83 Regarding source text(s)
The initial and quite basic idea for the GEBCom reception test was to have the participants read one or more texts in English and then somehow test their comprehension of this afterwards. We discussed several options for the design of the texts for the test.
A single, long text vs. several short texts
The advantage of using a single long text is that it may allow for a more profound and thorough testing of comprehension as it will be possible to include several elements that we would like to test and test how these influence each other. It is easily imaginable that if one part of the text gives rise to differences in comprehension, another part of the text might either enhance these differences or on the contrary attenuate them. In this sense a single long text may more adequately reflect reading comprehension in a natural setting. A problem with using a single long text, however, is that it may prove difficult to either find or produce a single long text that includes all of the desired elements to be tested. In addition, a long text may be too difficult for a less proficient reader and even a proficient reader might lose interest or focus while reading a long text, which may affect comprehension. This makes it difficult - maybe even impossible – to know whether possible differences in comprehension are caused by differences in mother tongues or in fact merely by a lack of concentration.
Instead I chose to include several short texts in the hope that this would be more feasible for the participant, regardless of proficiency. Several texts also allowed for the possibility of letting one text deal with only one or a few of the items that I wished to test. This might make it easier to elicit the comprehension of the specific element(s) that I wished to investigate. However, finding, selecting or even creating several texts with the desired elements could be a difficult task and raised important questions such as how many texts to choose, whether they should be related or not, in which order should they appear, etc. Furthermore, one could argue that short texts might not be a true reflection of reality in the sense that they would leave no room for the extra explanation or supportive move that might (or might not) be present in a real life intercultural connection. And, as with a single long text, several short texts might prove too tiring for the participant, which could mean that he/she would lose focus towards the end endangering the comprehension of the last texts.
84 Real life text(s) vs. composed text(s)
Another decision to make was whether to use real life text, i.e. texts written by other people and not necessarily specifically for this test, or to write our own texts specifically for this test. Using real life texts could strengthen the test as it would first of all ensure that the texts reflect real life (at least to a certain level) and secondly because it could reduce the risk of our own assumption about the influence of the mother tongue affecting the test in that way. Ideally, the texts used should involve or evolve around the participants’ daily lives now, i.e. university and perhaps work.
This might make it easier for them to relate to the texts. On the other hand, it would be very difficult and time consuming to find and select one or more texts that would encompass the elements desired for testing. Furthermore, it might not be possible to establish who the author of the text actually was and how this might have affected the language of the text. There was also the risk that using texts that either included elements from the participants’ university lives or evolved around them, e.g. using texts with specific reference to professors or locations, might leave the participants feeling too disclosed and might make the reluctant to give a true answer if they felt that this answer somehow would compromise them in relation to their studies.
As a result, I chose to compose the texts for the test myself in discussion with the rest of the GEBCom group. This would make it easier to integrate the elements I wanted to test and to control exactly where and how they appear. By using fictive texts with fictive persons and circumstances the participants might also not feel uneasy or uncomfortable in relation to their professor or fellow students and might be able to give an answer that reflects their comprehension of the text rather than their concern for their job. The problem with composing the texts ourselves is that it could be said to compromise the validity of the test as we might let our own assumptions affect the texts.
Regarding method for assessing reading comprehension
Having more or less established the sorts of texts to be included in the GEBCom reception test, there was still the question of which method to use to best assess the comprehension of these texts.
In the following I shall briefly discuss some of the methods used for assessing reading comprehension, i.e. which questions to ask and what sort of answers to ask for.
85 Multiple-choice test
In a multiple-choice test the participant first reads a text, is then asked a question (or several) and given a list of answers to choose from. It has been suggested (e.g. by Shohamy 1984) that the multiple-choice format is easier for the participant, especially for the less proficient participant, than other methods requiring the participant to actively produce text, e.g. open-ended questions or written recall. The amount and format of the data resulting from multiple-choice questions are also more easily manageable and comparable than individually produced data. In addition, this sort of data lends itself more easily to statistical calculations, if possible or desirable. A considerable advantage is the fact that the participant does not need to produce text to express his/her comprehension. He/she merely needs to make a choice. This is important since asking the participant to produce text in order to test his/her reception of text makes it difficult to know whether his/her answers are truly a reflection of reading comprehension or merely a result of writing skills.
On the other hand, a multiple-choice test is quite limiting in the sense that it very much controls the participant’s ability to express his/her comprehension of the test. We might risk losing out on valuable information because there might be (and probably would be) elements to comprehension that we simply could not imagine and therefore had not included in the possible answers. There would also be a risk that the possible answers we provided actually affected the participant’s comprehension of the text, perhaps encouraging him/her to change his/her initial answer to a more strategic answer, which would naturally compromise the result. Some researchers, e.g. Lee &
Riley (1996) and Chiramanee & Currie (2010), argue that a multiple-choice test is not a valid format for testing reading comprehension because it focuses on specific, isolated bits of the text rather than the overall integrated understanding of the text.
In a cloze test the participant is given a text where some of the words have been replaced by blanks. It is now the participant’s task to fill in the blanks either on his/her own or by choosing from a list of options. The cloze test seems to be a popular method of testing reading comprehension in a second language and forms part of many of the official English tests. Sharp (2010) argues that when using a rational rather than fixed deletion pattern a cloze-test “correlates very highly with other L2 reading assessment procedures” (Sharp 2010, 479). In contrast with this, Lee & Riley (1996) argues that, as with the multiple-choice test, a cloze test does not validly
test reading comprehension because it focuses on isolated bits of text comprehension rather than on an integrated comprehension of the text. Moreover, it may not be the best suited method for the GEBCom reception test as it seems to yield a strong focus on right/wrong answers rather than differences in answers.
As indicated by their name open-ended questions are questions with no fixed answer. Upon reading a text the participant is asked a question and is free to produce any answer he/she likes.
Contrary to a multiple-choice or cloze test the participant is not restricted in his/her answer but is free to include anything that he/she feels is relevant. There is a chance this would yield more diverse and interesting data, hopefully giving way for a more profound understanding of the actual comprehension of the text. On the other hand the participant is still restricted by the formulation of the question in the sense that he/she is unable to go beyond the frame of the question. This could perhaps be avoided or a least to some extent controlled by framing more general questions.
However, making the questions too general could make it difficult to compare the answers. It is also worth mentioning that Shohamy (1984) found that open-ended questions proved more difficult for less proficient L2 learners because they had to produce text to express their understanding of text. There is a risk that this might distort or skew the data. As mentioned one of the major difficulties with testing reading comprehension is exactly the issue of using text production to test text reception. In the end we cannot really know if what we are analysing is the result of reading comprehension or text production skills.
Written recall and written summaries
Another test method is written recall. Upon reading a text the participant is asked to write down everything and anything he/she remembers about the text. A written summary is similar to a written recall except that the participant is requested to structure his/her recall according to importance, i.e. produce a summary of the read text. Written recall and written summaries are praised by some researchers (Sharp 2010, Lee & Riley 1996), arguing that they produce a more accurate assessment of reading comprehension than e.g. multiple-choice or cloze because they leave it to the participant him/herself to structure his answers. However, especially written recall has been criticised for favouring quantity of recall rather that quality which according to Lee &
Riley (1996) cannot be said to truly reading reflect comprehension. For this they find that the written summary is better suited as is encourages the participant to focus on whatever he/she finds
most important, thus – according to Lee & Riley (1996) – resembling a more natural reading situation. One could argue, as does Sharp (2010), that the focus on quantity rather than quality in written recall is avoided if data from written recall is categorised according to both the number of idea units in the recall as well as the importance of the idea units included in the recall. Regardless of choosing written recall or written summaries, a real challenge with this method is, as with open-ended questions, that it requires a certain amount of text production skills from the participant.
And as with open-ended questions this again leads to the problem of telling whether differences in answers should be attributed to differences in comprehension or in fact differences in text production skills.
The GEBCom reception test method
Having discussed the various methods and their advantages and disadvantages within the GEBCom team, I chose to use a multiple-choice format as the method for assessing the comprehension of the texts. This first of all relates to the key issue previously raised about reading comprehension being only indirectly assessable. By using a multiple-choice format, we reduced the amount of text production that the participant would have to make, hopefully giving answers that are less affected by production skills and thus a better assessment of actual comprehension.
Having to read and select an answer is still a process that can be said to impact comprehension, it is still a filter somehow over the actual comprehension; however, it is at least less of filter compared to the text production required for e.g. open-ended questions or written recall.
As mentioned the multiple-choice format has been heavily criticised for focusing too much on isolated bits of information rather than on the overall comprehension of the text (Lee & Riley 1996). Although I do not necessarily disagree with this, I think that this might be less of a problem when assessing comprehension in terms of plus/minus differences across mother tongues than when assessing comprehension in the sense of plus/minus comprehension. By focusing on grammatical elements rather than rhetorical patterns we are at the very core of our research design focusing on specific elements contrary to an overall understanding, so to employ a method that does just this need not be a problem. What may still pose a problem, though, is the risk of restricting the participants in the expression of their comprehension, thus the possibility of losing out on valuable information. To try to avoid this we engaged in several discussions about the formulation of questions as well as the possible answers and adopted these according to input from pilot tests.
For the first pilot test, I tried with mainly multiple-choice but also including several open-ended questions. However, feedback from the participants showed that the open-ended questions were far from a success. The answers provided by the participants were very short and not very insightful. Furthermore, almost all participants reported that the open-ended questions made the test too strenuous and too long. I therefore made a choice to leave them out, and for the final test all answers were multiple choice.
Naturally the language of the text(s) should be English; however whether instructions, questions and even possible answers should be in English as well or in the participants’ mother tongues is a question worth addressing, albeit rather briefly. Keeping the questions and possible answers in the participants’ respective mother tongue could reduce possible misunderstandings of the questions and it might make the test easier for especially the less proficient L2 reader. This view is supported by Shohamy (1984) who hypothesises that posing questions in the participants’
mother tongue and allowing them to answer in their mother tongue might even give a more realistic assessment of the participants’ reading comprehension as it reduces the ‘noise’ or the filter that their answers would go through, assuming that the participants do a mental translation of the questions from English to their mother tongue and of their own answers from mother tongue to English.
On the other hand, keeping the questions and answers in the participants’ mother tongues would raise several issues in terms of validity. There is the obvious problem of translation. How could we produce an adequate translation of the questions to ensure that they would give the participants of different mother tongues the same starting grounds? Could they even be said to take the same test if the questions and answers were in different languages? And how would we then compare the answers? How could we ensure that the options are the same across mother tongues, i.e. would we ever really be sure that possible differences in answers were caused by differences in comprehension and not in translation? Furthermore, one could also argue that by keeping questions and answers in the participants’ mother tongues we would in fact be keeping them in the mind set of their mother tongues, affecting their comprehension and even compromising the validity of the test. This is of particular importance as I am working from the assumption that our mother tongue might influence the way we comprehend texts and that this influence could pervade
into English and affect our comprehension of English texts. I therefore need to ensure that the test does not provoke any differences in comprehension but only allows them to be disclosed if present.
Hence, the questions and possible answers of the GEB Com reception test are kept in English.
This may require participants of a certain proficiency. Working with university students who were all in some way or another familiar with English, we felt quite confident that this would be the case. We therefore chose to keep the language in English throughout the test, hoping to ease comparability across participants of different mother tongues.
The GEBCom reception test was pilot tested several times on volunteers as well as on friends and family, all from different linguistic background. The feedback from the pilot test was taken seriously and discussed at meetings with the rest of the GEBCom team and the necessary amendments were made. We were given feedback both on the formulation of the texts and their questions but also on the actual design of the test, which element worked and which did not.