CHAPTER
I
INTRODUCTION
A.
Background
In the public
eye, tests have acquired an aura of infallibility in our culture of mass
producing everything, including the education of school children. Everyone
wants a test for everything, especially if the test is. cheap, quickly
administered, and scored instantaneously. But we saw in Chapter 4 that while
the standardized test industry has become a powerful juggernaut of influence on
decisions about people's lives, it also has come under severe criticism from
the public (Kahn, 2000). A more balanced viewpoint is offered by Bailey (1998,
p. 204): "One of the disturbing things about tests is the extent to which
many people accept the results uncritically, while others believe that all
testing is invidious. But tests are simply measurement tools: It is the use to
which we put their results that can be appropriate or inappropriate."
It is clear by
now that tests are one ofa number ofpossible types ofassessment. In Chapter 1,
an important distinction was made between testing and assessing. Tests are
formal procedures, usually administered within strict time limitations, to
sample the performance of a test-taker in a specified domain. Assessment
connotes a much broader concept in that most of the time when teachers are
teaching, they are also assessing. Assessment includes all occasions from
informal impromptu observations and comments up to and including tests.
Early in
the decade of the 1990s, in a culture of rebellion against the notion that all
people and all skills could be measured by traditional tests, a novel concept
emerged that began to be labeled "alternative" assessment. As
teachers and students were becoming aware of the shortcomings of standardized
tests, "an alternative to standardized testing and all the problems found
with such testing" (Huerta-Macias, 1995, p. 8) was proposed. That proposal
was to assemble additional measures of students-portfolios, journals,
observations, self-assessments, peer-assessments, and the like-in an effort to
triangulate data about students. For some, such alternatives held "ethical
potential" (Lynch, 2001, p. 228) in their promotion.of fairness and the
balance of power relationships in the classroom.
CHAPTER II
DISCUSSION
A.
Performance Based Assessment
Before proceeding
to a direct consideration of types of alternatives in assessment, a word about
performance-based assessment is in order. There has been a great deal of press
in recent years about performance-based assessment, sometimes merely called performance
assessment (Shohamy, 1995; Norris et aI., 1998). Is this different from what is
being called "alternative assessment"?
The push toward
more performance.: based assessment is part of the same general educational
reform movement that has raised strong objections to using standardized test
scores as the only measures of student competencies (see, for example, Valdez
Pierce & O'Malley, 1992; Shepard & Bliem, 1993). The argument, as you
can guess, was that standardized tests do not elicit actual performance on the
part of test-takers. Ifa child were asked, for example, to write a description
of earth as seen from space, to work cooperatively with peers to design a
three-dimensional model of the solar system, to explain the project to the rest
of the class, and to take notes on a videotape about space travel, traditional
standardized testing would be involved in none of those performances.
Performance-based assessment, however, would require the performance of
the above-named actions, or samples thereof, which would be systematically
evaluated through direct ob~ervation by a teacher and/or possibly by self and
peers.
Performance-based
assessment implies productive, observable skills, such as speaking and writing,
of content-valid tasks. Such performance usually, but not always, brings with
it an air of authenticity-real-world tasks that students have had time to develop.
It often implies an integration of language skills, perhaps all four skills in
the case of project work. Because the tasks that students perform are
consistent with course goals and curriculum, students and teachers are likely
to be more motivated to perform them, as opposed to a set of multiple-choice
questions about facts and figures regarding the solar system.
O'Malley and
Valdez Pierce (1996) considered performance-based assessment to be a subset of
authentic assessment. In other words, not all authentic assessment is
performance-based. One could infer that reading, listenillg, and thinking have
many authentic manifestations, but since they are not directly observable in
and of themselves, they are not performance-based. According to O'Malley· and Valdez
Pierce (p. 5), the following are characteristics of performance assessment:
1. Students
make a constructed response.
2. They
engage in bigber-order tbinking, with open-ended tasks.
3. Tasks
are meaningfu~ engaging, and autbentic.
4. Tasks
call for the integration oflanguage skills.
5. Both
process and product are assessed.
6. Depth
of a student's mastery is emphasized over breadth.
Performance-based
assessment needs to be approached with caution. It is tempting for teachers to
assume that if a student is doing something, then the process hasfulfilled its
own goal and the evaluator-needs only to make a mark inthe grade book that says
"accomplished» next to a particular competency. In reality, performances
as assessment procedures need to be treated with the same rigor as traditional
tests. This implies that teachers should :
·
state the overall goal of the performance,
·
specify the objectives (criteria) of the
performance in detail,
·
prepare students for performance in stepwise
progressions,
·
use a reliable evaluation form, checklist; or
rating sheet,
·
treat performances as opportunities for giving
feedback and provide that feedback systematically, and if possible, utilize
self-and peer-assessments judiciously.
To sum up,
performance assessment is not completely synonymous with the concept of
alternative assessment. Rather, it is best understood as one of the primary
traits of the many available alternatives to assessment.
B.
Portfolios
One of the most popular
alternatives in assessment, especially within a framework of communicative
language teaching, is portfolio development. According to Genesee and Upshur
(1996), a portfolio is "a purposeful collection ofstudents' work that
demonstrates their efforts, progress, and achievements in given areas" (p.
99). Portfolios include materials such as:
·
essays and compositions in draft and
fmal forms,
·
reports, project outlines,
·
poetry and creative prose,
·
artwork, photos, newspaper or magazine
clippings,
·
audio and/or video recordings of presentations,
demonstrations, etc,
·
journals, diaries, and other personal
reflections,
·
tests, test scores, and written homework
and exercises,
·
notes on lecturer
·
self· and peer-assessments
·
comments, evaluations, and checklists.
The advantages of engaging students in portfolio development have
been extolled in a number ofsources (Genesee & Upshur, 1996; O'Malley
&Valdez Pierce, 1996; Brown & Hudson, 1998; Weigle, 2002). A synthesis
of those characteristics gives us a number of potential benefits. Portfolios :
·
foster intrinsic motivation,
responsibility, and ownership,
·
promote student-teacher
interaction with the teacher as facilitator,
·
individualize learning and
celebrate the uniqueness of each student,
·
provide tangible evidence of
a student's work,
·
facilitate Critical
thinking, self-assessment, and revision processes, • offer opportunities for
collaborative work with peers, and
·
permit assessment of
multiple dimensions of language learning.
Successful portfolio
development will depend on following a number of steps and guidelines:
1. State objectives clearly.
Pick one or more of the CRADLE attributes named above and specify
them as objectives of developing a portfolio. Show how tho:5e purposes are
connected to, integrated with, and/or a reinforcement of your already stated
curricular goals. A portfolio attains maximum authenticity and washback when it
is an integral part of a curriculum, not jiist an optional box of materials.
Show students how their portfolios will include materials from the course they
are taking and how that collection will enhance curricular goals.
2. Give guidelines on what materials to include.
Once the objectives have been determined, name the types of work
that should be included.There is some disagreement among "experts"
about how much negotiation should take place between student and,teacher over
those materials. Hamp-Lyons and Condon (2000) suggested advantages for student
control of portfolio contents,Dut teacher guidance, will keep students on
target with curricular objectives. It is helpful to give clear directions on
how to get started since many students will never have compiled a portfolio and
may be mystified about what to do. A sample portfolio from a previous student
can help to stimulate some thoughts on what to include.
3. Communicate
assessment criteria to students.
This
is both the most important aspect of portfolio development and the most
complex.Two sourcesself-assessment and teacher assessment-nlust be incorporated
in order for students to receive the maximum benefit. Self-assessment should be
as clear and simple as possible. O'Malley and Valdez Pierce (1996) suggested
the following halfpage self-evaluation of a writing sample (with spaces for
students to write) for elementary school English language students.
C.
Journals
A journal
is a log (or "account") of one's thoughts, feelings, reactions,
assessments, ideas, or progress toward goals, usually written with little
attention to structure, form, or correctness. Learners can articulate their
thoughts without the threat of those thoughts being judged later (usually by
the teacher). Sometimes journals are rambling sets of verbiage that represent a
stream of consciousness with no particular point, purpose, or audience.
Fortunately, models of journal use in educational practice have sought to
tighten up this style of journal in· order to give them some focus (Staton et
al., 1987). The result is the emergence of a number of overlapping categories
or purposes in journal writing, such as the following:
· language-learning logs
· grammar journals
· responses to readings
· strategies-based learning logs
· self-assessment reflections
· diaries of attitudes, feelings, and other affective factors
· acculturation logs
Most classroom-oriented journals are what have now come to be
known as student through dialogues or responses. For the best results, those
responses should be dispersed across a course at regular intervals, perhaps
weekly or biweekly. One of the principal objectives in. a student's dialogue
journal is to carry on a conversation with. the teacher. Through dialogue
journals, teachers can become better acquainted with their students, in terms
of both their learning progress and their affective states, and thus become
better equipped to meet students'individual needs.
The following journal entry from an advanced student from China,
and the teacher's response, is an illustration of the kind of dialogue that can
take place.
Dialogue
journal sample
Teacher's response:
This is a powerful piece of writing because you really communicate
what you were feeling. You used vivid details, like "eating tasteless
noodles," "my head seemed to be broken" and "rice gruel,
which has to cook more than three hours and is very delicious." These make
it easy for the reader to picture exactly what you were going through. The
other strong point about this piece is that you bring the reader full circle by
beginning and ending with lithe noodles./I
Being alone when you are sick is difficult. Now, I know why you
were so quiet in class. If you want to do another entry related to this one,
you could have a dialogue with your "sick" self. What would your
"healthy" self say to the "sick" self? Is there some
advicethat could be exchanged about how to prevent illness or how to take care
of yourself better when you do get sick? Start the dialogue with your
"sick" self speaking first.
It is important to turn
the advantages and potential drawbacks of journals into positive general steps
and guidelines for using journals as assessment instruments. The following
steps are not coincidentally parallel to those cited above for portfolio
development:
1. Sensitively
introduce students to the concept ofjournal writing.
2. State the
objective(s) of the journal.
The list of types of
journals at the beginning of this section may coincide with the following
examples of some purposes of journals:
·
Language-learning logs.
·
Grammar journals.
·
Responses to readings.
·
Strategies-based learning
logs
·
Self-assessment reflections
·
Diaries of attitudes,
feelings, and other affective factors.
·
Acculturation logs.
4. Give guidelines on what kinds oftopics to include.
5. Carefully specify the criteria for assessing or grading journals.
6. Provide optimal feedback in your responses. McNamara (1998, p. 39) recommended three different kinds of
feedback to journals:
·
cheerleading feedback, in
which you celebrate successes with the students or encourage them to persevere
through difficulties,
·
instructional feedback, in
which you suggest strategies or materials, suggest ways to fme-tune strategy
use, or instruct students in their writing, and
·
reality-check feedback, in
which you help the students set more realistic expectations for their language
abilities.
In sum, how
do journals score on principles of assessment? Practicality remains relatively
low, although the appropriation of electronic communication increases
practicality by offering teachers and students convenient, rapid (and legible!)
means of responding. Reliability can be maintained by the journal entries
adhering to stated purposes and objectives, but because of individual
variations in writing and the accompanying variety of responses, reliability
may reach only a moderate level. Content and face validity are very high ifthe
journal entries are closely interwoven with curriculum goals (which in turn
reflect real-world needs). In the category of washback, the potential in
dialogue journals is off the charts!
D.
Conferences and Interviews
Conferences are not limited to drafts of
written work. Including portfolios and journals discussed above, the list of
possible functions and subject matter for conferencing is substantial:
·
commenting on drafts of essays and reports
·
reviewing portfolios
·
responding to journals,
·
advising on a student's plan for an oral
presentation
·
assessing a proposal for a project
·
giving feedback on the results of performance on
a test
·
clarifying understanding of a reading
·
exploring strategies-based options for
enhancement or compensation
·
focusing on aspects of oral production
·
checking a student's self-assessment of a
performance
·
setting personal goals for the near future
·
assessing general progress in a course
Conferences
must assume that the teacher plays the role of a facilitator and guide, not of
an administrator, of a formal assessment. In this intlinsically motivating
atmosphere, students need to understand that the teacher is an ally who is
encouraging self-reflection and improvement. So that the student will be as
candid as possible in self-asseSSing, the teacher should not consider a
conference as something to be scored or graded. Conferences are by nature
formative, not summative, and their primary purpose is to offer positive
washback.
Genesee and
Upshur (1996, p. 110) offered a number of generic kinds of questions that may
be useful to pose in a conference:
· What did you like about this work?
· What do you think you did well?
· How does it show improvement from previous work? Can you show me
the improvement?
· Are there things about this work you do not like? Are there things
you would like to improve?
· Did you have any difficulties with this piece of work? If so,
where, and what did you do [will you do] to overcome them?
· What strategies did you use to figure out the meaning of words you
could not understand?
· What did you do when you did not know a word that you wanted to
write?
How do
conferences and interviews score in terms of principles ofassessment? Their
practicality, as is thus for many of the alternatives assessments low because
they are time consuming. Reliability will vary between conferences and
interviews. In the case of conferences, it may not be important to have rater
reliability because the whole purpose is to offer individualized attention,
which will vary greatly from student to student. For interviews, a relatively
high level of reliability should be maintained with careful attention to
objectives and procedures. Face validity for both can be maintained at a high
level due to their individualized nature. As long as the subject matter of the
conference/interview is clearly focused on the course and course objectives,
content validity should also be upheld. Washback potential and authenticity are
high for conferences, but 'possibly only moderate for interviews unless the
results of the interview are clearly folded into subsequent learning.
E.
Observations
All teachers, whether they are aware of it
or not, observe their students in the classroom almost constantly_ Virtually
every question. every response, and almost every nonverbal behavior is, at some
level of perception, noticed. All those intuitive perceptions are stored as
little bits and pieces of information about students that can form a composite
impression of a student's ability. Without eyer administering a test or a quiz,
teachers know a lot about their students. In fact, experienced teachers are so
good at this almost subliminal process of assessment that their estimates of a
student's competence are often highly correlated with actual independently
administered test scores. (See Acton, 1979, for an example.)
How do all
these chunks of information become stored in a teacher's brain cells? Usually
not through rating sheets and checklists and carefully completed observation
charts. Still, teachers' intuitions about students' performance are not
infallible, and certainly both the reliability and face validity of their
feedback to students can be increased with the help of empirical means of
observing their language performance. The value of systematic observation of
students has been extolled for decades (Flanders, 1970; Moskowitz, 1971; Spada
& Frolich, 1995), and its utilization greatly enhances a teacher's
intuitive impressions by offering tangible corroboration of conclusions.
Occasionally, intuitive information is disconfirmed by observation data.
We will not
be concerned in this section with the kind of observation that rates a formal
presentation or any other prepared, prearranged performance in which the
student is fully aware of some evaluative measure being applied, and in which
the teacher scores or comments on the performance.We are talking about
observation as a systematic, planned procedure for real-time, almost
surreptitious recording of student verbal and nonverbal behavior. One of the
objectives of such observation is to assess students without their awareness
(and possible consequent anxiety) of the observation so that the naturalness of
their linguistic performance is maximized.
What kinds
of student performance can be usefully observed? Consider the folLowing possibilities:
1.
Potential observation foci
· sentence-level oral production skills (see microskills, Chapter 7)
-pronunciation of target sounds, intonation, etc. -grammatical features (verb
tenses, question formation, etc.)
· discourse-level skills (conversation rules, turn-taking, and other
macroskills)
· interaction with classmates (cooperation, frequency of oral
production)
· reactions to particular students, optimal productive pairs and
g.fOUPS, which "zones" of the classroom are more vocal, etc.
· frequency of student-initiated responses (whole class, group work)
· quality of teacher-elicited responses
· latencies, pauses, silent periods (number of seconds, minutes,
etc.)
· length of utterances
· evidence of listening comprehension (questions, clarifications,
attentiongiving verbal and nonverbal behavior)
· affective states (apparent self-esteem, extroversion, anxiety,
motivation, etc.)
· evidence of attention-span issues, learning style preferences,
etc.
· students' verbal or nonverbal response to materials, types of
activities, teaching styles
· use of strategic options in comprehension or production (use of
comFNunication strategies, avoidance, etc.)
· culturally specific linguistic and nonverbal factors (kinesics;
proxemics; use of humor, slang, metaphor, etc.)
The list
could be even more specific to suit the characteristics of students, the focus
ofa lesson or module, the objectives of a curriculum, and other factors.The
list might expand, as well, to include other possible observed performance. In
order to carry out classroom observation, it is ofcourse important to take the
following steps:
1. Determine the specific objectives of the observation.
2. Decide how many students will be observed at one time.
3. Set up the logistics for making t:mn0ticed observations.
4. Design a system for recording observed performances.
5. Do not overestimate the number of different elements you can
observe at one time-keep them very limited.
6. Plan how many observations you will make.
7. Determine specifically how you will use the results.
2.
Observation checklist, student errors
Each of the 30-odd checklists that were eventually completed
represented a two hour class period and was filled in with "ticks" to
show the occurrences and the follow-up in the appropriate cell.
Rating scales have also been suggested for recording observations.
One type of rating scale asks teachers to indicate the frequency of occurrence
of target performance on a separate frequency scale (always = 5; never = 1).
F.
Self And Peer Assessments
A
conventional view of language assessment might consider the notion of selfand
peer-assessment as an absurd reversal of politically correct power
relationships. After all, how could learners who are still in the process of
acquisition, especially the early processes, be capable of rendering an
accurate assessment of their" own performance? Nevertheless, a closer look
at the acquisition of any skill reveals the importance, if not the necessity,
of self-assessment and the benefit of peer·assessment. What successful learner
has not developed the ability to 'monitor his or her own performance and to use
the data gathered for adjustments <and corrections? Most successful learners
extend the learning process well beyond the classroom and the presence of a teacher
or tutor, autonomously mastering the art of self-assessment. Where peers are
available to render assessments, the advantage of such additional input is
obvious.
Self-assessment
derives its theoretical justification from a number of wellestablished principles
of second language acquisition. The prinCiple of autonomy starids_ qut as one
of the primary foundation stones of successful learning. The ability to set
one's own goals both within and beyond the structure ofa classroom curriculum,
to pursue them without the presence of an external prod, and to independently
monitor that pursuit are all keys to success. Developing intrinsic motivation
that comes from a self-propelled desire to excel is at the top of the list of
successful acquisition of any set of skills.
Peer-assessment
appeals to similar prinCiples, the most obvious ofwhich is cooperative
learning. Many people go through a whole regimen of education from kindergarten
up through a graduate degree and never come to appreciate the value of
collaboration in learning-the benefit of a community oflearners capable of
teaching each 'other something. Peer-assessment is simply one arm of a plethoIa
of tasks and procedures within the domain of learner-centered and collaborative
education.
Researchers
(such as Brown & Hudson, 1998) agree that the above theoretical
underpinnings of self-and peer-assessment offer certain benefits: direct
involvement of students in their own destiny, the encouragement of autonomy,
and increased motivation because of their self-involvement. Of course, some
noteworthy drawbacks must also be taken into account. Subjectivity is a primary
obstacle, to overcome. Students may be either too harsh on themselves or too
self-flattering, or they may not have the necessary tools to make an accurate assessment.
Also, especially in the case of direct assessments of performance (see below),
they may not be able to discern their own ~rrors. In contrast, Bailey (1998)
conducted a study in which learners showed moderately high correlations
(between .58 and .64) between self rated oral production
ability and scores on the OPI, which suggests that in -the assessment of
general competence, learners' self-assessments may be more accurate than pne
might suppose.
1. Types of Self-and Peer-Assessment
a) Assessment of a specific performance.
In this category, a student typically
monitors him-or herself-in either oral or written production-and renders some
kind of evaluation of performance. The evaluation takes place immediately or
very soon after the performance. Thus, having made an oral presentation, the
student (or a peer) fills out a checklist that rates performance on a defined
scale. Or perhaps the student views a video-recorded lecture and completes a
self-corrected ·comprehension quiz. A journal mayserve as a tool for
such"self-assessment. Peer editing is an excellent example of direct
assessment of a specific performance.
Today, the availability
of media opens up a number of possibilities for self-and peer-assessment beyond
the classroom. Internet sites such as Dave's ESL Cafe (http://www.eslcafe.coml)
offer many self-correcting quizzes and tests. On this and other similar sites,
a learner may access a grammar or vocabulary quiz on the Internet and then
self-score the result, which may be followed by comparing with a partner.
Television and film media also offer convenient resources for self-and
peerassessment. Gardner (1996) recommended·.that students in
non-English-speaking countries access bilingual news, films, and television
programs and then self-assess their comprehension ability. He also noted that
video versions of movies with subtitles can be viewed first without the
subtitles, then with them, as another form of self-and/or peer-assessment.
b.
Indirect
assessment of [general) competence.
Indirect self-or
peer-assessment targets larger slices of time with a view to rendering an
evaluatioIl'of general ability, as opposed to one specific, relatively
time-cortstrained performance. The distinction between direct and indirect
assessments is the classic competence-performance distinction. Self-and
peer-assessments of performance are limited in time and focus to a relatively
short performance. Assessments of competence may encompass a lesson over
several days, a module, or even a whole term of course work, and the objective
is to ignore minor, nonrepeating performance flaws and thus to evaluate general
ability. A list of attributes can offer a scaled rating, from "strongly
agree" to "strongly disagree," on such items as these:
In a successful
experiment to introduce self-assessment in his advanced intermediate
pre-university ESL class, Phillips (2000) created a questionnaire (Figure 10.2)
through which his students evaluated themselves on their class participation.
The items were simply formatted with just three options to,C;h,~c~ for ea.~h
<;Iltegory, which made the process easy for students to perform.They
completed' the questionnaire at midterm, which was followed up immediately with
a teacher-student conference during which students identified weaknesses and
set goals for the remainder of the term.
Of course, indirect
self-and peer-assessment is not confined to scored rating sheets and
questionnaires. An ideal genre for self-assessment is through journals, where
students engage in more.open-ended assessment and/or make their own further
comments on the results of completed checklists.
c. Metacognitive
assessment [for setting goals}
Some
kinds of evaluation are more strategic in nature, with the purpose not just
ofviewing past performance-or competence but of setting goals and maintaining
an eye on the process oftheir pursuit. Personal goal-setting has the advantage
offostering intrinsic motivation and of providing learners with that
extra-special impetus from having set and accomplished one's own goals.
Strategic planning and self-monitoring can take the form of journal entries,
choices from a list of possibilities, questionnaires, or cooperative (oral)
pair or group planning.
On the back of this same card, which was filled out at the end
ofthe week, was the student's self-assessment:







0 comments:
Post a Comment