Technology and Assessment
General
These are the meeting summary notes from an NSF panel
on technology and assessment held at the REC-ROLE principal investigators
meeting, Arlington, VA, May 16, 2002. The meeting was chaired by Jim Minstrell
and Earl Hunt and attended by about 17 participants. Three sessions were held,
two in the morning and one in the afternoon. This summary is drawn from
extensive notes taken by Earl Hunt and Larry Sutter.
The notes in italics are those of the Hunt and
Minstrell and were not necessarily expressions of other participants in the
meeting.
Morning Session
Opening Remarks
The participants briefly reviewed some of their
current research projects involving technology and assessment. These ranged
from computer-simulated "dry labs" in biology to programs that monitor student
e-mail (with the students' knowledge) as the students discuss ways to construct
computer graphics. It was pointed out that hand-held devices will offer new
opportunities for technologically based teaching and assessment.
There was agreement that technological advancements,
primarily involving computer-delivered instruction and assessment, can serve to
organize classroom discussions. Improved learning results, although it is not
clear whether this is due to the improved computer-mediated techniques or due
to the initiation of student discussions, which then produce the improved
learning.
The point was made that any teaching reform must
provide an assessment. Therefore computer-oriented teaching methods must
include some method of assessment. Ideally assessment within a technologically
delivered curriculum project will be consistent with and linked to assessment
by the classroom teacher and other ongoing assessment efforts of the school
system.
Problems Discussed
Most projects studying the use of educational
technologies in the school, including assessment, involve a research group and
a fairly small number of highly motivated teachers. In many cases the research
group supplies the necessary equipment. The teachers are then trained in its
use. When assessment (or any technology) is scaled up to the level of a school
system or beyond, assessment must fit into whatever context happens to be in
the school system, including limited time for teacher training and non-uniform,
often obsolescent hardware and software. For these reasons research and
development programs that elicit a great deal of enthusiasm, and even
objectively measured success, may be difficult to scale-up for general
adoption.
The discussion then moved to the context of
assessment. Presently most assessment is viewed as an "event," some obvious
particular situation in the classroom. The consensus was that technological
assessment should be integrated into classroom activities, i.e. that it should
be focussed more on formative than summative assessment. However the point was
made that the formative assessments should be coordinated with summative ones.
Because a formative assessment is supposed to suggest instructional
intervention that a teacher should/could do, teachers probably need to
understand both the assessment methods and the psychological and pedagogical
theories behind them. Therefore the new assessment methods cannot simply be
handed to teachers, the teachers must learn to use them effectively. This is a
major constraint on the use of technological assessment methods, because
teacher training time is a very limited commodity. The point was made that
other countries devote more time to teacher training and self-improvement than
is usually the case in the United States.
The point was also made that, as is the case for any
assessment, the assessment should be carefully coordinated with learning goals;
you do not know what or how to assess unless you have defined what you want
students to know, including facts, concepts, and procedures. Both the
assessment developers and the teachers have to be in agreement about the nature
of learning desired, and how to achieve it. This is important because
assessment methods will, inevitably, be a powerful guide to instructional
practice. Teachers will focus on what the assessment method assesses. If the
assessment technique is in agreement with educational goals, that is a good
thing. If the assessment technique distorts the balance of goals, that is a bad
thing.
In respect to this, there appeared to be general
agreement that assessment should go beyond "right-wrong" grading to a more
diagnostic approach, in which students current conceptions are identified. The
diagnosis is then used to guide further instruction and learning. Student
self-assessment, as well as teacher assessment, is an important part of the
effort. Technologically based assessment can help here.
A major problem, as was frequently pointed out, is
that teachers are severely time-constrained. This impacts on assessment
technology in two ways. The teacher has limited time to read the results of
diagnostic assessment, and equally limited time to prepare or offer individual
or class instruction based on this assessment. It was also pointed out that
present-day "master teachers" are teachers who have developed teaching methods
that fit into the constraints of the present system. Therefore any new
assessment method must off-load diagnostic and instructional burdens from the
teacher, rather than presenting a teacher with further time demands.
This gets into the whole process of diffusion and
capital investment, a problem which people talk about a lot, but no one seems
to do anything about. Indeed, I am not sure that researchers, qua researchers,
have much leverage over this problem. Nor is it clear that, under the American
system, either NSF or the US government have very much of a role. Here is the
issue.
One can envisage an instructional assessment system
that does off-load burden from a teacher, once the teacher has had time to
master it. However teachers do not know that this will happen until they take
the time to master the system, and time is just what teachers do not have. Hunt
offered an analogy to the introduction of military technology into the armed
services. Such technology is introduced when units are in rear-areas, on
"training and reconstruction" missions rather than in combat. Teachers, unlike
the infantry, are always in combat. The system simply does not provide enough
professional development time to allow teachers to master the new technologies.
In less military terms, what we have is a classic
example of capital investment. Attitude and motivation are really important..
Let's grant that the school system is NEVER going to provide enough paid
professional development time to allow for mastery of new technologies. Should
the teacher dip into his/her personal time? Is the likely return worth the
up-front cost? In fact, it may be rational for teachers to not invest personal
time. Unlike industrialists, teachers do not perceive any risk in failure to
invest in intellectual capital. Suppose I am a teacher. So the guy in the next
school, or the next school district, is teaching better than I am. How does
that affect my pay check? Of course, there will always be the band of classic
'dedicated teachers.' But if you want to improve a system on a day-in, day-out
basis do not rely on heroic efforts.
One possibility is to embed the considerable
training in preservice teacher education. The problem here is to obtain the
buy-in from the perservice institution. In general, colleges of education
already feel they have too much to "cover" in the little time they have with
preservice teacher trainees. But, this might be an effort NSF could support
through the Math-Science Partnership (MSP) Program or through the Centers for
Learning and Teaching (CLT) Program.
Afternoon Session
Topics Discussed
Several issues mentioned above were "revisited" but
will not be discussed here, as basically the same things were said about these
issues. An exception was the viewpoint strongly expressed by several
individuals that the high-stakes assessments being dictated at the state and
national level were often much narrower than the stated aims of educational
programs. This was seen as a serious concern because of the way that
instructional practice will respond to this operationalization of education
goals through actual high-stakes assessment, rather than responding with
appropriate action to the abstract statement of these goals in a 'mission
statement.' See comments above.
The panel discussed what research goals NSF should
pursue. There was general agreement that more attention had to be paid to the
process of adoption and diffusion, with a focus on how successful teachers had
used technology. The sorts of studies required are analogous to studies that
have been done of diffusion of ideas in agriculture and public health.
It was pointed out that high technology assessment and
training technologies appear to be much more successful in industry than in the
educational system. Some of the reasons suggested were clearer learning goals,
better student motivation because mastery was linked to clearly perceived
rewards (e.g. promotion upon course completion in Navy technology schools or
increased salary in business) or addressed to problems that the users knew they
had. Other factors were also mentioned, including better financing and more
clearly defined instructional goals; e.g. maintaining a particular aircraft
rather than 'understanding aerodynamics.'
Following up on this idea, it was pointed out that
some of the most successful commercial products addressed to educational
issues, such as Accelerated Reader, Reader Rabbit, and The
Magic School Bus, are all for younger students and are clearly related to
instructional material and/or procedures that parents expect students to
master. They also address basic skills, such as reading, that parents and
student-users can understand and appreciate.
Finally, there was a discussion of security issues. If
technologically advanced assessment methods were used widely, computer-readable
records could, literally, follow a student around throughout his/her
educational career. This could be a great help, both to the educational system
and to the individual student. At the same time, such extensive record keeping
could lead to invasion of privacy issues.
The consensus of the meeting seemed to be that these
are essentially the same issues raised by financial record keeping. One has to
be aware of the issues involved, but they are manageable.
No system is ever going to be completely fool
proof. Excessive bureaucracy built around security issues can inhibit research,
to the point where the cost to progress in knowledge exceeds the marginal
benefits of a super security for the individual. NSF could do field researchers
a great benefit by making this point to agencies (primarily NIH) that issue
guidelines for institutional review boards (IRBs). At present the IRBs
essentially get the order to treat everything as if it was a personal medical
record. As a result, the educational benefits of high-technology record
evaluation are being hindered, without a material increase in realistic
security for the individual user.
Summary Comments (by Earl Hunt)
There was no clear cut consensus about what the most
important funding priorities should be. There was agreement that perhaps the
most important aspect of technological assessment is the capacity to provide
immediate feedback that (a) improves self assessment and (b) permits the
teacher to respond to individual differences. But how is this to be achieved?
Although several commentators said that technologies of instruction should be
built on psychological theory, very little specific was said about such a
theory beyond vague remarks about 'constructivism.' One summarizer, EH, was
disappointed (but not surprised) that there was essentially no discussion of
the well-developed psychometric literature on aptitude x treatment
interactions. The reading of that literature, which has a far stronger base
than research on "multiple intelligences" or "practical vs. academic
intelligence," argues for the use of highly structured programs in certain
learning situations.
No one saw NSF's role as being a major player in
accountability issues. Nevertheless, how can the NSF research program have much
effect beyond those schools directly involved in the research if NSF and its
funded researchers continue to be relatively uninvolved in high-stakes
accountability issues. As was pointed out, on-demand assessment programs can
drive (or at the least, constrain) teaching procedures. This is particularly
true of high-stakes assessment. One of the group members (Goldman) observed, in
approximately her words, that the high stake assessment train has already left
the station. If low stakes assessments are proceeding on one theory of
learning, and high stakes assessment on another, either low stakes assessment
will be overridden or a compromise position will be reached. If the two systems
continue to be independent, intellectually, then the views of the high stakes
assessment community will have far more influence on actual school practice
than will the views of the formative assessment community.
Several members of the panel stressed the point that
formative assessment should be coordinated with summative testing.
Nevertheless, with a few exceptions these researchers, and insofar as we know
NSF-sponsored researchers in general, seem to have fairly little interaction
with traditional test development companies. The same is true of the relation
between academic and educationally based researchers and companies that prepare
instructional material for industry. Active collaborations of this sort should
be encouraged. This might extend, at least, to ensuring that traditional,
summative test developers and industrial training firms had some representation
at meetings such as the one being summarized.
Exceptions might occur when the researchers appear
as consultants to state or district agencies who are the clients of the test
developers, but typically these relationships do not encourage test developers
to genuinely collaborate with researchers. Experience suggests the researchers
are politely ignored. ETS seems to be an exception to this general trend.
|