NSF Logo and link Learning and Education:  Building Knowledge, Understanding Its Implications, May 15-17, 2002, Arlington, VA
Skip navigation and go to content
    
 

Technology and Assessment

General

These are the meeting summary notes from an NSF panel on technology and assessment held at the REC-ROLE principal investigators meeting, Arlington, VA, May 16, 2002. The meeting was chaired by Jim Minstrell and Earl Hunt and attended by about 17 participants. Three sessions were held, two in the morning and one in the afternoon. This summary is drawn from extensive notes taken by Earl Hunt and Larry Sutter.

The notes in italics are those of the Hunt and Minstrell and were not necessarily expressions of other participants in the meeting.

Morning Session

Opening Remarks

The participants briefly reviewed some of their current research projects involving technology and assessment. These ranged from computer-simulated "dry labs" in biology to programs that monitor student e-mail (with the students' knowledge) as the students discuss ways to construct computer graphics. It was pointed out that hand-held devices will offer new opportunities for technologically based teaching and assessment.

There was agreement that technological advancements, primarily involving computer-delivered instruction and assessment, can serve to organize classroom discussions. Improved learning results, although it is not clear whether this is due to the improved computer-mediated techniques or due to the initiation of student discussions, which then produce the improved learning.

The point was made that any teaching reform must provide an assessment. Therefore computer-oriented teaching methods must include some method of assessment. Ideally assessment within a technologically delivered curriculum project will be consistent with and linked to assessment by the classroom teacher and other ongoing assessment efforts of the school system.

Problems Discussed

Most projects studying the use of educational technologies in the school, including assessment, involve a research group and a fairly small number of highly motivated teachers. In many cases the research group supplies the necessary equipment. The teachers are then trained in its use. When assessment (or any technology) is scaled up to the level of a school system or beyond, assessment must fit into whatever context happens to be in the school system, including limited time for teacher training and non-uniform, often obsolescent hardware and software. For these reasons research and development programs that elicit a great deal of enthusiasm, and even objectively measured success, may be difficult to scale-up for general adoption.

The discussion then moved to the context of assessment. Presently most assessment is viewed as an "event," some obvious particular situation in the classroom. The consensus was that technological assessment should be integrated into classroom activities, i.e. that it should be focussed more on formative than summative assessment. However the point was made that the formative assessments should be coordinated with summative ones. Because a formative assessment is supposed to suggest instructional intervention that a teacher should/could do, teachers probably need to understand both the assessment methods and the psychological and pedagogical theories behind them. Therefore the new assessment methods cannot simply be handed to teachers, the teachers must learn to use them effectively. This is a major constraint on the use of technological assessment methods, because teacher training time is a very limited commodity. The point was made that other countries devote more time to teacher training and self-improvement than is usually the case in the United States.

The point was also made that, as is the case for any assessment, the assessment should be carefully coordinated with learning goals; you do not know what or how to assess unless you have defined what you want students to know, including facts, concepts, and procedures. Both the assessment developers and the teachers have to be in agreement about the nature of learning desired, and how to achieve it. This is important because assessment methods will, inevitably, be a powerful guide to instructional practice. Teachers will focus on what the assessment method assesses. If the assessment technique is in agreement with educational goals, that is a good thing. If the assessment technique distorts the balance of goals, that is a bad thing.

In respect to this, there appeared to be general agreement that assessment should go beyond "right-wrong" grading to a more diagnostic approach, in which students current conceptions are identified. The diagnosis is then used to guide further instruction and learning. Student self-assessment, as well as teacher assessment, is an important part of the effort. Technologically based assessment can help here.

A major problem, as was frequently pointed out, is that teachers are severely time-constrained. This impacts on assessment technology in two ways. The teacher has limited time to read the results of diagnostic assessment, and equally limited time to prepare or offer individual or class instruction based on this assessment. It was also pointed out that present-day "master teachers" are teachers who have developed teaching methods that fit into the constraints of the present system. Therefore any new assessment method must off-load diagnostic and instructional burdens from the teacher, rather than presenting a teacher with further time demands.

This gets into the whole process of diffusion and capital investment, a problem which people talk about a lot, but no one seems to do anything about. Indeed, I am not sure that researchers, qua researchers, have much leverage over this problem. Nor is it clear that, under the American system, either NSF or the US government have very much of a role. Here is the issue.

One can envisage an instructional assessment system that does off-load burden from a teacher, once the teacher has had time to master it. However teachers do not know that this will happen until they take the time to master the system, and time is just what teachers do not have. Hunt offered an analogy to the introduction of military technology into the armed services. Such technology is introduced when units are in rear-areas, on "training and reconstruction" missions rather than in combat. Teachers, unlike the infantry, are always in combat. The system simply does not provide enough professional development time to allow teachers to master the new technologies.

In less military terms, what we have is a classic example of capital investment. Attitude and motivation are really important.. Let's grant that the school system is NEVER going to provide enough paid professional development time to allow for mastery of new technologies. Should the teacher dip into his/her personal time? Is the likely return worth the up-front cost? In fact, it may be rational for teachers to not invest personal time. Unlike industrialists, teachers do not perceive any risk in failure to invest in intellectual capital. Suppose I am a teacher. So the guy in the next school, or the next school district, is teaching better than I am. How does that affect my pay check? Of course, there will always be the band of classic 'dedicated teachers.' But if you want to improve a system on a day-in, day-out basis do not rely on heroic efforts.

One possibility is to embed the considerable training in preservice teacher education. The problem here is to obtain the buy-in from the perservice institution. In general, colleges of education already feel they have too much to "cover" in the little time they have with preservice teacher trainees. But, this might be an effort NSF could support through the Math-Science Partnership (MSP) Program or through the Centers for Learning and Teaching (CLT) Program.

Afternoon Session

Topics Discussed

Several issues mentioned above were "revisited" but will not be discussed here, as basically the same things were said about these issues. An exception was the viewpoint strongly expressed by several individuals that the high-stakes assessments being dictated at the state and national level were often much narrower than the stated aims of educational programs. This was seen as a serious concern because of the way that instructional practice will respond to this operationalization of education goals through actual high-stakes assessment, rather than responding with appropriate action to the abstract statement of these goals in a 'mission statement.' See comments above.

The panel discussed what research goals NSF should pursue. There was general agreement that more attention had to be paid to the process of adoption and diffusion, with a focus on how successful teachers had used technology. The sorts of studies required are analogous to studies that have been done of diffusion of ideas in agriculture and public health.

It was pointed out that high technology assessment and training technologies appear to be much more successful in industry than in the educational system. Some of the reasons suggested were clearer learning goals, better student motivation because mastery was linked to clearly perceived rewards (e.g. promotion upon course completion in Navy technology schools or increased salary in business) or addressed to problems that the users knew they had. Other factors were also mentioned, including better financing and more clearly defined instructional goals; e.g. maintaining a particular aircraft rather than 'understanding aerodynamics.'

Following up on this idea, it was pointed out that some of the most successful commercial products addressed to educational issues, such as Accelerated Reader, Reader Rabbit, and The Magic School Bus, are all for younger students and are clearly related to instructional material and/or procedures that parents expect students to master. They also address basic skills, such as reading, that parents and student-users can understand and appreciate.

Finally, there was a discussion of security issues. If technologically advanced assessment methods were used widely, computer-readable records could, literally, follow a student around throughout his/her educational career. This could be a great help, both to the educational system and to the individual student. At the same time, such extensive record keeping could lead to invasion of privacy issues.

The consensus of the meeting seemed to be that these are essentially the same issues raised by financial record keeping. One has to be aware of the issues involved, but they are manageable.

No system is ever going to be completely fool proof. Excessive bureaucracy built around security issues can inhibit research, to the point where the cost to progress in knowledge exceeds the marginal benefits of a super security for the individual. NSF could do field researchers a great benefit by making this point to agencies (primarily NIH) that issue guidelines for institutional review boards (IRBs). At present the IRBs essentially get the order to treat everything as if it was a personal medical record. As a result, the educational benefits of high-technology record evaluation are being hindered, without a material increase in realistic security for the individual user.

Summary Comments (by Earl Hunt)

There was no clear cut consensus about what the most important funding priorities should be. There was agreement that perhaps the most important aspect of technological assessment is the capacity to provide immediate feedback that (a) improves self assessment and (b) permits the teacher to respond to individual differences. But how is this to be achieved? Although several commentators said that technologies of instruction should be built on psychological theory, very little specific was said about such a theory beyond vague remarks about 'constructivism.' One summarizer, EH, was disappointed (but not surprised) that there was essentially no discussion of the well-developed psychometric literature on aptitude x treatment interactions. The reading of that literature, which has a far stronger base than research on "multiple intelligences" or "practical vs. academic intelligence," argues for the use of highly structured programs in certain learning situations.

No one saw NSF's role as being a major player in accountability issues. Nevertheless, how can the NSF research program have much effect beyond those schools directly involved in the research if NSF and its funded researchers continue to be relatively uninvolved in high-stakes accountability issues. As was pointed out, on-demand assessment programs can drive (or at the least, constrain) teaching procedures. This is particularly true of high-stakes assessment. One of the group members (Goldman) observed, in approximately her words, that the high stake assessment train has already left the station. If low stakes assessments are proceeding on one theory of learning, and high stakes assessment on another, either low stakes assessment will be overridden or a compromise position will be reached. If the two systems continue to be independent, intellectually, then the views of the high stakes assessment community will have far more influence on actual school practice than will the views of the formative assessment community.

Several members of the panel stressed the point that formative assessment should be coordinated with summative testing. Nevertheless, with a few exceptions these researchers, and insofar as we know NSF-sponsored researchers in general, seem to have fairly little interaction with traditional test development companies. The same is true of the relation between academic and educationally based researchers and companies that prepare instructional material for industry. Active collaborations of this sort should be encouraged. This might extend, at least, to ensuring that traditional, summative test developers and industrial training firms had some representation at meetings such as the one being summarized.

Exceptions might occur when the researchers appear as consultants to state or district agencies who are the clients of the test developers, but typically these relationships do not encourage test developers to genuinely collaborate with researchers. Experience suggests the researchers are politely ignored. ETS seems to be an exception to this general trend.

   
    
 
Division of Research, Evaluation and Communication
National Science Foundation
4201 Wilson Boulevard • Arlington, Virginia • (703)292-8650