[Federal Register Volume 61, Number 96 (Thursday, May 16, 1996)]
[Notices]
[Pages 24765-24771]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 96-12264]



=======================================================================
-----------------------------------------------------------------------

DEPARTMENT OF EDUCATION


National Assessment Governing Board; Opportunity for Comment

AGENCY: National Assessment Governing Board; Education.

ACTION: Notice of public meetings and request for comments.

-----------------------------------------------------------------------

SUMMARY: The National Assessment Governing Board announces the 
opportunity for public review and comment on a proposed policy for the 
redesign of the National Assessment of Educational Progress. Comments 
may be provided orally by participating in one of two public meetings 
described below or in writing. The Governing Board, in accordance with 
its statutory responsibility to ``take appropriate actions needed to 
improve the form and use of the National Assessment,'' has developed 
the proposed policy following an 18-month period of deliberation, 
involving review of commissioned papers, meetings with interested 
groups, and advice from experts. The proposed policy follows below.
    The period for submitting comments in writing begins with the 
publication of this notice; only comments received by June 28, 1996 
will be considered. Comments should be mailed to Ray Fields, Assistant 
Director for Policy and Research, 800 North Capitol Street NW., Suite 
825, Washington, DC, 20002-4233.
    The purpose of the two public meetings is to give individuals and 
groups an opportunity to discuss the

[[Page 24766]]

proposed policy with representatives of the Governing Board and to 
present their views. The Governing Board will consider the information 
obtained through these discussions and through written comments before 
taking action on a final policy statement to guide the redesign of the 
National Assessment.
    The two public meetings are secheduled as follows:
    Date: June 14, 1996.
    Time: 9:30 am to 12:00 noon.
    Place: The Madison Hotel, 15th and M Streets NW., Washington, DC 
(202) 862-1600.
    Date: June 17, 1996.
    Time: 9:30 am to 12:00 noon.
    Place: Park Hyatt Hotel, 800 North Michigan Ave., Chicago, IL (312) 
280-2222. Persons who wish to participate in these public meetings must 
register by 4:30 pm (Eastern Time), June 7, 1996. Persons who register 
may be assigned a specific time to appear. To register for the meeting, 
call 1-800-638-2784, extension 8623.

FOR FURTHER INFORMATION CONTACT:
Ray Fields, Assistant Director for Policy and Reserch, 800 North 
Capitol Street, N.W., Suite 825, Washington, DC, 20002-4233. Telephone: 
(202) 357-0395.

SUPPLEMENTARY INFORMATION: The National Assessment of Educational 
Progress is the primary means by which the public is able to know how 
students in grades 4, 8 and 12 are achieving nationally and state-by-
state. The National Assessment Governing Board is established to 
formulate policy guidelines for the National Assessment. The National 
Assessment and its Governing Board are authorized under sections 411 
and 412, respectively, of the Improving America's Schools Act of 1994, 
(Pub. L. 103-382).
    At its May 10 meeting, the Governing Board gave approval to 
disseminate the proposed policy of public comment, to be obtained both 
through submitted written comments and through the conduct of public 
meetings to discuss the proposed policy. The public comment period 
closes on June 28, 1996. Only comments received by June 28, 1996 will 
be considered. The Governing Board intends to take action on a final 
policy at its meeting scheduled for August 2-3, 1996, in Washington, 
DC.
    Records are kept of all Board proceedings, and are available for 
public inspection at the National Assessment Governing Board, 800 North 
Capitol Street NW., Suite 825, Washington, DC, from 8:30 a. to 5:00 pm, 
Monday through Friday. Proposed Policy Statement for the National 
Assessment of Educational Progress

Redesigning the National Assessment of Educational Progress

A Better Way To Measure Educational Progress in America

    An effective democracy and a strong economy require well-educated 
citizens. A good education lays a foundation for getting a good job, 
leading a fulfilling life, and participating constructively in society.
    But is the education provided in your State and in America good 
enough? How do our 12th graders compare with students in other nations 
in mathematics and science? Do our 8th grade students have an adequate 
understanding of the working of our constitutional democracy? How well 
do our 4th grade students read, write, and compute? The National 
Assessment of Educational Progress is the only way for the public to 
know with accuracy how American students are achieving nationally and 
state-by-state.
    The National Assessment tests at grades 4, 8 and 12. By law, it 
covers ten subjects, including reading, writing, math and science. The 
National Assessment has performance standards that indicate whether 
student achievement is ``good enough.'' The National Assessment is not 
a national exam taken by all students. In fact, only several thousand 
students are tested per grade, comprising carefully drawn samples that 
represent the nation and the participating states. Since its first test 
in 1969, the National Assessment has earned a trusted reputation for 
its quality and credibility. That reputation must be maintained.
    The National Assessment is unique because of its national, state-
by-state, and 12th grade results. State and local test results cannot 
be used to provide a national picture of student achievement. States 
and local schools use different tests that vary in many ways. The 
results cannot simply be ``added up'' to get a national score nor can 
state scores on their different tests be compared. Virtually no state 
tests 12th graders, so the only source of information about 12th grade 
achievement is the National Assessment. Colleage entrace tests such as 
the ACT and the SAT are taken only by students planning on higher 
education; the results do not represent the achievement of the total 
12th grade class. Twelfth grade achievement is important to monitor 
because it marks the end of elementary and secondary education, the 
transition point for most students from school to work, to college, or 
to technical training.
    While there is much about the National Assessment that is working 
well, there is a problem. Under its current design, the National 
Assessment tests too few subjects, too infrequently, and reports 
achievement results too late--as much as 18 to 24 months after testing. 
Testing occurs every other year. During the 1990's, only reading and 
mathematics will be tested more than once using up-to-date tests and 
performance standards. Six subjects will be tested only once and two 
subjects not at all during the 1990's.
    Why is the National Assessment testing so few subjects and fewer 
subjects now than years ago? Over the years, the National Assessment 
has become increasingly complex. Its quality and integrity have led to 
a multitude of demands and expectations beyond its central purpose. 
Meeting those expectations was done with good intentions and seemed 
right for the situation at the time. However, additions to the National 
Assessment have been ``tacked on'' without changing the basic design, 
reducing the number of subjects that can be tested and driving up 
costs.
    For example, where a single 120 page mathematics report once 
sufficed, mathematics reporting in 1992 consisted of seven volumes 
totalling almost 1,800 pages, not including individual state reports. 
Also, there are now two separate testing programs for reading, writing, 
math and science. One monitors trends using tests developed during the 
1970's; the other reflects current views on instruction and uses 
performance standards to report whether achievement is good enough. In 
addition, there are separate samples for reporting national and state 
results, even when the state samples may be adequate for some national 
reports.
    The current National Assessment design is overburdened, inefficient 
and redundant. It is unable to provide the frequent, timely reports on 
student achievement the American public needs. The challenge is to 
supply more information, more quickly, with the funding available.
    To meet this challenge, the National Assessment design must be 
changed, building on its strengths while making it more efficient. The 
design of the National Assessment must be simplified. The purpose of 
the National Assessment must be sharply focused and its principal 
audience clearly defined. Because the National Assessment cannot do all 
that some would have it do, trade-offs must be made among desirable 
activities. Useful but less important activities may have to be 
reduced, eliminated, or carried out by others. The National Assessment 
must ``stick to its knitting'' in order to be more cost-effective, 
reach more of the

[[Page 24767]]

public, provide more information more promptly, and maintain its 
integrity.

(Following below are preliminary proposals for new policies for the 
National Assessment being offered for public comment by the National 
Assessment Governing Board. The intent of these proposals is to specify 
purposes, audiences, and changes that will make the National Assessment 
a more effective monitor of student achievement)

Purpose of the National Assessment of Educational Progress

    The purpose of the National Assessment is stated in its 
legislation: to provide a fair and accurate presentation of educational 
achievement in reading, writing, and the other subjects included in the 
third National Education Goal, regarding student achievement and 
citizenship.
    Thus, the central concern of the National Assessment is to inform 
the nation on the status of student achievement. The National 
Assessment Governing Board believes that this should be accomplished 
through the following objectives:
    (1) To measure national and state progress toward the third 
National Education Goal and provide timely, fair and accurate data 
about student achievement at the national level, among the states, and 
in comparison with other nations;
    (2) To develop, through a national consensus, sound assessments to 
measure what students know and can do as well what students should know 
and be able to do; and
    (3) To help states and other link their assessments with the 
National Assessment and use National Assessment data to improve 
education performance.

The Audience for the National Assessment

    The primary audience for National Assessment results is the 
American public, including the general public in states that receive 
their own results from the National Assessment. Reports should be 
written for this audience. Results should be released within 6 months 
of testing. Reports should be understandable, jargon free, easy to use, 
and widely disseminated.
    Principal users of National Assessment data are state policymakers 
and educators concerned with student achievement, curricular, testing 
and standards. National Assessment data should be available to these 
users in forms that support their efforts to interpret results to the 
public and to improve education performance.

What the National Assessment Is Not

    The National Assessment is intended to describe how well students 
are performing, but not to explain why. The National Assessment only 
provides group results; it is not an individual student test. The 
National Assessment tests academic subjects and does not collect 
information on individual students' personal values or attitudes. Each 
National Assessment test is developed through a national consensus 
process. This national consensus process takes into account education 
practices, the results of education research, and changes in the 
curricula. However, the National Assessment is independent of any 
particular curriculum and does not promote specific ideas, ideologies, 
or teaching techniques. Nor is the National Assessment an appropriate 
means, by itself, for improving instruction in individual classrooms, 
evaluating the effects of specific teaching practices, or determining 
whether particular approaches to curricula are working.

Recommended Changes to the National Assessment

    To provide the American public with more frequent information in 
more subjects about the progress of student achievement, changes must 
be made in the way that the National Assessment is designed and the 
results are reported. Many current policies should continue. 
Reliability, validity, and quality of data will remain a hallmark of 
the National Assessment. The sample of tested students will be as 
representative as possible, keeping to a minimum the number of students 
excluded because of disability or limited English proficiency. Tests 
and test frameworks will be kept stable to measure progress in student 
achievement over time.
    The recommended changes relate to the three objectives outlined 
above. Current contracts for conducting the National Assessment extend 
through 1998. Changes can be incorporated in assessments in the year 
1999 and thereafter. Where feasible, these recommendations should be 
used to guide decisions under current contracts.
    Objective 1: To measure national and state progress toward the 
third National Education Goal and provide timely, fair and accurate 
data about student achievement at the national level, among the states, 
and in comparison with other nations.
    Test all subjects specified by Congress: reading, writing, 
mathematics, science, history, geography, civics, the arts, foreign 
language, and economics.
    The gap must be closed between the number of subjects the National 
Assessment is required to test and the number of subjects it can test 
under the current design. By law, the National Assessment is required 
to test ten subjects and report results and trends. In order to chart 
progress and report trends, subjects must be tested more than once. 
However, during the 1990's only reading and mathematics will have been 
tested more than once using up-to-date tests and performance standards 
to report how well students are doing.
    Recommendations:
     The National Assessment should be conducted annually;
     Reading, writing, mathematics and science should be given 
priority, with testing in these subjects conducted according to a 
publicly released 10-year schedule adopted by the National Assessment 
Governing Board;
     History, geography, the arts, civics, foreign language, 
and economics also should be tested on a reliable basis according to a 
publicly released schedule adopted by the National Assessment Governing 
Board.
    Vary the amount of detail in testing and in reporting.
    More subjects can be tested if different strategies are used. But 
each time the National Assessment is conducted, it uses a similar 
approach, regardless of the nature of the subject or the number of 
times a subject has been tested. This approach is locked-in through 
1998 under current contracts. Under this approach, a larger number of 
students is tested in order to provide not just overall results, but 
fine-grained details as well (e.g., the achievement scores of 4th grade 
students whose teachers that year had five hours or more of in-service 
training). The National Assessment also collects ``background'' 
information through questionnaires completed by students, teachers, and 
principals. The questionnaires ask about teaching practices, school 
policies, and television watching, to name a few. Data analyses are 
elaborate. Reports are detailed and exhaustive, involving as many as 
seven separate reports per subject. Although the National Assessment 
has been praised for this thoroughness, it comes at the cost of testing 
more subjects, more frequently, with more timely reporting.
    The different strategies needed might include several approaches to 
testing and reporting. For example, these approaches could take the 
form of ``standard report cards,'' ``comprehensive reports,'' and 
special, focused assessments. A standard report card would provide 
overall results in a subject with performance standards and average 
scores. Results for standard report cards would be reported by sex,

[[Page 24768]]

race/ethnicity, socio-economic status, and for public and private 
schools, but would not be broken down further. This may reduce the 
number of students needed for testing and may reduce associated costs. 
Student, teacher and principal survey questionnaires, if collected at 
all, would be limited and selective, with reports of results focused on 
only the most essential issues. Generally, subcategories within a 
subject (e.g., algebra, measurement and geometry within mathematics) 
would not be reported. However, data from the National Assessment would 
continue to be available to state and local educators and policymakers 
for additional analysis. Most National Assessment reports would use 
this strategy.
    Comprehensive reports, like the current approach, would be an in-
depth look at a subject, perhaps using a newly adopted test framework, 
many students, many test questions, and ample background information. 
In addition to overall results using performance standards and average 
scores, subcategories within a subject could be reported. Results would 
be reported by sex, race/ethnicity, socio-economic status, and for 
public and private schools, and might be broken down further as well. 
In some cases, more than one report may be issued in a subject. 
However, comprehensive reporting would occur infrequently, perhaps once 
in ten years in any one subject.
    Special, focused assessments in a subject would be scheduled as 
needed. They would explore a particular question or issue and may be 
limited to particular grades. Generally, the cost would be less than 
the cost of a standard report card. Examples of these smaller-scale, 
focused assessments include: (1) assessing subjects using targeted 
approaches (e.g., 8th grade arts), (2) testing special populations 
(e.g., in-school 12th graders vs. out-of-school youth), and (3) 
examining skills and knowledge across several subjects (e.g. readiness 
for work).
    Recommendations:
     National Assessment testing and reporting should vary, 
using standard report cards most frequently, comprehensive reporting in 
selected subjects about once every ten years, and special, focused 
assessments as needed;
     National Assessment results should be timely, with the 
goal being to release results within 6 months of the completion of 
testing.
Simplify the National Assessment Design
    The current design of the National Assessment is very complex. No 
student takes the complete set of test questions in a subject and as 
many as twenty-six different test booklets are used within each grade. 
Students, teachers, and principals complete separate questionnaires and 
may submit them for scoring at different times. Scores are not 
calculated directly from the test booklets, but are estimated using 
statistical procedures known as ``conditioning,'' ``drawing plausible 
values,'' and ``imputation.'' The estimates are calculated in part by 
using the questionnaire data collected from the students, teachers, and 
principals, in addition to the student answers to the test questions. 
Although using these procedures helps make the data accurate, it also 
increases the possibility of mistakes. Under these procedures, each 
time a problem arises in analyzing the data, everything must be redone. 
It is not unusual for data to be re-calculated hundreds of times. The 
current complex design of the National Assessment lengthens the time 
from testing to reporting and adds significantly to its cost.
Recommendation
     Options should be identified to simplify the design of the 
National Assessment and reduce reliance on conditioning, plausible 
values, and imputation to estimate group scores.
Simplify the Way the National Assessment Reports Trends in Student 
Achievement
    From its beginning in 1969, monitoring achievement trends has been 
a central mission of the National Assessment of Educational Progress. 
Since 1990, the National Assessment has reported achievement trends 
using two unconnected testing programs. The tests, criteria for 
selecting students, and reporting are all different. The first program, 
``the main National Assessment,'' tests at grades 4, 8 and 12 and 
covers ten subjects. The tests are based on a national consensus 
representing current views of each subject. Performance standards are 
used to report whether student achievement on the National Assessment 
is ``good enough.'' The schedule of subjects to be tested in the main 
National Assessment is unrelated to the schedule of subjects tested 
under the second testing program.
    The second testing program reports long-term trends that go as far 
back as 1970. Only four subjects are covered: reading, writing, 
mathematics and science. The tests are based on views of the curricula 
prevalent during the 1970's and have not been changed. Testing is at 
ages 9, 13 and 17 except for writing, which tests at grades 4, 8 and 
11. Trends are reported by average score; performance standards are not 
used. The long-term trend program has been valuable for documenting 
declines and increases in student achievement over time and a decrease 
in the achievement gap between minority and non-minority students.
    It may be impractical and unnecessary to operate two separate 
testing programs. However, it also is likely that curricula will 
continue to change and that current test frameworks may be less 
relevant in the future. The tension between the need for stable 
measures of student achievement and changing curricula must be 
addressed carefully.
Recommendations
     A carefully planned transition should be developed to 
enable ``the main National Assessment'' to become the primary way to 
measure trends in reading, writing, mathematics and science in the 
National Assessment program;
     As a part of the transition, the National Assessment 
Governing Board will review the tests now used to monitor long-term 
trends in reading, writing, mathematics and science to determine how 
they might be used now that new tests and performance standards have 
been developed during the 1990's for ``the main National Assessment.'' 
The Governing Board will decide how to continue the present long-term 
trend assessments, how often they would be used, and how the results 
would be reported.
Use Performance Standards To Report Whether Student Achievement is 
``Good Enough''
    In reporting on ``educational progress,'' the National Assessment 
has, until recently, only considered current student performance 
compared to student achievement in previous years. Under this approach, 
the only standard was how well students had done previously, not how 
well they should be doing on what is measured by the National 
Assessment. Although this approach has been useful, it began to change 
in 1988 from a sole focus on ``where we have been'' to include ``where 
we want to be'' as well.
    In 1988, Congress created a non-partisan citizen's group--the 
National Assessment Governing Board--and authorized it to set explicit 
performance standards, called achievement levels, for reporting 
National Assessment results.
    The achievement levels describe ``how good is good enough'' on the 
various tests that make up the National Assessment. Previously, it 
might have been reported that the average math score of 4th graders 
went up (or down)

[[Page 24769]]

four points on a five-hundred-point scale. There was no way of knowing 
whether the previous score represented strong or weak performance and 
whether the amount of change should give cause for concern or 
celebration. In contrast, the National Assessment now also reports the 
percentage of students who are performing at or above ``basic,'' 
``proficient,'' and ``advanced'' levels of achievement. Proficient, the 
central level, represents ``competency over challenging subject 
matter,'' as demonstrated by how well students perform on the questions 
on each National Assessment test. Basic denotes partial mastery and 
advanced signifies superior performance on the National Assessment. 
Using achievement levels to report results and track changes allows 
readers to make judgments about whether performance is adequate, 
whether ``progress'' is sufficient, and how the National Assessment 
standards and results compare to those of other tests, such as state 
and local tests.
Recommendation
     The National Assessment should continue to report student 
achievement results based on performance standards.

Use International Comparisons

    Looking at student performance and curriculum expectations in other 
nations is yet another way to consider the adequacy of U.S. student 
performance. The National Assessment is, and should be, a domestic 
assessment. However, decisions on the content of National Assessment 
tests, the achievement standards, and the interpretation of test 
results, where feasible, should be informed, in part, by the 
expectations for education set by other countries, such as Japan, 
Germany, and England. This, in turn, should take into account problems 
in making international comparisons truly comparable. In addition, the 
National Assessment should promote ``linking'' studies with 
international assessments, as has been done with the Third 
International Mathematics and Science Study, so that states that 
participate in the National Assessment can have state, national and 
international comparisons.
Recommendations
     National Assessment test frameworks, test specifications, 
achievement levels and data interpretations should take into account, 
where feasible, curricula, standards, and student performance in other 
nations;
     The National Assessment should promote ``linking'' studies 
with international assessments.

Emphasize Reporting for Grades 4, 8 and 12

    An aspect of the National Assessment design that needs 
reconsideration is age versus grade-based reporting. At its inception, 
the National Assessment tested only by age. Current law requires 
testing both by age (ages 9, 13 and 17) and by grade (grades 4, 8 and 
12). Grade-based results are generally more useful than age-based 
results. Schools and curricula are organized by grade, not by age. 
Grades 4, 8 and 12 mark key transition points in American education. 
Grade 12 performance is particularly important as an ``exit'' measure 
from the K-12 education system. Grades 4, 8 and 12 are specified for 
monitoring in National Education Goal 3. Age-based samples may be more 
appropriate with respect to international comparisons and, given high 
school drop-out rates, would be more inclusive for age 17 than for 
grade 12 samples, which are limited to youth enrolled in school. 
However, assessing the knowledge and skills of out-of-school youth may 
properly fall under the purpose of another program, such as the 
National Adult Literacy Survey.
    Although grade-based reporting is generally preferable, there is a 
problem about the accuracy of grade 12 National Assessment results. At 
grade 12, a smaller percentage of schools and students that are invited 
actually participate in testing than is the case with 4th and 8th 
graders. Also, more 12th graders fail to complete their tests than do 
4th and 8th graders. In addition, when asked ``How hard did you try on 
this test?'' and ``How important is doing well on this test?'' many 
more 12th graders, than 4th or 8th graders, say that they didn't try 
hard and that the test wasn't important. Low participation rates, low 
completion rates, and indicators of low motivation suggest that the 
National Assessment may be underestimating what 12th graders know and 
can do.
    One possible reason for low response and low motivation is that 
schools and students receive very little in return for their 
participation in the National Assessment beyond the knowledge that they 
are performing a public service. They do not receive test scores nor do 
they receive other information from the National Assessment that 
teachers and principals might wish to use as a part of the 
instructional program. This should be changed. The National Assessment 
design should use meaningful, practical incentives that will give 
school principals and teachers a greater reason to participate and 
students more of a reason to try harder. The underlying idea is clear: 
if principals and teachers see direct benefits, they are more likely to 
agree to participate in the National Assessment. Students may be more 
likely to take the assessment seriously if they see that their teachers 
and principals are enthusiastic about participating.
Recommendations
     The National Assessment should continue to test in and 
report results for grades 4, 8 and 12; however, in selected subjects, 
one or more of these grades may not be tested;
     Age-based testing and reporting should continue only to 
the extent necessary for international comparisons and for long-term 
trends, should the Governing Board decide to continue long-term trends 
in their current form;
     Grade 12 results should be accompanied by clear, 
highlighted statements about school and student participation, student 
motivation, and cautions, where appropriate, about interpreting 12th 
grade achievement results;
     The National Assessment design should seek to improve 
school and student participation rates and student motivation at grade 
12.

National Assessment Results for States

    In 1988, testing at the state level was added to the National 
Assessment. Previously, the National Assessment reported only national 
and regional results. For the first time, the information was relevant 
to individuals in states who make decision about education funding, 
governance and policy. As a result, states now are major users of 
National Assessment data.
    Participation was strong in the first state-level assessment in 
1990 and has grown to include even more states. In 1996, 44 states and 
3 jurisdictions participated in the math assessments at grade 4 and 8 
and the science assessment at grade 8.
    Currently, the National Assessment draws a separate sample to 
obtain national results in addition to the samples drawn for individual 
state reports. Testing separate national samples increases costs and 
creates additional burdens on states, particularly small states. If 
this practice can be discounted, savings should be possible.
    States participate in the National Assessment for many reasons, 
including to have an unbiased, external benchmark to help them make 
judgments about their own tests and standards. National Assessment data 
are used to make comparisons to other states, to help determine if 
curriculum

[[Page 24770]]

and standards are rigorous enough, to develop questions about 
curricular strengths and weaknesses, to make state to international 
comparisons, and to provide a general indicator of achievement.
    There is a strong interest among states to use the National 
Assessment to get state level information in reading, writing, science 
and mathematics. The level of interest in using the National Assessment 
varies with respect to the other subjects. State education officials 
are most interested in the National Assessment testing at grades 4 and 
8. They say that obtaining cooperation from high schools and 12th grade 
students is difficult. Also, from their perspective, 12th grade testing 
comes at the end of compulsory schooling, after which remediation is 
not feasible within the elementary and secondary system.
    States are active partners in the National Assessment program. 
States help develop National Assessment test frameworks, review test 
items, and assist in conducting the tests. The National Assessment 
program is effective, to a great degree, because of the involvement of 
the states.
    Because it is useful of them, and because they invest time and 
resources in it, states want a dependable schedule for National 
Assessment testing. With a dependable schedule, states that want to 
will be better able to coordinate the National Assessment with their 
own state testing program and make better use of the National 
Assessment as an external reference point.
Recommendations
     National Assessment state-level assessments should be 
conducted on a reliable, predictable schedule according to a 10-year 
plan adopted by the Governing Board;
     Reading, writing, mathematics, and science at grades 4 and 
8 should be given priority for National Assessment state-level testing;
     Testing in other subjects and at grade 12 should be 
permitted at state option and cost;
     Where possible, national results should be estimated from 
state samples in order to reduce burden on states, increase efficiency 
and save costs.

Use Innovations in Measurement and Reporting

    The National Assessment has a record of innovations in large-scale 
testing. These include the early use of performance items, sampling 
both students and test questions, using standards describing what 
students should know and be able to do, and employing computers for 
such things as inventory control, scoring, data analysis and reporting. 
The National Assessment should continue to incorporate promising 
innovative approaches to test administration and improved methods for 
measuring and reporting student achievement.
    Technology can help improve National Assessment reporting and 
testing. For example, reports could be put on computer disc, 
transmitted electronically, and made available through the World-Wide 
Web. Test questions could be catalogued and made available on-line for 
use by state assessment personnel and classroom teachers. Also, the 
National Assessment could be administered by computer, eliminating the 
need for costly test booklet systems and reducing steps related to data 
entry of student responses. Students could answer ``performance items'' 
in cost-effective, computerized formats. The increasing use of 
computers in schools may make it feasible to administer some parts of 
the National Assessment by computer under the next contract for the 
National Assessment, beginning around the year 2000.
    Other examples of promising methods for measuring and reporting 
student achievement include adaptive testing and domain-score 
reporting. In adaptive testing, each student is given a short ``pre-
test'' to estimate that student's level of achievement. On the basis of 
the pre-test, higher achieving students are given tougher questions; 
students who know and can do less are given easier questions. Since the 
test is ``adapted'' to the individual, it is more precise and can be 
markedly more efficient than regular test administration. In domain-
score reporting, a subject (or ``domain'') is well-defined, a goodly 
number of test questions are developed that encompass the subject, and 
student results are reported as a percentage of the ``domain'' that 
students ``know and can do.'' This is in contrast to reporting results 
using an arbitrary scale, such as the 0-500 scale in the National 
Assessment.
Recommendations
     The National Assessment should assess the merits of 
advances related to technology and the measurement and reporting of 
student achievement;
     Where warranted, the National Assessment should implement 
such advances in order to reduce costs and/or improve test 
administration, measurement and reporting;
     The next competition for National Assessment contracts, 
for assessments beginning around the year 2000, should ask bidders to 
provide a plan for (1) conducting testing by computer in at least one 
subject at one grade, and (2) making use of technology to improve test 
administration, measurement, and reporting.
    Objective 2: To develop, through a national consensus, sound 
assessments to measure what students know and can do as well as what 
students should know and be able to do.

Keep Test Frameworks and Specifications Stable

    Test frameworks spell out in general terms how a test will be put 
together. The test frameworks also determine what will be reported and 
influence how expensive an assessment will be. Should 8th grade 
mathematics include algebra questions? Should there be both multiple 
choice questions and questions in which students show their work? What 
is the best mix of such types of questions for each grade? Which grades 
are appropriate for testing in a subject area? Test specifications 
provide detailed instructions to the test writers about the specific 
content to be tested at each grade, how test questions will be scored, 
and the format for each test question (e.g. multiple choice, essay, 
etc.).
    Test frameworks and specifications are developed through a national 
consensus process conducted by the Governing Board. The national 
consensus process involves hundreds of teachers, curriculum experts, 
directors of state and local testing programs, administrators, and 
members of the public. The national consensus process helps determine 
what is important for the National Assessment to test, how it should be 
measured, and how much of what is measured by the National Assessment 
students should know and be able to do in each subject.
    Through the national consensus process, both current classroom 
teaching practices and important developments in each subject area are 
considered for inclusion in the National Assessment. In order to ensure 
that National Assessment data fairly represent student achievement, the 
test frameworks and specifications are subjected to wide public review 
before adoption and all test questions developed for the National 
Assessment are reviewed for relevance and quality by representatives 
from each participating state.
    An important role of the National Assessment is to report on trends 
in student achievement over time. For the National Assessment to be 
able to measure trends, the frameworks (and hence the tests) must 
remain stable. However, as new knowledge is gained

[[Page 24771]]

in subject areas and as teaching practices change and evolve, pressures 
arise to change the test frameworks and tests to keep them current. 
But, if frameworks, specifications and tests change too frequently, 
trends may be lost, costs go up, and reporting time may increase.
Recommendations
     Test frameworks and test specifications developed for the 
National Assessment generally should remain stable for at least ten 
years;
     To ensure that trend results can be reported, the pool of 
test questions developed in each subject for the National Assessment 
should provide a stable measure of student performance for at least ten 
years;
     In rare circumstances, such as where significant changes 
in curricula have occurred, the Governing Board may consider making 
changes to test frameworks and specifications before ten years have 
elapsed;
     In developing new test frameworks and specifications, or 
in making major alterations to approved frameworks and specifications, 
the cost of the resulting assessment should be estimated. The Governing 
Board will consider the effect of that cost on the ability to test 
other subjects before approving a proposed test framework and/or 
specifications.

Use an Appropriate Mix of Multiple-Choice and ``Performance'' Questions

    To provide information about ``what students know and can do,'' the 
National Assessment uses both multiple-choice questions and questions 
in which students are asked to provide their own answers, such as 
writing a response to an essay question or explaining how they solved a 
math problem. Questions of the latter type are sometimes called 
``performance items.'' The two types of questions may require students 
to demonstrate different kinds of skills and knowledge.
    Performance items are desired because they provide direct evidence 
of what students can do. Individuals confronted with problems in the 
real world are seldom handed four possible answers, one of which is 
correct. Although they may be desirable, performance items are more 
expensive than multiple-choice to develop, administer, and score.
    Multiple-choice questions are desired because conclusions are more 
practical to obtain about the kinds of skills and knowledge assessed by 
these items, given the time available for testing. However, multiple-
choice questions are more subject to guessing than are performance 
items.
    Currently, all students tested by the National Assessment are given 
both types of questions. Generally, about half the testing time is 
devoted to each type of question, but the amount of time for each 
differs based on the skills and knowledge to be assessed, as 
established in the National Assessment test framework. For example, in 
a writing assessment, all students are asked to write their responses 
to specific ``prompts.'' In other subjects, the appropriate mix of 
multiple-choice and performance items varies.
Recommendations
     Both multiple-choice and performance items should continue 
to be used in the National Assessment;
     In developing new test frameworks, specifications, and 
questions, decisions about the appropriate mix of multiple-choice and 
performance items should take into account the nature of the subject, 
the range of skills to be assessed, and cost.
    Objective 3: To help states and others link their assessments with 
National Assessment and use National Assessment data to improve 
education performance.
    The primary job of the National Assessment is to report frequently 
and promptly to the American public on student achievement. The 
resources of the National Assessment must be focused on this central 
purpose if it is to be achieved. However, the products of the National 
Assessment--test questions, test data, frameworks and specifications, 
are widely regarded as being of high quality. They are developed with 
public funds and, therefore, should be available for public use as long 
as such uses do not threaten the integrity of the National Assessment 
or its ability to report regularly on student achievement.
    The National Assessment should be designed in a way that permits 
its use by others while protecting the privacy of students, teachers, 
and principals who have participated in the National Assessment. This 
should include making National Assessment test questions and data easy 
to assess and use, and providing related technical assistance upon 
request. Generally, the costs of a project should be borne by the 
individual or group making the proposal, not by the National 
Assessment. Examples of areas in which particular interest has been 
expressed for using the National Assessment include linking state and 
local tests with the National Assessment and performing in-depth 
analysis on National Assessment data. States that link their tests to 
the National Assessment would have an unbiased external benchmark to 
help make judgments about their own tests and standards and would also 
have a means for comparing their tests and standards with those of 
other states.
Recommendations
     The National Assessment should develop policies, practices 
and procedures that enable states, school districts and others who want 
to do so at their own cost, to conduct studies to link their test 
results to the National Assessment;
     The National Assessment should be designed so that others 
may access and use National Assessment test questions, test data and 
background information;
     The National Assessment should employ safeguards to 
protect the integrity of the National Assessment program, prevent 
misuse of data, and ensure the privacy of individual test takers.

    Dated: May 13, 1996.
Roy Truby,
Executive Director, National Assessment Governing Board.
[FR Doc. 96-12264 Filed 5-15-96; 8:45 am]
BILLING CODE 4000-01-M