[Federal Register Volume 64, Number 97 (Thursday, May 20, 1999)]
[Notices]
[Pages 27520-27532]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 99-12746]


=======================================================================
-----------------------------------------------------------------------

DEPARTMENT OF EDUCATION


National Assessment Governing Board

AGENCY: National Assessment Governing Board; Department of Education.

ACTION: Notice of request for comments.

-----------------------------------------------------------------------

SUMMARY: The National Assessment Governing Board requests public 
comment on two draft documents it has prepared for submission to 
Congress and the President. The first document, required under section 
305(c)(1) of the FY 1999 Omnibus Budget Act (the Act), provides a 
suggested statement of the purpose, intended use, definition of the 
term ``voluntary,'' and the means of reporting results for the proposed 
voluntary national tests in 4th grade reading and 8th grade 
mathematics. The second document, entitled ``National Assessment of 
Educational Progress: Design 2000-2010,'' describes how improvements in 
the National Assessment of Educational Progress will be implemented 
during the 2000-2010 period. Interested individuals and organizations 
are invited to provide written comments to the Governing Board.
    Written Comments: Written comments must be received by June 9, 1999 
at the following address: Mark D. Musick, Chairman (Attention: Ray 
Fields), National Assessment Governing Board, 800 North Capitol Street 
NW, Suite 825, Washington, DC 20002-4233.
    Written comments also may be submitted electronically by sending 
electronic mail (e-mail) to [email protected] by June 9, 1999. 
Comments sent by e-mail must be submitted as an ASCII file avoiding the 
use of special characters and any form of encryption. Inclusion in the 
public record cannot be guaranteed for written statements, whether sent 
by mail or electronically, received after June 9, 1999.
    Public Record: A record of comments received in response to this 
notice will be available for inspection from 8 a.m. to 4:30 p.m., 
Monday through Friday, excluding legal holidays, in Suite 825, 800 
North Capitol Street, NW., Washington, DC, 20002.

The Voluntary National Test: Purpose, Intended Use, Definition of 
Voluntary and Reporting

Background

Purpose
    The purpose of this report is to fulfill one of the requirements of 
the FY 1999 appropriation act for the Department of Education (the 
Act). Specifically, with respect to the proposed voluntary national 
tests in 4th grade reading and 8th grade mathematics, the Act requires 
the National Assessment Governing Board to

* * * determine and clearly articulate the purpose and intended use 
of any proposed federally sponsored national test. Such report shall 
also include--(A) a definition of the meaning of the term 
``voluntary'' in regards to the administration of any national test; 
and (B) a description of the achievement levels and reporting 
methods to be used in grading any national test.

    This report addresses the four required areas: purpose, intended 
use, definition of ``voluntary,'' and reporting. Although the 
legislation states that the Governing Board shall ``determine'' these 
matters, the Governing Board recognizes that this report is advisory to 
Congress and the President. Any final determination on these matters 
will be made in legislation enacted by Congress and signed by the 
President.
    The Act contains other provisions related to the voluntary national 
test. One provision amends the General Education Provisions Act, 
creating a new section 447, prohibiting pilot testing and field testing 
of any federally sponsored national test unless specifically authorized 
in enacted legislation. However, another provision permits the 
development of voluntary national tests, giving the National Assessment 
Governing Board exclusive authority for such test development.
    In order to carry out the congressional assignment to prepare this 
report, the Governing Board had to envision a situation in which there 
was authority to conduct voluntary national tests, while recognizing 
that the Act prohibits such tests at this time. Further, the Governing 
Board had to envision how national testing could work, given that 
schools in the United States are governed by states, localities and 
non-public authorities. The Governing Board attempted to answer the 
question: If there are to be voluntary national tests, what is a 
feasible, coherent plan that would be beneficial to parents, students, 
and teachers? Thus, while not advocating for or against the voluntary 
national test initiative, the Governing Board interprets the 
congressional assignment to be to present a sound and logical case for 
the potential purpose and use of the voluntary national tests.
    The Act sets September 30, 1999 as the deadline for submitting this 
report to Congress and the President. However, to assist Congress and 
the President in deliberating on the future of the voluntary national 
test, help promote a timely decision, and avoid a full year's delay in 
pilot testing should Congress and the President decide to proceed with 
the project, the Governing Board is submitting its report in June.

Report Preparation Process

    In November 1998, the Governing Board established a special ad hoc 
committee to assist in drafting the report. The committee was composed 
of both veteran and new Board members. Chaired by Michael Nettles, the 
committee included Wilmer Cody, Thomas Fisher, Michael Guerra, Nancy 
Kopp, Debra Paulson, Diane Ravitch, and John Stevens.

[[Page 27521]]

    The committee developed a plan for preparing the report, engaging 
the Governing Board in related policy deliberations, and obtaining 
public comment. At the March 1999 Board meeting, the committee 
presented materials that were developed for public comment. These 
included an explanatory statement; two possible scenarios addressing 
purpose, use, definition of voluntary, and the methods for reporting; 
and a set of questions related to the scenarios. The purpose of these 
materials was to provide a framework for public comment. They did not 
represent the positions of the Governing Board at the time.
    The Governing Board discussed these materials at length, made 
several changes, and authorized the committee to proceed to obtain 
public comment. The materials and an invitation to provide written 
comments and/or oral testimony at four public hearings during March and 
April were disseminated.
    Taking the comments received into account, the committee then 
prepared a draft report for review at the May 1999 Governing Board 
meeting. The Governing Board discussed and revised the draft report and 
authorized the committee to obtain comment on the draft report. The 
draft report was disseminated by mail, on the Governing Board's web 
site, and in the Federal Register. A hearing on the draft report will 
be conducted June 12 at the annual Large Scale Assessment Conference 
with state and district testing experts.
    After taking the comments received into account, the committee will 
prepare a draft report for presentation to the Board at a special 
meeting on June 23. At the June 23 meeting, the Governing Board will 
discuss the draft and approve a final version for submission to the 
President and Congress.

Overview

    This report is in three sections. The first section is in the form 
of a story. It is intended to put a ``human face'' on the details in 
the section that follows. The second section describes the Governing 
Board's recommendations on purpose, intended use, definition of 
``voluntary,'' and reporting for the proposed voluntary national tests. 
The third section is a summary with recommendations.

The Voluntary National Test: A Story

    It is March 18; the year is 2006. Fourth grader Maria Johnson, 
along with her classmates and many other 4th graders across the nation, 
will be taking the voluntary national test in reading tomorrow. Eighth 
graders will be taking the mathematics test.
    Maria started kindergarten in September 2001; the first voluntary 
national test was administered the following March. That year and each 
year since, Parade magazine devoted an early April article to the test. 
The test questions were published, along with the answers. For 
questions that require students to write their own answers, samples of 
student work from the national tryout of the test the year before were 
included to illustrate different levels of student performance. These 
levels of student performance are based on the achievement levels set 
for the National Assessment of Educational Progress (NAEP). Similar 
materials were made available following each year's tests in 
newspapers, magazines aimed at parents and teachers, on the Internet, 
and on the Public Broadcasting System. Reading and mathematics 
achievement levels posters are displayed in pediatrician's offices 
across the country. January through March of each year, McDonald's, 
Burger King, Wendy's, and KFC print sample test questions on placemats 
and food containers.
    Maria's school district decided to volunteer to participate in the 
national test in 4th grade reading. The school district administration 
had examined the test framework, specifications, and sample test and 
determined that they were consistent with the district's reading 
program. They knew that the results would belong to the district and 
the families. The federal government would not report or maintain any 
of the data resulting from testing nor require the district to report 
any of the data to the federal government.
    Maria's school provided copies of the Parade article to each of the 
families. In the school district, the policy is for all students to 
participate in testing unless a parent specifically objects. When 
Maria's parents finished reading the article, they had a clear picture 
of what a proficient reader in the fourth grade should know and be able 
to do. They understood that proficiency would not come overnight, but 
with many small steps and that each year of school would mark progress 
toward the goal of reading proficiency. Maria's parents decided that 
having a clear goal and following progress toward that goal are good 
things to do and wanted their child to participate.
    Having this initial knowledge, the Johnsons wanted to learn more 
and did their homework. They attended a school-sponsored seminar on the 
reading program. They learned what they could do at home to reinforce 
what Maria was learning in school. The Johnsons obtained a special 
version of the NAEP framework, written for parents, to deepen their 
understanding of the material covered by the test. The Johnsons now had 
a frame of reference for talking with Maria's teachers in specific 
terms about the reading program and for monitoring Maria's progress 
each year toward 4th grade reading proficiency. Maria, with her 
parents' encouragement and teachers' support, has worked hard in school 
and at home on her reading assignments and enjoys reading on her own.
    With this shared understanding and common language about reading 
proficiency, the school was helped in its efforts to involve parents. 
The school had developed its own testing program to track the reading 
progress of each student each year toward 4th grade reading 
proficiency. Thus, needs for extra help were identified early, in-depth 
diagnosis was provided when needed, and remediation occurred before it 
was too late.
    The school liked using the achievement levels. They were consistent 
with the state's performance standards for reading. They helped keep 
the school staff focused as they worked day-by-day, making hundreds of 
decisions about materials, instruction, and curricula to achieve the 
many incremental steps needed for each student to progress.
    Parents and teachers also like the fact that the test booklet is 
returned. This permits parents and teachers to review with the student 
all of the test questions and the student's answers. The student gets 
reinforcement on what was done well. Parents and teachers can see which 
questions were answered well and which were missed, probe the reasons 
why with the student, and, from the student's response and other 
knowledge of the student, explore whether advanced activities, 
diagnostic testing, or any other intervention should be considered.
    Together with the on-going assessment program and the state's 
standards and assessments, the school and parents found that the 
voluntary national test adds in a unique way to the range of methods 
for monitoring individual student progress. The teachers and principals 
found that the achievement levels used to report voluntary national 
test results were much easier for parents to understand than 
percentiles, stanines, or mean scores. Also, the voluntary national 
test provides parents and schools a single basis of comparison for 
individual student performance across states that is generally not 
available from classroom developed tests or state-wide

[[Page 27522]]

assessments. Most of all, parents have a clear and very specific 
understanding of how their child has performed in comparison to 
rigorous standards.
    Although the test was designed to provide individual results, the 
school district has decided that it will compile the individual student 
results that were provided by the voluntary national testing program. 
The district administrators want to know how the district overall 
compares with the students in the national sample who participated in 
the national trial run of the test the year before.
    The district has joined a consortium of similar districts that have 
agreed among themselves to follow the guidelines for compiling and 
reporting voluntary national test data developed by the National 
Assessment Governing Board (NAGB). Following these guidelines ensures 
that the data analyses are done properly, comparisons between and among 
districts and schools are fair, and inferences about achievement are 
defensible. When the district reports these results to the public, it 
makes a big point of saying that it has followed these guidelines to 
the letter and spirit, as a means for establishing credibility and 
trust.
    The story presents one plausible scenario for how the voluntary 
national test might be implemented in public schools, but other 
scenarios are possible as well. The story is focused on the future 
because effects of the proposed voluntary national test would not be 
fully achieved in its first year. But two things are clear. If there is 
to be such a test, it should be made available to all who would find 
value in it, whether state, public school, private school, home school, 
or individual parent. And, while the federal government would provide 
resources to make the tests available, there should be no federal 
coercion, sanctions, or rewards for participating.
    The story emphasizes that, while having widely recognized standards 
and assessments can provide focus for planning and a common language 
for students, parents and teachers, what is most important is what 
parents, students, and educators actually do with that knowledge. The 
story, implicitly, also suggests that a wide voluntary mobilization of 
private resources in society reinforcing the value and importance of 
learning (e.g., Parade and McDonald's) would be important.

The Purpose of the Voluntary National Test

    As the Governing Board worked on this report, it became evidence 
that purpose, intended use, the definition of voluntary, and means for 
reporting are, to a large degree, interdependent. A change in any one 
of these could affect the others. Therefore, it is important that these 
four areas be coherent.
    In addition, the test should serve a unique purpose. If the same 
purpose is already being fulfilled by another testing program, there is 
no need for the voluntary national test. If the same purpose could 
easily be fulfilled by another testing program, it would be prudent to 
consider that possibility in weighing the pros and cons before 
proceeding with full implementation.
    The National Assessment Governing Board suggests that Congress and 
the President consider the following as the purpose of the proposed 
voluntary national test:

    To measure individual student achievement in 4th grade reading 
and 8th grade mathematics, based on the content and rigorous 
performance standards of the National Assessment of Educational 
Progress (NAEP), as set by the National Assessment Governing Board 
(NAGB).
Rationale
    The legislation giving responsibility for voluntary national test 
development to the Governing Board does not specify or limit the 
subjects and grades to be tested. However, the accompanying conference 
report does direct that the tests be based on NAEP content and NAEP 
performance standards and be linked to NAEP to the maximum extent 
possible. The Governing Board in August 1996 adopted a policy on NAEP 
redesign. The redesign policy provides for testing at grades 4, 8, and 
12 at the national level is 11 subjects and, based on the needs and 
interests expressed by states, at grades 4 and 8 at the state level in 
reading, writing, mathematics and science.
    Grades 4, 8, and 12 are transition points in American schooling. 
Consistent with the National Assessment redesign policy and the 
congressional directive that the voluntary national tests be designed 
to parallel NAEP, the Governing Board limited the test development 
contract to cover grade 4 reading and grade 8 mathematics. Proficiency 
in these subjects, by these grades, is considered to be fundamental to 
academic success.
    Most importantly, measuring individual student achievement based on 
the National Assessment affords this proposed testing program a unique 
niche among K-12 academic testing programs in the United States. For 30 
years, the National Assessment has reported the status and progress of 
student achievement on nationally representative samples of students. 
It has done so with credibility, technical competence, and widespread 
acceptance. For the last ten years, the National Assessment also has 
reported on state-representative samples of students in volunteering 
states, providing participating states with the only available 
comparable measure of student achievement.
    However, the National Assessment, by law, does not provide 
individual student results. It provides only group-level results (e.g., 
for students overall, by sex, by race, by type of school, etc.). The 
NAEP state-level assessments represented a watershed event. Ten years 
ago, state-level assessments were begun with fears of encroachment on 
state and local autonomy and worry that a national curriculum would 
result. The promise that the NAEP state-level assessment program would 
serve a unique function--to provide comparable state results, trends 
over time, and an external validity check for state standards and 
assessments--has been realized. The fears have not. This is because 
there are checks and balances built into the governance of the program.
    Today, similar fears of federal encroachment and the emergence of a 
national curriculum are being expressed about the voluntary national 
test and must be addressed. As with the NAEP state assessments, checks 
and balances can be provided for in the governance and operation of the 
voluntary national testing program to prevent these reasonable concerns 
about federal encroachment and national curricula from becoming 
reality.

Definition of the Term `Voluntary'

    There are two dimensions to the definition of the term 
``voluntary'' as it would apply in the administration of the voluntary 
national tests. The first dimension has to do with the role of the 
federal government. The second dimension has to do with who makes the 
decision to participate in the voluntary national tests.

Federal Role

    The role of the federal government in the proposed voluntary 
national tests should be limited. The federal government should not 
make any individual take the voluntary national tests or require any 
school to administer the tests. The federal government should have no 
control or authority over nay data resulting from the administration of 
the voluntary national tests, nor should participation in the voluntary 
national tests be a condition for receiving federal funds.

[[Page 27523]]

    The National Assessment Governing Board suggests that Congress and 
the President consider the following as part of the definition for the 
term ``voluntary'':

    The federal government shall not require participation by any 
state, school district, public or private school, organization, or 
individual in voluntary national tests, make participation in 
voluntary national tests a specified condition for receiving federal 
funds, or require participants to report voluntary national test 
results to the federal government.
Rationale
    It is fundamental that the definition of the term ``voluntary'' 
include limits on the role of the federal government. The limits on the 
federal role should be specified in legislation and designed to insure 
against any encroachment on state, local, and private school autonomy. 
Several witnesses in the Governing Board's public hearings argued that 
the 55 mile-per-hour speed limit was voluntary, too, but became 
universally implemented by states (and in that sense was ``mandatory'') 
because it was a specified condition required to receive federal 
highway funding. The definition of ``voluntary'' provided here would 
foreclose such an outcome. However, it would not foreclose any federal 
grantee from using the voluntary national test to meet a general 
reporting requirement if other options are available as well and could 
be fulfilled validly and appropriately by the voluntary national tests. 
On the one hand, it is not fair to require that the VNT be used. On the 
other hand, it is not fair to foreclose its use if doing so is done 
without coercion and solely at the participant's discretion.

Who Decides To Participate

    Since the federal government will not coerce participation, it will 
be up to others to decide whether to participate. Education governance 
for public schools in the United States, about 88 percent of K-12 
school enrollment, is vested in state and local public authorities. 
Responsibility for the remaining 12 percent of K-12 school enrollment 
resides with private school authorities and parents.
    The definition of ``voluntary'' needs to accommodate a wide range 
and diversity of governance authority. For example, there is great 
variation among state laws in the degree of central authority and 
responsibility for education and the degree of local district autonomy. 
Similarly, there are differences among private schools in how they are 
governed as well as among state laws regarding the oversight of private 
schools and home schooling. While provisions for who decides to 
participate should accommodate this range and diversity of authority, 
such accommodation must be made in a manner that does not conflict with 
state and local law and policy.
    With respect to who decides to participate in voluntary national 
tests, the National Assessment Governing Board suggests that Congress 
and the President consider the following:

    Public and private school authorities should be afforded the 
option to participate in the voluntary national tests. For public 
schools, state and/or local law and policy should determine whether 
the initial decision to participate is made at the state level or at 
the local district level. Where state law or policy provides that 
the initial decision be made at the state level, and the state 
decides not to participate, school districts should be afforded the 
opportunity to decide whether to participate, to the extent 
permitted by state and local law and policy.
    For private schools, the decision to participate should be made 
by the appropriate governing authority.
    Parents may have their children excused from testing as 
determined by state and local law and policy in the case of public 
schools. In the case of private schools, parents may have their 
children excused from testing as determined by the policy of the 
appropriate governing authority.
    Parents whose schools are not participating but want their 
children to take the voluntary national tests should have access to 
the tests either through a qualified individual or testing 
organization before the tests are released to the public or through 
dissemination procedures at no or minimal cost (e.g., public 
libraries and the Internet) after the tests are released to the 
public.
Rationale
    The definition of ``voluntary'' adopted by the Governing Board is 
intended to align with state and local law and policy regarding the 
authority to make decisions about testing. The definition is designed 
to allow for choice in providing the opportunity to participate, but 
without exceeding the authority of the federal government in this 
sensitive area, without coercion by the federal government, and without 
intruding on the prerogatives of states, school districts, private 
schools, and parents.
    Typically, if not universally, determinations about testing are 
made by school authorities, whether state, local, or private (including 
home schools). They determine what should be tested, what grades should 
be tested, the time of year for testing, the content of reports on test 
results and the use of the results. These authorities decide whether 
tests will be taken by all students or by a sample of students. 
Therefore, the definition of ``voluntary'' is designed to account for 
the fact that schools are the most likely venue through which the 
proposed voluntary national tests would be administered and that school 
authorities decide which tests will be given. At the same time, the 
definition of ``voluntary'' recognizes and accommodates the variation 
in responsibility and authority for education governance that exists 
across state boundaries among states and schools.
    School authorities also decide the extent to which official 
policies will provide for parental intervention to have their children 
excused from testing. The definition of ``voluntary'' intends to 
accommodate this variability as well, again, without intruding on local 
prerogatives.
    Finally, the definition of ``voluntary'' recognizes that there 
could be instances in which school authorities decide not to 
participate in the voluntary national tests, but certain parents want 
their children tested. In such cases, parents may elect to have their 
children tested by appropriately licensed or recognized individuals or 
organizations. Because all parents who may wish to have their children 
take the test may not have the resources to pay for private testing, 
the test and scoring guides could be made available for free, or at a 
minimal charge, after the period for conducting the testing is 
completed.

Intended Use of the Voluntary National Tests

    The intended use of the voluntary national tests is related to the 
statement of purpose and definition of ``voluntary'' suggested above. 
The Governing Board suggests that Congress and the President consider 
the following as the intended use of the proposed voluntary national 
tests:

    To provide information to parents, students, and authorized 
educators about the achievement of the individual student in 
relation to the content and the rigorous performance standards for 
the National Assessment, as set by the National Assessment Governing 
Board for 4th grade reading and 8th grade mathematics.
Rationale
    The proposed intended use of the voluntary national tests is 
purposely narrow, and appropriately so. Consistent with the purpose 
statement, which is to measure individual student achievement, the 
intended use is to provide information describing the achievement of 
the individual student. Upon receiving the results of the test, 
parents, students and teachers will have an overall measure of the 
individual student's achievement in 4th grade reading or 8th grade 
mathematics. As

[[Page 27524]]

described in the following section on reporting, they will have 
information on the performance standard reached by the student and 
other detailed related information.
    With information in hand from the voluntary national tests and 
other sources about the child and the school program, it is expected 
that: (1) parents could become more involved with the child's 
education, (2) students could study hard and learn more, (3) teachers 
could work more to emphasize important skills and knowledge in the 
subjects tested without narrowing or limiting their curricula, and (4) 
parents, students, and teachers could have a means for better 
communication about the child's achievement.
    While such outcomes can be hoped for, their achievement relies on 
local effort, resources, skill, and persistence. A test and clear 
performance standards are necessary, but not sufficient conditions for 
their achievement. No testing program can determine, ensure, or 
constrain what will be done with the information it provides. However, 
when the values of a society at large are focused on a clear goal 
widely recognized as important, with consistent methods for monitoring 
progress toward that goal, the likelihood that local effort, resources, 
skill and persistence will voluntarily be brought to bear on the 
achievement of that goal is increased.
    The Governing Board does not assume that uses of data from 
voluntary national tests beyond the intended use described above are 
necessarily inappropriate or should be prohibited to states, districts, 
and private schools. Any such additional use of voluntary national test 
data would be done at the discretion of the participating state, 
district, or private school authorities, who would be responsible for 
following appropriate technical standards and validation procedures.
    However, the voluntary national test are not tied to a preferred 
curriculum, teaching method or approach. The voluntary national tests 
are based on the content of the National Assessment of Education 
Progress. The content of each NAEP test is developed by the Governing 
Board through a National consensus process involving hundreds of 
educators, curriculum specialists, school administrators, parents, and 
members of the public. The content of NAEP is designed to assess what 
students know and can do, not how they are taught.
    The voluntary national tests also are not designed to diagnose 
specific learning problems or English language proficiency. Tests for 
such diagnostic purposes are specifically tailored. For example, a test 
of English language proficiency may involve speaking and listening as 
well as reading. A test to diagnose specific learning problems may 
include motor coordination and perception, but may or may not include 
mathematics skills. Tests for the general population, such as the 
voluntary national tests, are inappropriate for these diagnostic 
purposes.
    The voluntary national tests are not intended to be used as the 
sole criterion in making ``high stakes'' decisions (e.g., placement or 
promotion) about individual students. As the National Academy of 
Sciences/National Research Council (NAS/NRC) stated in its report 
``High Stakes: Testing for Tracking, Promotion, and Graduation'':

    Scores from large-scale assessments should never be the only 
sources of information used to make a promotion or retention 
decision * * * Test scores should always be used in combination with 
other sources of information about student achievement * * * 
Students who fail should have the opportunity to retake any test 
used in making promotion decisions; this implies that tests used in 
making promotion decisions should have alternate forms. (p. 12-11).

The NAS/NRC report also recommends against the use of the voluntary 
national test in any high stakes decision for individual students under 
any circumstances, whether in association with other sources of 
information or not. This recommendation is in contrast to the Governing 
Board's suggestion above that any use of the voluntary national test 
beyond the stated intended use must follow technical standards and be 
validated by the participating state, district, or private school 
authorities. The Governing Board recommends that such uses and their 
validation be left to the professional discretion of participating 
states, districts and schools.

Reporting the Results of the Voluntary National Tests

    Consistent with the purpose and intended use of the voluntary 
national tests, the National Assessment Governing Board suggests that 
results of the voluntary national tests be provided separately for each 
student. Parents, students, and authorized educators (those with direct 
responsibility for the education of the student) should receive the 
test results report for the student. Test results for the student 
should be reported according to the performance standards for the 
National Assessment of Educational Progress (NAEP). These are the NAEP 
achievement levels: Basic, Proficient, and Advanced.\1\ All test 
questions, student answers, and an answer key should be returned with 
the test results; it will be clear which questions were answered 
correctly and which were not. The achievement levels should be 
explained and illustrated in light of the questions on the test. Also, 
based on the nationally representative sample of students who 
participated in the national tryout of the test the year before, the 
percent of students nationally at each achievement level should be 
provided with the report.
---------------------------------------------------------------------------

    \1\ N.B. In making the determination that the achievement levels 
will be the basis for reporting voluntary national test results, the 
Governing Board is aware that Congress has asked for its response to 
the assertion that the process for setting the levels is ``flawed.'' 
The Governing Board is submitting simultaneously, under separate 
cover, a report describing its response to this assertion and its 
plan for investigating alternative standard-setting methods.
---------------------------------------------------------------------------

    There should be no compilations of student results provided 
automatically by the program. The program should not provide results 
for the nation as a whole or by state, district, school, or classroom, 
since the purpose and use of the testing program are directed at 
individual student level results.
    However, it is virtually certain that compilations of student 
results will be desired and demanded by at least some of the state and 
district participants and possibly by private school participants as 
well. These participants should be permitted to obtain and compile the 
data at their own cost, but they will bear the full responsibility for 
using the data in appropriate ways and for validating the uses they 
make of the data.
    The Governing Board would develop and provide guidelines and 
criteria for use by states, districts, and schools for compiling and 
reporting the data from the voluntary national tests. The guidelines 
and criteria would explicitly require full and clear disclosure about 
exclusions and/or absences from testing, so that results and 
comparisons would be accurately portrayed. Access to the test data by 
external researchers would be made strictly at the discretion of the 
participating state, district, or private school, as it would with any 
other testing program, without prejudice because of federal support for 
the voluntary national test program.

Other Issues

    There are several issues which the Governing Board would be remiss 
not to raise, although they are outside the requirements for this 
report set by Congress and no attempt is made to resolve them here.

[[Page 27525]]

Implementation

    By law, the Governing Board has exclusive authority for test 
development. The Governing Board has been meticulous in staying within 
the law's boundaries. The Governing Board has focused its efforts on 
developing test questions and on associated activities. Appropriately, 
the Governing Board has not taken up implementation issues such as
     The process by which states, districts and schools commit 
to participate, to what entity the commitment is made, and in what form 
and of what nature the commitment should be
     How information about the test program and the opportunity 
to participate will be made available to parents, teachers, and 
students
     Whether and how quality control monitoring of testing 
should occur
     How printing of test booklets, scoring of student 
responses, and reporting of test results would be handled
     Whether the testing program should be controlled by a 
federal agency or private commercial interests
     Whether all or part of the costs for the test program 
should be paid by the federal government

Linking the Voluntary National Tests to NAEP

    Underlying the concept of the proposed Voluntary National Tests is 
the desire to measure and report student achievement based on the 
content and rigorous performance standards of NAEP. Indeed, the 
directive from Congress to the Governing Board is to link the VNT to 
NAEP ``to the maximum extent possible.'' Accomplishing this linkage 
presents a significant challenge--one which affects the design of the 
VNT as well as the manner in which data are calculated and reported. 
Two tests can be linked to the degree that they have common 
characteristics, including types of questions, range of content, test 
administration procedures, etc. Thus, the first task facing the 
Governing Board is to forge a close relationship between the two tests 
as the VNT is being created.
    Linking two tests also depends upon the particular statistical 
approach that can be used. Unless a strong statistical procedure can be 
used legitimately, the VNT results cannot be reported directly on NAEP 
scales. This would necessarily mean that the VNT may have to be 
reported without direct reference to NAEP.
    Solutions to the challenge of linking will evolve as (and if) work 
on the VNT continues. The Governing Board intends to develop options to 
create a good linkage between the VNT and NAEP. If the linkage cannot 
be established, alternative reporting strategies for the VNT will be 
prepared. These alternatives would, of course, be based on NAEP content 
and performance standards to the maximum extent possible.
    These questions of implementation and linking do not need to be 
settled immediately. They will, however, need to be considered and must 
be settled in a timely manner if Congress and the President decide that 
the voluntary national test program should go forward.

Summary

    This report presents the Governing Board's response to the 
congressional assignment to determine the purpose and intended use of 
the proposed voluntary national tests, including the definition of the 
term ``voluntary'' and a description of the achievement levels and 
other means for reporting results. The Governing Board has prepared 
this report over an eight month period that included extensive 
deliberation, expert advice, four regional public hearings and two 
successive periods of public comment (the first to develop the draft 
report, the second to review the draft report).
    Although the legislation requiring the report calls for a 
``determination,'' the Governing Board views this report as advisory. 
Any final determination on these matters would be made in legislation 
enacted by Congress and signed by the President.
    In submitting this report, the Governing Board is neither 
advocating for or against a voluntary national test. Rather, the 
Governing Board interprets and assignment from Congress to be a present 
a sound and logical case about the potential purpose and use of the 
voluntary national tests.

Recommendation

    The Governing Board is submitting this report in June, three months 
before the required due date of September 30, 1999. This is to assist 
the Congress and the President in deliberations toward a timely 
decision on the future of the voluntary national tests.
    The Governing Board recommends that a decision be made before 
September 30. The schedule for the voluntary national test, if the 
decision is made to proceed, calls for a pilot test in March 2000 of 
test questions developed by the Governing Board. In order for the pilot 
test to be properly carried out in March 2000, a decision is needed 
before September 30, 1999. This will permit the test development 
contractor to proceed in an orderly and efficient manner to carry out 
activities that are essential to the pilot test, such as determining 
the sample of participating schools and arranging for the printing of 
booklets of test questions.
    A decision to proceed that comes too late will set the schedule for 
the pilot test back one year, to March 2001. This is because pilot 
testing must occur in the same month that testing is to occur, which is 
March. If authorization to proceed does not come before September 30, 
it may not be possible to carry out all of the necessary steps that 
lead up to the pilot test in time for it to occur in March 2000.
    If, on the other hand, the decision is made not to proceed, a 
decision prior to September 30 will allow for an orderly and cost-
effective termination of the test development contract.
    It is important to note the purpose of pilot testing. The purpose 
of pilot testing is to determine the quality of each individual test 
question. There are no individual student scores reported. In pilot 
testing, individual questions are evaluated singly. There are no 
overall test scores calculated, even though a student in the pilot test 
will respond to many test questions. The only data collected are 
statistics that relate to the specific test question, such as the 
percent of students who answered the question correctly. From the 
analysis of student responses on the individual test questions, three 
decisions are possible: drop the test question, keep the test question 
as is, or keep the test question with changes. Only from the set of 
test questions that remain after pilot testing will test booklets be 
constructed, which then will be tried out in field-testing. The field 
test stage, unlike the pilot test, is designed to simulate the plans 
for actual testing. If the decision is made to proceed, a field test 
would be conducted in March 2001.
    The optimal outcome would be to have a timely final decision on 
whether or not there shall be voluntary national tests. Another 
possible outcome would be to have agreement to proceed with the pilot 
test of questions, while continuing to deliberate on the prospects for 
the voluntary national test program itself. If the pilot test proceeds, 
the test questions could be considered for use in the National 
Assessment of Educational Progress, should the ultimate outcome be the 
continuing prohibition of voluntary national tests.

[[Page 27526]]

National Assessment of Educational Progress: Design 2000-2010

    What should the Nation's Report Card on student achievement look 
like during the next decade? How can it most effectively help the 
public understand the academic readiness of our youth at grades 4, 8, 
and 12--key transition points in American education? Ultimately, how 
can the National Assessment of Educational Progress (NAEP) best be used 
as an indicator of national and state educational preparedness for the 
challenges facing our society?
    The purpose of this report to Congress and the President is to 
describe the recommendations of the National Assessment Governing Board 
for answering these questions. The report will provide a summary of the 
Governing Board's policy to redesign the National Assessment, describe 
the status of implementation of the redesign policy, and address the 
implications for reauthorization of the National Assessment of 
Educational Progress.

Background

    In 1996, prompted by increasing demand for more and more frequent 
information about the status and progress of student achievement in the 
United States, the National Assessment Governing Board, an independent, 
bipartisan citizen's group created by Congress to set policy for the 
National Assessment, charted a course for NAEP through the year 2010. 
The policy to redesign the National Assessment followed two years of 
study, expert advice, deliberation by the Governing Board, and public 
comment.
    In 1997, the National Center for Education Statistics (NCES) 
developed a plan to implement the redesign policy. The plan has two 
phases. The first phase covers assessments in the year 1999-2003. In 
1998, NCES awarded new contracts for NAEP covering this period. During 
this first phase, the Governing Board's annual schedule of assessments 
will be carried out (see Table 1), National Assessment student 
achievement data will be released more quickly, National Assessment 
reports will be redesigned for the general public, and research will be 
conducted to foster a streamlined design for the National Assessment. 
The second phase of National Assessment redesign, covering assessments 
for the years 2004-2007, will continue the earlier improvements and 
begin to implement the innovations aimed at streamlining the design of 
NAEP.
    Even as redesign implementation begins under the new contracts, the 
Governing Board continues to weigh new evidence that may bear on the 
shape of the NAEP redesign policy. For example, following the adoption 
of the redesign policy in 1996, there have been evaluation reports 
issued on the National Assessment, reviews by other experts, and papers 
prepared for the November 1998 Ten-Year Anniversary Conference 
sponsored by the Governing Board. The views expressed raise issues or 
concerns that bear on six areas of the redesign policy. The Governing 
Board decided to examine once again these six areas of the redesign 
policy to determine whether any modifications to the policy are in 
order. These six policy areas were reviewed in detail in a forum 
conducted by the Governing Board on April 15 with technical experts, 
consumers of NAEP data, representatives from the National Center for 
Education Statistics and the NAEP contractors. The results of the April 
15 forum are incorporated in this report.

National Assessment Redesign: A Summary and Status Report

Introduction: The Redesign Principles
    Over its thirty-year history, the National Assessment has earned 
respect and credibility. The National Assessment is widely recognized 
for the comprehensiveness of its tests, the quality of its technical 
design, the accuracy of its reports, and innovation in its execution. 
The data produced by the National Assessment are unique. No other 
program provides regular reports on the status and progress of student 
achievement for our nation as a whole and that are comparable state-by-
state.
    Although its original purpose was to measure and report on the 
status of student achievement and on change over time, recognition of 
the quality and integrity of the National Assessment led to a multitude 
of demands and expectations beyond reporting on achievement. Meeting 
those expectations was done with good intentions and seemed right for 
the situation at the time. However, some additions that the National 
Assessment performs less effectively were ``tacked on'' to the original 
design.
    The National Assessment was being asked to do too many things, some 
even beyond its reach to do well, and was attempting to serve too many 
audiences. For example, in contrast to the 1970's in which a single 120 
page report on mathematics was deemed sufficient, the 1992 NAEP 
mathematics reports numbered seven and totaled about 1,800 pages.
    The result of attempting to respond to demands beyond NAEP's 
central purpose was to overburden NAEP's design, drive up costs and 
reduce the number of subjects that could be tested. For example, the 
National Assessment tested two or three subjects each year during the 
1970's, its first decade, but only every other year after the 1980's. 
Another indicator that NAEP had too many distractions was that results 
could be released as many as two to three years after testing. This 
simply was not acceptable, particularly with the advent of state-level 
assessments in the 1990's.
    The Governing Board's solution was to focus NAEP on what it does 
best: measure and report on the status of student achievement and 
change over time. Focusing NAEP on what it does best would permit 
NAEP's design to be simplified and also would mean putting limits on 
demands that are outside NAEP's central purpose. Another part of 
focusing NAEP is to define the audience for reports. The Governing 
Board has determined that the NAEP program should not attempt to serve 
multiple audiences directly. The audience for reports should be the 
general public.
    Specialized needs for NAEP data should be accommodated by making 
the NAEP data easily accessible for analysis by others--educators, 
researchers, policymakers, and the media, among others. In order to 
make data more understandable and useful to the general public, the 
Governing Board has determined that achievement levels, or performance 
standards, should be the primary means for reporting NAEP results.
    Thus, five principles undergird the Governing Board's policy for 
the redesign of the National Assessment:
     Conduct assessments annually, following a dependable 
schedule
     Focus NAEP on what it does best
     Define the audience for NAEP reports
     Report results using performance standards
     Simplify NAEP's technical design
    Details on these and other aspects of the redesign policy follow.

Annual Schedule

    A centerpiece of the National Assessment redesign in a dependable 
annual schedule of assessments through the year 2010 (Table 1). In the 
past decade, the focus on education reform, new and revised state 
assessments, and the national education goals have led to demand for 
National Assessment testing more frequently than the biennial schedule 
of the 1980's and most of the 1990's. The schedule for the period 1996 
through 2010 was adopted in March 1997 and revised in November 1998. It 
provides for annual assessments

[[Page 27527]]

at the national level and state-level assessments in even-numbered 
years. The long-term trend assessments in reading, writing, 
mathematics, and science continue on a once per four-year cycle 
beginning in 1999.
    At the national level, grades assessed will be 4, 8 and 12. 
Subjects covered will be reading, writing, mathematics, science, 
geography, U.S. history, world history, civics, economics, foreign 
language, and the arts. These are the subjects listed in the current 
national education goals. Reading, writing, mathematics and science 
will be assessed once every four years. Other subjects will be assessed 
less frequently, but there will generally be two assessments in a 
subject over a ten-year period.
    Testing at the state level will occur in even-numbered years, with 
reading and writing in grades 4 and 8 alternating with mathematics and 
science in grades 4 and 8. Student achievement results in these 
subjects and grades at the state level will be reported on a once per 
four-year basis.
    Many of the other redesign policies, described below, are aimed at 
making the annual schedule affordable through cost-saving efficiencies.

                     Table 1.--Schedule for the National Assessment of Educational Progress
 [The following schedule was adopted by the National Assessment Governing Board on March 8, 1997 and revised in
 November 1998. Assessments shown as scheduled for 1996, 1997, and 1998 were approved previously by the Board.]
----------------------------------------------------------------------------------------------------------------
                  Year                                National                              State
----------------------------------------------------------------------------------------------------------------
1996....................................  Mathematics.....................  Mathematics (4, 8).
                                          Science.........................  Science (8).
                                          Long-term trend* (reading,        ....................................
                                           writing, mathematics, science).
1997....................................  Arts (8)........................
1998....................................  Reading.........................  Reading (4, 8).
                                          Writing.........................  Writing (8).
                                          Civics..........................
1999....................................  Long-term trend*.
2000....................................  Mathematics.....................  Mathematics (4, 8).
                                          Science.........................  Science (4, 8).
                                          Reading (4).
2001....................................  U.S. History.
                                          Geography.
2002....................................  Reading.........................  Reading (4, 8).
                                          Writing.........................   Writing (4, 8).
2003....................................  Civics.
                                          FOREIGN LANGUAGE (12).
                                          Long-term trend*.
2004....................................  MATHEMATICS.....................  MATHEMATICS (4, 8).
                                          Science.........................  Science (4, 8).
2005....................................  WORLD HISTORY (12).
                                          ECONOMICS (12).
2006....................................  READING.........................  READING (4, 8).
                                          Writing.........................  Writing (4, 8).
2007....................................  ARTS.
                                          Long-term trend*.
2008....................................  Mathematics.....................  Mathematics (4, 8).
                                          SCIENCE.........................  SCIENCE (4, 8).
2009....................................  U.S. HISTORY.
                                          GEOGRAPHY.
2010....................................  Reading.........................  Reading (4, 8).
                                          WRITING.........................  WRITING (4, 8).
----------------------------------------------------------------------------------------------------------------
Note: Grades 4, 8, and 12 will be tested unless otherwise indicated. Comprehensive assessments are indicated
  BOLD ALL CAPS; standard assessments are indicated in upper and lower case.
* Long-term trend assessments are conducted in reading, writing, mathematics and science. These assessments
  provide trend data as far back as 1970 and use tests developed by the National Assessment at that time.

Status of Implementation
    The work in the new NAEP contracts covers the schedule as adopted 
by the Governing Board for the years 1999-2003. The long-term trend 
assessments in reading, writing, mathematics, and science will be 
conducted in 1999 and 2003. In 2000, mathematics and science 
assessments will be conducted in grades 4 and 8 at the state level and 
at grades 4, 8, and 12 at the national level. In addition, a reading 
assessment at grade 4 at the national level will be conducted. In 2001, 
geography and U.S. history assessments will be conducted at grades 4, 
8, and 12 at the national level. In 2002, reading and writing 
assessments will be conducted at the state level in grades 4 and 8 and 
at the national level in grades 4, 8, and 12. In 2003, assessments will 
be conducted at the national level in civics in grades 4, 8, and 12 and 
in foreign language at grade 12.

Define the Audience for NAEP Reports

    The expanded demands and expectations noted above reflected the 
many varied audiences that NAEP was attempting to serve. Trying to 
serve too many audiences has meant that no audience is optimally served 
by the National Assessment. The NAEP redesign policy makes the 
distinction between the audience for reports prepared by the NAEP 
program and the users of NAEP data. The audience for NAEP reports is 
the American public. The primary users of NAEP data are national and 
state policymakers, educators, and researchers.
    This distinction in the policy between the audience for reports and 
users of data is important. It is intended to address the needs of 
various groups and

[[Page 27528]]

individuals interested in NAEP results, while providing an appropriate 
division of labor between them and the federal government.
    National Assessment reports released by the U.S. Department of 
Education should be objective, providing the facts about the status and 
progress of student achievement. Providing objective information about 
student achievement is an appropriate federal role. Since the public is 
the primary audience, NAEP reports should be understandable, jargon 
free, easy to use, widely disseminated, and timely.
    On the other hand, the redesign policy suggests that interpreting 
NAEP data (e.g., developing hypotheses about achievement from 
relationships between test scores and background questions) is a role 
that falls primarily to those outside the Department of Education--the 
states that participate in NAEP, policymakers, curriculum specialists, 
researchers, and the media, to name a few. For the NAEP program itself 
to address the myriad of interests and questions of these diverse 
groups seems both impractical and inappropriate. However, the federal 
government should encourage and provide funds for a wide range of 
individuals and organizations with varied interests and perspectives to 
analyze NAEP data and use the results to improve education. This is the 
point of the redesign policy. Thus, the redesign policy provides that 
National Assessment data are to be made available in easily accessible 
forms to support the efforts of states and others to analyze the data, 
interpret results to the public, and improve education performance.
Status of Implementation
    The National Center for Education Statistics is placing a high 
priority on ``highlight'' reports and national report cards for each 
subject, which are aimed at the general public. NAEP data will be 
accessible through a new Internet web site, customized for particular 
data users. Priorities for NAEP secondary analysis grants were revised 
to encourage wider use of NAEP data by national and state policy 
makers, educators, and researchers and to focus the analyses on 
interpretive and education improvement purposes. Also, NCES is 
continuing to develop and provide training on software for analyzing 
NAEP data.

Report Results Using Performance Standards

    In 1988, Congress created the Governing Board and authorized it to 
set performance standards--called achievement levels--for reporting 
National Assessment results. Under the redesign policy, achievement 
levels are to be used as the primary (although not exclusive) means for 
reporting National Assessment results. The achievement levels describe 
``how good is good enough'' on the various tests that make up the 
National Assessment. Previously, the National Assessment reported 
average scores on a 500-point scale. There was no way of knowing 
whether a particular score represented strong or weak performance and 
whether the amount of change from previous years' assessments should 
give cause for concern or celebration. The National Assessment now also 
reports the percentage of students who are performing at or above 
``Basic,'' ``Proficient,'' and ``Advanced'' levels of achievement.
    The achievement levels have been the subject of several independent 
evaluations, some controversy, and conflicting recommendations. 
Recommendations have been carefully considered and some have been used 
to improve the standard-setting procedures. While the current 
procedures are among the most comprehensive used in education, the 
Governing Board remains committed to making continual improvements.
Status of implementation
    The Governing board will continue to set achievement levels for 
reporting NAEP results. These achievement levels are to be used on a 
developmental basis until a determination is made that the levels are 
reasonable, valid, and informative to the public. At that point, the 
developmental designation will be removed.
    The Governing Board views standard setting as a judgmental, not a 
scientific, process. However, the process must be conducted in a manner 
that is technically sound and defensible. The Governing Board is 
preparing a report required by Congress to respond to the assertion 
that the process for setting the achievement levels is ``flawed.'' This 
report will include a detailed plan for reviewing the criticisms and 
compliments found in the evaluation reports that studied the 
achievement levels. The plan also will address alternatives to the 
current level-setting procedures.

Simplify the Technical Design for the National Assessment

    The current design of the National Assessment is very complex. The 
redesign policy requires that the research and testing companies that 
compete for the contract to conduct the National Assessment must 
identify options to simplify the design of the National Assessment. 
Examples of NAEP's complexity include: (1) National and state results 
are based on completely separate samples. (2) No student takes the 
complete set of test questions in a subject and as many as twenty-six 
different test booklets are used within a grade; thus scores on NAEP 
are calculated using very sophisticated statistical procedures. (3) 
Students, teachers, and principals complete separate background 
questionnaires, which may be submitted at different times, complicating 
their use in calculating assessment results. (4) The data for every 
background question collected must be compiled before any report can be 
produced, regardless of whether the data from the background question 
will be included in a report, lengthening the time from data collection 
to reporting.
Status of Implementation
    This is a ``work in progress.'' Options for combining the national 
and state samples are being developed by the contractors in 
collaboration with NCES and the Governing Board. Similarly, options to 
reduce the size of the state sample are being considered. An option to 
increase the precision of the state results will be implemented in the 
year 2000 mathematics and science state assessments. Progress also has 
been made in shortening the time between data collection and reporting 
by eliminating the requirement to link certain background 
questionnaires to student achievement data. Plans for a short-form of 
the National Assessment, using a single test booklet, are being 
implemented, with a pilot possibly as early as the year 2000. The 
purpose of the short-form trial is to enable faster initial reporting 
of results and, possibly, for states to have access to NAEP assessment 
results in years in which NAEP assessments are not scheduled in 
particular subjects. Plans also are in the development stage for 
improving the quality, relevance, and efficiency of background 
questionnaires.

Measure Student Achievement at Grades 4, 8, and 12

    The primary purpose of the National Assessment is to measure 
student achievement at grades 4, 8, and 12 in academic subjects at the 
state and national level and for subgroups, showing trends over time in 
the percent of students at or above each achievement level. The 
subjects to assess are those listed in the national educational goals--
reading, writing, mathematics, science,U.S. history, geography, world 
history, civics,

[[Page 27529]]

economics, the arts, and foreign language. Grades 4, 8 and 12 are 
considered to be important transition points in American education. 
Reporting by grade is generally thought to be relevant for policy than 
the reporting by age which was used at NAEP's inception and in long-
term trend reporting.
    Although grade 12 performance is important as an ``exit'' measure 
from the K-12 system, here are problems with grade 12 results. The 
problems are that student and school participation rates and student 
motivation at grade 12 are now. The Governing Board has considered 
whether the change NAEP to another grade at the high school level, 
examining both anecdotal and empirical evidence. Anecdotal evidence 
about the low motivation of high school students taking low stakes 
tests in the spring of their senior year raises serious questions about 
whether NAEP should test a grade 12. However, the empirical evidence in 
NAEP does not indicate that switching to grade 11 would result in 
higher motivation on the part of students or greater accuracy in the 
results. In fact, there is some evidence that twelfth graders taking 
NAEP may try harder in some cases than eleventh graders. The redesign 
policy asks the companies that compete for the NAEP contract to find 
ways to increase school and student participation rates and student 
motivation. Until they increase, National Assessment reports should 
include clear caveats about interpreting grade 12 results.
Status of Implementation
    Because the empirical evidence does not warrant a change at this 
time, NAEP should continue to test at grade 12. New NAEP contracts have 
been awarded for the conduct of assessments through the year 2003. The 
contracts are designed to measure student achievement at grades 4, 8, 
and 12; report state, national, and subgroup results; report trends 
over time; and use performance standards for reporting results. Caveats 
for interpreting grade 12 results have been added to reports. However, 
more attention needs to be placed on improving grade 12 participation 
rates and student motivation. Toward this end, NCES is planning a 
series of studies, including NAEP transcript studies, to examine the 
relationship between student achievement and motivation.

What NAEP Is Not Designed To Do

    The NAEP redesign policy attempts to focus NAEP on what it does 
best. What the National Assessment does best is measure student 
achievement. Focusing NAEP on what it does best comes with a related 
idea--recognizing and limiting what NAEP is not designed to do.
    Although the National Assessment is well designed for measuring 
student achievement and trends over time, it is not a good source of 
data for drawing conclusions about or providing explanations for the 
level of performance that is reported. It also is not a measure of 
personal values, a national curriculum, an appropriate means for 
improving instruction in individual classrooms, or a basis for 
evaluating specific pedagogical approaches.
    The National Assessment is what is known as a ``cross-secondary 
survey,'' an effective and cost-efficient means for gathering data on 
student achievement. A cross-sectional survey gathers data at one point 
in time. In the case of NAEP, data are gathered on national and state-
representative samples of students at a particular time during the 
school year. The sample is large enough to permit reasonably accurate 
estimates of subgroup performance (e.g., by sex, race, and ethnicity). 
Change over time can be measured by administering the same survey again 
in later years, under the same testing conditions, with samples of 
students that are similar to the ones tested earlier. Comparisons can 
be made within and cross the subgroups and for the whole sample.
    However, a cross-sectional survey cannot provide answers about what 
causes the level of performance that is reported. Measuring the causes 
of achievement would involve an experimental design, with specific 
research questions to answer, pre- and post-testing of students, and 
comparisons of results between groups of students receiving a 
particular educational approach with those that are not. While some may 
view such research as a worthwhile part of NAEP, the need for pre- and 
post-testing alone would double the costs of NAEP testing. Because pre- 
and post-testing would require additional administrative burden on 
schools and more time away from instruction for students, it could 
severely hamper school and student participation rates in NAEP, 
especially with NAEP's annual assessment schedule. Too few schools and 
students in the sample, in turn, would jeopardize NAEP's ability to 
provide national and state-representative student achievement results.
    The best that can be done regarding explanation or interpretation 
of results is to report on background variables that may be associated 
with achievement. However, in many cases, the data from background 
questions collected by NAEP are inconclusive or counter to what one 
would expect. Even where the associations are stronger, the data are 
not adequate for supporting conclusions that explain why achievement is 
at the level reported. Clearly, the use of NAEP background data to 
explain or interpret achievement results should be done with caution.
Status of Implementation
    Under the new NAEP contracts, the collection of background 
information will be more focused. The plan is to collect a well-defined 
core of background information. For example, the well-defined core of 
background information will include the data that are required for 
every assessment--e.g., data on sex, race, ethnicity, whether the 
students are in public or private schools, etc. In addition, each 
assessment will have a set of background questions designed 
specifically for the subject being assessed, with each set being 
determined by policy. Therefore, the background questions for the 
mathematics assessment will vary from those for the science or reading 
assessments.
    The intent is not only to be more purposeful about what is 
collected, but more strategic about how it is collected as well. For 
example, in the past, information on TV watching by students was 
collected regularly as a part of every assessment. In the same year, 
the same background questions could be asked of the students in each 
separate national sample. Clearly, whether two or more subjects are 
being assessed in a particular year, it may not be necessary to ask 
identical questions across all of the assessments. Similarly, it may 
not be necessary to ask certain questions every year. In addition, the 
background questions themselves will be pilot tested to reduce the 
possibility of misinterpretation.

Reporting NAEP Results

    The redesign policy provides the National Assessment results should 
be released with the goal of reporting results six to nine months after 
testing. Reports should be written for the American public as the 
primary audience and should be understandable, free of jargon, easy to 
use and widely disseminated. National Assessment reports should be high 
technical quality, with no erosion of reliability, validity, or 
accuracy.
    The amount of detail in reporting should be varied. Comprehensive 
reports would be prepared to provide an in-depth look at a subject the 
first time it is assessed using a newly adopted test

[[Page 27530]]

framework, testing many students and collecting background information. 
Although scale scores also will be used, achievement levels shall 
continue to be the primary method for reporting NAEP results. Test 
questions, scoring guides, and samples of students work that illustrate 
the achievement levels--Basic, Proficient, and Advanced--will receive 
prominence in reports. Data also would be reported by sex, race/
ethnicity, socio-economic status, and for public and private schools; 
other reporting categories also are possible. Standard reports would be 
more modest, providing overall results in the same subject in 
subsequent years using achievement levels and average scores. Data 
could be reported by sex, race/ethnicity, socio-economic status, and 
for public and private schools, but would not be broken down further. 
The amount of background data collected and reported would be somewhat 
limited in comparison to a comprehensive report. Special, focused 
assessments on timely topics also would be conducted, exploring a 
particular question or issue and possibly limited to one or two grades.
Status of Implementation
    The new NAEP contracts provide for faster release of data, 
standards-based reporting, reports that are targeted to the general 
public, and three different kinds of reports: ``comprehensive,'' 
``standard,'' and ``focused.'' The 1998 national reading results were 
released in 11 months of testing; the state results in 12 months. 
Although still short of the Board's goal of reporting results in 6 to 9 
months following testing, progress is being made.

Simplify Trend Reporting

    The NAEP redesign policy requires the development of a carefully 
planned transition to enable ``the main National Assessment'' to become 
the primary way to measure trends in reading, writing, mathematics and 
science. This is because there are now two NAEP testing programs for 
reading, writing, mathematics and science. The two programs use 
different tests, draw different samples of students (i.e., one based on 
age--9, 13 and 17-year-olds, the other based on grade--4, 8 and 12), 
and report results in two different ways. Not surprisingly, the two 
different programs can yield different results, which complicates the 
presentation and explanation of NAEP results. In addition, this 
redundancy boosts costs, potentially limiting assessments in other 
subjects.
    The first program, referred to as the ``long-term trend 
assessments,'' monitors change in student performance using tests 
developed during the 1960's and 1970's. The sample of students is based 
on age (i.e., 9, 13, and 17-year-olds) for reading, mathematics, and 
science and on grade for writing (i.e., grades 4, 8 and 11). The age-
based samples include students from two or more grades. For example, 
the 9-year-old sample has 3rd, 4th, and 5th grade students. Long-term 
trend assessment results are reported displaying changes over time in 
average scores. The second program, referred to as ``main NAEP,'' uses 
tests developed more recently, reports results by grade, and employs 
performance standards for reporting whether achievement is good enough. 
As an example of the potential for confusion in maintaining two 
separate programs, in 1996 the long-term trend assessment program 
declared mathematics results flat since 1990, while main NAEP reported 
significant gains.
    Some argue against the policy to make main NAEP the primary means 
for monitoring trends. They feel that being able to compare student 
achievement in the 1990's to achievement in the 1970's and 1980's is 
too important to eliminate. Others argue that the long-term trend 
assessments are not relevant for policy makers. This is because these 
assessments primarily use a sample based on the students' age rather 
than on the students' grade, the content of the tests is simpler, there 
is no standards-based reporting, and the results at times conflict with 
main NAEP.
Status of Implementation
    This is a ``work in progress.'' The National Center for Education 
statistics is just beginning to develop options for making the 
transition from long-term trend to main NAEP as the primary means for 
monitoring trends in achievement. Identifying options that are 
practical, affordable, and technically feasible will take time. The 
Governing Board has scheduled long-term trend assessments to be 
conducted in 1999, 2003, and 2007. This will afford adequate time to 
evaluate the viability of the options that may be proposed and at the 
same time maintain the long-term trend line. The immediate effect is to 
change the schedule for this part of the testing program from once 
every two years to once every four years.

Keep NAEP Assessment Frameworks Stable

    The NAEP redesign policy states that assessment frameworks shall 
remain stable for at least ten years. The purpose is three-fold: to 
provide for measuring trends in student achievement, to allow for 
change to frameworks when the case for change is compelling, and to 
manage costs.
    By law, National Assessment frameworks are developed by the 
Governing Board through a national consensus process involving hundreds 
of teachers, curriculum experts, state and local testing specialists, 
administrators, and members of the public. The assessment frameworks 
describe how an assessment will be constructed, provide for the subject 
area content to be covered, determine what will be reported, and 
influence the cost of an assessment.
    Both current practice and important developments in each subject 
area are considered: How much algebra should be in the 8th grade 
mathematics assessment? Should there be both multiple choice and 
constructed response items and if so,what is the appropriate mix? How 
much of what is measured should students know and be able to do? The 
frameworks receive wide public review before adoption by the Governing 
Board.
Status of Implementation
    The Governing Board is solely responsible for developing and 
approving assessment frameworks and has been adhering to its policy of 
keeping the frameworks stable. With a decision to be made this year 
about whether to conduct a national consensus process for the 2004 
mathematics assessment, the Governing Board is beginning to examine 
criteria for determining when a new framework is necessary. An 
important factor will be the impact of changing the framework on the 
measurement of trends in student achievement.

Use International Comparisons

    The NAEP redesign policy states that National Assessment 
frameworks, test specifications, achievement levels, and data 
interpretations shall take into account, where feasible, curricula, 
standards, and student performance in other nations, band promote 
studies to ``link'' the National Assessment with international 
assessments.
    The National Assessment is, and should be, an assessment of student 
achievement in the United States. It should be focused on subjects and 
content deemed important for the U.S. through the national consensus 
process used to develop NAEP frameworks. However, decisions on content, 
achievement levels, and interpretation of NAEP results, where feasible, 
should be informed, in part, by the expectations for education set by 
other industrialized

[[Page 27531]]

countries, and comparative test results. Although there are technical 
hurdles to overcome, consideration of such information can be useful in 
determining ``how good is good enough'' in an assessment for U.S. 
students.
Status of Implementation
    The National Center for Education Statistics conducted a linking 
study of the 1996 NAEP science and mathematics assessments with the 
1995 Third International Mathematics and Science Study (TIMSS). The 
Government Board used information from this linking study in setting 
the achievement levels for the 1996 science assessment. NCES will be 
conducting TIMSS again in the spring of 1999 and thirteen states have 
agreed to participate to collect state-presentative TIMSS data. NCES 
will be applying a methodology for relating TIMSS to NAEP and will be 
evaluating the strength of the relationship.

Use Innovations in Measurement and Reporting

    The NAEP redesign policy states that the National Assessment shall 
assess, and, where warranted, implement advances related to technology 
and the measurement and reporting of student achievement. In addition, 
the competition for NAEP contracts for assessments beginning around the 
year 2000 shall include a plan for conducting testing by computer in at 
least one subject and grade and for using technology to improve test 
administration, measurement, and reporting.
Status of Implementation
    The newly awarded NAEP contracts include plans for a short-form 
test (described above) in 4th grade mathematics in the year 2000 and 
for the development of a computer-based assessment.

Help States and Others Link To NAEP and Use NAEP Data To Improve 
Education Performance

    The NAEP redesign states that the National Assessment shall assist 
states, districts and others, who want to do so at their own cost, to 
link their test results to the National Assessment. The policy also 
provides that NAEP shall be designed to permit access and use by others 
of NAEP data and materials. These include frameworks, specifications, 
scoring guides, results, questions, achievement levels, and background 
data. In addition, the policy provides that steps be taken to protect 
the integrity of the NAEP program and the privacy of individual test 
takers.
Status of Implementation
    The State of Maryland and the State of North Carolina have 
collaborated with Governing Board on studies to examine the content of 
their respective state mathematics test in light of the content of 
NAEP. The National Center for Education Statistics has a special grants 
program that provides funds to analyze NAEP data. The NCES has amended 
priorities for this grants program to encourage applications from 
states (and others) to conduct analyses that will be practical benefit 
in interpreting NAEP results and in improving education performance. 
The National Academy of Sciences report ``Uncommon Measures,'' 
describes the many technical difficulties involved in linking state 
results to NAEP. The NCES is planning a major conference with the 
states to provide a forum for discussing and addressing these 
difficulties. In addition NCES is planning to conduct studies on 
various linking methodologies to provide insight on how the linking of 
NAEP and state assessments may best be done.

National Assessment Redesign: Implications for Reauthorization

    The Governing Board's redesign policy is directed at the operation 
of the National Assessment program. It does not address governance of 
the National Assessment. While there are a number of areas in the 
current NAEP legislation for which change should be considered, the 
NAEP redesign policy can, with two exceptions, be implemented within 
the current NAEP legislation.
    The first exception has to do with the subjects to assess. Current 
law ties the subjects covered by NAEP to reading, and the other 
subjects listed in the national education goals. The Governing Board 
agrees that these subjects should be assessed by the National 
Assessment and, accordingly, has adopted the schedule displayed in 
Table 1 above. However, the national education goals are about to 
expire. The Governing Board recommends that, with respect to subjects 
to assess, the reauthorization of the National Assessment should be 
consistent with the schedule of assessments adopted by the Governing 
Board.
    The second issue has to do with long-term trend assessments. 
Current law requires that assessments using age-based samples be 
conducted at least once every two years. Since the only assessments 
using age-based samples are the reading, science and mathematics long-
term trend assessments, this provision is interpreted as requiring 
long-term trend assessments once every two years. In accordance with 
the schedule of assessments, the Governing Board recommends that the 
NAEP legislation be modified so that the frequency of the long-term 
trend assessments is changed to at least once every four years.

Conclusion

    The National Assessment in the next century will provide student 
achievement results at the national level each year. State-level data 
will be provided every other year. Student achievement in reading, 
writing, mathematics and science will, appropriately, receive the most 
attention, with testing once every four years, but not to the exclusion 
of other important subjects. By continuing to report results using 
achievement levels and improving the process by which achievement 
levels are set, the National Assessment will help advance standards-
based assessment and reporting in the United States. With a focus on 
its core purpose--measuring and reporting on the status of student 
achievement and change over time--the National Assessment design can be 
made more streamlined, more effective, and more efficient. With a clear 
sense of its primary audience--The general public--National Assessment 
reports will have more impact.
    With a predictable schedule of assessments and reporting of 
National Assessment results, the public at regular intervals will 
discuss and debate education quality, states can plan ahead for their 
participation, and educators will have an external standard against 
which to compare their own efforts.
    Additional Information: Written comments must be received by June 
9, 1999 at the following address: Mark D. Musick, Chairman (Attention: 
Ray Fields), National Assessment Governing Board, 800 North Capitol 
Street NW, Suite 825, Washington, DC 20002-4233.
    Written comments also may be submitted electronically by sending 
electronic mail (e-mail) to [email protected] by June 9, 1999. 
Comments sent by e-mail must be submitted as an ASCII file avoiding the 
use of special characters and any form of encryption. Inclusion in the 
public record cannot be guaranteed for written statements, whether sent 
by mail or electronically, received after June 9, 1999.
    Public Record: A record of comments received in response to this 
notice will be available for inspection from 8 a.m. to 4:30 p.m., 
Monday through Friday, excluding legal holidays, in Suite 825,

[[Page 27532]]

800 North Capitol Street, NW, Washington, DC, 20002.

    Dated: May 17, 1999.
Roy Truby,
Executive Director, National Assessment Governing Board.
[FR Doc. 99-12746 Filed 5-19-99; 8:45 am]
BILLING CODE 4000-01-M