Census 2000: Design Choices Contributed to Inaccuracy of Coverage
Evaluation Estimates (12-NOV-04, GAO-05-71).			 
                                                                 
Evaluations of past censuses show that certain groups were	 
undercounted compared to other groups, a problem known as	 
"coverage error." To address this, the Census Bureau included in 
its 2000 Census design the Accuracy and Coverage Evaluation	 
Program (A.C.E.) to (1) measure coverage error and (2) use the	 
results to adjust the census, if warranted. However, the Bureau  
found the A.C.E. results inaccurate and decided not to adjust or 
plan for adjustment in 2010. Congress asked GAO to determine (1) 
factors contributing to A.C.E.'s reported failure to accurately  
estimate census coverage error, and (2) the reliability of the	 
revised coverage error estimates the Bureau subsequently	 
produced. To do this, GAO examined three sets of Bureau research 
published in March 2001, October 2001, and March 2003 and	 
interviewed Bureau officials.					 
-------------------------Indexing Terms------------------------- 
REPORTNUM:   GAO-05-71						        
    ACCNO:   A13553						        
  TITLE:     Census 2000: Design Choices Contributed to Inaccuracy of 
Coverage Evaluation Estimates					 
     DATE:   11/12/2004 
  SUBJECT:   Census						 
	     Computer matching					 
	     Data collection					 
	     Data integrity					 
	     Errors						 
	     Evaluation methods 				 
	     Program evaluation 				 
	     Statistical data					 
	     2000 Decennial Census				 
	     Census Bureau Accuracy and Coverage		 
	     Evaluation Program 				 
                                                                 
	     1990 Decennial Census				 
	     2010 Decennial Census				 

******************************************************************
** This file contains an ASCII representation of the text of a  **
** GAO Product.                                                 **
**                                                              **
** No attempt has been made to display graphic images, although **
** figure captions are reproduced.  Tables are included, but    **
** may not resemble those in the printed version.               **
**                                                              **
** Please see the PDF (Portable Document Format) file, when     **
** available, for a complete electronic file of the printed     **
** document's contents.                                         **
**                                                              **
******************************************************************
GAO-05-71

     

     * Report to Congressional Requesters
          * November 2004
     * CENSUS 2000
          * Design Choices Contributed to Inaccuracy of Coverage Evaluation
            Estimates
     * Contents
          * Results in Brief
          * Background
          * Scope and Methodology
          * Design and Other Bureau Decisions Created Difficulties and Blind
            Spots in Census Coverage Evaluation
               * Bureau Attributes Inaccuracy of Coverage Error Estimates to
                 Residence Rules That Did Not Capture Complexity of U.S.
                 Society
               * Other Design Decisions Also Created Blind Spots in Census
                 Coverage Evaluation
               * Bureau Has Not Fully Accounted for How Design Decisions
                 Affected Coverage Error Estimates
          * The Bureau Has Not Produced Reliable Revised Estimates of
            Coverage Error for the 2000 Census
               * Bureau's Conclusion That A.C.E. Estimates Are Unreliable Is
                 Based on Results That Are Unreliable Themselves
               * The Bureau Has Not Clearly Quantified and Reported the Full
                 Impact of the Methodological Limitations on the Revised
                 Estimates
               * Unexpected Differences in Patterns of Coverage Error between
                 the Housing and the Population Count Further Call into
                 Question Reliability of Revised Estimates
               * Bureau Reports Revised Estimates Resulted in Lessons Learned
          * Conclusions
          * Recommendations for Executive Action
          * Agency Comments and Our Evaluation
     * Comments from the Department of Commerce
     * Bureau Design Decisions Increased Imputations
     * Bureau Estimates of Population and Housing Unit Net Undercounts
     * Glossary
          * Accuracy and Coverage Evaluation
          * Census Adjustment
          * Count of Housing
          * Count of Population
          * Coverage Error
          * Coverage Evaluation
          * Duplicates
          * Residence Rules
     * Related GAO Products
     * http://www.gao.gov

                 United States Government Accountability Office

Report to Congressional Requesters

GAO

November 2004

CENSUS 2000

Design Choices Contributed to Inaccuracy of Coverage Evaluation Estimates

                                       a

CENSUS 2000

Design Choices Contributed to Inaccuracy of Coverage Evaluation Estimates

  What GAO Found

According to senior Bureau officials, increasingly complicated social
factors, such as extended families and population mobility, presented
challenges for A.C.E., making it difficult to determine exactly where
certain individuals should have been counted thus contributing to the
inaccuracy of the coverage error estimates. For example, parents in
custody disputes both may have an incentive to claim their child as a
resident, but the Bureau used rules for determining where people should be
counted-residence rules--that did not account for many of these kinds of
circumstances. Other design decisions concerning both A.C.E. and the
census also may have created "blind spots" that contributed to the
inaccuracy of the estimates (see figure).

The Bureau has not accounted for the effects of these or other key design
decisions on the coverage error estimates, which could hamper the Bureau's
efforts to craft a program that better measures coverage error for the
next national census.

Factors Potentially Affecting Accuracy of Coverage Error Estimates at Different
                                 Points in the

                                 A.C.E. Program

     o Bureau used rules that did not fully account for social complexities.
     o Bureau relied on follow-up interviews to clarify unclear survey
       responses, but interviewees often provided information that further
       complicated, rather than clarified, the data.
     o Bureau left some populations out of A.C.E. sample survey.
     o Bureau limited geographic scope of searching for matches.
     o Bureau removed certain records from the census data being matched, but
       not from the census.

o  Limitations in revision methodology raised questions about usefulness
of revised estimates.

Source: GAO.

Despite having twice revised A.C.E.'s original coverage error estimates,
the Bureau has no reliable estimates of the extent of coverage error for
the 2000 census. While both revisions suggested that the original
estimates were inaccurate, in the course of thoroughly reviewing the
revisions, the Bureau documented (1) extensive limitations in the revision
methodology and (2) an unexpected pattern between the revised estimates
and other A.C.E. data, both of which indicated that the revised coverage
error estimates may be questionable themselves. Furthermore, when the
Bureau published the revised estimates, it did not clearly quantify the
impact of these limitations for readers, thus preventing readers from
accurately judging the overall reliability of the estimates. It is
therefore unclear how A.C.E. information will be useful to the public or
policymakers, or how the Bureau can use it to make better decisions in the
future.

                 United States Government Accountability Office

Contents

  Letter 1

Results in Brief 3 Background 4 Scope and Methodology 8 Design and Other
Bureau Decisions Created Difficulties and Blind

Spots in Census Coverage Evaluation 9 The Bureau Has Not Produced Reliable
Revised Estimates of

Coverage Error for the 2000 Census 16 Conclusions 22 Recommendations for
Executive Action 23 Agency Comments and Our Evaluation 24

  Appendixes

Appendix I: Comments from the Department of Commerce 29 Appendix II:
Bureau Design Decisions Increased Imputations 34 Appendix III: Bureau
Estimates of Population and Housing Unit Net Undercounts 36

Glossary

  Related GAO Products

Table 1: Estimated Percent Population Undercounts for Selected

  Tables

Race/Origin Groups 36 Table 2: Estimated Percent Occupied Housing
Undercounts Differ

from Population Undercounts for Selected Race/Origin

Groups 37

Figure 1:

  Figures

Figure 2: Figure 3: Figure 4: Figure 5: A.C.E. Sample Operations
Paralleled the Census 6

A.C.E. Sample Excluded Individuals and Records from Scope of Evaluation 11

A.C.E. Matching Results Lacked Data on Suspected Duplicates That Were in
the Census 14 Reported Undercount Estimates Generally Decreased with
A.C.E. Revisions 18 Population and Housing Undercounts 21

Contents

    Abbreviations

A.C.E. Accuracy and Coverage Evaluation NAS National Academy of Sciences

This is a work of the U.S. government and is not subject to copyright
protection in the United States. It may be reproduced and distributed in
its entirety without further permission from GAO. However, because this
work may contain copyrighted images or other material, permission from the
copyright holder may be necessary if you wish to reproduce this material
separately.

A

United States Government Accountability Office Washington, D.C. 20548

November 12, 2004

The Honorable Henry A. Waxman Ranking Minority Member Committee on
Government Reform House of Representatives

The Honorable Danny K. Davis Ranking Minority Member Subcommittee on Civil
Service

and Agency Organization Committee on Government Reform House of
Representatives

The Honorable Wm. Lacy Clay, Jr. Ranking Minority Member Subcommittee on
Technology,

Information Policy, Intergovernmental

Relations and the Census Committee on Government Reform House of
Representatives

The Honorable Charles A. Gonzalez The Honorable Carolyn B. Maloney House
of Representatives

A decennial census must be as accurate as possible, because census results
are used to, among other purposes, apportion congressional seats, redraw
congressional districts, and allocate federal aid to state and local
governments. However, given the nation's size and demographic complexity,
some amount of error is inevitable. Unfortunately, evaluations of past
censuses have shown that certain groups, for example African-Americans and
Hispanics, have been undercounted in comparison to other groups. To
estimate the extent that some groups were over-or undercounted in
2000-what the Bureau refers to as "coverage error"-the Census Bureau
(Bureau) planned and implemented the Accuracy and Coverage Evaluation
(A.C.E.) program. The primary goals of A.C.E. were to

     o more accurately estimate the rate of coverage error via a sample
       survey of select areas nationwide, and
     o if warranted, to use the results of this survey to adjust census
       estimates of the population for nonapportionment purposes.

In March 2003, after much deliberation and research, the Bureau decided
not to use any A.C.E. estimates of coverage error to adjust the 2000
Census, because it judged these estimates no more accurate than the
official census data. The Bureau found that A.C.E. did not account for at
least 3 million erroneously counted persons (mostly duplicates, or people
counted twice) in the census, which raised questions about the reliability
of the coverage error estimates. Furthermore, because of the difficulties
the Bureau experienced in trying to produce reliable coverage error
estimates, it announced officially in January 2004 that it did not plan to
develop a procedure for adjusting the 2010 Census results for
redistricting purposes. Agency officials said that in light of their past
experiences, they do not think they can produce reliable coverage error
estimates in time to meet deadlines for adjusting the census.

This report responds to your request that we review why A.C.E. coverage
error estimates were reportedly not sufficiently reliable to adjust or
validate the 2000 Census. Specifically, our review examined (1) factors
contributing to A.C.E.'s reported failure to accurately estimate census
coverage error, and (2) the reliability of the revised coverage error
estimates the Bureau subsequently produced.

To meet the objectives of this report, we reviewed and analyzed the
Bureau's publicly available research data and reports on the 2000 Census.
We also reviewed methodology documents and other available information
such as the minutes and supporting documents of the Executive Steering
Committees for Adjustment Policy. Finally, we discussed the results of our
analysis with senior Bureau officials and interviewed Bureau officials and
committee members to obtain their views on the process. Since our focus
was on the process and decisions that led to the results rather than on
determining the underlying numbers themselves, we did not audit the
Bureau's research, the underlying data, or its conclusions. Our work was
performed in Washington D.C. and at the U.S. Census Bureau headquarters in
Suitland, Maryland, from December 2002 through July 2004 in accordance
with generally accepted government auditing standards.

  Results in Brief

The Bureau attributes the inaccuracy of A.C.E. estimates primarily to the
rules it used for determining where people should be counted-residence
rules-which officials say did not fully capture the increasing complexity
of American society. For example, parents in custody disputes both may
have an incentive to claim their child as a member of their respective
household, but the Bureau's residence rules did not always account for
these kinds of circumstances, thus accurately counting such individuals
was difficult. While the Bureau emphasizes residence rules, our research
indicates that some of the Bureau's other design decisions concerning both

A.C.E. and the census created several "blind spots" in A.C.E. that also
may have contributed to the unreliability of coverage error estimates. For
example, the Bureau decided to exclude from A.C.E. people living in group
quarters such as college dorms, because their high mobility made them
difficult to count during the 1990 coverage evaluation, and this weakened
the 1990 coverage error estimates. In addition, the Bureau removed 2.4
million records it suspected were duplicates from the census population
before A.C.E. had a chance to measure them, and then reinstated them after
A.C.E. was finished, so that A.C.E. was blind to over 1 million duplicates
(according to Bureau research) in that reinstated group. However, because
the Bureau has not accounted for and clearly reported how these and other
key design decisions affected the outcome of A.C.E., the full range of
reasons why its estimates were unreliable remains obscure. This in turn
could hamper the Bureau's efforts to craft a more successful coverage
measurement program that better measures the accuracy of the next national
census.

The Bureau's revised coverage error estimates cannot necessarily be
interpreted as any more reliable than the original estimates. In the
course of an extensive review of A.C.E.'s results over a 3-year period,
the Bureau revised its estimates of A.C.E. coverage error twice-first in
October 2001 and second in March 2003-and both revisions suggested that
the original estimates were inaccurate. However, in reviewing the revised
estimates, the Bureau documented (1) extensive limitations in the revision
methodology and (2) an unexpected pattern between the revised estimates
and other data, both of which indicated that the revised estimates may be
questionable themselves. Furthermore, while the Bureau described the
aforementioned limitations in detail in its review documents, when it
published the revised estimates of coverage error, it did not clearly
quantify their impact for readers. It is therefore unclear how these
revised estimates will be useful to the public or policymakers, or how the
Bureau will use them to make better decisions in the future.

As a result, we are recommending that the Bureau's future evaluation
planning take into account the potential effect of future decisions
relating to census or coverage evaluation design to avoid similar or other
problems in the 2010 measure of coverage error. We also recommend that the
Bureau clearly report how any significant changes in the design of the
census and/or A.C.E. might affect the accuracy of the published coverage
error estimates for 2010. Similarly, GAO recommends that the Bureau not
only identify methodological limitations-as it currently does-but also
report the potential range of impacts that methodological limitations and
design changes may have on the census coverage error estimates it
publishes in the future.

The Under Secretary for Economic Affairs forwarded written comments from
the Department of Commerce on a draft of this report (see app. I). The
Department concurred with our recommendations but took exception to
several statements in our draft report, writing that it thought some of
our conclusions were misleading. In response, we revised the report, as
appropriate, including additional information provided by the Department
and clarifying the presentation of our analyses. We discuss the
Department's comments more fully in the section "Agency Comments and Our
Evaluation."

Since 1980, the Bureau has used statistical methods to generate detailed

  Background

estimates of census undercounts and overcounts, including those of
particular ethnic, racial, and other groups. To carry out the 2000
Census's Accuracy and Coverage Evaluation program (A.C.E.), the Bureau
conducted a separate and independent sample survey that, when matched to
the census data, was to enable the Bureau to use statistical estimates of
net coverage errors to adjust final census tabulations according to the
measured undercounts, if necessary. The Bureau obligated about $207
million to its coverage evaluation program from fiscal years 1996 through
2001, which was about 3 percent of the $6.5 billion total estimated cost
of the 2000 Census.

While the A.C.E. sample survey of people was conducted several weeks after
Census Day, April 1, the "as of" date on which the total population is to
be counted, many of the processes were the same as the 2000 Census. For
the census, the Bureau tried to count everybody in the nation, regardless
of their dwelling, and certain kinds of dwellings, including single-family
homes, apartments, and mobile homes, along with demographic information on
the inhabitants. For A.C.E., the Bureau surveyed about 314,000 housing
units in a representative sample of "clusters"-geographic areas each with
about 30 housing units. The sample comprised roughly 12,000 of the about 3
million "clusters" nationwide.

As illustrated in figure 1, the Bureau used a similar process to develop
address lists, collect response data, and tabulate and disseminate data-
one for the decennial census and one for A.C.E. sample areas. For the
Census, the Bureau mailed out forms for mail-back to most of the housing
units in the country; hand-delivered mail-back forms to most of the rest
of the country; and then carried out a number of follow-up operations
designed to count nonrespondents and improve data quality.

A.C.E. collected response data through interviewing from April 24 through
September 11, 2000.

Figure 1: A.C.E. Sample Operations Paralleled the Census Census Operations
A.C.E. Operations

                     Source: U.S. Census Bureau documents.

After the census and A.C.E. data collection operations were completed, the
Bureau attempted to match each person counted on the A.C.E. list to the
list of persons counted by the 2000 Census in the A.C.E. sample areas to
determine exactly which persons had been missed or counted more than once
by either A.C.E. or the Census. The results of the matching process, along
with data on the racial/ethnic and other characteristics of persons
compared, were to provide the basis for A.C.E. to estimate the extent of
coverage error in the census and population subgroups and enable the
Bureau to adjust the final decennial census tabulations accordingly. The
matching process needed to be as precise and complete as possible, since

A.C.E. collected data on only a sample of the nation's population, and
small percentages of matching errors could significantly affect the
estimates of under- and overcounts generalized to the entire nation.

Since the 2000 Census, we have issued four other reports on A.C.E.,
addressing its cost and implementation as part of our ongoing series on
the results of the 2000 Census, as well as a report on the lessons learned
for planning a more cost-effective census in 2010. (See the Related GAO
Products section at the end of this report for the assessments issued to
date.) These reports concluded, among other things, that while (1) the
address list the Bureau used for the A.C.E. program appeared much more
accurate than the preliminary lists developed for the 2000 Census and

(2) quality assurance procedures were used in the matching process,
certain implementation problems had the potential to affect subsequent
matching results and thus estimates of total census coverage error.1

In the end, the Bureau decided not to use A.C.E.'s matching process
results to adjust the 2000 Census. In March 2001, a committee of senior
career Bureau officials recommended against using A.C.E. estimates of
census coverage error to adjust final census tabulations for purposes of
redistricting Congress. In October 2001, the committee also recommended
against adjusting census data used for allocating federal funds and other
purposes, largely because Bureau research indicated that A.C.E. did not
account for at least 3 million erroneously counted persons (mostly
duplicates) in the census, raising questions about the reliability of
coverage error estimates. In March 2003, after considerable additional
research, the

1 U.S. General Accounting Office, 2000 Census: Coverage Evaluation
Interviewing
Overcame Challenges, but Further Research Needed, GAO-02-26 (Washington,
D.C.:
Dec. 31, 2001); and 2000 Census: Coverage Evaluation Matching Implemented
as Planned,
but Census Bureau Should Evaluate Lessons Learned, GAO-02-297 (Washington,
D.C.:
Mar. 14, 2002).

Page 7 GAO-05-71 Census 2000

  Scope and Methodology

Bureau published revised coverage error estimates and again decided not to
adjust official census data, this time for the purpose of estimating the
population between the 2000 and 2010 censuses.

In light of its 2000 experience, the Bureau officially announced in
January 2004 that while it plans to fully evaluate the accuracy of the
2010 census, it will not develop plans for using these coverage error
estimates to adjust the 2010 Decennial Census. Bureau officials have told
us that there is insufficient time to carry out the necessary evaluations
of the coverage estimates between census field data collection and the
Bureau's legally mandated deadline (within 12 months of Census Day) for
releasing redistricting data to the states. Furthermore, the Bureau does
not believe adjustment is possible. In responding to an earlier GAO report
recommending that the Bureau "determine the feasibility" of adjusting the
2010 Census, the Bureau wrote that the 2000 Census and A.C.E. was "a
definitive test of this approach" which "provided more than ample evidence
that this goal cannot be achieved."2 However, in March, the National
Academy of Sciences (NAS) published a report that recommended that the
Bureau and the administration request and Congress provide funding for an
improved coverage evaluation program that could be used as a basis for
adjusting the census, if warranted.3 The Academy agrees with the Bureau
that 12 months is insufficient time for evaluation and possible
adjustment; in the same publication, NAS recommended Congress consider
extending the statutory deadline of 12 months for providing data for
redistricting purposes, a suggestion which, if appropriate, could make
adjustment possible.

To identify the factors that may have contributed to A.C.E. missing
coverage errors in the census, we reviewed evaluations of A.C.E. and the
Bureau's subsequent revisions to its estimation methodology, as well as
changes made to the design from its 1990 attempts to estimate coverage. We
interviewed Bureau officials responsible for A.C.E. decision making to
obtain further context and clarification. We did not attempt to identify
all factors contributing to the success or failure of A.C.E in estimating
coverage error. Since our focus was on the process and decisions that led

2 GAO-04-37, pp. 43-44.

3 Constance F. Citro, Daniel L. Cork, and Janet L. Norwood, eds., The 2000
Census: Counting under Adversity (Washington, D.C.: The National Academies
Press, 2004).

Page 8 GAO-05-71 Census 2000

  Design and Other Bureau Decisions Created Difficulties and Blind Spots in
  Census Coverage Evaluation

to the results rather than on determining the underlying numbers
themselves, we did not audit the Bureau's research, the underlying data,
or its conclusions. We relied on the Bureau's own reporting quality
assurance processes to assure the validity and accuracy of its technical
reporting, and thus we did not independently test or verify individual
Bureau evaluations of their methodologies.

To identify the extent of the census errors not accounted for by A.C.E.,
we reviewed the descriptive coverage error estimates and the limitations
and context of these data as described in the Bureau reports published by
the Bureau in March 2001, October 2001, and March 2003.

On, August 9, 2004, we requested comments on the draft of this report from
the Secretary of Commerce. On September 10, 2004, the Under Secretary for
Economic Affairs, Department of Commerce forwarded written comments from
the department (see app. I), which we address in the "Agency Comments and
Our Evaluation" section at the end of this report.

The following Bureau decisions concerning the design of the census and the
A.C.E. program created difficulties and blind spots for the coverage
evaluation, possibly preventing A.C.E. from reliably measuring coverage
error: (1) using residence rules that were unable to capture the
complexity of American society, (2) excluding the group quarters
population from the

A.C.E. sample survey, (3) making various decisions that led to an increase
in the number of "imputed" records in the census, (4) removing 2.4 million
suspected duplicate persons from the census but not the A.C.E. sample, and
(5) reducing the sample area wherein A.C.E. searched for duplicates during
matching. However, the Bureau has not accounted for how these design
decisions have affected coverage error estimates, which has prevented it
from pinpointing what went wrong with A.C.E., and this in turn could
hamper its efforts to craft a more successful coverage measurement program
for the next national head count.

Bureau Attributes Inaccuracy of Coverage Error Estimates to Residence
Rules That Did Not Capture Complexity of

    U.S. Society

Bureau officials attribute A.C.E.'s inaccuracy primarily to the fact that
it used residence rules that do not fully capture the complexity of
American society. According to senior Bureau officials, increasingly
complicated social factors, such as extended families and population
mobility, presented challenges for A.C.E., making it difficult to
determine exactly where certain individuals should have been counted.
Specifically, in developing A.C.E. methodology, Bureau officials assumed
that each person in its sample could be definitively recorded at one known
residence that the Bureau could determine via a set of rules. However,
individuals' residency situations are often complicated: Bureau officials
cite the example of children in custody disputes whose separated parents
both may have strong incentives to claim the children as members of their
household, despite census residence rules that attempt to resolve which
parent should report the child(ren). In such situations, wherein the
residence rules are either not understood, are not followed, or do not
otherwise provide resolution, the Bureau has difficulty determining the
correct location to count the children. Bureau officials cite similar
difficulties counting college students living away from home, as well as
people who live at multiple locations throughout the year, such as
seasonal workers or retirees.

A.C.E. design also assumed that follow-up interviews would clarify and
improve residence data for people for whom vague, incomplete, or ambiguous
data were provided and whose cases remained unresolved. However, the
Bureau found it could not always rely on individuals to provide more
accurate or complete information. In fact, in our earlier reporting on
A.C.E. matching, we described several situations wherein conflicting
information had been provided to the Bureau during follow-up interviews
with individuals, and Bureau staff had to decide which information to
use.4 More recently, the Associate Director for Decennial Census told us
that returning to an A.C.E. household to try and resolve conflicting data
sometimes yielded new or more information but not necessarily better
information or information that would resolve the conflict. The Bureau
plans to review and revise its census residence rules for 2010, which may
clarify some of this confusion.

4 See GAO-02-297, pp. 9-12, for example.

    Other Design Decisions Also Created Blind Spots in Census Coverage
    Evaluation

While the Bureau emphasizes residence rules as the primary cause of

A.C.E. failure, our research indicates some of the Bureau's other design
decisions created blind spots that also undermined the program's ability
to accurately estimate census error. For example, the Bureau decided to
leave people living in group quarters-such as dormitories and nursing
homes- out of the A.C.E. sample survey, which effectively meant they were
left out of the scope of A.C.E. coverage evaluation (see fig. 2).

     Figure 2: A.C.E. Sample Excluded Individuals and Records from Scope of
                                   Evaluation

Source: GAO.

Note: Figures are not drawn to proportion.

As a result, the matching results could not provide coverage error
information for the national group quarters population of 7.8 million. In
addition, the Bureau did not design A.C.E. matching to search for
duplicate records within the subset of this population counted by the
census, though later Bureau research estimated that if it had, it would
have measured over 600,000 additional duplicates there. In response to our
draft report the Department wrote that coverage evaluation was designed to
measure some of these duplicates since during its follow-up interviews at
households during A.C.E. matching, the Bureau included questions intended
to identify college students living away at college.

While coverage evaluation in 1990 included some group quarters, such as
college dormitories and nursing homes, within its sample, the Bureau
reported that the high mobility of these people made it more difficult to
count them, thus the 1990 estimates of coverage for this population were
weak. The Bureau decided not to gather these data during 2000 A.C.E. data
collection based in part on the difficulty of collecting and matching this
information in the past, and in part as a result of a separate design
decision to change the way to treat information for people who moved
during the time period between the census and the coverage evaluation
interviews. By excluding group quarters from the coverage evaluation
sample, the Bureau had less information collected on a sample of this
population that included some duplication, and the missing information may
have enabled it to better detect and account for such duplication. In
addition, by developing coverage error estimates that were not applicable
to the group quarters population, the Bureau made the task of assessing
the quality of the census as a whole more difficult.

Figure 2 also shows that another blind spot emerged as the Bureau
increased the number of "imputed"5 records in the final census, though
they could not be included in the A.C.E. sample survey. The Bureau
estimates a certain number of individuals-called imputations-that they
have reason to believe exist, despite the fact that they have no personal
information on them, and adds records to the census (along with certain
characteristics such as age and race/ethnicity) to account for them. For
example, when the Bureau believes a household is occupied but does not
have any information on the number of people living there, it will impute
the number of people as well as their personal characteristics. The Bureau
increased imputations in 2000 from about 2 million in 1990 to about 5.8
million records. Changes in census and coverage evaluation design from
1990 likely contributed to this increase. Specifically, the Bureau reduced
the number of persons who could have their personal information recorded
on the standard census form in 2000. In addition, the Bureau changed the
way coverage evaluation accounted for people who moved between Census Day
and the day of the coverage evaluation interview. These design changes
resulted in less complete information on people and likely contributed to
an increase in imputations. (These and other changes are explained in more
detail in app. II.)

5 The Bureau uses computer-executed algorithms to estimate imputations.

Because imputed records are simply added to the census totals and do not
have names attached to them, it was impossible for A.C.E. to either count
imputed individuals using the A.C.E. sample survey or incorporate them
into the matching process. At any rate, since the true number and
characteristics of these persons are unknown, matching these records via

A.C.E. would not have provided meaningful information on coverage error.
A.C.E. was designed to measure the net census coverage error, in essence
the net effect of people counted more than once minus people missed, and
included measurement of the effects of imputation on coverage error. The
Bureau generalizes its estimates of coverage error to cover imputations
and also maintains that its imputation methods do not introduce any
statistical bias in population counts. But the formulas used by the Bureau
to calculate its estimates of coverage error account for imputations by
subtracting them from the census count being evaluated, not by measuring
the error in them. And the Bureau did not attempt to determine the
accuracy of the individual imputations, that is although the Bureau
imputed persons they have reason to believe existed, the Bureau does not
know whether it over- or underestimated such persons. As more imputations
are included in the census total, the generalization of coverage error
estimates to that total population becomes less reliable.

Similarly, the Bureau created an additional coverage error blind spot by
including in the census 2.4 million records that it previously suspected
were duplicates and thus were not included in the coverage evaluation.
Prior to A.C.E. matching, the Bureau removed about 6 million persons from
the census population, identifying them as likely duplicates. Then, after
completing additional research on these possible duplicates, the Bureau
decided to reinstate the records for 2.4 million of these persons it no
longer suspected were duplicates. However, it did so after A.C.E. had
completed matching and evaluating the census records from which the 2.4
million persons had been removed and for which coverage error estimation
had begun (see fig. 3).

Figure 3: A.C.E. Matching Results Lacked Data on Suspected Duplicates That
Were in the Census

Source: GAO.

Note: Figures are not drawn to proportion.

The Bureau documented in a memorandum that the exclusion of the records
from A.C.E. processing was not statistically expected to affect

A.C.E. results. However, later Bureau research concluded that over 1
million of these 2.4 million records were likely duplicates, none of which
could have been detected by A.C.E. While the Bureau maintains that the
reinstatement of the over 1 million reinstated likely duplicates did not
affect the A.C.E. estimate in a statistically significant way, this
suggests that the resulting A.C.E.-based estimate of national population
itself is blind to the presence in the census of the over 1 million
duplicates the Bureau reintroduced. For 2010, Bureau officials have
chartered a planning group responsible for, among other things, proposing
improvements to reduce duplication in the census, which may address some
of the problem.

In addition to excluding populations from the scope of evaluation, the
Bureau further curtailed its ability to measure coverage error by reducing
A.C.E.'s search area to only one geographic ring around selected A.C.E.
sample areas during the matching process. For the 1990 Census, the
Bureau's coverage evaluation program always searched at least one
surrounding ring and an even larger ring in rural areas. However, in 1999,
before a panel of the National Academy of Sciences, Bureau officials
announced and defended the decision to not expand the search area except
in targeted instances, saying that Bureau research indicated that the
additional matches found in 1990 from the expanded search areas did not
justify the additional effort. In its comments on our draft report, the
Department writes that more important than the size of the search area is
maintaining "balance"-i.e., search areas must be used consistently both to
identify people who have been missed and to identify people who have been
counted in error (including duplicates). The Department also justified the
decision to reduce the search area in 2000 from 1990 in part by stating,
"in an expected value sense, the reduced search area would have affected"
[emphasis added] the extra missed people and the extra miscounted people
equally, or been balanced. However, later research discovered large
numbers of the missed duplicates in the actual census by matching A.C.E.
persons to census persons nationwide-far beyond the areas searched during
A.C.E. matching. A 2001 Bureau report presenting the results of computer
rematching of the A.C.E. sample concluded, "Our analysis found an
additional 1.2 million duplicate enumerations in units that were
out-of-scope for the A.C.E. but would have been in-scope for [1990
coverage evaluation]."6 In other words, if the Bureau had continued its
1990 practice of searching in housing units in larger geographic areas in
2000, the A.C.E. process might have identified more duplicates and yielded
better results.7 The Bureau research cited above appears to question the
decision to reduce the search areas. In fact, after the 2000 Census was
completed, again before the National Academy of Sciences, a Bureau
official suggested

6 Executive Steering Committee For A.C.E. Policy II (ESCAP II) Report 20
October 11, 2001 ESCAP II: Person Duplication in Census 2000, Thomas Mule,
p.iv.

7 Even so, it is not clear that all of these 1.2 million duplicates would
have been identified had the search areas been expanded to completely
include them, since the Bureau did not always identify duplicates within
the search areas that it did use.

Page 15 GAO-05-71 Census 2000

    Bureau Has Not Fully Accounted for How Design Decisions Affected Coverage
    Error Estimates

that any coverage evaluation methods for 2010 should conduct a more
thorough search, perhaps expanding the search area to two or more
geographic rings everywhere.

This review has identified only some of the decisions that could have
created problems in A.C.E. estimates. Because the Bureau has not attempted
to account for how all of its design decisions relating to A.C.E. and the
census affected the outcome of the program, the full range of reasons that
A.C.E. estimates were not reliable remains obscure.

Bureau research documented and this report describes the magnitude of the
direct effects of most of these design decisions in terms of the size of
the census population affected, and the Bureau's final reporting on the
revised A.C.E. estimates mentions many design changes, but not together or
comprehensively, and they do not explain how the changes might have
affected the estimates of coverage error. Without clear documentation of
how significant changes in the design of the census and A.C.E. might have
affected the measurements of census accuracy, it is not apparent how
problems that have arisen as a result of the Bureau's own decisions can be
distinguished from problems that are less under the Bureau's control,
i.e., difficulties inherent to conducting coverage evaluation. Thus the
Bureau's plans to measure the coverage error for the 2010 Census are not
based on a full understanding of the relationship between the separate
decisions it makes about how to conduct A.C.E. and the census and the
resulting performance of its coverage measurement program. This in turn
could hamper the Bureau's efforts to craft a more successful coverage
measurement program for the next national head count.

  The Bureau Has Not Produced Reliable Revised Estimates of Coverage Error for
  the 2000 Census

While the Bureau produced a vast body of research regarding the census and
A.C.E., including multiple reassessments and revisions of earlier work,
the revised estimates are not reliable. The initial A.C.E. estimates of
coverage error suggested that while historical patterns of differences in
undercounts between demographic groups persisted, the Bureau had succeeded
in 2000 in reducing the population undercounts of most minorities, and the
subsequent revised estimates showed even greater success in reducing
population undercounts in the Census. However, the large number of
limitations described in the Bureau's documentation of the methodology
used to generate the revised estimates of coverage error suggest that
these estimates are less reliable than reported and may not

    Bureau's Conclusion That A.C.E. Estimates Are Unreliable Is Based on Results
    That Are Unreliable Themselves

describe the true rate of coverage error. The Bureau, however, has not
made the full impact of these methodological limitations on the data
clear. Moreover, the final revised estimates of coverage error for the
count of housing units and the count of people, which the Bureau expected
to be similar if the estimates were reliable, differed, further raising
questions about the reliability of the revised estimates.

The Bureau undertook an extensive review of A.C.E.'s results over a 3-year
period. In doing so, the Bureau revised its estimates of A.C.E. coverage
error twice-first in October 2001 and again in March 2003. These revisions
suggest that the original A.C.E. estimates were unreliable. Figure 4
illustrates how each of the revised A.C.E. estimates of coverage error
reduced the undercount for most of the three major race/origin groups from
the initial A.C.E. estimates. Note that the final revised estimate
indicates that the total population was actually overcounted by one-half
of 1 percent.

Figure 4: Reported Undercount Estimates Generally Decreased with A.C.E.
Revisions

Percent net undercount
4

3

2

1

0

-1

-2

A.C.E. Revision I Revision II
(03/01) (10/01) (03/03)

Non-hispanic black

Hispanic
Non-hispanic white or other

All race/origin groups Source: GAO analysis of U.S. Census Bureau data.

The differences in the revised estimates presumably provide a measure of
the error in the original A.C.E. estimates. (The estimated net population
undercounts-and their standard errors-for these groups are provided in
app. III.) However, the revised estimates of coverage error may not be
reliable enough themselves to provide an adequate basis for such a
comparison to measure error in the original A.C.E. estimates.

First, the number of Bureau-documented limitations with respect to the
methodologies used in generating A.C.E.'s revised estimates raises
questions about the accuracy of the revised estimates. Within voluminous
technical documentation of its process, the Bureau identified several
methodological decisions wherein if the decisions had been made
differently, they may have led to appreciably different results. Thus the
methods the Bureau chose may have affected the estimates of census
coverage error themselves and/or the measures of uncertainty associated
with the estimates, limiting the general reliability of the revisions. The

    The Bureau Has Not Clearly Quantified and Reported the Full Impact of the
    Methodological Limitations on the Revised Estimates

limitations in the methodologies for the revised estimates included the
following:

     o Errors in the demographic data used to revise estimates may have
       contributed to additional error in the estimates.
     o Errors stemming from the Bureau's choice of model to resolve uncertain
       match cases were accounted for in initial March 2001 A.C.E. estimates
       but were not accounted for in the revised estimates in March 2003.
     o Alternative possible adjustments for known inefficiencies in computer
       matching algorithms would directly affect revised estimates.
     o The Bureau's evaluations of the quality of clerical matching were used
       to revise the initial A.C.E. coverage estimates, leaving the Bureau
       less reliable means to measure the uncertainty in the revised
       estimates.

For the earlier revision of coverage error estimates, the Bureau provided
the range of impacts that could result from some different methodological
decisions, enabling more informed judgments regarding the reliability of
the data. For example, in support of its October 2001 decision to not
adjust census data, the Bureau showed that different assumptions about how
to treat coverage evaluation cases that the Bureau could not resolve could
result in variations in the census count of about 6 million people. The
Bureau also had reported previously the range of impacts on the estimates
resulting from different assumptions and associated calculations to
account for the inefficiency of computer matching. They found that
different assumptions could result in estimates of census error differing
by about 3.5 million people.

However, with the final revision of the A.C.E. coverage error estimates,
the Bureau did not clearly provide the ranges of impact resulting from
different methodological decisions. While the Bureau did discuss major
limitations and indicated their uncertain impact on the revised estimates
of coverage error, the Bureau's primary document for reporting the latest
estimates of coverage error did not report the possible quantitative
impacts of all these limitations-either separately or together-on the
estimates. Thus readers of the reported estimates do not have the
information needed to accurately judge the overall reliability of the
estimates, namely, the extent of the possible ranges of the estimates had
different methodological decisions been made.

    Unexpected Differences in Patterns of Coverage Error between the Housing and
    the Population Count Further Call into Question Reliability of Revised
    Estimates

Sampling errors were reported alongside estimates of census error, but
these do not adequately convey the extent of uncertainty associated with
either the reported quantitative estimates themselves or the conclusions
to be drawn from them. For example, the Bureau decided to make no
adjustment to account for the limitation of computer matching efficiency
when calculating its latest revision of estimates of coverage error,
unlike the adjustment it made when calculating its earlier revised
estimates. When justifying its adjustment made in its earlier revised
estimates, the Bureau demonstrated that the choice of adjustment mattered
to the calculated results. But the potential significance to the
calculated results of the Bureau's having made a different assumption was
not reflected in the Bureau's primary presentation of its estimates and
their errors. The absence of clear documentation on the possible
significant impacts of such assumptions could lead readers of the Bureau's
reporting to believe erroneously that all assumptions have been accounted
for in the published statistics, or that the estimates of coverage error
are more reliable than they are.

According to Bureau reporting, when it examined the validity of the
revised coverage error estimates the Bureau expected to see across
demographic groups similar patterns between the coverage error for the
count of the population and the count of housing. That is, if a population
was overcounted or undercounted, then the count of housing units for that
population was expected to be overcounted or undercounted as well. The
Bureau developed estimates of coverage error in the count of housing units
from A.C.E. data. But the comparisons of non-hispanic blacks and hispanics
to non-hispanic whites in figure 5 shows that the relative housing
undercounts are opposite of what was expected by the Bureau.

Figure 5: Population and Housing Undercounts Percent net undercount

                             Percent net overcount

Non-hispanic black

Hispanic

Non-hispanic white or other

All race/origin groups Source: GAO analysis of U.S. Census Bureau data.

For example, the estimated population undercount for non-hispanic blacks
is almost 3 percent greater than that of the majority group-non-hispanic
white or other-but the estimated housing unit undercount for non-hispanic
blacks is about 0.8 percent less than that of the majority group. In
addition, while the Bureau estimated that the non-hispanic white majority
group had a net population overcount of over 1 percent, the Bureau
estimated the majority group as having its housing units undercounted by
about one-third of a percent. (The estimated net housing unit
undercounts-and their standard errors-for these groups are provided in
app. III.)

Bureau officials told us that the problems A.C.E. and the census
experienced with identifying coverage error in the population do not seem
likely to have affected housing counts. However, when estimating coverage
error for housing units for specific populations (e.g., by gender or
race/ethnicity) errors in the population count can affect the reliability
of housing tabulations. This is because when the Bureau tabulates housing
data by characteristics like gender or race, it uses the personal

    Bureau Reports Revised Estimates Resulted in Lessons Learned

characteristics of the person recorded as the head of the households
living in each housing unit. So if there are problems with the Bureau's
count of population for demographic groups, for example by gender or sex,
they will affect the count of housing units for demographic groups. While
the unexpected patterns in population and housing unit coverage error may
be reconcilable, Bureau officials do admit that problems with the
estimations of population coverage error may also adversely affect the
reliability of other measures of housing count accuracy they rely upon,
such as vacancy rates. Bureau officials have indicated the need to review
this carefully for 2010.

While the multiple reassessments and revisions of earlier work did not
result in reliable estimates, these efforts were not without value,
according to the Bureau. Bureau officials stated that the revision process
and results helped the Bureau focus for 2010 on detecting duplicates,
revising residence rules, and improving the quality of enumeration data
collected from sources outside the household, such as neighbors, as well
as providing invaluable insights for its program of updating census
population estimates throughout the decade.

The volume and accessibility over the Internet of the Bureau's research

  Conclusions

may have made this the most transparent coverage evaluation exercise of a
Decennial Census. However, as the Bureau has closed the book on Census
2000 and turned toward 2010, the reliability of the Bureau's coverage
estimates remains unknown. The Bureau made extensive efforts to evaluate
the census and its coverage error estimates resulting from A.C.E., but
these efforts have not been sufficient to provide reliable revised
estimates of coverage error. So while much is known about operational
performance of the 2000 Census, one of the key performance measures for
the 2000 census remains unknown.

Moreover, neither Congress nor the public know why the coverage evaluation
program did not work as intended, because the Bureau has not provided a
clear accounting of how census and A.C.E. design decisions and/or
limitations in the A.C.E. revision methodology discussed in this report
accounted for the apparent weakness-or strengths-of A.C.E. Without such an
accounting, the causes of problems and whether they can be addressed will
remain obscure. And as the Bureau makes plans for coverage evaluation for
the 2010 Census, whether that program approximates A.C.E.'s design or not,
the Bureau will be missing valuable data that could help officials make
better decisions about how to improve coverage evaluation.

Finally, this lack of information calls into question the Bureau's claim
(made in response to a prior GAO recommendation that the Bureau determine
the feasibility of adjustment) that it has already established that using
coverage evaluation for adjustment purposes is not feasible. Without
clearly demonstrating what went wrong with its most recent coverage
evaluation and why, the Bureau has not shown that coverage evaluation for
the purpose of adjustment is not feasible. In fact, this report mentions
two census improvements-to residence rules and to efforts to identify and
reduce duplicates-that the Bureau is already considering that could make

A.C.E. estimates more reliable, and perhaps even feasible. Furthermore,
although the Bureau reports that its experience with revising A.C.E.
estimates has provided lessons, it remains unclear how the Bureau will use
its published coverage error estimates to make decisions leading to a more
reliable measure of coverage error in 2010, or how the unreliable
estimates can be of value to policymakers or the public.

As the Bureau plans for its coverage evaluation of the next national head

  Recommendations for

count in 2010, we recommend that the Secretary of Commerce direct that

the Bureau take the following three actions to ensure that coverage
evaluation results the Bureau disseminates are as useful as possible to
Congress and other census stakeholders:

     o To avoid creating any unnecessary blind spots in the 2010 coverage
       evaluation, as the Bureau plans for its coverage evaluation in 2010,
       it should take into account how any significant future design
       decisions relating to census (for example, residence rules, efforts to
       detect and reduce duplicates, or other procedures) or A.C.E. (for
       example, scope of coverage, and changes in search areas, if
       applicable), or their interactions, could affect the accuracy of the
       program.
     o Furthermore, in the future, the Bureau should clearly report in its
       evaluation of A.C.E. how any significant changes in the design of
       census and/or A.C.E. might have affected the accuracy of the coverage
       error estimates.
     o In addition GAO recommends that in the future the Bureau plan to not
       only identify but report, where feasible, the potential range of
       impact of

  Agency Comments and Our Evaluation

any significant methodological limitation on published census coverage
error estimates. When the impact on accuracy is not readily quantifiable,
the Bureau should include clear statements disclosing how it could
potentially affect how people interpret the accuracy of the census or
A.C.E.

The Under Secretary for Economic Affairs at the Department of Commerce
provided us written comments from the Department on a draft of this report
on September 10, 2004 (see appendix I). The Department concurred with our
recommendations, but took exception to some of our analyses and
conclusions and provided additional related context and technical
information. In several places, we have revised the final report to
reflect the additional information and provided further clarification on
our analyses.

The Department was concerned that our draft report implied that A.C.E. was
inaccurate because it should have measured gross coverage error
components, and that this was misleading because the Bureau designed

A.C.E. to measure net coverage errors. While we have previously testified
that the Bureau should measure gross error components, and the Department
in its response states that this is now a Bureau goal for 2010, we
clarified our report to reflect the fact that the Bureau designed A.C.E.
to measure net coverage error.

Further, although the Department agreed with our finding that the Bureau
used residence rules that were unable to capture the complexity of
American society thus creating difficulty for coverage evaluation, the
Department disagreed with our characterization of the role four other
census and A.C.E. design decisions played in affecting coverage
evaluation. Specifically, the Bureau does not believe that any of the
following four design decisions contributed significantly to the
inaccuracy of the A.C.E. results:

     1. The Treatment of the Group Quarters Population-The Department
        commented that we correctly noted that group quarter residents were
        excluded from the A.C.E. universe who would have been within the
        scope of A.C.E. under the 1990 coverage evaluation design, and that a
        large number of these people were counted more than once in 2000. The
        Department maintains that the Bureau designed A.C.E. to measure such
        duplicates. We believe this is misleading. As the Department noted,
        during its follow-up at housing units the Bureau included
     2. questions intended to identify the possible duplication of college
        students living away at college, and we have now included this in our
        final report. But as we stated in our draft report, A.C.E. did not
        provide coverage error information for the national group quarters
        population. Moreover, during A.C.E. the Bureau did not search for
        duplicate people within the group quarters population counted by the
        census, as it did within housing units counted by the census. In
        fact, later Bureau research estimated that if it had done so, the
        Bureau would have identified over 600,000 additional duplicates
        there. As such, our finding that this may have contributed to the
        unreliability of coverage error estimates still stands.
 1. The Treatment of Census Imputations-The Department stated that

A.C.E. was designed to include the effects of imputations on its
measurement of coverage error and that there was no basis for our draft
report stating that as more imputations were included in the census then
coverage error estimates became less reliable. While we agree that the
Bureau's estimates of coverage error accounted for the number of
imputations, as we report, and as the Department's response reiterated, no
attempt was made to determine the accuracy of the imputations included in
the census. Thus any errors in either the number or demographic
characteristics of the population imputed by the Bureau were not known
within the coverage error processes or estimation. As a result, in
generalizing the coverage error estimates to the imputed segment of the
population, the Bureau assumed that the imputed population had coverage
error identical to the population for which coverage error was actually
measured. Furthermore, the larger the imputed segment of the population
became the more this assumption had to be relied upon. Since the real
people underlying any imputations are not observed by the census, the
assumption is, in its strictest sense, untestable, thus we maintain that
increasing the number of imputations included in the census may have made
generalizing the coverage error estimates to the total census population
less reliable.

     1. The Treatment of Duplicate Enumerations in the Reinstated Housing
        Units-The Department wrote that our draft report incorrectly
        characterized the effects of reinstating duplicates into the census.
        The Department indicated that A.C.E., having been designed to measure
        net coverage error, treated the over 1 million likely duplicates
        "exactly correctly" and that including them in the census had no
        effect on the mathematical estimates of coverage error produced by
        A.C.E. We reported that, according to Bureau research, introducing
        the additional
     2. duplicates into the census appeared to have no impact on the A.C.E.
        estimates. But we view this fact as evidence of a limitation, or
        blind spot, in the Bureau's coverage evaluation. The fact that 2.4
        million records, containing disproportionately over 1 million
        duplicate people could be added to the census without affecting the
        A.C.E. estimates demonstrates a practical limitation of those
        coverage error estimates. We maintain that the resultant measure of
        coverage error cannot be reliably generalized to the entire
        population count of which those 1 million duplicates are a part.
 1. Size of the Search Area-The Department wrote that a search area like
       that used in 1990 would have done little to better measure the number
       of students and people with vacation homes who may have been
       duplicated in 2000. It described our conclusion regarding the
       reduction in search area from 1990 as not supported by the relative
       magnitudes of these situations. And finally, the Department offered
       additional support for the decision to reduce the search area by
       describing the reduced search area as balanced, or "in an expected
       value sense" [emphasis added] affecting the number of extra missed
       people and the extra miscounted people equally.

In our final report we added a statement about the Department's concern
over the importance of balance in its use of search areas. But we disagree
that our conclusion is unsupported, since in our draft report we
explicitly cited Bureau research that found an additional

1.2 million duplicate enumerations in units that were out-of-scope for
2000 A.C.E. but that would have been in-scope for 1990's coverage
evaluation.

In addition, the Department offered several other comments.

Regarding our finding that the Bureau has not produced reliable revised
estimates of coverage error for the 2000 Census, and, specifically, that
the full impact of the Bureau's methodological limitations on the revised
estimates has not been made clear, the Department wrote that the Census
Bureau feels that further evaluations would not be a wise use of
resources. We concur, which is why our recommendations look forward to the
Bureau's preparation for 2010.

The Department commented that it did not see how we could draw conclusions
about the reliability of the Bureau's coverage evaluation estimates if we
did not audit the underlying research, data, or conclusions. We maintain
that the objectives and scope of our review did not require such an audit.
As we described, and at times cited, throughout our draft report, we used
the results of the Bureau's own assessment of the 2000 Census and its
coverage evaluation. That information was sufficient to draw conclusions
about the reliability of the A.C.E. estimates. As a result, there was no
need to verify individual Bureau evaluations and methodologies.

The Department expressed concern that our draft report implied that the
unexpected differences in patterns of coverage error between the housing
and the population count were irreconcilable. That was not our intent, and
we have clarified that in the report.

The Department expressed concern over the report's characterization of the
1990 coverage error estimates for group quarters as weak in part due to
the high mobility of this population. However, the 1990 group quarters
estimates are described as "weak" in a Bureau memorandum proposing that
group quarters be excluded from the 2000 coverage evaluation. The
memorandum also explains how the mobility within the group quarters
population contributes to the resulting estimates. We have not revised the
characterization of the group quarters coverage error estimates or the
causal link due to the mobility of that population, but we have revised
our text to state more clearly that the 1990 estimates being discussed are
those for group quarters.

As agreed with your offices, unless you release its contents earlier, we
plan no further distribution of this report until 30 days from its issue
date. At that time we will send copies to other interested congressional
committees, the Secretary of Commerce, and the Director of the U.S. Census
Bureau. Copies will be made available to others upon request. This report
will also be available at no charge on GAO's Web site at
http://www.gao.gov .

If you or your staff have any questions concerning this report, please
contact me on (202) 512-6806 or by e-mail at [email protected] or Robert
Goldenkoff, Assistant Director, at (202) 512-2757 or [email protected].
Key contributors to this report were Ty Mitchell, Amy Rosewarne, and Elan
Rohde.

Patricia A. Dalton Director Strategic Issues Appendix I

Comments from the Department of Commerce

Appendix I Comments from the Department of Commerce Appendix I Comments
from the Department of Commerce Appendix I Comments from the Department of
Commerce Appendix I Comments from the Department of Commerce

Appendix II

Bureau Design Decisions Increased Imputations

The Bureau made various design decisions that resulted in an increase in
the number of "imputations"-or people guessed to exist-included in the
census population that could not be included within the A.C.E. sample
survey. The Bureau believes certain numbers of people exist despite the
fact that the census records no personal information on them; thus it
projects, via computer-executed algorithms, numbers and characteristics of
people and includes them in the census. Such records are simply added to
the census totals, and do not have names attached to them. Thus it was
impossible for A.C.E. to either count imputed individuals using the A.C.E.
sample survey or incorporate them into the matching process. Since the
true number and the characteristics of these persons are unknown, matching
nameless records via A.C.E. would not have provided any meaningful
information on coverage evaluation. The number of people the Bureau
imputed grew rapidly in 2000, from about 2 million in 1990 to about

5.

One of the reasons for the large increase in imputations may be the
decision by the Bureau to eliminate field edits-the last-minute follow-up
operation to collect additional information from mail-back forms that had
too little information on them to continue processing-from field follow-up
in 2000. While acknowledging that this decision may have increased
imputations for 2000, a senior Bureau official justified the decision by
describing the field edits in 1990 as providing at times a "clerical
imputation" that introduced a subjective source of error, which
computerbased imputation in 2000 lacked.

The Bureau also reduced the number of household members for whom personal
information could be provided on standard census forms, and this also
contributed to the increase in imputations. Households reporting a
household size greater than 6 in 2000-the number for whom personal
characteristics could be provided-were to be automatically contacted by
the Bureau to collect the additional information. Yet not all large
households could be reached for the additional information, and the
personal characteristics of the remaining household members needed to be
imputed. Again, A.C.E. would have been unable to match people counted by
its sample survey to imputations, so imputed people were excluded from
A.C.E. calculations of coverage errors.

An A.C.E. design choice by the Bureau that likely increased the amount of
data imputed within the A.C.E. sample survey was how the Bureau decided to
account for people who moved between Census Day and the day of the

A.C.E. interview. Departing from how movers were dealt with in 1990, and

Appendix II Bureau Design Decisions Increased Imputations

partly to accommodate the initial design for the 2000 Census, which relied
on sampling nonrespondents to the census, for 2000 the Bureau relied on
the counts of the people moving into A.C.E. sample areas to estimate the
number of matched people who had actually lived in the A.C.E. areas on
Census Day but moved out. This decision resulted in the Bureau having less
complete information about the Census Day residents in A.C.E. sample areas
who had moved out, and likely increased the number of imputations that
were later required, making it more difficult to match these moving
persons to the census. A Bureau official also cited this decision as
partial justification for not including group quarters in A.C.E. search
areas.

The extent that imputation affected the accuracy of the census is unknown.
The National Academy of Sciences discussed in an interim report on the
2000 Census the possibility of a subset of about 1.2 million of these
imputations being duplicates.1 That report stated that, for example, "it
is possible that some of these cases-perhaps a large proportion-were
erroneous or duplicates," and described another subset of about 2.3
million that could include duplicates. However, this Academy report did
not include any data to suggest the extent of duplicates within these
groups, and it may similarly have been possible for the number of persons
in this group to have been underestimated. The Bureau maintains that the
imputations were necessary to account for the people its field operations
led it to believe had been missed, and that its imputation methods do not
introduce statistical bias.

1 Constance F. Citro, Daniel L. Cork, and Janet L. Norwood, eds., The 2000
Census: Interim Assessment (Washington, D.C.: The National Academies
Press, 2001).

Page 35 GAO-05-71 Census 2000

Appendix III

Bureau Estimates of Population and Housing Unit Net Undercounts

As shown in Table 1, the initial A.C.E. results suggested that the
differential population undercounts of non-Hispanic blacks and
Hispanics-the difference between their undercount estimate and that of the
majority groups-persisted from Bureau estimates from its coverage
evaluation in 1990. Yet they also demonstrated that the Bureau had
apparently succeeded in reducing the magnitude of those differences since
its evaluation of the 1990 Census.1

Table 1: Estimated Percent Population Undercounts for Selected Race/Origin
                                     Groups

                      Estimate in % (standard error in %)
                                            Revision I (October   Revision II 
                                                                       (March 
                  1990 Estimatesa  A.C.E. (March          2001)         2003) 
                                       2001)                    
Non-Hispanic black 4.57 (0.55)      2.17 (0.35)  0.78 (0.45)   1.84 (0.43) 
Hispanic           4.99 (0.82)      2.85 (0.38)  1.25 (0.54)   0.71 (0.44) 
Other                               0.73 (0.14) -0.28 (0.20) 
Non-Hispanic white 0.68 (0.22)                                -1.13 (0.20) 
Total              1.61 (0.20)      1.18 (0.13)  0.06 (0.18)  -0.49 (0.20) 

Sources: A.C.E. and Revision I data are reported in Basis of "Revised
Early Approximation" of Undercounts Released Oct. 17, 2001 (Oct. 26,
2001). 1990 and Revision II data are from PP-54 A.C.E. Revision II:
Summary of Estimated Net Coverage (Dec. 31, 2002), Table 1.

Notes: The reported net undercounts for 2000 Census are for household
population only. The reported 1990 net undercount also covers the
population living in noninstitutional, nonmilitary group quarters.

aTabulations by race/origin differed in 1990 from those for the 2000
Census. The estimates reported in this column were labeled in 1990 as
"Black", "Hispanic", and "Non-Hispanic White & Other", which also included
"American Indian off Reservation".

Subsequent revised results published in October 2001 for three race/origin
groups indicated that differential undercounts were generally lower than
the initial A.C.E. estimates, but that only the undercount estimate for
Hispanics was still statistically different from zero. Finally, the latest
revised estimates of undercount reported in March 2003 that of these three
major race/origin groups, only the non-Hispanic black and non-Hispanic
white percentage undercounts were significantly different from zero, in
addition to the national net overcount.

Unlike the estimates of census population accuracy, which were revised
twice since initial estimates, the census housing count accuracy estimates

1 More complete estimates of differential population undercounts are
available on the Bureau's Internet Web site at www.census.gov.

Page 36 GAO-05-71 Census 2000 Appendix III Bureau Estimates of Population
and Housing Unit Net Undercounts

have not been revised and are based on the initial A.C.E. data. A subset
of those results, including those provided here, were also published in
October 2001.

Table 2: Estimated Percent Occupied Housing Undercounts Differ from
Population Undercounts for Selected Race/Origin Groups

                                          Estimate in % (standard error in %)
                                 Housing undercount     Population undercount
Non-Hispanic black                  -0.45 (0.29)               1.84 (0.43)
Hispanic                             0.06 (0.35)               0.71 (0.44)
Non-Hispanic white                   0.38 (0.14)              -1.13 (0.20)
or "some other race"  
Total                            0.27 (0.13)                  -0.49 (0.20) 

Source: DSSD A.C.E. REVISION II MEMORANDUM SERIES #PP-50 Comparison of
A.C.E. Revision II Population Coverage Results with HUCS Housing Coverage
Results J. Gregory Robinson Population Division Glenn S. Wolfgang
Decennial Statistical Studies Division (December 31, 2002).

Notes: The reported net housing undercounts are for housing units only and
do not include group quarters.

Net undercounts reported here are referred to by the Bureau as "single
cell dual system estimates".

                                    Glossary

             This glossary is provided for reader convenience, not to provide
                     authoritative or complete definitions.
                                The Bureau's Accuracy and Coverage Evaluation 
Accuracy and Coverage                                 (A.C.E.) program was 
Evaluation              intended to measure coverage error (see below) for 
                                                           the 2000 Decennial 
                         Census. The program was to enable the Bureau to more 
                         accurately                                           
                         estimate the rate of coverage error via a sample     
                         survey of select areas                               
                         nationwide, and if warranted, to use the results of  
                         this survey to adjust                                
                         census estimates of the population for               
                         nonapportionment purposes.                           
Census Adjustment     The use of statistical information to adjust         
                         official census data.                                
                         A tally of certain kinds of dwellings, including     
Count of Housing      single-family homes,                                 
                         apartments, and mobile homes, along with demographic 
                                                           information on the 
                         inhabitants.                                         
Count of Population   The headcount of everybody in the nation, regardless 
                                                           of their dwelling. 
                                 The extent that minority groups are over- or 
Coverage Error                                  undercounted in comparison 
                         to other groups in the census.                       
                                Statistical studies to evaluate the level and 
Coverage Evaluation                           sources of coverage error in 
                         censuses and surveys.                                
Duplicates            When the census erroneously counts a person more     
                         than once.                                           
                          The rules the Bureau uses to determine where people 
Residence Rules                                         should be counted. 

Related GAO Products

2010 Census: Cost and Design Issues Need to Be Addressed Soon.
GAO-04-37 . (Washington, D.C.: January 15, 2004).

2000 Census: Coverage Measurement Programs' Results, Costs, and
Lessons Learned. GAO-03-287. (Washington, D.C.: January 29, 2003).

2000 Census: Complete Costs of Coverage Evaluation Programs Are Not
Available. GAO-03-41 . (Washington, D.C.: October 31, 2002).

2000 Census: Coverage Evaluation Matching Implemented as Planned,
but Census Bureau Should Evaluate Lessons Learned. GAO-02-297 .
(Washington, D.C.: March 14, 2002).

2000 Census: Coverage Evaluation Interviewing Overcame
Challenges, but Further Research Needed. GAO-02-26 . (Washington, D.C.:
December 31, 2001).

  GAO's Mission

The Government Accountability Office, the audit, evaluation and
investigative arm of Congress, exists to support Congress in meeting its
constitutional responsibilities and to help improve the performance and
accountability of the federal government for the American people. GAO
examines the use of public funds; evaluates federal programs and policies;
and provides analyses, recommendations, and other assistance to help
Congress make informed oversight, policy, and funding decisions. GAO's
commitment to good government is reflected in its core values of
accountability, integrity, and reliability.

The fastest and easiest way to obtain copies of GAO documents at no cost
is through GAO's Web site (www.gao.gov). Each weekday, GAO posts GAO
Reports and newly released reports, testimony, and correspondence on its
Web site. To

have GAO e-mail you a list of newly posted products every afternoon, go to
www.gao.gov and select "Subscribe to Updates."

                             Order by Mail or Phone

The first copy of each printed report is free. Additional copies are $2
each. A check or money order should be made out to the Superintendent of
Documents. GAO also accepts VISA and Mastercard. Orders for 100 or more
copies mailed to a single address are discounted 25 percent. Orders should
be sent to:

U.S. Government Accountability Office 441 G Street NW, Room LM Washington,
D.C. 20548

To order by Phone: Voice: (202) 512-6000 TDD: (202) 512-2537 Fax: (202)
512-6061

Contact:

To Report Fraud, Web site: www.gao.gov/fraudnet/fraudnet.htm

  E-mail: [email protected]

Federal Programs Automated answering system: (800) 424-5454 or (202)
512-7470

Gloria Jarmon, Managing Director, [email protected] (202) 512-4400 U.S.
Government Accountability Office, 441 G Street NW, Room 7125 Relations
Washington, D.C. 20548

Susan Becker, Acting Manager, [email protected] (202) 512-4800

  Public Affairs

U.S. Government Accountability Office, 441 G Street NW, Room 7149
Washington, D.C. 20548

           Presorted Standard Postage & Fees Paid GAO Permit No. GI00

United States
Government Accountability Office
Washington, D.C. 20548-0001

Official Business Penalty for Private Use $300

Address Service Requested
*** End of document. ***