[Federal Register Volume 66, Number 46 (Thursday, March 8, 2001)]
[Notices]
[Pages 14004-14046]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 01-5479]



[[Page 14003]]

-----------------------------------------------------------------------

Part II





Department of Commerce





-----------------------------------------------------------------------



Bureau of the Census



-----------------------------------------------------------------------



Report of Tabulations of Population to States and Localities Pursuant 
to Title 13 U.S.C., Section 141(c), and Availability of Other 
Population Information; the Executive Steering Committee for Accuracy 
and Coverage Evaluation Policy (ESCAP) Report; and the Census Bureau 
Director's Recommendation; Notice

  Federal Register / Vol. 66, No. 46 / Thursday, March 8, 2001 / 
Notices  

[[Page 14004]]


-----------------------------------------------------------------------

DEPARTMENT OF COMMERCE

Bureau of the Census


Report of Tabulations of Population to States and Localities 
Pursuant to Title 13 U.S.C., Section 141(c), and Availability of Other 
Population Information; the Executive Steering Committee for Accuracy 
and Coverage Evaluation Policy (ESCAP) Report; and the Census Bureau 
Director's Recommendation

AGENCY: Bureau of the Census.

ACTION: Notice of recommendation and report.

-----------------------------------------------------------------------

SUMMARY: This notice provides the United States Census Bureau (Census 
Bureau) Director's recommendation on methodology and the Executive 
Steering Committee on Accuracy and Coverage Evaluation (A.C.E.) Policy 
(ESCAP) report analyzing the methodologies that may be used in making 
the tabulations of population reported to states and localities 
pursuant to Title 13 U.S.C., Section 141(c), and the factors relevant 
to the possible choices of methodology. Concurrent with this notice to 
the public, the Census Bureau Director's recommendation and the ESCAP 
report have been delivered to the Secretary of Commerce. The 
recommendation and the report are attached as exhibits to the 
SUPPLEMENTARY INFORMATION section of this notice. In addition to 
publication in the Federal Register, the recommendation and the report 
will be posted on the Census Bureau Web site at http://www.census.gov/dmd/www/2khome.htm>.

FOR FURTHER INFORMATION CONTACT:
    John H. Thompson, Associate Director for Decennial Census, U.S. 
Census Bureau, SFC-2, Room 2018, Washington, DC 20233. Telephone: (301) 
457-3946; fax: (301) 457-3024.

SUPPLEMENTARY INFORMATION: Background Information
    The decennial census is mandated by the United States Constitution 
(Article I, Section 2, Clause 3) to provide the population counts 
needed to apportion the seats in the United States House of 
Representatives among the states. By December 28, 2000, the Census 
Bureau fulfilled its Constitutional duty by delivering to the Secretary 
of Commerce the state population totals used for congressional 
apportionment. In accordance with the January 25, 1999, Supreme Court 
ruling, Department of Commerce v. House of Representatives, 119 S.Ct. 
765 (1999), the Census Bureau did not use statistical sampling to 
produce the state population totals used for congressional 
apportionment.
    However, the Census Bureau did consider the use of statistical 
methods to produce the more detailed data required for legislative 
redistricting. The Census Bureau designed the A.C.E. to permit 
correction of the initial census results to account for systematic 
patterns of net undercount and net overcount. The Census Bureau 
preliminarily determined that the A.C.E., if properly conducted, should 
produce more accurate census data by improving coverage and reducing 
differential undercounts. A senior-level committee, the Executive 
Steering Committee for A.C.E. Policy (ESCAP), was formed to evaluate 
whether the data produced in Census 2000 support this initial 
determination. The ESCAP used analysis from reports on topics chosen 
for their usefulness in informing the decision on the suitability of 
using the A.C.E. data for legislative redistricting. The Committee also 
drew upon work from other Census Bureau staff, as appropriate.
    As required by final rule, Title 15, Code of Federal Regulations, 
Part 101, issued by the Secretary of Commerce (66 FR 11232, February 
23, 2001), the ESCAP has submitted its report (attached below), 
accompanied by the recommendation of the Director of the Census Bureau 
to the Secretary of Commerce. The Secretary will make the final 
determination regarding the methodology to be used in calculating the 
tabulations of population reported to states and localities for 
legislative redistricting. By April 1, 2001, the Census Bureau must 
provide these tabulations, as required by Public Law 94-171, to each 
state so that they can redraw congressional, state, and local 
legislative districts.

    Dated: March 1, 2001.
William G. Barron, Jr.,
Acting Director, Bureau of the Census.

Attachment 1 to Preamble

March 1, 2001
Memorandum for Donald L. Evans, Secretary of Commerce
From: William G. Barron, Jr., Acting Director
Subject: Recommendation on Adjustment of Census Counts
    I am forwarding the report of the Executive Steering Committee 
for A.C.E. Policy (ESCAP) on whether the Accuracy and Coverage 
Evaluation (A.C.E.) should be used to adjust the Census 2000 counts. 
I asked the ESCAP to provide a recommendation in its report because 
I rely on the knowledge, experience, and technical expertise of the 
Committee and Census Bureau staff who have worked extremely hard 
with tremendous dedication and expertise through every phase of 
Census 2000.
    As a member of the ESCAP and as Acting Director, I concur with 
and approve the Committee's recommendation that unadjusted census 
data be released as the Census Bureau's official redistricting data. 
The law requires that the Census Bureau issue data for use in 
redistricting by April 1, 2001 (13 U.S.C. 141(c)). The Committee 
reached this recommendation because it is unable, based on the data 
and other information currently available, to conclude that the 
adjusted data are more accurate for use in redistricting. The 
primary reason for arriving at this conclusion is the apparent 
inconsistency in population growth over the decade as estimated by 
the A.C.E. and demographic analysis. These differences cannot be 
resolved in the time available for the Committee's work. The 
importance of completing this type of analysis has been emphasized 
clearly and explicitly in the Census Bureau's public presentations 
outlining the scope, intent, and purpose of ESCAP deliberations. For 
example, the June 2000 Feasibility Document contained various 
references to the importance of demographic analysis and demographic 
estimates as key components of data and analysis to inform the ESCAP 
recommendation. This point was reinforced in materials the Census 
Bureau presented on October 2, 2000, at a public workshop sponsored 
by the National Academy of Sciences. The inconsistency raises the 
possibility of an unidentified error in the A.C.E. estimates or 
Census 2000. This possibility cannot be eliminated by the legally 
mandated deadline.
    I believe the attached report and this cover memo meet the 
requirements set forth in regulation 66 Fed. Reg. 11231 (February 
23, 2001), ``Report of Tabulations of Population to States and 
Localities Pursuant to 13 U.S.C. 141(c) and Availability of Other 
Population Information; Revocation of Delegation of Authority.''
    Please let me know if I can provide you with additional 
information on these matters.

Attachment 2 to Preamble

Report of the Executive Steering Committee for Accuracy and Coverage 
Evaluation Policy

Recommendation Concerning the Methodology to be Used in Producing the 
Tabulations of Population Reported to States and Localities Pursuant to 
13 U.S.C. 141(c), March 1, 2001

Recommendation

    The Executive Steering Committee for A.C.E. Policy (ESCAP) is 
unable to conclude, based on the information available at this time, 
that the adjusted Census 2000 data are more accurate for 
redistricting. Accordingly, ESCAP recommends that the unadjusted 
census data be released as the Census Bureau's official 
redistricting data.
    The Census Bureau publicly set forth the criteria it would use 
to evaluate the success of the Accuracy and Coverage Evaluation 
(A.C.E.), stating that the adjustment decision would be based on: 
(1) a consideration of operational data to validate the successful 
conduct of the A.C.E.; (2) whether the A.C.E. measures of undercount 
were consistent with historical patterns of undercount and 
independent demographic analysis benchmarks; and (3) a review of 
quality measures.

[[Page 14005]]

    The ESCAP spent many weeks examining voluminous evidence, and 
has debated at great length whether adjustment based on the A.C.E. 
would improve Census 2000 data for use in redistricting. As 
described in the following Report, the Committee considered a wide 
variety of evidence relating to the accuracy of Census 2000 and the 
A.C.E. After careful consideration of the data, the Committee has 
concluded that there is considerable evidence to support the use of 
adjusted data, and that Census 2000 and A.C.E. operations were well 
designed and conducted. However, demographic analysis comparisons, 
and possible issues related to synthetic and balancing error, 
preclude a determination at this time that the adjusted data are 
more accurate.
    As described in detail in the Report, demographic analysis 
indicates fundamental differences with the A.C.E. In particular, 
demographic analysis estimates are significantly lower than the 
A.C.E. estimates for important population groups. The Committee 
investigated this inconsistency extensively, but in the time 
available could not adequately explain the result.
    The inconsistency between the A.C.E. and the demographic 
analysis estimates is most likely the result of one or more of the 
following three scenarios:
    1. The estimates from the 1990 census coverage measurement 
survey (the Post-Enumeration Survey), the 1990 demographic analysis 
estimates, and the 1990 census were far below the Nation's true 
population on April 1, 1990. This scenario means that the 1990 
census undercounted the population by a significantly greater amount 
and degree than previously believed, but that Census 2000 included 
portions of this previously un-enumerated population.
    2. Demographic analysis techniques to project population growth 
between 1990 and 2000 do not capture the full measure of the 
Nation's growth.
    3. Census 2000, as corrected by the A.C.E., overestimates the 
Nation's population.
    The inconsistency between the demographic analysis estimates and 
the A.C.E. estimates raises the possibility of an as-yet 
undiscovered problem in the A.C.E. or census methodology, scenario 
3, above. The Census Bureau must further investigate this 
inconsistency, and the possibility of a methodological error, before 
it can recommend that adjustment would improve accuracy. Similarly, 
concerns with synthetic and balancing error must be more fully 
investigated and addressed.
    The ESCAP's recommendation to use the unadjusted data was a 
difficult one. The Committee conducted a number of analyses directed 
at understanding the inconsistency with demographic analysis and the 
synthetic and balancing error issues, but could not find a complete 
explanation in the time available. The Committee believes it likely 
that further research may establish that adjustment based on the 
A.C.E. would result in improved accuracy. However, the uncertainty 
due to these concerns is too large at this time to allow for a 
recommendation to adjust. The Committee believes that further 
research will verify that Census 2000 improved on the coverage 
levels of past censuses, but that the unadjusted census totals will 
still reflect a net national undercount. The Committee further 
believes the evidence will confirm that the differential undercount 
(the lower than average coverage of minorities, renters, and 
children) was reduced, but not eliminated, in Census 2000.
    The ESCAP finds that both the census and the A.C.E. were 
efficient and effective operations that produced high quality data. 
The Committee is proud of the Census Bureau's design work on both 
the census and the A.C.E. and believes that both produced measurably 
better results. The high quality of the census has made the 
adjustment decision more difficult than in 1990. The closeness of 
the A.C.E. and the census heightens the concern that an undiscovered 
problem with Census 2000 or the A.C.E. will result in a decrease in 
accuracy from adjustment. Today's recommendation is, however, in no 
way a reflection of weaknesses in data quality or in the quality of 
staff work.
    The ESCAP makes this recommendation in light of the information 
now available. Additional evaluations, research, and analysis may 
allow the Census Bureau to resolve the noted concerns. The Census 
Bureau will continue to investigate these issues and will make the 
results of this research available, as is consistent with the 
Bureau's long-standing policy of openness.

Executive Summary

    The ESCAP cannot recommend adjustment at this time. The Executive 
Steering Committee for Accuracy and Coverage Evaluation (A.C.E.) Policy 
(ESCAP) is required by regulation to prepare a written report analyzing 
the methodologies and factors involved in the adjustment decision. The 
Acting Director of the Census Bureau asked the ESCAP to include a 
recommendation in its Report. The ESCAP spent many weeks examining 
voluminous evidence, and has debated at great length whether adjustment 
would improve Census 2000 data for use in redistricting. After having 
evaluated a wide variety of evidence relating to the accuracy of Census 
2000, and developed an extensive record of its deliberations, the ESCAP 
is unable to conclude, based on the information available at this time, 
that the adjusted Census 2000 data are more accurate for redistricting.
    While the majority of the evidence indicates both the continued 
existence of a differential undercount of the population and the 
superior accuracy of the adjusted numbers, the ESCAP has concerns. 
There is a significant inconsistency between the A.C.E. estimates and 
demographic analysis estimates. Additionally, possible synthetic and 
balancing errors may affect the accuracy of the adjusted numbers. Until 
these concerns are more fully investigated and addressed, the ESCAP 
cannot recommend using adjustment. Accordingly, ESCAP has recommended 
that unadjusted census data be released as the Census Bureau's official 
redistricting data.
    The ESCAP makes this recommendation in light of the information now 
available. Additional evaluations, research, and analysis may alleviate 
these concerns and support the evidence that indicates the superior 
accuracy of the adjusted data. Accordingly, the Census Bureau intends 
to continue its research into these concerns.
    The Census Bureau relied on three prespecified decision criteria. 
The ESCAP based its adjustment recommendation on: (1) a consideration 
of operational data to validate the successful conduct of the A.C.E.; 
(2) whether the A.C.E. measures of undercount were consistent with 
historical patterns of undercount and independent demographic analysis 
benchmarks; and (3) a review of quality measures. These criteria were 
specified in advance in the Census Bureau's June, 2000 ``Accuracy and 
Coverage Evaluation: Statement on the Feasibility of Using Statistical 
Methods to Improve the Accuracy of Census 2000.''
    Both Census 2000 and the A.C.E. were of high quality. The ESCAP's 
recommendation against adjustment in no way suggests serious concern 
about the quality of the census or the A.C.E. operations, as the ESCAP 
believes that both Census 2000 and the A.C.E. were efficient and 
effective operations that produced high quality data. All major 
programs in the census were completed on schedule and within budget, 
and design improvements in both Census 2000 and the A.C.E. produced 
measurably better results. An innovative advertising and partnership 
program encouraged public participation, and adequate staffing and pay 
contributed to improved data quality. The ESCAP concludes that the 
unadjusted census data are of high quality.
    The A.C.E. was also a design and operational success. The A.C.E. 
included a variety of design improvements that resulted in better data 
quality, including enhanced computer processing and bettering matching. 
The Census 2000 adjusted data have lower variances and comparable or 
improved missing data rates compared to the 1990 adjusted data. The 
Census Bureau followed the A.C.E.''s prespecified design except for two 
specific instances that are easily explained by good and normal 
statistical practice. Both of these changes should be considered 
enhancements. The ESCAP has concluded that both Census 2000 and the 
A.C.E. were effective and efficient operations.

[[Page 14006]]

    Demographic analysis estimates were inconsistent with the adjusted 
data. The demographic analysis estimates indicate fundamental 
differences with the results of the A.C.E. In particular, the 
demographic analysis estimates are significantly lower than both Census 
2000 and the A.C.E. estimates for important population groups. The 
Committee investigated this inconsistency extensively, but in the time 
available could not adequately explain it. The inconsistency between 
the A.C.E. and the demographic analysis estimates is most likely the 
result of one or more of the following three scenarios:
    1. The estimates from the 1990 census coverage measurement survey 
(the Post-Enumeration Survey), the 1990 demographic analysis estimates, 
and the 1990 census were far below the Nation's true population on 
April 1, 1990. This scenario means that the 1990 census undercounted 
the population by a significantly greater amount and degree than 
previously believed, but that Census 2000 included portions of this 
previously un-enumerated population.
    2. Demographic analysis techniques to project population growth 
between 1990 and 2000 do not capture the full measure of the Nation's 
growth.
    3. Census 2000, as corrected by the A.C.E., overestimates the 
Nation's population.
    The inconsistency between the demographic analysis estimates and 
the A.C.E. estimates raises the possibility of an as-yet undiscovered 
problem in the A.C.E. or census methodology, scenario 3, above. The 
Census Bureau must further investigate this inconsistency, and the 
possibility of a methodological error, before it can recommend that 
adjustment would improve accuracy.
    Quality measures indicate the adjusted data are more accurate 
overall, but concerns were identified. The ESCAP directed the 
preparation of several total error models and loss function analyses to 
evaluate whether the adjusted data are more accurate than the 
unadjusted data. The Committee examined the loss functions for evidence 
of a clearly measurable improvement under a variety of scenarios and 
found the following:

    1. Under what the Committee considered reasonable assumptions, 
state, congressional district, and county level analyses showed a 
marked improvement for adjustment.
    2. However, some less likely scenarios indicated that the 
unadjusted census was more accurate at all geographic levels.
    3. The analysis of accuracy for counties with populations below 
100,000 people indicated that the unadjusted census was more 
accurate.

    The ESCAP believes that under reasonable scenarios, and absent the 
concerns noted above, adjustment would result in more accurate data at 
the state, congressional district, and county levels. Even though 
smaller counties would have been less accurate, the analysis indicated 
an overall improvement in accuracy from adjustment. However, the 
concerns noted above are all potentially indicative of undetected 
problems. The ESCAP is unable to conclude at this time that the 
adjusted data are superior because further research on these concerns 
could reverse the finding of the adjusted data's superior accuracy.
    The ESCAP assessed other factors that might affect accuracy. The 
ESCAP examined the issues of synthetic and balancing error and 
concluded that the potential for these errors cannot be ignored, 
particularly when considered in conjunction with the inconsistency with 
demographic analysis. Finally, the ESCAP reviewed the treatment of late 
census additions and whole person imputations, because the number of 
these cases significantly increased from 1990, concluding that these 
cases did not raise serious new concerns.
    Additional issues were considered. The ESCAP reiterated that the 
Census Bureau does not consider block-level accuracy to be an important 
criterion with which to evaluate either Census 2000 or the A.C.E., and 
explained that had adjusted data files been released, adjustments for 
overcounts would not have resulted in the removal of any records from 
Census 2000 files.

Table of Contents

Introduction
    Census and A.C.E. Results in Brief
    ESCAP Procedure and Process
Findings
    Conduct of Key Operations
    Census Quality Indicators
    Address List Development
    Questionnaire Return--Census 2000 Mail Return Rates
    Nonresponse Follow-up
    Housing Unit Unduplication Program
    Data Processing
    A.C.E. Quality Indicators
    Historical Measures of Census Coverage-Comparison with 
Demographic Analysis
    Measures of Census and A.C.E. Quality
    Total Error Model
    Loss Function Analysis
    Other Factors That May Affect Accuracy
    Synthetic Error
    Balancing Error
    Late Adds and Whole Person Imputations
    Misclassification Error
Additional Issues
    Block Level Accuracy
    Adjustments for Overcounts
Attachments

Introduction

    This report fulfills the responsibility of the Executive Steering 
Committee for Accuracy and Coverage Evaluation (A.C.E.) Policy (``the 
ESCAP'' or ``the Committee'') to prepare a ``written report to the 
Director of the Census analyzing the methodologies that may be used in 
making the tabulations of population reported to States and localities 
pursuant to 13 U.S.C. 141(c), and the factors relevant to the possible 
choices of methodology.'' \1\ As is required by regulation, the 
Director of the Census will forward this report and his recommendation 
regarding adjustment to the Secretary of Commerce. This report is also 
being released to the public at the same time that it is being 
forwarded to the Secretary of Commerce. \2\ The Secretary of Commerce 
will make the final determination about whether to adjust the data that 
will be released pursuant to P.L. 94-171.
---------------------------------------------------------------------------

    \1\ The phrase ``the methodologies that may be used in making 
the tabulations of population reported to States and localities 
pursuant to 13 U.S.C. 141(c)'' refers to the decision about whether 
the Census Bureau should release adjusted or unadjusted data for the 
states to use in redistricting. Rather than repeating this 
cumbersome legal phrase, this document will often refer simply to 
``the adjustment decision.''
    \2\ In addition to the requirement to make this report public, 
the Census Bureau firmly believes that full disclosure and a 
vigorous and informed debate will improve both the Census Bureau's 
internal processes and the public's understanding of statistical 
adjustment. Accordingly, the Bureau is also making available on its 
Internet site the documentation supporting the ESCAP report. This 
additional documentation includes the analytical reports outlined 
publicly to the National Academy of Sciences Panel to Review the 
2000 Census in October, 2000, along with underlying data, analysis, 
and supporting documentation. An index to the supporting 
documentation is attached.
---------------------------------------------------------------------------

    The Census Bureau released in June 2000 the report ``Accuracy and 
Coverage Evaluation: Statement on the Feasibility of Using Statistical 
Methods to Improve the Accuracy of Census 2000,'' (the Feasibility 
Document). The Feasibility Document stated that ``the Census Bureau 
will make the determination to use the A.C.E. to correct Census 2000 
after evaluating (1) the conduct of key operations, (2) the consistency 
of the A.C.E. to historical measures of undercount, and (3) measures of 
quality.'' \3\ This report will, accordingly, evaluate the conduct of 
key operations, compare the Accuracy and Coverage Evaluation Survey 
(A.C.E.) estimates to historic measures of the undercount, and evaluate 
the quality of both the A.C.E. and the census.
---------------------------------------------------------------------------

    \3\ Feasibility Document, p. 33.

---------------------------------------------------------------------------

[[Page 14007]]

Census and A.C.E. Results in Brief

    As the Census Bureau has stated publicly, Census 2000 was an 
operational success, meeting or exceeding goals. This success may be 
attributed to a number of improvements, including the following:
     A multi-faceted marketing and partnership program that 
encouraged householders to complete and mail back their census forms,
     The ability to hire and retain enough highly skilled 
temporary staff throughout the course of the census, permitting timely 
completion of operations,
     The timely completion of nonresponse follow-up, which 
provided sufficient time and resources to conduct other operations 
designed to improve coverage, and
     The use of digital imaging and optical character 
recognition technology for the first time to recognize handwritten 
answers in addition to marks on the form, a vast improvement that 
allowed the Census Bureau to process the data faster and permitted 
multiple response options.
    The A.C.E. was also an operational success that met or exceeded 
goals. The A.C.E. was completed on time and generally produced data 
equal or superior in quality to prior coverage measurement surveys.
    The A.C.E. supports the conclusion that the quality of the initial 
census was generally good, finding that Census 2000 reduced both net 
and differential undercoverage from 1990 census levels. The A.C.E. 
estimates that the net national undercount was reduced from the 1990 
rate of 1.61 percent to 1.18 percent in 2000. \4\ This reduction is 
substantial and reflects high census quality. The A.C.E. further found 
that not only was the net undercount reduced, but there was a reduction 
in the differential undercount. According to the 1990 Post-Enumeration 
Survey, minorities, renters, and children were differentially 
undercounted in the 1990 census, and other methods indicate a 
differential undercoverage of minorities in earlier censuses. While 
these groups still have higher undercount rates than the population as 
a whole, the differential has dropped considerably.
---------------------------------------------------------------------------

    \4\ These figures compare the 1990 and 2000 undercount rates as 
measured by coverage measurement surveys. The coverage measurement 
survey conducted in connection with the 1990 census was called the 
Post-Enumeration Survey (PES). As will be discussed below, 
Demographic Analysis presents an alternative measure of census 
coverage.
---------------------------------------------------------------------------

    The A.C.E. did not judge Census 2000 quality to be perfect, 
however. The A.C.E. indicated that while differential coverage was 
reduced, it was not eliminated, and that Census 2000 continued 
longstanding patterns of differential coverage, with minority groups, 
renters, and children all exhibiting lower coverage rates. \5\
---------------------------------------------------------------------------

    \5\ The percent net undercount for owners was 0.44 percent 
compared to 2.75 percent for renters, and the non-Hispanic White 
undercount rate of 0.67 percent was lower than the rates for non-
Hispanic Blacks (2.17 percent) and Hispanics (2.85 percent).
---------------------------------------------------------------------------

    Coverage measurement surveys such as the A.C.E. are not the only 
method available to estimate census coverage; the Census Bureau also 
uses demographic analysis (DA) to assess net and differential 
population coverage. DA uses records and estimates of births, deaths, 
legal immigration, and Medicare enrollments, and estimates of 
emigration and net undocumented immigration to estimate the national 
population, separately from the census. The Census Bureau has long 
relied on DA as an important independent benchmark for validation of 
the accuracy of both the census and coverage measurement surveys such 
as the A.C.E. Initial DA results, however, presented a major 
inconsistency with the A.C.E. results--instead of confirming a net 
undercount, DA estimates that Census 2000 overcounted the national 
population by 1.8 million individuals. Even an alternative DA that 
assumed a doubling of net undocumented immigration during the 1990's 
(compared with the initial DA) showed a small net undercount of 0.9 
million, substantially below the net undercount of 3.3 million shown by 
the A.C.E. These inconsistencies and DA in general will be discussed in 
more detail later in this report. The DA and A.C.E. estimates did 
agree, however, that Census 2000 perpetuated the historical phenomenon 
of the differential undercount.
    The following table sets forth the A.C.E.'s results in summary 
fashion:

     Table 1a.--Percent Net Undercount for Major Groups: 2000 A.C.E.
------------------------------------------------------------------------
                                                    Net        Standard
              Estimation grouping                undercount     Error
                                                 (percent)    (percent)
------------------------------------------------------------------------
        Total population in Households........         1.18         0.13
Race and Hispanic Origin:
    American Indian and Alaska Native (on              4.74         1.20
     reservation).............................
    American Indian and alaska Native (off             3.28         1.33
     reservation).............................
    Hispanic Origin (of any race).............         2.85         0.38
    Black or African American (not Hispanic)..         2.17         0.35
    Native Hawaiian and Other Pacific Islander         4.60         2.77
     (not Hispanic)...........................
    Asian (not Hispanic)......................         0.96         0.64
    White or Some Other Race (not Hispanic)...         0.67         0.14
Age and Sex:
    Under 18 years............................         1.54         0.19
    18 to 29 years:
        Male..................................         3.77         0.32
        Female................................         2.23         0.29
    30 to 49 years:
        Male..................................         1.86         0.19
        Female................................         0.96         0.17
    50 years and over:
        Male..................................        -0.25         0.18
        Female................................        -0.79         0.17
Housing Tenure:
    In owner-occupied housing units...........         0.44         0.14

[[Page 14008]]

 
    In nonowner-occupied units................         2.75        0.26
------------------------------------------------------------------------
Notes:
 The race and Hispanic categories shown on this table represent
  estimation groupings used in developing estimates based on the A.C.E.
  Survey and do not conform with race and Hispanic categories that will
  appear in the redistricting (P.L. 94-171) files and other Census 2000
  data products. In developing the estimation groupings used to evaluate
  the coverage of Census 2000, the principal consideration was to
  combine people who were expected to have the same probability of being
  counted in Census 2000. Consequently, the race and Hispanic origin
  groupings used to create the A.C.E. estimates of coverage are
  exceedingly complex. For a complete description of the estimation
  groups, see DSSD Memorandum Q-37, which will be provided on request.
 In general, American Indians and Alaska Natives (AIAN) are
  included in that category, regardless of whether they marked another
  race or are Hispanic. A few exceptions apply, especially for those who
  do not live on a reservation, on trust lands, or in an AIAN
  statistical area.
 Similarly, Native Hawaiians and Other Pacific Islanders (NHPI)
  generally are included in that category, unless they lived outside of
  Hawaii and marked more than one race or marked Hispanic.
 Hispanics are mostly in that category, unless they marked AIAN
  and lived on a reservation, on trust lands, or in an AIAN statistical
  area, or marked NHPI and lived in Hawaii.
 People who marked Black or African American are generally in
  that category unless they fell in the categories described above;
  similarly those who marked Asian are generally in that category,
  unless they fell in the categories described above.
 The final category includes most people who marked only White
  or only Some Other Race or marked three or more races but did not fall
  into the categories described above.
 The data in this table contain sampling and non-sampling error;
  a minus sign denotes a net overcount.

    The following table presents the results from the 1990 Census Post-
Enumeration Survey:

      Table 1b.--Percent Net Undercount for Major Groups: 1990 PES
------------------------------------------------------------------------
                                                    Net        Standard
              Estimation grouping                undercount     error
                                                 (percent)    (percent)
------------------------------------------------------------------------
Total Population\1\...........................         1.61         0.20
Race and Hispanic Origin:
    White or Some Other Race not Hispanic)\2\.         0.68         0.22
    Black or African American.................         4.57         0.55
    Hispanic Origin\3\........................         4.99         0.82
    Asian and Pacific Islander................         2.36         1.39
    American Indian and Alaska Native (on             12.22         5.29
     reservaton)..............................
Age and Sex:
    Under 18 years............................         3.18         0.29
    18 to 29 years:
        Male..................................         3.30         0.54
        Female................................         2.83         0.47
    30 to 49 years:
        Male..................................         1.89         0.32
        Female................................         0.88         0.25
    50 years and over:
        Male..................................        -0.59         0.34
        Female................................        -1.24         0.29
Housing Tenure:
    In owner-occupied housing units...........         0.04         0.21
    In nonowner-occupied housing units........         4.51        0.43
------------------------------------------------------------------------
Notes:
 The data in this table contain sampling and non-sampling error.
 
 The race and Hispanic categories shown on this table represent
  selected population groupings used in conducting the PES and do not
  conform exactly with race and Hispanic tabulations that were released
  from the 1990 census.
\1\ Includes household population and some Group Quarters; excludes
  institutions, military group quarters.
\2\ Includes American Indians off reservations.
\3\ Excludes Blacks or African Americans, Asian and Pacific Islanders,
  and American Indians on reservations.

    The following table summarizes DA's estimates for Census 2000:

  Table 2.--Demographic Analysis Estimates of Percent Net Undercount by
                         Race, Sex and Age: 2000
------------------------------------------------------------------------
                                         DEMOGRAPHIC ANALYSIS--2000
             Category             --------------------------------------
                                     Average      Model 1      Model 2
------------------------------------------------------------------------
            Black Male
 
    Total........................         5.10         6.94         3.26
0-17.............................         1.47         4.86        -1.92
18-29............................         6.45         8.02         4.88
30-49............................         9.18        10.11         8.25
50+..............................         3.29         4.08         2.49
 

[[Page 14009]]

 
           Black Female
    Total........................         0.63         2.52        -1.27
0-17.............................         1.92         5.39        -1.56
18-29............................         0.12         1.93        -1.70
30-49............................         0.98         2.06        -0.10
50+..............................        -1.31        -0.45        -2.16
 
          Nonblack Male
    Total........................        -0.93        -1.21        -0.65
0-17.............................        -0.90        -1.56        -0.23
18-29............................        -4.17        -4.45        -3.89
30-49............................         0.10        -0.04         0.24
50+..............................        -0.16        -0.24        -0.08
         Nonblack Female
    Total........................        -1.44        -1.74        -1.14
0-17.............................        -0.32        -1.01         0.38
18-29............................        -3.66        -4.00        -3.32
30-49............................        -1.21        -1.38        -1.04
50+..............................        -1.45        -1.54       -1.35
------------------------------------------------------------------------
(A minus sign denotes a net overcount.)
Note: Model 1 uses 2000 census tabulations for Blacks that include
  people who reported ``Black'' and no other race. Model 2 uses 2000
  census tabulations for Blacks that include people who reported Black,
  whether or not they reported other races. People who reported only
  ``Some other race'' are reassigned to a specific race category (to be
  consistent with 1990 DA estimates and the historical demographic data
  series).

ESCAP Procedure and Process

    After the Supreme Court ruled in January, 1999 that the Census Act 
barred the use of statistical sampling for reapportioning the House of 
Representatives,\6\ the Census Bureau redesigned its plan for the 
census to assure that sampling was not used to arrive at the 
apportionment counts, and to provide for the possible use of sampling 
for all other purposes. This action was in accordance with the advice 
of the (then) General Counsel of the Department of Commerce that 
``Section 195 of the Census Act requires the Census Bureau, if 
feasible, to produce statistically corrected numbers from the decennial 
census for all non-apportionment purposes.'' \7\
---------------------------------------------------------------------------

    \6\ Dept. of Commerce v. House of Representatives, 119 S.Ct. 765 
(1999).
    \7\ Memorandum to the Secretary and the Director of the Census 
from Andrew J. Pincus, General Counsel, dated June 12, 2000 and 
entitled ``Legal Obligation to Produce Statistically-Corrected Non-
Apportionment Census Numbers.''
---------------------------------------------------------------------------

    The Associate Director for Decennial Census originally chartered 
the ESCAP on November 26, 1999 and charged the Committee to ``advise 
the Director in determining policy for the A.C.E. and the integration 
of the A.C.E. results into the census for all purposes except 
Congressional reapportionment.'' Thereafter, on October 6, 2000, the 
Department of Commerce delegated to the Director of the Census Bureau 
the final determination ``regarding the methodology to be used in 
calculating the tabulations of population reported to States and 
localities pursuant to 13 U.S.C. 141(c).'' This regulation further 
required the ESCAP to ``prepare a written report to the Director of the 
Census Bureau recommending the methodology to be used in making the 
tabulations of population reported to States and localities pursuant to 
13 U.S.C. 141 (c).'' \8\ The initial regulation was revised on February 
14, 2001 to provide that the Secretary of Commerce would make the final 
adjustment decision for the redistricting data, but only after 
receiving the recommendation, if any, of the Director of the Census 
Bureau, together with the ESCAP's report.\9\ Accordingly, this document 
constitutes the official report of the ESCAP to the Director analyzing 
the adjustment methodologies and setting forth the relevant factors in 
the adjustment decision. The Acting Director of the Census Bureau asked 
the ESCAP to include a recommendation in the Report. This Report is 
limited to an analysis of whether adjustment would produce improved 
data for legislative redistricting.
---------------------------------------------------------------------------

    \8\ 65 Federal Register 59713, ``Report of Tabulations of 
Population to States and Localities Pursuant to 13 U.S.C. 141(c) and 
Availability of Other Population Information,'' October 6, 2000.
    \9\ 66 Federal Register 11231, ``Report of Tabulations of 
Population to States and Localities Pursuant to 13 U.S.C. 141(c); 
Revocation of Delegation of Authority,'' February 23, 2001.
---------------------------------------------------------------------------

    The ESCAP held its first meeting on December 8, 1999 and met 
regularly until the date of this Report, meeting over 45 times, 
sometimes with more than one meeting per day. The analysis set forth in 
this document is supported by extensive staff work and many analytic 
reports on various aspects of the census and the A.C.E. The documents 
in these ``B-series'' reports represent diligent and thorough 
statistical, demographic, and analytic work conducted over many months 
of intensive effort. These more detailed reports are summarized in 
Report B-1, ``Data and Analysis to Inform the ESCAP Report,'' from 
which this Report draws heavily.\10\
---------------------------------------------------------------------------

    \10\ A list of these reports is attached to this Report.
---------------------------------------------------------------------------

    The ESCAP's membership was originally set forth in its charter and 
repeated in the regulations. There are twelve members on the ESCAP, 
with the Director functioning in an ex officio role. The Committee 
solicited needed assistance from the Associate Director for Field 
Operations, recognizing his unique contribution to the Committee's 
awareness of field operations and procedures. He contributed valuable 
input to the deliberative process and was, in effect, a member of 
ESCAP. The ESCAP represents a body of senior career Census Bureau 
professionals, with advanced degrees in relevant technical fields and/
or decades of experience in the Federal statistical system. All are 
highly competent to evaluate the relative merits of the A.C.E. data 
versus the census data and are recognized for their extensive 
contributions to the professional community.

[[Page 14010]]

    The Committee proceeded through four distinct but overlapping 
stages. The Chair arranged that minutes be prepared for all sessions, 
except for the final sessions which were private deliberations. The 
early sessions were educational, designed to make the Committee members 
aware of the details of the upcoming operations and to explain possible 
adjustment issues. The second phase was devoted to the presentation of 
evidence. As data from the census and the A.C.E. became available, 
knowledgeable individuals in the Census Bureau made presentations to 
the Committee. The Committee reviewed data from all relevant census and 
A.C.E. operations, sometimes asking staff to provide additional and new 
information. The third phase was the deliberation phase. Unlike the 
first two phases, the deliberations were closed to all but Committee 
members and individuals invited for a specific purpose: individuals 
with specialized knowledge who could respond to specific inquiries from 
the Committee members. The final and briefest stage was the review 
stage, where Committee members circulated and commented on the draft 
report.
    During the education and evidence presentation phases, the Chair 
generally arranged presentations on major issues, issues that he 
identified on his own initiative or on the suggestion of Committee 
members. During the evidence presentation stage, authors of the 
analysis reports known as ``the B-series'' presented their data and 
conclusions to the Committee. The deliberation and review phases were 
less structured with various members raising topics for discussion and 
asking for evidence. No formal vote was held; this Report reflects a 
consensus of the ESCAP.
    This report and the analysis preceding it were prepared in light of 
the statutory April 1, 2001 deadline.\11\ The Census Bureau clearly 
would have preferred to have additional time to analyze the data before 
it, and may well have reached a different recommendation had it had 
more time; however, the ESCAP believes that it has analyzed the 
available data sufficiently to make the findings contained in this 
report. This report is based on the best data available at the time. 
More data will be produced in the months and years to come that could 
affect the matters discussed in this report. As in past censuses, the 
Census Bureau will prepare a large number of detailed evaluations of 
both the census and the A.C.E. These evaluations will not be available 
for months, or in some cases, years, after the Census Bureau is 
required by law to provide redistricting data to the states. These 
final evaluations, as distinguished from the analysis reports that 
informed the ESCAP Committee, will be accomplished without the pressure 
of a legal deadline, will be based on additional information, and may, 
in some instances, reach conclusions different from those in the 
analysis reports.\12\
---------------------------------------------------------------------------

    \11\ The Census Act requires that redistricting data be 
``completed, reported, and transmitted to each respective State 
within one year after the decennial census date.'' 13 U.S.C. 
Sec. 141(c).
    \12\ A list of the planned Census 2000 final evaluations can be 
found at Attachment 3.
---------------------------------------------------------------------------

Findings

    The ESCAP has evaluated the conduct of key operations in both the 
census and the A.C.E., the consistency of the A.C.E. to historical 
measures of undercount, and measures of both census and A.C.E. 
accuracy. Accordingly, this section will evaluate:
     The conduct of key operations (Census Quality Indicators, 
A.C.E. Quality Indicators),
     Historical measures of census coverage--comparison with 
Demographic Analysis,
     Measures of census and A.C.E. accuracy (Total Error Models 
and Loss Function Analysis), and
     Other factors that may affect accuracy.

Conduct of Key Operations

Census Quality Indicators
    The ESCAP concludes that the unadjusted census was well designed 
and executed and that the results are of a high quality. There had been 
considerable concern about potential operational problems, given that 
the Census Bureau finalized its plans for Census 2000 very late in the 
census cycle in response to the Supreme Court ruling in January, 1999. 
However, Census 2000 was an operational success; all major programs 
were completed on schedule and within design parameters. Although there 
were some local problems and minor operational shortcomings, census 
operations were implemented in a controlled manner and within design 
expectations.
    The ESCAP reviewed the results of the initial census to determine 
whether improved census operations could be expected to yield high 
quality results. The ESCAP heard presentations on the results of each 
major census operation and evaluated the extent to which these 
operations were under control. The discussion in this document is not 
meant to be a complete evaluation of census operations, but rather 
focuses on information relevant to the level and pattern of census 
omissions or erroneous inclusions, because this information is directly 
relevant to understanding and assessing the results of the A.C.E.
    While several major improvements were introduced for Census 2000, 
including improved marketing, better questionnaire design, more ways to 
respond, higher pay rates, and improved processing,\13\ the basic 
design of Census 2000 was similar to the design of the last two 
censuses. Address lists were prepared from a variety of sources. 
Questionnaires were delivered to each address on the list. 
Questionnaires were principally delivered by the U.S. Postal Service. 
In areas with rural-style addresses, census workers delivered the 
questionnaires. Households were asked to return the questionnaires by 
mail. Those addresses that did not return a questionnaire by mail were 
followed up by census workers in the nonresponse follow-up (NRFU) 
operation. NRFU was followed by a coverage improvement follow-up 
operation. Each major operation had its own quality control procedures.
---------------------------------------------------------------------------

    \13\ For more detail see ``Census 2000 Operational Plan--
December, 2000.''
---------------------------------------------------------------------------

    The following is a brief discussion of the quality indicators 
associated with some of the major Census 2000 operations.
    Address List Development. A foundation of the decennial census 
process is the list of housing units representing every known residence 
in the country. The address list is dynamic, with updates occurring at 
a number of phases throughout the census. One important measure of its 
quality is the time at which housing units were added. It is preferable 
for the address list to be largely complete before the majority of 
census operations begin, as this would indicate that the building of 
the address list had been successful, by using operations such as 
address listing and block canvassing, and local government input in the 
Local Update of Census Addresses (LUCA) program. The data confirm that 
the address list was largely complete early in the process, as census 
enumerators found few new addresses in the field. The address list was 
nearly 97 percent complete (overall and in each region of the nation) 
before the census forms were mailed out or delivered. The two fastest 
growing regions, the South and the West, not surprisingly, had slightly 
lower percentages of housing unit coverage before the census and higher

[[Page 14011]]

rates of added housing units during questionnaire delivery. (See B-2, 
``Quality Indicators of Census 2000 and the Accuracy and Coverage 
Evaluation.'')
    Questionnaire Return--Census 2000 Mail Return Rates. One of the 
most important quality indicators for the census is the mail return 
rate, the proportion of occupied housing units that mailed back their 
questionnaires. A high mail return rate is crucial to the success of 
the census--operationally, budgetarily, and also in terms of data 
quality; data from mailback questionnaires tend to be more complete and 
of higher quality than the data from forms completed by enumerators.
    Public cooperation is critical for the success of any census. The 
Census Bureau had projected that the mail return rate would be lower in 
Census 2000 than in 1990 and had accordingly developed an enhanced 
marketing and partnership program designed to increase awareness of the 
decennial census and public cooperation. The marketing program was 
designed around a first-ever paid advertising campaign, including a 
national media campaign aimed at increasing mail response, targeted 
advertising directed at raising mail response among historically 
undercounted populations, and special advertising messages and 
campaigns targeted to hard-to-enumerate populations. In the partnership 
program, the Census Bureau worked nationwide with state and local 
partners to encourage all individuals to respond to the census. 
Additionally, the Census Bureau worked with states and local 
jurisdictions to encourage residents of the jurisdictions to raise 
their mail response rates over their 1990 levels.
    The success of the advertising campaign and the partnership program 
is reflected in the final Census 2000 national mail return rate of 72 
percent. In 1990, this figure was 74 percent, but the mailback universe 
was different in Census 2000, including the addition of approximately 
five million (mostly rural) housing units to the universe in 2000. 
These units were enumerated differently in 1990 and the two figures 
are, thus, not wholly comparable. It is fair to say that the level of 
public cooperation in Census 2000 roughly equaled that of 1990, despite 
projections of lower cooperation.
    Nonresponse Follow-up. The nonresponse follow-up (NRFU) operation 
involved field follow up of about 42 million housing units that did not 
return a census form within the specified time after Census Day. For 
most LCO's, NRFU was completed as scheduled in a 9-week period between 
April 27 and June 26. This performance is a significant improvement 
over 1990 when NRFU was generally conducted over a 14-week period from 
April 26 through July 30. The Census Bureau believes, based on past 
research, that NRFU interviews conducted closer to Census Day are 
likely to be of higher quality. Thanks in large part to adequate 
funding provided by the Congress, pay rates and levels of staffing in 
2000 were far higher than in the past two censuses. We believe that 
this increased funding and the ability to hire adequate staff 
contributed to an improvement in NRFU quality, and thus improved Census 
2000 data in general.
    The Census Bureau identified local NRFU problems at a few Local 
Census Offices (LCOs), including the LCO in Hialeah, Florida. The 
Census Bureau responded to the localized problems in the Hialeah office 
by re-enumerating certain areas that were believed to have faulty data 
and does not believe that net coverage in the Hialeah or any other LCO 
was substantially affected by these local problems. The limited local 
imperfections do not detract from the conclusion that NRFU as a whole 
was successful. The local problems experienced were similar to problems 
encountered in previous censuses, and should be expected in any non-
recurring operation of this magnitude.
    Housing Unit Unduplication Program. The Census Bureau became 
concerned that the address list might contain a significant number of 
duplicate addresses, or duplicated persons living in duplicated 
addresses. The Census Bureau responded to this problem by designing and 
conducting the Housing Unit Unduplication Program. This program was a 
special operation designed and instituted to reduce the level of 
housing unit duplication. While this program was not prespecified, the 
Census Bureau believed that failure to address this potential problem 
could have seriously impaired the accuracy of the apportionment 
numbers. Using the results of an address matching operation and a 
person matching operation, 2,411,743 addresses were identified as 
potential duplicates and the person and housing records associated with 
these addresses temporarily removed from the census file. Based on more 
detailed analysis, 1,392,686 addresses of these were permanently 
removed from the address list and 1,019,057 addresses were re-instated 
and included in the census results. Although this operation certainly 
made mistakes of both exclusion and inclusion, the operation was 
necessary and resulted in improved census accuracy.
    Data Processing. The large number of address sources used to 
compile the address list, along with an increased number of response 
opportunities, increased the chance of duplicate returns. Census 2000 
included several data processing steps designed to handle multiple 
census returns for a single housing unit. More than 90 percent of 
Census 2000 housing units had only one census return. For households 
returning two or more forms, the Census Bureau conducted a computer 
operation to identify and remove duplicated responses. Imputation is 
discussed later in this Report.
A.C.E. Quality Indicators
    The A.C.E. is based on an independent coverage measurement survey, 
meaning that it collects information in operations separate from the 
census to allow comparison with the initial census enumeration. The 
goal is to determine what proportion of the people living in the A.C.E. 
sample blocks were correctly included in the census, what proportion 
were erroneously included in the census, and what proportion were not 
included in the census, so that corrected data can be prepared.
    The Census Bureau selected a stratified random sample of blocks to 
include in the A.C.E. and created an independent list of housing units 
in those blocks. Enumerators conducted initial A.C.E. interviews at the 
housing units on this independent list. Households with discrepant 
information between the A.C.E. and the census received a follow-up 
interview to find the correct answer or ``true'' situation. This 
process led to a determination for each individual regarding whether 
the A.C.E. response or the census response was correct. Missing data 
for households and/or individuals was supplied using prespecified 
procedures, including imputation.\14\ The individuals in the A.C.E. 
sample were then categorized by age, sex, tenure (owner or renter) and 
other predefined variables into groupings called post-strata, and 
coverage correction factors (CCFs) were calculated for each post-
stratum. The methodology used to create the coverage correction factors 
is called Dual System Estimation or DSE. The coverage correction 
factors measure the extent to which the total of people in each post-
stratum is over-or undercounted in the initial census. These factors 
can be used

[[Page 14012]]

to correct the initial census data and to produce tabulated results.
---------------------------------------------------------------------------

    \14\ B-7 ``Missing Data Results'' contains a description of the 
three types of missing data in the A.C.E. and the processes used to 
correct for them.
---------------------------------------------------------------------------

    The Census Bureau incorporated a variety of improvements into the 
2000 A.C.E. compared with the 1990 PES:
     In order to reduce variance (sampling error), the Bureau 
doubled the size of the sample from the 1990 PES.
     The design included enhancements to the matching process, 
such as a more fully automated matching system with built-in edits and 
quality checks, centralization of matching in one site, and a change in 
the treatment of movers.
     Computer processing was improved in a number of ways, such 
as adoption of software validation and verification procedures, 
standardized nomenclature, and improved documentation for technical 
issues.
     Enhancements to minimize missing data were added to the 
design, including allowance of an additional two weeks for attempts to 
revisit any nonresponding households.
    The ESCAP spent many hours reviewing the elements of A.C.E. quality 
and has concluded that these enhancements succeeded in their goal of 
improving the A.C.E., and that the operational quality of the A.C.E. 
was good.
    The quality of the A.C.E. operations is particularly evident from 
the fact that the A.C.E. was completed on schedule without any major 
difficulties. That operations of the massive size of both the initial 
census and the A.C.E. could be finished on time and under budget is 
testimony to thoughtful design and careful implementation. Listing, 
interviewing, matching, and follow up were all conducted as designed 
and in a controlled manner. A.C.E. interview response rates met or 
exceeded expectations. The Quality Assurance operations were carried 
out as planned and assured that the A.C.E. was in control, resulting in 
few outliers.\15\ Computer programs were thoroughly tested and improved 
from 1990. This evidence indicates that the A.C.E. was a clear 
operational success.\16\
---------------------------------------------------------------------------

    \15\ Outliers are extreme blocks with high effect on the 
estimates.
    \16\ B-1, ``Data and Analysis to Inform the ESCAP Report.''
---------------------------------------------------------------------------

    A.C.E. prespecified procedures were followed except in two specific 
instances.\17\ Both of these instances were actually enhancements to 
the A.C.E. design permitted by earlier than anticipated availability of 
data; both are consistent with good statistical practice and both 
improved the accuracy of the A.C.E. results.
---------------------------------------------------------------------------

    \17\ B-1, Ibid.
---------------------------------------------------------------------------

    Briefly, the first change was a modification of A.C.E. collapsing 
rules to permit the inclusion of variance as a criterion to collapse 
data cells. The second enhancement to the prespecified rules deals with 
imputation cell estimation, the process by which resident status, match 
status, or enumeration status is imputed for unresolved cases. 
Imputation cell estimation was modified because the results of the 
A.C.E. follow-up forms became available during the missing data 
estimation process. The changes were discussed with the ESCAP and 
documented.
    The ESCAP was pleased with the reduction in sampling variance from 
1990 levels. The A.C.E. was designed so that the coefficients of 
variation (CV) would be lower than in 1990 because of the increased 
sample size, because better measures of population size were available 
for the selection of sample clusters, and because sample weights were 
less variable. The overall CV decreased about 40 percent from 1990 
levels, and forty-seven states saw their CV decline, with an average 
reduction of 37 percent. The A.C.E. design expectation of state-level 
CV of less than 0.5 percent was achieved. CVs at the congressional 
district, place, and county level all showed similar levels of 
improvement, as detailed in Analysis Report B-11, ``Variance Estimates 
by Size of Geographic Area.''
    Other important quality indicators for the A.C.E. operations 
include the following:
     Consistent reporting of Census Day address may have been 
somewhat better than achieved in 1990 due to the better interview made 
possible by being held closer to Census Day and an improved 
interviewing instrument.
     Matching error in the A.C.E. was low, with indications 
that it is substantially lower than that achieved in 1990. 
Additionally, other processing errors are probably lower than those 
measured in 1990.
     A.C.E. fabrication was tightly controlled in 2000; an 
improved interviewing instrument, tighter management of field 
operations, and better detection of falsification through targeting, 
likely lowered the level of fabrication below 1990 levels.
     The level and pattern of missing data in the A.C.E. were 
near or below that in the 1990 PES and the effect of missing data on 
A.C.E. quality is similar to that experienced in 1990.
    In short, the A.C.E. operations appear to have been in control, 
performed as expected, and produced data as good or better than the 
data produced by the 1990 PES.

Historical Measures of Census Coverage--Comparison With Demographic 
Analysis

    By far the largest issue facing the ESCAP has been the surprising 
inconsistency between the DA and A.C.E. estimates. The initial DA 
figures estimate that Census 2000 resulted in a net overcount of 1.8 
million individuals, that Census 2000 overcounted the population by 0.7 
percent. DA has long provided the standard against which the accuracy 
of both censuses and coverage measurement surveys are measured, making 
this inconsistency troubling. The inconsistency between the A.C.E. 
estimates and the demographic analysis estimates is most likely the 
result of one or more of the following three scenarios:

    1. The estimates from the 1990 census coverage measurement 
survey (the Post-Enumeration Survey), the 1990 demographic analysis 
estimates, and the 1990 census were far below the Nation's true 
population on April 1, 1990. This scenario means that the 1990 
census undercounted the population by a significantly greater amount 
and degree than previously believed, but that Census 2000 included 
portions of this previously un-enumerated population.
    2. Demographic analysis techniques to project population growth 
between 1990 and 2000 do not capture the full measure of the 
Nation's growth.
    3. Census 2000, as corrected by the A.C.E., overestimates the 
Nation's population.

    The inconsistency between the demographic analysis estimates and 
the A.C.E. estimates raises the possibility of an as-yet undiscovered 
problem in the A.C.E. or census methodology. The Census Bureau has 
determined that it must further investigate this inconsistency, and the 
possibility of a methodological error, before it can recommend that 
adjustment would improve accuracy.
    DA assesses accuracy in a fundamentally different manner from the 
survey-based approach used in the A.C.E. Instead of comparing the 
results of an independent survey, DA uses administrative records of 
births, deaths, legal immigration, and Medicare enrollments along with 
calculated estimates of legal emigration and net undocumented 
immigration to estimate the national population. Most of these 
components of population change are well measured (especially for 
recent decades), but undocumented immigration is not directly measured 
and must be estimated by comparing detailed data between two 
consecutive censuses with administrative data on legal immigration. 
Given the uncertainty of the initial DA results, the Census

[[Page 14013]]

Bureau has reexamined certain of these components and created an 
alternative set of DA estimates that allows for additional undocumented 
immigration in the 1990s. This alternative set estimates a net 
undercount of 0.9 million individuals. This would imply that the 2000 
Census undercounted the population by 0.3 percent. \18\
---------------------------------------------------------------------------

    \18\ Similarly, legal emigration from the U. S. must be measured 
indirectly and several scenarios were run that varied this component 
as well as undocumented immigration. Those scenarios did not fit the 
observed data as well as the those that simply varied undocumented 
immigration. Scenarios that changed smaller components such as legal 
temporary immigration will be examined in future research.
---------------------------------------------------------------------------

    For the decade between 1990 and 2000, the base demographic analysis 
relied on extrapolations of net undocumented immigration derived from 
data reflecting the changes between the 1980 and 1990 censuses. This 
analysis estimated the flow of undocumented immigration during the 
1990's at 2.8 million. The accuracy of that assumption can only be 
assessed once the Census 2000 questions on country of birth and year of 
immigration become available. However, related data that examine the 
percent Hispanic and Non-Hispanic in Census 2000, and data on the 
percent foreign-born from the re-weighted March 2000 Current Population 
Survey (CPS), provide an indication of the accuracy of the original 
assumptions about immigration and emigration during the 1990's. These 
data show that the base DA implies a foreign-born percentage of the 
population below the value reported in the March 2000 CPS (10.3 percent 
versus 10.6 percent). Similarly, the base DA implies a percent Hispanic 
(12.1 percent) that is below the Census 2000 percent Hispanic (12.6 
percent). Since the undocumented population has recently been 
predominantly Hispanic, these numbers would be consistent with an 
underestimate of the undocumented population component in the base DA.
    Census Bureau researchers have therefore assumed that the base DA 
is a reasonable low estimate of net undocumented immigration in the 
1990s, and examined several different scenarios to create a reasonable 
high estimate. For purposes of simplicity, researchers assumed a 
doubling of net undocumented immigration over the decade for the 
alternative DA. Doubling net undocumented immigration implies a percent 
foreign-born of 11.1, which is higher than the 10.6 percent from the 
re-weighted CPS, and a percent Hispanic of 12.7, which is higher than 
the 12.6 percent in the unadjusted Census 2000 results. Until data from 
Census 2000 on country of birth and year of immigration are available 
to recalibrate DA in detail, this alternative assumption should be 
considered a reasonable higher bound on net undocumented immigration 
during the 1990's.
    DA and the A.C.E. do not differ completely. DA and the A.C.E. agree 
on a reduction in the net undercount in Census 2000 compared with 1990, 
but DA implies a greater change. DA estimated a 1.8 percent net 
undercount in 1990, compared with either a 0.7 percent net overcount 
(base set), or a 0.3 percent net undercount (alternative set) in 2000. 
The A.C.E. estimates show that the net undercount was reduced from 1.6 
percent in 1990 to 1.2 percent in 2000. DA and the A.C.E. also concur 
that Census 2000 succeeded in reducing the differential undercount. 
Both DA and the A.C.E. measured a reduction in the net undercount rates 
for Black and non-Black children (aged 0-17) compared to 1990. Both 
methods also measure a reduction in the net undercount rates of Black 
men and women (aged 18 and over).
    The DA estimates indicate that correlation bias has not been 
reduced from 1990 levels.\19\ The A.C.E. sex ratios \20\ for Black 
adults are lower than DA ``expected'' sex ratios, implying that the 
A.C.E. did not capture the high undercount rates of Black men relative 
to Black women. Historically, DA's important strength has been its 
ability to measure sex ratios accurately. (The ESCAP believes that 
correlation bias cannot be ignored. The correlation bias for 2000 
measured by DA is about the same magnitude as that measured in 1990.)
---------------------------------------------------------------------------

    \19\ Correlation bias is discussed in B-12, ``Correlation 
Bias.''
    \20\ The ratio of men per 100 women.
---------------------------------------------------------------------------

    It is important to understand the limitations and uncertainties 
associated with the DA estimates:
     Like the A.C.E., DA has an associated level of 
uncertainty; the ranges of DA uncertainty are a matter of judgment.
     DA estimates do not provide independent coverage 
benchmarks for all of the characteristics estimated in the A.C.E.\21\
---------------------------------------------------------------------------

    \21\ DA estimates can be tabulated by year-specific age, sex, 
and Black/Non-Black; the A.C.E. permits tabulation for additional 
racial categories and other characteristics, such as whether the 
housing unit is owned or rented.
---------------------------------------------------------------------------

     DA has difficulty in estimating the sub-national 
population.
     The DA method requires reconciling the reporting of race 
in the vital statistics system with race as reported in the census. The 
Census 2000 questionnaire used the instruction ``mark one or more 
races,'' introducing a new consideration into the reconciliation of 
reported race data.
     DA provides estimates for the total population (people 
living in households and group quarters (GQ)), while the A.C.E. 
provides estimates only for the housing unit population, but excludes 
the group quarters population, which includes college dormitories and 
prisons.
    DA estimates for the 1980 and 1990 censuses did not immediately 
confirm the results of the coverage measurement surveys in those 
censuses either. Initial DA estimates for the 1980 census implied a net 
overcount of 0.4 percent, but were later revised upward, partially to 
account for an increase in undocumented immigration. DA estimated a 1.8 
percent undercount for the 1990 census, leading Secretary Mosbacher and 
others to question the accuracy of the 1990 adjusted counts. The Census 
Bureau, however, concluded that the differences between DA and the 1990 
PES were explainable as within the bounds of DA uncertainty. \22\
---------------------------------------------------------------------------

    \22\ Bureau of the Census, ``Technical Assessment of the 
Accuracy of Unadjusted Versus Adjusted 1990 Census Counts,'' 4.
---------------------------------------------------------------------------

    However, in Census 2000 the differences between DA and the A.C.E. 
are larger than in 1990, with DA measuring an undercount from 1.9 to 
0.9 percentage points less than the A.C.E. The Census Bureau 
acknowledges DA's inconsistency with the A.C.E. estimates and will 
continue to research this important issue.

Measures of Census and A.C.E. Accuracy

Total Error Model
    The total error model and loss function analysis are methods used 
to compare the accuracy of the adjusted and unadjusted 2000 data. The 
total error model brings together all of the components of error that 
can be measured for the A.C.E. The total error model is used to correct 
the A.C.E. for biases and thus produces a measure of ``truth'' that can 
be used to assess the accuracy of both the adjusted and unadjusted 
census. The measures of the truth are referred to as targets since the 
components of error must be estimated. By using a range of targets as 
the basis of comparing the A.C.E and Census 2000, calculations can be 
done that indicate whether the adjusted or unadjusted census results 
are more accurate. Situations are defined by the methods and 
assumptions that are used to vary the components of error in the total 
error model.

[[Page 14014]]

    The total error model identifies and estimates the various 
components of error and their variances for groups of the A.C.E. post-
strata designated as evaluation post-strata. \23\ Estimates of the 
component errors are derived for each evaluation post-stratam, then a 
simulation methodology is used to create a range of target populations. 
Loss functions, described in the next section, are then used to 
determine which of the adjusted or unadjusted census populations is 
closer to the targets, taking into account the uncertainty in the 
targets and in the adjustment. \24\
---------------------------------------------------------------------------

    \23\ The Census Bureau can only estimate error components for at 
most sixteen evaluation post-strata. The error components reflect, 
for the most part, measures of nonsampling error. Estimation of 
nonsampling error requires extensive methodology carried out by 
extremely well qualified staff. Because few such staff exist, this 
limits the size of the sample for which measures can be obtained. 
Therefore direct estimates of the targets can only be obtained for a 
smaller number of evaluation post-strata.
    \24\ Mary Mulry and Bruce D. Spencer. ``Accuracy of the 1990 
Census and Undercount Adjustments.'' Journal of the American 
Statistical Association 88 (September 1993): 1080-91; B-19, Mulry 
and Spencer.
---------------------------------------------------------------------------

    The components of error for the total error model are as follows: 
\25\
---------------------------------------------------------------------------

    \25\ Ibid.

1. P-sample matching error
2. P-sample data collection error
3. P-sample fabrication
4. E-sample data collection error
5. E-sample processing error
6. Correlation bias
7. Ratio estimator bias
8. Sampling error
9. Imputation error

    The Census Bureau has data from DA, Census 2000, and the A.C.E. 
that can be used to produce estimates of components 6, 7 and 8 
(Correlation Bias, Ratio Estimator Bias, and Sampling Error), and is 
relying on 1990 data to estimate the remaining components. The ESCAP 
discussed the use of 1990 measures for these error components, and 
determined that doing so would provide conservative estimates of the 
level of error in the A.C.E. The ESCAP noted that the A.C.E. is similar 
in design and operation to the 1990 PES, except that the A.C.E. was 
conducted with higher quality as noted above.\26\
---------------------------------------------------------------------------

    \26\ B-19, Mulry and Spencer.
---------------------------------------------------------------------------

    The ESCAP analyzed the sensitivity of various components of the 
total error model, particularly the office processing components, 
because the Committee believes that it achieved better results for 
these components in 2000 than in 1990. Also, the ESCAP used a number of 
models of correlation bias in the total error model, given the 
importance of this component, and the understanding of the significant 
influence that this component has on the estimates of total error and 
thus on the target populations.
Loss Function Analysis
    Loss function analysis is used to compare the adjusted and 
unadjusted census populations to the target populations derived from 
the total error model as described above. Loss functions are 
constructed to measure the loss or error associated with differences 
from the targets. Loss functions are defined to measure the loss in 
accuracy due to differences from the target populations. Loss functions 
are also specified based on various criteria related to the intended 
uses of the data. A general description of loss functions is as 
follows:
[GRAPHIC] [TIFF OMITTED] TN08MR01.005

Where:

n represents the number of entities for which the comparison is 
conducted;
Tppi represent the target 
population, unadjusted census population, and the adjusted census 
population, respectively for the i\th\ entity; and
Wi represents a weight defined for the criterion to be 
studied for a particular use of the data.

    If the Census loss is greater than the ACE loss, then the adjusted 
data are determined to be more accurate for the criterion represented 
by the loss function.
    The Census Bureau believes that both numeric and distributive 
accuracy are important measures of census accuracy and accordingly 
designed loss functions to measure both types of accuracy. Numeric 
accuracy refers to how close the overall count of a particular 
geographic area is to the truth, whereas distributive accuracy refers 
to how close the relative proportion or share of a geographic area is 
to its true share relative to other areas.\27\ As discussed in B-13, 
``Comparing Accuracy,'' the ESCAP directed the preparation of four 
types of loss functions:
---------------------------------------------------------------------------

    \27\ The relationship between numeric and distributive accuracy 
is discussed in the Feasibility Document, pp. 15-18.

1. Squared Error Loss
2. Weighted Squared Error Loss
3. Relative Squared Error Loss
4. Equal Congressional District Squared Error Loss
    The Committee determined that the second and fourth loss functions, 
weighted squared error loss and equal CD squared error loss, were the 
most appropriate to measure accuracy for redistricting data. The ESCAP 
directed the preparation of loss functions at the state, congressional 
district and county levels, believing these geographic levels most 
relevant to the decision before the Secretary. County level data is 
intended to simulate state legislative districts, because these 
districts are usually smaller than congressional districts.
    The ESCAP studied the sensitivity of the loss functions by varying 
the assumptions for various of the components in the total error model. 
As described above, extensive sensitivity analysis was conducted for 
the various models and levels of correlation bias that were used to 
generate the target populations. \28\
---------------------------------------------------------------------------

    \28\ B-13, ``Comparing Accuracy.''
---------------------------------------------------------------------------

    Loss functions that measure only a small gain in accuracy for the 
A.C.E. may be problematic, given the associated uncertainty with these 
estimates. Accordingly, the Committee examined the loss functions for 
evidence of a clearly measurable improvement and found the following: 
\29\
---------------------------------------------------------------------------

    \29\ Ibid.
---------------------------------------------------------------------------

    1. At the state and congressional district level, when only 
sampling variance was included, the loss functions showed that the 
change due to adjustment was significant in comparison to sampling 
error, that is, if sampling error were the only concern, adjustment 
would result in more accurate data. The ESCAP recognizes, of course, 
that sampling error is not the only error in the A.C.E., and thus this 
analysis was conducted to determine whether sampling error alone would 
result in finding that adjustment was less accurate. This was not the 
case so the ESCAP proceeded with more extensive analyses.
    2. Correlation bias is a significant factor in influencing the 
results of the loss functions, and a variety of models were used to 
test the sensitivity of the analysis to correlation bias effects in 
creating the target populations. When full components of estimated 
correlation bias were used to construct the target populations, at the 
state, congressional district, and county levels, the loss functions 
showed a marked improvement for adjustment, regardless of the model. 
When only 50 percent of the estimated correlation bias was used in 
constructing the target populations, the loss functions continued to 
show a clear improvement. The ESCAP

[[Page 14015]]

considered this to be an important finding because while there may be 
disagreement regarding the existence of correlation bias, assuming no 
correlation bias is clearly an unlikely possibility.
    3. When either no or only a modest amount of correlation bias is 
factored into the loss functions, they tend to favor the unadjusted 
census at all geographic levels. The ESCAP noted that assuming no 
correlation bias would result in a lower bound for the degree of 
improvement for adjustment, since as noted above, it is not reasonable 
to assume no correlation bias.
    4. The loss functions for counties with populations below 100,000 
indicated that the unadjusted census was more accurate regardless of 
the level of correlation bias assumed. This caused some concern, since 
this was not the case for the 1990 census adjustment. However, the 
ESCAP found that the adjustment was more accurate when considered for 
all counties using both numeric and distributive accuracy. Therefore, 
the adjustment was improving the data for areas in which the majority 
of the population resided. This is further indication of the closeness 
of the A.C.E. estimates and Census 2000.
    The conclusion that can be drawn from the loss function analysis is 
that, absent the concerns with consistency between DA and the A.C.E., 
the adjustment would result in data that are more distributively and 
numerically accurate at the state and congressional district levels if 
correlation bias is recognized at a likely level, but that the data are 
not more accurate for smaller counties. Even though smaller counties 
would have been less accurate; the analysis indicated an overall 
improvement in accuracy from adjustment. However, the ESCAP notes its 
concern regarding the unexplained differences between DA and the A.C.E. 
estimates, which may be indicative of an unmeasured problem in Census 
2000 or in the A.C.E. The potential for a reversal of these findings is 
strong enough to preclude a conclusion at this time that adjustment 
would improve accuracy. When considering the additional concerns 
described below and taking into account the inconsistencies with DA, 
the Committee was not prepared to recommend at this time that 
adjustment would improve accuracy.

Other Factors That May Affect Accuracy

Synthetic Error
    The A.C.E. methodology produced estimated coverage correction 
factors for each of the post-strata. These factors were carried down 
within the post-strata to the census block level in a process called 
synthetic estimation. The key assumption underlying synthetic 
estimation is that the net census coverage is relatively uniform within 
the post-strata. In other words, the probability that people in a 
particular post stratum will be missed by the census is assumed to be 
roughly the same. The failure of this assumption causes synthetic 
error.
    The design underlying synthetic estimation methodology is directed 
at correcting a systematic under or over count in the census. The 
synthetic estimates will not correct random counting errors that occur 
at any geographic level (blocks, tracts, counties, etc). Therefore, the 
synthetic estimate will not result in extreme changes in small 
geographic entities, nor will it correct for extreme errors. Synthetic 
estimation is designed to remove the effects of systematic errors, so 
that when small entities are aggregated, systematic and differential 
coverage errors can be corrected.
    The ESCAP was concerned with synthetic error, because this type of 
error is not included as a component of the total error model (which 
estimates error in post-stratum level DSE's, where there is, by 
definition, no synthetic error). Furthermore, synthetic error cannot be 
estimated directly, as direct estimation would require more sample 
observations for the A.C.E than practicable.
    The ESCAP analyzed the effects of synthetic error by conducting 
artificial population analysis. This analysis creates artificial 
populations with surrogate variables thought to reflect the 
distribution of net coverage error. These surrogate variables are known 
for the entire population. An analysis of these artificial populations 
for the effect of synthetic error is the basis on which this otherwise 
unknown effect is studied.
    The detailed analysis of synthetic error is described more fully in 
reports B-1, ``Data and Analysis to Inform the ESCAP Report,'' and B-
14, ``Assessment of Synthetic Assumptions.'' Briefly, four artificial 
populations were constructed based on census variables thought to be 
related to census coverage. The Census Bureau calls these variables 
``surrogates.'' \30\ The Census Bureau distributed the post-stratum 
level gross undercount (gross overcount) in proportion to the gross 
undercount surrogate variable (gross overcount surrogate variable) to 
the geographic levels to be studied. This process results in a 
population with surrogate values for coverage error which are known at 
all levels. Unlike other approaches, artificial population analysis 
provides measures of net coverage for all local areas, within a post-
stratum. Therefore the effect of synthetic error can be assessed for 
these artificial populations.
---------------------------------------------------------------------------

    \30\ The methodology used is similar to that suggested by 
Freedman and Wachter (1994, Statistical Science).
---------------------------------------------------------------------------

    The four artificial populations are described in Table 3 below:

                       Table 3.--Surrogate Variables Used To Create Artificial Populations
----------------------------------------------------------------------------------------------------------------
                                                          Undercount surrogate          Overcount surrogate
----------------------------------------------------------------------------------------------------------------
Artificial Population 1...........................  (# non-GQ persons)-- (#        (# non-GQ persons)--(#
                                                     persons in whole household     persons for whom date of
                                                     substitutions).                birth was allocated
                                                                                    consistent with reported
                                                                                    age).
Artificial Population 2...........................  (# non-GQ persons)--(#         (# non-GQ persons)--(#
                                                     persons in whole household     persons in whole household
                                                     substitutions).                substitutions).
Artificial Population 3...........................  # non-GQ persons with 2 or     # persons for whom date of
                                                     more item allocations.         birth was allocated
                                                                                    consistent with reported
                                                                                    age.
Artificial Population 4...........................  # non-GQ persons whose         # non-GQ persons whose
                                                     household did not mail back    household did not mail back
                                                     the questionnaire.             the questionnaire.
----------------------------------------------------------------------------------------------------------------


[[Page 14016]]

GQ = Group Quarters
    Three types of analysis were conducted using these artificial 
populations: \31\
---------------------------------------------------------------------------

    \31\ See Analysis Report B-14, ``Assessment of Synthetic 
Assumptions'' for a detailed discussion.
---------------------------------------------------------------------------

    1. The effect of relative bias for synthetic estimation was 
assessed by calculating the ratio of the absolute unadjusted census 
error to the absolute adjusted census error for state and congressional 
district population totals and shares. An analysis of the distribution 
of these relative biases indicated that within the artificial 
populations, synthetic estimation improved the majority of entities.
    2. Biases for synthetic estimation were also calculated and 
compared to the level of bias in the dual system estimates, including 
correlation bias from the total error model. Since the total error 
model does not include synthetic bias, the purpose of this analysis was 
to determine whether the level of synthetic error was small enough to 
be ignored when compared to the other errors estimated for the A.C.E. 
This analysis showed that the level of synthetic error could not be 
ignored for several of the artificial populations. This finding led to 
the third analysis.
    3. Because synthetic error affects both the unadjusted and adjusted 
census, the Census Bureau studied the effect of synthetic error on both 
unadjusted and adjusted census loss, as measured by the loss functions, 
and concluded that synthetic error would increase the loss of both the 
unadjusted and adjusted census. The question was how this error would 
affect the relative losses for the adjusted and unadjusted data.
    Therefore, the ESCAP directed the addition of synthetic error to 
the loss measured for both the adjusted and unadjusted census. This 
study indicated that synthetic error could, in certain situations, 
affect the relative comparison of adjusted or unadjusted loss.
    For the analyses based on several of the artificial populations for 
state and congressional district counts, the loss function analysis 
understated the true gains from adjustment. However, for some of the 
analyses, the loss function results understate the true gain for the 
unadjusted census. In these situations, the effect could be as high as 
58 percent.
    The ESCAP noted that a conservative view of the loss function 
results should be used in assessing the gain in accuracy from 
adjustment. Given the concerns described above, the ESCAP believes that 
this finding must be fully understood before recommending for an 
adjustment.
Balancing Error
    The A.C.E. actually consists of two surveys, based on two samples: 
`` the P-sample and the E-sample. The P-sample is an enumeration 
independent from the census, used to measure omissions or missed 
persons. The E-sample is a sample of census records that are reexamined 
to measure erroneous inclusions. Balancing error occurs when cases are 
handled differently in the P-and E-samples. For example, the effort 
spent to identify gross omissions should be comparable to the effort 
spent to identify erroneous enumerations. The ESCAP examined whether 
balancing error may have been introduced during the Targeted Extended 
Search (TES) operation. TES was the A.C.E. operation designed to look 
for matches in surrounding A.C.E. block clusters. The DSE model 
attempts to match people in the A.C.E with people in the census. 
Balancing error occurs when the search area for the P-sample matching 
does not agree with the search area for E-sample erroneous 
enumerations. Specifically, if A.C.E. records are allowed to match to 
records that were not in the common area of search, the DSE ratio will 
be incorrectly estimated.
    One can assess TES balance by seeing if the proportions of errors 
of inclusion and of exclusion are approximately equal after completion 
of the search, assuming that there is no geocoding error in the P-
sample. In other words, the number of TES people found on the P-sample 
(coded as a Match) and E-sample (coded as a Correct Enumeration) sides 
should be about equal. In Census 2000, the much greater increase in the 
match rate (3.8 percent) than the correct enumeration rate (2.9 
percent) may indicate that some aspect of A.C.E. is out of balance. The 
ESCAP directed a review of this situation. Preliminary results from an 
early A.C.E. evaluation indicate that a number of E-sample cases coded 
as correct enumerations were in fact outside of the search area. That 
means that they should have been coded as Erroneous Enumerations and 
subtracted from the DSEs. This error could introduce an upward bias in 
the DSE. In addition, there are also concerns that the search for 
census duplicate enumerations in surrounding blocks could have 
understated the estimate of duplicates used in the DSE. The net effect 
of correcting these two errors could have the effect of reducing the 
A.C.E. estimate of total net undercount. However, additional work must 
be completed to quantify this effect.
    The ESCAP was concerned about the possibility of balancing error. 
The ESCAP noted that some measures of this error were included in the 
total error model. However, this result, in combination with the 
inconsistency between DA and the A.C.E., added to the concerns that 
adjustment could not be shown to improve accuracy at this time. The 
Committee also believes that balancing error must be further 
investigated before a recommendation can be made.
Late Adds and Whole Person Imputations
    There are records included in Census 2000 that do not contain 
information sufficient for matching to the A.C.E independent sample. 
The methodology that has been established and used to produce coverage 
estimates, given that this situation will occur, is to produce the dual 
system estimate based on the census population that has sufficient 
information to be included into the A.C.E. matching process. In effect, 
this excludes records that do not contain sufficient information for 
matching from the dual system estimation. The dual system estimate then 
produces a measure of the correct population total. The undercount (or 
overcount) is estimated by comparing the complete census count to the 
dual system estimate of the correct population total. Therefore, the 
effect of these census records is included in the estimates of 
undercount produced by dual system estimation.
    The key assumption underlying this methodology of estimating 
coverage error is that the probability of including the people 
represented by these records in the A.C.E. P-sample is the same as the 
probability of including the people who report sufficient information 
to be included in the matching procedures.
    Census 2000 contains over five million records where imputation 
procedures were used to create all of the information. These are 
referred to as whole person imputations. Since these records do not 
contain information sufficient to be included in the matching, they are 
handled as described above. The Census Bureau plans to evaluate the 
causes for these imputations.
    In addition, as discussed in the preceding section on census 
quality, the Housing Unit Unduplication Operation reinstated over a 
million previously removed housing units (representing over two million 
individual person records) into the census files. These reinstated 
``Late Additions'' were incorporated into the estimates of coverage 
error using the same process as described for census records that do 
not contain sufficient information for

[[Page 14017]]

matching. The same assumption underlies this treatment of late 
additions as described above for records without sufficient information 
for matching. That is, the probability of inclusion in the P-sample for 
the people Census 2000 correctly enumerated in the universe of late 
additions is assumed to be the same as for correctly enumerated people 
not in this universe.
    The ESCAP reviewed the treatment of late additions and whole person 
imputations because the number of these cases had increased 
significantly from 1990. The ESCAP concluded that the key assumptions 
underlying the methodology for including these records into the 
estimates of A.C.E coverage error could be expected to hold. However, 
the ESCAP noted that these assumptions would not hold perfectly and 
examined the effects of deviations from this assumption. The ESCAP 
concluded that three effects were likely to result (1) the sampling 
variance of the dual system estimator would be increased; (2) the 
heterogeneity of the A.C.E inclusion probabilities would be increased, 
leading to increased correlation bias; and (3) to the extent that these 
records clustered geographically within the A.C.E. post-strata, 
synthetic error would be increased.
    The ESCAP was comfortable that the measures available for assessing 
the effects of sampling variance and correlation bias would include the 
effects of the treatment of late additions and whole person 
imputations. However, the ESCAP was concerned that synthetic error 
might be increased and continued its review of the effect of increased 
synthetic error. The committee reviewed tabulations of late census 
additions and whole person imputations for A.C.E post-strata by census 
region. The committee found that these data did indicate some degree of 
geographic clustering within post-strata. The committee noted that the 
synthetic error analysis included the effect of clustering of whole 
person imputations. The committee concluded that there was a 
possibility for increased synthetic error, and that it was reflected to 
some degree in the analysis based on artificial populations. The 
committee concluded further that a higher degree of conservatism should 
be used in reviewing the results of the loss function analysis. The 
committee did not view the effect of increased synthetic error as large 
enough to change the findings described previously.
Misclassification Error
    Misclassification error occurs when an individual is classified 
into different post-strata in the census and the A.C.E. While the 
Census Bureau has never detected a significant impact of 
misclassification error in earlier post-enumeration surveys, the 
introduction of multiple race reporting in both the census and the 
A.C.E. raised concerns about this type of error. The evidence reveals 
that misclassification error affected only two groups, the domains of 
American Indians off reservation and Native Hawaiians and Pacific 
Islanders. ESCAP has concluded that for these two groups, it appears 
that inconsistency may have contributed to having lower than 
anticipated undercount rates because of how they were classified. The 
misclassification error in these two domains had little or no effect on 
the validity of the dual system estimates as a whole, given their small 
sizes. Misclassification error, in general, was not a problem.

Additional Issues

    There are several issues or concerns that have been raised 
regarding census adjustment. These issues did not concern the ESCAP, 
but are briefly discussed below.

Block Level Accuracy

    Block level accuracy is not an important criterion to evaluate 
either Census 2000 or the A.C.E. The population of stand-alone blocks 
is not used to determine either congressional or state legislative 
districts, nor is block-level data used to distribute funds. Rather, 
blocks are added together to form the more meaningful levels of 
aggregation studied by the loss functions: states, congressional 
districts, and counties.
    Block level accuracy has two components, random error and 
systematic errors or biases. Random error can be minimized through the 
conduct of census operations aimed at improving quality. Systematic 
biases, on the other hand, are caused by systematic errors that occur 
during the conduct of census operations. Random errors at the block 
level diminish greatly as blocks are added together to form larger 
aggregations of the data. Systematic errors, if not corrected, will 
remain in the data at all levels of aggregation, leading to data that 
systematically over or understate affected population groups. 
Therefore, it is more important for adjustment to remove systematic 
errors from block level data.

Adjustment for Overcounts

    It is important to emphasize that the statistical correction of 
Census 2000 would involve some amount of downward adjustment for 
overcounts. While the A.C.E. would mostly result in an increase in the 
estimated size of most undercounted geographic entities, there are 
likely to be a small number of overcounted areas that would require 
decreasing the estimated size. The 2000 A.C.E. data do not show that 
any state or congressional district was overcounted; all states and 
congressional districts would increase in measured size. The data do 
reveal, however, that certain substate entities were overcounted and 
would thus be subject to downward adjustment.
    There are concerns that an adjustment for overcounts removes people 
from Census 2000 data files. This is not the case; the downward 
adjustment is accomplished by creating statistical records with 
negative weights that, when added to Census 2000 tabulations, reduce 
the count to reflect overcounts. No records would have been removed 
from the Census 2000 files. However, the effects of the adjustment for 
overcounts may subtract a person's individual characteristics from the 
Census 2000 tabulations.
    The ESCAP discussed the downward adjustment for overcounts, and 
noted that it was subject to the same concerns that are related to 
adjustment for undercounts. The ESCAP concluded that the analysis of 
the accuracy of the adjustment included the effects of uncertainties 
for adjustments of over and undercounts, and that any final 
determination on the potential improvement of accuracy would reflect 
these uncertainties.

Attachments

    1. Index of Supporting Documentation
    2. List of ``B-Series'' Analysis Reports

Attachment 1--Outline of Documents/Records Underlying the Report of 
the Executive Steering Committee for Accuracy and Coverage 
Evaluation Policy Regarding the Methodology To Be Used To Produce 
the Tabulations of Population Reported to States and Localities 
Pursuant to 13 U.S.C. 141(c)

A. Reports Supporting the Recommendation--Chapter B (Final reports)
B. Detailed Specifications for A.C.E. Methodology
C. Executive Steering Committee for A.C.E. Policy (ESCAP) Meetings
D. October 2, 2000, Presentation to the National Academy of Sciences
E. Prior Steps to Determining Feasibility

[[Page 14018]]

F. Prior Documentation of Adjustment Research
G. Census 2000 Decision Memoranda
H. Census 2000 Dress Rehearsal Evaluations Summary (1999)
I. Census 2000 Informational Memoranda
J. U.S. Census Monitoring Board Reports
K. U.S. General Accounting Office Reports
L. U.S. Department of Commerce Office of the Inspector General Reports

    Note: Because of the large volume of underlying documentation, 
not all will be posted to the Census Bureau's website at the time 
that the ESCAP report is made available. The remaining documents 
will be posted in the near future.

Attachment 2--B-Series Documents

February 27, 2001

----------------------------------------------------------------------------------------------------------------
                    Title                                                    Author
----------------------------------------------------------------------------------------------------------------
1. Accuracy and Coverage Evaluation: Data and  Hogan.
 Analysis To Inform the ESCAP Report.
2. Quality Indicators of Census 2000 and the   Farber.
 Accuracy and Coverage Evaluation.
3. Quality of Census 2000 Processes..........  Baumgardner/Moul/Pennington/Piegari/Stackhouse/Zajac/Alberti/
                                                Reichert/Treat.
4. Accuracy and Coverage Evaluation:           Robinson.
 Demographic Analysis Results.
5. Accuracy and Coverage Evaluation: Person    Byrne/Imel/Ramos/Stallone.
 Interviewing Results.
6. Accuracy and Coverage Evaluation: Person    Childers/Byrne/Adams/Feldpausch.
 Matching and Follow-up Results.
7. Accuracy and Coverage Evaluation: Missing   Cantwell/McGrath/Nguyen/Zelena k.
 Data Results.
8. Accuracy and Coverage Evaluation:           Mule.
 Decomposition of Dual System Estimation
 Components.
9. Accuracy and Coverage Evaluation: Dual      Davis.
 System Estimation Results.
10. Accuracy and Coverage Evaluation:          Farber.
 Consistency of Post-Stratification Variables.
11. Accuracy and Coverage Evaluation:          Starsinic/Sissel/Asiala.
 Variance Estimates by Size of Geographic
 Area.
12. Correlation Bias.........................  Bell.
13. Comparing Accuracy.......................  Mulry/Navarro.
14. Accuracy and Coverage Evaluation:          Griffin/Malec.
 Assessment of Synthetic Assumptions.
15. Census 2000: Service Based Enumeration     Griffin.
 Multiplicity Estimation.
16. Demographic Full Count Review: 100% Data   Batutis.
 Files and Products.
17. Census 2000: Missing Housing Unit Status   Griffin.
 and Population Data.
18. Accuracy and Coverage Evaluation: Effect   Navarro/Olson.
 of Targeted Extended Search.
19. Overview of Total Error Modeling and Loss  Mulry/Spencer.
 Function Analysis.
----------------------------------------------------------------------------------------------------------------

Attachment 3 to Preamble

DSSD Census 2000 Procedures and Operations Memorandum Series B-1, March 
1, 2001

Accuracy and Coverage Evaluation: Data and Analysis To Inform the ESCAP 
Report

Howard Hogan, U.S. Census Bureau

Table of Contents

Summary Table
Introduction
Background
Accuracy and Coverage Evaluation (A.C.E.) Dual System Estimates
Review of the Quality of the Census Operations
    Conclusions for this Section
    Analysis reports important to this section
    Discussion
    Address List Development
    Questionnaire Delivery and Return
    Nonresponse Follow-up
    Be Counted Campaign
    Coverage Edit Follow-up
    Coverage Improvement Follow-up
    Housing Unit Duplication Operation
    Primary Selection Algorithm
    Unclassified Unit and Missing Data Estimation
Review of A.C.E. Operations
    Proper execution of the steps between processing and estimation
    Conduct and control of the A.C.E. operations
    Conclusions for this section
    Analysis reports important to this section
    Discussion
    Interviewing
    Matching and Follow-up
Review of A.C.E. Quality
    Individual components of A.C.E. quality
    Sampling variance
    Consistent Reporting of Census Day Residence
    Matching error
    A.C.E. Fabrications
    Missing data
    Balancing error
    Errors in Measuring Census Erroneous Enumerations
    Correlation bias
    Synthetic Assumptions
    Other Measurement and Technical Errors
    Synthesizing A.C.E. Quality
    Comparison with demographic analysis and demographic estimates
    Post-Enumeration Survey--A.C.E. error of closure
    Comparing the accuracy of the A.C.E. to the accuracy of the 
uncorrected census
    References

Tables

Table 1. Summary Table: Data and Analysis to Inform the ESCAP Report
Table 2. Percent Net Undercount for Major Groups: 2000 A.C.E. and 
1990 PES
Table 3. Distribution Interviews by Week--Unweighted
Table 4a. Census 2000 A.C.E. 64 Post-Stratum Group-Percent Net 
Undercount
Table 4b. Census 2000 A.C.E. 64 Post-Stratum Group-Standard Error of 
the Net Undercount
Table 5. A Comparison of the 1990 PES Total Population with the 
A.C.E. Accounting for Population Change
Table 6. Relative Loss by Degree of Processing Error and Correlation 
Bias
Table A-1. Census 2000 A.C.E. 64 Post-Stratum Groups by Region--
Percent Late Adds
Table A-2. Census 2000 A.C.E. 64 Post-Stratum Group by Region--
Percent Iis
Table A-3. Census 2000 Evaluations Program Category Report Schedule

[[Page 14019]]



                      Table 1.--Summary Table: Data and Analysis To Inform the ESCAP Report
----------------------------------------------------------------------------------------------------------------
             Finding                        Evidence                 Implication            Limitations \32\
----------------------------------------------------------------------------------------------------------------
                                       What do we know about Census 2000?
----------------------------------------------------------------------------------------------------------------
Census 2000 was similar in design  Procedural histories and   We expected similar
 to the 1980 and 1990 censuses.     Census 2000 Operational    patterns of coverage
                                    Plan, 12/00.               error.
Census 2000 included some          Census 2000 Operational    We expected these
 important improvements such as     Plan 12/00.                programs to have a
 paid advertising, intensive                                   modest impact in
 community outreach, local                                     reducing the
 involvement with address list                                 undercount.
 development, and competitive pay
 scales.
While there were some local        Report B-3, ``Quality of   We expect generally good
 problems and minor operational     Census Processes''.        and uniform patterns of
 shortcomings, Census 2000 was                                 coverage, with perhaps
 generally well designed and                                   local exceptions.
 executed.
There is evidence of geographic    Presentation at ESCAP, 12- Local heterogeneity will  We cannot know the
 heterogeneity in the application   20-99, 4-12-00, 5-24-00    affect the accuracy of    effects of this
 of some census processes.          and B-14, Feasibility,     both the census and the   differential pattern on
                                    pp. 46-48.                 adjusted estimates.       census net undercount
                                                                                         at the local level.
There was a high level of          Census 2000 Operational    We expected these
 advertising and outreach           Plan 12/00.                programs to have a
 targeted at minority populations.                             modest impact in
                                                               reducing the
                                                               undercount.
----------------------------------------------------------------------------------------------------------------
                                 What does the A.C.E. tell us about Census 2000?
----------------------------------------------------------------------------------------------------------------
The level and patterns of          Report B-2, ``Overall      This finding is
 coverage in Census 2000 are        Census and A.C.E.          consistent with what is
 substantially similar to those     Quality Indicators;''      known about the design
 of the past two censuses, with     Report B-3, ``Quality of   and implementation of
 incremental improvements rather    Census Processes;''        Census 2000.
 than wholesale discontinuities.    Feasibility Doc.
While Census 2000 reduced both     Report B-9, ``Dual System  A lower undercount in     All results are subject
 the net and the differential       Estimation Results''.      the Census means the      to sampling and
 undercount, the A.C.E. estimates                              benefits from adjusting   nonsampling errors.
 that the census undercounted the                              a loss.
 total population by
 approximately 1.18 percent and
 continued previous patterns of
 differential coverage, with
 lower coverage rates for
 minorities, renters, and
 children.
----------------------------------------------------------------------------------------------------------------
                                      Was the A.C.E. conducted as designed?
----------------------------------------------------------------------------------------------------------------
The A.C.E. was carried out as pre- Report B-7, ``Missing      The results of the        There were two changes
 specified with only minor          Data Results;'' Report B-  A.C.E. were not           to prespecification,
 modifications, which were          8, ``Decomposition of      manipulated.              one concerning the
 warranted and documented when      Dual System Estimate                                 collapsing rules and
 important information became       Components;'' Report B-                              the other affecting the
 available earlier than expected.   9, ``Dual System                                     missing data
                                    Estimation Results''.                                imputation.
----------------------------------------------------------------------------------------------------------------
                                     Was the A.C.E. an operational success?
----------------------------------------------------------------------------------------------------------------
The A.C.E. was an operational      Report B-5, ``Person       There were no unforseen
 success; listing, interviewing,    Interviewing Results;''    operational
 matching, and follow-up were all   Report B-6, ``Person       difficulties with a
 conducted as designed and were     Matching and Followup      significant effect on
 well controlled.                   Results;'' Report B-7,     the quality of the
                                    ``Missing Data Results''.  data.
The A.C.E. significantly reduced   Report B-9, ``Dual System  There were no unforseen
 sampling variance.                 Estimation Results;''      operational
                                    Report B-11, ``Variance    difficulties with a
                                    Estimates by Size of       significant effect on
                                    Geographic Area''.         the quality of the
                                                               data.
Consistent reporting of Census     Report B-6, ``Person       Data collection error
 Day addresses may have been        Matching and Followup      probably lower than in
 somewhat better than that          Results''.                 1990.
 achieved in 1990 due to better
 interviews made possible by the
 Computer Assisted Personal
 Interviewing instrument.
 Interviewing was conducted
 closer to Census Day.

[[Page 14020]]

 
Matching error in the A.C.E. was   Report B-6, ``Person       A.C.E. processing errors
 low, with indications that it is   Matching and Followup      are probably less than
 substantially lower than that      Results;'' Presentation    those measured in 1990.
 achieved in 1990.                  to ESCAP, 11-30-00 and 2-
                                    2-01, Feasibility, p. 43.
A.C.E. fabrication was more        Report B-6, ``Person       Data collection error
 tightly controlled than in the     Matching and Followup      probably lower than in
 1990 PES; tighter field            Results''.                 1990.
 management reduced the
 opportunity for fabrication.
The level and pattern of missing   Report B-7, ``Missing      Effect on A.C.E. quality
 data in the A.C.E.'s comparable    Data Results''.            is similar to that
 to that in the 1990 PES.                                      experienced in the 1990
                                                               PES.
Questions remain concerning the    Report B-8,                Increased balancing       A full analysis has not
 level of balancing error.          ``Decomposition of Dual    error could make the      been completed.
                                    System Estimate            adjustment less
                                    Components;'' Minutes,     accurate.
                                    Feasibility p. 50,
                                    Report B-18, ``Effect of
                                    Targeted Extended
                                    Search''.
E-Sample coding errors were        Series T-6: ``Additional   A.C.E. might over         There was evidence of
 controlled comparable to 1990,     Geographic Coding for      estimate the census       some A.C.E. mis-
 except, perhaps, for e-sample      Erroneously Enumerated     undercount.               geocoding.
 geocoding.                         Housing Units''.
Correlation bias is almost         Report B-4, ``Demographic  A.C.E. could              Limited data on females,
 certainly present for both Black   Analysis Results;''        underestimate the         children etc.
 and non-Black populations. The     Report B-12,               undercount.
 switch to PES-C may have           ``Correlation Bias;''
 increased correlation bias over    Presentation to ESCAP, 7-
 1990 levels, but the evidence on   12-00, Feasibility pp.
 the level of correlation bias is   35-36.
 weak.
The A.C.E. contains bias due to    Report B-14, ``Synthetic   The A.C.E. will not       Only indirect evidence
 synthetic estimation.              Assumptions''.             remove local variations   is available.
                                                               in the net undercount
                                                               that are not correlated
                                                               with the poststrata.
----------------------------------------------------------------------------------------------------------------
                              What does Demographic Analysis say about the census?
----------------------------------------------------------------------------------------------------------------
Initial Demographic Analysis       Report B-4, ``Demographic  The level and pattern of  The Demographic Analysis
 estimates indicate a net census    Analysis Results''.        the A.C.E. estimates      estimates are subject
 overcount of 0.7 percent with                                 differs from the          to their own patterns
 large overcounts for the non-                                 initial Demographic       of uncertainities.
 Black population age 18-29.                                   Analysis estimates.
Alternate Demographic Analysis     .........................  The A.C.E. may be         The Demographic Analysis
 benchmarks indicate a net                                     overestimating the        estimates are subject
 undercount of 0.9 million, or                                 population size.          to their own patterns
 0.32 percent.                                                                           of uncertainities.
Both the initial and revised       B-4, ``Demographic         Census 2000 net coverage
 Demographic Analysis indicates     Analysis Results''.        is higher than 1990.
 an improvement in coverage from
 the 1990 to the 2000 censuses.
Both the initial and the revised   B-4, ``Demographic         Census 2000 did not
 Demographic Analysis indicate a    Analysis Results''.        eliminate the
 differential undercount in                                    differential
 Census 2000.                                                  undercount.
----------------------------------------------------------------------------------------------------------------
   What does loss function analysis tell us about the relative accuracy of the adjusted and unadjusted census?
----------------------------------------------------------------------------------------------------------------
If there is little or no           Report B-13, ``Comparing   If these conditions are   These results are
 correlation bias and the level     Accuracy''.                true, the census is       dependent on the model
 of A.C.E. errors is the same as                               probably the more         assumptions being
 the 1990 PES, the A.C.E. is less                              accurate.                 approximately true.
 accurate than the census.
If there is moderate correlation   B-13, ``Comparing          If these conditions are   These results are
 bias or if the level of A.C.E.     Accuracy''.                true, the adjusted        dependent on the model
 processing errors is                                          figures are probably      assumptions being
 substantially reduced, the                                    the more accurate.        approximately true.
 A.C.E. is more accurate.
Accounting for local census        B-14, ``Synthetic Error''  Heterogeneity is a        Measuring the effect of
 heterogeneity is unlikely to                                  concern but probably      local variation is
 reverse the findings for the                                  not a deciding factor.    dependent on finding
 loss function analysis.                                                                 observable variables
                                                                                         that have similar
                                                                                         geographic
                                                                                         distributions as the
                                                                                         net undercount.
----------------------------------------------------------------------------------------------------------------

[[Page 14021]]

 
   What does an analysis of the A.C.E./PES error of closure tell us about the level and pattern of DSE errors?
----------------------------------------------------------------------------------------------------------------
The level and pattern of errors    B-13, ``Comparing          The findings from the     This result depends upon
 in the A.C.E. may differ from      Accuracy'' B-14,           loss function analysis,   Demographic Analysis's
 that of the 1990 PES.              ``Synthetic Error,'' and   which depend upon an      ability to place an
                                    Overview of Total Error    assumption of A.C.E./     upper bound on the
                                    Modeling and Loss          PES similarity in error   level of population
                                    Function Analysis.         structure, may be         change between 1990 and
                                                               misleading.               2000.
----------------------------------------------------------------------------------------------------------------
\32\ All findings are based on the best available evidence as of today; further evaluations could modify them.

Accuracy and Coverage Evaluation: Accuracy and Coverage Evaluation: 
Data and Analysis To Inform the ESCAP Report

Prepared by Howard Hogan

Introduction

Background

    The Census Bureau designed the Accuracy and Coverage Evaluation 
(A.C.E.) to permit correction of the initial census results to account 
for systematic patterns of net undercount. The Census Bureau 
preliminarily determined that the A.C.E., if properly conducted, should 
produce more accurate census data by improving coverage and reducing 
differential undercounts; the purpose of this document is to evaluate 
whether the data produced in Census 2000 support this initial 
determination.
    This document summarizes and synthesizes the more detailed analysis 
reports that were written to inform the adjustment decision. No one 
analysis report is determinative; rather the information in the 
analysis reports, taken together, permits evaluation of the quality of 
both the census and the A.C.E. The topics of the analysis reports were 
selected because the Census Bureau believed that the information in 
those reports would provide the basis for informing the Census Bureau's 
adjustment decision. In the course of evaluating the conduct of both 
the census and the A.C.E., it became evident that other analyses should 
be completed; thus, two additional reports have been added to the 16 
formal reports originally specified. The information in the analysis 
reports, and the reports themselves in draft form, have been shared 
with the Executive Steering Committee on A.C.E. Policy (ESCAP) on a 
flow basis so that the Committee could evaluate the data as they became 
available. The Committee has sometimes asked for additional 
information, either from the authors of the analysis reports or from 
other Census Bureau staff. Much of the analysis in the attached reports 
is applicable for all possible uses of adjusted data, but in some 
instances the reports focus on the ESCAP Committee's initial regulatory 
charge: to make a recommendation on the suitability of using the A.C.E. 
data for redistricting.
    As this document is written, the ESCAP is in the process of 
evaluating which set of numbers, the adjusted or the unadjusted, is 
more accurate for redistricting purposes. If more than one set of 
numbers is available, each of the 50 states will then make its own 
decision on which set of data to use. The Census Bureau believes it is 
appropriate for it to make one determination on which set of data is 
more accurate, rather than 50 separate determinations, because the 
statistical determination of the relative accuracy of the census versus 
the A.C.E. results is meaningful when summarized across jurisdictions. 
However, we have not attempted, nor do we think it possible, to 
establish the relative accuracy of a particular state.
    The authors of the attached reports have analyzed the best data 
available at this time. It should be noted that in the years following 
Census 2000, as in past censuses, the Census Bureau will prepare an 
extensive array of detailed evaluations of many aspects of both the 
initial census and the A.C.E. A list of the evaluation categories and 
their projected completion dates is attached. These evaluations will 
not be available for months, or in some cases, years, after the Census 
Bureau is required by law to provide redistricting data to the states. 
These final evaluations, as distinguished from these analytical reports 
that inform the ESCAP Committee, will be accomplished without the 
pressure of a legal deadline, will be based on additional information, 
and may, in some instances, reach conclusions different from those in 
certain of these reports.

Accuracy and Coverage Evaluation (A.C.E.) Dual System Estimates

    The A.C.E. indicates that Census 2000 reduced both net and 
differential undercoverage over the levels measured by the 1990 Post-
Enumeration Survey (PES). The net national undercount is estimated to 
have been reduced from the 1990 rate of 1.61 percent (0.20 percent 
standard error) to 1.18 percent (0.13 percent standard error). The 
estimated undercount rate for the Non-Hispanic Blacks domain dropped 
from 4.57 percent (0.55 percent standard error) to 2.17 percent (0.35 
percent standard error), and the estimated undercount rate for the 
Hispanics domain dropped similarly from 4.99 percent (0.82 percent 
standard error) to 2.85 percent (0.38 percent standard error). In 
addition, the estimated undercount rate for children dropped from 3.18 
percent (0.29 percent standard error) to 1.54 percent (0.19 percent 
standard error). (Report B-9, ``Dual System Estimation Results'')
    Nonetheless, the improvements demonstrated in Census 2000 do not 
mean that complete coverage has been achieved or that differential 
coverage has been eliminated. On the contrary, the A.C.E. indicates 
that Census 2000 perpetuated longstanding patterns of differential 
coverage, with minority groups and children exhibiting lower coverage 
rates. The Census 2000 percent net undercount for the non-Hispanic 
Black and the Hispanic domains, 2.17 percent and 2.85 percent 
respectively, remain significant, as does the Census 2000 percent net 
undercount for children of 1.54 percent.
    Tenure continues to be an important characteristic to distinguish 
coverage. The A.C.E. indicates that the pattern of differential 
coverage continues despite improvements in Census 2000. The percent net 
undercount for non-owners was 2.75 percent (0.26 standard error) as 
compared with an estimated net undercount for owners of 0.44 percent 
(0.14 standard error). While this is a distinct improvement over the 
percent net undercount for non-owners in the 1990 census, which is 
estimated at 4.51 percent (0.43 standard error), the A.C.E. indicates 
that the estimated undercount for this population is significant as 
well.
    In addition, the undercount for minority renters also remains high. 
The

[[Page 14022]]

non-owner undercount for non-Hispanic Blacks was estimated to be 3.58 
(0.48 standard error), for Hispanics 4.32 (0.55 standard error), for 
Asians 1.58 (0.98 standard error), for Hawaiians and Pacific Islanders 
6.58 (4.07 standard error), and for American Indians not on 
reservations 5.57 (2.02 standard error).
    Tables 2a and 2b provide the percent net undercount for the race/
origin domains, tenure, and age/sex groups for Census 2000 and the 1990 
census.

     Table 2a.--Percent Net Undercount for Major Groups: 2000 A.C.E.
------------------------------------------------------------------------
                                                    Net        Standard
              Estimation grouping                undercount     error
                                                 (percent)    (percent)
------------------------------------------------------------------------
        Total population in Households........         1.18         0.13
Race and Hispanic Origin:
    American Indian and Alaska Native (on              4.74         1.20
     reservation).............................
    American Indian and Alaska Native (off             3.28         1.33
     reservation).............................
    Hispanic Origin (of any race).............         2.85         0.38
    Black or African American (not Hispanic)..         2.17         0.35
    Native Hawaiian and Other Pacific Islander         4.60         2.77
     (not Hispanic)...........................
    Asian (not Hispanic)......................         0.96         0.64
    White or Some Other Race (not Hispanic)...         0.67         0.14
Age and Sex:
    Under 18 years............................         1.54         0.19
    18 to 29 years:
        Male..................................         3.77         0.32
        Female................................         2.23         0.29
    30 to 49 years:
        Male..................................         1.86         0.19
        Female................................         0.96         0.17
    50 years and over:
        Male..................................        -0.25         0.18
        Female................................        -0.79         0.17
Housing Tenure:
    In owner-occupied housing units...........         0.44         0.14
    In nonowner-occupied units................         2.75         0.26
------------------------------------------------------------------------


       Table 2b--Percent Net Undercount for Major Groups: 1990 PES
------------------------------------------------------------------------
                                                    Net        Standard
              Estimation grouping                undercount     error
                                                 (percent)    (percent)
------------------------------------------------------------------------
        Total Population \1\..................         1.61         0.20
Race and Hispanic Origin:
    White or Some Other Race (not Hispanic)            0.68         0.22
     \2\......................................
    Black or African American.................         4.57         0.55
    Hispanic Origin \3\.......................         4.99         0.82
    Asian and Pacific Islander................         2.36         1.39
    American Indian and Alaska Native (on             12.22         5.29
     reservation).............................
Age and Sex:
    Under 18 years............................         3.18         0.29
    18 to 29 years:
        Male..................................         3.30         0.54
        Female................................         2.83         0.47
    30 to 49 years:
        Male..................................         1.89         0.32
        Female................................         0.88         0.25
    50 years and over:
        Male..................................        -0.59         0.34
        Female................................        -1.24         0.29
Housing Tenure:
    In owner-occupied housing units...........         0.04         0.21
    In nonowner-occupied housing units........         4.51         0.43
------------------------------------------------------------------------

Review of the Quality of the Census Operations

Conclusions for This Section

    While many elements of the design of Census 2000 were fundamentally 
similar to the 1990 census, there were numerous major changes. These 
included involving local governments in the address list building 
process, increasing methods for answering the census, designing a 
simplified questionnaire, developing a multi-step mailing strategy, 
creating a paid advertising campaign, and restructuring the pay scale 
for temporary workers. The paid advertising campaign (over $100 million 
dollars) allowed for a saturation of census awareness across the 
nation, particularly for the minority communities. The restructured pay 
scale meant that the census could compete successfully with other 
employers to hire the number and quality of field workers it needed to 
conduct the census well.

[[Page 14023]]

    Operationally, Census 2000 was a success. The census data 
collection was accomplished on schedule with only a few exceptions. A 
review of the evidence from field reports and quality assurance 
processes indicates that Census 2000 programs functioned effectively 
within design parameters.

Analysis Reports Important to This Section

(All Analysis Reports cited in the text are in the DSSD Census 2000 
Procedures and Operations Memorandum Series B)
     Report B-2: ``Quality Indicators of Census 2000 and the 
Accuracy and Coverage Evaluation,'' by James Farber.
     Report B-3: ``Quality of Census 2000 Processes,'' by James 
B. Treat, Nicholas S. Alberti, Jennifer W. Reichert et al.

Discussion

    As documented extensively by Census Bureau and outside 
statisticians, every census since at least 1940 has experienced both a 
net undercount and a substantial differential undercount. In 
particular, the data reveal a persistent differential undercount 
between the Black and non-Black populations, as well as differential 
undercounts for other minority groups and for children.
    Many elements of the design of Census 2000 were fundamentally 
similar to the design of the 1990 census. Address lists were prepared 
using a variety of sources, and questionnaires were delivered to each 
address on the list. Questionnaires were principally delivered by the 
U.S. Postal Service; however, in areas with rural style addresses, 
census workers delivered the questionnaires. Households receiving 
questionnaires were asked to return the questionnaires by mail, 
although in some very rural or isolated areas households were 
interviewed by census enumerators as the enumerators verified and 
updated the address list. Those addresses that did not return a 
questionnaire by mail were followed up by census workers in the 
Nonresponse Follow-up operation (NRFU). NRFU was followed by special 
coverage improvement follow-up operations, which, among other things, 
included contacting addresses listed as vacant or nonexistent by the 
NRFU field staff. Each of these operations had its own quality control 
procedures.
    The Census 2000 plan, however, included several important 
innovations to the census process designed to improve census accuracy. 
Prior to Census 2000, the Census Bureau worked closely with local and 
tribal governments through the Local Update of Census Addresses (LUCA) 
program to review and update the address list. During LUCA, local and 
tribal government officials were given the opportunity to review the 
Census Bureau's address list and identify missing addresses for 
inclusion in the census. The Census Bureau also implemented the New 
Construction Program, during which local governments were invited to 
submit addresses for housing units that had been built subsequent to 
the completion of the address list in January 2000. The ``Be Counted'' 
program was also new in Census 2000. ``Be Counted'' forms were provided 
to individuals who believed that they might have been missed in the 
initial distribution of census questionnaires, as well as to 
individuals without any usual residence. The ``Be Counted'' forms were 
made available to the public at walk-in Census 2000 assistance centers 
and at a variety of public locations identified through consultation 
with local organizations. In addition, Census 2000 questionnaires were 
available upon request in six languages and language assistance guides 
were available in more than forty languages. Households also were given 
the opportunity to respond to Census 2000 by telephone or via the 
Internet.
    To encourage households to respond to Census 2000, the Census 
Bureau initiated the largest promotion and outreach effort in its 
history for a decennial census. The Census Bureau established 
approximately 140,000 partnerships with a wide range of government and 
nongovernment organizations at the national and local levels. 
Organizations throughout the United States and Puerto Rico implemented 
promotional activities to educate the public about the importance of 
participating in the census. Then, starting in November 1999, the 
Census Bureau launched the first-ever paid advertising campaign for a 
census. This campaign was extended in targeted cities to encourage 
cooperation with enumerators during the NRFU operation. Other efforts 
included the distribution of numerous news releases and a number of 
video news feeds tailored to local areas to media outlets to generate 
media coverage during the various stages of Census 2000.
    The Census Bureau then implemented the A.C.E. because it expected 
that, while these innovations would improve the results of the census, 
the phenomenon of the differential undercount would continue. The 
A.C.E. is designed to serve as a quality check on the census counts 
obtained after all other operations planned for Census 2000 were 
completed. In effect, the goal of the A.C.E. is to make a good census 
even better.
    The discussion in this document is not meant to be a complete 
evaluation of census operations, but rather focuses on information 
relevant to the question of the level and pattern of census omissions 
or erroneous inclusions, because this information is directly relevant 
to understanding and assessing the results of the A.C.E.
    We will discuss what is known about the following major operations:

 Address List Development
 Questionnaire Delivery and Return
 Nonresponse Follow-up
 The ``Be Counted'' Campaign
 Coverage Edit Follow-up
 Coverage Improvement Follow-up
 Housing Unit Duplication Operation \33\
---------------------------------------------------------------------------

    \33\ The Housing Unit Duplication Operation was a special 
operation designed and instituted after the Coverage Improvement 
Follow-up to reduce the level of housing unit duplication. This 
operation has special implications for census coverage and the 
conduct of the A.C.E.
---------------------------------------------------------------------------

 Primary Selection Algorithm
 Unclassified Unit and Missing Data Estimation

Address List Development

    Address list development was conducted over several years, and the 
vast majority (96.7 percent) of addresses were listed before 
questionnaire delivery. One major change from previous censuses was the 
inclusion of the Local Update of Census Addresses (LUCA) program, 
during which the Census Bureau solicited the help of local governments 
in the address list building operation. LUCA was successful in adding 
approximately five million housing units to the address list \34\. 
However, anecdotal evidence suggests that the LUCA program may also 
have contributed duplicate addresses to the Master Address File (MAF). 
Duplicate addresses may have been erroneously added because the Census 
Bureau and local governments refer to the same address in different 
ways.
---------------------------------------------------------------------------

    \34\ Many of the adds were also added by other operations. At 
this time, we do not know the extent of the overlap. That is, the 
five million figure cannot be considered as a net addition.
---------------------------------------------------------------------------

    The address list development process included several quality 
assurance programs. These programs had the following objectives: to 
prevent errors due to lack of knowledge or understanding on the part of 
the lister, to control coverage and content errors, and to promote 
continuous improvement of performance. In general, the preliminary 
quality

[[Page 14024]]

assurance results for address list development are within the expected 
range for each of the programs.

Questionnaire Delivery and Return

    The United States Postal Service or census workers delivered 
questionnaires to the vast majority of addresses on the address list. 
As in previous censuses, a certain number of questionnaires were 
misdelivered. For example, the questionnaire intended for Apartment A 
might have been delivered to Apartment B and vice versa. We have not 
quantified the level of questionnaire mis-delivery.
    Householders were asked to return the questionnaires by mail. Since 
the Census Bureau does not expect mail responses from vacant or 
nonexistent housing units, the relevant measure of cooperation is the 
return rate, that is, the proportion of occupied housing units that 
returned their questionnaire. This measure differs from the response 
rate, which while available earlier in the census process, includes 
vacant and nonexistent housing units in the denominator. As measured by 
the return rate, the cooperation of the public with Census 2000 was 
approximately the same as in the 1990 census. The 2000 return rate (72 
percent) is approximately the same as the 1990 return rate (74 
percent). The comparison is not exact because the universes are 
slightly different.
    Considering the general trend downward in return rates between 
censuses and for survey interviews in general, the Census Bureau 
considers the Census 2000 return rate to be a major success.

Nonresponse Follow-up

    During Nonresponse Follow-up (NRFU), census workers visited each 
household that the address list identified as not yet having returned a 
mail questionnaire. In Census 2000, approximately 42 million households 
were included in the NRFU process. Thanks in large part to adequate 
funding provided by Congress, pay rates and levels of staffing in 2000 
were far better than in the past two censuses. We believe that this 
increased funding and the ability to hire adequate staff contributed to 
an improvement in NRFU quality.
    For most LCOs, NRFU was completed as scheduled in a nine-week 
period between April 27, 2000, and June 26, 2000. This performance 
compares favorably with 1990, when NRFU was conducted over a 14-week 
period from April 26 though July 30. The Census Bureau believes that, 
all other things being equal, NRFU interviews conducted closer to 
Census Day are likely of higher quality.
    Local NRFU problems were identified in a few local census offices, 
including the local census office in Hialeah, Florida. The Census 
Bureau responded to the localized problems in the Hialeah office by re-
enumerating certain areas that were believed to have faulty data. The 
Census Bureau does not believe that net coverage in the Hialeah or any 
other local census office was substantially affected by these local 
problems; the NRFU operation for the nation as a whole was good to 
excellent.
    The NRFU quality assurance program was conducted through a random 
and targeted reinterview program which had the following three 
objectives:

 Prevent errors due to lack of knowledge or understanding
 Control coverage and content errors
 Promote continuous improvement of performance

    Preliminary NRFU quality assurance results show that the 
reinterview workload was 6 percent, slightly above the expected 
workload of five percent. Discrepant cases were found in approximately 
three percent of the reinterview cases. Some local census offices 
experienced delays in starting their reinterview programs, which may 
have hindered the reinterviewers' ability to accurately verify the 
census data. A significant number of quality assurance forms were lost 
and/or completed incorrectly. (Report B-3, ``Quality of Census 2000 
Processes'')
    In spite of local imperfections, the NRFU program as a whole was 
largely successful. The better pay and staffing seemed to have resulted 
in a more professional and controlled labor force. The local problems 
and quality assurance shortcomings were similar to problems encountered 
in previous censuses and should be expected in any nonrecurring 
operation of this magnitude.

Be Counted Campaign

    The ``Be Counted'' campaign was designed to allow people who 
thought they may have been missed by the census to send in a ``Be 
Counted'' form, listing themselves and their April 1, 2000, address. 
The Census Bureau had hoped that this campaign would allow for improved 
cooperation and coverage. The National Academy of Science and others 
feared that large numbers of ``Be Counted'' forms would overwhelm the 
system and lead to increased person duplication.
    Neither the hopes nor the fears relating to the ``Be Counted'' 
campaign were realized. The Be Counted workload was only approximately 
600,000, with no large local clusters observed. Its impact on net 
coverage for any group or area was minimal, and it is not believed to 
have contributed to housing unit duplication.

Coverage Edit Follow-up

    Under certain circumstances, the Census Bureau would call a 
responding household on the telephone to gain additional information. 
This extra effort, called Coverage Edit Follow-up (CEFU), was designed 
to improve within-household coverage, especially for large households. 
The census questionnaire had room to collect data for six people and 
asked the respondent living in a household with more than six people to 
list the additional residents. In CEFU, enumerators called these 
households and gathered the required information about the additional 
residents. In addition, CEFU was designed to follow up count 
discrepancies, or cases where the population count on the front of the 
questionnaire differed from the number of person responses inside the 
questionnaire.
    Due to computer problems, the start of CEFU was delayed until May 
8, 2000. It ran through August 13, 2000. Originally, it was planned for 
April 5, 2000, through June 19, 2000. This delay may have made it more 
difficult to obtain good information from households with more than six 
residents because some of the residents may have moved. In addition, 
CEFU had no provision to contact large households without telephones. 
When the Census Bureau could not secure good CEFU data on listed 
additional residents, it imputed their characteristics; to do otherwise 
would have decreased net coverage. Thus, the CEFU operation may have 
resulted in some small coverage loss compared to previous censuses, but 
this possible loss has not yet been quantified and is not expected to 
be significant, given the use of imputation.

Coverage Improvement Follow-up

    Coverage Improvement Follow-up (CIFU) was designed as a check on 
addresses that were determined during the NRFU operation to be vacant 
or deleted (nonexistent). CIFU was also used for addresses requiring 
follow-up that were identified too late to be included in NRFU. CIFU 
was conducted from June 26 until September 13. Both the 1980 and the 
1990 censuses included similar operations.
    CIFU was conducted on 6.5 million addresses for which the housing 
unit was listed as vacant or non existent in NRFU. CIFU determined that 
1.5 million of these units were, in fact,

[[Page 14025]]

occupied. In addition, CIFU included 2.2 million other addresses that 
had been added to the MAF after the initial mail out, such as those 
that resulted from the New Construction or Update/Leave programs.
    The quality assurance procedures on CIFU included a questionnaire 
review, a dependent review and data entry quality assurance. The 
dependent review was conducted on housing units identified as vacant or 
nonexistent and excluded certain occupied units for time and budgetary 
considerations. Some districts may have had a difficult time completing 
all of the dependent reviews. A significant number of quality assurance 
forms were lost and/or completed incorrectly. These lost/incorrect 
forms make any analysis of outgoing quality difficult. (Report B-3, 
``Quality of Census 2000 Processes'')

Housing Unit Duplication Operation

    The Census Bureau observed tentative indications as the census 
progressed that the MAF might contain a significant number of duplicate 
addresses. The Census Bureau also concluded that the Hundred Percent 
Census Unedited File (HCUF) might contain a significant number of 
duplicated persons, many of which are assigned to duplicated addresses. 
The Census Bureau responded to this problem by designing and conducting 
the Housing Unit Duplication Operation (HUDO). While this program was 
not prespecified, the Census Bureau believed that failure to address 
this potential problem could impair the accuracy of the apportionment 
numbers. Using the results of an address matching operation and a 
person matching operation, 2,411,743 address listings (address ID's) 
were analyzed on an aggregate basis to see whether these addresses were 
likely to correspond to other addresses already contained in the 
listing. Based on this analysis, 1,392,686 address IDs were permanently 
removed from the HCUF; after further review to identify units that may 
have been removed in error, the remaining 1,019,057 addresses were 
reinstated and included in the census results. The HUDO was designed 
solely to remove address/housing unit duplication. The software used 
for this process was carefully checked.

Primary Selection Algorithm

    Census questionnaires contain a unique ID, an identifier that the 
Census Bureau uses to make sure it records the information for each 
household only once. Nonetheless, the Census Bureau sometimes receives 
more than one questionnaire for a single address ID. For example, a 
household might mail back its questionnaire after the Census Bureau had 
already created NRFU assignment lists; a NRFU interviewer would then 
get an interview for a household that had already mailed back its 
response. As a further example, a ``Be Counted'' form might be received 
for a household with a completed census questionnaire. Since NRFU 
households identified as vacant are sent to CIFU, sometimes multiple 
questionnaires are generated by design. That is to say that we expect 
to have one questionnaire from NRFU and another from CIFU. The Primary 
Selection Algorithm (PSA) examines these multiple questionnaires to 
form one household to represent the housing unit in the census, 
sometimes by combining information from more than one questionnaire. 
The PSA was designed to prevent both overcoverage (including people 
more than once) and undercoverage (deleting too many people).
    Multiple returns were received for less than 10 percent of the 
address IDs. However, many of these multiple returns were from vacant 
units or multiple listings of the same people on two IDs. The number of 
people in a household was found to be larger than the number reported 
on the most complete questionnaire for fewer than 300,000 IDs. In other 
words, PSA resulted in an increase of individuals in fewer than 300,000 
housing units.
    Although no formal evaluation has been completed, the PSA was well 
programmed and well tested. The results are consistent with the overall 
design of the PSA and of the census.

Unclassified Unit and Missing Data Estimation

    As in the past, Census 2000 had some housing unit records listed on 
the MAF for which the Census Bureau could not gain information. In 
addition, there were a small number of housing units which the Census 
Bureau knows to be occupied but for which it could not secure precise 
information about the individuals living in that unit. The census 
process could not always determine whether other units are occupied or 
vacant. Sometimes, the unit was determined to be occupied, but the 
number of residents could not be determined. In each of these cases, a 
statistical process known as ``imputation'' is used to estimate the 
number of people living in these units.
    Preliminary results indicate that almost 0.4 percent of person 
records were in housing units on the preliminary HCUF were missing a 
status of occupied, vacant or nonexistent, indicating that the 
residents of the housing unit were imputed. For states, the imputation 
percent ranged from 0.2 percent to 1.1 percent. In 1990, about 0.02 
percent of people in unclassified units were imputed.
    In addition, Census 2000 encountered whole households where the 
number of people could be determined, but the person records for these 
residents were missing. In accordance with past practice, the Census 
Bureau used imputation techniques to estimate characteristics for these 
people. About 0.8 percent were imputed with this technique.
    The total person substituted persons in the Census 2000 is 
approximately 1.3 percent. The percent of substituted persons in 1990 
was only about 0.7 percent.

Review of A.C.E. Operations

    Similar to its review of the operations in the initial census, the 
Census Bureau has reviewed the A.C.E. operations to identify any 
deviations from specified procedures and to assess the extent to which 
the operations were under management control.

Proper Execution of the Steps Between Processing and Estimation

Conclusions for This Section
    The A.C.E. was carried out as designed, with only minor 
modifications. Each modification was well documented and justified by 
good statistical practice. No steps were skipped because of lack of 
time or resources, and there was no manipulation of the results or 
distortions resulting from outside pressures. There is a clear and 
traceable path from the data collected by the interviewer to the final 
results. The Census Bureau carried out the A.C.E. according to its 
public plan, and the steps between processing and estimation were 
properly executed.
Analysis Reports Important to This Section
     Report B-7: ``Accuracy and Coverage Evaluation Survey: 
Missing Data Results,'' by Patrick J. Cantwell.
     Report B-8: ``Accuracy and Coverage Evaluation Survey: 
Decomposition of Dual System Estimate Components,'' by Thomas Mule.
     Report B-9: ``Accuracy and Coverage Evaluation Survey: 
Dual System Estimation Results,'' by Peter P. Davis.
Discussion
    The A.C.E. methodology planned for Census 2000 involves comparing 
(matching) the information from an independent sample survey to initial 
census records. In this process, the

[[Page 14026]]

Census Bureau conducts field interviewing and computerized and clerical 
matching of the records. Using the results of this matching, the Census 
Bureau applies the statistical methodology of Dual System Estimation 
(DSE) to develop coverage correction factors for various population 
groups. The results are then applied to the census files to produce the 
adjusted census data.
    One concern sometimes expressed about statistical correction is 
that statistical processes could be subject to manipulation. The Census 
Bureau believes that this notion is not well founded. The A.C.E. was 
publicly prespecified to assuage these concerns. The first step in 
reviewing the A.C.E. is to evaluate A.C.E. operations to determine 
whether the prespecified procedures were followed and documented. The 
Census Bureau's analysis found that all planned A.C.E. operations were 
carried out in close adherence to the prespecified design, with the two 
exceptions noted below.
    The supporting analysis reports review each of the steps in the 
A.C.E. operation from the creation of the A.C.E. micro-records to the 
computation of the final adjustment factors. In particular, Report B-8, 
``Decomposition of Dual System Estimate Components,'' presents an 
accounting of the A.C.E. estimation components so that the results can 
be independently verified. Beginning with records with complete data 
(meaning records with both post-stratification variables and 
enumeration status) the accounting then proceeds through each stage of 
missing data adjustment and sample weighting until the final weighted 
``matched'' results are provided (which are the results that are the 
input data for the dual system estimates). Report B-8 allows an 
informed reader to see clearly how the final results were derived and 
to understand the relative effect of the estimation steps on the 
results.
    Report B-7, ``Missing Data Results,'' shows in detail the effects 
of individual missing data estimation steps upon the weighted matching 
results. Report B-9, ``Dual System Estimation Results,'' provides 
detailed DSE computations together with useful ``roll-ups'' that 
aggregate the results by age and sex, minority/nonminority, or other 
useful summations. This document allows the reader to verify how the 
final coverage correction factors are computed from the input data.
    These three documents, taken together, demonstrate how the final 
coverage correction factors were derived from the micro-level data and 
document that the prespecified procedures were followed, with the 
following exceptions.
    The following two changes from the prespecified procedures arose 
from the unexpected availability of important information in time to 
improve the A.C.E. estimation:
     The A.C.E. plan provided that cells could be collapsed 
because of cell size but did not explicitly include variance as a 
reason for collapsing. We modified these rules because the estimated 
variance for one cell was unusually large. The design had not 
anticipated having variance estimates available in time to permit their 
use in collapsing. When the variances became available earlier than 
anticipated, the Census Bureau's statistical staff determined the 
collapsing of ``outlier'' poststrata was appropriate. This change did 
not deviate from the purpose or spirit of the prespecified collapsing 
rules but allowed a more precise application. The change was discussed 
with the ESCAP and documented.
     Our method for imputing unresolved match and residency 
status, namely imputation cell estimation, was modified because the 
results of the A.C.E. follow-up forms became available during the 
missing data estimation process (Report B-7, ``Person Matching and 
Follow-up Results''). The prespecified design had not anticipated that 
these data would be available in time to be used in missing data 
estimation. Analysis of the data indicated that some cases grouped 
together in the initial missing data design could be separated based on 
the keyed follow-up results, allowing for a more precise imputation. 
This change is consistent with normal statistical practice and was 
discussed with the ESCAP and documented.

Conduct and Control of the A.C.E. Operations

Conclusions for This Section
    The A.C.E. was an operational success; it was properly conducted 
and encountered no unanticipated difficulties. Listing, interviewing, 
matching, and follow-up were all conducted as designed and were all in 
control.\35\
---------------------------------------------------------------------------

    \35\ A more extensive description of the A.C.E. can be found in 
Howard Hogan's paper, ``Accuracy and Coverage Evaluation: Theory and 
Application,'' prepared for the February 2-3, 2000, DSE Workshop of 
the National Academy of Sciences Panel to Review the 2000 Census; 
and Danny R. Childers and Deborah A. Fenstermaker, ``Accuracy and 
Coverage Evaluation: Overview of Design,'' DSSD Census Procedures 
and Operations Memorandum Series S-DT-02, U.S. Census Bureau, 
Washington, D.C., January 11, 2000.
---------------------------------------------------------------------------

Analysis Reports Important to This Section
     Report B-5: ``Accuracy and Coverage Evaluation Survey: 
Person Interviewing Results,'' by Rosemary L. Byrne, Lynn Imel, and 
Phawn Stallone.
     Report B-6: ``Accuracy and Coverage Evaluation Survey: 
Person Matching and Follow-up Results,'' by Danny R. Childers, Rosemary 
L. Byrne, Tamara S. Admas, and Roxanne Feldpausch.
     Report B-7: ``Accuracy and Coverage Evaluation Survey: 
Missing Data Results,'' by Patrick J. Cantwell.
Discussion
    The second aspect to this review is to establish that the A.C.E. 
operations were well conducted and well controlled. Reports B-5, 
``Person Interviewing Results,'' B-6, ``Person Matching and Follow-up 
Results,'' and B-7, ``Missing Data Results,'' taken together, establish 
that the operational quality of the A.C.E. was generally good and that 
the prespecified design was well followed.

Interviewing

    One change from 1990 was the introduction of telephone 
interviewing. The Census Bureau implemented a telephone program to 
enhance the efficiency and quality of the A.C.E. interview. The Census 
Bureau believed that shortening the elapsed time from Census Day to the 
A.C.E. enumeration would improve data quality and that beginning 
interviewing early in a more easily controlled environment would allow 
the A.C.E. supervisors to gain valuable experience in conducting 
interviews and in operating their laptop computers before training the 
enumerators. The Census Bureau designed this process to maintain the 
independence between the A.C.E. and the other Census 2000 operations.
    A.C.E. interviewing was an operational success. The A.C.E. 
interviewing finished on schedule by September 1, 2000, in every local 
census office except the Hialeah office, where census NRFU interviewing 
finished late (September 11, 2000) due to local difficulties. There 
were no major disruptions or delays introduced by the Computer Assisted 
Personal Interviewing (CAPI) instrument. The timely interviews allowed 
the Census Bureau to have an orderly completion of interviewing was a 
major accomplishment.
    Twenty-nine percent of the total A.C.E. workload was completed 
during the telephone phase (April 24 through June 13). These A.C.E. 
interviews were conducted much closer to Census Day

[[Page 14027]]

(April 1) than had been possible in 1990, thereby reducing recall bias 
(the phenomenon of a respondent not remembering the actual situation 
several months earlier). By design, the telephone phase was restricted 
to a limited universe of households that were deemed unlikely to have 
any exposure to continuing census operations. These were primarily 
households that had mailed back their questionnaires, that had included 
a telephone number on the questionnaire, and that did not live in 
certain multi-unit or rural structures. The Census Bureau's 
conservative use of this interview mode meant that more than 99 percent 
of the telephone cases were classified as complete or partial 
interviews and were conducted with a household member.

                            Table 3.--Distribution of Interviews by Week--Unweighted
----------------------------------------------------------------------------------------------------------------
                                                                                                     Cumulative
                                                                                                     percent of
                     Phase                                 Week starting on             Number of      person
                                                                                          cases     interviewing
                                                                                                      workload
----------------------------------------------------------------------------------------------------------------
Telephone.....................................  April 23, 2000.......................        7,699           2.6
                                                April 30, 2000.......................       20,590           9.4
                                                May 7, 2000..........................       25,638          17.9
                                                May 14, 2000.........................      19,7282           4.5
                                                May 21, 2000.........................       10,497          28.0
                                                May 28, 2000.........................        3,232          29.1
                                                June 4, 2000.........................        1,154          29.5
                                                June 11, 2000........................           35          29.5
Personal Visit................................  June 18, 2000........................       45,204          44.5
                                                June 25, 2000........................       57,241          63.5
                                                July 2, 2000.........................       41,642          77.3
                                                July 9, 2000.........................       31,344          87.7
                                                July 16, 2000........................       17,038          93.4
                                                July 23, 2000........................        7,764          96.0
                                                July 30, 2000........................        5,057          97.7
                                                Aug 6, 2000..........................        3,982          99.0
                                                Aug 13, 2000.........................        1,756          99.6
                                                Aug 20, 2000.........................          939          99.9
                                                Aug 27, 2000.........................          336         100.0
                                                Sept 3, 2000.........................           36         100.0
                                                Sept 10, 2000........................            1        100.0
----------------------------------------------------------------------------------------------------------------
Source: Accuracy and Coverage Evaluation Survey 2000-- housing unit data collected by the Computer Assisted
  Personal Interviewing (CAPI) instrument. Report B-5, ``Person Interviewing Results.''

    The automated Computer Assisted Person Interviewing (CAPI) 
increased the quality of the data captured in the A.C.E. interviews, as 
the instrument included data edits to ensure a predetermined quality of 
data before the interview was considered complete. This was not 
possible with the paper and pencil 1990 instrument. It insured that the 
interviewer followed the correct path through the interview. CAPI also 
allowed quick feedback to the interviewers. The Census Bureau's 
observations and debriefings indicated that CAPI instilled the 
interviewers with a sense of professionalism and purpose. Observations 
also indicated that the use of laptop computers enhanced the respect 
and cooperation exhibited toward the interviewers by the respondent 
households thereby leading to improved A.C.E. data quality. However, 
there were a couple of small problems with the CAPI instrument that had 
minor impacts on quality.
    The Nonresponse Conversion Operation (NRCO) was designed to 
``convert'' nonresponse cases, that is, to obtain A.C.E. information 
for nonresponding households. On a national basis, the NRCO operation 
successfully converted 70.8 percent of its cases to complete interviews 
and 14.1 percent to partial interviews. Only 2.2 percent of the cases 
finished as refusals.
    A.C.E. interview rates were very high. The A.C.E. asked questions 
about both the household living at the address on Census Day and the 
current household. Because of this, there are two measures of household 
nonresponse. The rate for occupied housing units on Census Day was 97.1 
percent; on the date of the A.C.E. interview, the rate for occupied 
housing units was 98.8 percent.
    These rates compare favorably to the approximately 98.4 percent 
(unweighted) in the 1990 Post-Enumeration Survey. The unweighted rates 
for 2000 were 97.0 and 98.9, respectively. Due to the high rate of 
response, most of the noninterview adjustment factors were very close 
to one. Consequently, this operation did not change the final weights 
very much. This helps to keep down the variance of the survey weights.
    Missing data rates for characteristic data were very low, ranging 
from 1.4 percent to 2.4 percent. Compared to the 1990 PES, the rates of 
characteristic missing data are slightly higher for the age and sex 
characteristics and slightly lower for tenure and race. Again, this is 
indicative of good quality interviewing.
    The goal of A.C.E. interviewing quality assurance was to ensure 
that the interviewers did, in fact, visit the designated households, 
and to prevent systematic errors caused by of lack of knowledge or 
understanding. The evidence indicates that the A.C.E. interviewing 
quality assurance operation was properly implemented and successful. A 
total of 11.6 percent of the cases were subject to random or targeted 
quality assurance checks. We assume that the 88.4 percent of the cases 
not in quality assurance share the favorable error rates of the 
randomly selected cases (0.13 percent). This may have been reduced 
further as 171 of the remaining errors were corrected in the targeted 
QA sample.

Matching and Follow-up

    Matching refers to the process of determining whether an individual 
enumerated in the A.C.E. was the same person as an individual 
enumerated in the census. The matching and follow-up

[[Page 14028]]

process also determines whether a census record in the E-Sample \36\ 
was complete and correct. Errors in matching can significantly affect 
undercount estimates; highly accurate matching and processing are an 
important component of A.C.E. methodology.
---------------------------------------------------------------------------

    \36\ The E-sample refers to the sample of census data defined 
person records selected for inclusion in the A.C.E. The P-sample 
refers to the independent sample of people included in the initial 
A.C.E. interview.
---------------------------------------------------------------------------

    Although neither Secretary Mosbacher nor the Committee on 
Adjustment of Postcensal Estimates (CAPE) identified matching error as 
a significant problem with the 1990 PES, the Census Bureau made 
significant improvements to the matching process in the 2000 A.C.E. 
design. The A.C.E. computer matched the P-sample to the census using 
the Census Bureau's Statistical Research Division Record Linkage 
System, a system that the Census Bureau has been developing, testing 
and using for nearly two decades. Clerical personnel at a centralized 
location reviewed records that were not matched by the computer 
matcher. The Census Bureau utilized an ample staff of over 200 clerks, 
46 technicians, and 16 analysts so that each successive level of review 
could perform quality assurance on the previous level. Higher level 
staff independently reviewed a sample of each employee's work, a 
process designed to identify random matching errors. Each of the 
matching levels improved on the previous level. The clerks matched what 
the computer could not. The technicians worked on any cases the clerks 
could not resolve and performed the quality assurance on the clerks' 
cases. Then the analysts finished any cases the technicians could not 
resolve and performed quality assurance on the technicians' cases.
    The results indicate computer matching of 69.6 percent of the P-
sample and 64.4 percent of the E-sample. The computer matcher assigned 
matches very conservatively. Numerous studies over the years have shown 
that this operation produces insignificant numbers of false matches. 
Therefore, all questionable matches, possible matches, and near matches 
are left for clerical review. All nonmatches were clerically reviewed.
    We have quality assurance results only on the quality of the 
clerical matching in the before follow-up stage and the first three 
stages of after follow-up. The Census Bureau measures matching quality 
relative to the results that would be produced by the Census Bureau's 
most experienced and best trained matchers, the 16 analysts permanently 
employed by the Census Bureau. The quality of the matching process is 
further measured in terms of changes made by the next level of review; 
this process tends to overstate the matching error, as not all changes 
are the result of erroneous matching. However, given these caveats, the 
outgoing quality rate (the final match rate) for before follow-up was 
well more than 99 percent. For after follow-up, the outgoing quality 
rate was also well more than 99 percent. These rates are calculated 
based on the before follow-up and the after follow-up workload and not 
on the total number of sample cases, that is, they do not include the 
cases matched by computer. These rates exceed expectations and are 
indicative of high quality matching.
    Person follow-up is also an important A.C.E. process. The follow-up 
resolves possible matches and, most importantly, determines which E-
sample nonmatches are, nonetheless, correctly enumerated in the census. 
The person follow-up interviews were conducted either by permanent 
census field staff or by experienced decennial interviewers and the 
quality assurance operation was targeted at ensuring that the interview 
was conducted. Of the randomly selected person follow-up quality 
assurance cases, 0.45 percent resulted in a discrepancy, that is, only 
0.45 percent determined that the person follow-up interview may not 
have been conducted. We can assume that the remaining 84,843 cases not 
randomly selected for quality assurance have the same rate of failure 
or roughly 400 cases total that may have not been conducted. In 
addition, we corrected 84 of those cases in the targeted samples.

Review of A.C.E. Quality

    The review in the previous section established that the A.C.E. was 
conducted as designed. This section will take the next step and 
evaluate the quality of the A.C.E. as implemented.
    Our review of A.C.E. quality has two aspects. First, we review the 
available data relating to selected individual components of A.C.E. 
error. The second part of the A.C.E. quality review synthesizes what is 
known about the components of error into a few indicators of overall 
relative accuracy for both the adjusted and the unadjusted census 
results.

Individual Components of A.C.E. Quality

Sampling Variance
Conclusions for This Section
    The A.C.E. significantly reduced sampling variance relative to the 
1990 PES. This result was achieved by nearly doubling the sample size 
coupled with significant improvements in the sample design.
Analysis Reports Important to This Section
     Report B-9: ``Accuracy and Coverage Evaluation Survey: 
Dual System Estimation Results,'' by Peter P. Davis.
     Report B-11: ``Accuracy and Coverage Evaluation Survey: 
Variance Estimates by Size of Geographic Area,'' by Michael D. 
Starsinic, Charles D. Sissel, and Mark E. Asiala.
Discussion
    The dual system estimate shows that the Census 2000 undercount rate 
for the national household population is 1.18 percent, with a standard 
error of 0.13 percent. The net undercount for the 1990 census was 
estimated at 1.61 percent, with a standard error of 0.20 percent (see 
table 2, above). Comparisons by poststrata between 1990 and 2000 are 
necessarily inexact as the universe differs (2000 includes only the 
household population) and the exact poststrata definitions are 
different. Still, some comparisons are instructive. The standard error 
for owners was reduced from 0.21 percent to 0.14 percent, and the 
standard error for non-owners fell from 0.43 percent to 0.26 percent. 
The measured standard error fell for all comparable race/origin groups 
and for each age/sex group. The estimated standard error was 
comparatively high for the two groups estimated separately for the 
first time: Hawaiian and Pacific Islanders (2.77 percent) and American 
Indians and Alaskan Natives living off reservation (1.33 percent). As 
we will see, these groups also had high levels of inconsistent 
reporting between the census and the A.C.E. The estimated standard 
error for American Indians living on reservations fell dramatically 
from 5.29 percent to 1.2 percent. The standard error for Asians was 
0.64 percent. For Hispanics, it was 0.38 percent, and for non-Hispanic 
Blacks it was 0.35 percent.
    Table 4 gives the estimated percent net undercount and standard 
errors for the 64 major poststratum groups. The standard errors for 
several groups are above 1 percent and for a few small groups are up to 
4 percent. Because the populations of these groups are small,

[[Page 14029]]

their high variances will have only limited impact on geographic 
variance.
     For Census 2000, persons can self-identify with more than 
one race group. For post-stratification purposes, persons are included 
in a single Race/Hispanic Origin Domain. This classification does not 
change a person's actual response. Further, all official tabulations 
are based on actual responses to the census.
     A negative net undercount denotes a net overcount.
BILLING CODE 3510-07-P
[GRAPHIC] [TIFF OMITTED] TN08MR01.000


[[Page 14030]]


[GRAPHIC] [TIFF OMITTED] TN08MR01.001

     For Census 2000, persons can self-identify with more than 
one race group. For post-stratification purposes, persons are included 
in a single Race/Hispanic Origin Domain. This classification does not 
change a person's actual response. Further, all official tabulations 
are based on actual responses to the census.
BILLING CODE 3510-07-C

[[Page 14031]]

    At the state level, the median coefficient of variation (CV) for 
state population totals dropped from 0.41 percent in 1990 to 0.24 
percent in 2000. More important, the median CV for the congressional 
districts dropped from 0.5 percent to 0.3 percent. Similar drops in the 
CV of 40 percent to 50 percent were estimated for counties and places 
larger than 100,000.
    This decrease in sampling variance is due to the much larger sample 
size of the A.C.E. relative to the PES: 300,913 housing units in 11,303 
clusters for the A.C.E., versus 165,000 housing units in approximately 
5,000 clusters for the 1990 PES. Better measures of population size in 
the sample selection of block clusters, better subsampling methods, 
better methods of treating ``small blocks,'' and a reduction in the 
variability of sampling weights all contributed to this reduction.
    One simple analysis was to compare estimated undercount rates from 
the A.C.E. with estimated confidence levels. We can compare the 
undercounts among the 64 post-strata groups (collapsed over age and 
sex) with their confidence intervals. Of course, care must be taken in 
this analysis, with proper correction for multiple comparisons.
    This analysis clearly showed that the A.C.E. results cannot be 
dismissed as ``simply variance.'' (See Report B-9, ``Dual System 
Estimation Results'') A clear pattern of minority undercount and a most 
pronounced undercount of minority renters emerged. This pattern is 
consistent with differential undercount patterns found in all prior 
censuses.
Consistent Reporting of Census Day Residence
Conclusions for this Section
    The consistency of reporting of Census Day address should be better 
than in 1990 due to the interviews occurring closer to Census Day and 
better quality interviewing made possible with the CAPI instrument.
Analysis Reports Important to This Section
     Report B-5: ``Accuracy and Coverage Evaluation Survey: 
Person Interviewing Results,'' by Rosemary L Bryne, Lynn Imel, and 
Phawn Stallone.
Discussion
    Proper application of the DSE model requires consistent reporting 
of Census Day residence between the P and E-samples. If a person who 
was sampled in the P-sample reports a different Census Day residence 
than he/she reported in the E-sample, then that person could be 
considered both missed (based on the P-sample) and correctly enumerated 
(based on the E-sample), or conversely, both enumerated (based on the 
P-sample) and not correctly enumerated (based on the E-sample). Since 
many people fall only into either the P or the E-sample, measuring 
consistent reporting is an important task. When a person is in both the 
P and the E-sample, consistent reporting between the two systems is not 
a problem because we use the same interview for both samples. However, 
some individuals have two interviews, one in the P-sample and one in 
the E-sample. For example, we use the initial A.C.E. interview for 
individuals in the P-sample to determine their correct Census Day 
residence. However, if an individual was missed by the A.C.E. but 
included in the initial census, we would use the A.C.E. follow-up 
interview to determine Census Day residence. Even for matched people, 
if the person was duplicated by the census, we might have a different 
interview at each identified census household. Since these interviews 
use different survey questionnaires, and are administered at different 
times by different interviewers to potentially different respondents, 
there is a chance that the two interviews could result in different 
correct Census Day residences for the same person. Inconsistency in 
Census Day address reporting can influence the dual system estimates.
    The 1990 Evaluations (P studies) measured the consistency of 
reporting Census Day addresses in the PES by comparing the reinterview 
to the production results. (See P--4, ``Address Misreporting''). One 
problem in 1990 was the misreporting of Census Day addresses, with an 
estimated 0.7 percent of the P-Sample being erroneously reported as 
nonmovers. (See P--4, ``Address Misreporting'') The 2000 A.C.E. 
improves on 1990 PES, in particular because the use of the CAPI 
instrument requires the interviewer to ask all questions in the 
interview form, a vast improvement over the 1990 PES pencil and paper 
interview.
    There are two factors that should have increased the consistency of 
reporting census day addresses. First is the time schedule. The A.C.E. 
interviews were conducted much closer to Census Day than were the 1990 
PES interviews. This would normally increase the accuracy of recall. In 
addition, the CAPI interview instrument forced the interviewers to ask 
all probes as to Census Day residence, again probably increasing 
consistency. In addition, the A.C.E. interview usually used proxy 
respondents for movers where the 1990 PES normally interviewed the 
mover household themselves. This has an unknown effect on consistency; 
however, we have no direct data on this at this time.
Matching Error
Conclusions for This Section
    The matching error rate for 2000 is low with indications that it is 
substantially lower than that achieved in 1990.
Analysis Reports Important to This Section
     Report B-6: ``Person Matching and Follow-up Results, by 
Danny R. Childers, Rosemary L. Byrne, Tamara S. Adams, and Roxanne 
Feldpausch''.
Discussion
    Matching error refers to assigning the incorrect code to a P-sample 
record. Matching error can consist of assigning a code of ``matched'' 
to a true nonmatch case and vice-versa. It can also consist of 
assigning an unresolved code to a case that has sufficient information. 
Matching errors can directly influence the final dual system estimates. 
Matching errors have both a random and a systematic component. The 
random component will be partially reflected in the overall variance 
estimates.
    Matching error was measured in 1990 by conducting a rematch study, 
that is, by going back after the fact and rematching a sample of cases. 
(P-7, ``Estimates of P-sample Matching Error from a Rematching 
Evaluation,'' P-10, ``Measurement of the Census Erroneous Enumeration 
Clerical Error Made in the Assignment of Enumeration Status''). A study 
of clerical error in the 1990 PES found error in coding matches (P-5a, 
``Analysis of Fabrications from Evaluation Follow-up Data'') and 
erroneous enumerations (P-6, ``Fabrication in the P-sample--Interviewer 
Effect''). In 1990, codes were entered into a computer system, but the 
actual matching and duplicate searches were done using paper. We 
expected A.C.E. matching to be better controlled and more efficient 
because the clerical matching and quality assurance were fully 
automated and the matching was conducted at a single site. The 
automated interactive system does not prevent all matching error but 
should reduce the chances for error significantly. Our results 
confirmed these expectations.
    The 1990 matching system matched both nonmovers (within E-sample 
area) and in-movers (who could be coded and matched in any area). The 
1990 mover match system not only included several additional steps 
(mainly to

[[Page 14032]]

geographically code the Census Day address) but was also completely 
clerical. For the A.C.E., all matching was within the sample area or 
its surrounding blocks. The 2000 nonmover matching system was largely 
automated. The system was used to match both nonmovers and out-movers. 
The system was significantly more automated, with less clerical 
matching, and all clerical matching operations were conducted at one 
location. Comparisons to 1990 must take these changes into account. 
\37\
---------------------------------------------------------------------------

    \37\ The A.C.E. design treats movers differently than in 1990, 
using a procedure called PES-C, rather than the 1990 procedure, PES-
B. In 1990, movers were sampled where they lived at the time of the 
PES interview. The Census Bureau then searched the census records at 
the movers' April 1 usual residence to determine if they had been 
correctly enumerated in the census. This procedure was PES-B. In the 
PES-C procedure, the Census Bureau combined information on movers 
from two sources to produce an estimate of movers who were missed in 
Census 2000. First, an estimate of the total number of movers was 
calculated based on people who moved into the A.C.E. sample blocks 
between April 1, 2000, and the time of the A.C.E. interview. Second, 
the rate at which movers matched to Census 2000 was calculated by 
matching the Census Day residents of the A.C.E. sample housing units 
to the initial census records.
---------------------------------------------------------------------------

    Other examples of the improvements in matching included:
     Electronic filtering allowed searching within a particular 
search area based on first name, last name, characteristics, and 
addresses. For example, the system allowed searching for all people 
named George, all people whose last name began with an H, all people on 
Elm Street, or all people between 30 and 40 years old.
     Only particular codes that fit the situation were allowed. 
For example, only P-sample nonmatch codes could be assigned a P-sample 
nonmatch after follow-up code.
     The electronic searches for duplicates reduced the tedious 
searching through paper lists of census people. The searching in 1990 
was limited to printouts in two sorts: last name and household by 
address. In 2000, the clerks could search electronically by name, 
address, and other characteristics to help identify duplicates.
     Computer images of the Census questionnaire were easily 
accessible.
     The system monitored whether the matcher completed all the 
necessary searches such as looking for duplicates.
     Built in edits checked for consistent coding. For example, 
codes that applied to a household were assigned to all people in the 
household, such as a geographic code.
     The system automatically assigned certain codes, 
minimizing coding error.
     A code to indicate that the case needed review at the next 
level of matching was available to the clerical matchers. This code 
allowed them to flag unusual cases to be done by a person with more 
experience.
     All quality assurance for the clerical matching was 
automated. Therefore, the quality assurance component of the operation 
could not be skipped in 2000.
     Clerical matching was centralized at the National 
Processing Center instead of having different groups of matchers in 
seven processing offices, as was done in 1990. Forty-six technicians 
were hired in September 1999 and thoroughly trained in the design of 
the A.C.E. and matching of people and housing units. These technicians 
performed the quality assurance for the clerical matchers. 
Additionally, 16 analysts were our most experienced matchers. The 
analysts performed the quality assurance for the technicians and 
handled the most difficult cases.
    The results of the matching quality assurance program constitute 
the primary information available for assessing the matching operation. 
This program gives us information about the level of error relative to 
that of our most experienced matching specialists. It should be noted 
that many of these same individuals participated in the 1990 PES. The 
results of the quality assurance process noted above and in B-6 show 
that we achieved a very high level of matching quality. The majority of 
cases were computer matched. The change rate for the clerical operation 
(the rate of cases that the next level of review concludes must be 
changed) is very low in any event an upper bound on the error rate.
A.C.E. Fabrications
Conclusions for This Section
    Fabrication was more tightly controlled in the A.C.E. than it was 
in the 1990 PES because of the tighter field management control made 
possible by the CAPI instrument.
Analysis Reports Important to This Section
     Report B-6: ``Accuracy and Coverage Evaluation Survey: 
Person Matching and Follow-up Results,'' by Danny R. Childers, Rosemary 
L. Byrne, Tamara S. Adams, and Roxanne Feldpausch.
Discussion
    Inclusion of fictitious people in the dual system estimates can 
create a bias unless the number of fictitious people is controlled for 
a small level. Fictitious records have little chance of being matched 
between the P and E-samples, which means that they can erroneously 
increase the undercount estimates. Fictitious records, of course, 
should not be included in either the P-sample or the census. 
Fabrications in the initial census are measured by the E-sample (See 
below). Here we concentrate upon fabrications in the P-Sample.
    In 1990, the level of fabrication in the P-sample was measured by 
three studies evaluating different measures of potential fabrication. 
The first study (P-5, ``Analysis of P-sample Fabrications from PES 
Quality Control Data'') evaluated interviewer fabrication detected in 
the quality control operation (and rectified by the QC operation), as 
well as fabrication detected in the follow-up operation. The estimated 
number of fabricated persons remaining, at the national level, after 
the quality control operation was approximately 0.13 percent. The 
second study, using data from the 1990 Evaluation Follow-up, concluded 
that an additional 0.09 percent (weighted to the PES unweighted totals 
this figure represents 0.03 percent of the total sample) of the P-
sample follow-up interviews included in the evaluation sample should 
have been coded as fictitious. (P-5a, ``Analysis of Fabrications from 
Evaluation Follow-up Data''). This evaluation was designed to identify 
P-sample fabrication not detected by the quality control procedure. A 
third study, (Project P6: ``Fabrication in the P-sample--Interviewer 
Effect'') compared the nonmatch rates of interviewers working in 
similar areas, while assuming that deviations from the nonmatch rate 
may have indicated undetected curbstoning. This study used a model to 
predict nonmatch rates and showed that between 0.9 percent and 6.5 
percent overall of the interviewers were found to have high nonmatch 
rates, high rates that may have corresponded to dishonesty in their 
data collection.
    We have evaluated potential fictitious records in the A.C.E. by 
reviewing detailed quality assurance results that document the level of 
detected fabrications in the initial A.C.E. interview, as well as 
measures of residual fabrication. In addition we have the results of 
the Person Follow-up interviewing, which should have detected whole 
household P-sample fabrications not detected by the interviewing 
quality assurance program. These sources allowed us to evaluate the 
level of A.C.E. fabrication.
    The evidence indicates that the quality assurance was successful in 
controlling A.C.E. fabrications. Because the A.C.E. interview was taken 
on the CAPI instrument, it was ``time stamped'' so that field staff 
could use automated reports to quickly detect interviewers

[[Page 14033]]

who reported odd interviews, such as rapid multiple interviews, 
interviews at odd hours (such as late night interviews), and other 
similarly unbelievable interview results. The CAPI instrument allowed 
field management staff to tightly monitor the behavior of the A.C.E. 
interviewers.
    In addition, we examined the data to look for information relating 
to clusters, because fabrication is often highly clustered. An 
otherwise acceptable interviewer might, for example, suddenly fabricate 
his or her last assignment. The matching analysts kept a detailed 
record of any unusual clusters. These analysts could request special 
questions during follow-up or send additional cases to follow-up 
interviewing if they questioned the integrity of one interviewer's 
results. These records would provide an additional clue to whether 
there was substantial, clustered fabrication in the P-Sample. Analysts 
had the discretion to remove cases they believed to have been 
fabricated.
Missing Data
Conclusions for This Section
    The level and pattern of missing data in the A.C.E. is comparable 
to that of the 1990 PES. The effect of the missing data on the overall 
A.C.E. quality is similar to that experienced by the 1990 PES and 
documented in the P studies.
Analysis Report Important to This Section
     Report B-7: ``Accuracy and Coverage Evaluation Survey: 
Missing Data Results,'' by Patrick J. Cantwell.
Discussion
    Missing data can introduce uncertainty into DSE results. Missing 
data can contribute to variance and, if the missing data models are 
poorly specified, can also contribute to bias and differential bias.
    Missing data has three components:

 Whole household noninterviews
 Unresolved match, residence, or enumeration status
 Missing demographic characteristics

    This section focuses on the first two components of missing data: 
whole household noninterviews and unresolved match, residence, or 
enumeration status. The third component of missing data, missing 
postratification variables, will generally result in correlation bias 
or synthetic error and will be evaluated in connection with the 
analysis reports on those topics. Missing post-stratification variables 
tend to lead to correlation bias or synthetic error because this 
omission can increase heterogeneity and inconsistent post-
stratification between the initial census and the A.C.E. High levels of 
missing data, particularly for unresolved match, residence, or 
enumeration status, also tend to increase variance. We have not 
evaluated how this type of missing data by itself increases variance 
because this component is largely picked up in our measure of sampling 
variance.
    The 1990 PES dealt with movers by using Procedure-B. Under 
Procedure-B, missing data can occur when the interviewer fails to get 
information from the respondent, in either the initial interview or the 
follow-up interview, or the missing data can occur during follow-up. 
The 1990 PES had low rates of initial missing data, but a greater 
number of unresolved cases in the follow-up process. Procedure-B 
required geocoding the matching, making it possible that completed 
``mover'' cases could not be used because of ambiguities in the 
geographic coding. Procedure-B, therefore, resulted in initially low 
rates of missing data but was responsible for additional missing data 
in later processes.
    The effects of missing data on the 1990 results were studied in two 
ways. First, the modeled results were compared to the results of 
further field work on the nonresponse cases (P-3, ``Evaluation of 
Imputation Methodology for Unresolved Match Status Cases''). The field 
work largely validated the models. This alone is extremely important 
work as it clearly demonstrated that some of the extreme missing data 
adjustments sometimes proposed (for example assuming all nonresponse 
cases were missed) were not supported by the data. Second, additional 
1990 studies (P-1, ``Analysis of Reasonable Imputation Alternatives'') 
tended to show the robustness of the results to reasonable 
alternatives.
    There have been two important changes for Census 2000 that might 
affect missing data rates. First, we expected that the level of missing 
data in the A.C.E. interview might be higher because of a change in how 
we treated movers. In 1990 the Census Bureau only needed to interview 
the current residents, whereas in Census 2000, interviewers required 
information about both the current (A.C.E. Interview Day) residents and 
the Census Day residents. On the other hand, Procedure C, which we used 
in the A.C.E., eliminated the need to geographically code the Census 
Day address of ``in-movers,'' thus eliminating one potential source of 
missing data. Second, the CAPI instrument kept the interviewer on the 
correct set of questions and allowed for tight managerial control.
    The A.C.E. used a different missing data model for unresolved match 
and residence status. The 1990 model was based on hierarchical logistic 
regression, while the 2000 model used the far simpler ``Imputation Cell 
Estimator.'' The input data and behavioral assumptions between the two 
models are similar but not identical.
    The A.C.E. was able to maintain high quality interviewing and keep 
the level of missing data to low levels. This low level of missing data 
minimizes the effect on the final estimates of the missing data 
assumptions.
    Noninterview in the P-sample: A.C.E. interview rates were very 
high. Among occupied housing units, the rates were 97.1 percent for 
Census Day and 98.8 percent for A.C.E. Interview Day. This compares to 
98.4 percent (unweighted) in the 1990 PES. Due to the high response, 
most of the changes due to the noninterview adjustment factors applied 
were very small. This result helps to keep down the variance of the 
survey weights.
    Unresolved resident status in the P-sample: The proportion of 
people with unresolved residence was very low, 2.2 percent. Thus, it 
appears that missing this item has only a minor effect on the 
estimation process. The missing data procedures assigned an average 
resident probability of 82.6 percent to people with unresolved resident 
status, which was, as designed, lower than the average rate among 
people with resolved status (98.2 percent).
    Unresolved match status in the P-sample: Only 1.2 percent of the 
sample had unresolved match status, compared to 1.8 percent in the 1990 
PES. We assigned an average match rate of 84.3 percent to people with 
unresolved match status, compared to 91.7 percent for those with 
resolved status. The low rate of unresolved match status implies only a 
small effect on the estimation.
    Unresolved enumeration status in the E-sample: About 2.6 percent of 
the E-sample had unresolved enumeration status; it was 2.3 percent in 
the 1990 PES. The average rate of correct enumeration for people with 
unresolved status was 76.2 percent as compared with the 95.9 percent 
for those with resolved status.
    The level and direction of the differences between resolved and 
unresolved cases are generally what we expected and are explainable by 
the design of the missing data estimation.

[[Page 14034]]

Balancing Error
Conclusions for This Section
    Although detailed information is not yet available, the evidence 
now available does not permit us to conclude that there was no 
balancing error in 2000. One concern is that a number of E-sample cases 
were coded as correct even though they were outside the search area. 
This concern is discussed in a following section.
Analysis Report Important to This Section
     Report B-8: ``Accuracy and Coverage Evaluation Survey: 
Decomposition of Dual System Estimate Components,'' by Thomas Mule.
     Report B-18: ``Accuracy and Coverage Evaluation Survey: 
Effect of Targeted Extended Search,'' by Douglas B. Olson.
Discussion
    Balancing error occurs when the set of correct enumerations records 
defined and measured in the E-sample does not correspond to the set of 
records against which P-sample matching is allowed. An important type 
of balancing error occurs when the search area, as defined and 
implemented in the E-sample, does not correspond to the search area as 
defined and implemented in the P-sample. The dual system model first 
determines the number of individuals who are correctly in the initial 
census (through the E-sample) and then the proportion of the true 
population that is correctly in the census (through the P-sample). If 
the E-sample and the P-sample use different definitions of ``correctly 
in the census,'' the model will not work. Specifically, if the P-sample 
allows matches, (that is, treats a person as correctly enumerated), if 
he/she was found anywhere in a wide area, but the E-sample treated as 
erroneous (that is, not correctly enumerated) any census record not in 
its correct block, then the P and E-samples are using different 
definitions about what constitutes a correct enumeration. Obviously, 
there would also be balancing error if the E-sample definition was 
broad, but the P-sample definition was narrow.
    Balancing error, especially geographic balancing error, was a major 
concern in the 1980 post enumeration survey. The E-sample in 1980 
counted a person as being correctly in the census only if he or she was 
counted in the correct Enumeration District. Enumerations outside the 
correct enumeration district were considered erroneous. However, the P-
sample in 1980 searched several enumeration districts looking for a 
match. Thus some P-sample people were considered correctly enumerated 
because they matched to census records that would have been considered 
erroneous had these records been included in the E-sample. This 
particular problem was addressed in the 1990 PES by using identical 
search areas for nonmovers. A concern remained for movers. (P-11, 
``Balancing Error Evaluation'').
    The A.C.E. used a somewhat more complex balancing design than did 
the 1990 PES. One minor change was that the search area in 2000 was 
somewhat smaller, encompassing only the first ring of blocks of housing 
units around a census block. More important, not all cases were 
eligible for searching, coding, and matching in the surrounding ring; 
only whole household nonmatches and E-sample geocoding errors were 
eligible for surrounding block search. This search area is referred to 
as ``Targeted Extended Search'' or TES. The TES surrounding block 
search was also performed on a sample basis.
    A major goal of extended search, whether targeted or not, is to 
reduce the variance of the estimators, especially for small estimation 
cells where census geocoding errors will not tend to cancel out. To 
assess the effect of TES, we compared correct enumeration rates and 
match rates for TES and non-TES cases.
    Extended search can reduce A.C.E. bias due to A.C.E. P-sample and 
E-sample geocoding errors. If an A.C.E. address listing includes 
housing units outside the actual block, as defined by the census, an 
attempt to match only to the sample block will usually result in 
nonmatches for all units actually outside the block. This situation can 
lead to a high false measure of census omission and extending the 
search to the surrounding blocks reduces this bias. Extended search 
essentially converts a first order matching bias to a second or third 
order sampling bias.
    In addition, it is possible for the A.C.E. E-sample follow-up to 
incorrectly code a housing unit as inside a block when the unit is 
actually just outside the block. Without extended search, this 
discrepancy would result in a unit coded ``correctly enumerated'' that 
was actually a geocoding error. With extended search, the enumeration 
of the unit is correct whether coded to the actual block or a 
surrounding block. Obviously, if the unit was actually located 
completely outside the search area, coding it to the block or a 
surrounding block (that is, ``correctly enumerated'') would be an 
error. There is evidence that this type of coding sometimes occurred, 
as discussed below.
    A review of the results of Targeted Extended Search program (TES) 
has indicated an imbalance between P-sample matches to the surrounding 
block and E-sample enumerations coded as ``correctly counted in the 
surrounding block.'' Ideally, these should be similar. This result 
raised concerns. However, it is consistent with the presence of a small 
amount of A.C.E. P and E-sample geocoding error. Similar results were 
encountered in 1990. An imbalance may be due to the geographic 
miscoding of E-sample cases discussed below.
Errors in Measuring Census Erroneous Enumerations
Conclusions for This Section
    In general, the evidence suggests that with the possible exception 
of geographic mis-geocoding, E-sample coding errors were controlled at 
least as well as in 1990. However, preliminary results from an early 
A.C.E. evaluation indicate that a number of E-sample cases coded as 
correct enumerations were in fact outside of the search area. That 
means that they should have been coded as Erroneous Enumerations and 
subtracted from the DSEs. This error could introduce an upward bias in 
the DSE.
Analysis Report Important to This Section
     Report B-6: ``Accuracy and Coverage Evaluation Survey: 
Person Matching and Follow-up Results,'' by Danny R. Childers, Rosemary 
L. Byrne, Tamara S. Adams, and Roxanne Feldpausch.
    In addition,
     DSSD Memorandum Series T-6: ``Additional Geographic Coding 
for Erroneously Enumerated Housing Units,'' by Danny R. Childers and 
Xijian Liu.
Discussion
    Erroneous enumerations occur in the initial census in the following 
circumstances:
     When an individual had another residence where he or she 
should have been counted on Census Day.
     When an entry is fictitious.
     When entries are duplicated.
     When an individual lived in a housing unit subject to 
geocoding error.
     When the Census Bureau had insufficient information for 
matching and follow-up.
    Errors in measuring census erroneous enumerations can have a 
serious and direct impact on the A.C.E. For example, a systematic 
tendency in A.C.E. processing to code census fictitious cases 
(``curbstoned cases'') as

[[Page 14035]]

E-sample follow-up ``noninterviews'' leads to an incorrect estimate of 
the number of respondents correctly enumerated in the initial census. A 
tendency to ``give the census the doubt'' can result in people who move 
out before Census Day coded as correct enumerations. While the 
overlapping of the P and E-samples will lend considerable robustness to 
the A.C.E. estimates, both systematic and random errors can be expected 
to occur.
    E-Sample cases are either coded during the initial matching 
operation or coded based on information gathered during A.C.E. follow-
up. For the A.C.E., we assessed errors in measuring census enumeration 
by analyzing the matching systems' quality assurance results, as well 
as by using information from A.C.E. follow-up. The quality assurance 
program should have indicated any clerical problems in assigning 
enumeration status.
    The Census Bureau found clerical error in assigning erroneous 
enumerations in 1990 (P-10, ``Measurement of the Census Erroneous 
Enumeration Clerical Error Made in the Assignment of Enumeration 
Status''). The improvements in Census 2000 clerical matching (described 
earlier) should have improved the assignment of erroneous enumerations. 
The identification of duplicates was closely monitored to assure that 
the duplicate search was done within the block cluster and in the 
surrounding blocks for TES clusters. The follow-up interview has been 
improved to instruct the interviewer to conduct sufficient searches for 
people to allow accurate coding of fictitious people. The conclusion 
was that the follow-up interviewing was in both managerial and 
statistical control.
    The A.C.E. matching and follow-up quality assurance results 
referenced in the Matching and Follow-up section above indicate that 
these processes were well controlled and that these errors were no 
worse than in 1990.
    The one area of concern is the level of correct coding of E-sample 
cases that were actually outside the search area. Preliminary results 
from an early A.C.E. evaluation indicate that a number of cases that 
were coded as ``correctly enumerated'' were in fact outside the search 
area. This means that the E-sample process accepted a number of 
records, as correct when they were in fact erroneous. This would 
understate the gross census overcoverage rate and thus overstate the 
census net undercount.
Correlation Bias
Conclusions for This Section
    Correlation bias is documented for the Black male population and is 
almost certainly present for certain non-Black populations, including 
the non-Black Hispanic population. Unfortunately, evidence on the level 
of correlation bias is weak.
Analysis Reports Important to This Section
     Document 12: ``Accuracy and Coverage Evaluation Survey: 
Correlation Bias'' by William R. Bell.
     Document 4: ``Accuracy and Coverage Evaluation Survey: 
Demographic Analysis Results'' by J. Gregory Robinson.
Discussion
    Correlation bias is the term frequently used to refer to error 
caused by individuals systematically missed in both the initial census 
and the coverage measurement survey. In its purest form, dual system 
estimation assumes that the chance of being included in the P-sample is 
independent of the chance of being correctly included in the initial 
census. Although this assumption has proven useful in providing a 
better estimate of the population, it is, of course, unlikely to be 
absolutely true. Correlation bias can occur from two sources. First, it 
can be caused by inherent heterogeneity within the post-strata. It can 
also arise when the event of being enumerated in the census changes the 
probability of being included in the A.C.E.
    Even within post-strata there may be unobservable sub-groups with 
differing chances of being included in each system. There is also quite 
likely some group (of an indeterminate size) whose probability of being 
included in any survey is so low as to be effectively zero. Correlation 
bias will tend, therefore, to lead to an underestimate of the 
population. Dual system estimation will estimate some, but not all, of 
the people omitted from the initial census.
    Correlation bias is a bias in the dual system estimator. That is, 
it must be considered in light of both the initial census interview and 
the A.C.E. interviewing and processing. Correlation bias due to 
heterogeneity can be reduced either because the initial census was more 
successful in including the ``hard to count,'' or because the A.C.E. 
was more successful in including the ``hard to count.'' The census paid 
advertising and outreach campaign, especially that targeted to ethnic 
minorities including Hispanics, could have the effect of reducing 
correlation bias in the 2000 DSE.
    To measure correlation bias, one would ideally like to have an 
external measure of ``truth.'' Demographic analysis, especially 
demographic sex ratios, have in the past provided an external measure 
that, while not perfect, is useful because it is not subject to many of 
the limitations of the initial census or the dual system estimates. As 
discussed later in this document, comparisons with demographic analysis 
are increasingly difficult.
    Using demographic results, the 1990 studies detected a clear 
pattern of correlation bias in the 1990 PES (P-13, ``Use of Alternative 
Dual System Estimators to Measure Correlation Bias''). Correlation bias 
was especially strong for adult Black males, a group that dual system 
estimation methodology seems to underestimate.
    Recent criticisms of the 1990 studies seem to point to the fact 
that these studies underestimated the level of correlation bias in the 
1990 PES. This conclusion follows from the fact that, in general, 
correlation bias tends to lower the estimated population, while other 
measurement errors tend to raise the estimate. Correlation bias and the 
other kinds of errors therefore may have tended to cancel each other 
out. However, this reasoning applies to comparisons of the 1990 PES 
estimates to demographic analysis population totals. If comparisons are 
instead made to the demographic analysis sex ratios (as was done in the 
P-13 report), and if the other measurement errors are not very 
different between males and females, then these other measurement 
errors should tend to cancel out and have little effect on resulting 
estimates of correlation bias. Note that comparability problems arising 
from Black Hispanics, whom DA assigns to Black and A.C.E. assigns to 
non-Blacks, are expected to have minor effects on sex ratios for 2000. 
However, we have not fully analyzed the data that supports this 
expectation.
    An additional problem is that since demographic analysis provides 
national results, one must model how these errors might distribute 
themselves by post-strata. Several alternative models have been tried. 
(P-13, ``Use of Alternative Dual System Estimators to Measure 
Correlation Bias''; Bell, ``Using Information from demographic analysis 
in Post-Enumeration Survey Estimation,'' 1993).
    A final problem arises from the nature of the preliminary 2000 
demographic analysis results, discussed below. These results imply a 
level and pattern of net undercount different from that in any

[[Page 14036]]

previous census studied or from that measured by the A.C.E. Indeed, 
even some of the comparisons of sex ratios, normally the most robust 
aspect of demographic analysis, are quite different from those observed 
in previous censuses. These results make quantifying correlation bias 
even more difficult for Census 2000 than in previous censuses.
    The level of correlation bias in the A.C.E. might be larger than 
that in the 1990 PES because of the use of Procedure C for movers. 
Procedure C was designed to reduce matching error by eliminating mover 
matching. However, since this procedure calls for the reconstruction of 
the Census Day household, its use may increase correlation bias because 
it may result in the ``missing'' of individuals only tenuously 
connected to the household. Weighting the out-mover match-rate by the 
number of in-movers may partially, but probably not completely, 
compensate for the possible increase in correlation bias. Even among 
out-movers, those more likely to be enumerated in the initial census 
may be more likely to be picked up in the A.C.E. interview. Because of 
this potential correlation, we might overestimate the mover match rate.
    Our analysis of correlation bias in the 2000 estimates was, as in 
1990, based on the sex ratios from demographic analysis. It is limited 
to only measuring the correlation bias of Black adult males and non-
Black adult males. The method assumes no correlation bias for females 
and cannot be applied to the A.C.E. estimates for children. 
Essentially, it assumes that any shortfall of the number of males 
relative to females, as implied by the demographic analysis sex ratios, 
is attributable to a correlation bias for males. This analysis 
demonstrates the presence of correlation bias for adult Black males. 
The implied level is similar to that observed in 1990. Specifically, 
our analysis concludes that there is significant correlation bias for 
adult Black males 18-29 and 30-49 at levels very similar to 1990. There 
also is significant correlation bias for adult Black males 50+ that is 
smaller in magnitude than in 1990. Comparisons to demographic analysis 
sex ratios suggest at most small amounts of correlation bias for non-
Black males 30-49 and 50+. The correlation bias estimates for these 
groups are very small, though they were not much larger in 1990. Due to 
inconsistency of demographic analysis and A.C.E. data for non-Blacks 
18-29, we cannot estimate correlation bias for males in this group.
    Determining the level of correlation bias for the non-Black 
population is problematic because for some age groups, demographic 
analysis sex ratios imply fewer males than measured by the A.C.E. Taken 
at face value, this result would mean either negative correlation bias 
for males (which has never been observed and is difficult to explain) 
or larger correlation bias for females than for males. Positive 
correlation bias for females is not only possible but likely. However, 
what is also likely is that using initial DA to measure correlation 
bias for non-Blacks using sex ratios has become problematic. This 
conclusion is important since the majority of the Hispanics, as well as 
of course other minority groups, are non-Black. A frequently expressed 
concern about the DSE methodology is the possibly large level of 
correlation bias for Hispanics.
    This analysis only detects differential correlation for the Black 
and non-Black population. We have no measure for correlation bias for 
children or females, nor any separate measure for Hispanics, Asians, or 
other separate ``non-Black'' groups.
    We also examined records and reports for any indication of 
correlation bias due to causal dependence, that is, any indication that 
participation (or non-participation) in the initial census directly 
influenced participation in the A.C.E. For example, we looked at the 
number of letters (approximately 80) received from households that were 
reluctant to participate in the A.C.E. because they had already sent in 
their census form. We looked for reports from the regional offices to 
see if there was any indication of improper contact between the census 
enumerators and the A.C.E. interviewers. We found no reports or other 
evidence to support a problem with causal dependence.
    There were also concerns about the effects on correlation bias of 
the ``late census adds'' and the higher level of imputations. This is 
discussed below. (See Other Measurement and Technical Errors.)
Synthetic Assumptions
Conclusions for This Section
    Local census heterogeneity exists and affects the quality of both 
the adjusted and unadjusted census results. Properly accounting for the 
synthetic bias in the basic functions could potentially reverse a 
finding of small improvement, or small deterioration, from adjustment. 
This effect warrants further examination.
Analysis Report Important to This Section
     Report B-14: ``Accuracy and Coverage Evaluation 2000: 
Assessment of Synthetic Assumptions'' by Donald J. Malec and Richard A. 
Griffin.
Discussion
    Synthetic estimation error differs from the other measurement 
errors discussed in this document because it is not directly related to 
the accuracy of the dual system estimates themselves, but rather to the 
distribution of the measured net undercount to local areas and 
demographic subgroups.
    Another important difference between synthetic error and other 
types of A.C.E. error is that local heterogeneity is present in the 
unadjusted census; this local heterogeneity will affect the quality of 
census results even before A.C.E. adjustment. While this local 
heterogeneity is not, strictly speaking, synthetic error, since no 
synthetic estimation is involved, the effect of local heterogeneity on 
the accuracy of the population estimates is similar. If local 
heterogeneity in the initial census is correlated with post-
stratification variables, then the DSE/synthetic estimation process can 
reduce this heterogeneity. However, if a crew leader applied the census 
procedures in a way that resulted in a locally higher net undercount, 
then the DSE/synthetic model would not correct for this effect locally. 
Evaluations of the synthetic assumption help us to understand residual 
heterogeneity in both the initial and the corrected census.
    Evaluations of the synthetic assumption are necessarily indirect. 
Because the A.C.E. is based on a sample, it may be inefficient at 
detecting truly local heterogeneity. Attempts at measuring local 
heterogeneity at the block cluster level suffer from the problem that 
the A.C.E. is not designed to directly measure the undercount, even for 
the sample clusters. Targeted extended search and large-block 
subsampling, for example, both allow matching beyond the sample 
segments. The A.C.E. is designed to measure undercount at high levels, 
not at the local level.
    However, other data are available for all census areas. Some of 
these data may be related to the net undercount, although in perhaps 
complex ways. These data include the level of census whole person 
imputations and the level of census ID's removed from the census as 
part of HUDO and then reinstated. These can be tabulated at different 
levels than the A.C.E. poststrata. For example they can be tabulated 
for census region crossed by the other A.C.E. post-stratification 
variables (Attached). These analyses show that individual census 
procedures had different impacts in different census

[[Page 14037]]

regions, even controlling for A.C.E. post-stratification variables. 
What one does not know, of course, is whether these procedures 
corrected for an underlying differential in coverage or created a new 
level of geographic differential in coverage.
    The analysis of the data indicates variation within the poststrata 
for variables that might be related to the net undercoverage. If 
indeed, these indications are correct, the undercount in the unadjusted 
census varies not only between poststrata but also within poststrata. 
The A.C.E. adjustment process will not remove any differential patterns 
of undercount within poststrata. They will still be present within the 
data. Since this uneven census coverage is present in both the adjusted 
and unadjusted results, it does not seem to greatly affect the relative 
accuracy of the two sets of population estimates.
    A productive approach is to use ``artificial population'' analysis. 
This analysis looks at census operational measures available for all 
areas, scales them to be the size of the gross undercount or overcount, 
and then analyzes the results to assess the impact of local geographic 
heterogeneity on census and A.C.E. accuracy. This analysis looked at 
several variables, these included:

Surrogates for Gross Omissions

    Number of non-GQ \38\ persons less the number of persons in whole 
household substitutions.
---------------------------------------------------------------------------

    \38\ GQ means Group Quarters, that is, prisons, long-term care 
facilities, college dorms, and other group living arrangements.
---------------------------------------------------------------------------

    Number of non-GQ persons with two or more item allocations.
    Number of non-GQ persons whose household did not return a 
questionnaire by mail, etc.

Surrogates for Gross Erroneous Enumerations

    Number of non-GQ persons less those for whom the date of birth was 
allocated consistent with age.
    Number of non-GQ persons less the number of whole household 
substitutions.
    Number of non-GQ persons whose household did not return a 
questionnaire by mail, etc.
    These surrogates were chosen because they roughly correlated with 
the number of the A.C.E. nonmatches and A.C.E. E-Sample erroneous and 
incomplete enumerations for the sampled block clusters. Of course, for 
the artificial population analyses, we looked at both sample and 
nonsample blocks.
    Assessments of the 1990 PES were concerned with the accuracy of the 
synthetic assumption for low levels of geography, such as blocks. Our 
assessment of the synthetic assumption in the A.C.E. accepts that 
perfect homogeneity cannot exist at the block level. The Census 
Bureau's evaluation of synthetic error, therefore, focuses on whether 
heterogeneity at the local level is so great as to prevent an 
improvement from using the A.C.E., not on whether the post-strata are 
absolutely homogeneous.
    The analysis of the relative effect of synthetic error indicated 
that for state level count estimates (numeric state accuracy) three out 
of four loss functions are probably underestimating the true gains from 
adjustment (Report B-14, ``Assessment of Synthetic Assumptions''). 
Thus, correcting for the bias would not change the loss function 
results.
    For state shares, the analysis indicated a small effect of 
synthetic error on the loss function. Thus, for cases where the census 
loss and the A.C.E. loss were quite close showing a small improvement 
for adjustment, correcting for synthetic error could reverse the 
direction indicating a small decrease in accuracy by adjusting. This 
result warrants further investigation.
    For congressional district share estimates, the evidence is mixed. 
That is, some analyses indicated that ignoring synthetic error in the 
loss function would overstate A.C.E. accuracy. Other analyses indicated 
that this would overstate census accuracy. That is, some analyses 
indicated that the loss function measures would be conservative. For 
other analyses, the results were that the effect could be large enough 
that they could reverse a favorable finding for adjustment. These 
analyses would indicate that for congressional districts, loss function 
results that indicate a small or even moderate improvement from 
adjustment could be misleading. Correctly accounting for synthetic 
error would reverse the finding implying greater census accuracy in 
these cases.
Other Measurement and Technical Errors
Conclusions for this Section
    Available evidence does not indicate any appreciable increase in 
the level of any of these other measurement and technical errors over 
what was experienced in 1990.
Analysis Reports Important to This Section
    Document 10: ``Accuracy and Coverage Evaluation Survey: Consistency 
of Post-Stratification Variables,'' by James Farber.
Discussion:
    The coverage measurement process is subject to several other kinds 
of measurement errors that need to be noted, including technical ratio 
bias, contamination error, and inconsistent post-stratification.
    Technical ratio bias is well documented in the statistical 
literature and occurs when the expectation (statistical average) of a 
ratio differs from the expectation of the numerator divided by the 
expectation of the denominator. Technical ratio bias in survey 
estimates is usually not important unless the sample size is small. 
Usually, a sample size of thirty independent observations is adequate 
(Cochran, 1963). The dual system estimator is a ratio estimator and as 
such is subject to ratio bias. Further, since the Procedure C treatment 
of movers is also a ratio estimator, that may introduce a further ratio 
bias. The A.C.E. is designed to guard against large ratio bias by 
requiring a minimum cell size for both the post-stratum and the number 
of out-movers in the Procedure C estimate. While we did not expect 
technical ratio bias to be a problem in the A.C.E.
    Technical ratio bias was shown to be small, as expected. (B-2, 
``Quality Indicators of Census 2000 and the Accuracy and Coverage 
Evaluation'')
    Contamination error occurs when the conduct of the coverage 
measurement survey affects how people react to the initial census in 
the sample areas. If contamination occurs, the coverage measurement 
survey may no longer reflect the error for the population as a whole, 
even if it correctly measures the coverage ratios for the sample areas. 
Contamination error has affected past coverage measurement surveys. The 
1980 coverage measurement study (the PEP) was based on the April 
Current Population Survey, which had been conducted between Census Day 
and the start of NRFU. Evidence pointed to contamination error (See Fay 
et al., 1988). Prior to the 1998 Dress Rehearsal, contamination error 
was a major concern. See, for example, Griffiths, ``Results from the 
1995 Census Test: The Contamination Study'' (1996). Because the Census 
Bureau planned to conduct NRFU on a sample basis for all blocks except 
those blocks that were to be included in the PES, where NFRU would be 
conducted on a 100 percent basis. If there was any sampling bias due to 
the nonresponse sampling, this bias could differentially affected the 
Integrated Coverage Measurement and

[[Page 14038]]

the non-Integrated Coverage Measurement blocks. The Census Bureau 
evaluated this possibility and did not detect any contamination. In any 
case, sampling for NRFU was not used.
    With respect to possible contamination, the A.C.E. is, with one 
exception, quite similar to the 1990 PES. In both surveys, housing unit 
listing was conducted before census mailout and NRFU. Personal visit 
interviewing was, in both cases, conducted after the end of NRFU, but 
concurrently with various census coverage follow-up field interviews.
    One possible cause of contamination in the A.C.E. was that 
approximately one third of the A.C.E. interviews were conducted by 
telephone concurrent with census NRFU. The telephone interviews were 
restricted to cases that had a completed census questionnaire that 
provided a telephone number and excluded units in small multi-units 
structures and units without house-number-street-name. It is possible 
that some of these cases might have been visited later during NRFU, and 
that their responses to that operation were influenced by the A.C.E. 
interview.
    We have not been able to detect serious contamination since moving 
away from the 1980 design. The ESCAP analysis of possible contamination 
was restricted to reviewing any available reports. The only report of 
possible contamination was from a Government Accounting Office 
debriefing. They reported that they were told by a few A.C.E. 
interviewers that the interviewers observed census personnel conducting 
CIFU. Contamination could have occurred, although no actual sharing of 
information was reported. The 1990 PES was also run concurrently with 
CIFU.
    Next, we turn to inconsistency in post-stratification between the 
A.C.E. and the census. Some individuals may be classified in the 
initial census into different post-strata than in the P-sample. The 
initial census will certainly misclassify some individuals, causing 
them to be included in the wrong category. For example, some Hispanics 
may be classified as non-Hispanic, or some American Indians as White. 
To the extent that the coverage probabilities are equal only for the 
correct characteristics, census mis-classification (that is, incorrect 
post-stratification) may introduce correlation bias and synthetic 
error.
    The introduction of multiple race reporting in both the census and 
the A.C.E. raised concerns about this type of error.
    The impact of inconsistent post-stratification is a function of the 
proportion of misclassified records and the differences in coverage 
rates between the two post-strata. If only a few records are 
inconsistently classified, there will be little impact. Further, there 
is little impact on coverage if the misclassifications occur between 
post-strata with similar census coverage rates. Misclassification will 
only affect the quality of the estimates if there are large 
inconsistencies between post-strata with highly differential coverage 
rates.
    One must note that inconsistent misclassification is not possible 
for all A.C.E. postratification variables. Region, metropolitan area 
size, type of enumeration area, and census mail return rate are all 
measured at the block level and are inevitably assigned the same value 
in both the P and E-samples. Inconsistent classification is only 
possible for the race/ethnicity, owner/renter, and age/sex domains.
    We studied the differences in post-stratification for those people 
matched between the A.C.E. and the initial census. By assuming that 
these patterns apply equally to missed people and by working with the 
observed (estimated) coverage rates, we assessed the impact of these 
inconsistencies on the coverage estimates. Of course, this analysis 
took into account both the directly reported characteristics and the 
imputed characteristics in both the initial census and the A.C.E.
    Of the two tenure groups, seven age/sex groups, and seven race/
Hispanic origin domains, two groups stand out with particularly high 
rates of inconsistency: the domains of American Indian off reservation 
and Native Hawaiian and Pacific Islander. Both of these domains were 
new for 2000. In 1990 the American Indian off reservation population 
was combined with the non-Hispanic White and Other while the Native 
Hawaiian and Pacific Islander population was combined with the Asian 
population. The effect of the inconsistency for the American Indian off 
reservation is to push the resulting estimates toward that of the non-
Hispanic White and Other population.
    Another concern has been the treatment in the DSE of the cases 
involved in the Housing Unit Unduplication Operation (HUDO). As noted 
earlier, 1,019,057 housing units were analyzed during the HUDO and 
later re-instated into the census files. These units included 2,366,140 
person records (including census imputations as well as data defined 
records). These records are referred to as ``late census adds.'' These 
records were not included in the A.C.E. matching, processing, or 
follow-up processes. They were also excluded from the DSE. It is 
possible that, had these records been included in the A.C.E. and the 
DSE, the estimated undercount would have differed. To understand this 
difference, one must consider several factors.
    Any of these person records that were not data defined would not 
have been included in the DSE in any case. Excluding them as late 
census adds rather than as whole person imputations makes no difference 
on the final DSE. Some of these person records were duplicates or other 
erroneous enumerations. Had they been included, the A.C.E. would have 
sampled and processed them and estimated the level of erroneous 
enumerations. Excluding these records from the DSE should reduce 
sampling error since sampling is no longer involved. It is possible 
that excluding them affected the nonsampling error. For example, it is 
possible that some of these cases might have been mis-coded had they 
been included. Further, given the way these were excluded and 
reinstated, it is possible that this process could have affected 
duplicate search or targeted surrounding block search. We were not able 
to quantify the nonsampling effect.
    Some of these cases were, of course, correct enumerations. 
Including them in the A.C.E. would have had two effects. First, this 
would have raised the number of estimated correct data-defined 
enumerations. Second, it would have raised the number of matches from 
the P-sample to the census, since some of these people would have been 
included in the A.C.E. P-sample. If the ratio of matches to correct 
enumerations in the excluded cases is the same as the ratio matches to 
correct enumeration is the included cases, the DSE expected value 
should be nearly the same. However, if the people referred to in these 
correct cases were either much more likely to have been included in the 
A.C.E. or much less likely to have been included, then excluding these 
cases from the A.C.E. would have changed the level of correlation bias 
and affected the A.C.E. We have no reason to believe this to be the 
case. Finally, excluding these cases would have affected the sampling 
variances, especially if they were clustered. This effect, however, 
should be fully accounted for in the reported sampling error.
    Finally, if these late census adds included geographic clustering 
of erroneous enumerations, they would increase the geographic 
heterogeneity in the census net undercount. Geographic clustering in 
net undercount that is not correlated with the A.C.E. 
poststratification variables will not be corrected by the A.C.E.

[[Page 14039]]

    In 1990, the effect of late census adds on the DSE was studied and 
evaluated. Based on the 1990 experience, the treatment of the late 
census adds in the 2000 DSE was specified based on the theory noted 
above. In short, one cannot compare or project the effect of the late 
census adds in 1990 to the effect in 2000.
    A related issue concerns the census whole person imputations. These 
are cases included in the census counts that are not data defined. 
These include three groups: cases where the number of people in the 
housing unit had to be estimated, cases where the number of people was 
known but all the characteristics of the household had to be estimated, 
and cases where there was a person reported on the questionnaire but 
with so little data that the census substituted the characteristics of 
another person. There were 5,691,184 whole person imputations in Census 
2000 as opposed to 2,195,716 in the 1990 census. So while the A.C.E. 
design anticipated whole person imputations, the level was greatly 
increased.
    Again, the effect of these whole person imputations will depend 
upon several factors. Some of these will be erroneous. For example, 
they may impute people into a unit that was vacant on Census Day or 
into a group of seasonal vacant units. Such imputations will, 
obviously, decrease the net undercount rate and could lead to an 
overcount. However, they should not affect the DSE in any way. However, 
if the imputation was, indeed, correct, then there were people living 
in the unit on April 1 who were not elsewhere counted, that is, not 
included in a duplicate housing unit. Had these people been included in 
the census, then some of them would have matched. Therefore, the number 
of census correct data-defined enumerations would have increased and 
the number of matches would have increased. If the ratio of matches to 
correct enumerations in the ``imputed'' cases is the same as the ratio 
matches to correct enumeration in the ``non-imputed'' cases, the DSE 
expected value should be the same. However, if people living in these 
units were either much more likely to have been included in the A.C.E. 
or much less likely to have been included, then imputing these cases 
(rather than enumerating them) would have changed the level of 
correlation bias and affected the A.C.E. Finally, the increased level 
of imputation would have affected the sampling variances, especially if 
they were clustered. This effect, however, should be fully accounted 
for in the reported sampling error.
    Again, if incorrectly imputed cases were geographically clustered, 
they would increase the geographic heterogeneity in the census net 
undercount. Geographic clustering in net undercount that is not 
correlated with the A.C.E. poststratification variables will not be 
corrected by the A.C.E.

Synthesizing A.C.E. Quality

Comparison With Demographic Analysis and Demographic Estimates

Conclusions From This Section
    The disagreement between the results of demographic analysis and 
the A.C.E. removes an important independent verification of A.C.E. 
results. In 1990, demographic analysis clearly demonstrated that an 
adjustment based on the PES would have been conservative, that is, the 
true population would almost certainly have been higher still. In 2000, 
demographic analysis presents no such support, leaving the possibility 
that the A.C.E. would ``over adjust.''
Analysis Reports Important to This Section
     Report B-4: ``Accuracy and Coverage Evaluation Survey: 
Demographic Analysis Results,'' by J. Gregory Robinson.
     Report B-16: ``Demographic Full Count Review: 100 percent 
Data Files and Products,'' by Michael J. Batutis, Jr.
Discussion Demographic analysis has long provided the standard against 
which census accuracy is measured. See, for example, Committee on 
National Statistics, ``Modernizing the U.S. Census'' (1995) 
(``Demographic estimates are the primary means for comparing coverage 
for censuses over time for the nation as a whole.''). Indeed, when 
people discuss the ``steady improvement in census accuracy'' or say 
that the 1990 census was the ``first to be less accurate than its 
predecessor,'' they are using demographic analysis as a benchmark. See, 
for example, the ``Report to Congress--The Plan for Census 2000'' 
(1997, p. i), and Darga, ``Sampling and the Census'' (1999, page 14-
15).
    Demographic analysis, as the term is usually used, is the 
construction of an estimate of the ``true'' population using birth, 
death, migration and other data sources independent from either the 
current census or the A.C.E. demographic analysis provides independent 
measures of the net undercount by age, sex, and Black/non-Black. It 
represents a generally accepted historic data series, although, of 
course, it is subject to its own limitations and uncertainties. Among 
demographic analysis's important limitations is the lack of an 
historical data series to independently estimate the Hispanic, Asian, 
or American Indian populations. In addition, the level of emigration 
and undocumented immigration must be estimated using indirect methods. 
These limitations and uncertainty are documented in Robinson (1993), 
and Himes and Clogg (1992), as well as in the 1990 ``D'' studies.
    Due to the uncertainty in the estimates of undocumented 
immigration, DA in 2000 uses a high and low range for making 
comparisons to the census and A.C.E. results. The ``base'' DA set of 
estimates include the current estimate of undocumented immigrants 
entering during the 1990's (2.76 million); the ``alternative'' DA set 
increases the DA estimate by doubling the assumed net flow of 
undocumented immigrants in the 1990's (5.52 million). The A.C.E. 
measures a net undercount of 3.3 million, or 1.15 percent for Census 
2000. DA measures a lower net undercount than the A.C.E., according to 
either of the two sets of DA estimates developed. The ``base'' DA set 
estimates a net overcount of 1.8 million, or -0.67 percent in 2000. The 
``alternative'' set, which increases the DA estimate to allow for 
additional undocumented immigration in the 1990's, gives a net 
undercount of 0.9 million, or 0.32 percent. The DA and A.C.E. estimates 
both measure a reduction in the net undercount in Census 2000 compared 
to 1990, but DA implies a greater change. Under the base set, the 
estimated DA net undercount rate fell by 2.5 percentage points from 
1.85 percent net undercount in 1990 to -0.65 percent net undercount in 
2000.
    Further, the comparison of census counts to auxiliary data sets 
(such as school enrollment data for children and Medicare enrollment 
for the population 65 and older) are consistent in indicating Census 
2000 is more complete relative to 1990. Both DA and A.C.E. measure a 
reduction in the net undercount rates of Black and non-Black children 
(ages 0-17) compared to 1990. Both methods also measure a reduction in 
the net undercount rates of Black men and women (ages 18+). DA finds a 
reduction in the net undercount rates of non-Black men and women in 
Census 2000 compared to the rates of previous censuses. The reduction 
is large under the base DA set and moderate under the alternative DA 
set. The A.C.E. indicates no change or a slight increase in undercount 
rates for non-Black adults as a group.

[[Page 14040]]

    The A.C.E. sex ratios (ratio of males per 100 females) for Black 
adults are much lower than DA ``expected'' sex ratios, implying that 
A.C.E. is not capturing the high undercount rates of Black men relative 
to Black women (the well-known ``correlation bias''). The size of this 
bias is about the same as in the 1990 PES.
    A more recent complication warrants mention. The demographic 
analysis method requires reconciling the reporting of race in the vital 
statistics system with race as reported in the census. For example, in 
the birth registration system the race of the mother and the father are 
reported, rather than the race of the child. For the first time, the 
census questionnaire instruction was to ``mark one or more races.'' 
This change introduces a new consideration into the reconciliation of 
reported race data. Depending on the treatment of people who report 
Black and at least one other race, the Black undercount estimate ranges 
from 0.9 percent to 4.7 percent. However, in either case a clear 
differential undercount between the Black and non-Black population is 
evident, ranging from at least 1.8 percent to perhaps 6.2 percent.
    If we take DA as representing a reasonable low estimate of 
population in 2000, what would represent a reasonable high? Although we 
tried several different scenarios raising undocumented immigration, for 
purposes of simplicity, we have assumed a doubling of the undocumented 
population for our alternative demographic assumption. Doubling 
undocumented immigration would result in an alternative DA that implies 
a percent foreign-born of 11.13 (compared to 10.61 in the unweighted 
CPS) and a percent Hispanic of 12.72 (compared to 12.55 in the 
unadjusted Census 2000 results). Until we can get a fuller set of data 
from Census 2000 to recalibrate the DA estimates in detail, this 
alternative would seem to be a reasonable upper bound on the total 
number of undocumented immigration in the 1990s.
    The demographic analysis estimates may have underestimated the 
population and, thus, the net undercount, in 1990. Indeed, Robinson et 
al. (1993, p. 1070) states that ``the demographic net undercount 
estimates are biased in that they may underestimate the ``true'' net 
undercount'' (See also 1990 DA Evaluation Project D-10). Thus in 1990, 
the DA production or preferred estimate is as a net undercount of 4.55 
million. Analysis showed that it was very unlikely that the true 
undercount was less than 4 million. This showed that the 1990 PES 
almost certainly did not overestimate the net national undercount. 
However, the upper range of the demographic analysis uncertainty 
estimates was a net undercount of over eight million people. Indeed, 
the midpoint of the range (6.2 million) is higher than the 1990 
demographic analysis production estimates. (Estimates are based on 
Robinson 1993, Table 4 times 249 million).
    It is important to note that errors in estimating the 1990 
population will affect a comparison with the A.C.E. with respect to the 
level and pattern of the undercount. It will not affect the measured 
change between censuses. The internal consistency of the demographic 
estimates permits trends and changes in the coverage pattern over time 
to be estimated more accurately than the exact level of net coverage in 
any given census. (Report B-4, ``Demographic Analysis Results'')
    Historically, demographic analysis' important strength has been its 
ability to measure sex ratios accurately. While inconsistency in 
reporting racial data may introduce uncertainty into the demographic 
analysis estimates of a specific population group, in many instances 
the inconsistency will affect both sexes equally, so that the 
inconsistency's effect on the expected sex ratio should be quite small. 
In 1990, many of the comparisons between the initial census, the PES, 
and demographic analysis centered on the sex ratios.

Post-Enumeration Survey--A.C.E. Error of Closure

    The estimated population from the 1990 dual system estimates based 
on the PES can be projected forward and compared to the estimated 
population from the 2000 dual system estimates based on the A.C.E. To 
the extent that the population change during the decade is well 
estimated, the difference must be attributable to changes in the level 
and patterns of errors in the two dual system estimates. The following 
table is instructive:

 Table 5.--A Comparison of the 1990 PES Total Population With the A.C.E.
                    Accounting for Population Change
------------------------------------------------------------------------
                                               Base         Alternative
                                            demographic     demographic
                                             analysis        analysis
                                             estimates       estimates
------------------------------------------------------------------------
1990 Post-Enumeration Survey Dual System     252,756,428     252,756,428
 Estimates..............................
Natural Growth..........................      17,331,261      17,331,261
Legal Immigration.......................       9,266,974       9,266,974
Emigration..............................       2,652,597       2,652,597
Undocumented Immigration................       2,765,196       5,530,392
``Expected 2000 Population''............     279,467,262     282,332,458
2000 Accuracy and Coverage Evaluation        284,683,782     284,683,782
 Survey Dual System Estimates...........
Error of Closure........................       5,216,520       2,451,324
Error of Closure Percent................     1.9 percent    0.9 percent
------------------------------------------------------------------------
Estimates are for the total population, including populations excluded
  from the 1990 and 2000 Dual System Estimation estimates.

    Unless the demographic analysis estimates of change are inaccurate, 
it is clear from this table that the error level of the 1990 PES DSE 
must differ from that of the 2000 A.C.E.. There are several possible 
causes. Assuming change is correctly measured, the difference between 
the 1990 PES carried forward and the 2000 A.C.E. must be due to either 
sampling or non-sampling errors in the PES or A.C.E. Further, to 
account for differences beyond sampling error, one must assume that the 
non-sampling error levels were different in the two surveys.
    The A.C.E. universe differed from that of the PES. The A.C.E. 
excluded the noninstitutional nonmilitary group quarters, while the 
1990 PES had included this group. The A.C.E. DSE

[[Page 14041]]

implicitly attributes zero coverage error to this group. The PES DSE 
attempted to measure the coverage of this group. However, there is no 
evidence that this coverage of this group was so very far from correct 
as to explain much of the PES/A.C.E. error of closure.
    Another explanation would be that the 1990 PES DSE had much higher 
levels of correlation bias overall than did the A.C.E.. It is certainly 
possible, even likely, that the 1990 PES underestimated the net 
undercount. This is the implication of any comparisons with the 1990 
demographic analysis estimates and is reinforced by comparing the 1990 
PES net undercount (1.65 percent) the range of uncertainty surrounding 
the 1990 demographic estimates (1.63 percent to 3.36 percent).
    Noting that the 1990 PES most certainly was affected by correlation 
bias and almost certainly underestimated the net national undercount 
does not, however, explain the change between censuses. One possibility 
is that the improved publicity campaign and improved community outreach 
surrounding the census may indeed have persuaded people to participate 
in the census (both initial and A.C.E. phases) while they, or at least 
similarly situated people, did not participate in 1990. It must be 
noted that at this point, this explanation remains no more than an 
interesting hypothesis.
    The analysis of errors in the 1990 PES indicated that except for 
correlation bias, the other errors tended to increase the estimated 
population. That is, corrections for the bias would lower the 
estimates. Thus, to explain the error of closure one must posit that 
the errors in the A.C.E. were considerably higher than those in the 
1990 PES.
    However, the analyses of the 2000 A.C.E. seems to indicate that the 
errors were better controlled and probably smaller in 2000 than they 
were in 1990. The one exception noted above is errors from coding an E-
sample case as correct when, in fact, it was physically located outside 
the search area and should have been coded as erroneous. We have 
documented that this occurred, but not to the scale indicated by the 
error of closure. Since this kind of error was also present in the 1990 
census, one must assume a large increase in this or some other positive 
error to explain the error of closure.
    A fundamental assumption of the loss function analysis conducted in 
connection with the A.C.E. and discussed below is that the pattern of 
errors for the A.C.E. is similar to the pattern measured in the PES. If 
the error level or structure of the A.C.E. differs substantially from 
that of the PES, then findings from the loss function analysis are far 
less certain.

Comparing the Accuracy of the A.C.E. to the Accuracy of the Uncorrected 
Census

Conclusions for This Section
    Analysis shows that if one assumes that A.C.E. processing errors 
are assumed at or near the level measured in 1990 and assumes that 
there is little or no correlation bias, then either the unadjusted 
census is more accurate or the two are of nearly equal accuracy. If one 
assumes that the A.C.E. processing errors have been greatly reduced or 
if moderate or substantial correlation bias is present, then the A.C.E. 
adjusted results are more accurate, often by a large margin. Allowing 
for synthetic error does not reverse these findings. However, these 
findings are dependent on the assumption of similar pattern of errors 
as was measured in 1990. If this assumption is not valid, no 
conclusions can be drawn.
Analysis Reports Important to This Section
     Report B-13: ``Accuracy and Coverage Evaluation Survey: 
Comparing Accuracy'' by Mary H. Mulry and Alfredo Navarro.
     Report B-14: ``Accuracy and Coverage Evaluation Survey: 
Assessment of Synthetic Assumptions'' by Donald J. Malec and Richard A. 
Griffin.
Discussion
    Knowing the level of error in the A.C.E. is not enough because the 
A.C.E. decision will not be made in a vacuum; rather the A.C.E. will be 
compared to the unadjusted census to determine which is more accurate 
for redistricting purposes. Both the adjusted and the unadjusted data 
sets will have their own patterns of error.
    As discussed at length in the June 2000 ``Accuracy and Coverage 
Evaluation: Statement on the Feasibility of Using Statistical Methods 
to Improve the Accuracy of Census 2000,'' there are several important 
criteria in assessing accuracy. For purposes of the ESCAP decision, the 
Census Bureau has evaluated both numeric and distributive accuracy. 
Both types of accuracy are important criteria for numbers that will be 
used in the redistricting process, and both types of accuracy have 
independent importance as tools in assessing A.C.E. and census quality. 
Additionally, as discussed in the above document, accuracy can be 
measured at different geographic levels.
    Another way to measure overall accuracy is to prepare Loss 
Functions. Mean squared error is a form of loss function. The Census 
Bureau prepared Loss Function Analyses in connection with the 1990 
adjustment decision and also in connection with the 1993 decision 
regarding use of adjusted data as a base for the intercensal estimates. 
These Loss Functions were able to account for estimated bias in the PES 
estimates. The accuracy criteria discussed above guided our design of 
the loss functions. We prepared loss functions to determine the 
comparative accuracy of the adjusted and unadjusted data sets at the 
state and Congressional district levels, to measure both numeric and 
distributive accuracy.
    The 1990 studies and subsequent analyses addressed this issue 
through complex simulation procedures (See, P-16, as well as Mulry and 
Spencer [1993]). The Census Bureau concluded that adjustment of the 
1990 census would have improved distributive accuracy for states and 
for areas with populations of more than 100,000. Later Census Bureau 
work revealed that in general one could not distinguish an improvement 
in distributive sub-state accuracy for areas with populations of less 
than 100,000 (Obenski and Fay, 2000).
    The Loss Function Analyses that we conducted to inform the ESCAP 
decision should not be considered determinative for several reasons.
    Although A.C.E. variances are available, complete information on 
A.C.E. biases is not. Accurate bias data are a vital component of any 
Loss Function. For the purpose of ascertaining preliminary Loss 
Function information to guide the ESCAP decision, therefore, the Census 
Bureau assumed that the bias in the A.C.E. was similar to biases in the 
1990 PES. To some extent, the PES biases were modified based on an 
analysis of differences in the PES and the A.C.E., but the extent of 
this analysis was limited. Finally, one should keep in mind that more 
complete Loss Functions will be prepared as part of the final 
evaluation process, many months after the ESCAP recommendation. These 
more complete Loss Functions, performed after more data are available, 
may well reach results different from those of the preliminary Loss 
Functions.
    Although several loss functions were computed, three are of 
principal importance.
    The Weighted Squared Error Loss for all levels is a measure of 
numeric accuracy. For example, it treats a 1 percent error in 
estimating the population total for a state proportional

[[Page 14042]]

to the state's size. If state A is twice the size as state B, then a 1 
percent error in estimating the size of state A is considered twice as 
serious.
    The Weighted Squared Error Loss for shares is a measure of 
proportional or share accuracy. It treats a 1 percent (not percentage 
point) error in the share of a state proportional to that state's size. 
If state A is twice the size of state B, that is, state comprises 2 
percent of the nation's population while state B is 1 percent, then a 1 
percent error in estimating state A's share of the national population 
is weighted at twice the error as a 1 percent error in estimating state 
B's share.
    Equal Congressional District Squared Error Loss is a measure of 
within state share accuracy closely related to state congressional 
redistricting. This measure only looks at shares within state. The 
shares are computed on the current congressional districts, and errors 
from the census and A.C.E. are estimated. Errors within state shares 
are then summed over the fifty states to produce a national index of 
relative accuracy.
    For each measure of accuracy, we computed the relative loss. This 
is a measure or estimate of how the census and A.C.E. losses compare. 
It is computed as the Census Loss divided by the A.C.E. loss. Relative 
loss of less than one indicates that, for that measure and those 
assumptions, the census is estimated to be more accurate.
    To estimate the relative accuracy for the census and the A.C.E., 
one must properly account for several things. First is the estimated 
levels of undercount in the census as measured by the A.C.E. Second is 
the sampling variance in the A.C.E.. Third is the level of bias present 
in the A.C.E. As we had no direct measures of the level of bias in the 
A.C.E. (Except for ratio and correlation bias), we assumed the level 
measured in the 1990 PES. The analysis also took into account the 
variance of the estimated biases in 1990. See B-13 and B-19 for a full 
description.
    Models were run using many variations on the assumptions, which are 
documented in B-13. Most important to the results are the following 
assumptions.
     The Level of A.C.E. processing error: This includes A.C.E. 
matching error and E-sample coding error: One hundred percent error 
indicates that there was no improvement from the 1990 measured levels.
     The level of correlation bias for adult men: Zero 
correlation bias indicates that there is not allowance for this bias. 
One hundred percent indicates that the full level of correlation bias 
for adult men implied by the demographic analysis sex ratios. In 
addition, some runs were made assuming that the correlation bias for 
Hispanics was at the same level measured for Blacks.
     The 2000 estimated undercounts and their estimated 
sampling variances were used. All other A.C.E. biases were assumed at 
their 1990 levels, including an allowance for the variances on 
estimating these in 1990.
    Some principal findings are summarized in Table 6:

                   Table 6.--Relative Loss by Degree of Processing Error and Correlation Bias
----------------------------------------------------------------------------------------------------------------
                                                     Degree of
                                     Degree of      processing     Census loss/    Census loss/    Census loss/
              Model                 correlation        error        A.C.E. loss     A.C.E. loss     A.C.E. loss
                                  bias (percent)     (percent)     (St. Levels)    (St. Shares)     (CD shares)
----------------------------------------------------------------------------------------------------------------
NA..............................               0             100           0.519           1.783           0.995
1...............................             100               0          17.488           1.125           2.068
1...............................             100              25          18.565           1.318           1.975
1...............................             100              50          14.108           1.500           1.870
1...............................             100              75           8.242           1.656           1.759
1...............................             100             100           4.413           1.780           1.651
2...............................              10              90           0.770           1.761           1.147
2...............................              20              90           0.897           1.792           1.265
2...............................              50              90           1.416           1.838           1.554
2...............................              75              90           2.048           1.821           1.688
----------------------------------------------------------------------------------------------------------------

    Model 1--correlation bias is present for males except for Non-black 
males age 18 to 29.
    Model 2--correlation bias is present for Black males only.
    States use weighted squared error loss and congressional districts 
use equal CD squared error loss.
    The reader can see that if one assumes no reduction in processing 
error over 1990 as well as little or no correlation bias, the census is 
as accurate or more accurate than the adjusted A.C.E. for state levels, 
less accurate for state shares, and about as accurate for CD shares. 
This clearly demonstrates how sensitive the results are to the model 
assumptions. As noted above, an analysis of the Error of Closure 
between the A.C.E. and the PES indicates that the patten and level of 
error of A.C.E. may not necessarily follow that found in the PES.
    Therefore, the results of the loss functions must be interpreted 
cautiously. If the assumptions of similar patterns of errors do not 
hold even approximately, no direct conclusion can be drawn.
    To assess the impact of synthetic error (local heterogeneity in the 
unadjusted census results) on comparison between Census and A.C.E. 
relative accuracy, several models were run including both the local 
heterogeneity and the assumed level of bias in the A.C.E. (B-14, 
``Assessment of Synthetic Assumptions''). These analysis indicated 
gains in accuracy from adjustment even accounting for synthetic bias. 
However, these results are subject to the same limitations noted above.
    The loss functions run for counties with populations below 100,000 
indicated that the unadjusted census was more accurate regardless of 
the level of correlation bias assumed. This caused some concern, since 
this was not the case for the 1990 census adjustment. One should 
remember, however, that counties below 100,000 are not the same or even 
representative of all areas of less than 100,000. However, the analysis 
found that the adjustment was more accurate when considered in terms of 
all counties for both numeric and distributive accuracy.

References

    Bell, William R., ``Using Information from Demographic Analysis 
in Post-Enumeration Survey,'' Journal of the American Statistical 
Association 88 (September 1993): 1106-1118.
Census Bureau, ``Accuracy and Coverage Evaluation: Statement on the 
Feasibility of Using Statistical Methods to Improve the Accuracy of 
Census 2000, June 2000.
______, DSSD Census 2000 Procedures and Operations Memorandum Series 
B,

[[Page 14043]]

generally, and specially the following, all dated February 28, 2001:
    B-2, ``Quality Indicators of Census 2000 and the Accuracy and 
Coverage Evaluation'' by James Farber
    B-3, ``Quality of Census 2000 Processes'' by James B. Treat, 
Nicholas S. Alberti, Jennifer W. Reichert et al.
    B-4, ``Accuracy and Coverage Evaluation Survey: Demographic 
Analysis Results'' by J. Gregory Robinson
    B-5, ``Accuracy and Coverage Evaluation Survey: Person 
Interviewing Results'' by Rosemary L. Byrne, Lunn Imel, and Phawn 
Stallone
    B-6, ``Accuracy and Coverage Evaluation Survey: Person Matching 
and Follow-up Results'' by Danny R. Childers, Rosemary L. Byrne, 
Tamara S. Adams, and Roxanne Feldpausch
    B-7, ``Accuracy and Coverage Evaluation Survey: Missing Data 
Results'' by Patrick J. Cantwell
    B-8, ``Accuracy and Coverage Evaluation Survey: Decomposition of 
Dual System Components'' by Thomas Mule
    B-9, ``Accuracy and Coverage Evaluation Survey: Dual System 
Estimation Results'' by Peter P. Davis
    B-10, ``Accuracy and Coverage Evaluation Survey: Consistency of 
Post-Stratification Variables'' by James Farber
    B-11, ``Accuracy and Coverage Evaluation Survey: Variance 
Estimates by Size of Geographic Area'' by Michael D. Starsinic, 
Charles D. Sissel, and Mark E. Asiala
    B-12, ``Accuracy and Coverage Evaluation Survey: Correlation 
Bias'' by William R. Bell
    B-13, ``Accuracy and Coverage Evaluation Survey: Comparing 
Accuracy'' by Mary H. Mulry and Alfredo Navarro
    B-14, ``Accuracy and Coverage Evaluation 2000: Assessment of 
Synthetic Assumptions'' by Donald J. Malec and Richard A. Griffin
    B-16, ``Demographic Full Count Review: 100% Data Files and 
Products'' by Michael J. Batutis, Jr.
    B-17, ``Census 2000: Missing Housing Unit Status and Population 
Data'' by Richard A. Griffin
    B-18, ``Accuracy and Coverage Evaluation Survey: Effect of 
Targeted Extended Search'' by Douglas B. Olson
______, DSSD Census 2000 Procedures and Operations Memorandum Series 
S, specially the following:
    S-DT-02, ``Accuracy and Coverage Evaluation: Overview of 
Design,'' by Danny R. Childers and Deborah A. Fenstermaker, January 
11, 2000.
______, DSSD Census 2000 Procedures and Operations Memorandum Series 
T, specially the following:
    T-6 , ``Additional Geographic Coding for Erroneously Enumerated 
Housing Units,'' by Danny R. Childers and Xijian Liu, February 28, 
2001.
______, ``Evaluating Censuses of Population and Housing,'' 
Statistical Training Document ISP-TR-5.
______, POP ``D'' Studies (1990), generally, and specially the 
following:
    D-10, DA Evaluation Project
______, ``Report to Congress--The Plan for Census 2000,'' originally 
issued July 1997, revised and reissued August 1997.
______, STSD ``P'' Studies (1990), generally, and specifically the 
following:
    P-1, ``Analysis of Reasonable Imputation Alternatives''
    P-3, ``Evaluation of Imputation Methodology for Unresolved Match 
Status Cases''
    P-4, ``Address Misreporting''
    P-5, ``Analysis of P-Sample Fabrications from PES Quality 
Control Data''
    P-5a, ``Analysis of Fabrications from Evaluation Follow-up 
Data''
    P-6, ``Fabrication in the P-sample--Interviewer Effect''
    P-7, ``Estimates of P-Sample Clerical Matching Error from a 
Rematching Evaluation''
    P-10, ``Measurement of the Census Erroneous Enumeration Clerical 
Error made in the Assignment of Enumeration Status.''
    P-11, ``Balancing Error Evaluation''
    P-13, ``Use of Alternative Dual System Estimators to Measure 
Correlation Bias''
    P-16, ``Total Error in PES Estimates for Evaluation Post 
Strata''
Cochran, William G., Sampling Techniques, 2d ed., New York: John 
Wiley & Sons, Inc., 1963.
Darga, Kenneth, Fixing the Census until it Breaks: An Assessment of 
the Undercount Adjustment Puzzle, Michigan Information Center, 2000.
Edmonston, Barry and Charles Schultze, eds., Modernizing the U.S. 
Census, Panel on Census Requirement in the Year 2000 and Beyond, 
Committee on National Statistics, National Research Council, 
Washington D.C.: National Academy Press, 1995.
Fay, R.E., J.S. Passel, J.G. Robinson, and C.D. Cowan, The Coverage 
of Population in the 1980 Census, 1980 Census of Population: Housing 
Evaluation and Research Reports, PHC80-E4, U.S. Bureau of the 
Census, Washington, D.C.: U.S. Department of Commerce,1988.
Griffiths, Richard R., ``Results from the 1995 Census Test: The 
Contamination Study,'' Census Bureau, Washington, D.C.,1996.
Clogg, Clifford C., C.L. Himes, and J.S. Passel, ``An Overview of 
Demographic Analysis as a Method for Evaluating Census Coverage in 
the United States,'' Journal of the American Statistical Association 
88 (September 1993): 1072-1077.
Hogan, Howard M., ``Accuracy and Coverage Evaluation: Theory and 
Application,'' prepared for the February 2-3, 2000 DSE Workshop of 
the National Academy of Sciences Panel to Review the 2000 Census.
Hogan, H. and K..M. Wolter, ``Measuring Accuracy in a Post-
Enumeration Survey,'' Survey Methodology 14 (1988): 99-116.
Marks, E.S., ``The Role of Dual System Estimation for Census 
Evaluation'' in K. Krotkik, Recent Developments in PGE, University 
of Alberta Press, pp. 156-188.
Mulry, Mary H. and Bruce D. Spencer, ``Accuracy of the 1990 Census 
and Undercount Adjustments,'' Journal of the American Statistical 
Association 88 (September 1993): 1080-1091.
______, ``Overview of Total Error Modeling and Loss Function 
Analysis,'' March 2001.
______, ``Total Error in PES Estimates of Population,'' Journal of 
Official Statistics 86 (1991): 839-54.
Obenski, Sally M.. and Robert E. Fay, ``Analysis of CAPE Findings on 
PES Accuracy at Various Geographic Levels,'' Census Bureau, 
Washington, D.C., June 9, 2000.
Robinson, J.G., B. Ahmed, P.D. Gupta, and K.A. Woodrow, K.A., 
``Estimation of Population Coverage in the 1990 United States Census 
Based on Demographic Analysis,'' Journal of American Statistical 
Association 88 (September,1993): 1061-1071.
Spencer, Bruce D., ``Adaption of CAPE Loss Function Analysis for 
Census 2000,'' (2000 Draft).

BILLING CODE 3510-07-P

[[Page 14044]]

[GRAPHIC] [TIFF OMITTED] TN08MR01.002


[[Page 14045]]


[GRAPHIC] [TIFF OMITTED] TN08MR01.003


BILLING CODE 3510-07-C

[[Page 14046]]



  Table A-3.--Census 2000 Evaluations Program Category Report Schedule
------------------------------------------------------------------------
              Category                  Availability of Category Report
------------------------------------------------------------------------
A: Response Rates & Behavior          Spring 2002.
 Analysis.
B: Content/Data Quality.............  Summer 2003.
C: Data Products....................  Summer 2001.
D: Partnership and Marketing          Winter 2001.
 Programs.
E: Special Populations..............  Winter 2001.
F: Address List Development.........  Fall 2002.
G: Field Recruiting & Management....  Summer 2001.
H: Field Operations.................  Winter 2002.
I: Coverage Improvement.............  Winter 2002.
J: Ethnographic Studies.............  Spring 2003.
K: Data Capture.....................  Fall 2002.
L: Processing Systems...............  Winter 2002.
M: Quality Assurance Evaluations....  Spring 2003.
N: Accuracy & Coverage Evaluation     Fall 2002.
 Survey Operations.
O: Coverage Evaluations of the        Summer 2002.
 Census & of A.C.E. Survey.
P: A.C.E. Survey Statistical Design   Winter 2003.
 & Estimation.
Q: Organization/Budget & MIS........  Fall 2001.
R: Automation of Census Processes...  Summer 2001.
------------------------------------------------------------------------

[FR Doc. 01-5479 Filed 3-2-01; 11:20 am]
BILLING CODE 3510-07-P