[Federal Register Volume 66, Number 214 (Monday, November 5, 2001)]
[Notices]
[Pages 56006-56021]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 01-27663]



[[Page 56005]]

-----------------------------------------------------------------------

Part III





Department of Commerce





-----------------------------------------------------------------------



Bureau of the Census



-----------------------------------------------------------------------



Report of the Executive Steering Committee for Accuracy and Coverage 
Evaluation Policy and Statement of the Acting Director of the U.S. 
Census Bureau on Adjustment for Non-Redistricting Uses; Notice

  Federal Register / Vol. 66, No. 214 / Monday, November 5, 2001 / 
Notices  

[[Page 56006]]


-----------------------------------------------------------------------

DEPARTMENT OF COMMERCE

Bureau of the Census

[Docket Number 011026262-1262-01]
RIN Number 0607-XX66


Report of the Executive Steering Committee for Accuracy and 
Coverage Evaluation Policy and Statement of the Acting Director of the 
U.S. Census Bureau on Adjustment for Non-Redistricting Uses

AGENCY: Bureau of the Census.

ACTION: Notice of report and statement of Acting Director of the Census 
Bureau regarding adjustment decision.

-----------------------------------------------------------------------

SUMMARY: This notice provides the Executive Steering Committee on 
Accuracy and Coverage Evaluation Policy (ESCAP) report and the 
statement of the Acting Director of the Census Bureau regarding the 
potential application of statistically adjusted data from Census 2000 
for the following uses: (1) As controls to produce estimates from the 
Census 2000 long form (sample) data, (2) as demographic survey 
controls, and (3) as the base for producing post-censal estimates. The 
ESCAP report and statement of the Acting Director are attached as 
exhibits to the SUPPLEMENTARY INFORMATION section of this notice. In 
addition to publication in the Federal Register, the report is posted 
on the Census Bureau Web site at http://www.census.gov/dmd/www/EscapRep2.html>, and the Acting Director's statement is available 
electronically at http://www.census.gov/Press-Release/www/2001/cooper.pdf>.

FOR FURTHER INFORMATION CONTACT: John H. Thompson, Principal Associate 
Director for Programs, U.S. Census Bureau, FB-3, Room 2037, Washington, 
DC 20233. Telephone: 301 (457)-3946; fax: 301 (457)-3024.

SUPPLEMENTARY INFORMATION:

Background Information

    The decennial census is mandated by the United States Constitution 
(Article I, Section 2, Clause 3) to provide the population counts 
needed to apportion the seats in the U.S. House of Representatives 
among the states. By December 28, 2000, the Census Bureau fulfilled its 
constitutional duty by delivering to the Secretary of Commerce the 
state population totals used for congressional apportionment. In 
accordance with the January 25, 1999, Supreme Court ruling, Department 
of Commerce v. House of Representatives, 119 S.Ct. 765 (1999), the 
Census Bureau did not use statistical sampling to produce the state 
population totals used for congressional apportionment.
    However, the Census Bureau has examined the use of statistical 
sampling to produce statistically adjusted Census 2000 data for 
nonapportionment purposes. Pursuant to Title 15, Code of Federal 
Regulations, Part 101, issued by the Secretary of Commerce (66 FR 
11232, February 23, 2001), the Acting Director of the Census Bureau 
submitted his recommendation, based upon the ESCAP's March 1, 2001 
report, regarding the methodology to be used in producing the 
tabulations of population reported to states and localities pursuant to 
13 U.S.C., Section 141(c) (data used for congressional and state and 
local legislative redistricting), to the Secretary of Commerce (66 FR 
14003, March 8, 2001). The Secretary then made the final determination 
regarding statistical adjustment of the redistricting data (66 FR 
14520, March 13, 2001).
    After the issuance of the March 1, 2001, ESCAP report 
(``Recommendation Concerning the Methodology to be Used in Producing 
the Tabulations of Population Reported to States and Localities 
Pursuant to 13 U.S.C., Section 141(c)''), the Committee reconvened to 
examine the potential use of the statistically adjusted data for 
nonredistricting purposes--namely, as controls to produce estimates 
from the Census 2000 long form (sample) data, as demographic survey 
controls, and as the base for producing post-censal estimates. The 
ESCAP used analysis from reports on topics chosen for their usefulness 
in informing its recommendation regarding the suitability of using the 
statistically adjusted data for these nonredistricting purposes. The 
Committee also drew upon work from other Census Bureau staff, as 
appropriate. This notice provides the Committee's report and the Acting 
Director's statement regarding his determination of the appropriateness 
of statistical adjustment of the Census 2000 data for these purposes.

    Dated: October 30, 2001.
William G. Barron, Jr.,
Acting Director, Bureau of the Census.
October 16, 2001.
Memorandum for Kathleen B. Cooper, Under Secretary for Economic 
Affairs
From: William G. Barron, Jr., Acting Director
Subject: Notification of Decision

    I am attaching the recommendation of the Executive Steering 
Committee for A.C.E. Policy (ESCAP) on whether the Accuracy and 
Coverage Evaluation Survey should be used to adjust Census 2000 data 
for non-redistricting purposes. As in March, I asked ESCAP to 
provide a recommendation because I rely on the knowledge, 
experience, and technical expertise of the Committee and Census 
Bureau staff who have worked extremely hard with tremendous 
dedication and expertise through every phase of Census 2000.
    After assessing considerable new evidence, ESCAP now recommends 
that unadjusted Census 2000 data be used for non-redistricting 
purposes. The effect of this new evidence is that the A.C.E. 
overstated the net undercount by at least 3 million persons. The 
cause of this error was that the A.C.E. failed to measure a 
significant number of census erroneous enumerations, many of which 
were duplicates. This level of error in the A.C.E. measurement of 
net coverage is such that the A.C.E. results cannot be used in their 
current form. This finding of substantial error, in conjunction with 
remaining uncertainties, necessitates that revisions, based on 
additional review and analysis, be made to the A.C.E. estimates 
before any potential uses of these data can be considered.
    As a member of ESCAP and as Acting Director, I concur with and 
approve the Committee's recommendation that unadjusted data be used 
for non-redistricting purposes and have decided that the Census 
Bureau will release the remaining Census 2000 data products, post-
censal estimates, and survey controls using unadjusted data. It is 
possible that further research and analysis could yield revised 
A.C.E. estimates, and that these revised estimates could be used to 
improve estimates developed as part of the Census Bureau's annual 
population adjustments for survey controls and other purposes.

Report of the Executive Steering Committee for Accuracy and Coverage 
Evaluation Policy on Adjustment for Non-Redistricting Uses

October 17, 2001

Recommendation

    The Executive Steering Committee for A.C.E. Policy (ESCAP) 
recommended on March 1, 2001 that unadjusted census data be used for 
redistricting. After assessing considerable new evidence, ESCAP now 
recommends that unadjusted Census 2000 data also be used for non-
redistricting purposes. The effect of this new evidence is that the 
Accuracy and Coverage Evaluation (A.C.E.) overstated the net 
undercount by at least 3 million persons. The cause of this error 
was that the A.C.E. failed to measure a significant number of census 
erroneous enumerations, many of which were duplicates. This level of 
error in the A.C.E. measurement of net coverage is such that the 
A.C.E. results cannot be used in their current form. This finding of 
substantial error, in conjunction with remaining uncertainties, 
necessitates that revisions, based on additional review and 
analysis, be made to the A.C.E. estimates before any potential uses 
of these data can be considered. The Census Bureau will release the 
remaining Census 2000 data products, post-censal estimates, and 
survey controls using unadjusted data. It is, however, reasonable to 
expect that further research and analysis may lead to revised A.C.E. 
estimates that can be used to improve future post-censal estimates.

[[Page 56007]]

    The ESCAP review confirmed the finding in the first ESCAP Report 
that most Census 2000 and A.C.E. operations were of high quality. 
The evaluations continue to demonstrate that improvements were 
achieved over both the 1990 census and the 1990 coverage measurement 
survey. Important new information and methods are now available for 
assessing the A.C.E. and Census 2000. As will be discussed in more 
detail below, final analysis of this new information is still in 
progress. However, the Census Bureau believes that this analysis 
will confirm that Census 2000 made substantial gains in reducing the 
total net undercount, as well as reducing net differential 
undercount. Most of the A.C.E. operations were also seen to be well 
conducted, producing valuable information that, when combined with 
the other evaluation findings, provides important new research data. 
The ESCAP feels confident that its research program will enhance the 
evaluations of Census 2000, contribute to planning for the 2010 
census, and, with further analysis, potentially improve future the 
post-censal estimates.
    The ESCAP's primary concern in its March decision was that 
fundamental differences between the Demographic Analysis (DA) 
estimates and the A.C.E. estimates could not be explained. The 
estimates differed widely, both for the total national population 
and for important population groups. The Committee investigated this 
inconsistency extensively but could not adequately explain it in the 
time available for the March decision. The Committee concluded in 
March that the inconsistency must have resulted from one or more of 
three possible scenarios. The first scenario was that all available 
1990 census data, including the census results, the coverage 
measurement survey, and the demographic analysis estimates, 
significantly understated the Nation's population, but that Census 
2000 found this previously un-enumerated population. The second 
scenario was that demographic analysis underestimated population 
growth between 1990 and 2000. The third scenario was that the A.C.E. 
overestimated the Nation's population, raising the possibility of an 
undiscovered problem in the A.C.E. or census methodology.
    The Census Bureau's extensive research over the past eight 
months has been directed at examining demographic analysis, the 
A.C.E., and Census 2000. Demographic analysis research examined 
historic levels of the components of population change to address 
the possibility that the 1990 demographic analysis estimates 
understated the national population (the first scenario). This 
analysis did not reveal any significant problems. The Census Bureau 
investigated the second scenario by revising the preliminary 
estimates of international migration, and hence the foreign-born 
population, using actual Census 2000 long form data. The Census 
Bureau also consulted with outside experts on this work. These 
studies resulted in revisions to the ``Base DA'' that was initially 
examined as part of the March 2001 decision. The revisions reflected 
a larger growth in the foreign-born population during the last 
decade. The current revised demographic analysis estimates are much 
closer to the Alternative DA considered during the March 
deliberations. The A.C.E. and demographic analysis evaluations, when 
analyzed together, explain many of the inconsistencies.
    With regard to the third scenario, the ESCAP's review of the 
accuracy of the A.C.E. and Census 2000 was based on a number of 
evaluation studies, including reinterview studies, re-processing 
studies, and computer searches for duplicate enumerations. This 
research found that the A.C.E. did not account for a large number of 
Census 2000 duplicates, leading to an overstatement of the Census 
2000 net undercount. As described previously, this finding, in 
conjunction with the revisions to demographic analysis, explains to 
a large degree the discrepancies between the A.C.E. and demographic 
analysis. The significance of the error in the A.C.E. treatment of 
duplicates compels the recommendation that the current A.C.E. 
estimates cannot be used to adjust the Census 2000 data.
    The ESCAP notes that its extensive evaluation program has 
provided information that was unavailable for previous decennial 
censuses. This important new information was the result of 
outstanding and innovative work on the part of many Census Bureau 
employees. Additionally, the Committee notes that some of the 
information resulted from new methodologies not available in prior 
censuses. Census 2000 was the first census to capture name 
information in a way that permits nationwide computer matching. The 
evaluation results, including the new tool of name matching, will be 
extremely valuable for evaluating the accuracy of Census 2000, 
planning for the 2010 census, and potentially for improving future 
post-censal estimates. Both census taking and coverage measurement 
are processes that evolve and improve with each census. The Census 
2000 experience will help refine both census and coverage 
measurement processes for future censuses.
    While the ESCAP has recommended against use of the adjusted 
data, the A.C.E.'s original objective of addressing the differential 
undercount must still be pursued. The totality of the evidence 
considered by the Committee leads it to believe that while Census 
2000 successfully lowered the historical pattern of the differential 
undercount, it did not eliminate it. The Census Bureau believes that 
the net undercount remains disproportionately distributed among 
renters and minority populations. With further research, it is 
reasonable to expect that new information can be used to produce 
revised A.C.E. estimates. These revised estimates may then be 
employed to improve post-censal population estimates by reducing 
remaining differential coverage error. It is also expected that 
planning for the 2010 census will greatly benefit from these 
findings, with improved operations to identify and remove duplicates 
and refined methods to improve the accuracy of all census 
operations. The Census Bureau will continue research to design 
improved operations, including coverage measurement studies, for 
future censuses and surveys.

Executive Summary

    After assessing considerable new evidence, the second ESCAP 
Committee (ESCAP II) has recommended that unadjusted Census 2000 
data also be used for non-redistricting purposes. New evidence 
indicates that the Accuracy and Coverage Evaluation (A.C.E.) 
overstated the net undercount by at least 3 million persons, and 
that the cause of this error was the A.C.E.'s failure to measure a 
significant number of census erroneous enumerations, many of which 
were duplicates. This level of error in the A.C.E. measurement of 
net coverage is such that the A.C.E. results cannot be used in their 
current form. This finding of substantial error, in conjunction with 
remaining uncertainties, necessitates that revisions, based on 
additional review and analysis, be made to the A.C.E. estimates 
before any potential uses of these data can be considered. The 
Census Bureau will release the remaining Census 2000 data products, 
post-censal estimates, and survey controls using unadjusted data. It 
is, however, reasonable to expect that further research and analysis 
may lead to revised A.C.E. estimates that can be used to improve 
future post-censal estimates.
    ESCAP II has also confirmed the finding in the first ESCAP 
Report that most Census 2000 and A.C.E. operations were of high 
quality. More recent evaluations continue to demonstrate that 
improvements were achieved over both the 1990 census and the 1990 
coverage measurement survey. Important new information and methods 
are now available for assessing the A.C.E. and Census 2000. As will 
be discussed in more detail below, final analysis of the effects of 
this new information is still in progress. However, the Census 
Bureau believes that this analysis will confirm that Census 2000 
made substantial gains in reducing the total net undercount, as well 
as the net differential undercount. Most of the A.C.E. operations 
were also seen to be well conducted, producing valuable information 
that, when combined with the other evaluation findings, provides 
important new research data. The ESCAP feels confident that the 
Census Bureau's continuing research program will enhance the 
evaluations of Census 2000, contribute to planning for the 2010 
census, and, with further analysis, potentially improve the post-
censal estimates.
    The ESCAP's primary concern in its March decision was that 
demographic analysis and the A.C.E. estimates differed widely, both 
for the total national population and for important population 
groups. The Committee concluded in March that the inconsistency must 
have derived from one or more of three possible scenarios. The first 
scenario was that all available 1990 census data, including the 1990 
census, the 1990 coverage measurement survey, and the 1990 
demographic analysis estimates significantly understated the 
Nation's population, while Census 2000 included portions of this 
previously un-enumerated population. The second scenario was that 
demographic analysis estimates underestimated population growth 
between 1990 and 2000. The third scenario was that the A.C.E. 
overestimated the Nation's population, raising the possibility of an 
undiscovered problem with the A.C.E. or

[[Page 56008]]

census methodology. The ESCAP also identified additional technical 
concerns that are documented in the previous report.

Areas of Research

    In the months since the ESCAP I Report, the Committee embarked 
on a second round of deliberations to address the concerns 
identified in the report and to enable the Census Bureau to 
recommend whether Census 2000 data should be adjusted for future 
uses. The future uses in consideration included the post-censal 
population estimates, demographic survey controls, and the 
production of Census 2000 long form data products. The ESCAP I 
Committee did not have current results for certain measures of 
A.C.E. accuracy, and was forced to use 1990 data on potential A.C.E. 
errors. The ESCAP therefore directed and documented that a number of 
evaluations be conducted to inform the deliberations. Some of the 
evaluations were designed to provide current measures of accuracy 
for the various components of error. These evaluations involved 
additional technical research, field work, and data processing, as 
well as new computer matching and simulation research. The 
evaluations include:

Demographic Analysis (DA) Research

    The DA research program examined historical levels of the 
components of population change to address the possibility that the 
1990 DA estimates understated the Nation's population and that 
demographic analysis did not capture the full population growth in 
the last decade. The Census Bureau consulted with outside 
demographic experts to plan and conduct its research program, 
focusing on the methodologies and underlying estimates of the 
components of population change. The research activities 
concentrated on two major areas--international migration and the 
robustness of the DA estimates.

Measurement of Erroneous Enumerations, Including Duplication

    Erroneous enumerations refer to individuals who should not be 
included in the census counts because they are duplicated, 
fictitious, or live someplace other than where they were enumerated. 
While the ESCAP I Report did not identify erroneous enumerations as 
an area of concern, Census Bureau researchers quickly noted that 
Census 2000 erroneous enumerations differed substantially from 1990 
measures in ways that were not readily understood. Studies included 
the Measurement Error Reinterview/Evaluation Followup (hereinafter 
called the EFU) and the Person Duplication Studies. EFU results were 
used to determine how well the A.C.E. identified erroneous 
enumerations. The EFU was based on a reinterview of a sample drawn 
from the A.C.E. clusters. The Person Duplication Studies used 
computer matching techniques to identify Census 2000 duplicate 
enumerations throughout the United States, and to determine whether 
the A.C.E. estimates had correctly accounted for these duplications. 
These studies used computer matching methods not available in 
earlier censuses.

Measurement of Census Omissions

    Census omissions refer to individuals who should have been 
counted in the census but were not. The A.C.E. methodology must 
accurately account for both erroneous enumerations, as described 
above, and census omissions. The A.C.E. identifies omissions by 
matching an independent sample to the census. The accuracy of this 
measurement of omissions thus depends on the accuracy of the 
matching, as well as the accuracy of the information collected by 
the independent sample. Census omissions were evaluated in the 
Matching Error Study, in which expert matchers re-matched a sample 
of the A.C.E. to determine the accuracy of the A.C.E. matching 
process. Omissions were also evaluated in the EFU described above to 
measure the accuracy of the A.C.E. information on Census Day 
residence, including whether persons had moved since Census Day.

Missing Data Studies

    Missing data occurs in the A.C.E. if, after all attempts, there 
remain persons for whom complete data are not available, including 
demographic characteristics such as age or race. Missing data also 
includes the status of whether a person matched, was a resident on 
Census Day, or was correctly enumerated. The latter types of missing 
data can seriously affect the accuracy of coverage measurement 
surveys such as the A.C.E. The A.C.E. used a statistical model to 
account for the effects of missing data. The ESCAP directed the 
development of alternative missing data models to assess the effect 
on the estimates of using different assumptions to predict the 
effects of missing data.

Balancing Error

    The previous ESCAP report indicated concerns with balancing 
error. Balancing error occurs when the method used to determine the 
number of omissions is different from the method used to determine 
which records are correctly included in the census. The specific 
concern was that the area for matching to find omissions was 
different from the area used to determine erroneous enumerations. 
The ESCAP posited various scenarios that could explain the concerns 
with balancing error, ranging from small to very serious effects on 
the A.C.E. estimates. In order to investigate these concerns, 
additional field operations were conducted.

Synthetic Error Study

    The A.C.E. estimation methodology produced estimated coverage 
correction factors which were carried down within the post-strata in 
a process referred to as synthetic estimation. The key assumption 
underlying synthetic estimation is that net census coverage is 
relatively uniform within the post-strata. Failure of this 
assumption leads to synthetic error. The Census Bureau is concerned 
with synthetic error since it may affect the accuracy of small area 
estimates and cannot be directly estimated. ESCAP I examined the 
effects of synthetic error by studying ``artificial populations,'' 
populations created with surrogate variables that are known for the 
entire population, and are developed to reflect the distribution of 
net coverage error. ESCAP II directed the preparation of additional 
artificial populations.

Evaluation Results

    Demographic analysis research examined historical levels of the 
components of population change to address the possibility that the 
1990 demographic analysis estimates understated the national 
population (the first scenario). This analysis did not reveal any 
significant problems. The Census Bureau investigated the second 
scenario by revising the estimates of international migration using 
preliminary Census 2000 long form data, and estimates of the numbers 
of births, using more current assumptions about birth registration. 
The Census Bureau also consulted with outside experts on this work. 
This analysis resulted in revisions to the Base DA that was 
initially examined as part of the ESCAP I decision. The revisions 
reflected a larger growth in the foreign-born population during the 
last decade. The current Revised DA estimates considered by ESCAP II 
are much closer to the Alternative DA considered during the ESCAP I 
deliberations. Many of the inconsistencies previously noted are 
removed when the Revised DA estimates are viewed in light of the 
A.C.E. evaluations. The Revised DA national estimate of 281.8 
million for the U.S. resident population is 2.2 million higher than 
the Base DA and about 0.6 million lower than the Alternative DA. The 
Revised DA net undercount rate of 0.12 percent compares to a net 
overcount of 0.65 percent implied by the Base DA, and a net 
undercount of 0.32 percent using the Alternative DA.

Erroneous Enumerations

    The studies examining the accuracy of the measurement of 
erroneous enumerations initially found serious errors that would 
have resulted in a large overstatement of the population by the 
A.C.E. The seriousness of these findings prompted the Committee to 
direct further work to make sure that the findings were correct. 
This additional review indicated that a significant problem existed 
with the measurement of erroneous enumerations, but also indicated 
that the study findings were subject to uncertainties resulting from 
a large number of cases left unresolved or conflicting. The Person 
Duplication Studies added additional information underscoring the 
seriousness of the errors in measuring erroneous enumerations. These 
duplication studies found that the A.C.E. had seriously understated 
the level of erroneous enumerations because of incompletely 
measuring census duplications, and that the EFU had not accounted 
for a significant part of this understatement. They also helped to 
explain some of the uncertainty that arose from the rework of the 
EFU. The net effect of these studies was the conclusion that the 
A.C.E. overstated the level of undercount by at least 3 million 
persons. The level of this error is such that the ESCAP determined 
that the unadjusted data should be used.

Census Omissions

    With regard to studies of census omissions, the Matching Error 
Study indicated that the

[[Page 56009]]

A.C.E. overstated the net undercount due to P-sample matching error 
by about 385,000. The EFU indicated that a substantial number of 
movers were changed to nonmovers and vice versa. The net effect of 
these mover status changes suggests an overestimate of the match 
rate and therefore an understatement in the A.C.E. estimates of 
about 450,000. At the national level there is therefore a small net 
effect of about 65,000 on the accuracy of the measurement of census 
omissions. However, more research must be conducted to further study 
these effects.

Missing Data

    The Committee examined a variety of alternative models to 
account for the effects of missing data. These models gave a wide 
range of results, implying widely varying effects on the A.C.E. 
estimates. The data examined by the Committee make clear that 
alternative missing data models both understated and overstated the 
effects of missing data on the A.C.E. estimates, depending on the 
choice of model. The Committee ultimately viewed the choice of model 
as an increase in the uncertainty associated with the A.C.E. 
results, but did not find evidence of bias resulting from this 
choice of model. This uncertainty should be considered in further 
analysis of the A.C.E. estimates.

Balancing Error

    ESCAP I's concern with balancing error has for the most part 
been resolved, as further research indicated that the previously 
observed discrepancy did not appreciably influence the A.C.E. 
estimates.

Total Error Model

    ESCAP I used a total error model to consolidate its research and 
to produce an overall assessment of A.C.E. accuracy. ESCAP II 
directed that an updated model be prepared to account for 
information from the new evaluation studies. The timing of some of 
the new evaluations, along with the complexities of both the studies 
and the A.C.E. design, did not allow preparation of an updated model 
that would incorporate all errors that impact the A.C.E. estimates. 
As discussed more fully in the body of the report, the ESCAP could 
not develop or verify a new total error model that would take into 
account all of the errors discovered in the EFU, Matching Error 
Study, and Person Duplication Studies. Even without the information 
from an updated total error model, however, it was clear to the 
Committee that the magnitude of the discovered errors precluded a 
recommendation in favor of the adjusted data.

Synthetic Error

    Consideration of the synthetic error studies requires the 
completion of the total error model and will be included in the 
continued research.

Other Concerns

    Additional studies allayed other concerns about the A.C.E. and 
the census. Studies revealed no evidence of significant 
contamination bias. The Committee concluded that the effect of 
excluding reinstated census people from the A.C.E. was minimal. The 
Committee further concluded that the kind, level and pattern of 
whole person imputation in Census 2000 did not call the A.C.E. 
results into question.

Next Steps

    While the ESCAP has recommended against use of the adjusted 
data, the A.C.E.'s original objective of addressing the differential 
undercount must still be pursued. The totality of the evidence 
considered by the Committee leads it to believe that while Census 
2000 successfully lowered the historical pattern of the differential 
undercount, it did not eliminate it. The net undercount remains 
disproportionately distributed among renters and minority 
populations. With further research, it is reasonable to expect that 
new information can be used to produce revised A.C.E. estimates. The 
evaluation results, including the new measurement tool of name 
matching, will be extremely valuable for evaluating the accuracy of 
Census 2000, planning for the 2010 census, and potentially for 
improving the post-censal estimates. Both census taking and coverage 
measurement are processes that evolve and improve with each census. 
The Census 2000 experience will help refine both census and coverage 
measurement processes for future censuses.

Table of Contents

Executive Summary
    Areas of Research
    Evaluation Results
    Next Steps

Table of Contents

ESCAP II Report
Introduction
    Background
    ESCAP II Proceedings
    Non-redistricting uses of the data
ESCAP II Research
    Demographic Analysis
      International Migration
      Measurement of Vital Events
      Results of Revised DA
    Research to Evaluate the A.C.E. and Census 2000
      Matching Error Study
      Evaluation Followup
      Person Duplication Studies
    Measurement of Erroneous Enumerations, Including Duplicates
    Measurement of Census Omissions
    Correlation Bias
    A.C.E. Missing Data
    Balancing Error
    Conditioning
    Reinstated Late Additions
    Census 2000 Imputations
    Total Error Model and Loss Function Analysis
    Synthetic Estimation
Conclusion
Attachments

ESCAP II Report

Introduction

Background

    On March 1, 2001, the Acting Director of the Census Bureau 
recommended to the Secretary of Commerce that unadjusted census data 
be used as the Census Bureau's official redistricting data. This 
recommendation was in accord with the recommendation of the 
Executive Steering Committee for A.C.E. Policy (ESCAP). The ESCAP 
\1\ was unable to conclude, based on information available at the 
time, that adjusted Census 2000 data were more accurate for 
redistricting. The ESCAP I Report is available on the Census 
Bureau's website, along with a voluminous Administrative Record 
supporting this recommendation.
---------------------------------------------------------------------------

    \1\ For clarity, the Committee that produced the March 1, 2001, 
ESCAP Report is sometimes referred to herein as ``ESCAP I'' and the 
March 1 report as the ``ESCAP I Report.'' The Committee that has 
been meeting since March 1, 2001, is referred to as ``ESCAP II.''
---------------------------------------------------------------------------

    The primary issue that precluded ESCAP I from recommending use 
of the adjusted data was the unexplained difference between the 
A.C.E. and Demographic Analysis estimates of the population. 
Demographic analysis (DA) initially estimated the national total 
population to be below the census count, while the A.C.E. estimated 
the population to be above the census count. This discrepancy raised 
the significant possibility of an undetected problem with the A.C.E. 
or the census. ESCAP I also identified concerns with balancing and 
synthetic estimation error as potential problems in the adjusted 
data. The Committee directed the preparation of an extensive 
evaluation program to inform its deliberations relating to the 
proposed use of adjusted data for nonredistricting purposes.

ESCAP II Proceedings

    In the months since the ESCAP I Report, the Committee has 
embarked on a second round of deliberations to address the concerns 
identified in the report and to enable the Census Bureau to 
recommend whether the adjusted Census 2000 data should be used for 
nonredistricting purposes. These evaluations, the ESCAP II report 
series, set forth the underlying data that support the Committee's 
findings. The future uses in consideration include post-censal 
population estimates, demographic survey controls, and the census 
long form data products. Some of the required evaluations involved 
additional research, including additional field work and matching 
work.
    ESCAP II considered a wide variety of research and analyses, and 
heard presentations of the reports on the attached list (Attachment 
1). Some of these presentations provided background information to 
help the Committee interpret the results of other studies, while 
others bore directly on the adjustment recommendation. While the 
Committee considered and deliberated on all of the listed reports, 
this discussion will focus on those most directly relevant to the 
comparative accuracy of the adjusted and unadjusted data. This 
research was conducted over many months and represents diligent and 
thorough statistical and demographic analysis.\2\
---------------------------------------------------------------------------

    \2\ The ESCAP II Report Series does not represent the entirety 
of the Census Bureau's evaluation of Census 2000. The Census 
Bureau's formal Census 2000 Evaluation Program provides a 
comprehensive evaluation of all Census operations and programs. The 
reports in the ESCAP II series are only those necessary to inform 
the ESCAP's recommendation.

---------------------------------------------------------------------------

[[Page 56010]]

    The Associate Director for Decennial Census originally chartered 
the ESCAP on November 26, 1999, and charged the Committee to 
``advise the Director in determining policy for the A.C.E. and the 
integration of the A.C.E. results into the census for all purposes 
except Congressional reapportionment.'' Although there was a change 
in the Associate Director for the Decennial Census position in June 
2001, ESCAP II continued to be chaired by John Thompson to maintain 
continuity. The ESCAP resumed meeting on March 7, 2001, and met a 
total of 32 times, sometimes with more than one meeting per day. The 
ESCAP represents a body of senior career Census Bureau 
professionals, with advanced degrees in relevant technical fields 
and/or decades of experience in the federal statistical system. All 
are highly competent to evaluate the relative merits of the A.C.E. 
data versus the census data and are recognized for their extensive 
contributions to the professional community.
    As in the ESCAP I process, the early sessions were primarily 
educational, designed to inform Committee members of the research 
operations and to present general information about non-
redistricting uses of the data. The second phase involved 
presentation by knowledgeable employees of the new data and analyses 
as they became available. The Committee reviewed the data and 
analyses, sometimes asking staff to provide additional and new 
information. The third phase was deliberation, where the Committee 
members met privately. The final and briefest stage was review, 
where Committee members commented on the draft report. Again, as in 
the ESCAP I process, minutes were prepared for all sessions, except 
for the final ones, which were private deliberations.
    During the education and evidence presentation phases, the Chair 
generally arranged presentations on major issues, issues that he 
identified on his own initiative or on the suggestion of Committee 
members. During the evidence presentation stage, authors of the 
analysis reports presented their data and conclusions to the 
Committee. The deliberation and review phases were less structured 
with various members raising topics for discussion and asking for 
evidence. No formal vote was held; this Report reflects a consensus 
of the ESCAP.

Non-Redistricting Uses of the Data

    The ESCAP's recommendation covers the three non-redistricting 
uses of census data: post-censal estimates, demographic survey 
controls, and Census 2000 long form products. Certain Census Bureau 
data products have already been issued using only the unadjusted 
data, including the Census 2000 Redistricting Data Summary File, 
Demographic Profiles, Congressional District Demographic Profiles, 
Summary File 1 Data, and reports in the Census 2000 Brief Series.\3\
---------------------------------------------------------------------------

    \3\ These models can be found at http://factfinder.census.gov.
---------------------------------------------------------------------------

    Post-censal estimates are made by updating the most recent 
census base with estimates of population change (births, deaths, and 
net migration). As directed by the Census Act, the Census Bureau 
prepares post-censal estimates at the national, state, and county 
level every year, and at the functioning governmental unit level 
every other year.\4\ These estimates have a variety of uses, most 
notably in funding allocations, as the basis for sample survey 
controls, and as denominators for many important statistics.
---------------------------------------------------------------------------

    \4\ 13 U.S.C. 181.
---------------------------------------------------------------------------

    The accuracy of the post-censal estimates for funding 
allocations is critical, as about $200 billion are allocated based 
on these data each year. Medicaid (Title XIX) is the largest program 
to distribute federal funding based on population estimates, 
distributing over $100 billion each year based on the post-censal 
estimates. Community Development Block Grants from the Department of 
Housing and Urban Development, and Title I Basic, Concentration, and 
Targeted Grants from the Department of Education are two additional 
federal programs that use post-censal estimates as factors in their 
funding formulas to distribute federal monies. The individual states 
also have within-state fund allocation programs, many of which use 
post-censal estimates to allocate funds to sub-state areas.
    Many federal agencies use post-censal estimates as denominators 
to produce per capita statistics. Examples are per capita income, 
crime statistics, incidence of certain health conditions, birth 
rates, and mortality rates. The numerators of these statistics can 
be obtained at various points in time throughout the decade. In the 
absence of updated information, calculating these kinds of 
statistics on a static 2000 denominator would be misleading; 
therefore, many federal agencies use post-censal estimates of 
population.
    Demographic survey controls are used by many national sample 
surveys to transform the data they collect into nationally 
representative estimates. The most notable is the Current Population 
Survey, or CPS, which is used to calculate the monthly unemployment 
rate. Sample surveys generally have poorer coverage than a census; 
therefore, in order to improve the accuracy of estimates from a 
sample survey, the survey estimates are controlled to independent 
measures of the number of people in certain age, sex, race, and 
Hispanic origin groups, such as the post-censal estimates.
    The ESCAP Committee also considered whether adjusted or 
unadjusted Census 2000 data should be used for the controls for 
estimates based on data from the Census 2000 long form. The long 
form collects more extensive characteristic data from a sample of 
about seventeen percent of the population. Long form data are used 
to provide local communities with data on education, employment, 
housing, and various other social and demographic characteristics 
essential to efficient planning. Additionally, the long form 
provides the detailed local demographic and social characteristics 
used in some federal formula allocation programs. In order to 
produce estimates for the country as a whole from this sample, 
Census 2000 data from the short form items are used as controls.

ESCAP II Research

    In the months since the ESCAP I Report, the Committee embarked 
on a second round of deliberations to address the concerns 
identified in the report and to enable the Census Bureau to 
recommend whether adjusted Census 2000 data should be applied for 
non-redistricting uses. ESCAP II, therefore, directed the 
preparation of a number of evaluation studies, as described in 
detail in Attachment 2. Research centered around four areas, 
demographic analysis, the A.C.E., Census 2000, and synthetic error. 
The results of this research are set forth below.

Demographic Analysis

    ESCAP I's primary concern was that DA estimates were 
inconsistent with A.C.E. estimates. The Census Bureau expected, 
based on past experience, that demographic analysis would posit a 
higher estimate of the total population than the A.C.E. because of 
the presence of correlation bias, and that the two estimates would 
agree generally on the coverage of certain populations. Instead, the 
Base DA estimates were lower than both the Census 2000 population 
counts and the A.C.E. estimates. In response, the Census Bureau 
developed its Alternative DA estimates by doubling the unauthorized 
immigration assumed to have occurred during the 1990's. Doing so 
yielded a number of foreign born for 2000 that was roughly 
consistent with that reported by the March 2000 Current Population 
Survey.\5\ The Alternative DA estimates were, however, still 
significantly lower than the A.C.E. estimates. The Alternative DA 
indicated that Census 2000 undercounted the population by 0.32 
percent, while the A.C.E. produced a net undercount estimate of 1.15 
percent.\6\
---------------------------------------------------------------------------

    \5\ The March Current Population Survey was reweighted using the 
Census 2000 counts by age, race, sex, and Hispanic origin for this 
comparison.
    \6\ This figure differs from the 1.18 percent usually quoted for 
the A.C.E. because the A.C.E. and DA estimate different populations. 
DA estimates the total population, while the A.C.E. estimates the 
household population, which excludes group quarters.
---------------------------------------------------------------------------

    ESCAP I concluded that the inconsistency in the estimates of the 
total national population must have derived from one or more of 
three possible explanations. The first explanation was that all 
available 1990 census data, including the census results, the 1990 
coverage measurement survey, and the 1990 DA estimates, 
significantly understated the Nation's population, but that Census 
2000 found this previously un-enumerated population. The second 
explanation was that DA underestimated population growth between 
1990 and 2000. The third explanation was that the A.C.E. 
overestimated the Nation's population. ESCAP II directed that 
further research on demographic analysis be conducted. It focused on 
two main topics: international migration and measurement of vital 
events like births and deaths.

International Migration

    Assumptions regarding international migration were the most 
uncertain component in the demographic analysis estimates completed 
by March 1, 2001.

[[Page 56011]]

Although the research agenda for the March through October period 
focused primarily on those components of international migration 
that are less well measured (e.g., emigration, temporary migration, 
and unauthorized migration), the work also included research into 
legal immigration and the demographic characteristics of migrants 
used in the March 2001 DA estimates.
    Part of the analysis involved discussions with independent 
experts on demographic analysis and international migration. The 
purpose of a March 20, 2001, was to explain how the DA estimates 
differed from the A.C.E. estimates, and to discuss how to prioritize 
short-term and long-term research activities. Attendees included 
experts from the statistical community, academia, state agencies, 
the Census Bureau's advisory committees, professional organizations, 
and international organizations. A nearly unanimous recommendation 
from these experts was to focus on assumptions and estimates of the 
components of international migration, as these numbers were subject 
to the most uncertainty. Because of scheduling conflicts, two 
smaller meetings with other migration experts were held at the 
annual meeting of the Population Association of America on March 29-
30, 2001.
    Expert advice was sought again, on September 24, 2001, after 
completion of the original research activities (validation of the 
1990 estimates and updated 2000 estimates) that produced the revised 
DA estimates. Although these experts generally agreed with the 
methodology used to calculate components of international migration, 
they had concerns about the assumptions regarding the undercount of 
international migrants. Specifically, they believed the undercount 
assumption of 15 percent for unauthorized migrants, which was 
incorporated in the Revised DA, was probably too high, especially 
given the A.C.E. undercounts for other hard-to-enumerate groups. In 
addition, they urged renaming the residual migrant category as the 
residual foreign-born, or separating the residual foreign born into 
known components (``quasi-legal'' migrants) and the implied 
unauthorized migrant population. Both of these suggestions were 
incorporated into a subsequent sensitivity analysis.
    The sensitivity analysis of assumptions about coverage of 
various components of the foreign-born population showed that the 
total number of foreign born did not vary enough to have much effect 
on the DA estimate of the total population. For example, the lower 
bound assumption of 3.3 percent net undercount of the foreign-born 
equated to a population of 281.3 million, or more than 3 million 
people lower than the A.C.E. total population. The upper bound 
assumption of 6.7 percent was consistent with a population of 282.5 
million--still more than 2 million lower than the A.C.E. total 
population. These results led the Census Bureau to conclude that the 
Revised DA was an appropriate benchmark for assessing Census 2000 
and the A.C.E. estimates.

Measurement of Vital Events

    Other research examined the remaining assumptions underlying the 
DA components of change, including the birth, death, and Medicare 
components. Although estimates of deaths and the size of the elderly 
population did not change much, the estimates of historical births 
changed because of this research. The principal outcome was a 
revision in the assumptions about registration completeness of 
births since 1968. The previous DA estimates assumed that all births 
in years since 1968 (the last year of testing birth registration 
completeness) were registered at the same percent (99.2 percent). 
For the Revised DA estimates, registration completeness gradually 
reached 100 percent by 1985 (the first year natality statistics were 
reported electronically from all the States), and remained at 100 
percent through 2000. This revision lowered the estimated number of 
births for 1968-2000 by 715,000 (which lowered the Revised DA 
estimate of the total population in 2000 by the same amount).\7\
---------------------------------------------------------------------------

    \7\ ESCAP II Report No. 1, ``Demographic Analysis Results.''
---------------------------------------------------------------------------

Results of Revised DA

    The research undertaken between March and October allayed two 
fundamental concerns: first, the possibility that the Alternative DA 
did not capture the full growth of the population between 1990 and 
2000, and second, the possibility that the 1990 DA was lower than 
the true population. In fact, the cumulative effect of the research 
on immigration, births, and deaths led to Revised DA estimates that 
were only slightly different from the Alternative DA. In other 
words, the inconsistency between the Alternative DA and the A.C.E. 
estimates was not the result of unexplained problems in DA. These 
results, in combination with other evidence, led the ESCAP to 
conclude that the A.C.E. overestimated the Nation's total 
population.
    More specifically, the Revised DA lowered the net undercount 
rates from 1.85 to 1.65 percent in 1990, and from 0.32 to 0.12 
percent in 2000, but did not alter the DA finding that the net 
undercount rate in 2000 was substantially lower than in 1990.\8\ The 
Revised DA continued to measure a lower net undercount than the 
A.C.E., and in fact was very close to the Alternative DA estimate 
used by ESCAP I in March. The Revised DA estimated a net undercount 
of 0.3 million, or 0.12 percent, compared with the A.C.E. estimate 
of a net undercount of 3.3 million, or 1.15 percent. Population 
totals from the Base DA, Alternative DA, and Revised DA, along with 
the Census 2000 counts and the A.C.E. estimates, are shown in Table 
A. The corresponding numerical and percentage undercounts are shown 
in Figure 1.
---------------------------------------------------------------------------

    \8\ ESCAP II Report No. 1, ``Demographic Analysis Results.''

   Table A.--Resident Population Totals from Census 2000, Demographic
                 Analysis, and the A.C.E.: April 1, 2000
------------------------------------------------------------------------
                                                               Total
                         Source                             population
------------------------------------------------------------------------
Base DA (March).........................................     279,598,121
Census 2000.............................................     281,421,906
Revised DA (September)..................................     281,759,858
Alternative DA (March)..................................     282,335,711
A.C.E...................................................     284,683,782
------------------------------------------------------------------------


[[Page 56012]]

[GRAPHIC] [TIFF OMITTED] TN05NO01.011

    As shown in Table B below, the Revised DA implied a greater 
reduction than the A.C.E. in net undercount in Census 2000 compared 
with the 1990 census. Under the revised DA, the net undercount rate 
was reduced by 1.53 percentage points, from 1.65 percent in 1990 to 
0.12 percent in 2000. In contrast, the A.C.E. estimate of 1.15 
percent net undercount in 2000 was 0.43 percentage points lower than 
the 1.58 percent in 1990. Additionally, both DA and the A.C.E. 
measured a reduction in the net undercount rates of Black and 
nonBlack children compared with 1990. Both methods also measured a 
reduction in the net undercount rates of adult Black men and women.
    The revised DA and A.C.E. estimates continued to disagree in 
that DA found a reduction in the net undercount rates of nonBlack 
men and women in Census 2000 compared with the rates of previous 
censuses. The A.C.E. indicated no change or a slight increase in 
undercount rates for nonBlack adults as a group.
    Demographic analysis also provided evidence that correlation 
bias was not reduced between 1990 and 2000. Comparisons of the DA 
and A.C.E. sex ratios (men per 100 women) showed that correlation 
bias in the survey estimates was not reduced for Black men between 
1990 and 2000. The A.C.E. sex ratios for Black adults were much 
lower than the expected sex ratios based on DA, implying that the 
A.C.E. did not capture the high undercount rate of Black men 
relative to Black women. The size of this bias was about the same as 
in the 1990 coverage measurement survey.

  Table B.--Estimates of Percent Net Undercount, by Race, Sex, and Age:
                              1990 and 2000
                 [a minus sign denotes a net overcount]
------------------------------------------------------------------------
                                            Revised         PES/A.C.E.
-------------------------------------     demographic    ---------------
                                           analysis
              Category               --------------------   PES   A.C.E.
                                         1990      2000    1990     2000
------------------------------------------------------------------------
Total...............................       1.65     0.12    1.58    1.15
Black...............................       5.52     2.78    4.43    2.07
0-17................................       5.27     1.30    7.05    2.92
Male, 18+...........................       9.57     7.67    3.76    2.10
Female, 18+.........................       2.05     0.75    2.64    1.28
NonBlack............................       1.08    -0.29    1.18    1.01
0-17................................       1.12     0.54    2.46    1.27
Male, 18+...........................       1.74     0.29    1.19    1.43
Female, 18+.........................       0.44    -1.02    0.34   0.44
------------------------------------------------------------------------
Source: U.S. Census Bureau.
Note: Estimates by race shown for 2000 are based on the ``average'' of
  Model 1 and Model 2, as described in ESCAP II Report No. 1,
  ``Demographic Analysis Results.''


[[Page 56013]]

Research to Evaluate the A.C.E. and Census 2000

    A number of the studies described more fully in Attachment 2 
evaluate the accuracy of the A.C.E. and Census 2000. The A.C.E. is 
composed of two samples, the E-sample, which measures erroneous 
enumerations, and the P-sample, which measures census omissions. The 
E-sample is also used to estimate the number of census persons who 
do not have sufficient information to be used in A.C.E. matching and 
followup operations. The Dual System Estimates (DSEs) are computed 
by combining E-sample estimates of erroneous enumerations and 
insufficient information with P-sample estimates of omission. 
Therefore it is critical that the E-sample correctly account for 
erroneous enumerations and that the P-Sample correctly account for 
omissions. The evaluations were designed to measure the accuracy of 
both the P- and E-Samples.
    Three studies in particular produced substantial new information 
for ESCAP II: the Matching Error Study, the Evaluation Followup 
(EFU), and the Person Duplication Studies.

Matching Error Study \9\

    The Matching Error Study provided the P-sample matching error 
rate and the E-sample processing error rate. Expert matchers 
clerically rematched all of the people in a one-fifth subsample of 
the A.C.E. clusters to determine the best match code. This 
information was compared to match codes assigned in production of 
the actual A.C.E. estimates.
---------------------------------------------------------------------------

    \9\ ESCAP II Report No. 7, ``Accuracy and Coverage Evaluation 
Matching Error.''
---------------------------------------------------------------------------

Evaluation Followup \10\

    The EFU consisted of a reinterview of households in the same 
one-fifth subsample of A.C.E. clusters used in the Matching Error 
Study, with additional subsampling. EFU results helped determine the 
accuracy of the production data processed and collected in the P- 
and E-Samples. The EFU interview results were used to measure the 
accuracy of the classification of correct and erroneous census 
enumerations as determined by the E-Sample. The results were also 
used to measure the accuracy of the P-Sample data regarding mover 
status and Census Day residence.
---------------------------------------------------------------------------

    \10\ ESCAP II Report No. 3, ``Evaluation Results for Changes in 
A.C.E. Enumeration Status,'' ESCAP II Report No. 4, ``A.C.E. 
Erroneous Enumeration Errors: Analysis of Census Discrepant 
Persons,'' ESCAP II Report No. 16, ``Evaluation Results for Changes 
in Mover and Residence Status in the A.C.E.,'' and ESCAP II Report 
No. 24, ``Results of the Person Followup and Evaluation Follow-up 
Forms Review.''
---------------------------------------------------------------------------

Person Duplication Studies \11\

    The Person Duplication Studies took advantage of the fact that 
Census 2000 was the first census to record name information in the 
data capture system in a way that permits computer matching. This 
new methodology permitted the Census Bureau to direct a nationwide 
computer matching operation to measure the level of duplication in 
the census. These studies also examined how well the A.C.E. 
accounted for these duplicates. While the A.C.E. matched respondents 
in the same block and surrounding blocks, this new tool permitted 
the Census Bureau to search for duplicates throughout the country. 
The Person Duplication Studies involved only computer matching, as 
the Census Bureau lacked the resources and time to match to the 
entire country using both computer and clerical matching. The 
computer matching thus understated the actual level of duplication. 
These studies also compared the results of the EFU with the Person 
Duplication Studies to determine whether the EFU correctly measured 
these duplications.
---------------------------------------------------------------------------

    \11\ ESCAP II Report No. 6, ``Census Person Duplication and the 
Corresponding A.C.E. Enumeration Status,'' ESCAP II Report No. 9, 
``Evidence of Additional Erroneous Enumerations from the Person 
Duplication Study,'' and ESCAP II Report No. 20, ``Person 
Duplication in Census 2000.''
---------------------------------------------------------------------------

    Some of the error components produced in these studies suggest 
that the A.C.E. overestimated the net undercount while others 
suggest the net undercount was underestimated. The results of these 
studies are discussed below, and are the basis for the 
recommendation that the adjusted data not be used due to a 
significant problem in the measurement of erroneous enumerations 
resulting in an overstatement of the net undercount by at least 3 
million people.

Measurement of Erroneous Enumerations, Including Duplicates

    The evaluations of the accuracy of the A.C.E. indicated that the 
A.C.E. did not measure a significant portion of the Census 2000 
erroneous enumerations. The measurement of erroneous enumerations is 
critical to both the national net undercount and to sub-national 
estimates. The effect of this error resulted in the A.C.E. 
significantly overstating the net Census 2000 undercount by at least 
3 million people, with an approximate range of 3 to 4 million. The 
significance of this error was such that the ESCAP recommended that 
the unadjusted data be used for Census 2000 non-redistricting 
purposes.
    The EFU and the Person Duplication Studies described above 
provided the most significant information regarding the measurement 
of erroneous enumerations. The initial EFU results gave evidence of 
a significant understatement in the A.C.E. measurement of erroneous 
enumerations. Because of the significance of the understatement, the 
EFU was extensively reviewed. The revised EFU again also indicated a 
significant problem with understating the level of erroneous 
enumerations, and resulted in a high level of cases left unresolved 
or conflicting. The Person Duplication Studies found that a 
significant number of duplicate enumerations were not measured by 
the A.C.E., and that the EFU did not pick up significant portions of 
this error. The Person Duplication Studies also resolved a portion 
of the cases left unresolved or conflicting by the EFU Review.
    The EFU initially found a 3.5 percent change in enumeration 
status from that measured by A.C.E. production. A total of about 
2,800,000 production ``correct enumerations'' (SE 223,000) were re-
coded as ``erroneous enumerations,'' while about 900,000 production 
``erroneous enumerations,'' (SE 99,000) were re-coded as ``correct 
enumerations.'' \12\ The net difference found by the EFU was 
1,900,000. The EFU also included about 4,500,000 cases (SE 353,000) 
that could not be resolved. This study indicted that, at a minimum, 
the A.C.E. overstated the level of net undercount by about 2 million 
people.
---------------------------------------------------------------------------

    \12\ ESCAP II Report No. 3, ``Evaluation Results for Changes in 
A.C.E. Enumeration Status.''
---------------------------------------------------------------------------

    Because of the EFU's potentially significant implications for 
the A.C.E. estimates, ESCAP decided that further EFU analysis was 
needed. Accordingly, more highly trained matching analysts from the 
National Processing Center (NPC) directly reviewed a subsample of 
the EFU and production cases. Matching analysts are employees at NPC 
with many years of training in matching, some with over 20 years of 
experience, who supervise and perform quality assurance for all the 
A.C.E. matching operations.
    This additional review confirmed that there were errors in the 
A.C.E.'s identification of erroneous enumerations. A total of about 
1,800,000 enumerations (SE 189,000) that were coded as correct in 
production were subsequently coded erroneous in the evaluation, 
while the number of enumerations coded as erroneous in production 
that were then coded as correct in the review was about 361,000 (SE 
46,000).\13\ Consequently, the net difference in the ``correct 
enumeration'' to ``erroneous enumeration'' and ``erroneous 
enumeration'' to ``correct enumeration'' cells was estimated at 
1,450,000, rather than the initial level of 1,900,000. However, the 
review identified over 15 million cases which could not be resolved 
or for which conflicting results were observed. Depending on 
assumptions that could be made regarding the enumeration status of 
these cases, the overstatement of the net undercount could range 
from about 1.45 million to up to 5.9 million people.\14\
---------------------------------------------------------------------------

    \13\ ESCAP II Report No. 24, ``Results of the Person Followup 
and Evaluation Followup Forms Review.''
    \14\ ESCAP II Report No. 24, ``Results of the Person Followup 
and Evaluation Followup Forms Review.''
---------------------------------------------------------------------------

    The Person Duplication Studies found that a significant number 
of duplicate enumerations were not correctly measured by the A.C.E. 
or by the EFU. Furthermore, when the Person Duplication Studies 
results are combined with the EFU results, some of the unresolved 
and conflicting cases can be explained. Based on this work, more 
refined ranges for the level of the A.C.E. overstatement were 
developed. Direct estimates were produced from the Person 
Duplication Studies that indicated that the level of A.C.E. error 
not measured was about 3 million persons. In addition, it is also 
expected that further refinements to the treatment of the unresolved 
and conflicting cases would lead to about an additional 800,000 
errors. Thus, the approximate range of the potential overstatement 
of the net undercount was reduced to between 3 and 4 million 
persons.

[[Page 56014]]

    Finally, the EFU provided information regarding whether the 
A.C.E. accurately measured Census 2000 discrepant enumerations.\15\ 
This study showed that the net effect of erroneously identifying 
discrepant persons as correct enumerations in production and vice 
versa is an overstatement of about 6,000 correct enumerations in 
production, with a standard error of about 30,000.\16\ This 
difference is statistically insignificant.
---------------------------------------------------------------------------

    \15\ Discrepant results include falsification (the amount is 
uncertain), but do not include honest mistakes made by the 
interviewers or respondents. A person is classified as discrepant 
during the matching operation if three knowledgeable respondents 
indicate not knowing him or her in either the EFU or production 
interview.
    \16\ ESCAP II Report No. 4, ``A.C.E. Erroneous Enumerations 
Errors: Analysis of Census Discrepant Persons.''
---------------------------------------------------------------------------

Measurement of Census Omissions

    Measurement of census omissions is based on the P-Sample. 
Therefore, accurate matching of the P-sample to the census, and the 
correct classification of mover status and Census Day residence, are 
important components of the P-Sample. Information about the accuracy 
of the matching was produced by the Matching Error Study. 
Information about the accuracy of the classification of movers and 
Census Day residence was derived from the EFU.
    The Matching Error Study indicated that the level of matching 
error from the P-Sample would result in about a 385,000 
overstatement of the net undercount.\17\
---------------------------------------------------------------------------

    \17\ ESCAP II Report No. 7, ``Accuracy and Coverage Evaluation 
Matching Error.''
---------------------------------------------------------------------------

    The EFU demonstrated that misclassification of movers in the 
A.C.E. may have resulted in an understatement of about 450,000 in 
the net undercount.\18\ It should be noted that this final effect 
was the result of significant changes in mover status. These changes 
involved a large number of movers becoming nonmovers and vice versa. 
The EFU indicated that about 4.5 million people classified as 
``movers'' in production became ``nonmovers,'' and that about 2.4 
million people classified as ``nonmovers'' in production became 
``movers.'' At the national level there is therefore a small net 
effect of about 65,000 on the accuracy of the measurement of census 
omissions. However, more research must be conducted to further study 
these effects.
---------------------------------------------------------------------------

    \18\ ESCAP II Report No. 16, ``Evaluation Results for changes in 
Mover and Residence Status in the A.C.E.''
---------------------------------------------------------------------------

    The ESCAP was concerned about the EFU measurement of movers who 
became nonmovers, specifically about whether the EFU measured too 
few movers, due to its questionnaire design. To be classified a 
nonmover, the EFU required less detailed information than needed to 
be classified a mover. An examination of the bias caused by mover 
status changes indicates that the effect of mover-to-nonmover 
changes was greater in absolute value then the effect of nonmover-
to-mover changes. Therefore, if there was an over reporting of 
nonmovers in the EFU, the effect would be to lower the measured net 
bias described above. Additional work must clearly be conducted to 
clarify this information. Furthermore, even though the net effects 
of these errors cancel at the national level, assessment of the 
subnational effects also requires further research.

Correlation Bias

    Correlation bias refers to the tendency for people enumerated in 
the census to be more likely to be included in the A.C.E. than those 
missed in the census. Correlation bias usually results in a downward 
bias in the DSE. This type of bias can result from causal 
dependence, that is, the tendency of some people to be more likely 
to be included in the A.C.E. because they had been included in the 
census, or vice versa, or from heterogeneity. Heterogeneity bias can 
arise because different people within poststrata both have different 
chances of being counted in the census and different chances of 
being included in the A.C.E. To cause a bias, these chances must be 
correlated, for example, those likely to be missed by the census are 
also most likely to be missed by the A.C.E. ESCAP I assessed 
possible correlation bias in the A.C.E. estimates by comparing the 
A.C.E. and DA results. Correlation bias estimates available for the 
March ESCAP recommendation used DA estimates as of February 26, 
2001. ESCAP II directed that the correlation bias estimates be 
recomputed to use the Revised DA estimates and other newly available 
data. Revised correlation bias estimates were computed and discussed 
by the Committee.
    Like ESCAP I, ESCAP II was faced with the fact that while 
correlation bias exists, it is difficult to quantify. Correlation 
bias is an important component of assessing the A.C.E.'s accuracy 
because assumptions regarding correlation bias have a large effect. 
ESCAP II considered several models of correlation bias, including 
whether correlation bias should be assumed only for the Black 
population, whether the Hispanic population should be assumed to 
have the same degree of correlation bias as the Black population, 
and whether correlation bias should be assumed to be the same for 
owners and renters. Correlation bias would mean that the A.C.E. 
estimates of total population were too low by about 750,000 to 1.3 
million, depending on which model for correlation bias is 
assumed.\19\ Currently the Census Bureau has no means of 
incorporating these net biases in the production DSEs.
---------------------------------------------------------------------------

    \19\ ESCAP II Report No. 10, ``Estimation of Correlation Bias in 
2000 A.C.E. Estimates Using Revised Demographic Analysis Results.''
---------------------------------------------------------------------------

A.C.E. Missing Data

    Missing data occurs in the A.C.E. if, after all followup 
attempts, there remain households that were not interviewed, or 
households with some portions of the person data missing, such as 
age or race. Sometimes the missing item involves the status of 
whether a person matched, was a resident on Census Day, or was 
correctly enumerated. Statistical models are used to account for 
missing data. ESCAP I viewed the rates of occurrence of unresolved 
A.C.E. cases for match status, correct enumeration status, and mover 
status as low enough to preclude serious biases in the A.C.E. 
results. ESCAP II directed development of additional missing data 
models to assess the effect on the estimates of using alternative 
models.
    The treatment of missing data can have a large effect on the 
A.C.E. estimates under certain assumptions. ESCAP II examined a 
variety of models to predict the effects of missing data. Seven 
basic methods for addressing the components of missing data in the 
A.C.E. estimates were considered in various combinations. Each 
resulting alternative model was used to compute new DSE. The 
alternatives considered indicated that the choice of missing data 
model can have a significant effect on the resulting estimates of 
coverage error, causing the DSEs to be over- or under-stated. The 
Census Bureau chose to represent the effects of these alternative 
models in the form of increased uncertainty in the A.C.E. estimates.
    The DSEs that resulted from the alternative models were used to 
calculate a measure of variation similar to a sampling error. This 
research found that non-sampling variability from the use of 
alternative missing data models was considerable. At the national 
level, the overall magnitude of the variation resulting from all 
combinations of the alternative missing data models (about 530,000) 
was higher than the DSE sampling error (about 380,000).\20\ When 
some alternative models were excluded, the standard deviation was of 
approximately the same magnitude as the DSE sampling error, but 
there is no evidence to suggest that the measure of variation based 
on all methods is unreasonable. In fact, arguments could be made 
that this measure understates the actual levels of variation due to 
missing data because it assumes that the alternatives considered 
were randomly distributed around an average, that is, each 
alternative was equally likely.
---------------------------------------------------------------------------

    \20\ ESCAP II Report No. 12, ``Analysis of Missing Data 
Alternatives.''
---------------------------------------------------------------------------

    ESCAP II also examined information describing the level and 
distribution of A.C.E. missing data compared to the 1990 coverage 
measurement survey. The purpose of this review was to put the levels 
of missing data in context with 1990, and to add to the 
understanding of the alternative missing data model analysis 
previously described. The 2000 unresolved rates were slightly higher 
than those in 1990, but were not initially viewed as high enough to 
cause major concern. The alternative model analysis indicated that 
missing data had a more significant effect than anticipated, 
possibly due to changes in the methods for incorporating movers into 
the DSE, or to a more diverse set of alternative models.

Balancing Error

    The ESCAP I Report had identified balancing error as a potential 
problem, noting that the A.C.E. found 3 million more matches in 
surrounding blocks than correct enumerations, a result which could 
have affected the accuracy of the estimates. The A.C.E. matching is 
carried out in a defined search area consisting of the A.C.E. sample 
blocks (clusters) and a targeted area of blocks surrounding or 
bordering the A.C.E. blocks. Significant differences were discovered 
between the number of matches and correct

[[Page 56015]]

enumerations found in the surrounding blocks. Various scenarios were 
identified that could explain the difference, and ESCAP II directed 
that evaluations be conducted to investigate the source of this 
difference, identify the scale of any error, and assess whether its 
magnitude could significantly affect the accuracy of the adjusted 
data. This analysis necessitated additional field work.
    The evaluations indicated that the causes of the discrepancies 
were for the most part related to a scenario that does not 
significantly affect the resulting DSEs. That is, most of the 3 
million difference was attributable to the A.C.E. listing housing 
units in the blocks surrounding the sample blocks, which had little, 
if any, effect on the DSE. The evaluations did, however, detect 
about 246,000 A.C.E. people (SE 82,000) located out of the 
surrounding blocks.\21\ The evaluations also estimated that an 
additional 195,000 people (SE 56,000) were incorrectly identified as 
having been correctly enumerated, but although they were found to 
have been out of the search area. The effect of these errors is an 
approximate overstatement of the net undercount by about 450,000 
persons. It appeared that a portion of these errors were also 
included in the results of the EFU and Matching Error Study. While 
some additional work is required to completely resolve the potential 
effects of balancing error, the ESCAP believes that most of the 
previous concerns regarding balancing error have been addressed.
---------------------------------------------------------------------------

    \21\ ESCAP II Report No. 2, ``Evaluation of Lack of Balance and 
Geographic Errors Affecting Person Estimates.''
---------------------------------------------------------------------------

Conditioning

    Conditioning, or contamination bias, refers to the situation 
where the A.C.E. influenced the census. ESCAP I assumed in its 
deliberations that any effects of conditioning or contamination bias 
were minimal, and could be ignored. This assumption was based on 
previous experiences in the 1990 census. Evidence presented to ESCAP 
II confirmed that contamination bias was not a problem in Census 
2000, as research did not identify any evidence of its presence.\22\
---------------------------------------------------------------------------

    \22\ ESCAP II Report No. 14, ``Conditioning of Census 2000 Data 
Collected in Accuracy and Coverage Evaluation Block Clusters.''
---------------------------------------------------------------------------

Reinstated Late Additions

    While ESCAP I did not identify Census 2000 late additions as a 
source of error, levels of these additions were significantly higher 
than in the 1990 census. Late additions refer to persons included in 
the final census count who were excluded from A.C.E. matching and 
dual system estimation because of their late inclusion. For Census 
2000, the late additions consisted exclusively of housing units that 
were temporarily removed from the census because they were suspected 
to duplicate other housing units, but which were later (after the 
A.C.E. matching process started) reinstated into the final census 
after further research. ESCAP I determined that if the reinstated 
people were a small percentage of the correct enumerations in the 
census, or if their A.C.E. coverage rate was similar to the A.C.E. 
coverage rate for census people included in A.C.E., then there would 
be a minimal effect on the DSEs.\23\ To validate this assumption, 
additional research was conducted.
---------------------------------------------------------------------------

    \23\ Howard Hogan (March 2001). ``Accuracy and Coverage 
Evaluation Survey: Effect of Excluding `Late Census Adds,' '' DSSD 
Census 2000 Procedures and Operations Memorandum Series No. Q-43.
---------------------------------------------------------------------------

    Based on this additional work, ESCAP II concluded that the 
effect of excluding reinstated census people from the A.C.E. was 
minimal. The A.C.E. coverage rate may have been overestimated by 
0.034 to 0.082 percentage points.\24\ This result confirmed the 
assumption, previously made in the ESCAP I Report, that the effect 
of the reinstated people on the DSEs would be small.
---------------------------------------------------------------------------

    \24\ ESCAP II Report No. 21, ``Analysis of Census Imputations.''
---------------------------------------------------------------------------

Census 2000 Imputations

    Census 2000 experienced a higher rate of whole person 
imputations than in the 1990 census. Whole person imputations were 
excluded from A.C.E. matching activities, but reflected in the 
census coverage error as measured by the A.C.E. ESCAP I was 
concerned that information was not available at the time to validate 
that the whole person imputations were explainable by Census 2000 
design features (and thus should have no discernible impact on the 
A.C.E.). ESCAP II concluded that the kind, level, and pattern of 
whole person imputations in Census 2000 raised no additional issues 
relative to the accuracy of the A.C.E. adjustment.
    Approximately 5.77 million persons had all their characteristics 
(short form data items) imputed in Census 2000, compared to 1.97 
million persons in the 1990 census. Approximately 1.2 million of 
these persons were added to the census count through a count 
imputation process. The remaining 4.6 million persons were counted 
directly through the census enumeration process, but had all their 
person characteristics imputed because information about them was 
substantially missing from the census records.\25\ Research into the 
sources of the whole person imputations identified that changes in 
the census design contributed to the level of housing units 
requiring imputation. Furthermore, the count imputation rate was 
comparable to the rate experienced in the 1970 and 1980 censuses.
---------------------------------------------------------------------------

    \25\ ESCAP II Report No. 21, ``Analysis of Census Imputations.''
---------------------------------------------------------------------------

    Characteristics of the imputed persons were also examined. The 
age, race and sex characteristics of the population requiring some 
form of imputation was similar to the data-defined population with 
the exception of the age category under 18. The relatively higher 
percent of the population under age 18 in the imputed population was 
due to the high proportion of younger people in the ``within 
household'' category and reflected the fact that large households 
(greater than 6) were likely to have children not able to be 
accommodated by the 6-person mail-return form, and thus require 
imputation.\26\
---------------------------------------------------------------------------

    \26\ ESCAP II Report No. 22, ``Characteristics of Census 
Imputations.''
---------------------------------------------------------------------------

Total Error Model and Loss Function Analysis

    The total error model is designed to incorporate the results of 
the evaluations to produce a composite estimate of the bias and 
variability (both sampling and non-sampling) in the A.C.E. These 
measures are used to correct the A.C.E., thus producing measures of 
the ``true'' population that can be used to assess the accuracy of 
the adjusted and unadjusted census data. The total error model 
produces measures of this ``true'' population in the form of target 
populations which are based on various assumptions because the truth 
is not known.\27\ The total error model used by ESCAP I relied in 
part on 1990 data, as complete Census 2000 evaluations of the A.C.E 
were not then available. This preliminary model adapted the 1990 
total error model to the Census 2000 environment. For the current 
deliberations, the ESCAP II wanted to base recommendations on 
current data. Therefore, development of a new total error model was 
undertaken to incorporate the results of the Census 2000 
evaluations. The complexities of the revised EFU study and the 
A.C.E. design did not allow for the development and validation of a 
new total error model. Therefore, the ESCAP has had to rely on the 
individual evaluations described above. It is also apparent that a 
significant amount of additional research and development will be 
necessary before a complete total error model is available. ESCAP II 
believes that the information currently available is strong enough 
to preclude the use of adjusted data for any further Census 2000 
purposes, but that future research may lead to improved A.C.E. 
estimates, that could, in turn, be used to improve the post-censal 
estimates.
---------------------------------------------------------------------------

    \27\ Mulry, Mary H. and Spencer, Bruce D. (March 2001), ESCAP II 
Report No. B-19*, ``Overview of Total Error Modeling and Loss 
Function Analysis,'' DSSD Census 2000 Procedures and Memorandum 
Series No. B-19*.
---------------------------------------------------------------------------

Synthetic Estimation

    The A.C.E. estimation methodology produces estimated coverage 
correction factors for each post-stratum. These factors were carried 
down within the post-strata in a process referred to as synthetic 
estimation. The key assumption underlying synthetic estimation is 
that net census coverage is relatively uniform within the post-
strata. Failure of this assumption leads to synthetic error. 
Synthetic error affects both the adjusted and unadjusted census 
results. ESCAP I analyzed the effects of synthetic error by using 
artificial populations, which are populations created with surrogate 
variables to reflect the distribution of net coverage error. 
Additional synthetic estimation analysis for ESCAP II focused on 
expanding the scope of the earlier artificial population work.
    ESCAP II continues to be concerned with synthetic error because 
it is not included directly in the total error model. However, as 
the synthetic error analysis must be considered in conjunction with 
loss function analysis based on the total error model, there is no 
need to consider the effects of synthetic error at this point.

[[Page 56016]]

Conclusion

    ESCAP II recommends that unadjusted Census 2000 data be used for 
non-redistricting purposes. The Committee was persuaded by new 
evidence indicating that the A.C.E. overstated the net undercount by 
at least 3 million individuals as a result of the survey's failure 
to measure a significant number of census erroneous enumerations. 
However, the Committee believes that, while Census 2000 successfully 
lowered the differential undercount, it did not eliminate it. 
Therefore, the Census Bureau will conduct further research and 
analyses to attempt to produce revised A.C.E. estimates that can be 
used to improve future post-censal estimates.
    The ESCAP II recommendation, if accepted, means that Census 2000 
long form results will be weighted with unadjusted population 
counts, and that post-censal population estimates and survey 
controls will also rely on unadjusted data. The Census Bureau will 
continue research on the issues discovered with the A.C.E., 
particularly the issue of census duplicates and their estimation or 
detection. It is quite possible that this research will develop 
methods to improve future population estimates by combining 
information from the census, A.C.E., and the A.C.E. evaluations, 
including the Person Duplication Studies. Post-censal estimates and 
survey controls are updated annually, offering the opportunity to 
incorporate improvements. Even if the research does not lead to 
improved post-censal estimates, it will still further our 
understanding of the nature of census duplications and other 
erroneous enumerations, and the problems with their estimation by 
the A.C.E. This knowledge will be vitally important to the planning 
of the 2010 census and to the improvement of future coverage 
surveys.
    Both census taking and coverage measurement are processes that 
evolve and improve with each census. The Census 2000 experience will 
help refine both census and coverage measurement processes for 
future censuses.

Attachments

1. List of ESCAP II Reports
2. Analysis Plan for Further ESCAP Deliberations Regarding the 
Adjustment of Census 2000 Data for Future Uses
3. Field Operations to Answer the Concerns about Lack of Balance

                     Attachment 1.--ESCAP II Reports
------------------------------------------------------------------------
          Report No.                     Title          Author/Presenter
------------------------------------------------------------------------
1.............................  ESCAP II: Revised       J. Gregory
                                 Demographic Analysis    Robinson.
                                 Results.
2.............................  ESCAP II: Evaluation    Tamara Adams,
                                 of Lack of Balance      Xijian Liu.
                                 and Geographic Errors
                                 Affecting Person
                                 Estimates.
3.............................  ESCAP II: Evaluation    David A. Raglin,
                                 Results for Changes     Elizabeth A.
                                 in A.C.E. Enumeration   Krejsa.
                                 Status.
4.............................  ESCAP II: A.C.E.        Elizabeth A.
                                 Eerroneous              Krejsa.
                                 Enumerations errors:
                                 Analysis of Census
                                 Discrepant Persons.
5.............................  ESCAP II: E-Sample      Roxanne
                                 Erroneous               Feldpausch.
                                 Enumerations.
6.............................  Census Person           Roxanne
                                 Duplication and the     Feldpausch.
                                 Corresponding A.C.E.
                                 Enumeration Status.
7.............................  ESCAP II: Accuracy and  Susanne L. Bean.
                                 Coverage Evaluation
                                 Matching Error.
8.............................  Accuracy of the 2000    Rita J. Petroni.
                                 Census and A.C.E.
                                 Estimates Based on
                                 Updated Error
                                 Components--Total
                                 Error Model.
9.............................  Evidence of Additional  Robert E. Fay.
                                 Erroneous
                                 Enumerations from the
                                 Person Duplication
                                 Study.
10............................  ESCAP II: Estimation    William R. Bell.
                                 of correlation Bias
                                 in 2000 A.C.E.
                                 Estimates Using
                                 Revised Demographic
                                 Analysis Results.
11............................  ESCAP II: Analysis of   Xijian Jim Liu,
                                 Unresolved Codes in     John A. Jones,
                                 Person Matching.        Roxanne
                                                         Feldpausch.
12............................  ESCAP II: Analysis of   Don Keathley,
                                 Missing Data            Anne Kearney,
                                 Alternatives for the    William R.
                                 Accuracy and Coverage   Bell.
                                 Evaluation.
13............................  ESCAP II: Effect of     David A. Raglin.
                                 Excluding Reinstated
                                 Census People from
                                 the A.C.E. Person
                                 Process.
14............................  Conditioning of Census  Katie Bench.
                                 2000 Data Collected
                                 in Accuracy and
                                 Coverage Evaluation
                                 Block Clusters.
15............................  ESCAP II: Analysis of   Xijian J. Liu,
                                 Movers.                 Rosemary L.
                                                         Byrne, Lynn M.
                                                         Imel.
16............................  ESCAP II: Evaluation    David A. Raglin,
                                 Results for Changes     Elizabeth A.
                                 in Mover and            Krejsa.
                                 Residents Status in
                                 the A.C.E.
17............................  ESCAP II: Census 2000   Diane F.
                                 Housing Unit Coverage   Barrett,
                                 Study.                  Michael
                                                         Beaghen, Damon
                                                         Smith, Joseph
                                                         Burcham.
18............................  ESCAP II: P-sample      Glenn Wolfgang,
                                 Nonmatch Analysis.      Tamara Adams,
                                                         Peter Davis,
                                                         Xijian Liu,
                                                         Phawn Stallone.
19............................  ESCAP II: Analysis of   Michael Beaghen,
                                 Non-Matches and         Roxanne
                                 Erroneous               Feldpausch,
                                 Enumerations Using      Rosemary Byrne.
                                 Logistic Regression.
20............................  ESCAP II: Person        Thomas Mule.
                                 Duplication in Census
                                 2000.
21............................  ESCAP II: Analysis of   Fay F. Nash.
                                 Census Imputations.
22............................  ESCAP II:               Signe I.
                                 Characteristics of      Wetrogan,
                                 Census Imputations.     Arthur R.
                                                         Cresce.
23............................  ESCAP II: Sensitivity   Richard Griffin,
                                 Analysis for the        Donald Malee.
                                 Assessment of the
                                 A.C.E. Synthetic
                                 Assumption.
24............................  ESCAP II: Results of    Elizabeth A.
                                 the Person Followup     Krejsa, Tamara
                                 and Evaluation          Adams.
                                 Followup Forms Review.
------------------------------------------------------------------------

    July 26, 2001.

Attachment 2--Analysis Plan for Further ESCAP Deliberations Regarding 
the Adjustment of Census 2000 Data for Future Uses

Background

    On March 1, 2001, The Census Bureau issued the Executive 
Steering Committee for A.C.E. Policy (ESCAP) recommendation that the 
Census 2000 Redistricting Data not be adjusted based on the Accuracy 
and Coverage Evaluation (A.C.E.) program data. The ESCAP was unable 
to conclude, based on information available at the time, that the 
adjusted Census 2000 data were more accurate for redistricting.
    By mid-October, the Census Bureau will recommend whether Census 
2000 data

[[Page 56017]]

should be adjusted for future uses, such as the census long form 
data products, post-censal population estimates and Census Bureau 
demographic survey controls. In order to inform this decision, 
further research will be conducted generating data for ESCAP's 
review. The analyses will focus on resolving the concerns that ESCAP 
identified during its deliberations for the redistricting adjustment 
decision. This document describes the research agenda and is 
organized by the topic areas of concern.
    The broad, overarching concern was that the Demographic Analysis 
and the A.C.E. estimates of the population were inconsistent. Even 
though alternative demographic estimates were produced by varying 
the assumptions underlying the Demographic Analysis, the highest 
reasonable estimate indicated that Census 2000 undercounted the 
population by 0.32 percent, while the A.C.E. produced a net 
undercount estimate of 1.15 percent.\28\ In previous censuses since 
1960, the Demographic Analysis estimates were used to evaluate 
decennial census coverage. The estimate derived through the 1990 
coverage measurement survey was reasonably consistent with the 1990 
Demographic Analysis estimate of the total population. When the 
corresponding estimates for Census 2000 were found to reflect 
substantial differences in the population estimates, this concerned 
the ESCAP. Four scenarios were identified that could explain this 
result:

    \28\ The 1.15 percent and 0.32 percent of the undercount rates 
are based on census counts that include both the housing unit and 
group quarters populations.
---------------------------------------------------------------------------

     The 1990 census coverage measurement survey (Post 
Enumeration Survey), 1990 Demographic Analysis estimates, and the 
1990 census may have understated the Nation's population, while 
Census 2000 included portions of this previously unidentified 
population.
     Demographic Analysis estimates might not have captured 
the full growth between 1990 and 2000, specifically due to static 
assumptions about critical components of international migration 
such as unauthorized migration, temporary migration, and emigration.
     Census 2000, as adjusted by the A.C.E., might 
overestimate the Nation's population. This situation raises the 
possibility of an undiscovered problem with the A.C.E. or Census 
2000 methodology.
     A combination of these explanations.

    To address these possibilities, further research is required 
into the quality of the three independent measures of the 
population--the Demographic Analysis estimate, the A.C.E. estimate 
and the census count itself. Specifically, research will address 
whether the Demographic Analysis estimate was too low and/or whether 
the adjusted estimate was too high. The latter situation could have 
occurred if either the A.C.E. did not measure the coverage error 
accurately or the census count had coverage error reflected by 
components not measured by the A.C.E.
    In addition, the ESCAP was concerned about two other issues 
related to the A.C.E. estimates--balancing error and synthetic 
error. Balancing error occurs in the A.C.E. when cases are handled 
differently in the two independent samples (the P- and E-samples) 
when identifying gross omissions and erroneous enumerations. This is 
explained more fully under section B.1.a below. Synthetic error 
reflects the extent that net census coverage within a post-stratum 
is not relatively uniform. Uniformity of coverage is the underlying 
assumption of the synthetic estimation process of carrying coverage 
correction factors down to the block level. The concerns regarding 
synthetic error are described more fully in section D below.
    The analysis agenda is organized around four basic areas of 
research: 1) recalculation of Demographic Analysis estimates using 
new migration assumptions as well as new birth and death data, 2) 
A.C.E. issues, including balancing error, 3) Census 2000 issues and 
4) synthetic error.

A. Demographic Analysis (DA) Research

    This area of research addresses the discrepancy of the 
demographic analysis data and the A.C.E. adjusted estimates of 
population. Specifically, this area of research will reexamine the 
historic levels of the components of population change to address 
the scenarios dealing with the possibility that the 1990 Demographic 
Analysis estimates understated the Nation's population and that 
demographic analysis did not capture the full growth between 1990 
and 2000. Consultation with demographic experts inside and outside 
the Census Bureau has led to a research program consisting of a 
variety of research projects focused on the methodologies and 
underlying estimates of the components of population change. The 
research activities are concentrated in two areas:

1. International Migration

    Assumptions regarding international migration are the most 
uncertain component of the demographic analysis estimates. The 
international migration component represents a combination of 
several components. Some of these components, e.g. legal 
immigration, are measured through continuous administrative data. 
For other components, e.g. temporary migration, emigration, and 
unauthorized migration, we do not have administrative data to 
provide continuous and current measurements. In the past, we have 
relied upon the most recent decennial data to develop a once a 
decade measure of these components. Thus, for the 1990 to 2000 
decade, we would have relied upon the measurement from the 1990 
census to develop an estimate for the 1990 to 2000 decade.
    This work will involve examining preliminary data from the 
Census 2000 long form and the Census 2000 Supplementary Survey 
(C2SS) to provide information to update the measurement of the 
international migration components. Although the research will focus 
primarily on those components less well measured, e.g. emigration, 
temporary migration, and unauthorized immigration, the work will 
also include research into all of the current assumptions relating 
to the components of international migration The first goal is to 
validate for the 1990 to 2000 period, the calculation of the 
components of international migration used in previous estimates. 
Then, using the preliminary data from the Census 2000 long form and 
possibly the C2SS, we will develop some updated measures of the 
components of international migration. The second goal is to assess 
if the documented calculation of the 1990 to 2000 migration 
components affect the DA estimate for 2000 and thus account for some 
of the discrepancy with the A.C.E. results. Research to be conducted 
includes the following:

     We will examine the assumptions about international 
migration flows, specifically for unauthorized migration, legal 
immigration, emigration, temporary migration, and migration from 
Puerto Rico. Utilizing preliminary long form data from Census 2000 
and other information sources (including C2SS), we can prepare the 
first set of documentation for our current international migration 
assumptions and we can assess the accuracy of assuming a 
continuation of the estimates developed from the 1990 Census data. 
Specifically, we will estimate migration using available long-form 
data on place of birth, citizenship, and year of entry and compare 
this estimate to the estimates previously used that were developed 
from the 1990 Census long form data. Thus we will evaluate 
differences in size and characteristics of previously implied flows 
based on current data sets. If appropriate, we will recalculate the 
demographic analysis estimates for 2000 employing any revised levels 
of international migration.
     We will assess the quality of the foreign-born and 
Hispanic population data (important because these data are major 
inputs to the setting of assumptions noted above). We will review 
edit and allocation procedures for foreign-born and Hispanic 
populations in the 1990 and 2000 censuses and attempt to quantify 
the effect (or at least address the direction of the effect) of any 
differences. We also will review the impact of any change in the 
edits and allocation procedures on the size and characteristics of 
these population groups.

2. Robustness of Demographic Analysis

    In addition to the research aimed at examining the components of 
international migration used in the demographic analysis estimates, 
we will examine the remaining assumptions underlying the Demographic 
Analysis components of change. These components include the birth, 
death, and Medicare components. This work will entail the following:

     We will examine the consistency of the components by 
cohort and age/sex groups across time (1935 to 2000), including the 
historical international migration components. We will construct DA 
undercount rates for the 1940 to 2000 decennial censuses and examine 
them for consistency. We will examine the consistency of sex ratios 
across cohorts and age/sex groups. Inconsistent or anomalous results 
will be noted, and possible reasons identified.
     We will review the assumptions about the completeness 
of vital statistics registration. Specifically, we will review the 
historic levels of births and deaths used to

[[Page 56018]]

develop existing DA estimates and the assumptions about the 
underregistration of births and registration of infant deaths. We 
will evaluate both the procedures for adjusting births for 
underregistration and the level of historical deaths (both total and 
by age). If appropriate, we will redevelop the historical annual 
levels of births and deaths to 1990 and 2000.
     We will examine the assumptions about the variation and 
coverage of Medicare data. This work will include documenting the 
differences in the sources of Medicare data used in the 1990 and 
2000 DA estimates, evaluating the adjustment rates used for 
underenrollment in the 1990 and 2000 DA estimates, and reconciling 
the differences in the Medicare files for 1990 and 2000.
     If appropriate, we will recalculate the demographic 
analysis estimates for 1990, compare them to the original 1990 
Demographic Analysis estimates, and assess their impact on the DA 
estimates for 2000.
     We will analyze the consistency of DA estimates of the 
population, by race, ethnicity, and nativity, with Census 2000 and 
A.C.E. This work will entail (1) developing DA benchmarks of the 
population, by selected race, ethnicity, and nativity groups, (2) 
obtaining census tabulations of the native and foreign-born 
populations from preliminary Census 2000 and the 1990 Census long 
forms, and (3) comparing to the DA benchmarks to derive coverage 
estimates by selected age, sex, and race groups.

B. A.C.E. Issues and Planned Research

1. Major Areas of Research

a. Balancing Error

    The A.C.E. was conducted using a defined area of search, the 
sample blocks and surrounding blocks for clusters selected for 
targeted extended search. There were concerns, since there was a 
change in the 1990 procedure of expanding the search area to 
surrounding blocks for all sample blocks. We found 3 million more 
matches in surrounding blocks than correct enumerations after 
expanding the search area. This difference must be explained in 
terms of its impact on subsequent estimates of total population. 
There are two scenarios:

     The unit is located in the surrounding block with no 
effect on estimates of coverage, but would explain the three million 
difference.
     The unit is outside the search area and the 
corresponding people should have been coded erroneous enumerations. 
This would result in an overestimate of the net undercount.

    This may have been compounded by the targeting used in the 
A.C.E. to match in an area of search around the sample blocks, i.e., 
the search area. This targeting to make searching effective may have 
introduced limitations and/or biases into our measurement of 
coverage. There were three specific concerns in our review of the 
2000 A.C.E.

     There were a number of census people that might have 
been coded as correctly enumerated although the housing unit was not 
actually located in the sample block. If we didn't estimate the 
correct number of erroneously enumerated cases, the result would be 
an overestimate of the net undercount.
     The P-sample may have incorrectly included some housing 
units in a neighboring block, then in the extended search, the 
people would have been recorded as matching to the census in the 
surrounding blocks. Hence, these cases would appear to be balancing 
error when, in fact, the extended search was compensating for the 
original listing error. If the P sample had more geocoding error 
than expected, the Targeted Extended Search (TES) would have 
compensated for the error and the impact would be trivial and would 
have little or no impact on final coverage estimates. This would 
help explain some of the differences of the apparent lack of balance 
of 3 million.\29\
---------------------------------------------------------------------------

    \29\ Assume 2.6 million of the P-sample are listed in the 
surrounding blocks. If 95% of them are in the search area (a 
plausible percentage), and if 90% match (about the overall match 
rate), then we have accounted for 2.2 million matches to the 
surrounding blocks. When we divide this 2.2 million by the P-sample 
coverage of 0.94, we have accounted for about 2.36 million of the 3 
million lack of balance.
---------------------------------------------------------------------------

     Problems in identifying census geocoding errors may 
have affected the sampling used to select people for extended search 
outside the sample blocks. That is, the TES sample could have 
excluded cases it should have included and thus, not matched or 
followed up on them correctly. The effect of their exclusion would 
be an overestimate of the net undercount.

    It is likely that all of these errors occurred to some extent. 
What is not yet known is the scale of the error and whether the 
magnitude of the error was such as to significantly affect the 
relative accuracy of the A.C.E. adjusted numbers. The additional 
geographic field work is described in more detail in the attachment.

b. Erroneous Enumerations

    Subsequent to the March 1st decision, a new area of concern was 
identified. In comparing the A.C.E. measures to the comparable 
measures from the 1990 Census, the Census 2000 erroneous 
enumerations were found to differ substantially from the 1990 
measures. These differences indicate concerns that the level of 
erroneous enumerations may be understated for Census 2000. 
Therefore, these differences must be explained because an 
understatement of erroneous enumerations results in an overstatement 
of net undercount. Research described below will quantify the 
accuracy of the A.C.E. measures of erroneous enumeration.

     The Analysis of Measurement Error Study will determine 
how well the A.C.E. identified erroneous enumerations and correct 
enumerations. This study is based on a reinterview of a sample of E-
sample records. This is described more fully in section B.1.c below.
     Another evaluation based on results from the ``E-sample 
Erroneous Enumeration Study'' will analyze the erroneous 
enumerations for various characteristics. This evaluation will 
compare the rates of the different types of erroneous enumerations 
for Census 2000 with corresponding 1990 rates. This evaluation will 
also recategorize people with unresolved status into the appropriate 
erroneous enumeration categories by using data from the followup 
forms. The goal of this work will identify explanations for 
differences between 1990 and 2000 coding of erroneous enumerations.
     The duplication study discussed in Section C1 will also 
provide information regarding the differences between 1990 and 2000. 
This study will validate whether the A.C.E. process is correctly 
coding census 2000 duplicate enumerations as erroneous.

c. Total Error Model and Loss Functions

    Loss function analyses, reviewed by the ESCAP during its 
deliberations on whether to adjust the census redistricting data, 
were based on a total error model that corrected the A.C.E. for 
biases, thus producing measures of the ``true'' population that 
could be used as standards for comparing the adjusted and unadjusted 
census results. The 1990 total error model was adapted to the extent 
possible to ``fit'' the 1990 coverage measurement survey error 
components into the 2000 survey design. This model was updated with 
available Census 2000 data, but retained several error component 
measures obtained from the 1990 coverage measurement survey and 1990 
evaluations, because the 2000 A.C.E. evaluation data were not yet 
available. Thus, the error model assumed that the actual A.C.E. 
error rates for these components were similar to those reflected by 
the 1990 coverage measurement survey results. This was viewed as 
conservative because it was expected that the A.C.E. was of higher 
quality than the 1990 coverage measurement survey. Work is underway 
to validate that the assumption above is correct.
    We are conducting studies to revise the 1990 total error model 
to reflect actual A.C.E. error components, as measured by 2000 
evaluations. Because of methodological changes between 1990 and 
2000, there are issues that influence the comparability of this 
updated analysis to the March 2001 analysis. The analysis will 
include a discussion of the comparability.
    The A.C.E. error components that were previously based on 1990 
data will now be measured and input into the revised total error 
model are:

--P-sample matching error
--P-sample data collection error
--P-sample discrepancy error
--E-sample processing and data collection errors

    Synthetic error is not included in the total error model--this 
component of error is discussed later. A.C.E. error rates for these 
total error model components will be obtained from the following 
evaluation studies.
     The Matching Error Study will provide the A.C.E. P-
sample matching error rate and E-sample processing error rates. The 
methodology consists of the clerical rematching of all of the people 
in a one-fifth subsample of the A.C.E. clusters by expert matchers 
to determine the best match code possible. We will compare that 
match and residence information to the production codes.

[[Page 56019]]

     The Analysis of Measurement Error Study uses the 
results of the Evaluation Followup Interview to provide the error 
components for E-sample and P-sample data collection error relating 
to person coverage, and P-sample discrepancy error. The methodology 
consists of revisiting some of the households in a one-fifth 
subsample of the A.C.E. clusters and using that information to 
rematch the Census and A.C.E. people in those households. The 
results of this study will determine the accuracy of the data going 
into the person matching process, such as the results from Census 
and A.C.E. questionnaires. This can involve reclassification of 
correct and erroneous enumerations. We will determine the accuracy 
of the residence status of A.C.E. people and how well the A.C.E. 
process identified Census erroneous enumerations (EEs) and correct 
enumerations (CEs).

    Once the total error model is updated with current data, new 
loss function analyses will be conducted. The loss function analyses 
will be expanded to analyze the accuracy of governmental units, in 
addition to states and counties. No loss function analyses will be 
run for congressional districts.

d. Correlation Bias

    Correlation bias in Dual System Estimates (DSEs) results from a 
failure of the general independence assumption underlying DSEs due 
either to causal dependence or heterogeneity. Causal dependence 
occurs when the act of being included in the census makes someone 
more likely or less likely to be included in the A.C.E. 
Heterogeneity occurs when the census and A.C.E. inclusion 
probabilities vary over persons within post-strata. When 
heterogeneity within post-strata exists it is generally suspected to 
be of the form where persons more likely to be missed in the Census 
are also more likely to be missed in the coverage survey (A.C.E.). 
This will lead to underestimation of true population by the DSEs. 
The direction of the effect of causal dependence, if it exists, is 
less certain.
    Correlation bias in the A.C.E. estimates, whether due to 
heterogeneity or causal dependence, was assessed by comparing A.C.E. 
and DA results. Correlation bias estimates available for the March 
1, 2001 ESCAP recommendation used DA estimates as of February 26, 
2001. If further DA research results in revisions to the DA 
estimates, then the correlation bias estimates will be recomputed. 
The revised correlation bias estimates will then be used as inputs 
for revisions of the total error model and loss function analyses.

2.Auxiliary Areas of Research

    This section describes other areas that did not preclude ESCAP 
from recommending that Census 2000 data should be adjusted for 
redistricting purposes, but for which ESCAP would have preferred 
additional data. Further research in these areas will be conducted 
in order to confirm the ESCAP's conclusions.

a. Missing Data

    Missing data occurs in the A.C.E. if after all followup attempts 
there remain households that were not interviewed or households with 
some portions of the person data missing such as age or race. 
Sometimes the missing item involves the status of whether a person 
matched, was a resident on Census day or was correctly enumerated.
    For a small number of people in the P-Sample, there was not 
enough information available to determine the match status (whether 
or not the person matched to someone in the census in the 
appropriate search area) or the resident status (whether or not the 
person was living in the block cluster on Census Day). Determining 
residence status was important for the P-Sample because Census Day 
residents of the block clusters in the sample were used to estimate 
the proportion of the population who were not counted in the census. 
Similarly, some people in the E-Sample lacked information to 
determine whether the person was correctly enumerated. Generally for 
cases with missing status a probability of resident, match, or 
correct enumeration was assigned based on information available 
about the specific case and about cases with similar 
characteristics.
    The rates of occurrence of unresolved A.C.E. cases for match 
status, correct enumeration status, and mover status were viewed as 
low enough to preclude serious biases in the A.C.E. results. We are 
now doing analysis of the missing data model to determine if the 
assumptions are correct.
    We will develop and apply alternative models for the treatment 
of missing data. These alternative models will be carried through 
A.C.E. estimation process so that the effect on DSEs can be 
assessed.

b. Late Census 2000 Additions

    The levels of late Census 2000 additions were significantly 
higher than in the 1990 census. Late additions are those persons 
included in the final census counts, but which due to their late 
inclusion were excluded from in the A.C.E. matching and dual system 
estimation processes. For Census 2000, the late additions consisted 
exclusively of housing units that were temporarily removed from the 
census because they were suspected to duplicate other housing units, 
but which were later (after the A.C.E. matching process started) 
reinstated into the final census after further research was 
conducted. This differs from the 1990 Census in which the late 
additions were persons who were enumerated too late in the census 
cycle to be included in the matching and dual system estimation 
processes and were not factored into the coverage ratios. The A.C.E. 
design treated the late census data appropriately in measuring the 
census undercount. Two areas of concern require further 
investigation--whether calculating DSEs without these additions 
resulted in a bias in the estimates and whether these impacted the 
assumptions underlying the synthetic estimation model.
    There is no expectation of a bias in the dual system estimate 
caused by excluding late additions. The dual system estimate can be 
expressed as a product of the (1) number of A.C.E. people and (2) 
the ratio of census complete and correct enumerations to the number 
of people in both systems. Consequently, any effect must come from 
one of these two sources. Excluding the late additions does not 
impact the estimate of the number of A.C.E. people, which come 
solely from the A.C.E. enumerated sample. Excluding the late 
additions also will not affect the dual system estimate of the true 
population if the number of matches is reduced proportionately to 
the number of census correct enumerations. Given the traditional 
dual system independence assumption, one would expect this result. 
Consequently, there is no expectation of a bias in the dual system 
estimate caused by excluding late additions. Data were not available 
at the time to validate this assumption.
    We will now attempt to validate this assumption by performing a 
rematch of the P- and E-samples, with the late additions included in 
the E-sample, to attempt to measure the impact on the rates for 
correct enumerations and duplicates. This rematch will be conducted 
in a one-fifth subsample of A.C.E. clusters. This study has 
limitations because only computer and clerical matching can now be 
performed; that is, no field work will be conducted. Consequently, a 
high rate of unresolved cases is expected.
    The concerns regarding synthetic error are addressed in Section 
D. ``Synthetic Error''.

c. Conditioning

    Conditioning error occurs under two scenarios:
    1. Census data collection affects the A.C.E. This will be 
measured in the correlation bias.
    2. A.C.E data collection affects the census. This will be 
examined in the evaluation described below.
    The effect of potential conditioning of Census 2000 respondents 
by the A.C.E. operations was assumed to be minimal, similar to the 
1990 findings. The research is necessary to confirm this assumption.
    An evaluation will examine whether census and A.C.E. operations 
were kept operationally independent. The analysis will be based on 
comparing Census 2000 results in A.C.E. and non-A.C.E. blocks.
     Mover Status Analysis
    The match rate portion of the DSE formula (M/P) uses persons 
with all types of mover status (nonmovers, outmovers, and inmovers), 
differentiating between the different types of mover status. 
Therefore, misclassification of mover status could cause the DSEs to 
be overstated, understated, or both, depending on the post-strata.
    The Measurement Error Reinterview Analysis will measure the 
extent of mover misclassification by using the results from the 
Evaluation Followup Interview.
     Housing Unit Coverage
    The coverage of housing units will be available in the late 
summer of 2001. These data will be examined in relation to person 
coverage estimates for 2000. These data from 2000 will be compared 
to the 1990 estimates of person and housing unit coverage.
    In addition, another study will assess the impact of housing 
unit coverage on person coverage. This study looks at the P-sample 
to analyze the effect of housing unit nonmatches on the person 
nonmatches. The E-sample is also examined to help understand the 
relationship of housing unit status to person status. The correctly 
enumerated people in erroneously

[[Page 56020]]

enumerated housing units are of particular interest.
     P-sample Nonmatch Analysis
    The P-sample nonmatches are examined for variables such as race 
domain and age/sex group to see if the nonmatches are different for 
various types of people. This aids in the understanding of the 
components of A.C.E. and also helps explain the differences between 
A.C.E. and DA. In addition, the nonmatches from 2000 are compared to 
the nonmatches from 1990. In conjunction with the analysis of the E-
sample, it helps explain the differences between 1990 and 2000.

C. Census 2000 Issues and Planned Research

    Research will be conducted into two components of the census--
duplication issues and imputation of persons. A high level of 
duplication not measured by the A.C.E. design could cause the 
adjusted census estimate to be too high. The effect of imputed 
persons records are also not measured by the A.C.E. The number of 
person records that were imputed in Census 2000 was significantly 
higher than in the 1990 census. The assumption is that the imputed 
persons are no different than the persons included in the A.C.E. 
process and therefore match rates are not impacted.

1. Duplication Not Measured in A.C.E.

    The A.C.E. methodology by design did not measure duplication 
between components of the population living in group quarters and in 
housing units because group quarters were outside the A.C.E. 
universe. The A.C.E. also did not measure duplication within the 
group quarters population. Significant duplication of these types 
could explain some of the differences between demographic analysis 
and the adjusted Census 2000 data.
    The A.C.E. E-sample will be computer matched to the entire 
census to determine the extent of duplicate enumerations that were 
not in scope for the A.C.E. This analysis will potentially explain 
some of the differences between demographic analysis and the A.C.E.
    We also plan an extended computer search within the A.C.E. E-
sample for duplicate census enumerations among housing units and 
also between housing units and group quarters persons (which were 
out-of-scope for A.C.E.) This will help to explain differences 
between the A.C.E. and the 1990 coverage measurement survey.

2. Census Person Imputations

    Census 2000 imputed a higher number of cases than in the 1990 
census that came through the process with little or no information 
as to the occupancy status, or with an occupied status, but with no 
definitive population count. In addition, Census 2000 imputed more 
whole person records in cases with known household sizes, but with 
all the person data missing for some or all of the household 
members. Although the A.C.E. handled imputed persons appropriately 
in the estimation process, there was concern about not having 
information as to what census design processes contributed to the 
number of imputed persons when compared to the 1990 census.
    Given the potential impact that this level of imputations may 
have on Census 2000 data, it is essential to understand the 
demographic characteristics of the imputed people and how this may 
help explain the difference between the census and demographic 
analysis, as well as, how the imputations affect differences between 
the E-sample in 1990 and the E-sample in 2000.
    There were concerns expressed regarding the effect of whole 
household imputations on the heterogeneity assumption but these 
concerns are studied under the synthetic error analysis in Section 
D.

D. Synthetic Error

    The synthetic assumption states that census net coverage does 
not vary within post-strata. For example, the synthetic assumption 
implies that census counts in Florida in a particular Hispanic post-
stratum have the same net coverage as the census counts in the same 
Hispanic post-stratum but in New York. The synthetic assumption 
within post-strata will permit the Census Bureau to draw conclusions 
from the A.C.E. sample about the population as a whole and then 
apply them to individuals living in geographic areas smaller than 
post-strata. The synthetic assumption is necessary to permit 
correction for small geographic areas based on a sample. This 
adjustment is only correcting for systematic biases and not local 
census errors. The error that is introduced when the synthetic 
assumption does not hold is called synthetic error.
    Synthetic estimation methodology is directed at correcting for a 
systematic under- or overcount in the census. The synthetic 
estimates will not result in the correction of random counting 
errors that occur for any entity (blocks tracts, counties, etc). 
Therefore, the synthetic estimate will not result in extreme changes 
in small geographic entities, nor will it correct for extreme 
errors. It is designed to remove the effects of systematic errors so 
that when small entities are aggregated, systematic and differential 
coverage errors are corrected.
    In the assessment of accuracy, the Census Bureau is concerned 
with synthetic error since it is not included directly in the total 
error model. The analysis of the effects of synthetic error were 
based on the construction of ``artificial populations.'' These are 
populations that are created with surrogate variables that are known 
for the entire population, and are developed to reflect the 
distribution of net coverage error. This analysis of synthetic error 
and its effect on the loss functions was limited.
    Our additional analysis will expand the scope of the earlier 
artificial population work and add an approach using direct 
estimates of coverage at lower geographic levels.

1. Using Artificial Populations

    We will do a sensitivity analysis on the results from B-14. B-14 
gave results for weighted and unweighted loss functions using one of 
two methods for distributing targets to post-strata and one of 8 
models for correlation bias and percent of 1990 processing bias. 
This work will concentrate on the weighted loss functions and 
analyze the sensitivity of the B-14 results over both the methods 
for distributing targets to post-strata and all 8 models. Once again 
this analysis will be conducted for states and congressional 
districts.

2. Using Direct Estimates

    We will calculate direct DSEs for census divisions and for 
states having sufficient sample size to produce direct estimates 
with reasonably low variance. Assuming the resulting direct DSE 
population estimates are unbiased, the mean square error of the 
production synthetic estimate of total population will be estimated.

E. Schedule

    Some of the A.C.E. evaluation work being undertaken involves 
field work and/or additional computer or clerical matching work. The 
Evaluation Followup Interview was conducted in the field during the 
winter of 2001. The Matching Error Study matching work was completed 
in the spring. Results from these studies are being processed, with 
initial data being available for review in early summer. Field and 
clerical work for the TES2 and TES3 (described in the attachment) 
studies began in the winter and will continue into July. Results 
from these studies won't be available for ESCAP review until later 
in the summer. Matching for the late census adds evaluation is 
scheduled for late-July, with data available for review in August. 
Other research is being conducted on a flow basis as data become 
available and analyses are conducted.
    The ESCAP began holding weekly (or more frequent) meetings to 
review analyses of data related to the topics of concern beginning 
on June 18. It is expected that all of the research and analyses 
described will be completed by the end of September. The ESCAP will 
then discuss how the results impact their concerns and will make a 
recommendation by mid-October as to whether adjusted or non-adjusted 
census data should be used for subsequent purposes.
    During the September through October time frame, analysts will 
document the results of their research in evaluation reports, 
finalizing them in time for release to the public concurrently with 
the ESCAP recommendation.

Attachment 3--Field Operations to Answer the Concerns About Lack of 
Balance

    In order to answer these concerns and explain the lack of 
balance present due to Targeted Extended Search (TES) and to explain 
the lack of balance that may be introduced due to TES, we will be 
examining the results of Targeted Extended Search 2 (TES2) and 
Targeted Extended Search 3 (TES3). TES2 followed up E-sample housing 
units that were coded as erroneous enumerations in the initial 
housing unit phase to determine if the unit was inside or outside 
the block cluster and surrounding rings. TES3 will followup other 
types of units, both P-sample and E-sample, that may contribute to a 
lack of balance.
    In TES2 we are evaluating the housing units coded during the 
housing unit matching as not existing as housing units within the 
cluster. The block containing the housing unit selected for 
additional geographic work and the surrounding blocks were 
identified on a map. The field representative identified the block 
where the housing unit existed and the housing unit was classified 
as:


[[Page 56021]]


     Existing in the surrounding blocks
     Existing outside the surrounding blocks
     Existing within the block cluster
     Not a housing unit
     Unresolved

    So, a housing unit may be coded as in surrounding blocks or 
outside the search area when it was part of the block cluster.
    In TES3 we are also sending to the field a sample of census 
housing units classified as correctly enumerated in the block 
cluster. If a housing unit was classified as correctly enumerated in 
the block cluster in error, the housing unit was not eligible for 
targeted extended search in person matching. This could explain more 
of the lack of balance identified in the person matching.
    In addition, we are sending additional types of P-sample cases 
for more geographic field work and a sample of matches in the sample 
block as a control. These types of cases are:

     P-sample people matched in surrounding blocks
     Not matched P-sample housing units
     P-sample people matched in the sample block cluster

[FR Doc. 01-27663 Filed 10-31-01; 12:05 am]
BILLING CODE 3510-07-P