[Congressional Record Volume 144, Number 24 (Tuesday, March 10, 1998)]
[House]
[Pages H912-H917]
From the Congressional Record Online through the Government Publishing Office [www.gpo.gov]




                            THE 2000 CENSUS

  The SPEAKER pro tempore. Under the Speaker's announced policy of 
January 21, 1997, the gentleman from Florida (Mr. Miller) is recognized 
during morning hour debates for 5 minutes.
  Mr. MILLER of Florida. Madam Speaker, today I rise to discuss the 
current status of the 2000 census.
  Most Americans do not realize the size and scope of the decennial 
census. It is the largest peacetime mobilization of the Federal 
Government in history. The Census Bureau will hire and train about 
500,000 Americans to carry out and conduct the 2000 census.
  Under our system of government, we do not consider engaging in such a 
huge operation that spends billions of dollars without involving the 
United States Congress. Unfortunately, that is exactly what this 
administration has decided to do, ignore the Congress.
  Most Americans do not know what the dispute over the 2000 census is 
all about. So let me take a moment to try and explain.
  For 200 years we have conducted the census by trying to count all 
Americans. The fancy term for this is full enumeration. Of course, it 
is a difficult undertaking to count all Americans, but that is what 
we have been doing for 200 years. The administration does not want to 
do that anymore.

  They no longer want to attempt to count all Americans. Instead, with 
the help of experts, they have designed the largest statistical 
experiment in U.S. history. I do not want to bore everyone with the 
details, but let me try and give my colleagues a basic outline of this 
grand experiment.
  There are 60,000, 60,000 separate census tracts in the United States, 
each contains approximately 4,000 people. Under this new, untested 
theory, the administration wants to count 90 percent of the people in 
each of the 60,000 census tracts. And then they will use 60,000 
simultaneous polls to estimate the other 10 percent in each of the 
census tracts. That is just step one.
  And step two only gets worse. The scope of this experiment is simply 
breathtaking. When you see a poll in the New York Times or CNN or USA 
Today, the pollsters normally talk to about 1,000 or so Americans. What 
this administration is talking about is doing 60,000 separate polls at 
the same time. It has never been tried before and the potential for 
mistakes and errors is quite large.
  The Commerce Department's own Inspector General said in December, 
``We can conclude that although the 2000 census design is risky, the 
Bureau's fundamental problem is that it simply may not have enough time 
to plan and implement a design that achieves its dual goals of 
containing costs and increasing accuracy.''
  The Inspector General goes on to state, ``Because this process is 
long, complex and operating under a tight schedule, there will be many 
opportunities for operational and statistical errors.''
  Madam Speaker, I include for the Record the report, as follows:

                                      U.S. Department of Commerce,


                                        The Inspector General,

                                Washington, DC, December 30, 1997.
     Hon. John McCain,
     Chairman, Committee on Commerce, Science, and Transportation, 
         U.S. Senate, Washington, DC.
       Dear Mr. Chairman: During the Committee's May 14, 1997, 
     oversight hearing on the Department of Commerce, you 
     requested our views on what needs to be accomplished by what 
     dates in order to ensure a successful 2000 decennial census. 
     You planned to use this information as a benchmark to track 
     the progress of the census.
       In response to your request, the enclosed paper discusses 
     decennial census milestones and associated risks. This paper 
     does not take into account the recent decision to include 
     plans for conducting the decennial without the use of 
     sampling. The Census Bureau is currently in the early stages 
     of adjusting its scheduling and cost models to reflect that 
     decision, and we will closely monitor and report on the 
     bureau's progress in making these adjustments.
       We conclude that although the 2000 census design is risky, 
     the bureau's fundamental problem is that it simply may not 
     have enough time to plan and implement a design that achieves 
     its dual goals of containing cost and increasing accuracy. 
     The problem is evidenced by the decennial Master Activity 
     Schedule--the primary decennial program management tool. The 
     schedule's tightness is due to changing design details, 
     lagging progress in some critical activities, less than full 
     implementation of strategies and procedures, and a continuing 
     lack of agreement between the Administration and the Congress 
     on the appropriate use of sampling.
       A recurring theme of this paper is our conclusion that, as 
     a result of its lack of time to complete various aspects of 
     the design, the bureau will need to ask for additional 
     funding, reprogram funds, or accept potential quality 
     shortfalls. To minimize the need for such actions, the bureau 
     should immediately (1) prioritize and assess the readiness of 
     its major design components, (2) simplify the design, (3) 
     realistically reassess costs, (4) communicate results both 
     internally and externally, and (5) redirect the 1998 dress 
     rehearsal accordingly.
       We discussed our findings and recommendations with senior 
     bureau managers who generally concurred. They stated that 
     some planned corrective actions had been delayed by the 
     Fiscal Year 1998 continuing resolution and the recent 
     legislation requiring both a sampling and a non-sampling 1998 
     Dress Rehearsal. However, the bureau has initiated a 
     comprehensive design review to be completed in January 1998 
     that is intended to address our concerns. We look forward to 
     assessing the adequacy of those corrective actions.
       If you have any questions about this paper, your staff may 
     contact either me at (202) 482-4661 or Jessica Rickenbach, 
     our Congressional Liaison Officer, at (202) 482-3052.
           Sincerely,
                                              Francis D. DeGeorge,
                                                Inspector General.
     Enclosure.

U.S. DEPARTMENT OF COMMERCE, OFFICE OF INSPECTOR GENERAL, DECEMBER 1997

       2000 Decennial Census: Key Milestones and Associated Risks


                              introduction

                   History of Decennial Census Design

       The Census Bureau, in consultation with expert advisory 
     panels, ``reengineered'' census-taking methods to meet the 
     challenges of accurately and cost-effectively counting an 
     increasingly hard-to-count population in 2000. An accurate 
     census is crucial because the Constitution requires that it 
     be used to apportion seats in the Congress. Additionally, 
     census data are used for a host of other important 
     activities, including federal and state redistricting, the 
     implementation and enforcement of the Voting Rights Act, and 
     the distribution of billions of dollars of federal and state 
     funds each year. Because of its centrality to decisions that 
     last 10 years, the bureau must address concerns about the 
     content and method of conducting the census raised by its 
     stakeholders--federal, state, and local governments and a 
     myriad of advocacy groups whose constituents are affected by 
     census results.
       The 1990 census was long, expensive, and labor-intensive, a 
     situation exacerbated by a lower-than-expected public 
     response. Because of the low response, the bureau required 
     additional appropriations from the Congress during the census 
     to complete the count. Despite the census' higher cost, post-
     analysis concluded that the count was less accurate than that 
     of the 1980 census. Particularly alarming to the Congress and 
     other stakeholders was the increase over past censuses in the 
     disproportionate undercount of minorities.
       The Congress convened a panel of experts from the National 
     Academy of Sciences to study these problems and recommend 
     actions to address them. In 1994, the panel determined that 
     traditional counting methods alone are no longer sufficient, 
     and recommended that to contain cost and increase accuracy, 
     the bureau use statistical sampling and estimation as an 
     integral part of the 2000 census design. In addition, the 
     panel recommended that the bureau rethink and reengineer the 
     entire census process and operations. The bureau agreed with 
     the panel's recommendations and decided to incorporate 
     sampling and estimation, multiple response modes, updated 
     computing tools, and an improved national address file into 
     the design.
       The dress rehearsal, scheduled to begin in the spring of 
     1998, offers the Census Bureau its first opportunity to test 
     the interrelationships of the various decennial design 
     components. The bureau plans to closely approximate all major 
     decennial components and their supporting automated systems 
     in the dress rehearsal. Only a complete dress rehearsal will 
     allow the bureau and outside

[[Page H913]]

     observers to document the efficacy of the 2000 census design.

               OIG Monitoring of Decennial Census Design

       The OIG has long been concerned about the need for the 
     bureau to develop a sound decennial design. In an inspection 
     report issued two years ago, we concluded that the bureau had 
     not sufficiently refined and optimized a design that was 
     supported by adequate research and analysis and that it 
     lacked a credible cost estimate.\1\ Among our recommendations 
     was that the bureau derive a coherent, substantiated, cost-
     effective design for meeting decennial goals. Since that 
     time, we have continued to monitor the bureau's progress in 
     finalizing its design, offering our views on what actions 
     needed to be taken.
---------------------------------------------------------------------------
     See footnotes at end of article.
---------------------------------------------------------------------------
       This paper was developed in response to a request made by 
     Senator John McCain, Chairman of the Committee on Commerce, 
     Science, and Transportation, at a May 14, 1997, oversight 
     hearing on the Department of Commerce. The Chairman wanted 
     the OIG's perspective on milestones that the Census Bureau 
     needs to meet in order to ensure a successful census, 
     intending to use this information as a benchmark to track the 
     progress of the census.
       To define the requested decennial census milestones and 
     associated risks, we present several analyses of the design 
     using some of the bureau's activities for the dress rehearsal 
     and the census itself. First, we identify the key activities 
     and design components in each of the four phases of the 
     census. Then we briefly describe how the Master Activity 
     Schedule defines relationships between activities and 
     calculates start and finish dates. Based on the body of work 
     done by our office, we next provide a design risk analysis, 
     component by component. Since few dress rehearsal activities, 
     and even fewer decennial activities have yet occurred, we 
     identify potential future delays in milestone activities.


                               background

                        Decennial Census Phases

       Pre-Enumeration. Before census enumeration can start, the 
     Census Bureau must produce, distribute, and publicize the 
     2000 Census questionnaire. Perhaps the most complex step in 
     this process is creating the Master Address File (MAF)--the 
     list of addresses of all households to be counted in the 
     census. The MAF is being developed from information obtained 
     from the Postal Service, the 1990 census, local governments, 
     and field checks. Rural address capture requires temporary 
     staff to canvass areas that have rural delivery routes or 
     post office boxes. Before the MAF is finalized, it will be 
     sent to local governments for review and correction.
       Enumeration. Once all address information is complete, the 
     bureau will create the address file that will be used to 
     label questionnaires. Questionnaires will then be distributed 
     to households in one of two ways, depending on whether they 
     are in urban or in rural areas. Questionnaires with urban, 
     city-style addresses will be delivered by Postal Service mail 
     carriers. In rural areas, temporary census staff will drop 
     off questionnaires at each household and verify the location 
     of residences in the process.
       There will always be some individuals who do not return a 
     questionnaire or do not receive one in the first place. To 
     allow residents to obtain census forms at locations other 
     than their residences, the bureau will distribute additional 
     census forms, known as ``Be Counted'' forms, at high-profile 
     public places. Distribution sites in each community will be 
     determined through consultation with local officials and 
     community organizations. Additionally, temporary staff will 
     visit shelters and soup kitchens to enumerate transient 
     populations.
       The Census Bureau anticipates that about two-thirds of all 
     households will mail back a census form. To obtain 
     information on the remaining one-third of households, 
     temporary staff will visit them and attempt to conduct in-
     person census enumeration. Interviewers will obtain responses 
     from at least 90 percent of all households in each census 
     tract before terminating their activities. The bureau will 
     use statistical estimation to determine the characteristics 
     of the remaining nonrespondents.
       Processing. As census questionnaires are mailed back, 
     collected through follow-up interviews, or received over the 
     telephone, they are sent to one of several processing 
     centers. The data is then ``captured,'' or translated from 
     paper to electronic format for computer processing. 
     Questionnaires from within a defined geographic area are 
     compared to eliminate any duplicate responses from a single 
     household. The results are compiled into the unedited census 
     file, which is used in the post-enumeration phase to produce 
     final counts.
       Post-Enumeration. After enumeration and processing, the 
     Census Bureau will conduct an independent survey, called the 
     Integrated Coverage Measurement (ICM) survey, during which 
     750,000 households will be re-interviewed by temporary staff. 
     These second interviews serve as a quality check on all 
     preceding census activities. Responses to the ICM survey will 
     then be matched to each household's original census form, if 
     one was obtained, and the data transmitted to census 
     headquarters. The results of the quality check will be used 
     in calculating the final statistical adjustment of the census 
     count.
       At the end of December 2000, the Census Bureau will deliver 
     to the Congress the population counts to be used in 
     reapportionment. By April 2001, the bureau will release the 
     redistricting data to the states. Later, the census database 
     will be formatted for use by other data users--federal 
     agencies, state and local governments, and the general 
     public.

                           Project Management

       To help manage the planning for the 2000 Census, the Census 
     Bureau spent much of 1997 building its Master Activity 
     Schedule (MAS) for the census. The schedule was developed 
     using Primavera Project Planner (P3), a sophisticated project 
     management software tool. P3 allows the bureau to identify 
     relationships among activities in the schedule, such as 
     whether one activity must be completed before another can 
     start, or whether two must end at the same time. Using 
     activity durations developed by the bureau, P3 calculates the 
     earliest date an activity can begin based on its relationship 
     to predecessor activities, as well as the latest date an 
     activity can begin before it delays successor activities. The 
     interval between those two dates is known as ``float'' time.
       The bureau's planned beginning and ending dates for each 
     activity generally fall within the float period. Activities 
     with zero or negative float are considered critical, meaning 
     that they either are delaying or will delay subsequent 
     activities unless their durations are shortened. In part 
     because P3 provides the bureau with the opportunity to vary 
     activity durations or relationships as part of ``what if'' 
     analyses, it is an important tool in determining the cost, 
     schedule, and performance trade-offs inherent in implementing 
     the census.
       The milestones identified throughout this analysis come 
     from the MAS as of late October 1997. For major milestones, 
     we selected important end points from a possible list of 
     several thousand activities in the schedule. Unless otherwise 
     specified, we used the bureau's planned start and finish 
     dates. Appendixes I and II to this paper lists key dress 
     rehearsal and decennial milestones from the schedule. 
     Appendix III depicts the interrelationships among those key 
     activities as portrayed in the schedule. Appendix IV provides 
     a summary of our results.


                             risk analysis

                       Phase One: Pre-enumeration

                       Master Address File (MAF)

                               Background

       In 1990, the bureau purchased commercial address lists, 
     available only for metropolitan areas, to begin its address-
     building process. Temporary field staff went door-to-door 
     nationwide in 1989 to develop the 1990 Census Address Control 
     File. Because the address list was the source of millions of 
     errors, it was a good candidate for reengineering. 
     Further, the list was of particular interest to local 
     officials, who believed that they could help to improve 
     it. In October 1994, partially in response to local 
     government requests, the Congress passed Public Law 103-
     430, which requires the bureau to allow local governments 
     to review its address list before the 2000 decennial. 
     Consequently, bureau officials adopted an address-building 
     program that centered on partnerships with the U.S. Postal 
     Service and up to 39,000 local governments to build and 
     review the MAF before the census.
       This program was designed to produce an improved list at a 
     lower cost by assigning a unique georgraphic code to city-
     style addresses based on the bureau's mapping system. This 
     list is a combination of addresses from the Postal Service, 
     the 1990 census, and local governments. Rural address capture 
     would still require temporary staff to canvass areas that had 
     rural delivery routes or post office boxes. The address list 
     that emerged from both sets of activities would be sent to 
     local governments for review and corrections. In addition to 
     meeting the legal requirement for local government review of 
     the address list before the 2000 census, this review would 
     enable the bureau to obtain the most current information 
     available while receiving early acceptance from local 
     officials to preclude challenges after the census.

                           Activities at risk

       Developing base MAF. Although the MAF program seemed sound 
     in concept, when bureau staff began implementing it, a number 
     of deficiencies became apparent. The quality, currency, and 
     usability of the Postal Service and local government address 
     lists varied greatly. Additionally, few local governments 
     participated in the address-building part of the program. The 
     bureau addressed these deficiencies by planning for targeted 
     canvassing operations, such as a search for hidden units and 
     checks of multi-unit structures. However, as time progressed, 
     bureau analysts became increasingly alarmed about their 
     inability to clearly identify the attributes of areas where 
     errors would be most likely to occur. If it cannot identify 
     such attributes, the bureau will be unable to accurately 
     select the areas in need of the planned targeting, resulting 
     in error-prone areas not being among those checked.
       Acknowledging the MAF program concerns, during this past 
     summer, the bureau's Deputy Director established a team to 
     assess the 2000 decennial address-list building strategy. 
     Finding this strategy to be complex, risky, and incapable of 
     providing an adequate final product, the assessment team 
     concluded that a 1990-style, 100-percent field check was 
     essential and that the local review process needed to be 
     redesigned. Consequently, the bureau has requested an 
     additional $108.7 million to complete the MAF

[[Page H914]]

     building process. Bureau officials say that, if the funding 
     request is denied, they will reprogram the money from other 
     areas to conduct the field operation.
       Conducting local review of MAF. Despite its conclusions and 
     the associated need for additional funds, the assessment team 
     developed performance measures based on the number of local 
     governments participating in MAF building. These 
     participation measures seem to be considered as important as 
     quality measures. This apparent emphasis is troubling since 
     evidence suggests that, in some cases, local lists may 
     contain significant numbers of inappropriate or erroneous 
     addresses.
       Further, the redesigned process calls for a more 
     interactive process with greater technical assistance from 
     the bureau; as a result, depending on the intensity of the 
     bureau's efforts and the number of local governments 
     participating, the bureau could be facing an enormous 
     unanticipated resource drain. For example, local officials 
     may require detailed geographic assistance to conduct reviews 
     consistent with MAF requirements or technical assistance to 
     match and unduplicate multiple lists using computer software. 
     However, the current program infrastructure calls for staff 
     whose primary skills are in public relations, not technical 
     support. If the emphasis on local participation is not 
     subordinated to quality concerns and the local reviews become 
     unexpectedly numerous and intense, either cost and complexity 
     will further increase or MAF accuracy will decrease.

                               Conclusion

       To deliver the decennial MAF on schedule, the bureau must 
     receive additional funding, reprogram funds, or accept 
     potential quality shortfalls.

                         Phase Two: Enumeration

                         Nonresponse Follow-up

                               Background

       The largest single operation in the decennial census is 
     nonresponse follow-up--repeat mailings, visits, and telephone 
     calls to non-responding households. In 1990, 35.7 million 
     housing units required follow-up. In 2000, nonresponding 
     housing units will reach nearly 40 million, if the bureau's 
     projections of voluntary mail response are correct.
       After the traditional mail-out/mail-back phase of the 
     census, the 2000 plan calls for applying new methods, such as 
     making questionnaires (known as Be Counted forms) widely 
     available in up to 32 languages, and other coverage 
     improvement programs to further boost participation. Then, 
     the bureau will end the initial enumeration phase, tally the 
     responses in each census tract, and select a sample of the 
     remainder of sufficient size to increase response rates in 
     each tract to at least 90 percent. Using this strategy, 
     according to bureau projections, will reduce the nonresponse 
     workload to about 22 million housing units.
       In addition to using statistical methods, another strategy 
     for the 2000 census is building partnerships at every stage 
     of the process with state, local, and tribal governments; 
     community-based and other organizations; and the private 
     sector. The bureau believes that such partnerships are 
     valuable because local officials and community leaders 
     understand and know their communities, and can therefore help 
     to tailor plans for conducting the census. Local and tribal 
     governments will have the opportunity to review, confirm, and 
     augment the list of neighborhoods identified for targeting 
     methods, including distributing Be Counted forms in multiple 
     languages. Additionally, community-based organizations and 
     local governments will help the bureau to identify strategic 
     and high-visibility locations to serve as Be Counted form 
     distribution sites.
       According to bureau officials, despite the significant 
     reduction in workload under the current sampling strategy, 
     the single biggest threat to a successful census is 
     completing nonresponse follow-up within six weeks so that the 
     ICM survey can be completed in time to meet the December 31, 
     2000, legislative deadline.

                           Activities at risk

       Making Be Counted forms widely available in multiple 
     languages. The 2000 decennial census program to improve 
     coverage of the hard-to-enumerate by targeting questionnaires 
     in multiple languages may not be necessary and may conflict 
     with the bureau's dual goals of increasing accuracy and 
     containing costs.\2\ The program may be unnecessary because 
     the bureau has made sampling an integral part of its 2000 
     design to compensate for ineffective coverage improvement 
     programs used in past censuses. Further, the 1995 Census Test 
     results indicated that targeting areas with blank census 
     questionnaires in multiple languages did not increase 
     response rates for the intended populations.
       Although specific program details are not yet in place, if 
     the program is large and results in an unanticipated increase 
     in the workload, it could hamper the bureau's ability to 
     complete nonresponse follow-up on schedule. According to 
     decennial census managers, the limited period available to 
     complete nonresponse follow-up in time to conduct the ICM 
     survey is the single biggest risk in the census. A delay in 
     the start of the survey could compromise the bureau's ability 
     to deliver the appointment counts to the President by the 
     legal deadline.
       Acknowledging these limitations, bureau managers have 
     identified the goal of promoting partnerships as a 
     justification for expanding the number of languages included, 
     suggesting that measures of cost effectiveness are less 
     important. Given bureau managers' intensive efforts to 
     communicate and implement partnerships, community leaders are 
     likely to expect to play a significant role in determining 
     the program's ultimate scope and nature. In light of past 
     experience, local officials will probably advocate an 
     expansive program. Unless cost-effectiveness is a fundamental 
     criterion, program cost growth is likely.
       Conducting non-response follow-up. A long standing bureau 
     concern has been the difficulty and expense of recruiting, 
     hiring, training, and retaining a qualified, temporary 
     workforce. Even under a sampling scenario, this task involves 
     recruiting millions of people to ensure the hiring of about 
     500,000 staff to maintain a peak workforce. The magnitude of 
     the problem is exacerbated by a number of potential external 
     developments over which the bureau would have little or no 
     control; e.g., a decline in voluntary mail response rates 
     below the projected 67 percent, a booming economy shrinking 
     the available workforce, or a greater-than-expected 
     difficulty in enumerating nonrespondents.
       To help address the workforce problem, the bureau 
     contracted with WESTAT Inc. to devise a formula to calculate 
     the optimal pay rate for each area of the country to minimize 
     staff turnover without unnecessarily increasing wages. WESTAT 
     concluded that the bureau could achieve an 80 percent 
     turnover rate (a significant improvement over 1990) by 
     setting wage rates at 70 percent of locally prevailing rates 
     and by increasing the number of enumerators working at any 
     one time by 50 percent over 1990. Given the nearly 
     unprecedented pace and scale of hiring involved, however, 
     WESTAT's calculations are subject to uncertainty. (For the 
     discussion of some of the estimation issues related to 
     nonresponse follow-up, see the ICM/Estimation section.)

                        Phase Three: Processing

                            Data Processing

                               Background

       Unlike with previous labor-intensive decennial censuses, 
     the bureau's plan for the 2000 decennial depends heavily on 
     technology and automation. In previous censuses, the bureau 
     used internally designed and developed technology for data 
     processing. A prime example is its approach to data capture, 
     the process of translating data from paper questionnaires to 
     an electronic format for computer processing. Because the 
     system that the bureau used in 1990 is expensive, obsolete, 
     and unsupportable, it is acquiring a modern system, called 
     Data Capture System 2000 (DCS 2000), which uses electronic 
     imaging. The bureau is seeking to maximize the use of 
     commercial-off-the-shelf components for DCS 2000, but the 
     unique and stringent decennial census requirements 
     necessitate customizing parts of the system. Further, DCS 
     2000 is a key system for the 2000 census because every 
     response to a census questionnaire or personal visit must be 
     processed through the system in order to become a part of the 
     census.
       Once all census questionnaires are processed, 
     questionnaires potentially from the same address or person 
     must be matched and ``unduplicated.'' In the 1990 census, 
     census questionnaires were tightly controlled, with a unique 
     identification number printed on each, and only one was sent 
     to each household. Conversely, a key strategy for the 2000 
     Census is making questionnaires widely available. The bureau 
     plans to mail two questionnaires to every household in the 
     nation; mail a follow-up questionnaire to large households; 
     place unaddressed questionnaires, called ``Be Counted'' 
     forms, in public places; and allow responses by telephone and 
     possibly over the Internet. The potential for duplication is 
     therefore much greater than in previous censuses.

                           Activities at risk

       Capturing data from census questionnaires. The bureau's 
     plan for testing and implementing DCS 2000 appears feasible, 
     but only if two conditions are met. First, the bureau must 
     fund the contractor at agreed-upon levels. Second, the 
     processing plan cannot be altered significantly to 
     accommodate changes from other decennial census activities. 
     If the bureau fails to meet the first condition, the 
     contractor will be unable to provide full functionality. The 
     DCS 2000 project faces the continuing threat of funding 
     shortfalls. Without needed funds, the contractor will be 
     unlikely to complete the full range of planned testing, which 
     increases the risk of delays during operations.
       If other parts of the decennial census require changes 
     (e.g., in the questionnaire design or to the duration of the 
     Be Counted program), either increased funding will be needed 
     to pay for additional equipment and tasking, or the system 
     will be unable to perform at the required level. For example, 
     the bureau will be unable to process Be Counted forms in 
     languages other than English until they are translated. If 
     large quantities of Be Counted forms are submitted late in 
     the census, the bureau will have to wait for translators to 
     complete their work. To compensate for the delay, the bureau 
     will have to process data in extra shifts, reduce quality 
     assurance procedures, or extend the processing period. If the 
     bureau is unable to process all questionnaires by its ``drop 
     dead date,'' the matching of the census data to the ICM 
     survey will be delayed, jeopardizing timely census 
     completion.
       Conducting matching and unduplication of census 
     questionnaires and concluding all ICM matching. Because 
     limited time is available

[[Page H915]]

     for processing the millions of questionnaires involved in the 
     2000 census, the bureau must rely heavily on automated 
     procedures to match potential duplicate questionnaires. 
     Preparing the algorithms necessary to automate the matching 
     process requires a set of detailed rules indicating what 
     constitutes a match and a duplicate. Those rules cannot be 
     completed until the programs under which questionnaires will 
     be made available are fully defined. The uncertainties 
     associated with the bureau's plan to use the telephone, the 
     Be Counted campaign, and a second questionnaire mailing, as 
     well as each one's interaction with the sample design, have 
     delayed the preparation of the automated matching rules.
       In fact, it appears the bureau's concern about its ability 
     to automate this process caused it to limit to one block the 
     size of the area it will search for potential duplicates for 
     both the census and the ICM survey. Limiting the search area 
     decreases computational complexities and timing constraints, 
     but increases the likelihood of duplication because housing 
     units placed erroneously in adjacent blocks will go 
     undetected. This limitation is particularly problematic for 
     matching the ICM survey and census results because it 
     increases the likelihood that a household could be 
     incorrectly designated as undercounted.
       For example, if a household at 1075 Main Street is 
     mistakenly recorded as 1076 Main Street in the ICM survey, 
     the household will be incorrectly sorted across the street 
     from its actual location and placed in an adjacent block. A 
     matching process that searched nine blocks, as was previously 
     considered, would probably discover that this household had 
     been enumerated in the census. A single-block search would 
     not find this household's census enumeration and would 
     erroneously include the household in the undercounted 
     population. An abbreviated search area would virtually 
     guarantee more errors in the ICM survey.
       Errors in both the census and the ICM matching will be 
     further exacerbated without adequate software development and 
     testing. To date, however, the bureau has not completed 
     defining the matching rules and other procedural requirements 
     needed to develop the specifications to guide software 
     developers. Without adequate software, the matching and 
     unduplication process will ultimately depend more heavily on 
     labor-intensive clerical procedures, which are expensive, 
     time-consuming, and error-prone. A high rate of errors in 
     this arena could result in overcounts for certain groups, 
     which could exacerbate the differential undercount, given 
     that the method used in the ICM survey operates through 
     ``netting out'' over- and undercounts. (See the Post-
     Enumeration Phase for more discussion about issues associated 
     with completing the survey.)

                               Conclusion

       Completing processing of census questionnaires in time to 
     deliver the census unedited file to the ICM survey will 
     require stability in the rest of the design, which appears 
     unlikely. Moreover, to deliver accurate apportionment counts 
     on time, the bureau must have well-defined, automated 
     procedures to match and weed out duplicate questionnaires. 
     Without improvements in this area, quality may suffer.

                      Phase Four: Post-Enumeration

                    Integrated Coverage Measurement

                               Background

       The census has always had an undercount. Since 1940, the 
     Census Bureau has been able to measure the undercount; since 
     1990, methods have been sophisticated enough to 
     consider correcting for it. In the 1990 decennial census, 
     the bureau intentionally produced two sets of numbers: the 
     census counts and the counts ``adjusted'' through a 
     quality check called the Post Enumeration Survey (PES). 
     The PES was a separate operation conducted upon the 
     completion of regular census operations, in order to 
     provide the option of adjusting the census counts for 
     over- and undercounts. The results did not have to be 
     completed as early as the first set of counts. Opposition 
     to the adjustment ranged from technical to parochial, and 
     the adjustment was not made. Bureau statisticians later 
     conducted extensive analysis of the PES design, 
     methodology, and results to help them develop the next-
     generation PES--the 2000 ICM survey.
       The 1990 PES and the 2000 ICM survey differ in size, 
     precision, and function. A major criticism of the PES was the 
     use of indirect state estimates, which were based on samples 
     from several states combined. In response to this criticism, 
     the bureau increased the 2000 ICM sample size fivefold (to 
     750,000 households) to ensure that each state would have a 
     large enough sample to allow for direct state estimates. This 
     increase will provide every state with comparable levels of 
     accuracy, as well as the assurance that corrections to a 
     state's count are derived from residents of that state. 
     Partially as a result of this change, the ICM survey should 
     define the undercounted groups more precisely than the PES 
     would. The survey should also feature improved categorization 
     of subgroups that would share a probability of being counted 
     or missed.
       The most significant difference is that the ICM survey will 
     be integrated into overall census operations, producing a 
     single set of official Census Bureau counts. This ``one-
     number census'' is intended to be a seamless, accurate 
     calculation of the population that will not distinguish 
     between a housing unit determined through the ICM survey and 
     one enumerated in any other manner. The bureau plans to 
     provide data users with a single point estimate of a relevant 
     population count and its combined level of error.

                           Activities at risk

       Conducting ICM Field Interviews: ICM Size and Schedule. 
     Because of its complexity, the ICM survey is highly 
     vulnerable. In particular, the survey's magnitude, quality 
     demands, and tight schedule all present serious challenges. 
     Other than the census itself, the ICM is the largest survey 
     the bureau will ever have undertaken--the bureau must survey 
     750,000 households in 25,000 census tracts nationwide. 
     Because the ICM survey serves as a quality measure and 
     adjustment for the entire census, it must also be extremely 
     accurate. The bureau has stated that the survey must have a 
     98-percent response rate to produce a high-quality, accurate 
     adjustment.
       Perhaps the biggest obstacle facing the implementation of 
     the survey is the time pressure it faces at both ends. At the 
     front end, survey interviews cannot take place until the 
     bureau receives a household's initial census response. 
     Because the survey is one of the last census operations, it 
     is already at risk of delay from lags in earlier projects, 
     like nonresponse follow-up. If the survey begins late, ICM 
     activities themselves could require ad hoc operational 
     shortcuts, sure to compromise quality. At the back end, the 
     bureau must implement a whole host of complex estimation and 
     review steps.
       Interview Mode. As one approach to ensure quality, the 
     bureau plans for its thousands of interviewers to use laptop 
     computers, rather than paper and pencil. Originally, the 
     bureau selected Computer Assisted Personal Interviewing 
     (CAPI) to save time by eliminating the need to process paper 
     questionnaires and to improve quality through standardization 
     of interviews and built-in quality control measures. 
     Unfortunately, this area is subject to cost growth, because 
     the bureau's cost estimates for the ICM survey do not fully 
     capture the costs necessary to successfully manage, 
     implement, and process it. Areas of likely cost growth 
     include better-trained interviewers, a technical support 
     structure, a more complicated field structure to implement 
     laptop use, additional telecommunications to transmit data to 
     headquarters for processing, special contractual arrangements 
     with vendors to ensure the readiness of CAPI software, and 
     hardware delivery nationwide.
       To alleviate time pressures, the bureau recently decided to 
     include in the dress rehearsal some early ICM interviews over 
     the telephone after a household has returned its census 
     questionnaire but before nonresponse follow-up has been 
     completed in the block. Not having been tested, this approach 
     introduces new risks and complications. Using two ICM 
     interview techniques poses methodological concerns, and early 
     enumeration could violate the separation of the census and 
     the ICM survey. The integrity of the ICM design hinges on the 
     assumption that it is fully independent of nonresponse 
     follow-up. If residents or enumerators realize that a block 
     is in the ICM sample before nonresponse follow-up is 
     complete, independence is comprised, error is introduced, and 
     the ICM survey becomes a less effective correction for the 
     undercount. Ultimately, because early telephone ICM 
     interviews only recently became the subject of serious 
     consideration, there has not been enough time to develop a 
     solid understanding of their implications. An attempt will be 
     made to validate this approach during the dress rehearsal.
       Concluding All ICM Matching: Matching. The most sensitive 
     aspects of ICM quality control arise after initial field 
     interviews, when ICM responses are matched to census 
     responses and when interviewers conduct follow-up,or 
     reconciliation, interviews. The two sets of responses must be 
     compared to identify who was missed or erroneously counted in 
     census operations. Households that have not yet been counted 
     in the ICM survey, or who have offered incomplete or 
     inconsistent responses, must then be contacted by expert 
     interviewers. These final steps will be critical to minimize 
     error and to raise response rates to the necessary 98 
     percent.
       Response Rate. Current ICM interview plans propose a 
     response rate of 98 percent, since research has shown that 
     the undercount correction could be imprecise at response 
     rates as high as 95 percent. Raising response rates to 98 
     percent will require exhaustive efforts to contact all 
     households. In fact, some senior decennial census field 
     division managers do not find that goal realistic. If the ICM 
     survey begins late, the probability of achieving such a high 
     response rate is further reduced. Perhaps the only solution 
     involves using statistical methods (imputation) or sampling 
     of ICM nonrespondents (subsampling). The bureau is 
     considering the implications of both of these options. 
     Continued indecision in this area limits the bureau's 
     opportunities to address the ICM survey's quality assurance 
     measures. However, at present, the bureau does not fully 
     understand how the treatment of ICM nonrespondents will 
     interact with other design components, contribute to error, 
     or otherwise influence the results.
       Movers. Further, the bureau has yet to finalize decisions 
     about handling ICM responses from households that move in and 
     out of ICM blocks between census day and ICM enumeration. 
     Since the 1990 census, there have been concerns about 
     accurately enumerating movers in the ICM survey. The

[[Page H916]]

     bureau's decision to select a means for handling movers was 
     expected during the summer of 1997. Instead, the bureau will 
     test different methods for the treatment of movers during the 
     dress rehearsal, and will select an approach after analyzing 
     dress rehearsal results. Because of the delay of this 
     decision, there will be limited time to evaluate the selected 
     method, address any questions arising from the dress 
     rehearsal, and prepare software specifications and quality 
     assurance measures relating to movers. The treatment of 
     movers is yet another example of the questions that remain 
     about the reliability of matching and follow-up and the 
     adequacy of quality control in these operations.
       Combining All Estimation Streams to Produce Final Counts. 
     Census 2000 includes numerous avenues for data collection and 
     statistical adjustment; late in the census, all these 
     elements must be brought together into one file. Nonresponse 
     follow-up will estimate the characteristics of the final 
     nonresponding portion of the population and merge the results 
     into the census data file. Included in nonresponse follow-up 
     are a number of unique treatments for a series of special 
     populations. For example, the bureau must estimate how many 
     housing units in the address file are vacant buildings and 
     adjust census files to include counts for transient 
     populations. Finally, the file will incorporate ICM 
     estimates.
       Estimation Design and Quality Control. Because this process 
     is long, complex, and operating under a tight schedule, there 
     will be many opportunities for operational and statistical 
     errors. These conditions heighten the need for procedures to 
     control for sampling and non-sampling error, while also 
     managing the interplay of estimation and software components. 
     Given the importance of ensuring that undiscovered errors do 
     not creep into the final results, the bureau must ensure 
     timely development, refinement, and testing of the software. 
     These activities cannot be undertaken until the bureau 
     solidifies the estimation design.
       However, estimation associated with the ICM survey in 
     particular faces lingering methodological questions. 
     Decennial census managers intend to make all sampling and 
     estimation design decisions by December 31, 1997. Since 
     significant research questions have not yet been answered, 
     the bureau is unlikely to have the information it will need 
     to announce a fully adequate integrated sampling and 
     estimation plan by then.
       Conducting Estimation for Small Areas and Groups. Among the 
     research yet to be completed is research to address two 
     issues related to the accuracy of the ICM survey. First, ICM 
     estimates have higher error rates for small geographic areas. 
     The survey is intended to increase accuracy by significantly 
     reducing the differential undercount. Although the ICM survey 
     does introduce error, for larger geographic areas it improves 
     the data quality greatly. However, in its current design, the 
     survey introduced increasingly error-prone estimates for 
     small localities and in particular for block-level data.
       Second, the assumption that members of demographic 
     subgroups share a probability of being missed in the census, 
     called the homogeneity assumption, limits the accuracy of the 
     estimates. The ICM survey estimates a person's chances of 
     being undercounted based on only a few characteristics. In 
     reality, a person may be missed for many diverse reasons. 
     Therefore, the survey offers only an approximation of who is 
     undercounted. The bureau examined several techniques for 
     addressing this problem. Only one showed promise, and it has 
     serious unresolved mathematical questions. Therefore, the 
     bureau will be forced to address this important issue with a 
     tool that may not be fully evaluated and tested before 
     implementation.
       Applying Estimation to Blocks. The bureau is reconsidering 
     its initial plan for applying all estimates to individual 
     census blocks. The bureau intended to produce all population 
     estimates in the form of households, making enumerated and 
     estimated households indistinguishable. This approach was 
     designed to address data user concerns about the 1990 PES 
     method, which added an additional ``group quarter'' to each 
     census block to hold all persons estimated as undercounted. 
     This new approach raises fundamental questions about how 
     results will be formatted for the data file and provided to 
     all data users. Because of difficulties in applying the new 
     technique, the bureau is considering reusing the 1990 method.
       Implementing the One-Number Census. To deliver a one-number 
     census that is accurate and credible requires not only 
     mathematically proven sampling and estimation methodologies, 
     but also highly reliable, robust, and confidentiality-assured 
     software programs. Software of this caliber requires a 
     controlled development approach and rigorous testing and 
     retesting. Before the software development begins, decennial 
     census statisticians should produce numerous sampling and 
     estimation requirements specifications, or detailed sets of 
     rules to implement the intended methodology, which can guide 
     software developers. These specifications address selecting 
     households for many applications ranging from receiving a 
     long form to being included in the ICM survey. However, since 
     many design decisions will not be made until December 1997, 
     and the dress rehearsal begins in March 1998, the period 
     available for specification preparation and subsequent 
     software development is extremely limited.
       In fact, even the long form sampling specifications, which 
     are not based on a new technique, are almost a month late. 
     Bureau officials plan to address delays in sampling and 
     estimation specifications by having knowledgeable staff begin 
     programming before the specifications are completed and 
     formally delivered. They will then make software adjustments 
     in an iterative manner as the dress rehearsal progresses. In 
     a recent inspection of the decennial census software 
     development area, we found that (1) software is not being 
     developed in accordance with any well-defined process, (2) 
     estimates of software development schedules and resources are 
     not realistic for the dress rehearsal or the census, and (3) 
     requirements for headquarters processing are immature, 
     volatile, and likely to be late.\3\ These findings call into 
     question the bureau's ability to develop and implement 
     complete, accurate software for the census.
       Bureau managers acknowledged the deficiencies and are 
     taking steps to address them. For example, they have 
     contracted with a recognized software expert to recommend 
     improvements to the software development and testing process 
     that will assist in achieving decennial census goals. 
     However, there is not enough time to make significant changes 
     before the dress rehearsal software development effort 
     begins.


                               footnotes

     \1\ Inadequate Design and Decision-Making Process Could Place 
     2000 Decennial at Risk (OSE-7329-6-0001, November 1995).
     \2\ 2000 Decennial Census: Expanded Targeted Questionnaire 
     Program May Be Unnecessary and Counterproductive (ESD-9610-7-
     0001, September 1997).
     \3\ Headquarters Information Processing Systems for the 2000 
     Decennial Census Require Technical and Management Plans and 
     Procedures (OSE-10034-8-0001, November 1997).

  Mr. MILLER of Florida. Madam Speaker, as the Chairman of the 
Subcommittee on the Census and a member of both the Committee on 
Appropriations and the Committee on the Budget, I have to stop and 
scratch my head. Let me get this straight. This administration has 
unilaterally designed the largest statistical experiment in history. 
Their own Inspector General raises serious concerns that it will work. 
The majority of Congress disapproves of the plan. Yet, the 
administration is moving full steam ahead with their theory. They 
continue to stonewall the Congress.
  On November 26, 1997, President Clinton signed the Commerce, State, 
Justice Appropriations bill. The law states, ``that funds appropriated 
under this Act shall be used by the Bureau of the Census to plan, test, 
and become prepared to implement the 2000 decennial census without 
using statistical methods which will result in the percentage of the 
total population enumerated being as close to 100 percent as 
possible.''
  That legislation was signed last November. Secretary Daley testified 
last week before the Subcommittee on Commerce, Justice, State, and 
Judiciary, chaired by the gentleman from Kentucky (Mr. Rogers), and the 
Chairman asked a simple question, ``Do you have an enumeration plan in 
place?'' And Secretary Daley replied, ``If you are asking for a 
physical document, none is available.''
  Let me respond to Secretary Daley with the same words used by 
Chairman Rogers. Why not? We paid for the plan. We need cooperation, 
not stonewalling from this administration.
  The stonewalling continues. Congress, in the exercise of its 
responsibility for oversight, has been repeatedly thwarted by the lack 
of timely and complete responses for requests for information by our 
oversight subcommittees. Last year, Congress had to pass legislation to 
force the administration to give us a status report on their plan. Then 
the report was full of mistakes and had to be resubmitted.
  As recently as last week, the Commerce Department took the position 
that the Subcommittee on the Census staff should not be allowed to 
interview Bureau employees. They are deemed to be the best source of 
oversight information. The National Academy of Sciences is allowed to 
talk to them. The Government Accounting Office is allowed to talk to 
them, but not the Congress, not the elected representatives of the 
people, not the branch of government directed by the Constitution to 
carry out the census.
  Our ranking member of the subcommittee maintains that ``the planning 
process for the next Census has been the most open and inclusive ever 
and has been carried out in direct accord with the wishes of Congress. 
. . .'' Certainly the record has shown and continues to demonstrate 
that this is not true.
  Finally, Madam Speaker, I want to quickly change topics. There's a 
growing controversy out at the Census Bureau in Suitland, Maryland 
about a fence around the parking lot. It was put there because of 
repeated car thefts

[[Page H917]]

and vandalism. Now, the junior Senator from Maryland is threatening to 
go out there and cut down the fence. Employees of the census bureau are 
busy trying to prepare for the 2000 Census. Is it to much to ask for 
them to have peace of mind that their cars will be protected from 
vandals while they are at work? I mean really. All they want is to keep 
their fence. Doesn't the Junior Senator have more pressing issues to 
consider?

                          ____________________