[Senate Hearing 110-2]
[From the U.S. Government Printing Office]

                                                          S. Hrg. 110-2
                          DATA MINING PROGRAMS



                               before the

                       COMMITTEE ON THE JUDICIARY
                          UNITED STATES SENATE

                       ONE HUNDRED TENTH CONGRESS

                             FIRST SESSION


                            JANUARY 10, 2007


                           Serial No. J-110-1


         Printed for the use of the Committee on the Judiciary

33-226                      WASHINGTON : 2007
For Sale by the Superintendent of Documents, U.S. Government Printing Office
Internet: bookstore.gpo.gov  Phone: toll free (866) 512-1800; (202) 512�091800  
Fax: (202) 512�092250 Mail: Stop SSOP, Washington, DC 20402�090001

                       COMMITTEE ON THE JUDICIARY

                  PATRICK J. LEAHY, Vermont, Chairman
EDWARD M. KENNEDY, Massachusetts     ARLEN SPECTER, Pennsylvania
JOSEPH R. BIDEN, Jr., Delaware       ORRIN G. HATCH, Utah
HERB KOHL, Wisconsin                 CHARLES E. GRASSLEY, Iowa
DIANNE FEINSTEIN, California         JON KYL, Arizona
CHARLES E. SCHUMER, New York         LINDSEY O. GRAHAM, South Carolina
RICHARD J. DURBIN, Illinois          JOHN CORNYN, Texas
BENJAMIN L. CARDIN, Maryland         SAM BROWNBACK, Kansas
            Bruce A. Cohen, Chief Counsel and Staff Director
      Michael O'Neill, Republican Chief Counsel and Staff Director

                            C O N T E N T S




Feingold, Hon. Russell D., a U.S. Senator from the State of 
  Wisconsin......................................................     4
Kennedy, Hon. Edward M., a U.S. Senator from the State of 
  Massachusetts, prepared statement..............................   136
Leahy, Hon. Patrick J., a U.S. Senator from the State of Vermont.     1
    prepared statement and attachment............................   142
Specter, Hon. Arlen, a U.S. Senator from the State of 
  Pennsylvania...................................................     3


Barr, Robert, Chairman, Patriots to Restore Checks and Balances, 
  Washington, D.C................................................     6
Carafano, James Jay, Heritage Foundation, Assistant Director, 
  Kathryn and Shelby Cullom Davis Institute for International 
  Studies, Senior Research Fellow, Douglas and Sarah Allison 
  Center for Foreign Policy Studies, Washington, D.C.............    15
Harper, Jim, Director of Information Policy Studies, CATO 
  Institute, Washington, D.C.....................................     8
Harris, Leslie, Executive Director, Center for Democracy and 
  Technology, Washington, D.C....................................    10
Taipale, Kim A., Founder and Executive Director, Center for 
  Advanced Studies in Science and Technology Policy, New York, 
  New York.......................................................    12

                         QUESTIONS AND ANSWERS

Responses of Robert Barr to questions submitted by Senator 
  Kennedy........................................................    29
Responses of James Jay Carafano to questions submitted by Senator 
  Specter........................................................    32
Responses of Jim Harper to questions submitted by Senators Leahy, 
  Kennedy, and Specter...........................................    34
Responses of Leslie Harris to questions submitted by Senators 
  Leahy, Kennedy, and Specter....................................    45
Responses of Kim Taipale to questions submitted by Senator 
  Specter........................................................    54

                       SUBMISSIONS FOR THE RECORD

Barr, Robert, Chairman, Patriots to Restore Checks and Balances, 
  Washington, D.C., prepared statement and attachment............    65
Carafano, James Jay, Heritage Foundation, Assistant Director, 
  Kathryn and Shelby Cullom Davis Institute for International 
  Studies, Senior Research Fellow, Douglas and Sarah Allison 
  Center for Foreign Policy Studies, Washington, D.C., prepared 
  statement......................................................    74
Harper, Jim, Director of Information Policy Studies, CATO 
  Institute, Washington, D.C., prepared statement and attachment.    81
Harris, Leslie, Executive Director, Center for Democracy and 
  Technology, Washington, D.C., prepared statement...............   104
Hertling, Richard A., Acting Assistant Attorney General, 
  Department of Justice, Washington, D.C., letter and attachment.   116
McClatchy Newspapers, Greg Gordon, article.......................   152
Taipale, Kim A., Founder and Executive Director, Center for 
  Advanced Studies in Science and Technology Policy, New York, 
  New York, prepared statement...................................   154
Washington Post, Spencer S. Hsu and Ellen Nakashima, article.....   172


Submissions for the record not printed due to voluminous nature, 
  previously printed by an agency of the Federal Government, or 
  other criteria determined by the Committee, list...............   174

                          DATA MINING PROGRAMS


                      WEDNESDAY, JANUARY 10, 2007

                              United States Senate,
                                Committee on the Judiciary,
                                                     Washington, DC
    The Committee met, pursuant to notice, at 9:31 a.m., in 
room 226, Dirksen Senate Office Building, Hon. Patrick J. Leahy 
(chairman of the committee) presiding.
    Also present: Senators Specter, Feingold, and Whitehouse.

                      THE STATE OF VERMONT

    Chairman Leahy. The Judiciary Committee will be in order.
    Today the Senate Judiciary Committee holds an important 
hearing on the privacy implications of government data mining 
programs. This committee has a special stewardship role in 
protecting our most cherished rights and liberties as 
Americans, including the right of privacy.
    Today's hearing on government data mining programs is our 
first in the new Congress. This hearing is also the first of 
what I plan to be a series of hearings on privacy-related 
issues throughout this Congress.
    The Bush administration has dramatically increased its use 
of data mining technology, namely the collection and monitoring 
of large volumes of sensitive personal data to identify 
patterns or relationships.
    Indeed, in recent years the Federal Government's use of 
data mining technology has exploded, without congressional 
oversight or comprehensive privacy safeguards.
    According to a May 2004 report by the Government 
Accountability Office, at least 52 different Federal agencies 
are currently using data mining technology. There are at least 
199 different government data mining programs.
    Think about that just for a moment. One hundred and ninety-
nine different programs that are operating or are planned 
throughout the Federal Government. Of course, advances in 
technology make data mining and data banks far more powerful 
than ever before.
    Now, these can be valuable tools in our national security 
arsenal, but I think the Congress has a duty to ensure that 
there are proper safeguards so they can be most effective.
    One of the most common and controversial uses of this 
technology is to predict whom among our 300 million Americans 
are likely to be involved in terrorist activities.
    According to GAO and a recent study by the CATO Institute, 
there are at least 14 different government data mining programs 
within the Departments of Defense, Justice, Homeland Security, 
and Health. That figure does not include the NSA's programs.
    I think Congress is overdue in taking stock of the 
proliferation of these databases that are increasingly 
collecting information on Americans.
    Now, they are billed, of course, as counterterrorism tools, 
but you wonder why there have to be so many, in so many 
different departments. But the overwhelming majority of them 
use, collect, and analyze personal information about ordinary 
American citizens.
    We have just learned through the media that the Bush 
administration has used data mining technology secretly to 
compile files on the travel habits of millions of law-abiding 
    Incredibly, through the Department of Homeland Security's 
Automated Targeting System program, ATS, our government has 
been collecting information on Americans, just average 
    They then share this sensitive, personal information with 
foreign governments. They are shared with private employers. 
There is only one group they will not share it with: the 
American citizens they collected it on.
    So if there is a mistake in there and you suddenly find you 
cannot get into another country, or a mistake in there and you 
find you do not get a promotion in your job because your 
employer has it, you never know why and you never even know 
what the mistake was.
    Following years of denial, the Transportation Security 
Administration, TSA, has finally admitted that its 
controversial secure flight data mining program, which collects 
and analyzes airline passenger data obtained from commercial 
data brokers, violated Federal privacy laws by failing to give 
notice to U.S. air travelers that their personal data was being 
collected for government use. I think you find out why they 
denied they were doing it: because they were breaking the law 
in doing it.
    Last month, the Washington Post reported that the 
Department of Justice will expand its one-DOJ program, a 
massive database that would allow State and local law 
enforcement officials to review and search millions of 
sensitive criminal files, following the FBI, DEA, and other 
Federal law enforcement agencies.
    That means sensitive information about thousands of 
individuals, including thousands who have never been charged 
with a crime, will be available to your local law enforcement 
agencies no matter what their own system of protection of that 
data might be.
    So you have to have proper safeguards and oversight of 
these, and other, government data programs, otherwise the 
American people do not have the assurance that these massive 
databases are going to make them safer, nor the confidence 
their privacy rights will be protected.
    And, of course, there are some very legitimate questions 
about whether these data mining programs actually do make us 
safer. It becomes almost humorous. Some of the consequences, I 
have talked about before.
    Senator Kennedy has been stopped 10 times going on a plane, 
a flight he has been taking for 40 years back to Boston, 
because somehow his name got, by mistake, on one of these 
    We had a 1-year-old child who was stopped because their 
name was on as a terrorist. The parents had to go and get a 
passport to prove this 1-year-old was not really a 44-year-old 
    So the CATO Institute study found that data mining is not 
an effective tool for predicting or combatting terrorism 
because of the high risk of false positive results.
    We need look no further than the government's own terrorist 
watch list, which now contains the names of more than 300,000 
individuals, including, as I said, Members of Congress, 
infants, and Catholic nuns, to understand the inefficiencies 
that can result in data mining and government dragnets.
    So let us find out how we can make ourselves safer, but not 
make ourselves the object of a mistake and ruin our lives that 
    I am joined today by Senator Feingold, Senator Sununu, and 
others in a bipartisan attempt to provide congressional 
oversight. We are reintroducing the Federal Agency Data Mining 
Reporting Act, which we have supported since 2003. It would 
require Federal agencies to report to Congress about their data 
mining programs.
    We in Congress have to make sure that our government uses 
technology to detect and deter illegal activity, but do it in a 
way that protects our basic rights.
    I also might say, on a personal note, I want to thank 
Chairman Specter for scheduling this hearing at my request. At 
the beginning of every Congress we have to do various 
reorganizational things, and I understand this is to be 
completed today or early tomorrow, and allowing me to be 
Chairman, even though I am not, technically, yet.
    So, Chairman Specter, it is up to you. You do whatever you 
want to do.

                        OF PENNSYLVANIA

    Senator Specter. Well, thank you very much. I hope you will 
not mind if I address you as ``Mr. Chairman'', Mr. Chairman.
    Chairman Leahy. I can put up with it.
    Senator Specter. The 109th Congress was very productive for 
the Judiciary Committee because of the close cooperation which 
Senator Leahy and I have had, which goes back to a period 
before we were Senators.
    The National District Attorneys' Conference was held in 
1970 in Philadelphia when I was District Attorney, and District 
Attorney Leahy from Burlington, Vermont attended. We formed a 
partnership which has lasted and withstood partisan pressures 
in Washington, DC.
    When Chairman Leahy refers to my scheduling of a hearing at 
his request, I think there were a number of hearings which were 
at Senator Leahy's request when he was only Senator Leahy and 
not Chairman Leahy. We had a very close, coordinated 
relationship and I am sure that will continue.
    Senator Harkin and I have passed the gavel for many years 
in the Subcommittee on Appropriations, and we call it a 
seamless transfer. This is our first transfer of the gavel 
between Chairman Leahy and myself, and I am looking forward to 
a seamless operation.
    In fact, Senator Leahy and I coordinated with the 
introduction of the Personal Data Privacy and Security Act of 
2005, which we reported out of committee and have coordinated 
with the Commerce Committee, which dealt with identity theft 
significantly, but also with data mining.
    There are some very important issues which are raised in 
the collation of all this material. The presence of the 
material in so many contexts led the Supreme Court to observe, 
in the case of U.S. Department of Justice v. Reporters' 
Committee for Freedom of the Press, that when information is 
located in so many spots, it is a matter of ``practical 
obscurity'', but when it is all brought together, it is a 
different matter.
    The committee focused on one aspect of this last year when 
we were looking at the telephone company responses to the 
government's request for collection of data. There may be very 
important law enforcement activities which utilized this data 
appropriately, but it is a balancing test of what kind of 
privacy was invaded, and what is the benefit for law 
enforcement, what is the benefit for society.
    I want to start my tenure as the non-chairman by observing 
the time limit, so I yield back a balance of 20 seconds. Thank 
you, Mr. Chairman.
    Chairman Leahy. I thank Chairman Specter. We have tried to 
work together. We have worked together ever since Senator 
Specter came here in 1986.
    Senator Specter. 1980.
    Chairman Leahy. 1980. I am sorry. Time goes by when you are 
having fun. And we did know each other as former prosecutors. 
We worked closely together. We have been on the Appropriations 
Committee together and worked together, and on this committee.
    I think we lowered the level of partisanship in this 
committee during the past 2 years, and I hope to continue that. 
I am hoping that we are going to reach a point where things can 
work the way the Senate should.
    I do note that Senator Feingold of Wisconsin is here. He 
is, as I mentioned, the lead sponsor on this bill. I would 
yield to Senator Feingold if he wished to say anything.

                       STATE OF WISCONSIN

    Senator Feingold. Thank you, Mr. Chairman, and thank the 
Ranking Member. It is a pleasure working with you in the 
different capacities, and I look forward to working with both 
of you again on this committee.
    Thanks for holding this hearing. It raises important policy 
questions about the capabilities of data mining technologies 
and the privacy and civil liberties implications for ordinary 
Americans if this type of technology were to be deployed. These 
are questions that Congress has to address.
    This hearing is a critical first step in the process of 
understanding, evaluating, and perhaps regulating this type of 
technology. Many Americans are understandably concerned about 
the specter of secret government programs analyzing vast 
quantities of public and private data about the every-day 
pursuits of mostly innocent people in search of patterns of 
suspicious activity.
    So let me start by reiterating a point that Senator Wyden 
and I made in a recent letter to Director of National 
Intelligence Negroponte. Obviously, protecting our national 
security secrets is essential and the intelligence community 
would not be doing its job if it did not take advantage of new 
    But when it comes to data mining, we must be able to have a 
public discussion, what one of our witnesses has called a 
national conversation, about its potential efficacy and privacy 
implications before our government deploys it domestically.
    We can have that public debate about these policy issues 
without revealing sensitive information that the government has 
developed. The witnesses here today have for years been 
debating a variety of issues related to data mining.
    It is time to get Congress and the executive branch into 
that discussion, not just in reaction to the latest news story, 
which has sort of been the position we have been in in the 
past, but in a proactive, thoughtful, and collaborative way.
    As I have said before, this hearing is an important first 
step. I hope that the next step will be the enactment of the 
Federal Data Mining Reporting Act, which I am reintroducing 
today along with Senator Sununu, Senator Leahy, and others. I 
thank the Chairman for mentioning it, and for his excellent 
support of the bill.
    The bill requires Federal agencies to report on their 
development and use of data mining technologies to discover 
predictive or anomalous patterns indicating criminal or 
terrorist activity, the types of data analysis that raise the 
most serious privacy concerns. It would, of course, allow 
classified information to be provided to Congress separately 
under appropriate security measures.
    Along with this hearing, I hope these reports will help 
Congress, and to the degree appropriate the public, finally 
understand what is going on behind the closed doors of the 
executive branch so we can start to have the policy discussion 
about data mining that is long overdue. I would urge my 
colleagues to support the legislation.
    Mr. Chairman, I also want to note that last night I 
received a response from the Director of National Intelligence 
Negraponte to the letter Senator Wyden and I wrote to him 
regarding the Tangram Data Mining Program.
    In it, ODNI states that Tangram is a research project, and 
acknowledged that it has ``a real risk of failure.'' It also 
assured us that no Tangram tools would be deployed without 
consultation with the DNI's Civil Liberties and Privacy 
    I would just add that I would hope that Congress also would 
be consulted prior to any deployment of the Tangram data mining 
tool. So, I do thank you, Mr. Chairman, very much for the 
opportunity to make this opening statement.
    Chairman Leahy. Thank you.
    Would the panel please rise and raise your right hand?
    [Whereupon, the panel was duly sworn.]
    Chairman Leahy. Following our normal procedure--and I am 
sure you understand this, Mr. Harper--we have a former Member 
of Congress and we will recognize him first. Bob Barr 
represented the Seventh District of Georgia in the U.S. House 
of Representatives from 1995 to 2003. He was on the Judiciary 
Committee. He was Vice Chairman of the Government Reform 
Committee and a member of the Committee on Financial Services.
    He occupies the 21st Century Liberties Chair for Freedom 
and Privacy at the American Conservative Union; serves as a 
board member of the National Rifle Association; is chairman of 
Patriots to Restore Checks and Balances; provides advice to 
several organizations, including--this is interesting--
consulting on privacy issues with the ACLU, serving as a chair 
for youth leadership training at the Leadership Institute in 
Arlington, Virginia; and is a member of the Constitution 
Project's Initiative on Liberty and Security based at 
Georgetown University's Public Policy Institute.
    The Congressman served as a member of the Long-Term 
Strategy Project for Preserving Security and Democratic Norms 
in the War on Terrorism at the Kennedy School of Government at 
Harvard University from 2000 to 2005. He was a New York Times 
columnist, and a close personal friend of mine, Mr. Safire, has 
called him ``Mr. Privacy''.
    So with all that, Bob, go ahead.

                 AND BALANCES, WASHINGTON, D.C.

    Mr. Barr. Thank you very much, Mr. Chairman. Let me add my 
personal congratulations to the many I know you have received 
since your ascendancy to the chairmanship.
    Let me also congratulate the fine work that Senator Specter 
has been involved in in laying the groundwork for the work that 
I know is coming this Congress with regard to the fundamental 
right to privacy and other civil liberties, particularly vis-a-
vis fighting against acts of terrorism.
    I very much appreciate both the former chairman and the 
current chairman inviting me today to this very important 
    I appreciate very much the attendance of at least two other 
Senators at this time whose presence here today obviously 
indicates a keen interest on their part in the issues before 
this committee, Senator Whitehouse and Senator Feingold, who 
has been a leader in the last Congress, and even before that.
    I very much appreciate the committee indicating, I think 
very clearly, to the American people and to your colleagues 
here in the Congress that the issue of privacy, particularly as 
it relates to government data mining and the secrecy 
surrounding that and the extent thereof, is a top A-1 priority. 
I think that sends a very important message.
    Of course, mindful of the committee's many 
responsibilities, I would ask that my prepared testimony be 
included in full in the record.
    Chairman Leahy. It will.
    [The prepared statement of Mr. Barr appears as a submission 
for the record.]
    Mr. Barr. What I would like to do, simply, in addition to 
that, is indicate to the committee, I think that a very 
appropriate starting point, or at least one of the starting 
points for the 110th Congress' long-term discussion of these 
issues, looking at and laying the groundwork for particular 
pieces of legislation, such as that which the committee has 
indicated will be introduced today.
    I think it is important also to focus on some fundamental 
questions which have given rise because of the extensive secret 
data mining by the government and by private industry in 
conjunction with the government to a culture of suspicion in 
our society.
    Perhaps one of the most fundamental issues, the most 
fundamental questions that really needs to be addressed, is who 
owns all of this data, this private data, this private, 
personal information that is the subject of all of this data 
    The extent of the data mining, Mr. Chairman, you indicated 
is the tip of the iceberg. There have been recent disclosures 
that there are at least some 200 different data mining systems 
in the government.
    You can hardly pick up the paper any day or watch the news 
any day and yet not walk away with new revelations about new 
data mining, whatever agency of the government it is, not just 
the Department of Justice, the Department of Defense, CDC, HUD, 
Homeland Security, Social Security Administration, IRS, SBA. 
They all seem to be enamored of, and have this blind interest 
and faith, in data mining.
    The problem is, there has never been a comprehensive look 
at who owns this data. The fact that over the last several 
years the administration has been treating that data as its 
own--that is, information on private citizens--begins us down 
that slippery slope.
    That slippery slope, we are all aware now, leads not only 
to secret data mining, which includes very personal data on 
American citizens and others in this country who have rights 
equal to those of our citizens under the Bill of Rights, First 
Amendment, Second Amendment, Fourth Amendment, and Fifth 
Amendment, being maintained in these government databases with 
no knowledge thereof, with no way to correct errors or improper 
    But it also leads us down that slippery slope to where we 
now see this administration, and that is viewing private mail 
that Americans and others have sent through the U.S. Postal 
    If, in fact, the government can continue to believe or view 
this data that is the subject of data mining as its own, that 
it owns it, then everything else that it wants to do follows 
from that false premise.
    Certainly, they can read people's mail, they can read 
people's e-mails. I think that is really a fundamental question 
that the committee must look at. There are others on which I 
would be glad to provide whatever information I have in terms 
of questions and follow-up.
    But I really do think there are fundamental issues 
regarding the ownership of that data and the extent to which 
the government already, and should be, engaged in that that 
provide more than fertile ground for this committee to look 
    Chairman Leahy. Thank you, Congressman. In fact, those will 
be among the questions that will be asked of the Attorney 
General when he comes here next week, the mail opening one. 
More and more, we hear about these things only because we read 
about it in the press, and this creates a strong concern for 
    Jim Harper is the Director of Information Policy Studies at 
the CATO Institute. As Director of Information Policy Studies, 
he focuses on the difficult problem of adopting law and policy 
to the unique situation of the information age. He is a member 
of the Department of Homeland Security's Data Privacy Integrity 
Advisory Committee.
    His work has been cited by USA Today, Associated Press, and 
Reuter's. He has appeared on Fox News channel, CBS, and MSNBC, 
and other media. His scholarly articles appear in the 
Administrative Law Review, the Minnesota Law Review, and the 
Hastings Constitutional Law Quarterly.
    He wrote the book, Identity Crises: How Identification is 
Overused and Misunderstood. He is the editor of privasilla.org, 
a web-based think tank devoted exclusively to privacy. He 
maintains the online Federal spending resource, 
washingtonwatch.com. He holds a J.D. from Hastings College of 
    Mr. Harper, it is yours. Again, I apologize. We have to ask 
you to keep the statement brief--your whole statement will be 
part of the record--because we want to ask questions.
    I should also note that Senator Whitehouse of Rhode Island 
has joined us here, not only today for the hearing, but Senator 
Whitehouse is a former attorney general. I had asked him, 
before he knew all the work that goes on in this committee, if 
he would join the committee. In a moment of weakness, he said 
yes. Senator, I am glad to have you here.
    Senator Whitehouse. I am glad to be with you, Mr. Chairman. 
Delighted to be with the Ranking Member. And it was no moment 
of weakness.
    Chairman Leahy. Thank you.
    Mr. Harper?


    Mr. Harper. Thank you, Mr. Chairman.
    If I can briefly start with a personal note that extends my 
biography just a little bit, my first job here on Capitol Hill 
was working for Senator Biden during the period when he was 
Chairman of this committee. I was an intern at the time.
    It inspired my legal career, including my focus on 
constitutional law. My first paid job when I returned to the 
Hill after that was with Senator Hatch as a legal fellow on 
this committee. So I really appreciate being here before you.
    Chairman Leahy. You covered both sides of the aisle very 
    Mr. Harper. In the spirit of bipartisanship. This committee 
has influenced my life and career a great deal and I hope that, 
in a small way, I will be able to influence you today.
    The questions about data mining are complicated. Questions 
about privacy are complicated. When you combine the two, you 
have a very complex set of issues to deal with.
    So we will obviously start to sort them out, but I think 
the conversation that you are starting with this hearing and 
with the oversight you intend to do this year in this Congress 
is very important.
    My resort is to a document that we produced in the 
Department of Homeland Security Data Privacy Committee, where 
we created a structure, a framework for thinking about problems 
like this.
    The first step in that framework is to ask how a program or 
technology serves a homeland security purpose. What risk does 
it address and how well does it address that risk? Once you 
determine that, you can make decisions about privacy and decide 
whether you want to use this technology, and how you want to 
use it.
    I think in the area of data mining we have not gotten past 
that step yet. What is the theoretical explanation for how data 
mining can catch terrorists, is the major question that is 
before us.
    The positive case for the use of data mining in this 
particular area has not yet been made, so I suppose that my 
colleague, Jeff Jonas, and I laid down something of a marker 
when we issued our paper on the dis-utility of data mining for 
the purpose of finding terrorists.
    We argue that what we call ``predictive data mining'', that 
is, finding a pattern in data and then seeking that pattern 
again in data sets, predictive data mining, cannot catch 
    Data mining can give a lift. There are many good uses to 
data mining. It can give a lift to researchers, their study of 
people, of scientific phenomena. But with the absence of 
terrorism patterns on which to develop a model, you're going to 
have a very hard time finding terrorists in data.
    The result will be that you will get a lot of false 
positives. That is, you will find that many people who are not 
terrorists are suspects. You will waste a lot of resources 
going after these people. You will follow a lot of dead ends. 
And, very importantly, you will threaten the privacy and civil 
liberties of innocent, law-abiding Americans.
    Now, I personally think that this applies equally well to 
developing patterns to search for through red-teaming and in 
searching for anomalies, though this was not the subject of our 
    I think it is important to recognize this is not an 
indictment of data mining in toto. There are many data mining 
programs that may not even use personal information.
    There are data mining programs that use personal 
information that may successfully ferret out fraud, for 
example, in health care payments or areas like that, so it is 
important to be clear about where data mining does not work and 
where it certainly may work.
    I think the proponents of data mining need to make that 
affirmative case. It is not enough to attack nominal opponents 
of data mining. The affirmative case, again, has to be made.
    You on this committee should be able to say to yourselves, 
oh, yes, I get it. I understand how data mining works. Then the 
country will be ready to accept data mining as a law 
enforcement or national security tool.
    Once the benefits of data mining are understood and clear, 
then you can consider the privacy and other costs. Certainly 
there are dollar costs, as there are with any program, and a 
lot of dollars are going into data mining at this point.
    But the privacy costs, which I have articulated, or 
attempted to articulate, in my paper include the lack of 
control that people have over personal information about 
themselves, the questions of fairness, of liberty, and data 
    In this committee, we have referred to some of these things 
as due process, or the Fourth Amendment right to be free of 
unreasonable search and seizure, and equal protection. So the 
thing that I think we need, and the thing that I think we are 
seeing in the bill that is being introduced today--and I am 
quite happy about that--is transparency.
    Transparency should be seen as an opportunity for the 
proponents of data mining to make their case, to make the 
affirmative case for data mining. We need to see how it works, 
where it is being used, what data is being used, what assures 
that the data is of high quality, and so on and so forth.
    You will run into the problem of secrecy, that is, secrecy 
being put forward as a reason why not to share this information 
with you, why not to explain data mining to you. But I think 
you will have to address that at the right point, and I hope 
you will.
    Thanks very much for the opportunity to present to you 
    Chairman Leahy. Thank you, Mr. Harper.
    [The prepared statement of Mr. Harper appears as a 
submission for the record.]
    Chairman Leahy. Leslie Harris is the Executive Director for 
the Center for Democracy and Technology. She joined CDT in the 
fall of 2005, and became Executive Director at the beginning of 
2006. She brings over two decades of experience to CDT as a 
civil liberties lawyer, a lobbyist, and public policy 
    Her areas of expertise include free expression, privacy, 
and intellectual property. Prior to joining CDT, Ms. Harris was 
Founder and President of Leslie Harris & Associates, a public 
interest, public policy, and strategic services firm, 
representing both corporate and nonprofit clients before 
Congress and the executive branch on a broad range of Internet- 
and technology- related issues, including intellectual 
property, online privacy, telecommunications, and Spectrum.
    During that time she was involved in the enactment of many 
landmark pieces of legislation, including the landmark e-rate 
amendment to the 1996 Telecommunications Act, the Children's 
Online Privacy Protection Act, and the 2002 Technology, 
Education, and Copyright Harmonization Act, or the TEACH Act, 
which updated copyright law for digital distance learning. I 
would note that Ms. Harris has appeared before this committee 
many times, and I appreciate that.
    Please go ahead.


    Ms. Harris. Thank you so much, Mr. Chairman. I appreciate 
the opportunity to be here. I want to applaud the Chairman, in 
particular, for making this data mining question, and privacy 
in general, a first order of business for this committee.
    From the perspective of CDT, we believe that information 
technology ought to be used to better share and analyze the 
oceans of information that the government has in the digital 
age, but both national security and civil liberties require 
that technology only be used when there is a demonstrable, 
effective impact, and then only within a framework of 
accountability, oversight, and most importantly, protection of 
individual rights.
    Data mining, in the abstract, is neither good nor bad, but 
as Jim Harper has pointed out, there is very little evidence of 
the effectiveness of at least the protective or patterned data 
mining. Yet, frankly, the executive branch is bewitched with 
this technology.
    Unless and until a particular data mining technology can be 
shown to be an effect tool for counterterrorism and appropriate 
safeguards are in place to protect the privacy and due process 
rights of Americans, Congress should simply not permit the 
executive branch to deploy pattern-based data mining tools for 
any terrorism purposes.
    Mr. Chairman, for some time you have sounded the alarm 
about how the legal context for data collection and analysis 
has been far outstripped by technology; at the very time that 
the legal standards for government access to data have been 
lowered and legal safeguards like the Privacy Act have been 
bypassed and the Fourth Amendment requirements for probable 
cause, particularity, and notice have been thrown into doubt, 
we are moving into this very sophisticated and troubling data 
mining era.
    The impact of this perfect storm of technological 
innovation, growing government power, and outdated legal 
protections is well illustrated by the revelation last month 
that the Automatic Targeting System, which is designed to 
screen cargo, is now being used to conduct risk assessments on 
individuals. Those risk assessments, as I read this Privacy Act 
notice, can be used for a wide variety of uses wholly unrelated 
to border security.
    There is much Congress can do. The first step, of course, 
is to pierce this veil of secrecy. We strongly endorse the 
legislation that you, Senators Feingold, Sununu, and others 
have introduced today. We need vigorous oversight. We need 
transparency. Ultimately, we need legislation. We cannot do any 
of that until we are able to get a handle on what is going on.
    We believe that Congress ought to go further and not permit 
any particular data mining applications to be deployed until 
there is a demonstration of effectiveness. We believe research 
should continue, but in terms of deploying these technologies, 
we do not even have to reach the privacy questions until we 
know whether or not they are working.
    While it is the job of the executive branch, in the first 
instance, to develop serious guidelines for the deployment of 
data mining for data sharing and analysis, we do not believe 
that job has been adequately done.
    If necessary, this body needs to impose those guidelines. 
There is much in the Markle recommendations and others to guide 
you in that regard.
    Finally, we have to get our arms around how commercial 
databases are being used for data mining. Those activites fall 
entirely outside of the Privacy Act and all other rules.
    Last year, Mr. Chairman, Mr. Specter, you introduced the 
Personal Data Privacy and Security Act. That bill included 
important to ensure that government use of commercial data 
bases for data mining was brought under the Privacy Act. We 
ought to enact that bill and we ought to enact some other 
protections as well.
    I appreciate the opportunity to testify, and am ready for 
your questions.
    Chairman Leahy. Thank you.
    [The prepared statement of Ms. Harris appears as a 
submission for the record.]
    Chairman Leahy. Our next witness is Kim Taipale. Now, have 
I pronounced it right?
    Mr. Taipale. Close enough.
    Chairman Leahy. How do you pronounce it?
    Mr. Taipale. Taipale.
    Chairman Leahy. Taipale. Mr. Taipale is the Founder and 
Executive Director of the Center for Advanced Studies in 
Science and Technology Policy. It is a private, nonpartisan 
research and advisory organization focused on information 
technology and global and national security policy.
    He is a Senior Fellow at the World Policy Institute, where 
he serves as Director of the Global Information Society 
Project, and the Program on Law Enforcement and National 
Security in the Information Age. He is an Adjunct Professor of 
Law at New York Law School, where he teaches cyber crime, cyber 
terrorism, and digital law enforcement.
    He serves on the Markle Task Force on National Security in 
the Information Age, the Science and Engineering for National 
Security Advisory Board of The Heritage Foundation, the Lexis-
Nexis Information Policy Forum, and the Steering Committee of 
the American Law Institute's Digital Information Privacy 
    Thank you for joining us here today.

                    NEW YORK CITY, NEW YORK

    Mr. Taipale. Thank you, Mr. Chairman. Mr. Chairman, Senator 
Specter, members of the committee, thank you for the 
opportunity to testify today on the implications of government 
data mining.
    Data mining technology has raised significant policy and 
privacy issues, and we have heard a lot of them today. I agree 
with all of those. But the discussion about data mining suffers 
from a lot of misunderstandings that have led to a presentation 
of a false dichotomy, that is, that there is a choice between 
security and privacy.
    My testimony today is founded on several beliefs. First, 
that privacy and security are not dichotomous rivals, but dual 
obligations that must be reconciled in a free society. Second, 
we face a future of more data and more powerful tools, and 
those tools will be widely available.
    Therefore, third, political strategies premised on 
outlawing particular technologies or techniques are doomed to 
failure and will result in little security and brittle privacy 
    Fourth, there is no silver bullet. Everybody is right here. 
Data mining technologies alone cannot provide security. 
However, if they are properly employed they can improve 
intelligence gain and they can help better allocate 
intelligence and security resources. If they are properly 
designed, I believe they can still do that while protecting 
    Before getting to my two main points, there are also some 
general policy principles that I think should govern the use of 
any of these technologies if they are implemented.
    First, they should be used only for investigative purposes. 
That is, as a predicate for further investigation, not for 
proof of guilt or to otherwise automatically trigger 
significant adverse consequences.
    Second, any programmatic implementations should be subject 
to strict oversight and review, both congressional and, to the 
extent appropriate, judicial review, consistant with existing 
notions of due process.
    Third, specific technology features and architectures 
should be developed that help enforce these policy rules, 
protect privacy, and ensure accountability. So let me just make 
two main points.
    The first, is a definitional problem. What is data mining? 
Data mining is widely misunderstood, but just defining it 
better is not the solution. If we are talking about some 
undirected massive computer searching through huge databases of 
every individual's private information and intimate secrets, 
and the result of a positive match is that you face a firing 
squad, I think we will all agree that we are opposed to that.
    If, on the other hand, we are talking about uncovering 
evidence of organizational links among unknown conspirators 
from within legally collected intelligence databases in order 
to focus additional analytical resources on those targets, I 
think we will all agree that we are for it. The question is, 
can we draw a line between those two?
    I doubt it if we start by focusing only on trying to define 
data mining. That is precisely the mistake that detracts us 
from the issues we should be focused on, some of which were 
actually raised in your opening statements. Drawing some false 
dichotomy between subject-based and pattern-based analysis is 
sophistry, both technical- and policy-wise.
    The privacy issue in a database society, or to put it the 
other way around, the reasonableness of government access to 
data or use of any particular data, can only be determined 
through a complex calculus that includes looking at the due 
process of a system, the relationships between the particular 
privacy intrusion and security gain, and the threat level. They 
simply cannot be judged in isolation.
    Even privacy concerns, themselves, are a function of scope, 
sensitivity of the data, and method: how much data, how 
sensitive is the data, and how specific is the query? But we 
really need to separate the access question and the decision-
making question--on either side--from the data mining question 
itself and the use of data mining tools.
    More importantly, even the privacy concerns cannot be 
considered away from due process. Due process is a function of 
predicate: alternatives, consequences, and error correction.
    A lot of predicate and you can tolerate severe consequences 
even in a free society, but even ambiguous predicate maybe all 
right if there are minor consequences and there is robust error 
correction and oversight.
    While we are on predicate we should note that there is no 
blanket prohibition against probablistic predicates, such as 
using predicate patterns. We do it all the time. Nor is there a 
requirement for non-individualized suspicion, such as using 
pattern mining.
    My point is not that there are no privacy concerns, only 
that focusing only on data mining, however you define it, is 
not terribly useful. It really needs to be looked at more 
broadly. It is basically the computational automation of the 
intelligence function as a productivity tool that, when 
properly employed, can increase human analytical capacity and 
make better use of limited security resources.
    My second and final point, is that you cannot look at data 
mining in this context through the ``it won't work'' lens and 
simply dismiss potential. First, the popular arguments about 
why it will not work for counterterrorism are simply wrong.
    As I explain in my written testimony, the commercial 
analogy is irrelevant, the training set problem is a red 
herring, and the false positive problem can be significantly 
reduced by using appropriate architectures. In any case, it is 
not unique to data mining. It is fundamental to the 
intelligence function. The intelligence function deals with 
uncertainties and ambiguities.
    Second, you cannot burden technology development with 
proving efficacy before the fact. We need R&D and we need real-
world implementations and experience, done correctly with 
oversight, so we can correct errors.
    Third, you cannot require perfection. To paraphrase 
Voltaire, the perfect ought to not be the enemy of the better.
    Finally, you need to bear in mind that any human and 
technological process will fail under some conditions. Some 
innocent people will be burdened in any preemptive approach to 
terrorism and, unfortunately, some bad guys will get through. 
That is reality.
    The question is, can we use these data mining tools and 
improve intelligence analysis and help better allocate security 
resources on the basis of risk and threat management?
    I think we can, and still protect privacy, but only if 
policy and system designers take the potential for errors into 
account during development and control for them in deployment.
    Chairman Leahy. Thank you.
    [The prepared statement of Mr. Taipale appears as a 
submission for the record.]
    Chairman Leahy. I would note that a number of the Senators 
have expressed a great deal of interest in this subject, both 
on the Republican side and the Democratic side. They are not 
here this morning simply because we have several major 
committees meeting at the same time.
    One of the problems with the Senate, is you cannot be in 
more than one place at a time. Senator Feingold, for exmaple, 
is at the Foreign Relations Committee, and several other 
Senators have mentioned they wanted to be here.
    Dr. Carafano, our next witness, is the Assistant Director 
for the Kathryn and Shelby Cullom Davis Institute for 
International Studies. He is a Senior Research Fellow at the 
Douglas and Sarah Allison Center for Foreign Policy Studies. 
Dr. Carafano is one of The Heritage Foundation's leading 
scholars on defense affairs, military operations and strategy, 
and homeland security.
    His research focuses on developing the national security 
that the Nation needs to secure the long-term interests of the 
United States, realizing as we all do that terrorism is going 
to face us for the rest of our lifetimes, and how you protect 
our citizens and provide for economic growth and preserve civil 
    He is an accomplished historian and teacher. He was an 
Assistant Professor at the U.S. Military Academy at West Point, 
served as Director of Military Studies at the Army's Center of 
Military History, taught at Mt. Saint Mary College in New York, 
served as a Fleet Professor at the U.S. Naval War College. He 
is a Visiting Professor at the National Defense University at 
Georgetown University.
    I do not want anybody to think that we have this large 
proliferation of people connected with Georgetown just because 
I went to Georgetown Law School; it is purely coincidence.
    Dr. Carafano, go ahead.


    Mr. Carafano. Thank you, Mr. Chairman. I also got my Ph.D. 
from Georgetown.
    I have submitted my statement for the record.
    Mr. Carafano. I would like to do three things, very 
quickly: place the issue in context, state what I really think 
the problem is, and then argue why it is really essential that 
Congress address the issue and solve it.
    First of all, I come at this not as a lawyer, because I am 
not a lawyer, but as an historian and strategist. One of the 
fundamentals of good, long war strategy for competing well over 
the long term is that you have to have security and the 
preservation of civil liberties, as well as maintaining civil 
    It is not a question of balance. You simply have to do both 
over the long term. I think there is no issue or no security 
tool in which this issue is more important than the one we are 
discussing today.
    The problem is simply this. In the good old days when we 
were kids, technology evolved fairly slowly and policy could 
always keep up. We could look, we could observe, we could 
correct--trial and error.
    But the fact is, today technologies evolve far more quickly 
than policies can be developed. Information proliferates, 
capabilities proliferate, and if the technology evolution has 
to stop for the policy to catch up, it is never going to 
    In fact, it will not stop. You cannot stop it. So what you 
have to do is take a principled approach. You have to have a 
set of fundamental principles at the front end as guidelines to 
guide the development and implementation of the technology.
    Among these, we have argued--some Kim already mentioned--
are a clear definition of what data mining really is, 
addressing the requirements for efficacy, addressing the 
requirements for the protections, putting in appropriate checks 
and balances, and most importantly and often forgotten, is 
addressing the issue of the requirement for human capital and 
programming investments to actually implement these programs 
    The third point that I will make very quickly, is why is 
this really so important? There are really two aspects to that. 
The first, is we do not have infinite resources. What we need 
to do is focus our information and intelligence and law 
enforcement resources where they are going to do the most good.
    And while it is absolutely important that any system 
protect the rights of everyone, we should also have systems 
that inconvenience as few people as possible. That is part of 
keeping a free, open, and healthy civil society. So we should 
be looking for systems which are directing on us on where we 
most live.
    I would argue, for example, that programs like the 
Container Security Initiative and the Automated Targeting 
System--which, by the way, I think you could argue are not data 
mining systems--are good examples of where we try to focus 
scarce resources on things that might be problematic. Contrast 
that, for example, with the bill passed yesterday in the House, 
which argues that we should strip-search every container and 
package that comes into the United States (where you look at 
everything), or the lines that we have at TSA, which look at 
grandmothers and people coming through absolutely equally.
    So we want systems that are going to focus our assets, 
where we inconvenience the least amount of citizens, friends, 
and allies of the United States, and we want to use our law 
enforcement efforts to best effect.
    If we can create reporting requirements and a set of 
principles at the front end that guide the administration in 
doing that and adapting these new technologies, I think it will 
be time well spent by the Congress.
    Thank you, Mr. Chairman.
    [The prepared statement of Mr. Carafano appears as a 
submission for the record.]
    Chairman Leahy. Thank you. I am going to come back to this 
question of which things work best, because we are talking 
about millions of dollars--perhaps billions of dollars--being 
spent. I worry about a shotgun approach as compared to a rifle 
approach where you might actually pick what works.
    When I see 90-year-old people in walkers take their shoes 
off to go onto an airplane and then not physically able to even 
put the shoes back on, I am curious just what happens.
    I have been worried about the lack of privacy safeguards. 
In early 2003, I wrote to former Attorney General Ashcroft to 
inquire about the data mining operations, practices, and 
policies within the Department of Justice.
    I would ask that a copy of my January 10, 2003 letter be 
made a part of the record. I would love to be able to put a 
response in the record too, but of course I never got one.
    In 2003, I joined Senator Wyden in a bipartisan coalition 
of Senators in offering an amendment to the omnibus 
appropriations bill that ended the funding for the 
controversial TIA, Total Information Awareness, program because 
there were no safeguards.
    In April of that year I joined with Senator Feingold in 
introducing the Federal Data Mining Reporting Act, which 
required all Federal agencies to report back to Congress on 
their data mining programs in connection with terrorism and law 
enforcement efforts, and a version of our measure was put on 
the Department of Homeland Security appropriations bill.
    But basically the administration has ignored a lot of the 
bans that Congress, in a bipartisan way, has put on these 
things. Just last month, Representative Martin Sabo, one of the 
leaders in enacting the legal prohibition on developing and 
testing data mining programs, told the Washington Post that the 
law clearly prohibits the testing or development of the 
Department of Homeland Security's ATS data mining program, even 
though that has been used for years to secretly assign so-
called terror scores to law-abiding Americans, I suppose that 
90-year-old person in the walker. I will put the Washington 
Post article in as part of the record.
    All I want is the administration to follow the law. They 
want us to follow the law, they ought to follow the law and let 
us develop what is best. We all want to stop terrorists, but we 
do not want to make our own government treat us, all of us, 
like we are terrorists.
    So, Mr. Harper, I read your article on ``Effective 
Counterterrorism and the Limited Role of Predictive Data 
Mining'' with a great deal of interest because data mining 
becomes more and more a tool to detect terrorist threats.
    In May of 2004, 2 years ago, GAO reported that there were 
at least 14 different government data mining programs in 
existence today. That was back then.
    Now, I favor the use of data mining technology if there are 
safeguards, but we are talking about millions of dollars--
probably billions of dollars by now--in data mining technology 
in order to predict future terrorist threats. I worry about the 
huge amount of stuff coming in that does not do a darned thing.
    Are you aware of any scientific evidence or empirical data 
that shows the government data mining programs are an effective 
tool in predicting future terrorist activity or identifying 
potential terrorists?
    Mr. Harper. I am not aware of any scientific evidence, of 
any studies. Unfortunately, the discussion tends to happen in 
terms of bomb throwing or anecdote, where the ATS system, for 
example, has been defended based on one anecdote of someone who 
was turned away from the U.S. border based on ATS and ended up 
being a bomber in Iraq.
    Now, I recently spoke with a reporter who is apparently 
investigating that story, and it was not necessarily ATS 
signaling that this was a potential terrorist, but rather that 
it was a potential immigration over-stayer. So was that an 
example of the system working or was it not? That is just an 
anecdote. We would be much better off with scientific 
background that justifies this.
    Chairman Leahy. Do you not think we should have a 
scientific study to find out if we are going to spend millions, 
even billions, whether this thing actually works?
    Mr. Harper. Absolutely. I think, along with scientific 
study, allowing technologies like data mining to prove 
themselves in the private sector will give us much more than 
allowing government research to happen.
    Chairman Leahy. Dr. Carafano, are you aware of any 
empirical studies?
    Mr. Carafano. Well, I think, quite frankly, a review of the 
scientific literature does not give you a definitive answer of 
the ultimate potential of data mining technologies to predict 
behavior. But we should also realize, if you look at the state 
of behavioral science--
    Chairman Leahy. I am not asking about the potential that 
someday it may work. Are you aware of any empirical study that 
these millions of dollars--maybe billions of dollars--we are 
spending on all these systems seem to be proliferating? 
Everybody has got to have their own. Are you aware of 
scientific or empirical studies that say they work?
    Mr. Carafano. Senator, somebody would have to specifically 
describe to me the program, then we would have to have a 
discussion about whether it is actually a data mining program 
or not. I am not sure that all the systems that GA qualifies is 
data mining, or ATS, which I do not believe is a data mining 
system. But the point is, behavioral science modeling is a 
rapidly developing field.
    The combination of computer technology and informatics and 
behavioral science is producing new advances every day, and so 
even if I gave you a definitive answer today that said I can 
guarantee you for a fact that data mining processes cannot 
predict terrorist behavior, that answer may be totally false 6 
months, a year, or 2 years from now. I cannot give you that 
    Chairman Leahy. Might we suggest there are some mistakes 
when Senator Kennedy and Congressman Lewis are told they cannot 
go on an airplane, or a pilot has to lose a lot of his income 
because he gets delayed every single time they go through, even 
though they know it is the wrong guy?
    Mr. Carafano. Yes, sir. But in all those systems you are 
doing one-to-one matches. They have got a data point and they 
are matching a person to that data point. Sometimes those data 
points are incorrect. That is not data mining.
    Chairman Leahy. I could follow up for a couple of hours on 
that one, but we will go back to it.
    Congressman Barr, in November of 2002, the New York Times 
reported that DARPA was developing a tracking system, which 
turned out to be Total Information Awareness.
    Privacy concerns were so abhorrent that a Republican-
controlled Congress cut the funding for it. But October 31st of 
last year, an article in the National Journal reported that the 
Office of the Director of National Intelligence is testing a 
new computerized system to search very large stores of personal 
information, including records of individuals' private 
communications, financial transactions, and everyday activities 
that looks very much like TIA.
    Are you concerned that a system shut down by the Congress 
is now reappearing under another form?
    Mr. Barr. Very concerned, both as a former Federal 
prosecutor, certainly as a former Member of this great 
institution on the House side, and as a citizen concerned about 
the rule of law.
    I think that allowing any administration--and this 
administration has shown itself to favor this, time again--to 
do what it wants regardless of what Congress says, either 
through an appropriations rider or through specific 
legislation, it breeds contempt for the law, it breeds a lack 
of credibility that cuts across the board in reducing people's 
faith in government, and it leads to this further sort of 
cultural suspicion.
    I think it is extremely problematic and I believe that, so 
long as the Congress allows the administration to do this 
without either providing an overall architecture such as the 
Europeans did over a decade ago, and a number of other 
countries that have shown themselves much more willing than our 
government to establish a framework within which proper privacy 
protections can be employed and shall be employed, and yet not 
harm business at all--the Swiss are a perfect example of that--
until Congress addresses this issue, the administration is 
going to continue to do precisely what you put your finger on, 
Mr. Chairman, and that is essentially to thumb its nose at the 
Congress and do what it wants. They just call it something 
    Chairman Leahy. The concern I have, I mean, you fly on 
commercial flights, as I do, as most of us do. You have to 
assume that you have some kind of a terror index score 
somewhere. You have no way of finding out what that is. I have 
no way of finding out what that is.
    If you are a person working for a bank and you are up for 
vice president or head of one of the branches or something, and 
you are suddenly turned down because the bank has found this 
score, you have no way of knowing what it is, do you?
    Mr. Barr. This is the very pernicious nature of what is 
going on here. You have no way of knowing. You have no way of 
correcting it.
    The particular system that you referred to, Mr. Chairman, 
that has given rise to the absurd situation of the U.S. Senator 
and the U.S. Congressman being halted from boarding a plane 
because their name appears on some list, whether one considers 
that data mining technically or not, the fact of the matter is, 
it points out a major problem and a major shortcoming, a 
fundamental problem in the way we allow government to operate 
to do this without, as Jim correctly put his finger on, the 
transparency that at least provides some knowledge and 
protection for the citizen.
    Chairman Leahy. Thank you. I have further questions of Ms. 
Harris and others, but my time is virtually up. I will yield to 
Senator Specter, then we will go, by the early bird rule, to 
Senator Whitehouse.
    Senator Specter. Thank you very much, Mr. Chairman.
    Congressman Barr, was your privacy violated by the 
interview in Borat?
    Mr. Barr. In what?
    Senator Specter. Borat.
    Mr. Barr. I do not know. Was he an agent of the Federal 
Government or not? It is a very good question that ought to be 
proposed to him.
    Senator Specter. Was your privacy violated?
    Mr. Barr. I believe it was. Information was gathered at 
that interview under false pretenses.
    Senator Specter. It was an extraordinarily moving 
interview. Did you have any right to stop its showing or 
distribution because of the invasion of your privacy?
    Mr. Barr. There may be. I know that some legal actions by 
some other persons involved are being pursued. I elected not to 
pursue it, believing essentially that the more one wastes time 
or engages in those sorts of activities, the more publicity you 
bring to something.
    Senator Specter. I think that is a valid generalization. If 
somebody is a Member of Congress with that kind of a high-
profile position, you sort of have to take your lumps here and 
    Did you see the movie?
    Mr. Barr. I have not. I know folks that have. The movie 
that revels in nude male wrestling is not something that puts 
it high on my priority list to see.
    Senator Specter. Well, I think the record ought to be clear 
that you were not featured in any nude male wrestling.
    Mr. Barr. I was going to, but I appreciate the Ranking 
Member indicating that.
    Senator Specter. It was a sedate interview in your office 
somewhere and it was a most extraordinary movie. I do not want 
to hype it too much or get people to go to see it, but the 
interview with you was about the only part of the movie worth 
seeing, Congressman Barr.
    Mr. Barr. I will take that as a compliment, Senator.
    Senator Specter. Well, you should. You should. It is a 
    There has been a reference made to the situation where the 
Automated Targeting System has been credited with the exclusion 
of an airline passenger. Proponents of ATS point to an 
incident, purportedly, where ATS was used by the Customs and 
Border Patrol agent in Chicago's O'Hare Airport to refuse to 
allow a traveler arriving from Jordan to enter the United 
States, a man named Riyib Al-Bama, who had a Jordanian visa and 
a U.S. business visa when he attempted to enter the United 
States, and 18 months later he reputedly--it is always hard to 
find out the facts in these matters, but this is the report--
killed 125 Iraqis when he drove into a crowd and set off a 
massive car bomb.
    Ms. Harris, are you familiar with that reported incident?
    Ms. Harris. Well, I am familiar with the allegation. 
Obviously, there is no way for me to know. But let us assume 
for the sake of argument that that is true.
    Senator Specter. Well, now, wait a minute. I am asking you 
if you are familiar with it.
    Ms. Harris. Specifically with that case?
    Senator Specter. Yes.
    Ms. Harris. Yes. All I know is what I read. I mean, there 
is no way for me to know.
    Senator Specter. Well, that is about all any of us could 
    Ms. Harris. Right. All I know is what I read.
    Senator Specter. And when we go to top secret briefings, we 
walk out with the same conclusion.
    Ms. Harris. Exactly.
    Senator Specter. All we know is what we read in the 
    Ms. Harris. Right.
    Senator Specter. In your testimony, you state that unless 
and until a particular application can be shown to be an 
effective tool for counterterrorism, the government should not 
deploy pattern-based data mining as an anti- terrorism tool.
    Our hearing today is built on a very, very high level of 
    Ms. Harris. Right.
    Senator Specter. And later, if the Chairman has a second 
round, I want to come back to a question as to, for those who 
like data mining, what can you point to that it has produced? 
For those who do not like data mining, what can you point to 
where there has been an invasion of privacy which has been 
damaging? I would like to get specifics so we can have some 
basis to evaluate it.
    Because we sit here and listen to high-level 
generalizations. You talk about oversight. When you pursue 
oversight--and I am going to be interested in the pursuit of 
the Attorney General next week--it is a heavy line of pursuit 
and diligent prosecutors have a hard time catching up.
    But before my time goes too much further--
    Chairman Leahy. I should note that I was told that there 
was an error on the clock before. I thought I was within the 
time and I went over the time. So, please, take what time you 
need, then we will go to Senator Whitehouse.
    Senator Specter. Well, I will just finish up this one 
question, then yield.
    When you talk about proving it to be an effective tool for 
counter-terrorism, how do we make the determination as to what 
is an effective tool for counter-terrorism?
    Ms. Harris. Well, I think you have to get the facts. At the 
moment, Congress does not have the facts. It is not for me to 
say that a program is corrective because it works once or works 
ten times. At some point there has to be evaluation criteria, 
whether it is set in those agencies or Congress sets them.
    If the information on the effectiveness has to be secret 
and is shared only with Congress to make that determination, 
that is fine. But even if you assume that that program is 
effective, and I do so only for the sake of argument, there is 
nothing that exists in that program to protect the rights of 
the rest of the people, the innocent people.
    There is no way that a program like that is designed where 
we know, because of the level of secrecy, what the impact is. 
You ask the question, what is the impact? There is a potential 
that we may have caught one terrorist, and that would be a good 
thing. We also do not know what the impact is on the millions 
of other people who are in that system because they do not know 
that they are in that system, they have no way to know they are 
in that system.
    So there is no reason for us to deploy these systems and 
leave us in a situation where there is no due process and no 
fair information practices. I mean, there are two different 
questions: one, are they effective and should they deploy it at 
    The second is, if you are going to deploy them, why do we 
have to deploy them without the traditional procedural 
protections that this body has imposed, fair information 
practices, and the Privacy Act, in a variety of other contexts.
    So you have to look at both of them. I do not think you 
address the second, privacy, until you get to the first, 
efficacy. But if there is, in fact, a person out there in 
Senator Leahy's example who is trying to figure out why they 
were fired, that person has no way to know.
    Senator Specter. Well, you sort of lost me along the way.
    Ms. Harris. All I am saying is--
    Senator Specter. Wait a minute.
    Ms. Harris. Yes.
    Senator Specter. You sort of lost me along the way.
    Ms. Harris. All right.
    Senator Specter. Can you point out any specific instance 
where data mining has resulted in somebody's demonstrable 
    Ms. Harris. Well, of course. I mean, there is demonstrable 
prejudice. The only ones that we can see visibly at this point 
are people being searched or people being kept off the plane.
    But you have a privacy notice that specifically said, we 
will share this for any other purpose with the rest of the 
government, down to the local level. So people are walking 
around with a risk assessment that they do not know, that is 
secret, that can be shared all over the government for any 
other purpose.
    If they are prejudiced by that, they do not know because 
nobody is going to say to them, we have now looked at your risk 
assessment and that is why you did not get a security 
clearance, that is why you did not get a job.
    Senator Specter. If they are kept off the plane though, if 
they are challenged--
    Ms. Harris. If they are kept off the plane, they know they 
have been kept off the plane. But nobody has said to them, we 
have identified you as a high risk, and here is how you can get 
out of that. There is no procedure for challenging a risks 
    Senator Specter. But until they are kept off the plane, 
when they have been prejudiced, at that juncture they have a 
right to challenge it until--
    Ms. Harris. They have no right to challenge it.
    Senator Specter. Wait a minute.
    Ms. Harris. They have no right to challenge it.
    Senator Specter. Wait a minute. Wait a minute. The question 
is not posed yet.
    Ms. Harris. Yes, Senator.
    Senator Specter. At what point is there prejudice? If they 
have been kept off the plane, it has been identified, they then 
have a right to challenge it. But until that time, what is 
their prejudice?
    Ms. Harris. Senator, I am not quite sure I agree with you 
about their right to challenge it. We do not have procedures 
set up for people to know their risk assessment and to be able 
to go and challenge it. We do not have those procedures. You 
can kind of go to TSA or whoever and try to get a response.
    I do not mean to be talking past you, but if you are kept 
off a plane you probably have an idea that perhaps you have a 
risk assessment that is high. If that is based on data that is 
inaccurate, I do not know where you go to challenge that data.
    We do not have Privacy Rights Act-like privileges. These 
notices specifically exclude people from those kinds of rights 
in these programs. All we are arguing is, just putting efficacy 
aside, that people do have those rights, that we restore them.
    Chairman Leahy. I might use an example, I alluded to it in 
my opening statement, of an airline pilot. I will identify him. 
It is Kieran O'Dwyer. Having an Irish surname, I kind of 
noticed this, notwithstanding my Italian ancestry.
    But Kieran O'Dwyer of Pittsboro, North Carolina, an airline 
pilot for American Airlines. In 2003, he gets off the plane and 
is detained for 19 minutes on international flight because they 
told him his name matched one on a government terrorist watch 
list, apparently somebody from the IRA.
    Over the next almost 2 years, he was detained 70 to 80 
times. He talked to his Republican Senator and Democratic 
Congressman and they could not get him off the list. It got so 
bad, he said, Custom agents came to greet him by his first 
name. But they still had to detain him because he was on the 
list and he could not get off it.
    So he finally, after missing numerous connecting flights 
where he has to get to the next flight that he is supposed to 
fly, having to pay to stay in hotels because he has missed 
them, he gave up flying internationally, even though he took a 
five-figure drop in his pay. He just could not do it. That is 
one example, and I am sure we have many more.
    Senator Whitehouse?
    Senator Whitehouse. Thank you, Mr. Chairman.
    Just a word on my background. Rhode Island is one of those 
States in which the Attorney General has State-wide criminal 
law enforcement authority, so like the Senator and the Ranking 
Member I was, in effect, the DA. I was also the U.S. Attorney 
for Rhode Island.
    I have led and overseen undercover and confidential 
investigations, so I am well aware of the critical value of 
that, and also well aware of the civil liberties hazard that 
that creates. It is very interesting to me to be seeking to 
apply that balance in this area where there is a new and 
inevitable technology that has arrived upon our society.
    My question to anyone on the panel who would care to answer 
it, is this. Does it make sense to look at the use of the data 
mining capability in different ways depending on the different 
uses of that capability?
    And specifically, can we talk about two different uses 
being one in which a dragnet is run through the data mine based 
on a profile or based on a formulary, and as a result 
individual names are surfaced and then further action ensues 
with respect to those previously unknown or undisclosed names? 
That would be one category of access to the data mining 
    The other would be taking a preexisting identified subject 
of some variety, perhaps a predicated subject of some kind, 
perhaps not, and running that individual name through the data 
mining capability to seek for links, contacts, and other things 
that would be useful in investigating the activities of that 
    Are those two meaningfully distinct uses of the data mining 
capability, and in our deliberations should we be considering 
them separately?
    Ms. Harris. Mr. Whitehouse, at least from our perspective 
we do think that those are differing capabilities. I mean, 
there is a very interesting--I cannot remember if it is a 
footnote or a page in the Markle report that shows how using 
sort of existing data and starting with the two terrorists who 
are on the watch list and looking for links about addresses and 
a variety of things, that you might have been able to identify 
all the terrorists. That, to me, is traditional law 
    Now, I understand from Dr. Carafano's view that the line 
between that as technology advances, and what Mr. Harper and I 
sort of refer to as predictive or pattern- based, is going to 
get more muddled as technology advances. But it does offer, I 
think, a useful place to make a distinction.
    First of all, in the suspicion-based, you are sort of 
engaged in a law enforcement activity. People get identified at 
some point and action is taken that is, if not public, goes 
into the law enforcement realm, procedures attach under our 
    In the predictive realm, we are starting with no predicate. 
We are starting with no suspect. We may be starting with a set 
of hypotheticals that are maybe worth testing, but then we are 
literally moving towards identifying, labeling, perhaps taking 
actions on people and there never is a procedure that attaches. 
I think that that is a very big difference.
    Senator Whitehouse. Does anyone disagree that this is a 
meaningful distinction?
    Ms. Harris. I think these witnesses do.
    Senator Whitehouse. Congressman Barr?
    Mr. Barr. Thank you, Senator. I do not disagree. I think it 
is a very important distinction. I think that if, in fact, 
there is information developed through legitimate intelligence 
operations, for example, that a particular person is a 
legitimate suspect, the government certainly needs to follow up 
on that and run that person's name through in whatever 
permutations there might be.
    But the question or the issue that is the more fundamental 
one to determine what those distinctions are and how to 
proceed, is that whatever the system is, it has to pass Fourth 
Amendment muster.
    Data mining, the way I believe it is being used by the 
government where everybody is a suspect and there is no 
suspicion, reasonable or otherwise, that a person is or has 
done something wrong before evidence is gathered against them, 
put into, manipulated, retained and disseminated through a data 
mining base, is not consistent with the Fourth Amendment and it 
should not hinge, with all due respect to the Ranking Member, 
on whether or not a person can show that, I have in fact been 
    I think the harm is done to society generally where you 
have a government that can treat all of its citizens and all 
other persons lawfully in the country as suspects, gather 
evidence on them, use that data to deny any particular one of 
them or a group of them, a fundamental right. That, I think, 
ought to be the starting point for the analysis.
    Senator Whitehouse. Would you require a warrant for a 
government agent to do a Google search?
    Mr. Barr. No. The government does not need a warrant to do 
a search of publicly available information. But in order to be 
consistent with both existing laws such as the Privacy Act, and 
consistent with the basic edicts of the Fourth Amendment, if 
they in fact take it further steps and include information, 
private information on a person in a database that is to be 
mined through algorithms manipulated in some way and then 
potential adverse action taken against a person, I think they 
do need to consider that, and ought to.
    Senator Whitehouse. This will be my last question. So in 
your view, the privacy barrier that is intruded upon by this is 
breached when private information goes into the data mine, not 
when the name emerges from the data mine and the government 
then begins to take action against an individual.
    Mr. Barr. That is correct.
    Senator Whitehouse. All right. Thank you, Mr. Chairman.
    Mr. Taipale. Could I just address it?
    Chairman Leahy. Go ahead. In fact, that is a very good 
question. If anybody else wants to address, briefly, what 
Senator Whitehouse asked, go ahead.
    Mr. Taipale. I think the issue of trying to draw a 
distinction between link and pattern analysis is very 
difficult. Again, let me preface all this by saying, I am 
completely in favor of privacy protection and oversight, and 
all of those things.
    But when you start to get into, actually, the use of these 
technologies, in context, I mean, we are talking about a lot of 
different things and going back and forth. So, for instance, in 
the Ted Kennedy example, that is a one-to-one match. That is a 
problem with watch lists. If we want to talk about watch lists, 
that is a problem. There ought to be procedures to deal with 
    Data mining in that case may actually help solve the 
problem. Here, if Ted Kennedy has stopped because he's on the 
watch list, but his terror score is very low because he is a 
U.S. Senator--I do not know if that is true--but if he does 
have a low score because he is a U.S. Senator, then that ought 
to be the basis for determining--sort of using independent 
models to come up with whether that is someplace to spend 
resources against, as Jim said earlier.
    Again, I am not in favor of any particular government 
program. I am not here endorsing any particular government 
program. I am merely saying that these are tools that can 
allocate investigative and intelligence resources. Going back 
to the premise of your question about using it in law 
enforcement, we do this all the time.
    The difference between looking for John Smith, or the man 
in a black suit, or a man in a blue suit, or a person cashing a 
check under $10,000, or whatever, we do this all the time. We 
used pattern-based analyses in the IRS to select who gets 
audited. We do it in the SEC and NSAD to find insider traders. 
We do it in money laundering.
    We do it at the borders with ICE to find drug couriers 
using drug courier profiles. We use hijacker profiles. All of 
those have been upheld and, quite frankly, the issue of using a 
probability-based predicate is something that is not inherently 
contrary to the Fourth or Fifth Amendment.
    Senator Specter. Dr. Carafano, you wanted to add something?
    Mr. Carafano. Yes. I do think that useful distinction in 
how we address the public policy issues is distinguishing 
between automating traditional law enforcement activities and 
the more exotic knowledge management of information to do 
predictive behaviors.
    But the point I would disagree with your division is, not 
all law enforcement activities begin with a suspect, 
essentially. I come from a long line of cops. When a cop goes 
on the street, he is collecting information every second. He is 
looking for behavior that is out of place. He pulls a car over, 
and everything else. That leads to a whole thing.
    So, no, he is not starting with a suspect, yet he is 
continually gathering freely accessible information. In a 
sense, ATS is automating that. I do think that that belongs in 
a separate discussion because the law there is clear. The 
question is, are the checks and balances in place? Those are 
not science experiments. Knowledge management is.
    Chairman Leahy. Ms. Harris, did you want to add to that?
    Ms. Harris. Well, I wanted to respond to the idea that this 
is no different than sort of the profiling we do that has been 
upheld for, for example, stopping a car under a drug profile. 
That seems to be the basis for this analysis, that this is all 
right under the Fourth Amendment.
    First of all, it is not secret. The police stop you. They 
know they have stopped you. You have an immediate opportunity 
to resolve the situation. If you are an innocent person and 
they have stopped you, and you have consented, there is no 
long-term use of the data.
    Two years later you do not show up for a job with the 
Federal Government and you get a security clearance denied 
because somewhere there is now a file that says they stopped 
you at the Vermont border. That is more like a metal detector.
    I really object to this effort to take these cases that 
involve one-on-one suspicion, one-on-one record analysis from 
20 years ago and try to apply them to this complex technical 
environment we are in. The Supreme Court may have said it is 
fine to do stops for drug profiling, but it has also said we 
have to update the Fourth Amendment to take into account 
technology. That is where we have fallen short. The one thing 
that I hear from everybody on this committee, is that we all 
think we have got to do something about the safeguards, whether 
or not we think predictive data mining works.
    Chairman Leahy. I smiled just briefly. In talking about 
being stopped down at the Vermont border, I was actually 
stopped a few years ago. It was a huge stop. They were stopping 
everybody. I drive back from Vermont about once a year, usually 
after the August recess, my wife and I. About 100-some-odd 
miles from the Canadian border, here is this big stop. I had 
license plate one on the car.
    They asked for identification and I was a little bit 
annoyed and showed them my Senate ID that says I am a U.S. 
Senator. But they asked, do I have proof of citizenship. I 
said, you may want to check the Constitution.
    Anyway, I digress. Not that it annoyed me; I still remember 
it like it was yesterday.
    Today, as you said, Ms. Harris, it is something that could 
be resolved right there. Today we read that the Department of 
Defense has agreed to alter the uses of a database with 
information on high school and college students and they have 
agreed to alter that.
    I wish they had done it because of questions being asked by 
Members of Congress. They did it because they got sued. I will 
include in the record information on that settlement, including 
the filing in the Federal Register yesterday amending this 
government information system.
    Senator Specter, did you have anything further? Otherwise I 
was going to keep the record open so that Senators on both 
sides could submit anything they wanted to.
    Senator Specter. Well, thank you, Mr. Chairman. Just one 
comment. I do not think that I have any disagreement with 
Congressman Barr with respect to probable cause if there is 
going to be, as he puts it, an adverse action. I think that is 
true. But within the range of investigative tools, if there is 
no adverse action, as Congressman Barr says, and there is no 
specific prejudice to the individual, then I think there is 
latitude for law enforcement to look for patterns.
    If you put together the 9/11 hijackers, for example, and 
you have connecting points where they entered about the same 
time, where they used the same banks, where they go to the same 
flight schools and do it in a confidential way where there is 
no disclosure, they have no prejudice and not saying anything 
adverse about it and doing it in a confidential, discreet way--
Congressman Barr used to be a prosecuting attorney. It is a 
popular background. It gives you a lot of insights into 
investigative techniques and protection of civil liberties. 
That is one of the prosecutor's fundamental duties. He is 
quasi-judicial, to be sure that civil rights are not violated.
    But it is a very complex field and it is hard to put your 
arms around it. It is really hard to figure out exactly where 
it is going. When we have open sessions, you see on C-SPAN how 
little we find out, and the sessions you saw which were closed, 
how little we find out, you would be amazed. Congressman Barr 
knows. He has been in a lot of them. This 407 on the Senate 
side, and the House has its own side.
    But we have to pursue the matters and we have to keep 
various Federal agencies on their toes, and give them latitude, 
but expect them to respect rights. So, thank you, Mr. Chairman.
    Chairman Leahy. Well, thank you. Thank you, Senator 
Specter. We will have more hearings on it.
    I also want to thank the panel. I know that you spent a lot 
of time preparing for this. It seems like, kind of zip in, zip 
out. This is important. It is important to this committee.
    I worry very much about this privacy matter. We Vermonters 
just naturally have a sense of privacy, but I think most 
Americans, too. We want to be secure. But at some point, 
especially in an interconnected age of the Internet and 
everything else, when mistakes are made, they are really bad 
    The worst mistakes are those when you do not know a mistake 
has happened, but it affects everything from your credit rating 
to your job. It is not what America is about. We talk about 
connecting the dots with the people in the flight school. 
Unfortunately, the FBI had all that information. They just 
chose not to act on it, and we had 9/11.
    Thank you all very much. We stand in recess.
    [Whereupon, at 10:45 a.m. the hearing was concluded.]
    [Questions and answers and submissions for the record 
    [Additional material is being retained in the Committee 
files, see Contents.]



















































































































































    Hanson et al. vs. Rumsfeld, 2006; Case No. 06 CV 3118; 
Judicial Case Files; United States District Court Southern 
District of New York, Complaint; New York City.

    Hanson et al. vs. Rumsfeld, 2007; Case No. 06 CV 3118; 
Judicial Case Files; United States District Court Southern 
District of New York, Stipulation of Voluntary Dismissal 
Pursuant to F.R.C.P. 41(a)(1)(ii); New York City.

    Taipale, A. K. ``Technology, Security, and Privacy: The 
Fear of Frankenstein, The Mythology of Privacy, and The Lessons 
of King Ludd.'' Yale Journal of Law and Technology. 7 Yale J.L. 
& Tech. 123; 9 INTL. J. Comm. L. & Pol'y 8. (Dec. 2004).

    United States. Office of the Secretary of Defense. The 
Privacy Act of 1974 Notice to Amend Systems of Records. January 
9, 2007.

    United States. Department of Homeland Security. Report of 
the Department of Homeland Security Data Privacy And Integrity 
Advisory Committee: Framework for Privacy Analysis of Programs, 
Technologies, and Applications Report No. 2006-01. Adopted 
March 7, 2006.