[House Hearing, 116 Congress]
[From the U.S. Government Publishing Office]





 
                EQUITABLE ALGORITHMS: EXAMINING WAYS TO.
                  REDUCE AI BIAS IN FINANCIAL SERVICES

=======================================================================

                                HEARING

                               BEFORE THE

                 TASK FORCE ON ARTIFICIAL INTELLIGENCE

                                 OF THE

                    COMMITTEE ON FINANCIAL SERVICES

                     U.S. HOUSE OF REPRESENTATIVES

                     ONE HUNDRED SIXTEENTH CONGRESS

                             SECOND SESSION

                               __________

                           FEBRUARY 12, 2020

                               __________

       Printed for the use of the Committee on Financial Services

                           Serial No. 116-87
                           
                           
                           
                           
[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]  




                            ______                       


               U.S. GOVERNMENT PUBLISHING OFFICE 
42-821 PDF               WASHINGTON : 2021                            
                           
                           
                           
                           
                           
                           

                 HOUSE COMMITTEE ON FINANCIAL SERVICES

                 MAXINE WATERS, California, Chairwoman

CAROLYN B. MALONEY, New York         PATRICK McHENRY, North Carolina, 
NYDIA M. VELAZQUEZ, New York             Ranking Member
BRAD SHERMAN, California             ANN WAGNER, Missouri
GREGORY W. MEEKS, New York           FRANK D. LUCAS, Oklahoma
WM. LACY CLAY, Missouri              BILL POSEY, Florida
DAVID SCOTT, Georgia                 BLAINE LUETKEMEYER, Missouri
AL GREEN, Texas                      BILL HUIZENGA, Michigan
EMANUEL CLEAVER, Missouri            SEAN P. DUFFY, Wisconsin
ED PERLMUTTER, Colorado              STEVE STIVERS, Ohio
JIM A. HIMES, Connecticut            ANDY BARR, Kentucky
BILL FOSTER, Illinois                SCOTT TIPTON, Colorado
JOYCE BEATTY, Ohio                   ROGER WILLIAMS, Texas
DENNY HECK, Washington               FRENCH HILL, Arkansas
JUAN VARGAS, California              TOM EMMER, Minnesota
JOSH GOTTHEIMER, New Jersey          LEE M. ZELDIN, New York
VICENTE GONZALEZ, Texas              BARRY LOUDERMILK, Georgia
AL LAWSON, Florida                   ALEXANDER X. MOONEY, West Virginia
MICHAEL SAN NICOLAS, Guam            WARREN DAVIDSON, Ohio
RASHIDA TLAIB, Michigan              TED BUDD, North Carolina
KATIE PORTER, California             DAVID KUSTOFF, Tennessee
CINDY AXNE, Iowa                     TREY HOLLINGSWORTH, Indiana
SEAN CASTEN, Illinois                ANTHONY GONZALEZ, Ohio
AYANNA PRESSLEY, Massachusetts       JOHN ROSE, Tennessee
BEN McADAMS, Utah                    BRYAN STEIL, Wisconsin
ALEXANDRIA OCASIO-CORTEZ, New York   LANCE GOODEN, Texas
JENNIFER WEXTON, Virginia            DENVER RIGGLEMAN, Virginia
STEPHEN F. LYNCH, Massachusetts      WILLIAM TIMMONS, South Carolina
TULSI GABBARD, Hawaii                VAN TAYLOR, Texas
ALMA ADAMS, North Carolina
MADELEINE DEAN, Pennsylvania
JESUS ``CHUY'' GARCIA, Illinois
SYLVIA GARCIA, Texas
DEAN PHILLIPS, Minnesota

                   Charla Ouertatani, Staff Director
                 TASK FORCE ON ARTIFICIAL INTELLIGENCE

                    BILL FOSTER, Illinois, Chairman

EMANUEL CLEAVER, Missouri            BARRY LOUDERMILK, Georgia, Ranking 
KATIE PORTER, California                 Member
SEAN CASTEN, Illinois                TED BUDD, North Carolina
ALMA ADAMS, North Carolina           TREY HOLLINGSWORTH, Indiana
SYLVIA GARCIA, Texas                 ANTHONY GONZALEZ, Ohio
DEAN PHILLIPS, Minnesota             DENVER RIGGLEMAN, Virginia

                            C O N T E N T S

                              ----------                              
                                                                   Page
Hearing held on:
    February 12, 2020............................................     1
Appendix:
    February 12, 2020............................................    33

                               WITNESSES
                      Wednesday, February 12, 2020

Ghani, Rayid, Distinguished Career Professor, Machine Learning 
  Department and the Heinz College of Information Systems and 
  Public Policy, Carnegie Mellon University......................    12
Henry-Nickie, Makada, David M. Rubenstein Fellow, Governance 
  Studies, Race, Prosperity, and Inclusion Initiative, Brookings 
  Institution....................................................     6
Kearns, Michael, Professor and National Center Chair, Department 
  of Computer and Information Science, University of Pennsylvania     8
Thomas, Philip S., Assistant Professor and Co-Director of the 
  Autonomous Learning Lab, College of Information and Computer 
  Sciences, University of Massachusetts Amherst..................     4
Williams, Bari A., Attorney and Emerging Tech AI & Privacy 
  Advisor........................................................    10

                                APPENDIX

Prepared statements:
    Ghani, Rayid.................................................    34
    Henry-Nickie, Makada.........................................    43
    Kearns, Michael..............................................    49
    Thomas, Philip S.............................................    52
    Williams, Bari A.............................................    55

              Additional Material Submitted for the Record

Foster, Hon. Bill:
    Written statement of BSA/The Software Alliance...............    62
    Written statement of the Future of Privacy Forum.............    71
    Written statement of ORCAA...................................    88
    Student Borrower Protection Center report entitled, 
      ``Educational Redlining,'' dated February 2020.............    90
    Response from Upstart to the Student Borrower Protection 
      Center's February 2020 report..............................   120


                    EQUITABLE ALGORITHMS: EXAMINING

                         WAYS TO REDUCE AI BIAS

                         IN FINANCIAL SERVICES

                              ----------                              


                      Wednesday, February 12, 2020

             U.S. House of Representatives,
             Task Force on Artificial Intelligence,
                           Committee on Financial Services,
                                                   Washington, D.C.
    The task force met, pursuant to notice, at 2:05 p.m., in 
room 2128, Rayburn House Office Building, Hon. Bill Foster 
[chairman of the task force] presiding.
    Members present: Representatives Foster, Cleaver, Porter, 
Casten; Loudermilk, Budd, Hollingsworth, Gonzalez of Ohio, and 
Riggleman.
    Chairman Foster. The Task Force on Artificial Intelligence 
will now come to order. It is my understanding that there is an 
ongoing markup in the Judiciary Committee, which is competing 
for Members' attention, and I suspect they will be coming in 
and out over the course of this hearing.
    Without objection, the Chair is authorized to declare a 
recess of the task force at any time. Also, without objection, 
members of the full Financial Services Committee who are not 
members of this task force are authorized to participate in 
today's hearing, consistent with the committee's practice.
    Today's hearing is entitled, ``Equitable Algorithms: 
Examining Ways to Reduce AI Bias in Financial Services.''
    I will now recognize myself for 5 minutes for an opening 
statement. First, thank you, everyone, for joining us today for 
what should be a very interesting hearing of the task force.
    Today, we are looking to explore what it means to design 
ethical algorithms that are transparent and fair. In short, how 
do we program fairness into our AI models and make sure that 
they can explain their decisions to us? This is an especially 
timely topic. It seems as though every week, we are hearing 
stories and questions about biased algorithms in the lending 
space, from credit cards that discriminate against women, to 
loans that discriminate based on where you went to school.
    I think many of these issues can be a lot more complicated 
and nuanced than how they are portrayed in the media, but it is 
clear that the use of AI is hitting a nerve with a lot of 
folks.
    For us as consumers to understand what is happening, we 
need to take a deeper look under the hood. First off, there are 
literally dozens of definitions of fairness to look at. As 
policymakers, we need to be able to explicitly state what kinds 
of fairness we are looking for, and how you balance multiple 
definitions of fairness against each other. Because, while we 
have fair lending laws in the form of the Equal Credit 
Opportunity Act and the the Fair Housing Act, translating these 
into analog laws into machine learning models is easier said 
than done. It is incumbent upon us to clearly state what our 
goals are, and to try to quantify the tradeoffs that we are 
willing to accept between accuracy and fairness.
    Equally important to designing ethical algorithms, however, 
is finding ways to ensure that they are working as they are 
intended to work. AI models present novel issues for resource-
strapped regulators that aren't necessarily present in 
traditional lending models. For example, AI models continuously 
train and learn from new data, which means that the models 
themselves must adapt and change.
    Another challenge is in this biased data, and I am reminded 
at this point of the saying from the great sage, Tom Lehrer, 
who said that life is like a sewer, what you get out of it 
depends on what you put into it, and AI algorithms are very 
similar, where the algorithms are like sewers, and sewage in 
will generate sewage out. Maybe our job on this committee is to 
define the correct primary, secondary, and tertiary sewage 
treatment systems to make sure that what comes out of the 
algorithms is of higher quality than what goes into them.
    And because AI models often train on historical data that 
reflects historical biases, which we hope will disappear over 
time, that means the models must correct for them as we wish 
today, and hopefully, those corrections will become less 
important in the future.
    But as more alternative data points are added to the 
underwriting models, the risk that a model will use such data 
as a proxy for prohibited characteristics, like race or age, 
only increase. One potential solution that we keep hearing 
about is the idea that these algorithms or their outputs should 
be audited by expert third parties.
    As an analogy, we have all subscribed to the idea that 
companies' financial statements should be audited by qualified 
accountants to ensure that they are in compliance with 
Generally Accepted Accounting Principles (GAAP).
    Another idea is to require companies to regularly self-test 
and perform benchmarking analyses that are submitted to 
regulators for review. This recognizes that model development 
is an iterative process, and we need agile ways to review and 
respond to changing models.
    These are just a few of the many good ideas that have been 
discussed. I am excited to have this conversation, to see how 
we can make AI be the best version of itself, and how to design 
algorithmic models that best capture the ideals of fairness and 
transparency that are reflected in our fair lending laws.
    We want to make sure that the biases of the analog world 
are not repeated in the AI and machine learning world. And with 
that, I now recognize the ranking member of the task force, my 
friend from Georgia, Mr. Loudermilk, for 5 minutes.
    Mr. Loudermilk. Thank you, Mr. Chairman. And I thank all of 
you for being here today as we discuss this very important 
subject. We are going to discuss ways to identify and reduce 
bias in artificial intelligence in financial services. We have 
talked about this issue in concept numerous times, but we have 
not yet gotten deep into what algorithm explainability really 
means. So, I appreciate the chairman holding this hearing.
    Analytical models of AI and machine learning are best 
understood, at least to me, when they are broken into three 
basic models: descriptive analytics, which analyzes past data; 
predictive analytics, which predicts future outcomes based on 
past data; and prescribing analytics, where the algorithm 
recommends a course of action based on past data.
    There is also a fourth emerging model, which I refer to as 
the ``execution model,'' which automatically takes action based 
on other AI systems' outputs. I believe the execution model 
deserves the most attention from policymakers because it can 
remove the human element in decision-making.
    There are a number of noteworthy recent developments in 
artificial intelligence that I hope we can discuss today. 
First, the White House Office of Science and Technology Policy 
recently released principles for how Federal agencies can 
regulate the development of AI in the private sector. The 
intent of the principles is to govern AI with the direction on 
the technical and ethical aspects without stifling innovation.
    The principles recommended providing opportunities for 
public feedback during the rulemaking process, considering 
fairness and nondiscrimination regarding the decisions made by 
AI applications, and basing the regulatory approach on 
scientific data.
    The U.S. Chief Technology Officer said the principles are 
designed to ensure public engagement, limit regulatory 
overreach, and promote trustworthy technology. Some private-
sector organizations recommend principles for companies using 
AI, which include designating a lead AI ethics official, making 
sure the customer knows when they are interacting with AI, 
explaining how the AI arrived at its result, and testing AI for 
bias. I believe the latter two, explaining results and testing 
for bias, are important to ensure appropriate use of AI by 
private sector businesses.
    A basic but central part of explainability is making sure 
businesses and their regulators are able to know the building 
blocks of what went into an algorithm when it was being 
constructed. In other words, coders should maintain full 
records of what is going into the model when it is being 
trained, ranked by order of importance. This is also known as 
``logging,'' and can help isolate sources of bias.
    A similar concept is present in credit scoring. Credit 
scores are generated by algorithms to create a number that 
predicts a person's ability to repay a loan.
    Importantly, it is easy to figure out what is bringing 
someone's score up or down, because the factors that go into 
the score are transparent. Having a long credit history brings 
the score up, while the use of available credit brings it down. 
Additionally, on-time payments are weighted higher than the 
number of inquiries.
    With that said, recordkeeping is a starting point, and 
certainly is not a silver bullet solution to the explainability 
problem, especially with more complex algorithms.
    With explainability, we also need to define what fairness 
is. There needs to be a benchmark to compare algorithm results 
and evaluate the fairness of an algorithm's decisions. These 
kinds of paper trails can help get to the bottom of suspected 
bias in loan underwriting decisions. It is also important to be 
able to test algorithms to see if there is any bias present.
    If there is suspected bias, companies can take a subset of 
the data based on sensitive features like gender and race to 
see if there is disparate impact on a particular group. Aside 
from testing for bias, testing can also help companies verify 
if the algorithm is arriving at its expected results.
    It is also important for companies and regulators to verify 
the input data for accuracy, completeness, and appropriateness. 
Flawed data likely results in flawed algorithm outcomes.
    I look forward to the discussion on this issue, and I yield 
back.
    Chairman Foster. Thank you.
    Today, we welcome the testimony of Dr. Philip Thomas, 
assistant professor and co-director of the Autonomous Learning 
Lab, College of Information and Computer Sciences at the 
University of Massachusetts Amherst; Dr. Makada Henry-Nickie, 
the David M. Rubenstein Fellow for the Governance Studies, 
Race, Prosperity, and Inclusion Initiative at the Brookings 
Institution; Dr. Michael Kearns, professor and national center 
chair, the Department of Computer and Information Science at 
the University of Pennsylvania; Ms. Bari A. Williams, attorney 
and emerging tech AI and privacy adviser; and Mr. Rayid Ghani, 
the distinguished career professor in the machine learning 
department at Heinz College of Information Systems and Public 
Policy at Carnegie Mellon University.
    Witnesses are reminded that your oral testimony will be 
limited to 5 minutes. And without objection, your written 
statements will be made a part of the record.
    Dr. Thomas, you are now recognized for 5 minutes to give an 
oral presentation of your testimony.

   STATEMENT OF PHILIP S. THOMAS, ASSISTANT PROFESSOR AND CO-
DIRECTOR OF THE AUTONOMOUS LEARNING LAB, COLLEGE OF INFORMATION 
   AND COMPUTER SCIENCES, UNIVERSITY OF MASSACHUSETTS AMHERST

    Mr. Thomas. Thank you. Chairman Foster, Ranking Member 
Loudermilk, and members of the task force, thank you for the 
opportunity to testify today.
    I am Philip Thomas, an assistant professor at the 
University of Massachusetts Amherst.
    My goal as a machine learning researcher is to ensure that 
systems that use machine learning algorithms are safe and fair, 
properties that may be critical to the responsible use of AI in 
finance.
    Towards this goal, in a recent science paper, my co-authors 
and I proposed a new type of machine learning algorithm which 
we call a Seldonian algorithm. Seldonian algorithms make it 
easier for the people using AI to ensure that the systems they 
create are safe and fair. We have shown how Seldonian 
algorithms can avoid unfair behavior when applied to a variety 
of applications, including optimizing online tutorials to 
improve student performance, influencing criminal sentencing, 
and deciding which loan applications should be approved.
    While our work with loan application data may appear most 
relevant to this task force, that work was in a subfield of 
machine learning called contextual bandits. The added 
complexity of the contextual bandit setting would not benefit 
this discussion, and so I will instead focus on an example in a 
more common and straightforward setting called regression.
    In this example, we used entrance exam scores to predict 
what the GPAs of new university applicants would be if they 
were accepted. The GPA prediction problem resembles many 
problems in finance, for example, rating applications for a job 
or a loan. The fairness issues that I will discuss are the same 
across these applications.
    In the GPA prediction study, we found that three standard 
machine learning algorithms overpredicted the GPAs of male 
applicants on average and underpredicted the GPAs of female 
applicants on average, with a total bias of around 0.3 GPA 
points in favor of male applicants. The Seldonian algorithm 
successfully limited this bias to below 0.05 GPA points, with 
only a small reduction in predictive accuracy.
    The rapidly growing community of machine learning 
researchers studying issues related to fairness has produced 
many similar AI systems that can effectively preclude a variety 
of types of unfair behavior across a variety of applications. 
With the development of these fair algorithms, machine learning 
is reaching the point where it can be applied responsibly to 
financial applications, including influencing hiring and loan 
approval decisions.
    I will now discuss technical issues related to ensuring the 
fairness of algorithms which might inform future regulations 
aimed at ensuring the responsible use of AI in finance.
    First, there are many definitions of fairness. Consider our 
GPA prediction example. One definition of fairness requires the 
average predictions to be the same for each gender. Under this 
definition, a system that tends to predict a lower GPA if you 
are of a particular gender would be deemed unfair.
    Another definition requires the average error of 
predictions to be the same for each gender. Under this 
definition, a system that tends to overpredict the GPAs of one 
gender and underpredict for another would be deemed unfair.
    Although both of these might appear to be desirable 
requirements for a fair system, for this problem, it is not 
possible to satisfy both simultaneously. Any system, human or 
machine, that produces the same average prediction for each 
gender necessarily over-predicts more for one gender and vice 
versa. The machine learning community has generated more than 
20 possible definitions of fairness, many of which are known to 
be conflicting in this way.
    In any effort to regulate the use of machine learning to 
ensure fairness, a critical first step is to define precisely 
what fairness means. This may require recognizing that certain 
behaviors that appear to be unfair may necessarily be 
permissible in order to enable the enforcement of a conflicting 
and more appropriate notion of fairness.
    Although the task of selecting the appropriate definition 
of fairness should likely fall to regulators and social 
scientists, machine learning researchers can inform this 
decision by providing guidance with regard to which definitions 
are possible to enforce simultaneously, what unexpected 
behavior might result from a particular definition of fairness, 
and how much or how little different definitions of fairness 
might impact profitability.
    Regulations could also protect companies. Fintech companies 
that make every attempt to be fair using AI systems that 
satisfy a reasonable definition of fairness may still be 
accused of racist or sexist behavior for failing to enforce a 
conflicting definition of fairness. Regulation could protect 
these companies by providing an agreed-upon, appropriate, and 
satisfiable definition of what it means for their systems to be 
fair.
    Once a definition of fairness has been selected, machine 
learning researchers can work on developing algorithms that 
will enforce the chosen definition. For example, our latest 
Seldonian algorithms are already compatible with an extremely 
broad class of fairness definitions and might be immediately 
applicable.
    Still, there is no silver bullet algorithm for remedying 
bias and discrimination in AI. The creation of fair AI systems 
may require use-specific considerations across the entire AI 
pipeline, from the initial collection of data, through 
monitoring the final deployed system.
    Several other questions must be answered for regulations to 
be effective and fair. For example, will fairness requirements 
that appear reasonable for the short term have the long-term 
effect of reinforcing existing social inequalities? How should 
fairness requirements account for the fact that changing 
demographics can result in a system that was fair last month 
not being fair today? And when unfair behavior occurs, how can 
regulators determine whether this is due to the improper use of 
machine learning? Thank you again for the opportunity to 
testify today. I look forward to your questions.
    [The prepared statement of Dr. Thomas can be found on page 
52 of the appendix.]
    Chairman Foster. Thank you.
    Dr. Henry-Nickie, you are now recognized for 5 minutes to 
give an oral presentation of your testimony.

 STATEMENT OF MAKADA HENRY-NICKIE, DAVID M. RUBENSTEIN FELLOW, 
GOVERNANCE STUDIES, RACE, PROSPERITY, AND INCLUSION INITIATIVE, 
                     BROOKINGS INSTITUTION

    Ms. Henry-Nickie. Chairman Foster, Ranking Member 
Loudermilk, and distinguished members of the task force, thank 
you for the opportunity to testify today. I am Makada Henry-
Nickie, a fellow at the Brookings Institution, where my 
research covers issues of consumer financial protection.
    I am pleased to share my perspective on both the 
opportunities and challenges of integrating AI into financial 
services. As this committee knows, market interest in AI is 
soaring. AI technologies have permanently reshaped the 
financial marketplace and altered consumer preferences and 
expectations of banks. I want to point out a few key trends 
that underscore this premise.
    First, layering AI onto the financial value chain is 
unlocking enormous opportunities for banks. Consider that J.P. 
Morgan, for example, just installed a contract software that 
takes mere seconds to review the same number of documents that 
previously required about 360,000 manpower hours to complete.
    Second, AI is creating new surface areas for banks to 
cross-sell products to customers, and this means more revenue.
    Finally, consumers are increasingly open to embracing AI in 
banking. According to Adobe Analytics, 44 percent of Gen Z and 
31 percent of millennials have interacted with a chat bot. And 
they prefer, overwhelmingly, to interact with a chat bot as 
opposed to a human representative. Taken together, these trends 
suggest that AI is undoubtedly shaping the future of banking.
    The story of AI in financial services is not all bad, and 
innovative fintechs have made salient contributions that make 
financial services more inclusive and more accessible for 
consumers. Micro savings apps, for example, have empowered 
millions of consumers to save more and to do so automatically.
    Digit has used machine learning to help its clients save 
over $2.5 billion; that is an average of $2,000 annually. In 
credit markets, a combination of machine learning and 
alternative data is slowly showing some early promise. When I 
say, ``alternative data,'' I am not referring to the format of 
your email address. I am talking about practical, alternative 
factors such as rental payment and utility payment histories, 
among others. A 2019 FINRA study showed that these variables 
can reliably predict a consumer's ability to repay.
    Furthermore, the results of CFPB's No Action Letter review 
also supports this idea of early promise. According to CFPB, 
Upstart, through its use of machine learning and alternative 
data, was able to increase loan approval by nearly 30 percent 
and lower APRs by as much as 17 percent. Crucially, the CFPB 
reported that the fintech's data showed no evidence of fair 
lending disparities.
    Meanwhile, a UC Berkeley study found that algorithmic 
lending substantially decreased pricing disparities and 
eliminated underwriting discrimination for Black and Hispanic 
borrowers.
    Both research and market evidence show that, despite the 
risks, algorithmic models have potential to provide benefits to 
consumers. However, it is important not to overstate this 
promise. We have all had a front row seat to the movie. 
Algorithms propagate bias. This is not an attempt to 
exaggerate. Numerous cases from various scenes support this 
claim, from Amazon's hiring algorithm shown to be biased 
against women, to Google's insulting association between 
African Americans, like myself, and gorillas.
    And the same Berkeley study I mentioned earlier found that 
algorithmic lenders systematically charged Black and Hispanic 
borrowers higher interest rates. According to the study, 
minorities paid 5.3 basis points more than their white peers.
    In the final analysis, machine learning was not 
sophisticated enough to break the systematic correlation 
between race and credit risk. In the end, these borrowers pay 
an estimated ongoing $765 million in excess interest payments, 
instead of saving or paying down student loan debt.
    Machine bias is not inevitable, nor is it final. This bias, 
though, is not benign. AI has enormous consequences for racial, 
gender, and sexual minorities. This should not be trivialized. 
Technical solutions alone, though, will not reduce algorithmic 
bias or ameliorate its effects.
    Congress should focus on strengthening the resiliency of 
the Federal consumer financial protection framework so that 
consumers are protected. Thank you for your time, and I look 
forward to your questions.
    [The prepared statement of Dr. Henry-Nickie can be found on 
page 43 of the appendix.]
    Chairman Foster. Thank you.
    Dr. Kearns, you are now recognized for 5 minutes to give an 
oral presentation of your testimony.

  STATEMENT OF MICHAEL KEARNS, PROFESSOR AND NATIONAL CENTER 
    CHAIR, DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE, 
                   UNIVERSITY OF PENNSYLVANIA

    Mr. Kearns. Thank you for the opportunity to testify today. 
My name is Michael Kearns, and I am a professor in the Computer 
and Information Science Department at the University of 
Pennsylvania. For more than 3 decades, my research has focused 
on machine learning and related topics. I have consulted 
extensively in the finance and technology sectors, including on 
legal and regulatory matters. I discussed the topics and these 
remarks at greater length in my recent book, ``The Ethical 
Algorithm: The Science of Socially Aware Algorithm Design.''
    The use of machine learning for algorithmic decision-making 
has become ubiquitous in the finance industry and beyond. It is 
applied in consequential decisions for individual consumers 
such as lending and credit scoring, in the optimization of 
electronic trading algorithms at large brokerages, and in 
making forecasts of directional movement of volatility in 
markets and individual assets.
    With major exchanges now being almost entirely electronic 
and with the speed and convenience of the consumer internet, 
the benefits of being able to leverage large-scale, fine-grain, 
historical data sets by machine learning have become apparent.
    The dangers and harms of machine learning have also 
recently alarmed both scientists and the general public. These 
include violations of fairness, such as racial or gender 
discrimination in lending or credit decisions, and privacy, 
such as leaks of sensitive personal information.
    It is important to realize that these harms are generally 
not the result of human malfeasance, such as racist or 
incompetent software developers. Rather, they are the 
unintended consequences of the very scientific principles 
behind machine learning.
    Machine learning proceeds by fitting a statistical model to 
a training data set. In a consumer lending application, such a 
data set might contain demographic and financial information 
derived from past loan applicants, along with the outcomes of 
granted loans.
    Machine learning is applied to find a model that can 
predict loan default probabilities and to make lending 
decisions accordingly. Because the usual goal or objective is 
exclusively the accuracy of the model, discriminatory behavior 
can be inadvertently introduced. For example, if the most 
accurate model overall has a significantly higher false 
rejection rate on Black applicants than on white applicants, 
the standard methodology of machine learning will, indeed, 
incorporate this bias.
    Minority groups often bear the brunt of such discrimination 
since, by definition, they are less represented in the training 
data. Note that such biases routinely occur even if the 
training data itself is collected in an unbiased fashion, which 
is rarely the case.
    Truly unbiased data collection requires a period of what is 
known as exploration in machine learning, which is rarely 
applied in practice because it involves, for instance, granting 
loans randomly, without regard for the properties of 
applicants.
    When the training data is already biased and the basic 
principles of machine learning can amplify such biases or 
introduce new ones, we should expect discriminatory behavior of 
various kinds to be the norm and not the exception.
    Fortunately, there is help on the horizon. There is now a 
large community of machine learning researchers who explicitly 
seek to modify the classical principles of machine learning in 
a way that avoids or reduces sources of discriminatory 
behavior. For instance, rather than simply finding the model 
that maximizes predictive accuracy, we could add the 
constraints that different--that the model must not have 
significantly different false rejection rates across different 
racial groups.
    This constraint can be seen as forcing a balance between 
accuracy and a particular definition of algorithmic fairness. 
The modified methodology generally requires us to specify what 
groups or attributes we wish to protect and what harms do we 
wish to protect them from. These choices will always be 
specific to the context and should be made by key stakeholders.
    There are some important caveats to this agenda. First of 
all, there are bad definitions of fairness that should be 
avoided. One example is forbidding the use of race in lending 
decisions in the hope that it will prevent racial 
discrimination. It doesn't, largely because there are so many 
other variables strongly correlated with race that machine 
learning can discover as proxies.
    Even worse, one can show simple examples where such 
restrictions will, in fact, harm the very group we sought to 
protect. Unfortunately, to the extent the consumer finance law 
incorporates fairness considerations, they are usually of this 
flawed form that restricts model inputs. It is usually far 
better to explicitly constrain the model's output behavior, as 
in the example of equalizing false rejection rates in lending.
    I note in closing, though all my remarks have focused on 
the potential for designing algorithms that are better-behaved, 
they also point the way to regulatory reform, since most 
notions of algorithmic fairness can be algorithmically audited. 
If we are concerned over false rejection rates, or disparities 
by race, we can systematically test models for such behaviors 
and measure the violations.
    I believe that the consideration of such algorithmic 
regulatory mechanisms is both timely and necessary, and I have 
elaborated on this in other recent writings. Thank you.
    [The prepared statement of Dr. Kearns can be found on page 
49 of the appendix.]
    Chairman Foster. Thank you.
    Ms. Williams, you are now recognized for 5 minutes to give 
an oral presentation of your testimony.

STATEMENT OF BARI A. WILLIAMS, ATTORNEY AND EMERGING TECH AI & 
                        PRIVACY ADVISOR

    Ms. Williams. Chairman Foster, thank you. Members of the 
task force, thank you for allowing me to be here. My name is 
Bari A. Williams, and I am an attorney and start-up adviser 
based in Oakland, California. I have a B.A. from UC Berkeley, 
an MBA from St. Mary's College of California, an M.A. in 
African-American studies from UCLA, and a J.D. from UC Hastings 
College of the Law.
    Primarily, I work in technology transactions, and that also 
includes writing all of the terms of service, which are what I 
like to call the things that you scroll, scroll, scroll 
through, and then accept. I write all of the things that people 
typically tend not to read. I also focus on privacy and a 
specialization in AI, and my previous employer, All Turtles, is 
akin to an AI incubator, and they are concentrated not just on 
legal and policy, but also help with product production and 
inclusiveness.
    So, in my work in the tech sector, I have been exposed to 
many different use cases for AI. And the things that you tend 
to see for now, and a lot of the panelists have also referred 
to them--criminal justice, lending, understanding predictive 
behavior--are also responsible for all of the ads that you tend 
to see, to influence consumer behavior.
    So I would say that there are five main issues with AI in 
financial services, in particular. One, what data sets are 
being used? And to me, I distill that down to, who fact-checks 
the fact-checkers? What does it mean to use this particular 
data set, and why are you choosing to use it?
    Two, what hypotheses are set out to be proven by using this 
data? Meaning, is there a narrative that is already being 
written and you are looking for examples in which to prove it 
and to bake that into your code?
    Three, how inclusive is the team that is building and 
helping you test this product? I think that is one thing that 
has yet to be mentioned on the panel, is, also, how inclusive 
is the team that is actually creating this product? So who are 
you building the products with?
    Four, what conclusions are drawn from the pattern 
recognition in the data that the AI provides? That is, who are 
you building the products for? And then, who is harmed and who 
stands to benefit?
    And, five, how do we ensure bias neutrality, and are there 
even good reasons to ensure that there is bias neutrality 
because not all biases are bad?
    Data sets in financial services are used to determine your 
home ownership, your mortgage, and your savings and student 
loan rates, all of the things that the prior panelists also 
noted.
    I also cited the same study that Dr. Henry-Nickie did as 
well from UC Berkeley by noting that, yes, she is correct; in 
that 2017 study, it showed that 19 percent of Black borrowers 
and almost 14 percent of Latinx borrowers were turned down for 
a conventional loan, and additionally, the bias was not removed 
whether it was a face-to-face interaction or it was done using 
the algorithm. So, in fact, it just seems that the AI 
technology actually made the efficiency better, to deny people 
loans and to increase their interest rates.
    So there are two mechanisms in which you can drive for fair 
outcomes. Again, you can pick your favorite definition of 
``fair.'' I think you will see that there are many to choose 
from. One is to leverage statistical techniques to resample or 
reweigh a data sample to reduce the bias. I would give you a 
visual of, essentially, it is someone standing on a box. 
Imagine someone may be shorter, and you give them a box to 
stand on so that they are the same height as the person next to 
them. That is essentially reweighing the data.
    And the second technique is a fairness regulator, which is 
essentially a mathematical constraint to ensure fairness in the 
model to existing algorithms.
    So what are other emerging methods or ways that you can use 
AI for good? Some emerging methods--there is one in particular, 
that is seen with Zest AI, which is a tech company, and it has 
created a product called ZAML Fair, which reduces bias in 
credit assessment by ranking an algorithm's credit variables by 
how much they lead to biased outcomes. And then, they muffle 
the influence of those variables to produce a better model with 
less biased outcomes.
    So if more banks, or even consumer-facing retailers, credit 
reporting bureaus, used something like this, you may get a 
better outcome that shows better parity.
    What ways can existing laws and regulations help us? It is 
the same as I tell my kids, and what I tell my clients: A rule 
is only as good as its enforcement. So, if you act as if the 
rule doesn't exist, it might as well not exist.
    For example, if a lending model finds that older 
individuals have a higher default rate on their loans, and then 
they decide to reduce lending to those individuals based on 
their age, that can constitute a claim for housing 
discrimination. That is where you could apply the Fair Housing 
Act.
    Additionally, the U.S. Equal Credit Opportunity Act of 
1974, if you show greater disparate impact on the basis of any 
protected class, you could also use that as a lever as well.
    And I don't abide by the idea that, oh, well, the model did 
it. There are people who are actually creating the models, and 
so that means that there is regulation that could be used to 
actually ensure that the people creating the models are 
inclusive and diverse as well. Thank you.
    [The prepared statement of Ms. Williams can be found on 
page 55 of the appendix.]
    Chairman Foster. Thank you.
    And, Mr. Ghani, you are now recognized for 5 minutes to 
give an oral presentation of your testimony.

   STATEMENT OF RAYID GHANI, DISTINGUISHED CAREER PROFESSOR, 
     MACHINE LEARNING DEPARTMENT AND THE HEINZ COLLEGE OF 
    INFORMATION SYSTEMS AND PUBLIC POLICY, CARNEGIE MELLON 
                           UNIVERSITY

    Mr. Ghani. Thank you. Chairman Foster, members of the task 
force, thanks for giving me the opportunity, and for holding 
this hearing. My name is Rayid Ghani, and I am a professor in 
the machine learning department in the Heinz College of 
Information Systems and Public Policy at Carnegie Mellon 
University.
    I have worked in the private sector, in academia, and 
extensively with governments and nonprofits in the U.S. and 
globally on developing and using machine learning and AI 
systems for public policy problems across health, criminal 
justice, education, public safety, human services, and 
workforce development in a fair and equitable manner.
    AI has a lot of potential in helping tackle critical 
problems we face today, from improving the health and education 
of our children, to reducing recidivism, to improving police-
community relations, to improving health and safety outcomes 
and conditions in workplaces and housing.
    AI systems can help improve outcomes for everyone and 
result in a better and more equitable society. At the same 
time, any AI system affecting people's lives should be 
explicitly built to increase equity and not just optimize for 
efficiency.
    An AI system designed to explicitly optimize for efficiency 
has the potential to leave more difficult or costly people to 
help behind, resulting in increased inequities. It is critical 
for government agencies and policymakers to ensure that AI 
systems are developed in a responsible, ethical, and 
collaborative manner with stakeholders that include, yes, 
developers who build these systems, and decision-makers who use 
these systems, but critically including the communities that 
are being impacted by them.
    Since today's hearing is entitled, ``Equitable 
Algorithms,'' I do want to mention that, contrary to a lot of 
thinking in this space today, simply developing AI algorithms 
that are equitable is not sufficient to achieve equitable 
outcomes. Rather, the goal should be to make entire systems and 
their outcomes equitable.
    Since algorithms are typically not--and shouldn't be--
making autonomous decisions in critical situations, we want 
equity across the entire decision-making process, which 
includes the AI algorithm but also the decisions made by humans 
using inputs from those algorithms and the impact of those 
decisions.
    In some recent preliminary work we did with the Los Angeles 
City attorney's office, we found that we can mitigate the 
disparities that a potentially biased algorithm may create to 
potentially result in equitable criminal justice outcomes 
across racial groups.
    Because an AI system requires us to define exactly what we 
want it to optimize, and which mistakes we think are costlier, 
financially or socially, than others, and by exactly how much, 
it forces us to make some of these ethical and societal values 
explicit. For example, in a system recommending lending 
decisions, we may have to specify the differential costs of 
different areas. Flagging somebody as unlikely to pay back a 
loan and being wrong about it versus predicting someone will 
pay and not pay back a loan and being wrong about it, and 
specify those costs explicitly for--in the case of people who 
may be from different gender, race, income, and education 
backgrounds.
    While that may have happened implicitly in the past, and 
with high levels of variation across different decision-makers, 
loan officers in this case, or banks, with AI-assisted 
decision-making processes, we are forced to define them 
explicitly and, ideally, consistently.
    In my written testimony, I outline a series of steps to 
create AI systems that are likely to lead to equitable outcomes 
that range from coming up with the outcomes to building these 
systems to validating whether they achieve those outcomes, but 
it is important to note that these steps are not purely 
technical but involve understanding the existing social and 
decision-making processes, as well as require solutions that 
are collaborative in nature.
    I think it is critical and urgent for policymakers to act 
and provide guidelines and regulations for both the public and 
private sector organizations, using AI-assisted decision-making 
processes in order to ensure that these systems are built in a 
transparent and accountable manner and result in fair and 
equitable outcomes for society.
    As initial steps, we recommend, one, expanding the already 
existing regulatory frameworks in different policy areas to 
account for AI-assisted decision-making. A lot of these bodies 
already exist--SEC, FINRA, CFPB, FDA, FEC, you know, pick your 
favorite three-letter acronym. But these bodies typically 
regulate inputs that go into the process--race or gender may 
not be allowed--and sometimes the process, but rarely focus on 
the outcomes produced by these processes.
    We recommend expanding these regulatory bodies to update 
their regulations to ensure they apply to AI-assisted decision-
making.
    We also recommend creating training programs, processes, 
and tools to support these regulatory agencies in their 
expanded responsibilities and roles. It is important to 
recognize that AI can have a massive positive social impact, 
but we need to make sure that we can put guidelines and 
regulations in place to maximize the chances of the positive 
impact, while protecting people who have been traditionally 
marginalized in society and may be affected negatively by these 
new AI systems. Thank you for this opportunity, and I look 
forward to your questions.
    [The prepared statement of Dr. Ghani can be found on page 
34 of the appendix.]
    Chairman Foster. Thank you.
    And I now recognize myself for 5 minutes for questions.
    Dr. Thomas, the Equal Credit Opportunity Act, also known as 
ECOA, prohibits discrimination in lending based on the standard 
factors: race or color, religion, national origin, sex, marital 
status, age, and the applicant's receipt of income from any 
public assistance program. Today, is it technically possible to 
program these explicit constraints? If Congress gives exact 
guidance as to what we think is fair, are there still remaining 
technical problems? I would be interested in--yes, proceed.
    Mr. Thomas. Yes. We could program those into algorithms. 
For example, the Seldonian algorithms we have created, for most 
definitions of fairness, we could encode them now. The 
remaining technical challenge is just to recognize that often 
fairness guarantees are only with high probability, not 
certainty. So it may not be possible to create an algorithm 
that guarantees with certainty it will be fair with respect to 
the chosen definition of fairness, but we can create ones that 
will be fair with high probability, yes.
    Chairman Foster. Any other comments on that general 
problem? Is it just a definitional question we are wrestling 
with, or are there technical issues that are--Dr. Kearns?
    Mr. Kearns. If I understood you correctly, as per my 
remarks, I think all of these definitions that try to get to 
fairness by restricting inputs to models are ill-formed. You 
should specify what behavior you want at the output. So, when 
you forbid the use of race, you forget the fact that 
unfortunately, in the United States, ZIP code is already a very 
good statistical proxy for race. So what you should just do is 
say, ``Don't have racial discrimination in the output behavior 
of this model,'' and let the model use any inputs it wants.
    Ms. Henry-Nickie. I would just add that in optimizing for 
one definition of fairness, sometimes we are actually creating 
a disparate treatment effect within the protected class group. 
One study showed that when they optimized for statistical 
parity, meaning the same outcome for both groups, no 
differences, they actually hurt qualified members of a 
protected class. And so, there is a very costly decision 
involved in constraining for one definition, and hurting people 
in the real world.
    Chairman Foster. Mr. Ghani?
    Mr. Ghani. To that point, you can always achieve some--
whatever definition of fairness in terms of the outcomes you 
care about. The question is, at what cost? There are a lot of 
ways you can make fairly random decisions, and a lot of random 
decisions will be somewhat fair, but the cost will be, in terms 
of effectiveness of outcomes, you are not going to get to 
people who need the support, who need the help, who need the 
loans, who need the services.
    So, the question is not whether the algorithms can achieve 
fairness. Yes, they can. But is the cost that comes with it 
acceptable to society and to the values that we care about?
    Chairman Foster. Yes, Ms. Williams?
    Ms. Williams. I would also add that this goes back to the 
point that I made about a narrative looking for facts. We want 
to be careful that, to Dr. Kearns' point, I think solving for 
the outcome is actually probably most effective. The inputs are 
very important, yes, but also you are typically picking those 
inputs because there is a desired outcome that you want, and 
that is why you are choosing the data sets that you are 
choosing.
    There also needs to be an element of making sure that you 
are examining and auditing the human behavior that is 
responsible for the decision-making based off of that output as 
well. It isn't enough to simply look at just the model and the 
inputs, but it is looking at the output, choosing to solve for 
the desired output, and then looking at the human decision-
making behind how that comes to be.
    Chairman Foster. And the issue with black box testing, that 
you can look at the details of the algorithms, is that an 
appropriate stance for us to take in regulating this? This is 
something that we run into in things like regulating high 
frequency trading, where they are very protective of the source 
code for their trading, and they say: Just look at the trading 
tapes, and look at our behavior, and don't ask us how we come 
to that behavior.
    Is that going to end up being sufficient here, or would the 
regulators have to look at the guts of the algorithm? Dr. 
Thomas?
    Mr. Thomas. That will depend on the chosen definition of 
fairness. If the definition of fairness is that you don't look 
at a feature like race, which is the kind that Professor Kearns 
is arguing against, if it was that kind of definition, you may 
need to look at the algorithm, because it could be looking at 
some other features that make it act as though it was looking 
at that protected attribute.
    But if you are looking at a definition of fairness, like 
the ones Professor Kearns is promoting, things like equalized 
odds or demographic parity, which are requiring false positive 
and false negative rates to be bounded, those you could test in 
the black box way, looking at the behavior of the system and 
then determine if it is being fair or not, without looking at 
the code for the algorithm.
    Chairman Foster. Yes, Dr. Kearns?
    Mr. Kearns. I think one could go a long way with black box 
testing. It is always better to be able to see source code. I 
think it is also important to remember that sometimes when we 
talk about algorithms or models, we are oversimplifying.
    A good example is advertising results on Google. Underneath 
advertising results on Google is, indeed, a machine learning 
model that tries to predict the likelihood that you would click 
on an ad, and that goes into the process of placing ads. But 
there is also an auction being held for people's eyeballs and 
impressions, and these two things interact.
    For instance, there have been studies showing that 
sometimes gender discrimination in the display of STEM 
advertising in Google is not due to the underlying machine 
learning models of Google but rather to the fact that there is 
a group of advertisers willing to outbid STEM advertisers for 
female impressions.
    Chairman Foster. I will now have to bring the gavel down on 
myself for exceeding my time, and recognize the distinguished 
ranking member, Mr. Loudermilk, for 5 minutes for questions.
    Mr. Loudermilk. Thank you, Mr. Chairman.
    And thank you all for your incredible testimony. Spending 
30 years in the information technology sector, I have learned 
one thing, which is, if you are going to take a scientific 
approach to anything, you can't use your own bias, but you have 
to suspect bias. Many times I have gone to dealing with 
cybersecurity issues, programming security on physical 
networks, and it doesn't work it the way I thought it was 
supposed to work.
    Several times, I went in to check myself, and found out 
that what I suspected was supposed to happen isn't what was 
happening. In other words, my own bias of, I program it for 
this outcome, but the machine was actually giving me the proper 
outcome.
    The only reason I say that is, if you are going to take a 
scientific approach, you have to check your own bias as well. 
So, when I ask some questions here, don't interpret what I am 
trying to say. I just have--we have to understand that we all 
have bias. And we also have to look in--are there occasions 
when the output isn't what we expected, but it is the right 
output? And the only reason I am going down that because I want 
to ask some questions just to try to help us get to, where are 
we seeing the bias?
    And I am not going into questioning--or making a statement 
that, yes, AI is perfect, and it is working the way it should 
be, that there is anything wrong with the testimonies. I think 
as a community, we have to come together and we realize that 
this is the future we are going to and we have to get things 
right. And so I just wanted to say that, that if I ask 
questions, don't take it that I am trying to question the 
validity of what you are telling me. I just need to dig a 
little deeper into some of this.
    Ms. Henry-Nickie, as we are all concerned about potential 
bias in algorithms, we know from a scientific approach that 
humans have much more potential for bias than machines, if 
properly utilized and programmed. And I think that is what we 
are getting.
    Ms. Williams said something in her testimony that just 
highlighted--I just kind of want to step through some things to 
see if we can really drive in to where the issue exists. In her 
testimony, she was talking about home mortgage disclosures, and 
it showed that--and I believe, if I am right, Ms. Williams, 
this was AI approving home mortgages, is that correct--and I 
think it was like only 81 percent of Blacks were approved, and 
76 percent of Hispanics were approved.
    So my question, Dr. Henry-Nickie, is, how do we know that 
those numbers weren't correct? In other words, was a 19 percent 
disapproval of Black borrowers and 24 percent of Latinos 
outside of what would normally we see if it wasn't through an 
algorithm?
    Ms. Henry-Nickie. It is difficult to answer that question 
without looking at the algorithms, but I will tell you that it 
is not fair to assess what a proper outcome should be. The 
context matters.
    Mr. Loudermilk. Right.
    Ms. Henry-Nickie. And so, if the market bears an average 
denial rate of 19 percent, then that is the market. And if all 
groups--Hispanics, African Americans, and white borrowers--are 
being denied at systematically similar rates, then that is an 
outcome that I don't think we can argue with. What is 
troublesome or concerning in that kind of example would be a 
model that is systematically denying minority borrowers, and 
having that be based on their race or predicted by their race.
    Mr. Loudermilk. Right.
    Ms. Henry-Nickie. So I think it is--and we have all said it 
on the panel--looking squarely for computational technical 
solutions is part of the answer but it is not the complete 
answer. We need a systematic approach to making sure that we 
can understand what is going on in these algorithmic 
applications and also from there to monitor effects and most 
importantly processes.
    Mr. Loudermilk. And so, when it comes to testing AI 
platforms, it is not just the algorithm. There is a whole lot 
of emphasis on the algorithm, which is a mathematical equation. 
That is one part of a four-part testing that we need to do. The 
appropriateness of the data, the quality of the data, the 
availability of the data--you also have cognitive input systems 
that have to be considered if it is using facial identification 
for something. Is that actually operating?
    The reason I am asking the questions is to say, are we 
focused on an algorithm when the problem may actually be in the 
data or the appropriateness of the data if there is--and we 
just will make the assumption for this argument--that the 
output of the AI system is wrong? But I also think we do have 
to have empirical data to prove that the output is wrong, and 
it is not in our own bias. And I am not suggesting that that is 
what it is, but from a scientific approach, we have to do that. 
In a forensic way, if we are going to find out where the 
problem is, we have to consider all of that.
    If we have a second round, I will have more questions. 
Thank you, Mr. Chairman.
    Chairman Foster. I anticipate that we will. The gentleman 
from Missouri, Mr. Cleaver, who is also the Chair of our 
Subcommittee on National Security, International Development 
and Monetary Policy, is recognized for 5 minutes.
    Mr. Cleaver. Thank you, Mr. Chairman, and thank you for 
holding this hearing. Dr. Thomas, what is AI? Can you, as 
quickly, as short a definition as you--
    Mr. Thomas. Unfortunately, it is a poor definition, but AI, 
I view as just a research field that contains a lot of 
different directions towards making machines more intelligent 
so that they can solve problems that we might associate with 
intelligent behavior.
    Mr. Cleaver. Machine intelligence?
    Mr. Thomas. Yes.
    Mr. Cleaver. Okay. So, if Netflix begins to have showings 
for certain viewers, customers, and they know what movies and 
shows that I would most likely enjoy, what determined that? How 
did they get that information? Is that AI?
    Mr. Thomas. Yes. Typically, that would be machine learning, 
which is a subfield of AI, that uses data collected from 
people, for example, to make decisions or predictions about 
what those people will like in the future.
    Mr. Cleaver. Okay. Thank you. For any of our witnesses, I 
was on the committee when we had the economic collapse in 2008, 
and witness after witness testified clearly, unambiguously, 
that there was great intentionality in the discrimination in 
mortgages with Black and Brown people. They admitted it. Can 
AI, Ms. Williams, eliminate that or confuse it even more?
    Ms. Williams. It has the potential to do both. I'm sorry; I 
am giving you a very lawyerly answer, right? It depends. It 
literally can do both. My concern--the ranking member made a 
comment in regard to his question around, how do we know that 
this isn't the right answer based on the data that is received? 
Well, the answer to that, I would say, which is also analogous 
to your question, is if you are using historical data, the 
historical data already is biased.
    So, if we are talking about something that is based on 
redlining or something that is based on income of women or 
income of Black people in particular, we know that we are 
historically underpaid, even if we have the same credentials 
and qualifications and experience.
    So, if you are using bad inputs, you are going to get bad 
outputs. It is very akin to what Congressman Foster said: 
Garbage in, garbage out. So it has the potential to solve for 
it if you are also being cognizant of the fact that not all 
biases are bad. There may be some ways to solve for it, 
particularly the human decision-making element at the end, of--
when you get the output. But the inputs also need to be 
completely vetted and understood as well. So, again, if you are 
using something that is based off of old redlining data, that 
is already going to skew your results.
    Mr. Cleaver. And to any of you, one of the most dangerous 
things, I contend, having grown up in the deep South, is 
unconscious bias. There would be people who would, without any 
hesitation or reservation, declare that, I have designed this 
machine and the algorithms are completely unbiased. Is that 
even possible? Anybody? Yes, sir, Mr. Ghani?
    Mr. Ghani. No. I don't think anybody is trained or 
certified today to the level where they can guarantee that an 
algorithm is unbiased. And I think, again, the focus on the 
algorithm is misleading. I think it is important to remember 
the algorithm doesn't do anything by itself.
    Mr. Cleaver. Yes.
    Mr. Ghani. You tell it what to do. So, if you tell it to 
replicate the past, that is exactly what it will do. You can 
take bad data, but tell it, ``Don't replicate the past, make it 
fair, here is what I mean by fair,'' even if that doesn't work, 
and as my fellow panelists were saying, the decisions we make 
based on the algorithm's recommendation, we don't have to do 
exactly what the algorithm says. We can override in certain 
cases when the algorithm gives us the right explanation, which 
we need that, and override it and/or reinforce what it is doing 
based on what our societal outcomes are.
    So we need training for regulators to understand these 
nuances, because today we don't have that capacity inside 
agencies to understand this, implement it, and enforce these 
types of regulations that should exist regardless of AI. What 
we are talking about is not about AI. It is about societal 
values that should exist in every human decision-making 
process.
    We are just talking about it today because the scale and 
the risks might be higher, but it is the same conversation that 
should have been happening continuously.
    Mr. Cleaver. My time has run out, but we had someone before 
this committee once who declared that he had never seen any 
discrimination and didn't know anything about it--and he was 
60-years-old--but he said he knew some people who had. Thank 
you.
    Chairman Foster. Thank you.
    And the gentleman from Virginia, Mr. Riggleman, is 
recognized for 5 minutes.
    Mr. Riggleman. Thank you, Mr. Chairman.
    Thank you, everybody, for being here. I had a whole list of 
questions, but now that I have heard you all, I am just going 
to just ask some cool things.
    Dr. Thomas, I was really impressed by your thoughtful words 
about contextual bandits. When I did this, I had to worry about 
technical or assumed bandits because we actually tried to 
template human behavior for node linking or information sharing 
and how they actually put that data together, and we had two or 
three people. And, by the way, when we templated each other's 
behavior, it was completely different. It was fantastic. But 
that is the algorithm we tried to do.
    So, I have a question for you on these contextual bandits 
because, as soon as you said that, I thought, oh, goodness, I 
have never heard that term, specifically. We always just called 
them screw-ups.
    Is there a list of contextual bandits that might be 
overlooked or not seen as egregious, and is there a prioritized 
set of rule set errors that you and your team or others have 
identified that we can point to and go look at, because, for 
instance, we had our huge list of errors that we had in our 
algorithmic rule sets that we were building through machine 
learning, but is there any--have you identified this list, or 
is there a list that we can see as far as those contextual 
bandits you are talking about?
    Mr. Thomas. I think we may have a miscommunication on the 
term, ``contextual bandits.'' By contextual bandit, I mean the 
machine learning paradigm where you make a decision based on a 
feature vector and then get a reward in return for it, and you 
optimize.
    Is that the same usage of the phrase that you are using?
    Mr. Riggleman. A little different, nope. You are right, 
because when you said, ``contextual bandit,'' I'm thinking 
about a bandit where you had a faulty piece of data put into 
your rule set, and that faulty piece of data came from 
somebody's context and what that piece of data should do.
    So, let me reframe the question. Is there any way to 
identify or is there a playbook or a technical order on how to 
remove some of those contextual bandits that, say, we as a 
committee can see or we can refer to?
    Mr. Thomas. Unfortunately, I am not particularly familiar 
with the specific definition of contextual bandit that you are 
using, so I apologize. ``Bandits'' in our setting refers to 
kind of like slot machines being called a one-armed bandit.
    Mr. Riggleman. Oh, okay. I thought you were talking about 
pieces of data within it. I am sorry about that because I am 
using ``contextual bandits'' from now on. That is the greatest 
term I have heard in a long time.
    And then, Dr. Kearns, I was listening to what you were 
saying. Where have improvements been in removing bias been most 
noticeable when you are looking at building these rule sets? 
Where have you seen that we have done the most improvement 
right now, and, again, is there something that I can go see, 
because I know our issues that we had in the DOD? Where can I 
go see where the most improvements are in removing bias and a 
way forward for us as we do this?
    Mr. Kearns. Yes. I guess, in my opinion, there is quite a 
bit of science on algorithmic fairness, and we sort of broadly 
know how to make things better right now, but it is, in my 
view, early days in terms of actual adoption, and I think one 
of the problems with adoption is that, for instance, even 
though many of the large tech companies have small armies of 
Ph.D.'s who think specifically about fairness and privacy 
issues in machine learning, there have been relatively few 
actual deployments into kind of critical products at those 
companies, and I think that is because of the aforementioned 
costs that I and my fellow panelists have made, right?
    If you impose a fairness constraint on Google advertising 
or in lending, that will inevitably come at a cost to overall 
accuracy. And so, in lending, a reduction in overall accuracy 
is either going to be more defaults or fewer loans granted to 
creditworthy people that would have given revenue.
    I think the next important step is to sort of explain to 
companies, either by coercion or encouragement, that they need 
to think carefully about these tradeoffs, and that we need to 
start talking about making these tradeoffs quantitative and 
kind of acceptable to both the industry and to society.
    Mr. Riggleman. And I think, Dr. Henry-Nickie, when you were 
talking about this, now that we went to tradeoffs, do you feel 
that--can it go the other way? Can we have too many tradeoffs 
when it comes to bias? And can we insert things in there that 
might not be real based on a political decision? I think that 
is the thing that everybody here wants to keep out of this, is 
that where is that line between making sure--do we have an 
algorithm writing on an algorithm for fairness, which is what 
we try to do, to write an algorithm to crosscheck our 
algorithms, or do we have to be very careful about what we 
identify as bias or fairness when we are making these rule 
sets, and where is that tradeoff, as far as can we go too 
political where it doesn't become fair based on the fact that 
we are too worried about what fairness looks like?
    Ms. Henry-Nickie. I think it can become too political. When 
the CFPB tried to implement its BISG to make auto lending fair, 
it went extremely political and ended up screwing consumers.
    Mr. Riggleman. Yes.
    Ms. Henry-Nickie. And so, I think we have to step back 
collectively as regulators, on the scientific community, 
consumer advocates, technologists, and public policy scholars, 
and try to think about, how do we create collective gradations 
of fairness that we can all agree with? It is not a hard-and-
fast issue, and, as Dr. Thomas said and Dr. Kearns, more 
fairness, but you hurt some groups in protected classes who we 
wanted better off anyway, before the algorithms were imposed.
    Mr. Riggleman. I thank all of you for your thoughtfulness. 
I'm sorry. I know my time is almost up, but a little bit of 
time? I think it is up, right?
    Chairman Foster. There is an unofficial 40 seconds of slot 
time. So--
    Mr. Riggleman. Thank you.
    Chairman Foster. --you now have 18 seconds.
    Mr. Riggleman. You are a gracious man. Thank you, sir.
    Mr. Kearns. To just make one brief comment to make the 
political realities clear here: Pick your favorite specific 
mathematical definition of fairness and consider two different 
groups that we might want to protect by gender and by race. It 
really might be the case that it is inevitable that, when you 
ask for more fairness by race, you must have less fairness by 
gender, and this is a mathematical truth that we need to get 
used to.
    Mr. Riggleman. Thank you for that clarification. And thank 
you for your thoughtful answers. I appreciate it. And I yield 
back.
    Chairman Foster. The gentleman from Illinois, Mr. Casten, 
is recognized for 5 minutes.
    Mr. Casten. Thank you, Mr. Foster. I am just fascinated by 
this panel, and I find myself thinking that there is--I have 
deep philosophical and ethical questions right now that are 
really best answered in the context of a 5-minute congressional 
hearing, as all of our philosophers have taught us.
    I do, though, think there are some seriously philosophical 
questions here, and so I would like you just to think as big 
picture as you can, and hopefully as briefly as you can.
    First, Dr. Kearns, I was intrigued by your comment to Mr. 
Foster that we shouldn't define bias on the basis of inputs. I 
am just interested: Do any of the panelists disagree with that 
as a proposition?
    Okay. So, then, Dr. Kearns, help me out with the second 
layer. Is it more useful to define the bias in terms of outputs 
or in terms of how the outputs are used? Because I can imagine 
an algorithm that predicts where that crime is likely to occur 
at point X. I can imagine using that for good to prevent the 
crime. I can imagine using that to trade against in advance of 
the crime and make money off of it.
    How would you define the point of regulation or internal 
control where we should define that bias?
    Mr. Kearns. That is a great question. But it is not an easy 
one.
    First of all, it can't possibly hurt to get the outputs 
right in the first place. Second, there are many situations in 
which the output is the decision. So, criminal sentencing is an 
example where, fortunately, still, the output of predictive 
models is given to human judges as an input to their decision-
making process, but lots of things in lending and other parts 
of consumer finance are entirely automated now. So there is no 
human who is overseeing that the algorithm actually makes the 
lending decision. There, you need to get the outputs right 
because there is no second point of enforcement.
    In general, I think, as per comments that people have made 
here already, it is true that we shouldn't become too 
myopically focused on algorithms and models only because there 
is generally a pipeline, right? There is a process to collect 
data from before, early in the pipeline, and there might be 
many steps that involve human reasoning down the line as well. 
But, to the extent that we can get the outputs fair and 
correct, that is better for the downstream process than not.
    Mr. Casten. So then the point about these--hold on a 
second, because I have two more meaty questions, and, like I 
said, all of these are like Ph.D. theses questions.
    Ms. Williams, you said in your comments that not all biases 
are bad. Do you have any really easy definition of how we would 
define good versus bad bias if we are going to go in and 
regulate this?
    Ms. Williams. That is a good question. It is giving me a 
college throwback idea.
    I guess it would be, if you have certain outputs that show 
disparity impact among groups or, let's say, certain housing 
decisions over the course of, let's say, three generations, if 
you somehow put that into your inputs or if you use that, if 
you are a human decision-maker who receives an output, and you 
decide that is something that you are going to try to correct 
for or solve for, then perhaps that is an example of bias for 
good.
    Mr. Casten. Okay. So my last really meaty one--I am going 
to give you the really hard one, Dr. Henry-Nickie.
    Let us assume, stipulate that people will make decisions 
based on bias, they will make money off of the decisions based 
on bias, because they already have. We already know that is 
going to happen.
    From a regulatory perspective, what do you think is the 
appropriate thing to do after that has happened? Are they 
obligated to disclose? I know of cases where hedge funds have 
found that they were actually trading on horrible things in the 
world and the algorithm got out of control. Should they 
disclose that? Should they return the gains that they have had 
to that? Should they reveal the code?
    If you are the philosopher king or queen, what is the right 
way for us to respond to something, having agreed that it 
should never have happened?
    Ms. Henry-Nickie. Well, I think our current regulatory 
framework allows for that situation, and it allows us to 
revisit the issue, analyze, and understand who the population 
was that was hurt, what they look like, how much disgorgement 
we should go back and get in terms of redress for consumers. So 
I think it is completely appropriate to go back and ask--not 
ask, but right the harms for consumers who have been hurt.
    Mr. Casten. But doesn't that assume that they have already 
disclosed it? In my scenario, where my algorithm is predicting 
a crime and I figure out how to short the crime--
    Ms. Henry-Nickie. Disclosure does not absolve you of 
liability.
    Mr. Casten. But if you are not obligated to disclose, how 
are we ever going to find out as regulators that it happened?
    Ms. Henry-Nickie. I think that is a really good question. 
If you are not obligated to disclose, then we are in a Catch-
22, and then how do we find and identify and detect, and how do 
we hold them accountable? I think it is important for the CFPB, 
the DOJ, the OCC, and the Federal Reserve to have their 
enforcement powers intact and strengthened to be able to hold 
bad actors, regardless of intent, accountable for their 
decisions.
    Mr. Casten. Well, I am out of time. I yield back.
    And I am sorry, Mr. Ghani. I know your hand was up. Feel 
free to submit comments. And, if any of you have thoughts on 
that, feel free to submit them. Thank you so much.
    Chairman Foster. I believe it is likely that, if we don't 
have votes called, we will have a second round of questions.
    The gentleman from North Carolina, Mr. Budd, is recognized 
for 5 minutes.
    Mr. Budd. Thank you, Mr. Chairman. This is a fascinating 
conversation. Professor Ghani, was there something that you 
were--I think your hand was raised earlier. I have other 
questions for you, but if you wanted to clarify?
    Mr. Ghani. Yes, I wanted to go back to the first Ph.D. 
thesis that was talked about: Is it enough to get the outputs 
right, or is it important how those outputs are going to be 
used? And I think that that is probably the most critical 
question that has been asked today, because it doesn't matter 
what your outputs are if you don't act on them appropriately, 
right?
    Here is an example. If you are going to take the example 
of, you are not going to get all the outputs right, period. AI 
will never be good enough to get everything perfectly right. It 
is going to make mistakes. What mistakes are more important to 
guard against really depends on how those outputs are going to 
be used.
    If we predict somebody might commit a crime and the 
intervention we have is going and arresting them, that is a 
punitive intervention. False positives are back, like 
disproportionate false positives; much, much, much worse than 
missing people.
    If we predict that somebody is going to commit a crime, but 
they have a mental health need, and we are going to send out a 
mental health outreach team to help them, give them the support 
services they need, then missing people disproportionately, 
false negatives, are much, much, much worse than false 
positives.
    And so the intervention is what really decides how we 
design these algorithms, and it is not the output; it is--we 
can have the same output. Different interventions will require 
different notions of what to optimize for and the impact of the 
bias in society. So, I want to make that distinction clear--
    Mr. Budd. Thank you.
    Mr. Ghani. --because it does matter quite a bit.
    Mr. Budd. Thank you.
    And, in terms of using this AI for giving people credit, I 
think we can agree that giving consumers access to credit can 
fundamentally change their lives, and this is one tool that we 
are using that can help them do so. It allows consumers to buy 
a home, a car, pay for college, or start a small business. 
Using alternative data such as education level, employment, 
status, rent, or utility payments has the potential to expand 
access to credit for all consumers, especially those on the 
fringes of the credit score range.
    A recent national online survey shows that 61 percent of 
consumers believe that incorporating access to their payment 
history and their credit files will ultimately improve their 
scores. The same survey also found that more than half of 
consumers felt empowered when able to add their payment history 
into the credit files, and they cited the ability to access 
more favorable credit terms as one of the biggest benefits of 
sharing their financial information.
    So can you further elaborate, Mr. Ghani, on how the use of 
alternative data expands access to credit for low- to moderate-
income consumers who would otherwise be unable to access that 
same credit?
    Mr. Ghani. Yes. I would go back to what Dr. Kearns was 
saying, that it is really not about the inputs, right? The 
sandbox we need to create is to enter those things in and then 
measure the outputs and then look at disparities in the rates 
at which you are going to offer loans or credits to people that 
you wouldn't have before.
    So, imagine our societal goal is that the lending decisions 
we want to make should serve to reduce or eliminate disparities 
in home ownership rates across, let's say, Black and white 
individuals, or minorities and white individuals. If that is 
the societal goal we want to have, then these inputs may or may 
not help us achieve that, and what we want to be able to do is 
to test that out, have a framework for testing it, validating 
it, certifying that it is actually doing that, and then put 
this into place after we have done trials, just like other 
regulatory agencies do.
    Starting with, if we put in these inputs, would it help? We 
don't know, but I think putting the right outcomes in place 
that you want to achieve and then testing it is the right 
approach to take.
    Ms. Henry-Nickie. I would add to that.
    Mr. Budd. I want to add an open question here, and if you 
can comment on the same thing, but then answer the open 
question, and that is ways that we can be more encouraging to 
use tools like alternative data and AI to raise access to 
credit and lower the overall costs for consumers, if there are 
ways that we can encourage that here, so please?
    Ms. Henry-Nickie. I will take that question first.
    I think we have to be careful about experimenting with 
people's--consumers' financial lives. I think a healthy way to 
discover what our new products are out there might be through 
pilots, might be through continued active observation, and also 
vigilant oversight, as in the Upstart case.
    To your question before, how do rental payments help to 
expand access to credit on alternative data? For example, in 
some markets, rental payments are as high as a mortgage or even 
higher, and, if you, as a first-time home buyer about to enter 
into this process have only had a rental payment history that 
is consistent, stable, not late, then taking that feature, 
substituting it for what a mortgage payment and standing in for 
mortgage payment--excuse me--could then push you above the 
margin to have the model predict that you were a good credit 
risk.
    Mr. Budd. Thank you, and I yield back.
    Chairman Foster. Thank you.
    The gentleman from Indiana, Mr. Hollingsworth, is 
recognized for 5 minutes.
    Mr. Hollingsworth. Good afternoon. I appreciate everybody 
being here. Certainly, were my wife here, she would tell you 
that I am far outside my circle of competence. So, I am going 
to ask a lot of really stupid questions and let you all give me 
really intelligent answers to those stupid questions.
    Can you clarify--the word ``fairness'' has been thrown 
around a lot. Can you clarify what you mean by fairness, the 
five of you? Have at it. Dr. Kearns, Ms. Williams, Dr. Thomas, 
everybody, anybody?
    Ms. Williams. Okay. I will go first.
    Mr. Hollingsworth. Okay.
    Ms. Williams. For me, I look at fairness as ensuring that 
all groups have equal probability of being assigned favorable 
outcomes.
    Mr. Hollingsworth. All groups have equal probability of 
being assigned outcomes irrespective of their current 
situations, or all individuals similarly situated are assigned 
the same outcome--the same probability of outcomes?
    Ms. Williams. The latter.
    Mr. Hollingsworth. The latter. Okay.
    Dr. Kearns?
    Mr. Kearns. There are too many definitions of fairness, as 
we have already alluded to, but the vast majority of them begin 
with, the user has to identify what group or groups they wanted 
to protect and what would constitute harm to those groups. So, 
it is maybe a racial minority, and the harm is a false loan 
rejection, rejection for a loan that they would have repaid.
    Mr. Hollingsworth. You have clearly short-circuited to what 
I was getting at, which is, we have a lot of senses of fairness 
and a lot of senses of what we want done, but the requirement 
in AI and algorithms is that we make explicit that which is 
right now implicit, right, and you have to be very good at 
making that explicit because the algorithm itself is going to 
optimize for what you tell it to optimize for, right? And so, 
you are going to have to make very clear what you were trying 
to optimize for in order to get that outcome, and then, to your 
point, what side you were unwilling to live with, right? I am 
unwilling to live with the extra risk on this side or perhaps 
that side depending on what situation you are in.
    So, not only do you have to have a lot of awareness about 
exactly what you want to optimize for, but also a lot of 
awareness about, in the context, what you are really worried 
about and what you are concerned about, the false positives or 
the other side of it.
    Dr. Thomas?
    Mr. Thomas. I absolutely agree. I missed what the precise 
question--
    Mr. Hollingsworth. No. I saw you nodding your head and I 
didn't know if you had a comment to the previous question about 
fairness.
    Mr. Thomas. I am generally just in agreement that you are 
hitting on very good points, and--
    Mr. Hollingsworth. I shall take that back to my wife. Maybe 
my circle of competence is bigger than I thought it was.
    Mr. Thomas. You are hitting on the point that there are 
many different definitions of fairness. The question of which 
one is right and nailing it down is very important.
    Mr. Hollingsworth. Yes.
    Mr. Thomas. And something that I think you might be kind of 
dancing around is this idea that the negative outcomes that are 
consistent with different definitions of fairness can often all 
seem bad. There can be two different definitions of fairness, 
and, if we pick one, it means we are saying that the 
undesirable, unfair behavior of the other is necessarily okay.
    Mr. Hollingsworth. Yes. And I think Dr. Kearns talked a 
little bit about this earlier, and it is something that puzzles 
me a lot because I think, in some places, the tradeoff in 
fairness for one group may mean less fairness in the other. Did 
you say that, Dr. Kearns?
    Mr. Kearns. I did.
    Mr. Hollingsworth. Yes. And this is something that others 
have hit on as well, that we are going to have to grow 
comfortable saying to ourselves that we are going to trade 
fairness here for fairness there--and not just more fairness 
for perhaps less accuracy in the model itself, which is 
something we have had more comfort in. But trading fairness and 
risk to a certain group is something we have been really 
uncomfortable with because we want fairness for everybody in 
every dimension, which seems--I don't want to say impractical, 
but it seems challenging inside an AI algorithm in 
optimization.
    Mr. Kearns. I would say--
    Mr. Hollingsworth. Do you agree with this?
    Mr. Kearns. --that is, in fact, impractical. And let me 
just, while we are in the department of bad news, also point 
out that all of these definitions we are discussing are 
basically only aggregate definitions and only provide 
protections at the group level.
    Mr. Hollingsworth. Right.
    Mr. Kearns. So, for instance, you can be fair, let's say, 
by race in lending. And if you are a Black person who was 
falsely rejected for a loan, your consolation is supposed to be 
the knowledge that white people are also being falsely rejected 
for loans at the same rates. There is literature on individual 
notions of fairness, definitions of fairness that try to make 
promises to particular people that are basically impractical 
and feasible. It is sort of a theoretical curiosity, but no 
more.
    Mr. Hollingsworth. Yes. I appreciate that.
    Each of you have talked a little bit about the pipeline 
that AI algorithms aren't birthed in the ether, right, that 
they rely on data, A; and, B, individuals craft these. I wonder 
if you might talk a little bit about the biases that we are 
talking about, are they more likely to arise from the algorithm 
itself, or are they likely to arise from the coder or the 
drafter of said algorithm, or are they likely to arise from the 
data that is being input into them? Where should we look first 
if we are going to look through that pipeline? Ms. Williams?
    Ms. Williams. I would say look on the human level first, 
because a human is going to discern what is the narrative that 
they are actually solving for, and then therefore, what is the 
data that they are going to use, and they discern the quality 
of data that is used, and they then discern the training set 
that is created and how that is functional. I also want to be 
clear that I don't think that there are a bunch of mad coders 
sitting in a basement somewhere.
    Mr. Hollingsworth. Yes. The fair expectations of society.
    Ms. Williams. I don't think that is it. It is very--you 
don't know what you don't know.
    Mr. Hollingsworth. Yes. I agree.
    Ms. Williams. And I think, oftentimes, if people pick data 
that is available to them, they may not do a ton of due 
diligence to find additional data or data that may even offset 
some of the data that they already have. But I would say, start 
at the human level first, because that is where everything else 
sort of begins in terms of picking the data, the quality of 
data, and then actually doing the coding.
    Mr. Hollingsworth. Yes. Thank you.
    With that, I yield back, Mr. Chairman.
    Chairman Foster. Thank you.
    And now, I guess we have time for a quick second round of 
questions. Votes are at 3:30, and we have nerves of steel here, 
I'm learning, so we will give it a try here.
    So, I will recognize myself for 5 minutes.
    I would like to talk about the competitive situation that 
would happen when you have multiple companies, each running 
their own AI and, say, offering credit to groups of people.
    If you just tell them, ``Okay, maximize profits, that is a 
mathematically well-defined way to program your AI,'' and they 
would all do it identically, and the competition would work out 
in an understandable way.
    Now, if you impose a fairness constraint on these, first 
off, that will reduce the profitability of any firm that you 
impose the fairness constraint on, so they are not simply 
maximizing profits, and then, if a new competitor may come in 
and, say, and say, oh, there is a profitable opportunity to 
cherry pick customers that you have--that your fairness 
constraint has caused you to exclude, and is that a 
mathematically stable competition? Has that been thought about? 
Do you understand the problem I am talking about?
    Mr. Kearns. If I understand you correctly, there is 
literature in economics on whether, for instance, racial 
discrimination in hiring can actually be a formal equilibrium 
of the Nash variety.
    Chairman Foster. Maybe that is a good description.
    Mr. Kearns. Gary Becker was a very famous economist who did 
a lot of work in the 1960s and 1970s on exactly this topic, and 
it is complicated, but the top-level summary of his work is 
that the argument that you can't have discrimination in hiring 
at equilibrium because you wouldn't be competitive, because you 
are irrationally excluding some qualified sectors of the job 
market. He actually shows that, in fact, you can have 
discrimination even at equilibrium.
    Chairman Foster. So, one of the questions would be whether 
you are better off actually having multiple players here. So, 
if someone is erroneously excluded because of some quirk in 
some model, then it would be to the advantage of society 
overall to have multiple players, so that person could go to a 
second credit provider?
    Mr. Kearns. You are asking kind of the reverse of Becker's 
question, which is, if you don't have sort of regulatory 
conditions on antidiscrimination, for instance, might there be 
arbitrage opportunities for new entrants? I don't know that 
that question has specifically been considered, but it is a 
good question.
    Chairman Foster. Yes, Mr. Ghani?
    Mr. Ghani. I think one thing I would point out is the 
premise that, if you put those constraints there, the profits 
will go down; that is not a guarantee. We don't know that, and 
here is why, right? I think it was Dr. Kearns who was talking 
about how there are a lot of people who just--we don't know 
what happens in somebody who was never--the type of person was 
never given loans before. What happens when you give them a 
loan, right?
    So it could be that, when you start adding these fairness 
constraints, it turns out that you don't actually lose profits, 
and, in fact, you might increase profits. These are things 
called counterfactuals, where, because you have never given 
loans to people like this, you don't know what the outcomes 
are. You might have just--the human decision-making process 
that existed before was only giving loans to people they 
thought were going to pay back loans.
    Chairman Foster. That has to do with the exploratory phase 
of programming your neural--
    Mr. Ghani. That is correct.
    Chairman Foster. --to actually do random, crazy stuff 
because you may discover a pocket of consumers--
    Mr. Ghani. Hopefully better than random, crazy stuff, but 
some smarter version of that, yes.
    Chairman Foster. Yes. And so, this question of similarly 
situated people, that depends on the scope of the data that you 
are looking at, because two people can look similarly situated 
if you only look at their family and their personal history, 
and then, if you look at a wider set of things--I think this is 
what came up with Apple and Goldman where, if you just looked 
at one-half of a couple's credit information, you would give a 
different credit limit on a credit card, I think it was, 
whereas, if you look holistically at both halves of a couple, 
you get a different answer. And there is no obvious right 
answer to how wide you should spread your field of view here.
    Is that an unresolvable problem that you are going to need 
Congress to weigh in on? Yes?
    Mr. Ghani. I think this is exactly why these systems need 
people in the middle, but, also, these systems need 
collaborative processes upfront, including the people who are 
going to be impacted by them. If you start including those 
communities, they will tell you that there is actually really 
good work. There is a group in New Zealand who has been doing, 
how do we incorporate community input into designing these 
types of algorithms? What input attributes do we use that best 
represent the differences and similarities across these?
    So it is inherently--it is going to be hard to automate 
that today, but I think that is the process we need, which is 
to include the community that is being impacted and humans in 
the loop, in the system, coming up with some of these things, 
and collaborating with the machines.
    Chairman Foster. Well, okay. That sounds ambitious. I am 
just trying to think of assembling groups that are sufficiently 
knowledgeable about the nuts and bolts of this and to have--and 
where you are balancing the people who wind up winning and 
losing according to the tradeoffs you are going to be making.
    Mr. Ghani. The challenge is some of the--the amount of data 
you have on people is also a function of who they are. Some 
people are less reluctant or more reluctant to give data about 
themselves. They may have less of a history. Immigrants are 
coming in who don't have a background, and credit history, so 
missing information. It is not just that you have the data, you 
can just get it and compare those. You might not have that 
data, and that is also biased in the data collection process.
    Chairman Foster. Okay. I will gavel myself down and 
recognize the ranking member for 5 minutes.
    Mr. Loudermilk. Thank you, Mr. Chairman.
    Unfortunately, we only have a few more minutes. I think we 
could all be here all day discussing this.
    Something Ms. Williams and Dr. Kearns said earlier has 
really been resonating: Not all bias is bad. We agree. In fact, 
if we take kind of the model we have been talking about, loan 
applications, whether there is a mortgage or not, the whole 
purpose of the AI platform is to be biased, right? That is the 
actual purpose, is to be biased, but what is the bias that we 
want? We want those who are likely to pay a loan back, really 
is what we are getting at. So I think I see what you are 
saying. There is some bias that you want in there.
    What is the bias we want to eliminate, is really the 
question, and that goes to something Dr. Kearns said, well, if 
we reprogram it to make sure more of one racial group gets more 
approval, then you may see a gender impacted. And so, this is 
kind of a conundrum we are in until we figure out or define 
what bias do we want in a particular system, but, more 
importantly, what do we not want?
    When I look at it as what do we not want, if Mr. Budd and 
myself are identical--I know that we are identical in income 
because I know what he does for a living, and that the law 
doesn't allow him to take any other income, right? But if he is 
Hispanic, I am white, and the chairman is Black, and we all 
have the same income, we all have the same assets, we all have 
basically the same biographical data, do we all get the same 
result, whether it is approval or disapproval? That is really 
what I think we are trying to get to.
    It isn't that we weren't happy with the result that came 
out, but we have to go back and find out why. And that is what 
we are getting at.
    Mr. Ghani, if it exists, and since algorithms are my 
mathematical equation, really, I think part of the problem is, 
when you get into the machine learning and the algorithm begins 
to rewrite itself, how do we track it?
    We verify the data is good. I think most of the problems we 
have are probably in the data and the appropriateness of the 
data. Let me say not just in the raw data, but the 
appropriateness of the data.
    But if we do want to check the algorithms, is there a way 
of running what I would call in the network world an audit 
trail in the development of the algorithm, throughout the 
operation of the algorithm, and each phase of decision it is 
making, and the actual coding, and is there a way to go back 
and do a forensic audit trail on these algorithms?
    Mr. Ghani. Yes, absolutely, and I think that is the right 
approach, is you can audit the data, and that is great. But 
then you are going to--I think the starting point is you want 
to tell it what you want the system to achieve. Then, you want 
to turn those into technical requirements for the system to see 
what to tell it to do, and then you want to confirm that it did 
what you told it to do, and then you want to test it and see, 
does it continue to do what you--what it did yesterday, right?
    When you answer, what should we ask a company to disclose, 
it is not the algorithm. It is not the code. It is not the 
data. It is this entire audit trail, and that is what we need 
to look at to figure out where it is happening.
    Mr. Loudermilk. Well, that's an interesting aspect, and 
take it a step further. The difference between software and 
artificial intelligence is we expect software to give us the 
same result every time, right? That is not the case with 
artificial intelligence, correct, because artificial 
intelligence is always looking for other data, and it may give 
us a different outcome the next day based on something that 
changed the day before, and it may rewrite itself to learn new 
things.
    I think that is some of the challenge going forward is, if 
you tell the machine that is not the right answer, it is going 
to look for a different answer in the future. This is stuff we 
wrote science fiction about just 10 years ago, right? So, when 
we code the algorithms themselves, can we actually program in 
the artificial intelligence platform to do systematic reports 
throughout the process?
    Mr. Ghani. Absolutely.
    Mr. Loudermilk. Okay.
    Mr. Ghani. That should be standard. That should be part of 
our training programs for people who are building these 
systems. It should be part of training for auditors who are 
doing compliance. Absolutely, that is the right approach.
    Mr. Loudermilk. Okay. And the last part is probably more a 
statement than a question. In my opening statement, I talked 
about different analytical models. I think the one that 
concerns us the most is what I call the execution model. We 
have presentation of data. We have predictive analysis. We have 
prescriptive analysis that prescribes, okay, approve or don't 
approve. And we can do that, but, yet, there is a human element 
making the same decision.
    It is like the backup warning on my car that beeps and it 
says something is behind me. It doesn't stop the car. I still 
make the decision. But if you watch the Super Bowl, the Smart 
Park, right, it is actually making the decisions. In this case, 
it is the machine making the decision of go/no-go on the loan. 
It is executing on that, and I think, until we get this fixed, 
we may need to look at, is there an appeal process for that go/
no-go that a human element can go in and work?
    So, thank you, Mr. Chairman. It sounds like our warning 
bell is going off, and my time has expired.
    Chairman Foster. Thank you.
    I would like to also thank our witnesses for their 
testimony today.
    Without objection, the following letters will be submitted 
for the record: the Student Borrower Protection Center; Cathy 
O'Neil of O'Neil Risk Consulting & Algorithmic Auditing; the 
BSA Software Alliance; The Upstart Network, Incorporated; and 
Zest AI.
    The Chair notes that some Members may have additional 
questions for this panel, which they may wish to submit in 
writing. Without objection, the hearing record will remain open 
for 5 legislative days for Members to submit written questions 
to these witnesses and to place their responses in the record. 
Also, without objection, Members will have 5 legislative days 
to submit extraneous materials to the Chair for inclusion in 
the record.
    This hearing is now adjourned.
    [Whereupon, at 3:33 p.m., the hearing was adjourned.]

                            A P P E N D I X



                           February 12, 2020
                           
                           
[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]