b"<html>\n<title> - BIG DATA CHALLENGES AND ADVANCED COMPUTING SOLUTIONS</title>\n<body><pre>[House Hearing, 115 Congress]\n[From the U.S. Government Publishing Office]\n\n\n\n\n \n                        BIG DATA CHALLENGES AND \n                      ADVANCED COMPUTING SOLUTIONS\n\n=======================================================================\n\n                             JOINT HEARING\n\n                               BEFORE THE\n\n                        SUBCOMMITTEE ON ENERGY &\n                SUBCOMMITTEE ON RESEARCH AND TECHNOLOGY\n\n              COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY\n                        HOUSE OF REPRESENTATIVES\n\n                     ONE HUNDRED FIFTEENTH CONGRESS\n\n                             SECOND SESSION\n\n                               __________\n\n                             JULY 12, 2018\n\n                               __________\n\n                           Serial No. 115-69\n\n                               __________\n\n Printed for the use of the Committee on Science, Space, and Technology\n \n \n \n \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT] \n \n\n\n       Available via the World Wide Web: http://science.house.gov\n       \n       \n       \n                         _________ \n\n           U.S. GOVERNMENT PUBLISHING OFFICE\n                   \n30-879 PDF           WASHINGTON : 2018            \n       \n\n              COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY\n\n                   HON. LAMAR S. SMITH, Texas, Chair\nFRANK D. LUCAS, Oklahoma             EDDIE BERNICE JOHNSON, Texas\nDANA ROHRABACHER, California         ZOE LOFGREN, California\nMO BROOKS, Alabama                   DANIEL LIPINSKI, Illinois\nRANDY HULTGREN, Illinois             SUZANNE BONAMICI, Oregon\nBILL POSEY, Florida                  AMI BERA, California\nTHOMAS MASSIE, Kentucky              ELIZABETH H. ESTY, Connecticut\nRANDY K. WEBER, Texas                MARC A. VEASEY, Texas\nSTEPHEN KNIGHT, California           DONALD S. BEYER, JR., Virginia\nBRIAN BABIN, Texas                   JACKY ROSEN, Nevada\nBARBARA COMSTOCK, Virginia           CONOR LAMB, Pennsylvania\nBARRY LOUDERMILK, Georgia            JERRY McNERNEY, California\nRALPH LEE ABRAHAM, Louisiana         ED PERLMUTTER, Colorado\nGARY PALMER, Alabama                 PAUL TONKO, New York\nDANIEL WEBSTER, Florida              BILL FOSTER, Illinois\nANDY BIGGS, Arizona                  MARK TAKANO, California\nROGER W. MARSHALL, Kansas            COLLEEN HANABUSA, Hawaii\nNEAL P. DUNN, Florida                CHARLIE CRIST, Florida\nCLAY HIGGINS, Louisiana\nRALPH NORMAN, South Carolina\nDEBBIE LESKO, Arizona\n                                 ------                                \n\n                         Subcommittee on Energy\n\n                   HON. RANDY K. WEBER, Texas, Chair\nDANA ROHRABACHER, California         MARC A. VEASEY, Texas, Ranking \nFRANK D. LUCAS, Oklahoma                 Member\nMO BROOKS, Alabama                   ZOE LOFGREN, California\nRANDY HULTGREN, Illinois             DANIEL LIPINSKI, Illinois\nTHOMAS MASSIE, Kentucky              JACKY ROSEN, Nevada\nSTEPHEN KNIGHT, California           JERRY McNERNEY, California\nGARY PALMER, Alabama                 PAUL TONKO, New York\nDANIEL WEBSTER, Florida              BILL FOSTER, Illinois\nNEAL P. DUNN, Florida                MARK TAKANO, California\nRALPH NORMAN, South Carolina         EDDIE BERNICE JOHNSON, Texas\nLAMAR S. SMITH, Texas\n                                 ------                                \n\n                Subcommittee on Research and Technology\n\n                 HON. BARBARA COMSTOCK, Virginia, Chair\nFRANK D. LUCAS, Oklahoma             DANIEL LIPINSKI, Illinois, Ranking \nRANDY HULTGREN, Illinois                 Member\nSTEPHEN KNIGHT, California           ELIZABETH H. ESTY, Connecticut\nBARRY LOUDERMILK, Georgia            JACKY ROSEN, Nevada\nDANIEL WEBSTER, Florida              SUZANNE BONAMICI, Oregon\nROGER W. MARSHALL, Kansas            AMI BERA, California\nDEBBIE LESKO, Arizona                DONALD S. BEYER, JR., Virginia\nLAMAR S. SMITH, Texas                EDDIE BERNICE JOHNSON, Texas\n                            C O N T E N T S\n\n                             July 12, 2018\n\n                                                                   Page\nWitness List.....................................................     2\n\nHearing Charter..................................................     3\n\n                           Opening Statements\n\nStatement by Representative Randy K. Weber, Chairman, \n  Subcommittee on Energy, Committee on Science, Space, and \n  Technology, U.S. House of Representatives......................     4\n    Written Statement............................................     6\n\nStatement by Representative Marc A. Veasey, Ranking Member, \n  Subcommittee on Energy, Committee on Science, Space, and \n  Technology, U.S. House of Representatives......................     8\n    Written Statement............................................     9\n\nStatement by Representative Barbara Comstock, Chairwoman, \n  Subcommittee on Research and Technology, Committee on Science, \n  Space, and Technology, U.S. House of Representatives...........    10\n    Written Statement............................................    11\n\nStatement by Representative Lamar Smith, Chairman, Committee on \n  Science, Space, and Technology, U.S. House of Representatives..    12\n    Written Statement............................................    13\n\nWritten Statement by Representative Eddie Bernice Johnson, \n  Ranking Member, Committee on Science, Space, and Technology, \n  U.S. House of Representatives..................................    15\n\nWritten Statement by Representative Daniel Lipinski. Ranking \n  Member, Subcommittee on Research and Technology, Committee on \n  Science, Space, and Technology, U.S. House of Representatives..    17\n\n                               Witnesses:\n\nDr. Bobby Kasthuri, Researcher, Argonne National Laboratory; \n  Assistant Professor, The University of Chicago\n    Oral Statement...............................................    19\n    Written Statement............................................    22\n\nDr. Katherine Yelick, Associate Laboratory Director for Computing \n  Sciences, Lawrence Berkeley National Laboratory; Professor, The \n  University of California, Berkeley\n    Oral Statement...............................................    31\n    Written Statement............................................    34\n\nDr. Matthew Nielsen, Principal Scientist, Industrial Outcomes \n  Optimization, GE Global Research\n    Oral Statement...............................................    47\n    Written Statement............................................    49\n\nDr. Anthony Rollett, U.S. Steel Professor of Materials Science \n  and Engineering, Carnegie Mellon University\n    Oral Statement...............................................    57\n    Written Statement............................................    59\n\nDiscussion.......................................................    66\n\n             Appendix I: Answers to Post-Hearing Questions\n\nDr. Bobby Kasthuri, Researcher, Argonne National Laboratory; \n  Assistant Professor, The University of Chicago.................    92\n\nDr. Katherine Yelick, Associate Laboratory Director for Computing \n  Sciences, Lawrence Berkeley National Laboratory; Professor, The \n  University of California, Berkeley.............................    97\n\nDr. Matthew Nielsen, Principal Scientist, Industrial Outcomes \n  Optimization, GE Global Research...............................   104\n\nDr. Anthony Rollett, U.S. Steel Professor of Materials Science \n  and Engineering, Carnegie Mellon University....................   113\n\n            Appendix II: Additional Material for the Record\n\nDocument submitted by Representative Neal P. Dunn, Committee on \n  Science, Space, and Technology, U.S. House of Representatives..   120\n\n\n                          BIG DATA CHALLENGES\n\n\n                    AND ADVANCED COMPUTING SOLUTIONS\n\n                              ----------                              \n\n\n                        THURSDAY, JULY 12, 2018\n\n                  House of Representatives,\n                         Subcommittee on Energy and\n           Subcommittee on Research and Technology,\n               Committee on Science, Space, and Technology,\n                                                   Washington, D.C.\n\n    The Subcommittees met, pursuant to call, at 10:15 a.m., in \nRoom 2318, Rayburn House Office Building, Hon. Randy Weber \n[Chairman of the Subcommittee on Energy] presiding.\n\n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]\n\n\n\n    Chairman Weber. The Committee on Science, Space, and \nTechnology will come to order.\n    Without objection, the Chair is authorized to declare \nrecess of the Subcommittees at any time.\n    Good morning, and welcome to today's hearing entitled ``Big \nData Challenges and Advanced Computing Solutions.'' I now \nrecognize myself for five minutes for an opening statement.\n    Today, we will explore the application of machine-learning-\nbased algorithms to big-data science challenges. Born from the \nartificial intelligence--AI--movement that began in the 1950s, \nmachine learning is a data-analysis technique that gives \ncomputers the ability to learn directly from data without being \nexplicitly programmed.\n    Generally speaking--and don't worry; I'll save the detailed \ndescription for you all, our expert witnesses--machine learning \nis used when computers are trained--more than husbands are \ntrained, right, ladies--on large data sets to recognize \npatterns in that data and learn to make future decisions based \non these observations.\n    Today, specialized algorithms termed ``deep learning'' are \nleading the field of machine-learning-based approaches. These \nalgorithms are able to train computers to perform certain tasks \nat levels that can exceed human ability. Machine learning also \nhas the potential to improve computational science methods for \nmany big-data problems.\n    As the Nation's largest federal sponsor of basic research \nin the physical sciences with expertise in big-data science, \nadvanced algorithms, data analytics, and high-performance \ncomputing, the Department of Energy is uniquely equipped to \nfund robust fundamental research in machine learning. The \nDepartment also manages the 17 DOE national labs and 27 world-\nleading scientific user facilities, which are instrumental to \nconnecting basic science and advanced computing.\n    Machine learning and other advanced computing processes \nhave broad applications in the DOE mission space from high \nenergy physics to fusion energy sciences to nuclear weapons \ndevelopment. Machine learning also has important applications \nin academia and industry. In industry, common examples of \nmachine-learning techniques are in automated driving, facial \nrecognition, and automated speech recognition.\n    At Rice University near my home district, researchers seek \nto utilize machine-learning approaches to address challenges in \ngeological sciences. In addition, the University of Houston's \nSolutions Lab supports research that will use machine learning \nto predict the behavior of flooding events and aid in \nevacuation planning. This would be incredibly beneficial for my \ndistrict and all areas that are prone to hurricanes and to \nflooding. In fact, in Texas we're still recovering from \nHurricane Harvey, the wettest storm in United States history.\n    The future of scientific discovery includes the \nincorporation of advanced data analysis techniques like machine \nlearning. With the next generation of supercomputers, including \nthe exascale computing systems that DOE is expected to field by \n2021, American researchers utilizing these technologies will be \nable to explore even bigger challenges. With the immense \npotential for machine-learning technologies to answer \nfundamental scientific questions, provide the foundation for \nhigh-performance computing capabilities, and to drive future \ntechnological development, it's clear that we should prioritize \nthis research.\n    I want to thank our accomplished panel of witnesses for \ntheir testimony today, and I look forward to hearing what role \nCongress should play in advancing this critical area of \nresearch.\n    [The prepared statement of Chairman Weber follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n   \n    \n    Chairman Weber. I now recognize the Ranking Member for an \nopening statement.\n    Mr. Veasey. Thank you, Chairman Weber. Thank you, \nChairwoman Comstock, and also, thank you to the distinguished \npanel for being here this morning.\n    As you know, there are a growing number of industries today \nthat are relying on generating and interpreting large amounts \nof data to overcome new challenges. The new--the energy sector \nin particular is making strides in leveraging these new \ntechnologies and techniques. Today, we're going to hear more \nabout the advancements that we're going to see in the upcoming \nyears.\n    Sensor-equipped aircraft engines, locomotive, gas, and wind \nturbines are now able to track production efficiency and the \nwear and tear on vital machinery. This enables significant \nreductions in fuel consumption, as well as carbon emissions. \nThe technologies are also significantly improving our ability \nto detect failures before they occur and prevent disasters, and \nby doing so will save money, will save time, and lives. And by \nusing analytics, sensors, and operational data, we can manage \nand optimize systems ranging from energy storage components to \npower plants and to the electric grid.\n    As digital technologies revolutionize the energy sector, we \nalso must ensure the safe and responsible use of these \nprocesses. With our electric grid always in under persistent \nthreats from everything from cyber to other modes of \nsubterfuge, the security of these connected systems is of the \nutmost importance. Nevertheless, I'm excited to learn more \nabout the value and benefits that these technologies may be \nable to provide for our economy and our environment alike.\n    I'm looking forward to hearing what we can do in Congress \nto help guide and support the responsible development of these \nnew data-driven approaches to the management of these evermore \ncomplex systems that our society is very dependent on.\n    Thank you, and, Mr. Chairman, I yield back the balance of \nmy time.\n    [The prepared statement of Mr. Veasey follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n        \n    Chairman Weber. Thank you, Mr. Veasey.\n    I now recognize the Chairwoman of the Research and \nTechnology Subcommittee, the gentlewoman from Virginia, Mrs. \nComstock, for an opening statement.\n    Mrs. Comstock. Thank you, Chairman Weber.\n    A couple of weeks ago, our two Subcommittees joined \ntogether on a hearing to examine the state of artificial \nintelligence and the types of research being conducted to \nadvance this technology. The Committee learned about the \nnuances of the term artificial intelligence, such as the \ndifference between narrow and general AI and implications for a \nworld in which AI is ubiquitous.\n    Today, we delve deeper into disciplines originating from \nthe AI movement of the 1950s that include machine learning, \ndeep learning, and neural networks. Until recently, machine \nlearning and especially deep-learning technologies were only \ntheoretical because deep-learning models require massive \namounts of data and computing power. But advances in high-\nperformance graphics, processing units, cloud computing, and \ndata storage have made these techniques possible.\n    Machine learning is pervasive in our day-to-day lives from \ntagging photos on Facebook to protecting emails with spam \nfilters to using a virtual assistant like Siri or Alexa for \ninformation. Machine-learning-based algorithms have powerful \napplications that ultimately help make our lives more fun, \nsafe, and informative.\n    In the federal government, the Department of Energy stands \nout for its work in high-performance computing and approaches \nto big-data science challenges. The Energy Department \nresearchers are using machine-learning approaches to study \nprotein behavior, to understand the trajectories of patient \nhealth outcomes, and to predict biological drug responses. At \nArgonne National Laboratory, for example, researchers are using \nintensive machine-learning-based algorithms to attempt to map \nthe human brain.\n    A program of particular interest to me involves a DOE and \nDepartment of Veterans Affairs venture known as the MVP-\nCHAMPION program. This joint collaboration will leverage DOE's \nhigh-performance computing and machine-learning capabilities to \nanalyze health records of more than 20 million veterans \nmaintained by the VA. The goal of this partnership is to arm \nthe VA with data it can use to potentially improve health care \noffered to our veterans by developing new treatments and \npreventive strategies and best practices.\n    The potential for AI to help humans and further scientific \ndiscoveries is obviously immense. I look forward to what our \nwitnesses will testify to today about their work and--which may \ngive us a glimpse into the revolutionary technologies of \ntomorrow that we're here to discuss.\n    So I thank you, Mr. Chairman, and I yield back.\n    [The prepared statement of Mrs. Comstock follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n    \n    \n    Chairman Weber. I thank the gentlelady.\n    And let me introduce our witnesses. Our first witness is \nDr. Bobby--Mr. Chairman, are you going to----\n    Chairman Smith. Mr. Chairman, thank you. In the interest of \ntime, I just ask unanimous consent to put my opening statement \nin the record.\n    Chairman Weber. Without objection.\n    [The prepared statement of Chairman Smith follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n    \n    \n    [The prepared statement of Ranking Member Johnson follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n\n    \n    \n    [The prepared statement of Mr. Lipinski follows:]\n\n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n    \n    Chairman Weber. Thank you. I appreciate that.\n    Now, I will introduce the witnesses. Our first witness is \nDr. Bobby Kasthuri, the first neuroscience researcher at \nArgonne National Lab and an Assistant Professor in the \nDepartment of Neurobiology at the University of Chicago. You're \nbusy. Dr. Kasthuri's current research focuses on innovation and \nnew approaches to brain mapping, including the use of high-\nenergy x-rays from synchrotron sources for mapping brains in \ntheir entirety.\n    He holds a Bachelor of Science from Princeton University, \nan M.D. from Washington University School of Medicine, and a \nPh.D. from Oxford University where he studied as a Rhodes \nscholar. Welcome, Doctor.\n    Our second witness today is Dr. Katherine Yelick, a \nProfessor of Electrical Engineering and Computer Sciences at \nthe University of California, Berkeley, and the Associate \nLaboratory Director for Computing at Lawrence Berkeley National \nLaboratory. Her research is in high-performance computing, \nprogramming languages, compilers, parallel algorithms, and \nautomatic performance tuning.\n    Dr. Yelick received her Bachelor of Science, Master of \nScience, and Ph.D. all in computer science at the Massachusetts \nInstitute of Technology. Welcome, Dr. Yelick.\n    Our next witness is Dr. Matthew Nielsen, Principal \nScientist at the GE Global Research Center. Dr. Nielsen's \ncurrent research focuses on digital twin and computer modeling \nand simulation of physical assets using first-principle physics \nand machine-learning methods.\n    He received a Bachelor of Science in physics at Alma \nCollege in Alma, Michigan, and a Ph.D. in applied physics from \nRensselaer.\n    Dr. Nielsen. Rensselaer.\n    Chairman Weber. Rensselaer, okay, Polytechnic Institute in \nTroy, New York. Welcome, Dr. Nielsen.\n    And our final witness today is Dr. Anthony Rollett, the \nU.S. Steel Professor of Metallurgical Engineering and Materials \nScience at Carnegie Mellon University, a.k.a. CMU. Dr. Rollett \nhas been a Professor of Materials Science Engineering at CMU \nfor over 20 years and is the Co-Director of CMU's \nNextManufacturing Center. Dr. Rollett's research focuses on \nmicrostructural evolution and microstructure property \nrelationships in 3-D.\n    He received a Master of Arts in metallurgy and materials \nscience from Cambridge University and a Ph.D. in materials \nengineering from Drexel University. Welcome, Dr. Rollett.\n    I now recognize Dr. Kasthuri for five minutes to present \nhis testimony. Doctor?\n\n          TESTIMONY OF DR. BOBBY KASTHURI, RESEARCHER,\n\n                  ARGONNE NATIONAL LABORATORY;\n\n                      ASSISTANT PROFESSOR,\n\n                   THE UNIVERSITY OF CHICAGO\n\n    Dr. Kasthuri. Thank you. Chairman Smith, Chairman Weber, \nChairwoman Comstock, Ranking Members Veasey and Lipinski, and \nMembers of the Subcommittees, thank you for this opportunity to \ntalk and appear before you. My name is Bobby Kasthuri. I'm a \nNeuroscientist at Argonne National Labs and an Assistant \nProfessor in the Department of Neurobiology at the University \nof Chicago.\n    And the reason I'm here talking to you today is because I \nthink we are at a pivotal moment in our decades-long quest to \nunderstand the brain. And the reason we're at this pivotal \nmoment is that we're actually witnessing in real time is the \ncollision of two different disciplines, two different worlds, \nthe worlds of computer science and neuroscience. And if we can \nnurture and develop this union, it could fundamentally change \nmany things about our society.\n    First, it could fundamentally change how we think about \nunderstanding the brain. It could change and revolutionize how \nwe treat mental illness, and perhaps even more significantly, \nit can change how we think and imagine and build our future \ncomputers and our future robots based on how brains solve \nproblems.\n    The major obstacle between us and realizing this vision is \nthat, for many neuroscientists, modern neuroscience is \nextremely expensive and extremely resource-intensive. To give \nyou an idea of the scale, I thought it might help to give you \nan example of the enormity of the problem that we're trying to \ndo.\n    The human brain, your brains, probably contain on order 100 \nbillion brain cells or neurons, and the main thing that neurons \ndo is connect with each other. And so in your brain there's \nprobably--each neuron connects on average 10,000 times with \n10,000 other neurons. That means in your brain there are orders \nof magnitude more connections between neurons than stars in the \nMilky Way galaxy. And what's even more important for \nneuroscientists is that we believe that this map, this map of \nyou, this map of connections contains all of the things that \nmake us human. Our creativity, our ability to think critically, \nour fears, our dreams are all contained in that map.\n    But unfortunately, that map, if we were to do it, wouldn't \nbe one gigabyte of data; it wouldn't be 100 gigabytes of data. \nIt could be on order a billion gigabytes of data, perhaps the \nlargest data set about anything ever collected in the history \nof humanity. The problem is that for many neuroscientists even \nanalyzing a fraction of this map is beyond their resources, the \nresources of their laboratory, the resources of the \nuniversities, and perhaps the resources of even large \ninstitutions. And if we don't address this gap, then what will \nhappen is that only the richest neuroscientists will be able to \nanswer their questions, and we would like every neuroscientist \nto have access to answer the most important questions about \nbrains and ultimately promote this fusion of computer science \nand neuroscience.\n    Luckily, there is a potential solution, and the potential \nsolution is the Department of Energy and the national lab \nsystem, which is part of the Department of Energy. As stewards \nof our scientific architecture, as stewards of some of the most \nadvanced technological and computing capabilities available, \nthe Department of Energy and the national labs can address this \ngap, and in fact, they do address this gap in many different \nsciences.\n    If I was a young astrophysicist or a young materials \nscientist, no one would expect me to get money and build my own \nspace telescope. Instead, I would leverage the amazing \nresources of the national lab system to answer my fundamental \nquestions. And although many fields of science have learned how \nto leverage the expertise and the resources available in the \nnational lab system, neuroscientists have not.\n    A national center for brain mapping situated within the DOE \nlab system could actually be a sophisticated clearinghouse to \nensure that the correct physics and engineering and computer \nscience tools are vetted and accessible for measuring brain \nstructure and brain function. Since the national labs are also \nthe stewards of our advanced computing infrastructure, they're \nideally suited to incubate these revolutions in computer and \nneurosciences.\n    Decades earlier, as a biologist, I just recently learned \nthat the DOE and the national labs helped usher in humanity's \nperhaps greatest scientific achievement of the 20th century, \nthe mapping of the human genome and the understanding of the \ngenetic basis of life. We believe that the DOE and the national \nlab system can make a similar contribution to understanding the \nhuman brain.\n    Other countries like Japan, South Korea, and China, \ncognizant of the remarkable benefits to economic and national \nsecurity that understanding brains and using them to make \ncomputer science better have already invested in national \nefforts in artificial intelligence and national efforts to \nunderstand the brain. The United States has not yet, and I \nthink it's important at the end of my statement for everyone to \nremember that we are the ones who went to the moon, we are the \nones who harnessed the power of nuclear energy, and we are the \nones that led the genomic revolution. And I suspect it's the \nmoment now for the United States to lead again, to map and help \nreverse engineer the physical substrates of human thought, \narguably the most challenging quest of the 21st century and \nperhaps the last great scientific frontier.\n    Thank you for your time and attention today. I welcome any \nquestions you might have.\n    [The prepared statement of Dr. Kasthuri follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n    \n    \n    Chairman Weber. Thank you, Doctor.\n    Dr. Yelick, you're recognized for five minutes.\n\n               TESTIMONY OF DR. KATHERINE YELICK,\n\n                 ASSOCIATE LABORATORY DIRECTOR\n\n                    FOR COMPUTING SCIENCES,\n\n             LAWRENCE BERKELEY NATIONAL LABORATORY;\n\n       PROFESSOR, THE UNIVERSITY OF CALIFORNIA, BERKELEY\n\n    Dr. Yelick. Chairman Smith, Chairman Weber, Chairwoman \nComstock, Ranking Members Veasey and Lipinski, distinguished \nMembers of the Committee, thank you for holding this hearing \nand for the Committee's support for science. And thank you for \ninviting me to testify.\n    My name is Kathy Yelick and I'm the Associate Laboratory \nDirector for Computing Sciences at Lawrence Berkeley National \nLaboratory, a DOE Office of Science laboratory managed by the \nUniversity of California. I'm also Professor of Electrical \nEngineering and Computer Sciences at the University of \nCalifornia, Berkeley.\n    Berkeley Lab is home to five national scientific user \nfacilities serving over 10,000 researchers covering all 50 \nStates. The combination of experimental, computational, and \nnetworking facilities puts Berkeley Lab on the cutting edge of \ndata-intensive science.\n    In my testimony today, I plan to do four things: first, \ndescribe some of the large-scale data challenges in the DOE \nOffice of Science; second, examine the emerging role of machine \nlearning; third, discuss some of the incredible opportunities \nfor machine learning in science, which leverage DOE's role as a \nleader in high-performance computing, applied mathematics, \nexperimental facilities, and team-based science; and fourth, \nexplore some of the challenges of machine learning and data-\nintensive science.\n    Big-data challenges are often characterized by the four \n``V's,'' the volume, that is the total size of data; the \nvelocity, the rate at which the data is being produced; \nvariability, the diversity of different types of data; and \nveracity, the noise, errors, and the other quality issues in \nthe data. Scientific data has all of these.\n    Genomic data, for example, has grown by over a factor of \n1,000 in the last decade, but the most abundant form of life, \nmicrobes, are not well-understood. Microbes can fix nitrogen, \nbreak down biomass for fuels, or fight algal blooms. DOE's \nJoint Genome Institute has over 12 trillion bases--that is DNA \ncharacters A, C, T, and G--of microbial DNA, enough to fill the \nLibrary of Congress if you printed them in very boring books \nthat only contain those four characters.\n    But genome sequencers produce only fragments with errors, \nand the DNA of the entire microbial community is all mixed \ntogether. So it's like taking the Library of Congress, \nshredding all of the books, throwing in some junk, and then \nasking somebody to reconstruct the books from them. We use \nsupercomputers to do this, to assemble the pieces, to find the \nrelated genes, and to compare the communities.\n    DOE's innovations are actually helping to create some of \nthese data challenges. The detectors used in electron \nmicroscopes, which were developed at Berkeley Lab and since \ncommercialized, have produced data that's almost 10,000 times \nfaster than just ten years ago.\n    Machine learning is an amazingly powerful strategy for \nanalyzing data. Perhaps the most well-known example is \nidentifying images such as cats on the internet. A machine-\nlearning algorithm is fed a large set of, say, ten million \nimages of which some of them are labeled as having cats, and \nthe algorithm uses those images to build a model, sort of a \nprobability of which images are likely to contain cats. Now, in \nscience we're not looking for cats, but images arise in many \ndifferent scientific disciplines from electron microscopes to \nlight sources to telescopes.\n    Nobel laureate Saul Perlmutter used images of supernovae--\nexploding stars--to measure the accelerating expansion of the \nuniverse. The number of images produced each night from \ntelescopes has grown from tens per night to tens of millions \nper night over the last 30 years. They used to be analyzed \nmanually by scientific experts, and now, much of that work has \nbeen replaced by machine-learning algorithms. The upcoming LSST \ntelescope will produce 15 terabytes of data every night. If you \nwatch that, one night's worth of data as a movie, it would take \nover ten years, so you can imagine why scientists are \ninterested in using machine learning to help them analyze that \ndata.\n    Machine learning can be used to find patterns that cluster \nsimilar items or approximate complicated experiments. A recent \nsurvey at Berkeley lab found over 100 projects that are using \nsome form of machine learning. They use it to track subatomic \nparticles, analyze light source data, search for new materials \nfor better batteries, improve crop yield, and identify abnormal \nbehavior on the power grid.\n    Machine learning, it does not replace the need for high-\nperformance computing simulations but adds a complementary tool \nfor science. Recent earthquake simulations of the bay area show \nthat just a 3-mile difference in location of an identical \nbuilding makes a significant difference in the safety of that \nbuilding. It really is all about location, location, location. \nAnd the team that did this work is looking at taking data from \nembedded sensors and eventually even from smart meters to give \neven more detailed location-specific results.\n    There is tremendous enthusiasm for machine learning in \nscience but some cautionary notes as well. Machine-learning \nresults are often lacking in explanations, interpretations, or \nerror bars, a frustration for scientists. And scientific data \nis complicated and often incomplete. The algorithms are known \nto be biased by the data that they see. A self-driving car may \nnot recognize voices from Texas if it's only seen data from the \nMidwest.\n    Chairman Weber. Hey, hey.\n    Dr. Yelick. Or we may miss a cosmic event in the southern \nhemisphere if they've only seen data from telescopes in the \nnorthern hemisphere. Foundational research in machine learning \nis needed, along with the network to move the data to the \ncomputers and share it with the community and make it as easy \nto search for scientific data as it is to find a used car \nonline.\n    Machine learning has revolutionized the field of artificial \nintelligence and it requires three things: large amounts of \ndata, fast computers, and good algorithms. DOE has all of \nthese. Scientific instruments are the eyes, ears, and hands of \nscience, but unlike artificial intelligence, the goal is not to \nreplicate human behavior but to augment it with superhuman \nmeasurement control and analysis capabilities, empowering \nscientists to handle data at unprecedented scales, provide new \nscientific insights, and solve important societal challenges.\n    Thank you.\n    [The prepared statement of Dr. Yelick follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n    \n    \n    Chairman Weber. Thank you, Doctor.\n    Dr. Nielsen, you're recognized for five minutes.\n\n               TESTIMONY OF DR. MATTHEW NIELSEN,\n\n                      PRINCIPAL SCIENTIST,\n\n               INDUSTRIAL OUTCOMES OPTIMIZATION,\n\n                       GE GLOBAL RESEARCH\n\n    Dr. Nielsen. Chairman Smith, Chairman Weber, and Chairwoman \nComstock, Ranking Members Veasey and Lipinski, and Members of \nthe Subcommittee, it is an honor to share General Electric's \nperspective on innovative machine-learning-based approaches to \nbig-data science challenges that promote a more resilient, \nefficient, and sustainable energy infrastructure. I am Matt \nNielsen, a Principal Scientist at GE's Global Research Center \nin upstate New York.\n    The installed asset base of GE's power and renewable \nbusinesses generates roughly 1/3 of the planet's power, and 40 \npercent of the world's electricity is managed by our software. \nGE Energy's assets include everything from gas and steam power, \nnuclear, grid solutions, energy storage, onshore and offshore \nwind, and hydropower.\n    The nexus of physical and digital technologies is \nrevolutionizing what industrial assets can do and how they are \nmanaged. One of the single most important questions industrial \ncompanies such as GE are grappling with is how to most \neffectively integrate the use of AI and machine learning into \ntheir business operations to differentiate the products and \nservices they offer. GE has been on this journey for more than \na decade.\n    A key learning for us--and I can attest to this as being a \nphysicist--has been the importance of tying our digital \nsolutions to the physics of our machines and to the extensive \nknowledge on how they are controlled. I'll now highlight a few \nindustrial applications of AI machine learning where GE is \ncollaborating with our customers and federal agencies like the \nU.S. Department of Energy.\n    At GE, digital twins are a chief application of AI and \nmachine learning. Digital twins are living digital models of \nindustrial assets, processes, and systems that use machine \nlearning to see, think, and act on big data. Digital twins \nlearn from a variety of sources, including sensor data from the \nphysical machines or processes, fleet data, and industrial-\ndomain expertise. These computer models continuously update as \nnew data becomes available, enabling a near-real-time view of \nthe condition of the asset.\n    To date, GE scientists and engineers have created nearly \n1.2 million digital twins. Many of the digital twins are \ncreated using machine-learning techniques such as neural \nnetworks. The application of digital twins in the energy sector \nis enabling GE to revolutionize the operation and maintenance \nof our assets and to drive new innovative approaches in \ncritical areas such as services and cybersecurity.\n    Now onto digital ghosts. Cyber threats to industrial \ncontrol systems that manage our critical infrastructure such as \npower plants are growing at an alarming rate. GE is working \nwith the Department of Energy on a cost-shared program to build \nthe world's first industrial immune system for electric power \nplants. It cannot only detect and localize cyber threats but \nalso automatically act to neutralize them, allowing the system \nto continue to operate safely.\n    This effort engages a cross disciplinary team of engineers \nfrom the global research and our power business. They are \npairing the digital twins that I mentioned of the power plants \nmachines, industrial controls knowledge, and machine learning. \nThe key again for this industrial immune system is the \ncombination of advanced machine learning with a deep \nunderstanding of the machines' thermodynamics and physics.\n    We have demonstrated to date the ability to rapidly and \naccurately detect and even localize simulated cyber threats \nwith nearly 99 percent accuracy using our digital ghost \ntechniques. We're also making significant progress now in \nautomatically neutralizing these threats. It is a great example \nof how public-private research partnerships can advance \ntechnically risky but universally needed technologies.\n    Along with improving cyber resiliency, AI and machine-\nlearning technologies are enabling us to improve GE's energy \nservices portfolio, helping our customers optimize and reduce \nunplanned downtime for their assets. Through GE's asset \nperformance management platform, we help our customers avoid \ndisruptions by providing deep, real-time data insights on the \ncondition and operation of their assets. Using AI, machine \nlearning, and digital twins, we can better predict when \ncritical assets require repair or have a physical fault. This \nallows our customers to move from a schedule-based maintenance \nsystem to a condition-based maintenance system.\n    The examples I have shared and GE's extensive developments \nwith AI and machine learning have given us a first-hand \nexperience into what it takes to successfully apply these \ntechnologies into our Nation's energy infrastructure. My full \nrecommendations are in my written testimony, and I'll only \nsummarize them here.\n    Number one, continue to fund opportunities for public-\nprivate partnerships to expand the application and benefits of \nAI and machine learning across the energy sector.\n    Two, encourage the collaboration between AI, machine \nlearning, and subject matter experts, engineers, and \nscientists.\n    And number three, continue to invest in the Nation's high-\nperformance computing assets and expand opportunities for \nprivate industry to work with the national labs.\n    I appreciate the opportunity to offer our perspective on \nhow the development of AI and machine-learning technologies can \nmeet the shared goals of creating a more efficient and \nresilient energy infrastructure.\n    One final thought is to reinforce a theme that I've \nemphasized throughout my testimony, and that is the importance \nof having teams of physical and digital experts involved in \ndriving the future of AI and machine-learning solutions.\n    Thank you, and I look forward to answering any questions.\n    [The prepared statement of Dr. Nielsen follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n   \n    \n    Chairman Weber. Thank you, Dr. Nielsen.\n    Dr. Rollett, you're recognized for five minutes.\n\n               TESTIMONY OF DR. ANTHONY ROLLETT,\n\n                    U.S. STEEL PROFESSOR OF\n\n               MATERIALS SCIENCE AND ENGINEERING,\n\n                   CARNEGIE MELLON UNIVERSITY\n\n    Dr. Rollett. So my thanks to Chairman Weber, Chairman \nSmith, Chairwoman Comstock, Ranking Members Veasey and \nLipinski, and all the Members for your interest.\n    Speaking as a metallurgist, it's my pleasure and privilege \nto testify before you because I've found big data and machine \nlearning, which depend on advanced computing, to be a never-\nending source of insight for my research, be it on additive \nmanufacturing or in developing new methods of research on \nstructural materials.\n    My bottom line is that there are pervasive opportunities, \nas you've heard, to benefit from big data and machine learning. \nNevertheless, there are many challenges to be addressed in \nterms of algorithm development, learning how to apply the \nmethods to new areas, transforming data into information, \nupgrading curricula, and developing regulatory frameworks.\n    New and exciting manufacturing technologies such as 3-D \nprinting are coming on stream that generate big data, but they \nneed further development, especially for qualification, in \nother words, the science that underpins the processes and \nmaterials needed to satisfy requirements.\n    So consider that printing a part with a powder bed machine, \nstandard machine, requires 1,000-fold repetition of spreading a \nhair's-breadth layer of powder, writing that desired shape in \neach layer, shifting the part by that same hair's breadth, and \nrepeating. So if you think about taking a part and dividing the \ndimension of that part by a hair's breadth, multiplied by yards \nof laser-melting track, you can easily estimate that each part \ncontains miles and miles of tracks, hence, the big data.\n    The recent successes with machine learning have used data \nthat is already information-rich, as you've heard, cats, dogs, \nand so on. And so to advanced manufacturing and basic science, \nhowever, we have to find better ways to transform the data, \nstream into a big information stream.\n    Another very important context is that education in all \nSTEM subjects needs to include the use of advanced computing \nfor data analysis and machine learning. And I know that this \nCommittee has focused on expanding computer science education, \nso thank you for that.\n    So for printing, please understand that the machines are \nhighly functional and produce excellent results. Nevertheless, \nif we're going to be able to qualify these machines to produce \nreliable parts that can be used in, for example, commercial \naviation, we've got some work to do.\n    If I might ask for the video, Daniel, if you can manage to \nget that to play. So I'd like to illustrate the challenges in \nmy own research.\n    [Video shown.]\n    Dr. Rollett. I often used the light sources, in other \nwords, x-rays from synchrotrons, most of which are curated by \nthe Department of Energy. I use several modes of \nexperimentation such as computer topography, diffraction \nmicroscopy, and dynamic x-ray radiography. So this DXR \ntechnique produces movies of the melting of the powder layers \nexactly as it occurs in 3-D printing with the laser. And again, \nat the micrometer scale you can see about a millimeter there. \nAnd you can also see that the dynamic nature of the process \nmeans that one must capture this at the same rate as, say, the \nmore familiar case of a bullet going through armor.\n    Over the last couple of years, we've gotten many deep \ninsights as to how the process works, but again, for the big-\ndata aspect, each of these experiments lasts about a \nmillisecond. That's about 500 times faster than you can blink. \nAnd it provides gigabytes of images, hence, the big data. \nStoring and transmitting such large amounts of data, which are \narriving at ever-increasing rates, is a challenge for this \nvital public resource. I should say that the light sources \nthemselves are well aware of this challenge. Giving more \nserious attention to such challenges requires funding agencies \nto adopt the right vision in terms of recognizing the need for \nfusion of data science with the specific applications.\n    I also want to say that cybersecurity is widely understood \nto be an important problem with almost weekly stories about \ndata leaks and hacking efforts. What's not quite so well \nunderstood is exactly how we're going to interface \nmanufacturing with cybersecurity.\n    So, in summary, I suggest that there are three areas of \nopportunity. First, federal agencies should continue to support \nthe application of machine learning to advanced manufacturing, \nparticularly for the qualification of new technologies and \nmaterials. I thank and commend all of my funders for supporting \nthese advances and particularly want to call out the FAA for \nproviding strong motivation here.\n    In the future, research initiatives should also seize the \npotential for moonshot efforts on objectives such as \nintegrating artificial intelligence capabilities directly into \nadvanced manufacturing machines and advancing synergy between \ntechnologies such as additive manufacturing and robotics.\n    Second, we need to continue to energize and revitalize STEM \neducation at all levels to reflect the importance of the data \nin learning and computing with a focus on manufacturing. I \nmyself have had to learn these things as I've gone along.\n    Third, based on the evidence that machine learning is being \nsuccessfully applied in many areas, we should encourage \nagencies to seek programs in areas where it's not so obvious \nhow to apply the new tools and to instantiate programs in \ncommunities where data, machine learning, and advanced \ncomputing are not yet prevalent.\n    Having traveled abroad extensively, I can assure you that \nthe competition is serious. Countries that we used to dismiss \nout of hand, they're publishing more than we are and securing \nmore patents than we do.\n    Again, I thank you for the opportunity to testify and share \nmy views on this vital subject. I know that we will all be glad \nto answer your questions.\n    [The prepared statement of Dr. Rollett follows:]\n    \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]    \n   \n    \n    Chairman Weber. Thank you, Doctor. I now recognize myself \nfor five minutes.\n    This question is for all the witnesses. You've all used \nsimilar terminology in your testimonies like artificial \nintelligence, machine learning, and deep learning. So that we \ncan all start off on the same page, I'll start with Dr. \nKasthuri. But could you explain what these terms mean and how \nthey relate to each other?\n    In the interest of time, I'm going to divvy these up. Dr. \nKasthuri, you take artificial intelligence. Dr. Yelick, you \ntake machine learning. Dr. Nielsen, you take deep learning. All \nright? Doctor, you're up.\n    Dr. Kasthuri. Thank you, Chairman Weber. That's an \nexcellent question. In the interest of time I'm not going to \nspeak about artificial intelligence. There are clearly experts \nsitting next to me. I'm interested in the idea of finding \nnatural intelligence wherever we can, and I would say that the \nconfusion that exists in these terminologies also exist when we \nthink about intelligence beyond the artificial space. And I'm \nhappy to--maybe perhaps after I let the other scientists speak \nto talk about how we define natural intelligence different \nways, which might help elucidate the ways we define artificial \nintelligence.\n    Chairman Weber. All right. Fair enough. Dr. Yelick, do you \nfeel that monkey on your back?\n    Dr. Yelick. Yes. Thank you very much for the question. So \nlet me try to cover a little bit of all three. So artificial \nintelligence is a very long-standing subfield of computer \nscience looking at how to make computers behave with humanlike \nbehavior. And one of the most powerful techniques for some of \nthe subproblems in artificial intelligence such as computer \nvision and speech processing are machine-learning algorithms. \nThese algorithms have been around for a long time, but the \navailability of large amounts of labeled data and large amounts \nof computing have really made them take off in terms of being \nable to solve those artificial intelligence problems in certain \nways.\n    The specific type of machine learning is a broad class of \nalgorithms that come from statistics and computer science, but \nthe specific classes called deep learning algorithms, and I \nwon't go into the details. I will defer that if somebody else \nwants to try to explain deep learning algorithms, but they are \nused for these particular breakthroughs in artificial \nintelligence.\n    I would say that the popular press often equates the word \nartificial intelligence with the term deep learning because the \nalgorithms have been so powerful, and so that can create some \nconfusion.\n    Chairman Weber. All right. Thank you. Dr. Nielsen?\n    Dr. Nielsen. Yes, I'm not an expert in deep learning, but \nwe are practitioners of deep learning at GE. And really it's \ntaken off in, I would say, the last several years as we've seen \na rise in big data. So we have nearly 300,000 assets spread \nglobally and each one generating gigabytes of data. Now, \nprocessing that gigabytes of data and trying to make sense of \nit we're using deep learning techniques. It's a subfield, as \nyou mentioned, of machine-learning algorithms but allows us to \nextract more information, more relationships if you will.\n    So, for example, we use deep learning to help us build a \ncomputer model of a combined-cycle power plant, very complex \nsystem, very complex thermodynamics. And it's only because we \nhave been able to collect now years and years of historical \ndata and then process it through a deep-learning algorithm. So, \nfor us, deep learning is a breakthrough enabled by advances in \ncomputing technology, advances in big-data science, and it's \nallowing us to build what we think is more complex models of \nnot only our assets but the processes that they perform.\n    Chairman Weber. And, Dr. Rollett, before you answer, you \nissued a warning quite frankly in your statement that there's \nbeen more patents filed by some of the foreign countries than \nwe are. Do you attribute that to what we're talking about here? \nGo ahead.\n    Dr. Rollett. In very simple terms, I think what I'm calling \nattention to is investment level in the science that underpins \nall kinds of things, so whether it be the biology of the brain, \nthe functioning of the brain or how you make machines work, how \nyou construct machines, control algorithms, so on, and so \nforth. That's really what I'm trying to get at.\n    Chairman Weber. Okay.\n    Dr. Rollett. And I'm trying to give you some support, some \nammunition that what you're doing as a committee, set of \nSubcommittees is really worthwhile.\n    Chairman Weber. Yes, well, thank you. I appreciate that.\n    I'm going to move on to the second question. Several of you \nmentioned your reliance on DOE facilities, which is, again, \nwhat you're talking about, particularly light sources and \nsupercomputing which we are focused on, have been to a couple \nof those for the types of big-data research that you all \nperform and my question is how necessary is it for the United \nStates to keep up to date? You've already address that with the \npatents statement, a warning that you issued, but what I want \nto know is have any of you all--would you opine on who the \nnearest competitor is? And have you interfaced with any \nscientists or individuals from those companies? And if so, in \nwhat field and in what way? Doctor?\n    Dr. Kasthuri. I would say that, internationally, sort of \nthe nearest two competitors to us are Germany and China. And in \ngeneral in the scientific world there is a tension between \ncollaboration and competition independent of whether the \nscientist lives in America or doesn't live in America.\n    I think the good news is that for us at least in \nneuroscience we realize that the scale of the problem is so \nenormous and has so much opportunity, there's plenty of food \nfor everyone to eat. So right now, we live at the world of \ncooperation between individual scientists where we share data, \nshare problems, and share solutions back and forth unless of \ncourse familiar with what happens at levels much higher than \nthat.\n    Chairman Weber. Thank you. Dr. Yelick?\n    Dr. Yelick. Yes, in the area of high-performance computing \nI would say the closest competitor at this point is China. And \nin science we also like to look at derivatives, so what we \nreally see is that China is growing very, very rapidly in terms \nof their leadership. At this point we do have the fastest \ncomputer and the top-500 list in the United States, but of \ncourse until recently that was the top two--the number-one and \n-three machines were from China. But perhaps more importantly \nthan that there are actually more machines manufactured in \nChina on that list than there are machines that are fractured \nin the United States, so there is a huge and growing interest, \nand certainly a lot of research, a lot of funding in China for \nartificial intelligence, machine learning, and all of that \napplied to science and other problems.\n    Chairman Weber. Have you met with anybody from over in \nChina involved in the field?\n    Dr. Yelick. Yes. Last summer, I actually did a tour of all \nof the major supercomputing facilities in China, so I got to \nsee what were the number-one and number-three machines at that \ntime--and was very impressed by the scientists. I think one of \nthe things that you see--and a lot of, by the way, very junior \nscientists, the students that they are training in these areas, \nthey use these machines to also draw talent back to China from \nthe United States or to keep talent that was trained in China \nin the United States. And they have very impressive people in \nterms of the computer scientists and computational scientists.\n    Chairman Weber. And, Dr. Nielsen, very quickly because I'm \nout of time.\n    Dr. Nielsen. Yes, I would just like to echo that, like Dr. \nRollett, we follow publications and patents, and we're seeing a \ngrowing number from China, so I'd like to echo that just from \nthat statement. We're seeing growing interest in the use of \nhigh-performance computing to go look at things like \ncybersecurity from China, so obviously, that's the number-one \nlocation we're looking at.\n    Chairman Weber. Good. Thank you, Dr. Rollett. I'm happy to \nmove on now. So I'm now going to recognize the gentlelady from \nOregon for five minutes.\n    Ms. Bonamici. Thank you very much, Mr. Chairman.\n    What an impressive panel and what a great conversation and \nan important one.\n    I represent northwest Oregon where Intel is developing the \nfoundation for the first exascale machines. We know the \npotential of high-performance computing and all energy \nexploration, predicting climate weather, predictive and \npreventive medicine, emergency response, just a tremendous \namount of potential. And we certainly recognize on this \nCommittee that investment in exascale systems and high-\nperformance computing is important for our economic \ncompetitiveness, national security, and many reasons.\n    And we know--I also serve on the Education Committee, and I \nknow that our country has some of the best scientists and \nprogrammers and engineers, but what really sets our country \napart is entrepreneurs and innovation. And those \ncharacteristics require creative and critical thinking, which \nis fostered through a well-rounded education, including the \narts.\n    I don't think anyone on this Committee is going to be \nsurprised to hear me mention the STEAM Caucus, which is--I'm \ncochairing with Representative Stefanik from New York, working \non integrating arts and design into STEM, learning to educate \ninnovators. We have out in Oregon this wonderful organization \ncalled Northwest Noggin, which is a collaboration of our \nmedical school, Oregon Health Sciences University, Portland \nState University, Pacific Northwest College of Art, and the \nRegional Arts and Culture Council. And they go around exciting \nthe public about ongoing taxpayer-supported neuroscience \nresearch. And they're doing great work and expanding the number \nof people who are interested in science and also communicating \nwith all generations and all people about the benefits of \nscience.\n    So, Dr. Rollett, in your testimony you talked about the \nrole of data analytics across manufacturing--the manufacturing \nsector. And you noted that it's not necessarily going to be \nimportant for all data analytic workers to have a computer \nscience degree, so what skills are most important for \naddressing the opportunities? You did say in your testimony \nthat technology forces us to think differently about how to \nmake things, so talk about the next manufacturing center at \nCarnegie Mellon and what you're doing to prepare students for \nevolving fields? And we know as technology changes we need \nintellectual flexibility as well, so how do you educate people \nfor that kind of work?\n    Dr. Rollett. So thank you for the opportunity to address \nthat. The way that we're approaching that is telling our \nstudents don't be afraid of these new techniques. Jump in, try \nthem, and lo and behold, almost every time they're trying it--\nsometimes it's a struggle, but almost every time that they try \nit they're discovering, oh, this actually works. Even if it's \nnot big data in quite the sense that, say, Kathy would tell us, \neven small data works.\n    So, for example, in these powder bed machines you spread a \nlayer. Well, if you just take a picture of that layer and then \nanother picture and you keep analyzing it and you use these \ncomputer vision techniques, which are sort of a subset of \nmachine learning, lo and behold, you can figure out whether \nyour part is building properly or not. That's the kind of thing \nthat we've got to transmit to all of our students to say it's \nnot that bad, jump in and try it and little by little, you'll \nget there.\n    Ms. Bonamici. I think over the years many students have \nbeen very risk-averse and they don't want to risk taking \nsomething where they might not get the best grade possible, so \nwe have to work on overcoming that because there's so much \npotential out there until students have the opportunity to get \nin and have some of that hands-on learning.\n    Dr. Yelick, I'm in the Northwest and it's not a question of \nif but when we have an earthquake off the Northwest coast, and \na tsunami could be triggered of course by that earthquake along \nthe Cascadia subduction zone. So in your testimony you discuss \nthe research at Berkeley Lab to simulate a large magnitude \nearthquake, and I listened very carefully because you were \ntalking about the effects on an identical building in different \nareas. This data could be really crucial as we are assessing \nthe need for more resilient infrastructure not only in Oregon \nbut across the country. So what technical challenges are you \nfacing and sort of curating, sharing, and labeling and \nsearching that data? And what support can the federal \ngovernment provide to accelerate a resolution of these issues?\n    Dr. Yelick. Well, thank you very much for the question. \nYes, this is very exciting work that's going on, and simulating \nearthquakes is currently at a regional scale. There are \ntechnology challenges to trying to even get that to larger-\nscale simulations, but I think even more importantly the work \nthat I talked about is trying to use information about the \ngeology to try to give you much more precise information about \nthe safety of a particular location.\n    And the challenge is to try to collect this data and then \nto actually invert it, that is turn it into a model so you \ncollect the data and then in some sense you're trying to \ndevelop a set of equations that say how that area--based on the \ndata that's been collected from little tiny seismic events, \nit'll tell you something about how that particular subregion, \neven a yard or a city block or something like that, how that \ncity block is going to behave in an earthquake. And you can use \nthe information from tiny seismic events and then to infer how \nit will behave in a large significant earthquake. And so \nthere's technical challenge, mathematical challenges of doing \nthat, as well as the scale of computing for both doing the \ndata, inverting the data but also then doing the simulation.\n    And I think you bring up a very good point about the \ncommunity needs for these community data sets because you \nreally want to make it possible for many groups of people, not \njust, for example, a power company that has smart meter data \nbut for other people to access that kind of data.\n    Ms. Bonamici. Thank you. And I want to follow up with that. \nI'm running out of time, but as we talk about infrastructure \nand investment in infrastructure, we know that by making better \ndecisions at the outset we can save lives and save property, so \nthe more information we have about where we're building and how \nwe're building is going to be a benefit to people across this \ncountry, as well as in northwest Oregon. So thank you again to \nthis distinguished panel. I yield back.\n    Chairman Weber. Thank you, ma'am.\n    The gentlelady from Virginia, Mrs. Comstock, is recognized.\n    Mrs. Comstock. Thank you, Mr. Chairman, and thank all of \nyou here. This has been very interesting once again.\n    Now, I guess I'd ask to all of you, what are the unexamined \nbig-data challenges that could benefit from machine learning? \nAnd what are the consequences for the United States for not \nbeing the world leader in that if we aren't going forward in \nthe future? Maybe, Dr. Rollett, if you'd like to start. You \nlook like you had an answer ready to go, so----\n    Dr. Rollett. I'll give you a small example from my own \nfield. So when we deal with materials, then we have to look \ninside the materials. So we typically take a piece of steel and \nwe cut it and we polish it and we take pictures of it. So \ntraditionally, what we've done is play the expert witness as it \nwere. You look at these pictures, which I often say resemble \nmore of a Jackson Pollock painting than anything that remotely \nas a simple as a cat, and so the excitement in our field is \nthat we now have the tools that we can start to tease things \nout of these pictures, that we go from something where we are \ncompletely dependent on sort of gray-bearded experts to let the \ncomputer do a lot of the job for you. And that speeds things up \nand it automates them and it allows companies to detect \nproblems that they're running across. So it's just one example.\n    Dr. Kasthuri. Congresswoman Comstock, thank you for the \nquestion. I have two sort of answers specifically to thinking \nabout brains and then to thinking about education. I think \nthese are the potential things that we can lose. One of the \nthings that I find fascinating about how our brains work is \nthat whether you are Einstein thinking up relativity or Mozart \nmaking a concerto or you're just at home watching reality TV, \nall brains operate at about 20 watts of energy. These light \nbulbs in this room are probably at 60 watts of energy. And \nalthough you might already think some of your colleagues are \ndim bulbs, in this sense, what's amazing about the things that \nthey can accomplishes that they accomplish them at energy \nefficiencies that are currently unheard of for any type of \nalgorithm.\n    So I feel like if we can leverage machine learning, deep \nanalytics, and understand how the brain passes information and \nprocesses information for energies that are really energy \nefficiencies unheard of in our current algorithms and robots, \nthat's a huge benefit to both the national and economic \nsecurities of our country. That's the first.\n    And the second thing I'd like to add, the other reason that \nit's important for us to lead now--and I'll do it by example--\nis that in 1962 at Rice University John F. Kennedy announced \nthat we were going to the moon. And he announced it and in his \nspeech he said we're going to go to the moon--and I \nparaphrase--not because it's easy but because it's hard and \nbecause hard things test our mettle and test our capabilities.\n    The other interesting fact about that is that in 1969 when \nwe landed on the moon, the average age of a NASA scientist was \n29 years old, so quick math suggests that when Kennedy \nannounced the moonshot, many of these people were in college. \nThey were students. And there was something inspirational about \npositing something difficult, positing something visionary. And \nI suspect that this has benefited us--in recruiting this \ngeneration of scientists to the moonshot has benefited this \ncountry in ways that we yet haven't calculated. And I suspect \nthat if we don't move now, we lose both of these opportunities, \namong many others.\n    Mrs. Comstock. So it's really a matter of getting that \nfocus and attention and commitment so that you have that next \ngeneration understanding this is really a long-term investment, \nand we have a passion for it, so they will.\n    Dr. Kasthuri. Exactly.\n    Dr. Yelick. I'll just add briefly that I think we really \nwant to--in terms of the threat associated with this is really \nabout continuing to be a leader in computing but also about the \ncontrol and use of information. And you can see the kinds of \nexamples we've given are really important, and you hear about \nit in the news about the control and use of information. We \nneed leaders in understanding how to do that and make sure that \ninformation is used wisely.\n    We teach our freshmen at Berkeley a course in data science, \nso whether they're going to go off and become English majors or \nart majors or engineers, we think it's really important for \npeople to understand data.\n    Dr. Nielsen. And just real briefly, I'd like to build a \nlittle bit on Dr. Rollett's comments. For us, we're seeing \ntremendous benefit in big data for things like trying to better \npredict when an aircraft engine part has to be repaired, when \nit needs to be inspected, very critical for the safety of that \nengine. For gas turbines, same thing. Wind parts need to be \ninspected and repaired.\n    So where does big data come in? It comes in with \ncomputational fluid dynamics, which we leverage--actually, the \nhigh-performance computing infrastructure of the United States \nmaterials science, material knowledge, trying to understand \ngrain structure, et cetera. So for us, that nexus of the \ndigital technologies with the physics, understanding the \nthermodynamics of our assets are leading us into what I think \nis just a better place to be from maintenance scheduling, \nsafety, resiliency, et cetera.\n    Mrs. Comstock. Thank you. I really appreciate all of your \nanswers.\n    I yield back, Mr. Chairman.\n    Chairman Weber. The gentleman from Virginia, Mr. Beyer, is \nrecognized for five minutes.\n    Mr. Beyer. Mr. Chairman, thank you very much, and thank you \nall very much for doing this.\n    Dr. Kasthuri, so on the BRAIN Initiative I think obviously \nthe most--maybe the most exciting thing happening in the world \ntoday, I was fascinated by this whole notion of the Connectome, \n1 billion neurons with 1 quadrillion connections, you talk \nabout it being if you took--of all the written material in the \nworld into one data set, it'd just be a small fraction of the \nsize of this brain map. Is it possible that it's simpler than \nthat, that it sort of strains my understanding that there are \nfew things in nature that are as complex as that. Why in \nevolution have we developed something that--and every human \nbeing on the planet has a brain that's already--contains more \nconnections than every bit of written material?\n    Dr. Kasthuri. Congressman Beyer, that's a great question, \nand like most scientists I'm going to do a little bit of \nhandwaving and a little bit of conjecture because the question \nthat you're asking is the question that we are trying to \naccomplish. We know reasonably well that there are, as you \nsaid, 100 billion brain cells, neurons, that make on order 1 \nquadrillion connections in the brain. Now, that--when I say the \ndata of that, I'm really talking about the raw image data. What \nwill it take to take a picture of every part of the brain and \nif you added up all the data of all those pictures together, it \nwould be the largest data set ever collected.\n    Now, I suspect we have to do that at least once and then it \nmight be possible that there are patterns within that data that \nthen simplify the next time that we have to map your brain. One \nway to think about this is that before we had a map of DNA, we \ndidn't realize that there was a pattern within DNA, meaning \nevery three nucleotides--A, C, T, et cetera--codes for a \nprotein. And that essentially simplifies the data structure to, \nlet's say, 1/3. I don't need to know, I just need to know that \nthese three things are an internal pattern that then gets \nrepeated again and again and again. And that was a fundamental \ninsight. We have no similar insight into the brain. Is there a \nrepetitive pattern that would actually reduce the amount of \ndata that we had to collect?\n    So, you're right, it might be that the second brain or the \nthird brain isn't going to be that much data, but now let me \ngive you the counter because as a scientist I have to do both \nsides or all sides. The other thing we know is that each human \nbrain is unique, very much like a snowflake. Your brain, the \nconnectivity, the connections in your brain at some level have \nto represent your life history, what your brain has \nexperienced.\n    And so the question for me--and I think it's really one of \nthe most important questions--is even within the snowflake \nthere are things that are unique to snowflakes but they're the \nsame. They either have seven arms are eight arms or six arms. I \nget them confused with spiders, but it's one of those is the \nanswer. So there's regularity in a snowflake at the level of \nthe arms, but there is uniqueness at the level of the things \nthat jut out of the seven arms of the snowflake. And the \nfundamental question is what is unique, what is the part that \nmakes each of us a neurological snowflake and what is common \nbetween all of us? And that would be one of the very first \ngoals of doing a map is to discover the answer to your \nquestion.\n    Mr. Beyer. Yes, well, thank you for a very thoughtful \nanswer. And I keep coming back to the Einstein notion that \nalways looking for the simplest answers, things that unify it \naltogether. So here's another simple question. You talked in \nyour very first paragraph about reverse engineering human \ncognition into our computers, good idea? At our most recent AI \nhearing here a lot of the controversy was, you know, dealing \nwith Elon Musk and others and their concerns about what happens \nwhen consciousness emerges in machines.\n    Dr. Kasthuri. Again, a fantastic question. Here's my \nversion of an answer. We deal with smarter things every day. \nMany of our children, especially mine, wind up getting \nconsciousness and being smarter than us, certainly smarter than \nme, but yet we don't worry about the fact that this next \ngeneration of children, forever the next generation of children \nwill always be smarter than us because we've developed ways as \na society to instill in them the value systems that we have. \nAnd there are multiple avenues for how we can instill in our \nchildren the value systems that we have.\n    I suspect we might use the same things when we make smart \nalgorithms, the same way we make smart children. We won't just \nproduce smart algorithms but we'll instill in them the values \nthat we have the same way that we instill our values in our \nchildren.\n    Now, that didn't answer your question of whether reverse \nengineering the brain is a specific good idea for AI or not. \nThe only thing I would say is that no matter what we can \nimagine AI--artificial intelligence doing, there is a \nbiological system that does that at more energy efficiency and \nits speed for which that AI physical silicon system does not. \nBut I suspect these answers are probably best debated amongst \nyou and then you could tell us.\n    Mr. Beyer. Well, that was a very optimistic thing. I want \nto say one of the things we do is we keep the car keys in those \ncircumstances.\n    Mr. Chairman, I yield back.\n    Chairman Weber. Thank you. The gentleman from Kansas is \nrecognized for five minutes.\n    Mr. Marshall. Well, thank you, Mr. Chairman.\n    Speaking of Kansas, I'm sure you all remember President \nEisenhower is the one who started NASA in 1958, but it was \nPresident Kennedy, as several of you have stated, that, you \nknow, gave us the definitive goal to get to the moon. And as a \nyoung boy I saw that before my eyes, the whole country wrapped \naround that.\n    Each of you get one minute. What's your big, hairy, \naudacious goal, your idea, it took 11 years, '58 to '69 to get \nto the Moon. Where are we going to be in 11 years? Dr. Rollett, \nwe'll start with you and you each get one minute.\n    Dr. Rollett. I think we're going to see that manufacturing \nis a much more clever operation. It understands the materials. \nIt understands how things are going to last, and it draws in a \nmuch wider set of disciplines than it currently does. I have to \nadmit I don't exactly have an analogy to going to the moon, but \nthat's a very good challenge.\n    Mr. Marshall. What I like about your idea is that's going \nto add to the GDP. Our GDP grows when we become more efficient, \nnot when federal government sends dollars to States for social \nprojects, so I love adding to GDP.\n    Dr. Nielsen, I guess you're next.\n    Dr. Nielsen. So I would love it if every one of our \nassets--and I mentioned there are about 300,000 globally--had \ntheir own digital twin, so every aircraft engine had its own \ndigital twin. A digital twin is a computer model that when the \nasset is operating, we're collecting data. So imagine an \naircraft engine taking off. As soon as that aircraft engine \ntakes off, we pull the data back from the aircraft engine and \nwe update the computer model. That computer model becomes a \ndigital twin of the physical asset. If every one of our \n300,000-plus assets had a digital twin, we'd be able to know \nwith very good precision when it needed to be maintained, when \nit needed to be pulled off wing, what kind of repairs when it \nwent to a repair shop, what kind of repairs need to occur.\n    Mr. Marshall. You can do that with satellites and a whole \nbunch of things.\n    Dr. Nielsen. We can pull back data from a whole variety of \ndifferent pathways. It's then utilizing that data in the most \nefficient way, which we use machine learning and AI-type \ntechnologies----\n    Mr. Marshall. Maybe get internet to rural places by doing \nthat, right?\n    Dr. Nielsen. Yes.\n    Mr. Marshall. Okay. We better go on. Dr. Yelick?\n    Dr. Yelick. So I think one of the biggest challenges is \nunderstanding the microbiome and being able to use that \ninformation about the microbiome in both health applications \nand agriculture, in engineering, materials, and other areas.\n    So I think that we already know that your microbiome, your \nown personal microbiome is associated with things like obesity, \ndiabetes, cardiovascular disease, and many other disorders. We \ndon't understand it as well in agriculture, but we're looking \nat things like taking images of fields, putting biosensors into \nthe fields and putting all this information together to \nunderstand how to make--to improve the microbiome to improve \ncrop yield and reduce other problems. So I think it's about \nboth understanding and controlling the microbiome, which is a \nhuge computational problem.\n    Mr. Marshall. Okay. Dr. Kasthuri?\n    Dr. Kasthuri. The thing I would really like to have done in \n11 years is understand how brains learn. And actually it \nreminds me of something that I should've said earlier about the \ndifferences between artificial intelligence, machine learning, \ndeep learning, and how brains learn. The main difference is \nthat for many of these algorithms you have to provide them \nthousands of examples, millions of examples, billions of \nexamples before they can then produce inferences or predictions \nthat are based on those examples.\n    For those of you with children, you know that that's not \nthe way children learn. They can learn in one example. They can \nlearn in half an example. Sometimes I don't even know where \nthey're learning these things. And when they learn something, \nthey learn not only the very specific details of that thing, \nthey can immediately abstract it to a bunch of other examples.\n    For me, this happened with my son the first time he learned \nwhat a tiger was. An image of a tiger he could see, and then as \nsoon as he learned that, he could see a cartoon of a tiger, he \ncould see a tiger upside down, he could see the back of a tiger \nor the side of a tiger, and from the first example be able to \ninfer, learn all of these other general applications.\n    If in 11 years we could understand how the brain does that \nand then reverse engineer that into our algorithms and our \ncomputers and robots, I suspect that will influence our GDP in \nways that we hadn't yet imagined.\n    Mr. Marshall. Okay. Thank you so much. I yield back.\n    Chairman Weber. I thank the gentleman.\n    The gentleman from the great State of Texas is recognized.\n    Mr. Veasey. Thank you, Mr. Chairman.\n    Dr. Rollett, am I pronouncing that right?\n    Dr. Rollett. It'll do.\n    Mr. Veasey. Okay. In your testimony you talk about the huge \namounts of data that are generated by experiments using light \nsources to examine the processes involved in additive \nmanufacturing. You also highlight the need for more advanced \ncomputing algorithms to help researchers extract information \nfrom this data. And you state that we are essentially building \nthe infrastructure for digital engineering and manufacturing. I \nwas hoping that you'd be able to expand on that a little bit \nand tell us also what are the necessary components of such \ninfrastructure.\n    Dr. Rollett. Right. So one of the things that I didn't have \ntime to talk about is where does the data go? And so, you know, \none's generating terabytes, the standard story is you go to a \nlight source, you do an experiment, all of that data has to go \non disk drives, and then you literally carry the disk drives \nback home. So despite the substantial investments in the \ninternet and the data pipe so to speak, from the perspective of \nan experiment, it's still somewhat clumsy. So even that \ninfrastructure could do with some attention.\n    It's also the case that the algorithms that exist have been \ndeveloped for a fairly specialized set of applications. So, you \nknow, the deep-learning methods, they exist, and what we're \ndoing at the moment is basically borrowing them and applying \nthem everywhere that we can. But, in other words, we haven't \ngone very far with developing the specialized techniques or the \nspecialized applications.\n    So even that little movie that I showed, to be honest, I \nmean, the furthest that we've got is doing very basic analysis \nso far, and we actually need cleverer, more sophisticated \nalgorithms to analyze all of that information that's latent in \nthose images. I know that sounds like I'm not doing my job, \nbut, I'm just trying to get some idea across of the challenges \nof taking techniques that have been worked up and then taking \nthem to a completely different domain and doing something \nworthwhile.\n    Mr. Veasey. I was also hoping that you'd be able to \ndescribe the progress your group has made in teaching computers \nto recognize different kinds of metal power--powders using----\n    Dr. Rollett. Powders.\n    Mr. Veasey. --additive manufacturing. I think that you----\n    Dr. Rollett. Right.\n    Mr. Veasey. --go on to say that these successes have the \npotential to impact improvements to materials, as well as the \ngeneration of new materials. And I hope--was hoping you could \ntalk about that a little bit more and for the ability of a \ncomputer to recognize different types of metal and improvements \nto materials and how that can impact the development of new \nmaterials.\n    Dr. Rollett. So thank you for the question. So I was trying \nto think of a powder--I mean, think of talcum powder or \nsomething like that. You spread some on a piece of paper and \nyou look at it and you think, well, that powder looks much like \nany other powder. It looks like something you would use in the \ngarden or whatever. So the point I'm trying to get across is \nthat when you take these pictures of these materials, one \nmaterial looks much like another. However, when you take \npictures with enough resolution and you allow these machine-\nlearning algorithms to work on them, then what you discover is \nthey can see differences that no human can see.\n    So it turns out that you can use the computer to \ndistinguish powders from different sources, different \nmaterials, so on and so forth. And that's pretty magic. That \nmeans that you can again, if you're a company and you're using \nthese powders, you can detect whether you've got--you know, if \nsomebody's giving you what's supposed to be the same powder, \nyou can analyze it and say, no, it's not the same powder after \nall. So there's considerable power in that.\n    Another example is things break, they fracture, and you \nmight be surprised, but there's quite a substantial business in \nanalyzing failures. You know, bicycles break and somebody has \nto absorb the liability. Bridges crack; somebody has to deal \nwith that. Well, that's another case where the people involved \nlook at pictures of these fracture surfaces and they make \nexpert judgments.\n    So one of the things we're discovering is that we can \nactually, again, use some of the computer vision techniques to \nfigure out if this fracture is a different kind of fracture or \nthis is a different fatigue failure that's occurred. Again, \nit's magic. It opens up--not eliminating the expert, not at \nall. The analogy is with radiography on cancers. It's helping \nthe experts to do a better job, to do a faster job, to be able \nto help the people that they're working for.\n    Mr. Veasey. Thank you very much. I appreciate that.\n    And, Mr. Chairman, I yield back.\n    Chairman Weber. Thank you, sir.\n    The gentlelady from Arizona is now recognized.\n    Mrs. Lesko. Thank you, Mr. Chairman.\n    I have to say this Committee is really interesting. I learn \nabout all types of things and people studying the brains. I \nthink we're going to hear about flying cars sometime soon, \nwhich is exciting. I'm from Arizona, and the issues that are \nreally big in my district, which are the suburbs of Phoenix \nmostly, are actually national security and border security. And \nwe have two border ports of entry connecting Mexico and \nArizona, and I have the Luke Air Force Base in my Congressional \ndistrict. And so I was wondering if you had any ideas how \nmachine learning, artificial intelligence are being used in \nborder security and national security. If you have any \nthoughts?\n    Dr. Yelick. Well, I can say generally speaking that in \nnational security, like in science, you're often looking for \nsome signal, some pattern in very noisy data. So whether you're \nlooking at telephones or you're looking at some other kind of \ncollected information, you are looking for patterns. And \nmachine learning is certainly used in that.\n    I'm not aware in border security of the current \napplications of machine learning. I would think that things \nlike face-recognition software would probably be useful there, \nand I just don't know of the current applications.\n    Dr. Nielsen. So I know some of the colleagues at our \nresearch center are exploring things like security, using \nfacial recognition but trying to take it a step further, so \nusing principles of machine learning, et cetera, trying to \ndetect the intent of a person. So they'll use computer vision, \nthey'll watch a group of individuals but try to infer, make \ninferences about the intent of what that group is doing. Is \nthere something going to happen? Who is in charge of this \ngroup? What are they trying to do?\n    And they're working with the Department of Defense on many \nof these applications. And I think there's going to be \ntremendous breakthroughs where artificial intelligence and \nmachine learning are going to help us not only recognize people \nbut also trying now to recognize the intent of what that person \nis trying to do.\n    Dr. Rollett. And you mentioned an Air Force Base, so \nsomething that maybe not everybody's aware of is that the \nmilitary operates very old vehicles, and they have to repair \nand replace a lot. And that means that manufacturing is not \njust a matter of delivering a new aircraft; it's also a matter \nof how you keep old aircraft going. I mean, think of the B-52s \nand how old they are.\n    And so there are very important defense applications for \nmachine learning, for manufacturing, and manufacturing in the \nrepair-and-replace sense. And again, when you're running old \nvehicles, you're very concerned about outliers, which hasn't \ncome up very much so far today, but taking data and recognizing \nwhere you've got a case that's just not in the cloud, it's not \nin with everybody else and figuring out what that means and how \nyou're going to deal with it.\n    Mrs. Lesko. Anyone else? There's one person left.\n    Dr. Kasthuri. Of course, yes. It's me. So of course my work \ndoesn't deal directly with either border security or national \nsecurity, but just to echo one other sentiment, one of the \nthings I'm interested in is that, as our cameras get faster, \ninstead of taking 30 shots per second, we can now take 60 shots \nper second, 90 shots per second, 120 frames per second usually, \nand you start watching people's facial features as they are \njust engaging in normal life. It turns out that we produce a \nlot of microfacial features that happen so fast and so quick \nthat they often aren't detected consciously by each other but \nconvey a tremendous amount of information about things like \nintent and et cetera.\n    I suspect that, as our technology, as our cameras get \nbetter and of course if you take 120 pictures in a second \nversus 30 pictures in a second, that's already four times more \ndata that you're collecting per second. If we can deal with the \ndata and get better cameras, we will actually be making \ninferences about intentions sooner rather than later.\n    Mrs. Lesko. Very interesting. I'm glad that you all work in \nthese different fields.\n    And I yield back my time, Mr. Chairman.\n    Chairman Weber. Thank you, ma'am.\n    The gentleman from Illinois, Mr. Foster, is recognized.\n    Mr. Foster. Thank you, Mr. Chairman. And thank you to our \nwitnesses.\n    And, let's see, I guess I'll start with some hometown \ncheerleading for Argonne National Lab, which--and I find it \nquite remarkable. Argonne lab has been--they've come out to \nevents that we've had in my district dealing with the opioid \ncrisis, I find it incredible that one single laboratory--we \nhave everything from using the advanced photon source and its \nupgrades to directly image what are called G-coupled protein \nreceptors at the very heart of the chemical interaction with \nthe brain all the way up through modeling the high-level \nfunction of the brain, the Connectome, and everything in \nbetween. And it's really one of the magic things that happens \nat Argonne and at all of the--particularly the multipurpose \nlaboratories, which are really gems of our country.\n    Now, one thing I'd like to talk about--and it relates to \nbig data and superconducting--is that you have to make a bunch \nof technological bets in a situation where the technology is \nchanging really, really rapidly. You know, for example, you \nhave the choice of--for the data pipes, you can do \nconventional, very wide floating point things for partial \ndifferential equations and equations of state, things like \nthat, the way supercomputing has been done for years, and yet \nthere's a lot of movement for artificial intelligence toward \nmuch narrower data paths, you know, 8 bits or even less or 1 \nbit if you're talking about simulating the brain firing or not.\n    You know, you have questions on the storage where you can \nhave--classically, we have huge external data sets, you know, \nlike the full geometry of the brain that you will then use \nsupercomputing to extract the Connectome. Or now we're seeing \nmore and more internally generated data sets like these are \ngames playing each other where you just generate the data, \nthrow it away. You don't care about storage at all. Or \nsimulation of billions of miles of driving where that data \nnever has to be stored at all, and so that really affects the \nhigh-level design of these machines.\n    In Congress, we have to commit to projects, you know, on a \nsort of five-year time cycle when every six months there are \nnew disruptive things. We have to decide are these largely \ngoing to be front ends to quantum computing or not? And so how \ndo you deal with that sort of, you know, internally in your \nplanning? And should we move more toward the commercial model \nof move fast, take risks, and break things, or do we have--are \nour projects that we have to approve in Congress things that \nhave to have no chance of failing? And do you think Congress is \ntoo far on one side or the other of that tradeoff?\n    Dr. Yelick. I guess as a computer scientist maybe I'll \nstart here and I would say that you've asked a very good \nquestion. I think this issue of risk and technology is very \nimportant, and we do need to take lots of risks and try lots of \nthings, especially right now as not only are processors not \ngetting any faster because of the end of Dennard scaling, but \nwe're facing the end of Moore's law, which is the end of \ntransistors getting denser on a chip. And we really need to try \na number of different things, including quantum, neuromorphic \ncomputing, and others.\n    The issue of even the design of computers, if we look at \nthe exascale computing program, very important. Of course, the \nfirst machine targeted for Argonne National Lab is in 2021, and \nthe process that is really fundamental to the exascale project \nis this idea of codesign, that is, bringing together people who \nunderstand the applications like Tony and with the people that \nunderstand the applied mathematics, and people that understand \nthe computer architecture design.\n    And the exascale program is looking at both applying \nmachine-learning algorithms for things like the Cancer \nInitiative, as well as the microbiome where you also have these \nvery tiny datatypes, only four characters that you can store in \nmaybe two bits, and putting all of that together. So those \nmachines are being codesigned to try to understand all those \ndifferent applications and work well on the traditional high-\nperformance simulation applications, as well as some of these \nnew data-analysis problems.\n    To answer your question directly, I think that, if \nanything, that project is very focused on that goal of 2021, \nand some other machines will come after that in '22 and '23. \nAnd the application--so it's not just about delivering the \nmachines; it's about delivering 25 applications that are all \nbeing developed at the same time to run on those machines.\n    It is a very exciting project. I actually lead the \nmicrobiome project in exascale, and I think it's a great amount \nof fun. But it is a project that doesn't have much room for \nrisk or basic research, and so I do think it's very important \nto rebuild the fundamental research program, for example, the \nDepartment of Energy to make sure that ten years from now we \ncould have some other kind of future program that we would have \nthe people that are trained in order to answer those basic \nquestions and figure out how to build another computing device \nof some kind.\n    Mr. Foster. Well, yes, thank you. That was a very \ncomprehensive answer. But if you could just in my last one \nsecond here just sort of--do you think Congress is being too \nrisk-averse in our expectations or, you know, should we be more \nrisk-tolerant that allow you occasionally to fail because you \nmade a technological bet that is--you know, that has not come \nthrough?\n    Dr. Yelick. You know, I think I'll answer that from the \nscience perspective. As a scientist, I absolutely want to be \nable to take risks and I want to be able to fail. I think the \nCongressional question I will leave to you to debate.\n    Mr. Foster. Thank you. I yield back.\n    Chairman Weber. Thank you.\n    The gentleman from California, Mr. Rohrabacher, is \nrecognized.\n    Mr. Rohrabacher. Thank you very much, Mr. Chairman.\n    I wanted to get into some basics here. This is for the \nwhole panel. Who's going to be put out of work because of the \nchanges that you see coming as we do what's necessary to fully \nunderstand what you're doing scientifically? Who's going to be \nput out of work?\n    Dr. Rollett. I hope very much that nobody's going to be put \nout of work.\n    Mr. Rohrabacher. Oh, you've got to be kidding. I mean, \nwhenever there's a change for the better, I mean, otherwise, \nwe'd have people working in----\n    Buggy whips would still be----\n    Dr. Rollett. Yes. I think the point here is to sustain \nAmerican industry at its most sophisticated and competitive \nlevel.\n    Mr. Rohrabacher. What professions are going to be losing \njobs? You're making me--I mean, everybody's afraid to say that. \nCome on, you know?\n    Dr. Rollett. I would say they've mostly been lost. I mean, \nif you look at steel mills, we have steel mills. They used to \nrun with 30,000 people.\n    Mr. Rohrabacher. Right.\n    Dr. Rollett. That's why the population of Pittsburgh was so \nlarge years ago, right? It's decreased enormously----\n    Mr. Rohrabacher. Okay. Well, where can we expect that in \nthe future from this new technology or this new understanding \nof technology? Anybody want to tell me?\n    Dr. Kasthuri. I have a very quick----\n    Mr. Rohrabacher. Don't be afraid now.\n    Dr. Kasthuri. I have a very quick answer. Historically, a \nlot of science is done on getting relatively cheap labor to \nproduce data and to analyze data, by that I mean graduate \nstudents, postdoctoral fellows, young assistant professors, et \ncetera. I suspect----\n    Mr. Rohrabacher. So they're not going to be needed \nprobably?\n    Dr. Kasthuri. Well, I suspect that they should still be \ntrained but then perhaps that they won't be used specifically \nin just laboriously collecting data and analyzing data.\n    Mr. Rohrabacher. Okay. So let's go through that. Where are \nthe new jobs going to be created? What new jobs will be created \nby the advances that you're advocating and want us to focus \nsome resources on?\n    Dr. Kasthuri. I'm hoping that when the people who are \ntrained in science no longer have to do all of that work, they \ndo--they then expand into other fields that could use \nscientific education like the legal system or Congress.\n    Mr. Rohrabacher. But what specifically can we look at, say, \nthat will remind Congressmen always to turn off the ringer even \nwhen it's their wife? Now, I'm in big trouble, okay? Tell me--\nso, what jobs are going to be created? What can we expect from \nwhat your research is in the future? Do you have a specific job \nthat you can say this--we're going to be able to do this, and \nthus, people will have a job doing it?\n    Dr. Yelick. Well, I think there will be a lot more jobs in \nbig data and data analysis and things like that and more \ninteresting jobs I think going along with what was already \nsaid, that it's really about replacing--so if we replace taxi \ndrivers with self-driving cars that eliminates a certain class \nof jobs but it'll----\n    Mr. Rohrabacher. Okay. Well, there you go.\n    Dr. Yelick. Right, but it allows people to then spend their \ntime doing something more interesting such as perhaps analyzing \nthe future of the transportation system and things like that.\n    Mr. Rohrabacher. Well, but taxicab driver--finally, I got \nsomebody to admit somebody's going to be hurt and going to have \nto change their life. And let me just note that happens with \nevery bit of progress. Some people are left out and they have \nto form new type of lifestyles, and we need to understand that. \nMaybe we need to prepare for it as we move forward.\n    What diseases do you think that--especially when we're \ntalking about controlling things that are going on in the human \nmind, what diseases do you think that we can bring under \ncontrol that are out of control now? Diabetes, obviously has \nsomething to do with the brain is telling the body what to do, \ndifferent--maybe even cancer? What diseases do you think that \nwe can have a chance of curing with this?\n    Dr. Kasthuri. I think there's a range of neurological \ndiseases that obviously we'll be able to do a better job curing \nor ameliorating once we understand the brain. These range from \nneurodegenerative diseases like Alzheimer's and Parkinson's to \nmore mental illness, psychiatric illnesses and to even early \ndevelopmental diseases like autism. I think all of these will \nabsolutely be benefited by a better understanding----\n    Mr. Rohrabacher. Then if we can control the way the brain \nis functioning, the maladies that you're suffering like I say \ndiabetes and et cetera, that maybe we can tell the brain not to \ndo that and once we have that deeper understanding.\n    One last question. I got just a couple seconds. I remember \n2001 Hal got out of control and tried to kill these people. And \nElon Musk is warning us. I understand somebody's already \nbrought that up. But if we do end up with very independent-\nminded robots, which is what I think we're talking about here, \nwhy shouldn't we think of that as a potential danger, as well \nas a potential asset? I mean, Elon Musk is right in that.\n    Dr. Rollett. Well, I was going to throw in that I think one \nopportunity would be in health care and for example, the use of \nrobots as assistants, so not replacing people but having robots \nhelp them. Well, those robots have to be programmed, they have \nto be built.\n    Mr. Rohrabacher. Right.\n    Dr. Rollett. I mean, there's a huge infrastructure that we \ndon't have.\n    Mr. Rohrabacher. Yes, but if you were building robots that \ncan think independently, who knows--you know, and they're \nhelping us in the hospitals or wherever it is, what if Hal gets \nout of control?\n    Dr. Rollett. Right, right. So I think AI is being discussed \nmostly in the context of how do you do something? How do you \nmake something work? When it comes to what these machines \nactually do, you also need supervision. And what I think we \nhave to do is to build in AI that addresses control and \nevaluation, you know, the equivalent of the little guy on your \nshoulder saying don't do that; you're going to get into \ntrouble. So you need something like that, which I haven't heard \npeople talk about much.\n    Mr. Rohrabacher. Okay. Well, thank you very much, Mr. \nChairman. I yield back.\n    Chairman Weber. You've been watching too many \nSchwarzenegger films.\n    Mr. Rohrabacher. That's true.\n    Chairman Weber. The gentleman yields back and, Mr. \nMcNerney, you're recognized for five minutes.\n    Mr. McNerney. I thank the Chairman. And I apologize to the \npanel for having to step in and out in the hearing so far.\n    Mr. Nielsen, I'm a former wind engineer. I spent about 20 \nyears in the business. And I understand that the digital twin \ntechnology has allowed GE to produce--to increase production by \nabout 20 percent. Is that right?\n    Dr. Nielsen. About five percent on an average wind turbine, \nyes.\n    Mr. McNerney. Five percent?\n    Dr. Nielsen. Five percent, which is pretty amazing when you \nthink we're not switching any of the hardware. It's just making \nthat control system on a wind turbine much smarter using a----\n    Mr. McNerney. And five percent is believable.\n    Dr. Nielsen. Five percent----\n    Mr. McNerney. Twenty percent for the wind farm----\n    Dr. Nielsen. No--yes, it's five percent for----\n    Mr. McNerney. Okay. Okay. I can believe that. As Chair of \nthe Grid Innovation Caucus, I'm particularly interested in \nusing new technology to create a smarter grid. We have things \nlike the duck curve that are affecting the grid. How can all \nthis technology improve grid stability and reliability and \nefficiency and so on?\n    Dr. Nielsen. Yes, so we're now embarking on research for \nunderstanding how to better integrate disparate power sources \ntogether in regional, so imagine us trying to use AI machine \nlearning, say, okay, I have a single combined-cycle power \nplant. How do I better optimize the efficiency of it, produce \nless emissions, use less fuel, allow more profit from it? But \nwe're taking that now a step further and saying how do I then \nlook regionally and integrating not only that combined-cycle \npower plant but the solar farm, the wind farm, et cetera? How \ndo I balance that and optimize at a grid-scale level versus \njust a microscale level?\n    So that's some of the research that's ongoing now. We're \ncontinuing to work on it. But that's our plan is to better \nfigure out that macroscale optimization problem.\n    Mr. McNerney. So, I mean, once you get that figured out, \nthen you need to have some sort of a SCADA or control system \nthat can dispatch and----\n    Dr. Nielsen. Yes, correct.\n    Mr. McNerney. Okay. So that's another product for GE or for \nthe other----\n    Dr. Nielsen. Yes. Correct.\n    Mr. McNerney. Okay.\n    Dr. Nielsen. We're figuring out how to not only build those \noptimization routines but how to then put them in what we call \nedge devices, the SCADA systems, the----\n    Mr. McNerney. Sure.\n    Dr. Nielsen. --unit control systems, et cetera. So it's not \nonly trying to figure out the algorithm but making sure that \nalgorithm can execute in a timescale that can be put into some \nof these, as you mentioned, SCADA systems and control systems.\n    Mr. McNerney. Okay. Well, with the digital ghost, the--a \npower plant can replicate an industrial system and the \ncomponent parts for cyber vulnerability. Is that right?\n    Dr. Nielsen. So we use digital ghost at what we call the \ncyber physical layer. So imagine having a digital twin of a gas \nturbine. So that digital twin tells us how that gas turbine is \nbehaving and should behave. We then compare to what signal is \nbeing generated, what sensors are being--signal's been \ngenerated, and we compare that behavior and say that behavior \ndoesn't look right. Our digital twin says something's not \ncorrect. The thermodynamics aren't correct.\n    Mr. McNerney. Well, I mean, I can see that for mechanical--\n--\n    Dr. Nielsen. Yes.\n    Mr. McNerney. --systems. What about cyber?\n    Dr. Nielsen. So what we're doing is we're not applying it \nat sort of the network layer. We're not watching network \ntraffic. We're actually looking at the machine level and \nunderstanding if the machine is behaving as it should be given \nthe inputs, the control signals, as well as the outputs, the \nsensors, et cetera. Some recent attacks look at replicating \nsensors----\n    Mr. McNerney. So the same sort of behavior characteristics \nare going to be monitored--can tell you whether or not there's \na cyber issue or some other sort of mechanical failure----\n    Dr. Nielsen. Yes.\n    Mr. McNerney. --impending?\n    Dr. Nielsen. Perfect. It's a----\n    Mr. McNerney. Very good.\n    Dr. Nielsen. It's an anomaly detection scheme, yes.\n    Mr. McNerney. Dr. Yelick, thank you for coming. And I \nvisited your lab a number of times. It's always a pleasure to \ndo so. I think you guys are doing some really good work out \nthere.\n    One of the things that was striking was the work you did on \nexascale computing, simulating a San Francisco earthquake and \nhow striking that is. Do you think we have the collective use--\nhave we collectively used this information to harden our \nsystems, to harden our communities against an earthquake, or is \nthat something that is yet to happen?\n    Dr. Yelick. That's something that is yet to happen. We're \njust starting to see some of this very detailed information \ncoming from the simulations. And as I mentioned earlier, even \nbringing in more detailed data into the simulations to give you \nbetter geological information about the stability of a certain \nregion or even a certain local area, a city block or whatever, \nand using that information is not something that is happening \nyet but obviously should be.\n    Mr. McNerney. This is sort of a rhetorical question but \nsomebody can answer it if you feel like. I know we hear about \nthe social challenges of digital technology and AI and big \ndata, you know, in terms of job displacement. Does AI tell us \nanything about that, about how we should respond to this \ncrisis?\n    Dr. Yelick. I don't know of any studies that have used AI \nto do that. People do use AI to understand the market, \neconomics, and things like that, and I'm sure that people are \nusing large-scale data analytics of various kinds, and they \ncertainly are to understand changes in jobs and what will \nhappen with them.\n    It is, by the way, a very active area of discussion within \nthe computer science community about both the ethics, which you \nheard about I think at previous hearing of AI, but also the \nissues of replacing jobs.\n    Mr. McNerney. Sure. Dr. Rollett?\n    Dr. Rollett. If I might jump in, I would encourage you to \nthink about supporting research in policy and even social \nscience to address that issue because AI displacing people is \nabout education, it's about retraining, it's about how people \nbehave. So we scientists are really at sort of the front end of \nthis, but there's a lot of implications that are much broader \nthan what we've talked about this morning.\n    Mr. McNerney. All right. Thank you. Mr. Chairman, I yield \nback.\n    Chairman Weber. Thank you, sir.\n    The gentleman from Florida, Dr. Dunn, is recognized.\n    Mr. Dunn. Thank you very much, Chairman Weber.\n    And I want to add my thank you to the panel and underscore \nmy personal belief in how important all of your work is. I've \nvisited Dr. Bobby Kasthuri's lab, a great fan of your work and \nyour energy level. Dr. Yelick, we'll be visiting you in the \nnear future, so that'll be fun, too.\n    I want to focus on the niche in big computing, which is \nartificial intelligence, and I apologize I missed that hearing \nearlier, but it was near and dear to my heart.\n    I think we all see many potential benefits of artificial \nintelligence, but there are some potential problems, and I \nthink it serves us to face those as we're having this virtual \nlovefest for artificial intelligence. You know, and we've known \nthis since at least the '60s. I mean, the Isaac Asimov robotic \nnovels and the robotic laws, the Three Laws of Robotics, which \nI have in my printout, the copies of in case anybody doesn't \nremember them. I bet this group does.\n    But what I want to do is--I also, by the way, was looking \nfor guides for artificial intelligence and I came up with the \n12 Boy Scout laws, too, so I don't know how that--so I want to \noffer some quotes and then get some thoughts from you, and \nthese are quotes from people who are recognizably smart people. \nStephen Hawking said, ``I think the development of artificial \nintelligence could spell the end of the human race.'' Elon \nMusk, quoted several times here, said, ``I think we should be \nvery careful about artificial intelligence. If I were to guess \nwhat our biggest existential threat is, it's probably that.'' \nBill Gates responded, ``I agree with Elon Musk and I don't \nunderstand why people are concerned.''\n    And then finally, Jaan Tallinn, one of the inventors of \nSkype, said with ``strong and artificial intelligence, planning \nahead is a better strategy than learning from mistakes.'' And \nwent on to say, ``It really sucks to be the number-two \nintelligent species on the planet; just ask the gorillas.''\n    So in everybody's handout you have a very brief summary of \na series of experiments run at MIT on artificial intelligence. \nThe first one was named Norman, which was an artificial \nintelligence educated on biased data, not false data but biased \ndata and turned into a deeply sociopathic intelligence. There \nwas another one Tay, which was really just an artificial \nintelligence Twitterbot, which they turned loose into the \ninternet, and I think it wasn't the intention of the MIT \nresearchers, but people engaged with Tay and tried to provoke \nit to say racist and inappropriate things, which it did. And \nthere are some other experiments from MIT as well.\n    So I want to note, like Dr. Kasthuri, I have sons that are \nmore clever than I, but they are not virtual supermen, nor do \nthey operate at the speed of light, so, you know, there's ways \nof working with them. I'm not so sure about that with \nartificial intelligence.\n    My question first, what are the implications of a future \nwhere black-box machine learning, the process can't even be \ninterpreted? You know, once it gets several layers in, we can't \ninterpret it. What's the implications today on that to you, Dr. \nKasthuri and Dr. Yelick, if I could?\n    Dr. Kasthuri. Congressman Dunn, thank you for the kind \nwords to start. And I actually suspect there is a reasonable \nconcern that the things that we develop in artificial \nintelligence are different than the other things like our \nchildren because their ability to change is at the speed of \ncomputers as opposed to the speed of our own. So I agree that \nthere's legitimate cause for concern.\n    I suspect that we will have to come up with lessons and \nsafeguards the same way that we've done with every existential \ncrisis: the discovery of nuclear energy, the application to \nnuclear weapons. As humans, we do have some history of living \non the edge and figuring out how to get the benefit of \nsomething and keep the risk at bay.\n    You're right that if algorithms can change faster than we \ncan think, our existing previous historical safeguards might \nnot work.\n    To the specific question that you asked about the non-\ninterpretability, for me, without knowing what the algorithm is \nproducing, how do you innovate? If you don't know the \nfundamental nature of what the algorithm is--its principles for \nhow it comes to a conclusion, I worry that we won't be able to \ninnovate on those results.\n    And this is interestingly perhaps as a thought exercise: \nWhat if a machine-learning algorithm could tell me--could \nmake--could collect enough data to make a prediction about a \nbrain, about your brain or someone else's brain that was \nincredibly accurate? Would we at that moment care how that \nmachine-learning algorithm arrived at its conclusion? Or would \nwe at that moment take the results that the algorithm produces \nand just go on with it, in which case there could be a missed \nopportunity for learning something deeply fundamental and \nprincipled about the brain.\n    Mr. Dunn. And very quickly, Dr. Yelick.\n    Dr. Yelick. Well, I agree with that. I think that these \ndeep learning algorithms which have these multiple layers, \nwhich is why they're deep, they have millions perhaps of \nparameters inside of them. And we don't really understand when \nyou get an answer out why all these parameters put together \ntell you that that's a cat and this one's not a cat. And so \nthat may be okay if we're trying to figure out where to place \nads as long as we give it unbiased data about where the place \nthe ads so the right--so----\n    Mr. Dunn. But it might be more problem if it was flying a \ndrone swarm on attack some place?\n    Dr. Yelick. Well, where it's a problem is if I'm a \nscientist, I want to understand why. It's not enough to say \nthere's a correlation between these two things. And if the, you \nknow, drone is flying in the right place, that's really \nprobably the most important thing about some kind of a \ncontrolled vehicle. But in science, you want to----\n    Mr. Dunn. We're dangerously close to being way, way, way \nover time, so I better yield back here, Mr.--thank you very \nmuch, though. I appreciate the chance.\n    Chairman Weber. All right. The gentlelady from Nevada, Ms. \nRosen, is recognized.\n    Ms. Rosen. Thank you. I want to thank you for one of the \nmost interesting, informative, and I want to say this is on the \nbleeding edge of everything that we need to worry about for \nsure.\n    But one thing we haven't talked about is data storage. And \ndata storage specifically is critical infrastructure in this \ncountry, right, because we have tons and tons of data \neverywhere, and where it goes and how we keep it is going to be \nof utmost importance.\n    And so I know that we're trying to focus on that in the \nfuture, and in my district in Nevada we have a major data \nstorage company. It has state-of-the-art reliability. We have \nlots of quality standards to ensure its data is secure, but \nlike I said, we don't consider it critical infrastructure.\n    So right now in this era of unprecedented data breaches, \ndata hacks, every moment they are just pounding on us, in your \nview what are--the data storage centers that house the \ngovernment and private sector, where are their vulnerabilities \nand what are the implications? How should we be sure that we \nclassify them as critical infrastructure?\n    Dr. Yelick. So, clearly, those data centers are storing \nvery important information that should be protected. And, as \nyou said, even at the computing centers that we run in the \nlabs, there's a constant barrage of attacks, although we store \nat NERSC the center at Berkeley lab only scientific data, so it \nis not really critical data. I think that using these kinds of \nmachine-learning techniques to look for patterns is one of the \nbest mechanisms we have to prevent attack, and they do have to \nlearn from these patterns in order to figure out what is--and--\nwhat is abnormal behavior. And we're looking at--as we build \nout the next network, even kind of embedding that information \ninto the network so that you can see patterns of attack even \nbefore they get to a particular data set or a particular \ncomputer system.\n    Ms. Rosen. Thank you. I have one other question. And you \nwere talking about using predictive analytics with a digital \ntwin to talk about fatigue in planes. But how can we use that \nto discuss infrastructure fatigue as we talk about the \ninfrastructure failures around this country in bridges, roads, \nports, et cetera, et cetera? So----\n    Dr. Rollett. That's I think a question of recognizing the \nneed and talking to the agencies and finding out whether you \nconsider there are adequate programs to do that. I'm going to \nguess that there is not a huge amount of activity, but I don't \nknow, so that's why I'm being very cautious in my answer.\n    But I suspect it's one of the opportunity areas. It's an \narea where there is data. It's often rather incomplete, but it \nwould definitely benefit from having the techniques applied, \nthe machine-learning techniques to try to find the patterns, to \ntry to identify outliers, particularly trends that are not \ngood.\n    Ms. Rosen. Thank you.\n    Dr. Nielsen. I would just----\n    Ms. Rosen. Oh, please, yes. Yes.\n    Dr. Nielsen. Oh, I'm sorry. I would just second the \ncomments made. I mean, at GE we obviously focus a lot of our \nattention on the commercial assets that we build, but there's \nno reason the technologies, the ideas that are being applied \nthere could be applied to bridges and infrastructure and all \nthat.\n    Ms. Rosen. Right.\n    Dr. Nielsen. It's just, I think, a matter of will and \npolicy to do that, right?\n    Ms. Rosen. So I--do you think that would be well worth our \ntime here in this Committee to promote those kinds of policies \nor research for you all or someone to do the--use the \npredictive analytics? Congresswoman Esty and I sit on some \ninfrastructure committees, and really important that we try to \nfind out points of failure before they fail, right?\n    Dr. Rollett. Absolutely. And I would encourage you to bring \nstate and local government into that discussion because they \noften own a lot of those assets.\n    Ms. Rosen. Yes. Thank you. I yield back my time.\n    Chairman Weber. The gentlelady yields back.\n    The gentlelady from Connecticut is recognized.\n    Ms. Esty. Thank you so much. And this is tremendously \nimportant for this Committee and for the U.S. Congress to be \ndealing with, and we really appreciate you taking the time with \nus today.\n    All of you have mentioned somewhat in passing this critical \nimportance of how are the algorithms structured and how are we \ngoing to embed the values if we have AI moving much faster than \nour brains can function or at least on multiple levels \nsimultaneously?\n    So we did have a hearing last month in talking about this, \nand one of the issues that came up that everyone supported--and \nI'd like your thoughts on that--is the critical importance of a \ndiverse workforce in doing that. If you're going to try to \ntrain AI, it needs to represent the diversity of human \nexperience, and therefore, it can't be like my son who did \ncomputer science in astrophysics. If they all look like that, \nif those are--the algorithms are all being developed by, you \nknow, 26-year-olds like my son Thomas, we're not going to have \nthe diversity of life experience.\n    So, first, if you can quickly--because I've got a couple of \nquestions--thoughts on how do we ensure that? Because we're \nlooking at that issue. We talk about that diverse workforce all \nthe time, but when we're looking at AI and algorithms, it \nbecomes vitally important that we do this. It's not about \nchecking the box to say the Department of Labor that we've got \na diverse workforce. This is actually vital to what we need to \ndo.\n    Dr. Yelick. So if I can just comment on that. Yesterday, \nbefore I left UC Berkeley, I gave a lecture to the freshman \nsummer class introductory computing class. My title was rather \nostentatious as ``How to Save the World with Computing.'' What \nI find is that when you talk about the applications of \ncomputing and including data analytics and machine learning and \nreal problems that are societal problems, you tend to bring in \na much more diverse workforce. That class in particular has had \nover 50 percent women and a very good representation at least \nrelative to the norm of underrepresented minorities as well.\n    Ms. Esty. Anyone else who--I mean it--MIT has found that \nwhen they change the title of some of their computer science \nclasses to again be applied in sort of more political and \nsocial realms, they had a dramatic change in terms of \ncomposition of classes.\n    Dr. Nielsen. Yes, I would just quickly build upon that, \ntoo. I think to me when you look at AI and machine learning, \nyou have to have a critical eye. You have to always be looking \nat it. And I think a diverse workforce and diverse experience \ncan help just bring more perspectives to help critically \nquestion why are those algorithms doing what they're doing? \nWhat is the outcomes? How can we improve that? So I would \nsupport that supposition, yes.\n    Dr. Yelick. I'll just mention that the name of the course--\nwhich I was not teaching, by the way, I was giving a guest \nlecture--is ``The Beauty and Joy of Computing,'' so maybe that \nhelps.\n    Ms. Esty. Well, that helps. And if I could have you turn \nagain--and some of you have mentioned the important role of \nfederal research. I mean that's what this Committee is looking \nat, what is uniquely the federal role. As you see across the \nboard, there's more and more effort and being engaged and we \nsee it in space research and other places to move into the \nprivate sector with the notion the federal government is not \nvery good at picking winners and losers. So if you can all talk \nabout what you think are the most critical tasks for federal \ninvestment in, say, foundational and basic research that then \nwill be developed by the GE's and others and companies not yet \nformed or conceived of because, again, that's part of our job \nis to figure out--I see it as our job to defend putting those \nbasic research dollars in because we don't know where they're \ngoing to go but we do know they're vital to keep us, whether \nit's competitive or frankly just have better research and more \ncare.\n    Dr. Kasthuri. So perhaps I can go really quick. I suspect \nthat there is a model of funding scientific research that's \nthis idea that if you plant a million seeds in the ground, a \nfew flowers will grow, where individual labs and individual \nscientists have the freedom to judge what is the next important \nquestion to address.\n    And I can see why having the federal government decide the \nnext important question to address might not be the most \nefficient way to push science forward. But where I do see the \nfederal government really playing a role is in the level of \nfacilities and resources, that what I imagine is that the \nfederal government establishes large-scale resources and \nfacilities like the national lab system and then allow \nindividual scientists to promote their individual ideas but \nleveraging the federal resources. And I wonder if this is a \ncompromise between allowing these seeds to grow but the federal \ngovernment--maybe this is appropriate but maybe not--providing \nthe fertilizer for those seeds.\n    Ms. Esty. They think we generate a lot of it at least in \nthis place.\n    Dr. Yelick. So I would just add I think the importance of \nfundamental research, as well as the facilities and \ninfrastructure and the applied mathematics, the computer \nscience, statistics, very important in machine learning. And, \nas we said, these machine-learning algorithms have been used a \nlot in nonscientific domains. There's a lot of interest in \napplying them in scientific domains. I think the peer-review \nprocess in science will make machine learning better for \neverybody if we really put a lot of scrutiny on it.\n    Dr. Rollett. And very quickly, I wanted to add that I think \nit's important that program managers in the federal government \nhave some discretion over what they fund and take risks. And \nit's also important that the agencies have effective means of \ngetting community input. And I don't want to name names, but \nsome agencies have far more effective mechanisms for that than \nothers.\n    Ms. Esty. Well, we might want to follow up with that last \npoint.\n    And I wanted to just put out for you to help us with--and \nyou mentioned it, Dr. Yelick, with--on peer review, this \nsystematic--because of pressures to publish or perish and show \nsuccess is we are not sharing the failures, which are \nabsolutely essential for science to make progress. It's one of \nthe issues we've touched on a lot in this Committee. We don't \nhave any good answers, and it's gotten worse because of the \npressures to do--to get grant money and to show progress. But I \nam deeply concerned about those pressures both from the private \nsector and the public sector making it harder for us--people \nhoard the, quote, ``bad results,'' but they're absolutely \nessential for us to learn from them.\n    And so I don't know how we change that dynamic, but I think \nthat is something that we could really use your thoughts on \nthat because whether it's--AI can maybe help us with disclosing \nthe dead ends and we learn from the dead ends and we move \nforward. But it is something that we have a big issue with in \nhow we deal with the sharing of the not-useful results, which \nmay turn out to be very useful down the line.\n    Dr. Yelick. I completely agree with that. I think the first \nstep in that is sharing the scientific data and allowing people \nto reproduce the successful results but also, as you said, \nexamine the supposed failures to see--there are many examples \nof this in physics and other disciplines where people go back \nto data that may be 10 or 20 years old and find some new \ndiscovery in it.\n    Ms. Esty. Thank you very much. I really appreciate your \nindulgence to keep us here to the bitter end. Thank you. Not \nthe bitter, not you, just the fact that the bell has rung, and \nwe had a lot of questions for you. We appreciate it. Thank you \nso much.\n    Chairman Weber. After failing 1,000 times for the \nlightbulb, Dr. Edison, his staffer said doesn't that frustrate \nyou? He goes, what are you talking about? We're 1,000 ways \ncloser to success.\n    So I thank the witnesses for their testimony and the \nMembers for their questions. The record will remain open for \ntwo weeks for additional written comments and written questions \nfrom the Members.\n    This hearing is adjourned.\n    [Whereupon, at 12:08 p.m., the Subcommittees were \nadjourned.]\n\n                               Appendix I\n\n                              ----------                              \n\n\n                   Answers to Post-Hearing Questions\n\n\n\n\n                   Answers to Post-Hearing Questions\nResponses by Dr. Bobby Kasthuri\n\n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]\n\n\nResponses by Dr. Katherine Yelick\n\n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]\n\n\nResponses by Dr. Matthew Nielsen\n\n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]\n\nResponses by Dr. Anthony Rollett\n\n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]\n\n\n                              Appendix II\n\n                              ----------                              \n\n\n                   Additional Material for the Record\n\n\n\n\n            Documents submitted by Representative Neal Dunn\n            \n[GRAPHIC(S) NOT AVAILABLE IN TIFF FORMAT]            \n\n\n                                 <all>\n</pre></body></html>\n"