[Federal Register Volume 90, Number 241 (Thursday, December 18, 2025)]
[Notices]
[Pages 59131-59135]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 2025-23246]


-----------------------------------------------------------------------

DEPARTMENT OF HEALTH AND HUMAN SERVICES

National Institutes of Health


NIH Controlled-Access Data Policy and Proposed Revisions to NIH 
Genomic Data Sharing Policy

AGENCY: National Institutes of Health, HHS.

ACTION: Notice.

-----------------------------------------------------------------------

SUMMARY: The National Institutes of Health (NIH) is requesting public 
input on its proposal to establish harmonized and transparent policy 
requirements for protecting human participant research data. 
Specifically, NIH proposes (1) establishing policy requirements for 
which data should be controlled-access under NIH data sharing policies, 
and (2) revising the NIH Genomic Data Sharing Policy to simplify and 
harmonize requirements.

DATES: To ensure consideration, comments must be submitted in writing 
by March 18, 2026.

ADDRESSES: Comments may be submitted electronically to https://osp.od.nih.gov/comment-form-draft-nih-controlled-access-data-policy-and-proposed-revisions-to-nih-genomic-data-sharing-policy/.
    Comments are voluntary and may be submitted anonymously. You may 
also voluntarily include your name and contact information with your 
response. Other than your name and contact information, please do not 
include in the response any personally identifiable information or any 
information that you do not wish to make public. Proprietary, 
classified, confidential, or sensitive information should not be 
included in your response. After the NIH Office of Science Policy (OSP) 
has finished reviewing the responses, the responses may be posted to 
the OSP website without redaction.

FOR FURTHER INFORMATION CONTACT: Taunton Paine, Director, Division of 
Scientific Data Sharing, NIH Office of Science Policy, at (301) 496-
9838 or [email protected].

SUPPLEMENTARY INFORMATION:

Background

    NIH serves as the steward of a wide range of research data and 
continuously works to optimize open sharing with appropriate 
protections throughout the entire data lifecycle. Given its numerous 
established data policies, NIH is proposing a holistic update to its 
data policy framework to strengthen data protections, clarify 
requirements, and reduce duplicative burden.
    First, NIH is proposing a new NIH Controlled-Access Data Policy to 
support the research community in fulfilling NIH data sharing 
expectations. This proposed policy specifies human participant data 
types required to be managed via controlled-access and provides 
criteria for assessing the need for controls for other data types. It 
also provides a standard set of expectations across NIH Institutes, 
Centers and

[[Page 59132]]

Offices to promote maximal responsible human participant data sharing 
through controlled access while simultaneously responding to emergent 
privacy and security risks, including those outlined in the following 
security directives:
    1. Executive Order 14117 and the Department of Justice's final rule 
``Preventing Access to Americans' Bulk Sensitive Personal Data and 
United States Government-Related Data by Countries of Concern or 
Covered Persons'' 28 CFR part 202 (see: https://www.federalregister.gov/documents/2025/01/08/2024-31486/preventing-access-to-us-sensitive-personal-data-and-government-related-data-by-countries-of-concern) identifying specific data types, along with 
associated thresholds, that should be protected to mitigate national 
security risks.
    2. The Consolidated Appropriations Act, 2023, requiring updates to 
genomic data sharing policies and practices to account for national 
security risks, Public Law No: 117-328. (see: https://www.congress.gov/bill/117th-congress/house-bill/2617/text).
    3. The Government Accountability Office report on Human Genomic 
Data, recommending NIH develop and implement procedures to proactively 
and comprehensively monitor researcher compliance with data management 
and security measures for human genomic data, GAO-25-107377 (see: 
https://www.gao.gov/products/gao-25-107377).
    Second, NIH is proposing to revise the NIH Genomic Data Sharing 
(GDS) Policy to reduce duplicative policy requirements and improve 
overall performance. The GDS Policy, issued in 2014, promotes broad, 
responsible, and timely sharing of genomic research data derived from 
NIH research. As a landmark policy, it has played a crucial role in 
facilitating rapid access to valuable genomic data while ensuring 
participant protection through rigorous informed consent and privacy 
safeguards. Since 2014, NIH has issued the NIH Data Management and 
Sharing (DMS) Policy as well as streamlined and strengthened its 
controlled-access practices, including:
     harmonizing the oversight and management of controlled-
access data repositories and access management systems (see: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-25-159.html),
     modernizing security standards for controlled-access data 
subject to the GDS Policy (see: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-24-157.html),
     establishing minimum expectations for access to 
controlled-access data by developers (see: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-24-157.html), and
     prohibiting access to NIH controlled-access data 
repositories and associated data by institutions located in countries 
of concern (see: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-25-083.html).
    NIH is proposing to revise the GDS Policy to enhance efficiency and 
reduce redundancies with these recent directives and policy 
developments.
    NIH requests public input on both the Draft NIH Controlled-Access 
Data Policy and the proposed revisions to the NIH Genomic Data Sharing 
Policy.

NIH Controlled-Access Data Policy

Scope and Applicability

    This Policy applies to all NIH-supported research generating human 
data or deriving data from human data, cell lines, or biospecimens. 
This Policy applies to the NIH Intramural Research Program and all NIH 
funding mechanisms (e.g., grants, cooperative agreements, contracts, 
Other Transactions), regardless of activity code.
    This Policy does not apply to NIH research that only involves:
     Generation and sharing of non-human data
     Collection and sharing of human cell lines and 
biospecimens (see: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-25-160.html)
    Human data or data derived from human cell lines or biospecimens 
already shared prior to the effective date of this Policy should be 
assessed for risk but are not required to be controlled to comport with 
this Policy. Additionally, this Policy is not intended to address 
sharing data for regulatory approval or to comply with regulatory 
requirements (e.g. submission of data to regulatory approval bodies).

Requirements

Human Data Types Required To Be Protected Through Controlled-Access 
(see Appendix for Definitions)
    The following data types, listed below, must be protected 
throughout the data lifecycle. Institutions conducting NIH-supported 
research must ensure that these data types are protected even if not 
sharing through a controlled-access data repository. These categories 
are based on the data types provided in 28 CFR part 202, as well as 
other data commonly generated or used in NIH-supported research that 
warrant additional controls. These data types may only be shared 
without access controls if (1) there is informed consent explicitly 
stating data are to be shared openly without controls. In these 
instances, institutions must still review to determine that openly 
sharing these data pose very low risk when shared and used; or (2) open 
sharing is required or authorized by Federal law or international 
agreements to which the United States is a party. Protected data types 
include:

 Covered personal identifiers
 Precise geolocation data
 Biometric identifiers
 Genomic data
 Epigenomic data
 Proteomic data
 Transcriptomic data
 Personal health data
 Personal financial data
 Individual level clinical trial data
 Imaging data of the human face or head regions
Requirements for Controlled-Access Data Sharing
    Controlled-access data repositories sharing human participant data 
types identified in this Policy must adhere to security and operational 
standards appropriate for safeguarding of human data. NIH Controlled-
Access Data Repositories (CADRs) subject to the ``Required Security and 
Operational Standards for NIH Controlled-Access Data Repositories'' are 
fully compliant with the requirements of this Policy (https://grants.nih.gov/policy-and-compliance/policy-topics/sharing-policies/accessing-data/requirements). Other controlled-access data repositories 
can meet the requirements of this policy if, at a minimum, their 
security and operational standards include:
     Prospective review of requests to access controlled data;
     Authentication of the identity of data requesters;
     Restrictions for sharing data with countries of concern as 
identified in Part 202; and
     Employing security standards for protection of controlled 
data (e.g., NIST-SP-800-171 or equivalent).
    NIH does not consider repositories only requiring user registration 
and/or provide non-binding guidance on data usage to be controlled-
access repositories.

Additional Policy Considerations

    Data not explicitly required to be managed via controlled-access 
under this Policy should be assessed for the need for controls. 
Criteria for making these assessments include any of the following:

[[Page 59133]]

    1. Explicit limitations on subsequent use, such as those imposed by 
laws, regulations, policies, informed consent, and agreements.
    2. Potential sensitivities, such as information regarding 
potentially stigmatizing traits, illegal behaviors, or other 
information that could be perceived as causing group harm or used for 
discriminatory purposes. Sensitive data may also include data from 
individuals, groups, or populations with unique attributes that 
increase the risk of re-identification.
    3. Lack of adequate data de-identification or the possibility of 
re-identification cannot sufficiently be reduced.

Proposed Revisions to the NIH Genomic Data Sharing Policy

    NIH is proposing several revisions to the GDS Policy to reduce 
duplicative policy requirements and improve overall performance. 
Importantly, the core tenets of this policy will remain intact. The 
effective date for changes will align with the target effective date of 
the NIH Controlled-Access Data Policy described above.
    Proposed GDS Policy revisions:
    1. The scope of the GDS Policy is proposed to be revised as 
follows:
     Apply only to human data. The NIH Data Management and 
Sharing Policy (DMS Policy) encompasses all types of scientific data, 
including genomic data. The GDS Policy scope will apply specifically to 
human genomic data to outline the specific protections needed to ensure 
participant autonomy, privacy, and security. Non-human genomic data 
will no longer be subject to the GDS Policy.
     Simplify thresholds for ``large scale'' genomic data. Any 
amount of human genomic data collected from 100 individuals or more 
will be defined as ``large scale'' and required to comply with the GDS 
Policy's consent and data sharing requirements. Studies generating and 
sharing human genomic data below this threshold will be subject to the 
expectations of the DMS Policy and the proposed NIH Controlled-Access 
Data Policy described above.
     Establish consistent requirements across NIH. To reduce 
complexity, NIH Institutes, Centers, and Offices (ICOs) will not be 
permitted to expand the scope of the GDS Policy through individual 
program or policy expectations. While ICs may request additional data 
protections or submission of data to NIH controlled-access repositories 
in NOFOs, they may not characterize these requests as modifying the 
expectations of the GDS Policy. These data submissions will be governed 
by the DMS Policy, the ``Required Security and Operational Standards 
for NIH Controlled-Access Data Repositories,'' the proposed NIH 
Controlled-Access Data Policy, and other relevant policies.
    2. Timelines for Data Processing: Timing of data release will be 
removed from the GDS Policy as data release should be immediate, 
depending on repository operations. The levels of data processing 
previously provided in supplementary information to the GDS Policy will 
no longer be used. Human genomic data should be submitted to an 
approved NIH controlled-access data repository (see: https://grants.nih.gov/policy-and-compliance/policy-topics/sharing-policies/accessing-data/best-practices) to make the data available within 6 
months of generation, to allow time for data cleaning, quality control, 
and repository release processes. Initial sequence reads and raw data 
do not have to be shared, consistent with previous GDS Policy 
expectations. Data not shared within 6 months of data cleaning and 
quality control should be shared consistent with DMS Policy 
requirements (publication date or end of the award period, whichever 
comes first).
    3. NIH proposes modernizing the following data submission and 
sharing practices by:
     Clarifying expectations for sharing human genomic data 
openly. Expectations for sharing genomic data (and other omics data) 
openly or with controls will be entirely governed by the proposed NIH 
Controlled-Access Data Policy, including expectations for informed 
consent to share data openly.
     Allowing for HIPAA Expert Determination. Identifiers are 
allowed to be submitted under the GDS Policy so long as the dataset is 
de-identified according to Expert Determination and is accepted by the 
controlled-access data repository. Data derived from human research 
participants must be de-identified according to the following 
standards:
    [cir] Identities of research participants cannot be readily 
ascertained or otherwise associated with the data by the repository 
staff or secondary data users (45 CFR 46.102(e); Federal Policy for the 
Protection of Human Subjects); and
    [cir] Either remove 18 identifiers enumerated at 45 CFR 
164.514(b)(2) (the HIPAA Privacy Rule) or meet the Expert Determination 
standard according to 45 CFR 164.514(b)(1)
     Expanding institutional review capacity. Institutions are 
permitted to appoint an individual or institutional body technically 
and legally capable of reviewing Institutional Certifications for data 
submission beyond an Institutional Review Board, Privacy Board, or 
equivalent. For example, Human Research Protection Programs (HRPPs) 
would be permitted to review and approve data submissions.
     Strengthening requirements for participant consent. Human 
genomic data collected under the GDS Policy from biospecimens or cell 
lines created or collected after 2015 must have consent for use and 
sharing, consistent with 45 CFR 46.116(d) (the Common Rule). If consent 
has not been obtained, the data cannot be shared. NIH accepts data when 
collected under informed consent for research use from a Legally 
Authorized Representative, consistent with the Common Rule, as long as 
the consent meets other expectations of the GDS Policy (e.g., consent 
is expected to be for future research use and be opt-in, not opt-out). 
This includes, but is not limited to, the following situations:
     Consent is obtained from next-of-kin or applicable legal 
authority in cases where the individual is deceased
     Consent is obtained from a Legally Authorized 
Representative, next-of-kin, or other forms of surrogate or proxy 
decision-making in cases where the individual lacks capacity to consent 
for themselves
     Assent is obtained from minors, with parental permission
     Updating expectations for imputation servers. NIH 
currently allows Approved Users to use imputation servers for 
imputation with controlled-access data from studies subject to the GDS 
Policy only in limited circumstances, such as the National Heart, Lung, 
and Blood Institute (NHLBI) Trans-Omics in Precision Medicine (TOPMed) 
Imputation Server. Specifically, these servers offer a secure 
environment for users to upload genotypes, the results are encrypted, 
and after 7 days the data are deleted from the server upholding the 
non-transferability agreement in the Data Use Certification (DUC) 
Agreement. The server operates in an environment consistent with 
security standards in the NIH Security Best Practices for Controlled-
Access Data Repositories and employs countermeasures to reduce the 
risks of certain attacks.
    NIH currently does not allow users to develop their own imputation 
panels or servers. NIH has heard interest from the research community 
in allowing Approved Users to develop their own imputation panels and 
servers using controlled-access data from studies subject to the GDS 
Policy. NIH requests input on clarifying that Approved Users

[[Page 59134]]

may operate imputation servers if they ensure that (1) the controlled-
access data used to develop the imputation panels are protected from 
disclosure and attacks specific to imputation servers, (2) the 
imputation server operates in an environment consistent with security 
controls in the NIH Security Best Practices for Controlled-Access Data 
Repositories and, (3) the imputation servers are funded or operated by 
NIH or another federal agency.

Request for Input

    NIH invites public input on any aspect of the Draft NIH Controlled-
Access Data Policy and the proposed revisions to the NIH Genomic Data 
Sharing Policy. Input is specifically requested on:
    1. Availability of established repositories for implementing the 
proposed Controlled-Access Data Policy. NIH has made investments in 
expanding the capacity of controlled-access data repositories (see: 
https://grants.nih.gov/policy-and-compliance/policy-topics/sharing-policies/accessing-data/best-practices) and is interested in additional 
resources that may be needed to meet an anticipated increased demand 
for storing and managing larger amounts of controlled-access data.
    2. Appropriateness of the protected data types designated to be 
controlled-access. The data types subject to the Controlled-Access Data 
Policy, including whether any should be added, removed, or definitions 
clarified (e.g., whether NIH should consider adding thresholds for the 
number of analytes for particular data types). Additionally, any 
factors that should be considered when sharing data openly without 
controls, given the Draft Controlled-Access Data Policy's requirements 
for informed consent and institutional review. NIH may provide FAQs or 
additional guidance on data types that typically should not be 
controlled, such as genomic summary results, summarized result data 
(including from clinical trials), and specific low-risk components of 
controlled-access data.
    3. Proposed Updates to the GDS Policy for Imputation Servers. NIH 
is interested in options or strategies that maintain the privacy of 
imputation servers and reference panels, such as technologies that 
operate servers in secure environments or use privacy enhancing 
technologies (PETs).

Appendix to the Controlled-Access Policy: Definitions

     Genomic summary results. The output of analyses of genomic 
data across many individuals included within a dataset. Includes 
systematically computed statistics such as, but not limited to: (1) 
frequency information (e.g., genotype counts and frequencies, or allele 
counts and frequencies); and (2) association information (e.g., effect 
size estimates and standard errors, and p-values). These values may be 
defined and calculated using scientifically relevant subsets of 
research participants included within study populations (e.g., disease, 
trait-based, or control populations).
     Covered personal identifiers. Any listed identifier: (1) 
In combination with any other listed identifier; or (2) In combination 
with other data that is disclosed such that the listed identifier is 
linked or linkable to other listed identifiers or to other sensitive 
personal data. This excludes (1) Demographic or contact data that is 
linked only to other demographic or contact data (such as first and 
last name, birthplace, ZIP code, residential street or postal address, 
phone number, and email address and similar public account 
identifiers); and (2) A network-based identifier, account-
authentication data, or call-detail data that is linked only to other 
network-based identifier, account-authentication data, or call-detail 
data as necessary for the provision of telecommunications, networking, 
or similar service (28 CFR 202.212).
     Listed identifier. Any piece of data in any of the 
following data fields: (a) Full or truncated government identification 
or account number (such as a Social Security number, driver's license 
or State identification number, passport number, or Alien Registration 
Number); (b) Full financial account numbers or personal identification 
numbers associated with a financial institution or financial-services 
company; (c) Device-based or hardware-based identifier (such as 
International Mobile Equipment Identity (``IMEI''), Media Access 
Control (``MAC'') address, or Subscriber Identity Module (``SIM'') card 
number); (d) Demographic or contact data (such as first and last name, 
birth date, birthplace, ZIP code, residential street or postal address, 
phone number, email address, or similar public account identifiers); 
(e) Advertising identifier (such as Google Advertising ID, Apple ID for 
Advertisers, or other mobile advertising ID (``MAID'')); (f) Account-
authentication data (such as account username, account password, or an 
answer to security questions); (g) Network-based identifier (such as 
internet Protocol (``IP'') address or cookie data); or (h) Call-detail 
data (such as Customer Proprietary Network Information (``CPNI'') (28 
CFR 202.212).
     Precise geolocation data. Data, whether real-time or 
historical, that identifies the physical location of an individual or a 
device with a precision of within 1,000 meters (28 CFR 202.242).
     Biometric identifiers. Measurable physical characteristics 
or behaviors used to recognize or verify the identity of an individual, 
including facial images, voice prints and patterns, retina and iris 
scans, palm prints and fingerprints, gait, and keyboard usage patterns 
that are enrolled in a biometric system and the templates created by 
the system (28 CFR 202.204).
     Genomic data. Data representing the nucleic acid sequences 
that constitute the entire set or a subset of the genetic instructions 
found in a human cell, including the result or results of an 
individual's ``genetic test'' (as defined in 42 U.S.C. 300gg-91(d)(17)) 
and any related human genetic sequencing data (28 CFR 202.224).
     Epigenomic data. Data derived from a systems-level 
analysis of human epigenetic modifications, which are changes in gene 
expression that do not involve alterations to the DNA sequence itself. 
These epigenetic modifications include modifications such as DNA 
methylation, histone modifications, and non-coding RNA regulation. 
Routine clinical measurements of epigenetic modifications for 
individualized patient care purposes would not be considered epigenomic 
data because such measurements would not entail a systems-level 
analysis of the epigenetic modifications in a sample (28 CFR 202.224).
     Proteomic data. Data derived from a systems-level analysis 
of proteins expressed by a human genome, cell, tissue, or organism. 
Routine clinical measurements of proteins for individualized patient 
care purposes would not be considered proteomic data under this rule 
because such measurements would not entail a systems-level analysis of 
the proteins found in such a sample (28 CFR 202.224).
     Transcriptomic data. Data derived from a systems-level 
analysis of RNA transcripts produced by the human genome under specific 
conditions or in a specific cell type. Routine clinical measurements of 
RNA transcripts for individualized patient care purposes would not be 
considered transcriptomic data under this rule because such 
measurements would not entail a systems-level analysis of the RNA

[[Page 59135]]

transcripts in a sample (28 CFR 202.224).
     Personal health data. Health information related to 
disease, diagnosis, or treatment and indicates, reveals, or describes 
the past, present, or future physical or mental health or condition of 
an individual; the provision of healthcare to an individual; or the 
past, present, or future payment for the provision of healthcare to an 
individual. This term includes basic physical measurements and health 
attributes (such as bodily functions, height and weight, vital signs, 
symptoms, and allergies); social, psychological, behavioral, and 
medical diagnostic, intervention, and treatment history; test results; 
logs of exercise habits; immunization data; data on reproductive and 
sexual health; and data on the use or purchase of prescribed 
medications (28 CFR 202.241).
     Personal financial data. Data about an individual's 
credit, charge, or debit card, or bank account, including purchases and 
payment history; data in a bank, credit, or other financial statement, 
including assets, liabilities, debts, or trades in a securities 
portfolio; or data in a credit report or in a ``consumer report'' (as 
defined in 15 U.S.C. 1681a(d)) (28 CFR 202.240).
     Individual level clinical trial data. Detailed data 
collected from each participant during a clinical trial, excluding 
summary results.
     Imaging data of the human face or head regions. Visual 
representations (including functional imaging, ultrasound imaging, 
photographic images, 3D models, radiological scans, X-rays, and others) 
that depict anatomical or functional details of the human face or head 
regions.

    Dated: December 11, 2025.
Matthew Memoli,
Principal Deputy Director, National Institutes of Health.
[FR Doc. 2025-23246 Filed 12-17-25; 8:45 am]
BILLING CODE 4140-01-P