[Federal Register Volume 70, Number 118 (Tuesday, June 21, 2005)]
[Proposed Rules]
[Pages 35573-35577]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 05-12199]


-----------------------------------------------------------------------

DEPARTMENT OF COMMERCE

Patent and Trademark Office

37 CFR Part 1

[Docket No.: 2005-P-062]
RIN 0651-AB91


Acceptance, Processing, Use and Dissemination of Chemical and 
Three-Dimensional Biological Structural Data in Electronic Format

AGENCY: United States Patent and Trademark Office, Commerce.

ACTION: Advance notice of proposed rule making.

-----------------------------------------------------------------------

SUMMARY: This advance notice of proposed rule making is to inform the 
public that the United States Patent and Trademark Office (USPTO) is 
considering amending its rules of practice to require submission of 
chemical and three-dimensional (3-D) biological structural data in 
electronic format. The USPTO anticipates that requiring submission of 
chemical and 3-D biological structural data in electronic format in 
patent applications will improve the processing and examination of 
patent applications that include such data, as well as the 
dissemination of such data to searchable public databases. The purpose 
of this notice is to encourage comments on this topic, in the form of 
responses to the questions posed in this notice, from industry, 
academia, the patent bars, and members of the public.
    Comment Deadline Date: To be ensured of consideration, written 
comments must be received on or before August 22, 2005. No public 
hearing will be held.

ADDRESSES: Comments should be sent by electronic mail message over the 
Internet addressed to [email protected]. Comments may also be 
submitted by mail addressed to: Mail Stop Comments--Patents, 
Commissioner for Patents, P.O. Box 1450, Alexandria, VA, 22313-1450, or 
by facsimile to (571) 273-3373, marked to the attention of Lisa J. 
Hobbs, Ph.D., Search Systems Project Manager, Search and Information 
Resources Administration, Office of the Deputy Commissioner for Patent 
Resources and Planning. Although comments may be submitted by mail or 
facsimile, the Office prefers to receive comments via the Internet. If 
comments are submitted by mail, the Office prefers that the comments be 
submitted on a DOS formatted 3\1/2\ inch disk accompanied by a paper 
copy.

[[Page 35574]]

    Comments may also be sent by electronic mail message over the 
Internet via the Federal eRulemaking Portal. See the Federal 
eRulemaking Portal Web site (http://www.regulations.gov) for additional 
instructions on providing comments via the Federal eRulemaking Portal.
    The comments will be available for public inspection at the Office 
of the Commissioner for Patents, located in Madison East, Tenth Floor, 
600 Dulany Street, Alexandria, Virginia, and will be available through 
anonymous file transfer protocol (ftp) via the Internet (http://www.uspto.gov). Because comments will be made available for public 
inspection, information that the submitter does not desire to make 
public, such as an address or phone number, should not be included in 
the comments.

FOR FURTHER INFORMATION CONTACT: Lisa J. Hobbs, Ph.D., Search Systems 
Project Manager, Search and Information Resources Administration, 
Office of the Deputy Commissioner for Patent Resources and Planning, by 
telephone at (571) 272-3373, respectively, by mail addressed to: Box 
Comments--Patents, Commissioner for Patents, P.O. Box 1450, Alexandria, 
VA 22313-1450, or by facsimile to (571) 273-3373, marked to the 
attention of Lisa J. Hobbs.

SUPPLEMENTARY INFORMATION:
    1. General Background Information: It is becoming increasingly 
apparent that the USPTO needs to begin investigation of procedures for 
the submission, screening, processing, storing, searching, analysis and 
dissemination of chemical and 3-D biological structural data in 
appropriate electronic formats. The rate at which these data are being 
generated is poised to increase by several orders of magnitude in the 
coming years as significant advances are being made in the ability to 
readily determine structural information. Initiatives to fund research 
in these areas are being supported by both numerous governmental 
agencies and private industry entities. With the advancement of 
capabilities allowed by automation, the number of public and private 
databases hosting these types of data for information exchange is 
growing daily.
    It has yet to be determined whether or not the USPTO will receive 
an increasing number of applications comprising 3-D crystal data and/or 
chemical structure data. However, the USPTO currently receives a 
significant amount of chemical structure data, and has begun to receive 
some very large submissions of 3-D protein crystal data. Consequently, 
the USPTO has decided to begin the planning and coordination of how 
best to provide the capability to manage, process, search, and 
disseminate this information as appropriate.
    Similar to the process involved in the promulgation of the sequence 
rules (37 CFR 1.821-1.825 and WIPO ST.25), the USPTO intends to work 
with other international intellectual property offices in developing 
any new standards for the submission of chemical or 3-D structural data 
in electronic format.
    In an effort to facilitate public comment to the questions set 
forth below, the following additional background information is 
provided:
    2. Background Specific to 3-D Biological Structural Data: X-ray 
crystallographic studies and nuclear magnetic resonance (NMR) 
spectroscopy studies of biological macromolecules provide mechanisms 
for obtaining detailed 3-D structural information. The current 
scientific priorities, and concomitant intellectual property 
priorities, of many laboratories include using 3-D protein crystal data 
to assist in unraveling the complex relationship between sequence, 
structure, and function.
    Knowledge of the 3-D structures of biological macromolecules is an 
essential element for guiding studies and developing an understanding 
of biological processes. Three dimensional structural coordinate data 
provide essential information that can be exploited for protein 
engineering, rational drug design, and other biotechnology efforts 
(Gilliland, et al. 1996 J. Res. Natl. Inst. Stand. Technol. 101: 309-
320).
    Bioinformatics, the collection and use of scientific database 
entries to predict the structure or behavior or evolutionary 
relatedness of particular biological macromolecules based on sequence 
similarity or structural similarity to known macromolecules, is one of 
the fastest growing scientific disciplines. The ability of the 
scientific community to ``data mine'' known scientific information is 
directly dependent on the public availability of all prior art data.
    The worldwide Protein Data Bank (wwPDB; http://www.wwpdb.org/index.html) is a collection of all publicly available 3-D structure 
data of large molecules of proteins and nucleic acids, experimentally 
determined by X-ray crystallography and NMR, which is freely and 
publicly available to the global community. The PDB, which is under the 
oversight of the Research Collaboratory for Structural Bioinformatics 
(RCSB, USA), the Macromolecular Structure Database (MSD) at the 
European Bioinformatics Institute (EBI) and the Protein Data Bank Japan 
(PDBj) at the Institute for Protein Research, has grown from 7 
structures in 1971 to a database containing over 30,900 structures as 
of May 2005. The PDB's growth has been accompanied by increases in both 
data content and the structural complexity of individual entries. A 
further acceleration in growth is anticipated as the result of 
developments in high-throughput structural determination methodologies 
and worldwide structural genomics efforts (Westbrook, et al. 2003 Nucl. 
Acids Res. 31(1): 489-491).
    There are also many secondary sources of 3-D protein crystal data 
and associated information. One of these is the Molecular Modeling 
Database (MMDB), maintained as part of the Entrez search system by the 
National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/), which is a compilation of all of the PDB 3-D 
structures of biomolecules and additionally integrates value-added 
chemical, sequence and structural information in order to facilitate 
structure-based homology modeling and protein structure prediction. The 
goal of Entrez's 3-D-structure database is to make protein crystal 
structure information, and the functional annotation MMDB adds, easily 
accessible to molecular biologists (Wang, et al. 2002 Nucl. Acids Res. 
30(1): 249-252).
    All of the major 3-D protein crystal databases use a variant of the 
Crystallographic Information File (CIF) format as the means for 
obtaining data entries with proper annotation. Ratified in 1990 by the 
International Union of Crystallography (ICUr), CIF is a format that 
enables the characterization of small crystal structures. In 1997, the 
CIF format was modified to include information specific to 
macromolecules, resulting in version 1.0 of the macromolecular 
Crystallographic Information File (mmCIF) dictionary (Bourne, et al. 
1997 Meth. Enzymol. 227: 571-590). The PDB database initially accepted 
files in a proprietary pdb format in 1971, but has now moved to 
accepting all files, and converting the backfile, into mmCIF. Some 
databases, especially those involved in secondary, value-added 
information, have further modified the mmCIF format to include more 
data fields and annotations. MMDB uses the format, ASN.1, which is 
specific to the NCBI and addresses structural and functional linkages. 
The ASN.1 format also allows for a 3-D viewer to be used to visualize 
the protein crystal.
    In addition to databases containing information on the crystal 
structures of

[[Page 35575]]

biomolecules, there are major repositories for other types of crystal 
structures. The Cambridge Structural Database (CSD), maintained by the 
Cambridge Crystallographic Data Centre (CDCC; http://www.ccdc.cam.ac.uk/), is a worldwide repository of small molecule 
crystal structures and has over 300,000 organic and metallo-organic 
compound records. The CSD database accepts entries in the CIF data 
format in plain ASCII text. Repositories for other types of crystal 
structures include: the Nucleic Acids Data Bank (ndb; http://ndbserver.rutgers.edu/), which stores oligonucleotides; the Inorganic 
Crystal Structure Database ( ICSD; http://www.fiz-informationsdienste.de/en/DB/icsd/); and, CRYSTMET [reg] (http://www.tothcanada.com/), which stores metals and alloys.
    3. Background Specific to Chemical Structural Data: While the use 
of drawings to denote specific molecular relationships and chemical 
bonds is a very old art, the embodiments and uses of these drawings are 
evolving rapidly as supporting technology evolves. Two main methods for 
handling chemical data are: chemical drawing systems that depend on 
annotations added to unique substance records, in specific electronic 
file-types, and text files that are a compilation of unique data 
determining a canonical representation.
    Electronic files containing drawings created by chemical drawing 
software would provide the most accessible data set for processing, use 
in searching, and public dissemination. However, there is currently no 
single, publicly available, software that has been accepted as the 
standard for this type of drawings. Some publicly available chemical 
data depiction systems are: (1) SMILES (http://www.daylight.com/dayhtml/smiles/); (2) SMARTS/SMIRKS (http://www.daylight.com/dayhtml/doc/theory/theory.rxn.html#RTFrxn18); (3) ACD ChemSketch (http://www.acdlabs.com/download/); and (4) MDL ISIS/Draw (http://www.mdli.com/downloads/downloadable/index.jsp). Some proprietary chemical data 
depiction systems are: (1) ChemDraw (http://www.cambridgesoft.com/products/family.cfm?FID=2); (2) ACD/Name (http://www.acdlabs.com/products/name_lab/); (3) Chemistry 4-D Draw (http://www.cheminnovation.com/products/chem4d.asp); and (4) ChemWindow (http://www.bio-rad.com/ com/).
    One of the difficulties facing the USPTO in moving toward 
acceptance of chemical drawings in electronic format is the 
preponderance of proprietary software and file-types. Prior to filing a 
patent application, many applicants have already created drawings of 
chemical structures of interest for publication or presentation 
purposes; however, these drawings could be in one of many publicly 
available file-types, or in a file-type specific to a particular 
software product. It is not possible to require applicants to purchase 
proprietary drawing software, nor is it possible to accept and handle 
all possible file-types.
    One alternative to requiring a non-standard publicly available 
format, requiring a proprietary format, or accepting a multiplicity of 
drawing file-types would be the use of a standardized text format to 
describe a chemical structure. Two possibilities for this type of file 
are: Chemical Markup Language (CML; http://www.xml-cml.org/), or a 
joint effort currently under way between the International Union of 
Pure and Applied Chemistry and the National Institute of Standards and 
Technology, the IUPAC-NIST Chemical Identifier (INChI; http://www.iupac.org/projects/2000/2000-025-1-800.html). A description of 
INChI states that it would enable an automatic conversion to a 
graphical representation of a chemical substance that could be 
performed anywhere in the world, and could be built into desktop 
chemical structure drawing packages and on-line chemical structure 
drawing applets (A.J. McNaught 2001 http://www.iupac.org/nomenclature/chem_id_project.html).

Rule Making Considerations

    Executive Order 13132: This rule making does not contain policies 
with federalism implications sufficient to warrant preparation of a 
Federalism Assessment under Executive Order 13132 (Aug. 4, 1999).
    Executive Order 12866: This rule making has been determined to be 
not significant for purposes of Executive Order 12866 (Sept. 30, 1993).
    Paperwork Reduction Act: This notice involves information 
collection requirements which are subject to review by the Office of 
Management and Budget (OMB) under the Paperwork Reduction Act of 1995 
(44 U.S.C. 3501 et seq.). The collections of information involved in 
this notice have been reviewed and previously approved by OMB under OMB 
control numbers: 0651-0022, 0651-0024, 0651-0031, and 0651-0032. The 
principal impact of the changes under consideration in this advance 
rule would be to revise the rules of practice to require or provide for 
the submission of chemical and three-dimensional (3-D) biological 
structural data in electronic form. The Office is not resubmitting any 
information collection package to OMB for its review and approval 
because the this advance notice does not propose any changes that would 
affect the information collection requirements associated with the 
information collection under these OMB control numbers. If the Office 
proceeds with proposing changes to the rules of practice relating to 
the submission of chemical and three-dimensional (3-D) biological 
structural data in electronic form, the Office will resubmit an 
information collection package to OMB for its review and approval for 
any collections of information whose requirements will be revised as a 
result of the proposed rule changes.
    Interested persons are requested to send comments regarding these 
information collections, including suggestions for reducing this 
burden, to Robert J. Spar, Director, Office of Patent Legal 
Administration, Commissioner for Patents, P.O. Box 1450, Alexandria, VA 
22313-1450, or to the Office of Information and Regulatory Affairs, 
Office of Management and Budget, New Executive Office Building, Room 
10235, 725 17th Street, N.W., Washington, D.C. 20503, Attention: Desk 
Officer for the Patent and Trademark Office.
    Notwithstanding any other provision of law, no person is required 
to respond to nor shall a person be subject to a penalty for failure to 
comply with a collection of information subject to the requirements of 
the Paperwork Reduction Act unless that collection of information 
displays a currently valid OMB control number.
    4. Comments on the following Questions and Any Other Related 
Matters Are Solicited:

A. Questions Pertaining to the Creation of 3-D Structural Data Files

    1. What benefits do you foresee for the applicant if electronic 
filing is adopted? What disadvantages do you foresee?
    2. What types of 3-D data would be best submitted electronically? 
Examples:
     Small organic crystals.
     Macromolecular peptide/protein crystals.
     Inorganic crystals.
     Metallic crystals.
     Other.
    3. Should electronic submission of 3-D data be mandatory, optional, 
or mandatory for some types (e.g., protein crystals) and optional for 
others (e.g., small organic crystals)?
    4. If electronic submission is mandatory, should the USPTO require 
all 3-D information cited in application to be submitted in electronic 
format, including prior art, or only new data?

[[Page 35576]]

    5. Have tables of 3-D data generally been created for other 
purposes before preparation of a patent application, e.g., for 
publication in a scientific journal or submission to a database? If so,
     What format(s) are used (e.g., mmCIF, pdb, CIF, other)?
     What authoring tool is used to create the files, e.g., 
ADIT http://pdb.rutgers.edu/mmcif/ADIT/index.html?
     What software, if any, is used to validate files of 3-D 
data, e.g., ADIT Validation Tool or enCIFer (http://www.ccdc.cam.ac.uk/free_services/encifer/)?
    6. Have most of the 3-D tables been submitted to a database before 
inclusion in a patent application? If so, which one? Examples:
     http://www.ccdc.cam.ac.uk/products/csd/
     http://www.rcsb.org/pdb/
     http://www.fiz-informationsdienste.de/en/DB/icsd/
     http://www.tothcanada.com/
    7. Have most of the 3-D tables been published before inclusion in a 
patent application?
    8. Database providers require certain annotation data. Would any of 
the annotation data currently required by 3-D database providers be 
unknown or proprietary at the time of filing a patent application 
(e.g., method used for crystal creation)?
    9. Database providers often establish a controlled vocabulary for 
annotation or feature description information. Would there be any 
problems created during patent application prosecution if the 
electronic file relied on dynamic controlled dictionaries or 
vocabularies, controlled and maintained by database providers, not the 
USPTO, for the description of features, etc. What would be the pros and 
cons if the USPTO were to incorporate by reference a public database 
controlled vocabulary into any adopted standard? Examples:
     http://pdb.rutgers.edu/cc_dict_tut.html
     http://ndbserver.rutgers.edu/mmcif/dictionaries/index.html
    10. Is there annotation data specific to a patent application that 
does not appear in public database files but that would be desirable to 
provide for an electronic submission in a patent application (e.g., 
continuing application data, attorney's docket number)?
    11. Do many/most file wrapper submissions with 3-D data contain 
multiple 3-D tables?

B. Questions Pertaining to the USPTO Receipt of 3-D Files

    1. In general, 3-D structure data tables submitted as part of a 
patent application are quite lengthy. Should the USPTO require that all 
3-D files greater than a certain size be submitted in electronic media 
only?
    2. Should the USPTO require submission in electronic format at the 
time of filing, or, if a paper copy is filed, permit the electronic 
submission to be filed later (with a statement indicating that the 
electronic version is the same as the version originally filed)?
    3. Should any statement that comes with an electronic file outline 
the authoring tool and certify the use of a validation tool?
    4. Should the rules be revised to specify that 3-D biological 
structural data, if a paper copy is provided, is to appear in a special 
section, e.g., between the specification and the Sequence Listing?

C. Questions Pertaining to the Use of 3-D Electronic Files by the USPTO 
Examiners/STIC Personnel

    1. If enough patent applications are filed directed to 3-D 
structures to go forward with pursuing search capability (a 3-D file 
search, not the standard sequence search and text search already 
performed) of some sort, what databases should be investigated?
    2. What software viewer would be recommended for visual 
interpretation of the text tables? Examples:
     http://www.ncbi.nlm.nih.gov/Structure/CN3-D/cn3-D.shtml
     http://products.cambridgesoft.com/ProdInfo.cfm?pid=285
     http://www.proteinscope.com/
     http://www.candomultimedia.com/medical/

D. Questions Pertaining to 3-D File Export to a Public Database Partner

    1. If the USPTO receives 3-D structural data in electronic form, 
the USPTO would likely be able to export the data to a searchable 
public database upon publication of the application or patent grant. 
What databases should be investigated for a USPTO export arrangement?
    2. Would public databases be willing to work with the USPTO in 
developing acceptable formats and annotations, if that would be the 
best submission practice for applicants?

E. Questions Pertaining to the USPTO Publication of 3-D Files

    1. Should all 3-D files be posted on the USPTO's Publication Site 
for Issued and Published Sequences (PSIPS; http://seqdata.uspto.gov/)?
    2. Should the files be part of the text or image of the patent 
application publication or patent grant aside from electronic posting 
on PSIPS?

F. Question Pertaining to 3-D File Export to the USPTO Customers

    The USPTO would be exporting in a new file-type; would this have an 
adverse or beneficial impact on the USPTO customers?

G. Questions Pertaining to the Creation of Chemistry Structural Data 
Files

    1. What benefits do you foresee for the applicant if electronic 
filing is adopted? What disadvantages do you foresee?
    2. Has a structural chemistry data file or drawing generally been 
created for other purposes before preparation of a patent application, 
e.g., for publication in a scientific journal or submission to a 
database? If so, in what format: .mol, .cdx, CML, INChI, other?
    3. If drawing tools are used by applicants, which tools are 
generally used to create the files, e.g., ChemDraw, ISIS/Draw, ACD/
Name?
     http://www.cambridgesoft.com/products/family.cfm?FID=2
     http://www.mdli.com/products/framework/isis_draw/index.jsp
     http://www.acdlabs.com/products/name_lab/name/
    4. Is there annotation data that should be added to the drawings? 
What annotations? How would applicants prefer to add additional data?
    5. Possibly applicants want to cite inventors, attorneys, 
continuing application data, attorney's docket number, etc.?
    6. Should the USPTO require all structures cited in a patent 
application be submitted in electronic format? Only new data (not prior 
art)? Only a representative drawing? Only the ``actual invention'' 
after restriction of the claims and election of an invention?
    7. Would a single representation be deemed a limitation to 
applicant's disclosure?
    8. Do many/most file wrapper submissions with chemical structures 
contain multiple chemical structure drawings?
    9. Have any chemical drawings generally been submitted to a public 
entity (e.g., a database or journal) before the filing of a patent 
application?
    10. Have most of the drawings been published before the filing of a 
patent application?
    11. Would it be a hardship for applicants if the USPTO required 
drawings in a proprietary software format?
    12. Would it be a hardship for applicants if the USPTO required 
drawings in a text format that is not yet supported by the major 
drawing software tools?
     How well known is the CML format?


[[Page 35577]]


    [s] http://www.xml-cml.org/

     How well known is the INChI format?

    [s] http://www.iupac.org/publications/ci/2001/may/project_2000-025-1-050.html

    [s] http://www.iupac.org/projects/2000/2000-025-1-800.html#clip
    13. What is the state of the art for chemical drawings?
    [s] http://www.iupac.org/publications/ci/2002/2404/XML.html

H. Questions Pertaining to the USPTO Receipt of Chemistry Structure 
Files

    1. Chemical structure data received by the USPTO varies widely in 
size. Should the USPTO require that all chemical structure files 
greater than a certain size be submitted in electronic media only?
    2. Should the USPTO require submission in electronic format at the 
time of filing, or, if a paper copy is filed, permit the electronic 
submission to be filed later (with a statement indicating that the 
electronic version is the same as the version originally filed)?
    3. Should the rules be revised to specify that chemical structure 
data, if a paper copy is supplied, is to appear in a special section, 
e.g., between the specification and the Sequence Listing, or as part of 
the drawings?
    4. Chemical structures are often presented in the specification and 
claims in Markush format wherein a basic structure is defined, but 
portions thereof are variable. Are there drawing tools available that 
accurately render these types of structures? If not, what approach 
should the USPTO take to ensure that the data submitted appropriately 
reflects the invention described or claimed in the patent application. 
For example, the USPTO could require: An ``exemplary'' drawing at the 
time of filing; a drawing at the time of a restriction election, e.g., 
a single embodiment of a Markush claim; or, possibly multiple drawings.
    5. The USPTO needs to have certain data associated with files. 
Since there is no annotation data in chemical drawing files, should the 
USPTO require a ``read me'' text file to accompany the drawing file? 
Should the title of the file be the name of the drawing?

I. Question Pertaining to the Use of Chemistry Structure Files by the 
USPTO Examiners/STIC Personnel

    If a chemical structure drawing were required at the time of 
filing, how often might it have so many variables (that may be subject 
to a restriction/election requirement) that it cannot be effectively 
searched? If this is likely to be problematic, how can the USPTO 
effectively require submission of a representative drawing to be 
searched and, possibly, published?

J. Questions Pertaining to Chemistry Structure File Export to a Public 
Database Partner

    1. Should the USPTO send chemical structure data files to a public 
database partner? If so, which one(s)?
    2. Should the USPTO export data to CAS for inclusion in the 
Registry file? What about other private providers?
     http://www.cas.org/EO/regsys.html

K. Question Pertaining to the USPTO Publication of Chemistry Structure 
Files

    1. Should all chemistry structure files be posted on the USPTO's 
Publication Site for Issued and Published Sequences (PSIPS; http://seqdata.uspto.gov/), or should the chemistry drawing be published with 
the TIFF images of the patent application publication or patent grant?

L. Question Pertaining to Chemistry Structure File Export to the USPTO 
Customers

    1. Should we change the drawing files that are sent to the USPTO 
customers?
     Currently, .cdx, .mol, and TIFF versions are present 
(Note: common to Patent and Trademark Applications)

    Dated: June 15, 2005.
Jon W. Dudas,
Under Secretary of Commerce for Intellectual Property and Director of 
the United States Patent and Trademark Office.
[FR Doc. 05-12199 Filed 6-20-05; 8:45 am]
BILLING CODE 3510-16-P