40
PDB File Format v. 3.2
Page 159
SEQRES (updated)
Overview
SEQRES records contain a listing of the consecutive chemical components covalently linked in a
linear fashion to form a polymer. The chemical components included in this listing may be standard or
modified amino acid and nucleic acid residues. It may also include other residues that are linked to
the standard backbone in the polymer. Chemical components or groups covalently linked to side-
chains (in peptides) or sugars and/or bases (in nucleic acid polymers) will not be listed here.
Record Format
COLUMNS DATA TYPE FIELD DEFINITION
-------------------------------------------------------------------------------------
1 - 6 Record name "SEQRES"
8 - 10 Integer serNum Serial number of the SEQRES record for the
current chain. Starts at 1 and increments
by one each line. Reset to 1 for each chain.
12 Character chainID Chain identifier. This may be any single
legal character, including a blank which is
is used if there is only one chain.
14 - 17 Integer numRes Number of residues in the chain.
This value is repeated on every record.
20 - 22 Residue name resName Residue name.
24 - 26 Residue name resName Residue name.
28 - 30 Residue name resName Residue name.
32 - 34 Residue name resName Residue name.
36 - 38 Residue name resName Residue name.
40 - 42 Residue name resName Residue name.
44 - 46 Residue name resName Residue name.
48 - 50 Residue name resName Residue name.
52 - 54 Residue name resName Residue name.
56 - 58 Residue name resName Residue name.
60 - 62 Residue name resName Residue name.
64 - 66 Residue name resName Residue name.
68 - 70 Residue name resName Residue name.
Verification/Validation/Value Authority Control
The residues presented in the ATOM records must agree with those on the SEQRES records.
The SEQRES records are checked using sequence databases and information provided by the
depositor.
SEQRES is compared to the ATOM records during processing, and both are checked against the
sequence databases. All discrepancies are either resolved or annotated appropriately in the entry.
C#: XDoc.HTML5 Viewer for .NET Online Help Manual 4. FilledRectangle. Click to draw a filled rectangle annotation. Click to save created redaction with customized name. 6. zoomIn. Click to zoom out current file.
how to extract data from pdf to excel; export pdf form data to excel
29
PDB File Format v. 3.2
Page 160
The ribo- and deoxyribonucleotides in the SEQRES records are distinguished. The ribo- forms of
these residues are identified with the residue names A, C, G, U and I. The deoxy- forms of these
residues are identified with the residue names DA, DC, DG, DT and DI. Modified nucleotides in the
sequence are identified by separate 3-letter residue codes. The plus character prefix to label
modified nucleotides (e.g. +A, +C, +T) is no longer used.
Example
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
SEQRES 1 A 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU
SEQRES 2 A 21 TYR GLN LEU GLU ASN TYR CYS ASN
SEQRES 1 B 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU
SEQRES 2 B 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR
SEQRES 3 B 30 THR PRO LYS ALA
SEQRES 1 C 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU
SEQRES 2 C 21 TYR GLN LEU GLU ASN TYR CYS ASN
SEQRES 1 D 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU
SEQRES 2 D 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR
SEQRES 3 D 30 THR PRO LYS ALA
SEQRES 1 A 8 DA DA DC DC DG DG DT DT
SEQRES 1 B 8 DA DA DC DC DG DG DT DT
SEQRES 1 X 39 U C C C C C G U G C C C A
SEQRES 2 X 39 U A G C G G C G U G G A A
SEQRES 3 X 39 C C A C C C G U U C C C A
Known Problems
Polysaccharides do not lend themselves to being represented in SEQRES.
There is no mechanism provided to describe the sequence order if their starting position is unknown.
For cyclic peptides, a residue is arbitrarily assigned as the N-terminus.
32
PDB File Format v. 3.2
Page 161
MODRES (updated)
Overview
The MODRES record provides descriptions of modifications (e.g., chemical or post-translational) to
protein and nucleic acid residues. Included are correlations between residue names given in a PDB
entry and standard residues.
Record Format
COLUMNS DATA TYPE FIELD DEFINITION
--------------------------------------------------------------------------------
1 - 6 Record name "MODRES"
8 - 11 IDcode idCode ID code of this entry.
13 - 15 Residue name resName Residue name used in this entry.
17 Character chainID Chain identifier.
19 - 22 Integer seqNum Sequence number.
23 AChar iCode Insertion code.
25 - 27 Residue name stdRes Standard residue name.
30 - 70 String comment Description of the residue modification.
Details
* Residues modified post-translationally, enzymatically, or by design are described in MODRES
records. In those cases where the wwPDB has opted to use a non-standard residue name for the
residue, MODRES also correlates the new name to the precursor standard residue name.
* Modified nucleotides in the sequence are now identified by separate 3-letter residue codes. The
plus character prefix to label modified nucleotides (e.g. +A, +C, +T) is no longer used.
* MODRES is mandatory when modified standard residues exist in the entry. Examples of some
modification descriptions:
Glycosylation site
Post-translational modification
Designed chemical modification
Phosphorylation site
D-configuration
* A MODRES record is not required if coordinate records are not provided for the modified residue.
19
PDB File Format v. 3.2
Page 162
* D-amino acids are given their own residue name (resName), i.e., DAL for D-alanine. This resName
appears in the SEQRES records, and has the associated MODRES, HET, and FORMUL records.
The coordinates are given as HETATMs within the ATOM records and occur in the correct order
within the chain. This ordering is an exception to the stated Order of Records.
* When a standard residue name is used to describe a modified site, resName (columns 13-15) and
stdRES (columns 25-27) contain the same value.
Verification/Validation/Value Authority Control
MODRES is generated by the wwPDB.
Relationships to Other Record Types
MODRES maps ATOM and HETATM records to the standard residue names. HET, and
FORMUL may also appear.
Example
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
MODRES 2R0L ASN A 74 ASN GLYCOSYLATION SITE
MODRES 1IL2 1MG D 1937 G 1N-METHYLGUANOSINE-5'-MONOPHOSPHATE
MODRES 4ABC MSE B 32 MET SELENOMETHIONINE
32
PDB File Format v. 3.2
Page 163
4. Heterogen Section (updated)
The heterogen section of a PDB formatted file contains the complete description of non-standard
residues in the entry. Detailed chemical definitions of non-polymer chemical components are
described in the Chemical Component Dictionary (ftp://ftp.wwpdb.org/pub/pdb/data/monomers
)
HET
HET records are used to describe non-standard residues, such as prosthetic groups, inhibitors,
solvent molecules, and ions for which coordinates are supplied. Groups are considered HET if they
are not part of a biological polymer described in SEQRES and considered to be a molecule bound to
the polymer, or they are a chemical species that constitute part of a biological polymer and is not one
of the following:
standard amino acids, or
standard nucleic acids (C, G, A, U, I, DC, DG, DA, DU, DT and DI), or
unknown amino acid (UNK) or nucleic acid (N) where UNK and N are used to indicate the
unknown residue name.
HET records also describe chemical components for which the chemical identity is unknown, in which
case the group is assigned the hetID UNL (Unknown Ligand).
The heterogen section of a PDB formatted file contains the complete description of non-standard
residues in the entry.
Record Format
COLUMNS DATA TYPE FIELD DEFINITION
---------------------------------------------------------------------------------
1 - 6 Record name "HET "
8 - 10 LString(3) hetID Het identifier, right-justified.
13 Character ChainID Chain identifier.
14 - 17 Integer seqNum Sequence number.
18 AChar iCode Insertion code.
21 - 25 Integer numHetAtoms Number of HETATM records for the group
present in the entry.
31 - 70 String text Text describing Het group.
34
PDB File Format v. 3.2
Page 164
Details
* Each HET group is assigned a hetID of not more than three (3) alphanumeric characters. The
sequence number, chain identifier, insertion code, and number of coordinate records are given for
each occurrence of the HET group in the entry. The chemical name of the HET group is given in the
HETNAM record and synonyms for the chemical name are given in the HETSYN records, see
ftp://ftp.wwpdb.org/pub/pdb/data/monomers
.
* There is a separate HET record for each occurrence of the HET group in an entry.
* A particular HET group is represented in the PDB archive with a unique hetID.
* PDB entries do not have HET records for water molecules, deuterated water, or methanol (when
used as solvent).
* Unknown atoms or ions will be represented as UNX with the chemical formula X1. Unknown
ligands are UNL; unknown amino acids are UNK.
Verification/Validation/Value Authority Control
For each het group that appears in the entry, the wwPDB checks that the corresponding HET,
HETNAM, HETSYN, FORMUL, HETATM, and CONECT records appear, if applicable. The HET
record is generated automatically using the Chemical Component Dictionary and information from the
HETATM records.
Each unique hetID represents a unique molecule.
Relationships to Other Record Types
For each het group that appears in the entry, there must be corresponding HET, HETNAM, HETSYN,
FORMUL,HETATM, and CONECT records. LINK records may also be created.
Example
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
HET TRS 975 8
HET UDP A1457 25
HET B3P A1458 19
HET NAG Y 3 15
HET FUC Y 4 10
HET NON Y 5 12
HET UNK A 161 1
34
PDB File Format v. 3.2
Page 165
HETNAM
Overview
This record gives the chemical name of the compound with the given hetID.
Record Format
COLUMNS DATA TYPE FIELD DEFINITION
----------------------------------------------------------------------------
1 - 6 Record name "HETNAM"
9 - 10 Continuation continuation Allows concatenation of multiple records.
12 - 14 LString(3) hetID Het identifier, right-justified.
16 - 70 String text Chemical name.
Details
* Each hetID is assigned a unique chemical name for the HETNAM record, see
ftp://ftp.wwpdb.org/pub/pdb/data/monomers
.
* Other names for the group are given on HETSYN records.
* PDB entries follow IUPAC/IUB naming conventions to describe groups systematically.
* The special character “~” is used to indicate superscript in a heterogen name. For example:
N
6
will be listed in the HETNAM section as N~6~, with the ~ character indicating both the start
and end of the superscript in the name, e.g.,
N-(BENZYLSULFONYL)SERYL-N~1~-{4-[AMINO(IMINO)METHYL]BENZYL}GLYCINAMIDE
* Continuation of chemical names onto subsequent records is allowed.
* Only one HETNAM record is included for a given hetID, even if the same hetID appears on more
than one HET record.
Verification/Validation/Value Authority Control
For each het group that appears in the entry, the corresponding HET, HETNAM, FORMUL, HETATM,
and CONECT records must appear. The HETNAM record is generated automatically using the
Chemical Component Dictionary and information from HETATM records.
Relationships to Other Record Types
For each het group that appears in the entry, there must be corresponding HET, HETNAM, FORMUL,
HETATM, and CONECT records. HETSYN and LINK records may also be created.
13
PDB File Format v. 3.2
Page 166
Example
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
HETNAM NAG N-ACETYL-D-GLUCOSAMINE
HETNAM SAD BETA-METHYLENE SELENAZOLE-4-CARBOXAMIDE ADENINE
HETNAM 2 SAD DINUCLEOTIDE
HETNAM UDP URIDINE-5'-DIPHOSPHATE
HETNAM UNX UNKNOWN ATOM OR ION
HETNAM UNL UNKNOWN LIGAND
HETNAM B3P 2-[3-(2-HYDROXY-1,1-DIHYDROXYMETHYL-ETHYLAMINO)-
HETNAM 2 B3P PROPYLAMINO]-2-HYDROXYMETHYL-PROPANE-1,3-DIOL
Documents you may be interested
Documents you may be interested