| GRID manual | ||
|---|---|---|
| <<< Previous | Next >>> | |
Chapter 5. Introduction to programme GRIN
5.1. Abstract
Programme GRIN is used to prepare and check an input file (GRINKOUT) for Programme GRID, which is a computational procedure for determining energetically favourable binding sites on molecules of known structure. Programme GRID is described separately below. GRIN and GRID may be used to study individual molecules such as drugs; molecular arrays such as membranes or crystals; and macromolecules such as proteins, nucleic acids, glycoproteins or polysaccharides. GRIN must be used before GRID.
The overall procedure and some preliminary results obtained with Programmes GRIN and GRID have been described in the Journal of Medicinal Chemistry. (1985). Volume 28. Pages 849-857. However the Programmes have been largely rewritten and greatly extended since that paper was submitted for publication.
5.2. Functions of programme GRIN
5.2.1. Error checking
It is not uncommon for computer programs to give faulty results because of mistakes in the original input data. Programme GRIN is used to forestall such errors. GRIN merges two input files (PDB and GRUB), and the merged file (GRINKOUT) is checked and then saved for use as input to the following Programme GRID (See Figure 2 and Appendix A which contains a list of technical terms and abbreviations). A Lineprinter OUTput (GRINLOUT; Figure 4) is also produced, and this contains warnings of possible errors or problems with the original PDB input data. It may also contain warnings about the current version of datafile GRUB, because this is also checked for mistakes since errors can be introduced into GRUB by a User who has edited the file without due care.
By preparing and checking the input in this way, the chances of successful computing with the following Programme GRID are increased. This is important because a full GRID run on a large macromolecule can occupy the central processing unit of a computer for some time. The merging and checking by Programme GRIN, on the other hand, may only require a few seconds.
5.2.2. Merging the input files
5.2.2.1. The PDB input file
The first input file is called PDB (Protein Data Bank), and contains details of the molecule to be studied. This molecule or macromolecule is called the Target, and the input file must be in standard Protein Data Base (PDB) format as used at Brookhaven. The PDB file (Figure 1) contains the name and x,y,z coordinate position of every "heavy atom" (i.e non-hydrogen atom) in the Target. Hydrogen positions may also be included in the PDB file if they are available, but this is not normally the case with proteins, nucleic acids and polysaccharides.
Programme GREAT may be used to convert other formats into standard PDB format if need be.
A complete Target may consist of more than one molecule. For example it might be an enzyme-cofactor complex, in which the enzyme was a set of "atoms" according to standard Protein Data Bank conventions, but the cofactor was a set of "hetero-atoms" according to the same conventions. Water molecules; solvent molecules; ions; inhibitors; cofactors or other ligands might all be present as hetero-atoms.
Programme GRIN is normally able to deal with any combination of molecules in the Target, so long as the appropriate conventions are followed. If problems are encountered, warnings will usually be printed to the lineprinter output file GRINLOUT.
5.2.2.2. The GRUB datafile
The second input file (GRUB; Figure 5) contains a list of Energy Variables appropriate to each type of atom which might occur in the chosen Target. These variables define the strength of the Lennard-Jones, hydrogen bond and electrostatic interactions made by an atom. The Energy Variables may often relate to an "extended atom"; i.e. a heavy atom together with attached hydrogen atoms. Thus one line in datafile GRUB might relate to the beta -CH2- group of glutamic acid considered as a single entity. The next line would relate to the gamma -CH2- of glutamic acid; the next to the delta carbon atom and so on. Each line contains a set of Energy Variables which will be used in order to evaluate the energy functions in Programme GRID. Datafile GRUB is supplied with Programme GRIN, and may be extended by the User.
ATOMS and HETATMS:: Datafile GRUB contains the list of Energy Variables (Van der Waals radius, charge, etc), and special provision is made for hetero-atom variables in GRUB. This distinction between ATOMS and HETATMS is important. It is defined by the conventions of the Brookhaven Data Bank, and is discussed in detail below (see in Section 6.3.1.1).
5.2.2.3. Merging the PDB and GRUB files
The two files PDB and GRUB are merged by Programme GRIN in order to add appropriate Energy Variables to every atom in the Target. Checks are also carried out to be sure that all atoms are present; that they are arranged in the right order; that none occurs twice and so on. The N and C terminals and S-S bonds of proteins are identified, and appropriate adjustments are made to the variables associated with these special atoms before they are written to file GRINKOUT (Figure 6). Counter-ions are added to the phosphate groups of nucleic acids.
5.2.3. Computing the coordinates of hydrogen atoms
The PDB file for a macromolecular Target normally consists of ATOM records, as defined by the conventions of the Brookhaven Data Bank. It does not normally include the x,y,z coordinate positions of any hydrogen atoms, although many hydrogens may be of crucial importance for hydrogen-bonding interactions.
When Programme GRIN is processing a macromolecule, it does not expect to find any hydrogen coordinates in the PDB file, but it will check in case hydrogens are actually given. It will, of course, use the hydrogen coordinates if they are supplied. On the other hand, if there are no hydrogens in the macromolecule file, GRIN will search the Target for atoms such as hydroxy oxygen which are bonded to hydrogen-bonding hydrogen atoms. The coordinates of these hydrogen-bonding hydrogens will then be calculated by standard geometry, and added to the output file GRINKOUT.
A different procedure is followed when the Target is a small molecule. In this case "HETATMS" ("hetero-atoms" as defined by Brookhaven conventions) are used in the PDB file instead of "ATOMS", and the small molecule is called a "hetero-molecule". Moreover, it often happens that the positions of the hydrogens in a small Target molecule have actually been determined by X-ray crystallography. Programme GRIN will accept any observed hydrogen positions which are included as separate records in the PDB input file for a "hetero-molecule". It will then go on to compute the positions of all the other hydrogens in the molecule, whose positions were not reported.
In summary, therefore, the default procedure for macromolecules is to represent the Target as ATOMS, and only consider the hydrogen-bonding hydrogens. The default procedure for small molecules is to represent the TARGET as HETATMS, and treat all hydrogens as part of the Target.
5.2.4. Sets of targets
As mentioned above, the primary input for GRIN is normally a single PDB file which is processed by itself in a single GRIN run. However, a different primary input may be used if the User wants to process many PDB files one after the other as a Set. For instance, if the three PDB files PHENOL.PDB PHENOLATE.PDB and PYRIDINE.PDB were to be processed as Set, then the primary input to GRIN could be a list of the file names:
Phenol Phenolate Pyridine
This list would be typed into a file which would be called FILE.LIST, and would be used as the primary input for GRIN. The Programme would make three separate runs, giving three separate output files PHENOL.KOUT PHENOLATE.KOUT and PYRIDINE.KOUT. However, it would make these three runs as one single computing job. This is much quicker and easier than preparing and running three individual jobs, and ensures that each PDB file receives exactly the same treatment as the others.
Note: great.list and file.list
The FILE.LIST for Programme GRIN will normally have been prepared previously as GREAT.LIST for Programme GREAT. See under the heading PROCESSING A SET OF INPUT FILES.
5.2.5. Other uses for programme GRIN
Although Programme GRIN is specially designed to produce an acceptable input file for the Molecular Discovery Programmes, it may often be useful in a wider context for detecting inconsistencies in PDB files. This application can be particularly useful if the PDB files were written by other programs which did not incorporate adequate error checking.
At the end of any GRIN run it is important to assess the error messages printed to the lineprinter file GRINLOUT. It may then be necessary to correct the original PDB file, and perform further GRIN runs until no significant error messages occur. The GRINKOUT file produced at that time, but not before, should be suitable as an input file for use with Programme GRID.
| <<< Previous | Home | Next >>> |
| Programme GRIN | Up | User guide to programme GRIN |