| GRID manual | ||
|---|---|---|
| <<< Previous | Next >>> | |
Chapter 4. Programme GREAT
Programme GREAT provides an integrated supervising environment for the Molecular Discovery Programmes. It is also used in order to deal with Targets when the coordinates of the atoms are not available in Brookhaven PDB format. Furthermore, it can check the names of atoms, and can change inappropriate atom names into the correct names for input to GRIN and GRID.
GREAT is particularly recommended for New Users who are just starting to use the Programmes and do not prefer the GREATER GUI. Experienced Users often prepare their command files for GRIN and GRID directly, as described in later Sections of this User Manual. However, they may still want to use Programme GREAT in order to prepare their PDB input files.
Note on aliases: On most Unix systems there will be aliases for the names of the main Programmes. However the following examples are designed to work whether you have aliases or not.
In order to start you would type the characters great in order to activate the Command File which controls Programme GREAT:
great
and you must hit the 'RETURN' key after typing GREAT. As soon as you have hit the 'RETURN' key you should see this sort of menu:
WELCOME TO THE MOLECULAR DISCOVERY PROGRAMMES
*********************************************
MENU A
****** Do you want to use:-
(1) Programme GRIN (GRIN)
(2) Programme GRID (GRID)
(3) Reserved for future use (****)
(4) or do you want to prepare a pdb FILE (FILE)
(5) or do you want HELP or (HELP)
(6) or do you want to use the operating system(SPAWN)
(7) or return to the operating system (QUIT)
(8) or do you want to switch off this BELL (BELL)
ANSWER NOW: |
Programme GREAT works by asking you questions, and waiting for your answers. Most of the questions are collected into Menus, and you are looking at the first MENU A.
Programme GREAT collects all your instructions for GRIN, and writes them into a Command File called grin.in. It will submit the Command file, or save it and wait for more instructions. The Command File can be printed to form a permanent record of the GRIN run, and you can then use GREAT again in order to prepare and submit Programme GRID.
Experienced Users should have no problems in using Menu A and subsequent Menus in order to prepare and submit a Command File for Programme GRIN. The Command Files for Programmes GRID and GRAB are prepared and submitted similarly. New Users may find this confusing. They should turn to the tutorial demonstrations and work through the examples in that directories. The following points should be noted when you are using Programme GREAT:
The input and output files do not all have to be in your working directory. You can always give a fuller file specification.
Programme GREAT normally works by generating a new command file for Programme GRIN.
The existing command file will always be saved when the new command file is written, unless you explicitly say that you do not want it.
Programme GREAT cannot deal with long directory or long file names. Each of these must be less than 50 characters long, so the file name:
and the default directory name:THISISAPARTICULARLYBIGNAME.WITHAVERYBIGEXTENSION
would each be acceptable as individual entries, since each is less than 50 characters. The same limitation applies when the strings are combined as a complete file specification, so the combined file and directory name:DUA1:[VIVELAFRANCE.GODSAVETHEQUEEN.INGODWETRUST]
would be acceptable since it has 49 characters, but this is the limit. In practice, new Users will generally find that it is best to use shorter file and directory names, since short names are less prone to typing errors! Furthermore, your operating system may limit you to shorter names.[THEMOVINGFINGERWRITESANDHAVINGWRIT]MOVESON.QUOTE
There are two main menus (B and C) when you are using Programme GREAT in order to prepare a GRIN run. Both of these menus contain the same item:
and this item allows you to toggle between the two menus. Similarly, you can toggle between the two main menus for Programme GRID by typing the Keyword ELSE(8) Do you want to alter something ELSE
Many keywords may be typed when you are using GREAT. You are not restricted to the keywords that are being displayed on the screen at any particular time. Type the keyword: "KEYWORDS" if you want to see the complete list of available keywords.
Type QUIT to return from Programme GREAT to the operating system. You may have to type it two or three times. You should NEVER try to exit from Programme GREAT by typing CTRL/C or CTRL/Y or CTRL/Z, or by hitting the BREAK or ESCAPE keys. Work in progress may be lost, especially if you are working over a network.
Programme GREAT checks the input files, and often asks questions in order to be sure that the files are acceptable. Most questions can be answered with YES or NO. However another response will be needed for a few questions, where experience has shown that Users may answer YES too quickly when they really meant NO.
When dealing with a file of atom coordinates, Programme GREAT will try to convert other formats into PDB format. It does this automatically when preparing to run Programme GRIN, or you can convert formats directly by calling Item 4 in the Main Menu for GREAT (see above).
Programme GREAT can process several files of coordinates one after the other as a Set. The individual Targets must be listed in a special file called "great.list". Users should note that Programme GREAT will always process a file called GREAT.LIST, whenever it finds a file with this name in the working directory. You should therefore rename GREAT.LIST as soon as you have finished using it, or move it to another directory. From version 19 of GRID, an alternative method to process a list of mol2 or sdf files is provided. See Gmol2 and Gsdf programs.
The list of Targets in GREAT.LIST can be as long as you like. Each Target name must appear on a separate line. Each name must be less than 50 characters long. Detailed information is given below.
There are two distinct stages in the preparation of coordinate files by Programme GREAT. First, the format is changed to PDB format, while the original names of the atoms are retained. The User can use this converted file as it stands, as input for some other computation but not for GRIN and GRID. Alternatively, he or she can continue using Programme GREAT in order to change the atom names, so that they will be fully compatible with the Molecular Discovery Programmes. A simplified flow chart for format conversion by Programme GREAT is shown on the next page.

DIAGRAM 2: SIMPLIFIED FLOW CHART FOR FORMAT CONVERSION BY PROGRAMME GREAT
4.1. Format conversion by programme GREAT
Diagram 2 (above) shows a simple flow-chart for Programme GREAT when it is preparing a Brookhaven PDB file from data in some other format. Programme GREAT can ask many different questions during this format conversion, and it would not be appropriate to describe them all in full. However the general outline is defined by the flow chart above, and selected questions will now be considered in more detail.
If your input file: MYTARGET.DAT is already in PDB format, you may initially be asked:
and the answer CERTAIN will finish the dialogue about format conversion. Notice that YES is not an acceptable answer in this case, because experience has shown that Users may give the answer YES to this particular question when they actually meant to give the answer NO!!Are you CERTAIN that your PDB input file is correct?
If your input file: MYTARGET.DAT is not in PDB format, you may be asked:
This is because an equivocal atom name has been detected in the file. For example the atom name CA might mean an alpha-carbon atom, or it might be calcium, and Programme GREAT is not going to guess unless you have asked it to do so.Does your coordinate input file contain ONLY the following six common elements: CARBON HYDROGEN NITROGEN OXYGEN PHOSPHORUS SULPHUR
If your original input file was not in PDB format, it will be converted to PDB format and you may then be asked:
This question is also asked when the input file is in PDB format, but some or all of the atoms are listed as HETATM records. You must answer the question with 'NO' if you want to use the PDB file as input for some Programme which is expecting the original atom names. However, if you want to use your PDB file as input for GRID you must answer 'YES'.The HETATMS can be renamed for use with Programme GRID. Is this what you want?
ATOMS and HETATMS: there is an important distinction between ATOMS and HETATMS. If you are in any doubt about the distinction, see under Atom Name Conventions.
If your original input file was not in PDB format, you may be asked:
This would be asked if there were several different molecules in your input file (such as the main Target molecule and a counter ion and some solvent molecules), and the original input format did not assign an identifiable name to each individual molecule. You can now assign explicit names to molecules, or let Programme GREAT call them MOL1, MOL2 etc. NOTE If you are asked this question about molecule names, you should always consider whether you really do want a Target containing two or more different molecular species. Sometimes a User may not have been aware that his or her file contained more than one molecule! If that has happened to you, it may be necessary to edit the final PDB file and remove any unwanted ions or molecules or ligands from the file before you use Programmes GRIN and GRID.Do you want to give each molecule an appropriate name?
If your input file does not contain hydrogen atoms, you may be asked a question such as:
Programme GREAT is asking this question because there are no hydrogens in the input file, and it cannot be sure if your molecule is a phenol, or a phenolate, or perhaps even a quinone. Only you know the answer if the hydrogen positions are not shown in the file.Is this an aromatic hydroxyl oxygen (e.g. a phenol)?
If you started with a file from which the hydrogen atoms were missing, then some of the atom names may still remain unassigned after you have answered all the questions. In order to help you deal with these, Programme GREAT will always type the first few lines of your PDB file after all possible changes have been made. It will also tell you how many atom names still need to be assigned.
You can then make the final assignments on an atom-by-atom basis while Programme GREAT prompts you, or you may prefer to exit from Programme GREAT and use your favourite screen editor for this last important check. If you do use Programme GREAT, it will select the atoms which need consideration, so that you do not need to reassess all the atoms in the file.
Note on temporary "working" files: In some cases an intermediate command file is temporarily produced by Programme GREAT, and is deleted again soon afterwards. This file will be named JUNK or JUNKIE and you should not try to delete one of these "JUNK" files while you are using GREAT.
4.2. Rapid format conversion by programme GREAT
Programme GREAT may ask:
..... Do you want Programme Great to read your pdb input
file and make sensible GUESSES about atom names,
..... or do you want to MANAGE Programme Great, so that
it does not have to make any guesses at all |
and the Programme will work differently if you give the answer GUESS instead of the recommended answer MANAGE. These are the main differences:
When Programme GREAT is allowed to GUESS, it will not ask you so many questions. For instance, you will not be asked if you want to change ATOMS into HETATMS, but Programme GREAT will work on the assumption that each of these labels is correct.
You will not be asked about some of the equivocal entries in your file. For instance, Programme GREAT will assume that the atom name NE1 is a protein side-chain amide nitrogen, and will not ask if it might be a Neon atom. Similarly CA will be taken as an alpha-carbon; not Calcium.
Many messages from Programme GREAT will still be written to the screen so that you can check its progress. However, you will not have to answer all the questions because GREAT will make more decisions by itself. This is a simplification for the User, but GREAT may take some time to make its decisions and may not make them so well as you would make them yourself.
Programme GREAT can sometimes be rather pedantic if you do not let it GUESS, but prefer to MANAGE things yourself. For example, you could have assigned the name HOH to the oxygen of a water molecule, and with some Targets GREAT might ask if HOH was a Holmium eta atom (HoH) if guesses were not allowed. Each User will have to decide if GREAT should be allowed to GUESS or not.
We suggest that you turn the BELL OFF, so that it does not ring while GREAT is running without supervision. You can then get on with another job, because the bell should still ring when you actually do have to answer a question.
4.2.1. Processing a set of input files
Programme GREAT may be used to process a lot of input files one after the other as a SET. However, more sophisticated procedure are now available to do this job. Please see the utility programs Gmol2 and Gsdf and the Tutorial05 section.
We suggest you skip the next Section which deals with this feature, when you are reading this User Manual for the first time. Continue instead by reading the following Section which describes how to use Programme GRIN to process one single input file.
4.3. Preparing a set of targets with GREAT
You may have a Set of many coordinate files which you need to process one after the other. This most often happens when several small molecules are being prepared, in order to generate Grid maps that will be used as input for a Partial Least Squares analysis or another statistical treatment. Some of the files may be in PDB format, but they may not have appropriate atom names for input to GRIN and GRID. Other files may be in Cambridge Databank Format, or in various proprietary formats, or your own in-house format.
In every case it is necessary to check the atom names carefully, and convert any other format into Brookhaven PDB format. We shall now describe each individual step for doing this, and then summarize the best overall procedure.
4.3.1. Preparing great.list
Programme GREAT will process a set of coordinate files automatically, if you prepare a list of the file names each on a new line one after the other. The easiest procedure is to make working copies of all the coordinate files in an empty directory (but make sure that you still have the original files safely somewhere else, in a secure directory with secure file protections).
We suggest you start by renaming your working copies so that each file has the same extension .pdb. For instance you might start with four compounds in four files called:
ach.cssr atropine.xtl pilocarpine.mol2 lachesine.pdb |
and in this case the first three files should be renamed to:
ach.pdbi atropine.pdb pilocarpine.pdb |
Note that it is only the file-names which have been changed, but the actual formats of the data in these files would still be: cssr, xtl and mol2 as they were originally. Remember that each file-name must be less than 50 characters long.
Next prepare a file (which must be called "great.list" listing the directory contents as a single column of names. The list can be as long as you like; there is no limitation on the number of targets in a great.list file. You can type up this file yourself, or prepare it like this:
ls -1 *.pdb > great.list |
If you type it yourself, it is not necessary to add the extension ".pdb" to each file name. However you will get this extension if the computer prepared the file by listing the contents of a directory, and in that case your GREAT.LIST file will be like this:
ach.pdb atropine.pdb pilocarpine.pdb lachesine.pdb |
4.3.2. Processing great.list
You can now run Programme GREAT. You should be working in the same directory with the PDB coordinate files and with GREAT.LIST. Programme GREAT will find GREAT.LIST and will automatically process the listed files. GREAT may ask if you want it to "GUESS" the names of the atoms in your molecules; see above about Guesses. It may also ask if you need a Password, and may want to know which copy of GRUB.DAT it should use.
GREAT will normally make a new PDB file from each of the starting files in your working directory, and it will give every atom in the new PDB file the correct atom name which is needed for input to Programmes GRIN and GRID. You may be asked questions in order to resolve some equivocal atom names. If GREAT is still unable to resolve a name or if it has any other problems, it will finish by displaying a list of the PDB files in which the problems occurred.
GREAT may ask you for permission to take short cuts, because the detailed processing of many files can take a long time. For instance GREAT would normally check an atom name like CA because this symbol could mean 'Calcium' or 'Carbon Alpha', but it may ask if you are certain that none of your molecules have equivocal symbols such as BA (Boron or Barium), CL (Carbon or Chlorine), FE (Fluorine or Iron) or NE (Nitrogen Epsilon or Neon). Then, if you tell GREAT that you are sure about this, it will skip such questions when dealing with the rest of your molecules.
After GREAT has finished, your files will still have the extension name .pdb and they will now really be in PDB format instead of their various original formats. For instance you might have started with a file called: ach.cssr which was in .cssr format. You renamed it to: ach.pdb but it was actually the same file in the same .cssr format because you only changed the name and not the contents. Now, after processing by Programme GREAT, the file called ach.pdb is really a PDB file in PDB format. Note that GREAT normally makes a new PDB file from each of your starting files, because it normally has to alter atom names and perhaps change the overall format. However, if no alterations are required, then behaviour depends on the operating system.
It is good policy, if you are running GREAT on a workstation, to keep another window open. Then, if GREAT asks questions about your molecules, you can use that other window in order to study the whole PDB file or view the molecular structure. Programme GREAT might ask for example, if a particular nitrogen atom was uncharged or cationic, and you could use the other window in order to find which atom it was dealing with.
4.3.2.1. Processing conformational isomers
Programme GREAT always makes a check in case the files in GREAT.LIST are different conformers of the same molecule (ie: Same covalent structure and same ionization pattern, but different torsion angles). It makes this check by comparing how you deal with the first two files in GREAT.LIST. If the molecules in those first two files are the same size, and if you give the same HETATM names to equivalent atoms in each molecule, then GREAT will assume that you may be studying several different conformational isomers of the original molecule.
In order to check this possibility GREAT asks whether you "want all the conformers to be treated as a batch and given the same HETATM names". If you give the answer "YES", then GREAT will start working in "BATCH MODE". It will process each of the following files in the batch like it processes the first, giving each conformer the same HETATM names. This can save a lot of time and vexation, because it can be very frustrating if one has to answer the same questions in the same way for perhaps hundreds of conformers of the same molecule!
The following points should be noted:
If you tell GREAT that it is dealing with a batch of conformers, it will continue to process all the following files as members of that batch until it comes to a blank line in GREAT.LIST. The blank line acts as a flag marking the end of the batch, and GREAT will then check in case conformers of another molecule are being studied as a second batch. If the two files following the blank line are identical with each other, then GREAT will assume that another batch of conformers has started, and will carry on processing conformers of that second molecule until it finds another blank line. Further batches of conformers can follow one after the other, each separated by a blank line.
It is critically important to give the same answers for the first two molecules of each batch, so that GREAT is forced to consider the possibility of conformers. If you tell GREAT that a certain nitrogen is protonated in the first conformer and deprotonated in the second, it will assume that it has to deal with completely different molecules because addition or removal of a proton alters the covalent structure. You should therefore note carefully how you answer questions about the first molecule in each batch.
In some cases you might want to study conformers of a certain molecule with its nitrogen protonated, and other conformers after deprotonation. This is most easily done by dividing the conformers into two batches (protonated and deprotonated) separated from each other by a blank line in GREAT.LIST
If there is a blank line before and after a filename in GREAT.LIST, then GREAT will not be able to check for conformers because it needs a pair of successive files which can be compared with each other. GREAT will switch out of "BATCH MODE" and may continue to process any more files one by one. Therefore, if you want to study some molecules as a batch of conformers, but only have one conformer of certain other molecules, it is best to have the one-conformer molecules at the end of the list.
As mentioned above, it is critically important to give the same answers for the first two molecules of each batch. However, care is required with this procedure, because different HETATM names might be assigned automatically by Programme GREAT itself, without user intervention. Be aware that this could happen unexpectedly if, for example, some torsion angles were twisted in the second conformation so that a conjugated system was realigned, and the protonation of an ionizable group in that conjugated system was thereby altered.
4.3.2.2. Programme g2f
When you want to process a lot of files, it is often convenient to use a list of base filenames without the extensions cssr, xtl, mol2 or whatever. Each Programme will then add the appropriate default extension for its own input files. The extension ".pdb" will be added by GREAT, and ".pdb" by GRIN, and ".kout" by GRID. The list of base names is called a FILE.LIST, so in our example the two files would be like this:
GREAT.LIST FILE.LIST ach.pdb ach atropine.pdb atropine pilocarpine.pdb pilocarpine lachesine.pdb lachesine |
You can prepare the file of base names by editing GREAT.LIST, or you can use a utility Programme called G2F to do this job. It is easier to use G2F if the list is long.
Programme G2F is run directly from the keyboard, and it will prompt you for the names of its input and output files. The Programme name G2F means "GREAT.LIST to FILE.LIST" and we suggest you use GREAT.LIST and FILE.LIST as the names of the files.
The new FILE.LIST without extensions can be used as input for Programme GRIN and for Programme GRID. It can also be used as input for GREAT, but in that case it must have the special name GREAT.LIST as described above. There is no limitation to the number of Target molecules in a GREAT.LIST file, nor in the number in a FILE.LIST for Programme GRIN. However, the maximum number of Targets in a FILE.LIST for Programme GRID itself is set to 1500 from version 19 of the Programmes.
4.3.3. Overall method for a set of targets
Programme GREAT will always process a file called GREAT.LIST whenever it finds one in the current directory. You must therefore rename the file or move it to another directory, as soon as you have finished using it as input to GREAT. Rename it at once, because you do not want Programme GREAT to keep processing the same GREAT.LIST file in the same way, every time you try to run GREAT!
We therefore suggest the following overall procedure:
Open a new directory with working copies of the files of coordinates of the Target molecules that you want to study.
Make sure that these files of coordinates have names with the extension .pdb.
Make a list of those files in your working directory. Either type the list, or prepare it as a "Directory Listing".
Make sure the list is in a file called "great.list" in your working directory. Use this list as input for Programme GREAT.
Remove any extensions from the filenames in your list. Either edit the list or use Programme G2F.
Rename the list to "file.list" and then use it as input for Programmes GRIN and GRID.
4.3.4. Using the renamed file.list
The FILE.LIST (without extension) is used as input to Programmes GRIN and GRID. Full details are given below (see contents under STUDYING A SET OF TARGETS WITH GRID). Briefly, it is used like this:
The renamed FILE.LIST can be the main input for Programme GRIN, which will start by adding the Energy Variables (Van der Waals radius, number of hydrogen bonds, etc) to each atom in each of the Targets which is named on the LIST. GRIN will then produce a set of .KOUT files ready for use as input to the next Programme GRID. There will be one .KOUT file for each input file of coordinates.
Programme GRIN will also prepare one lineprinter output file GRINLOUT.DAT. This will contain error messages and warnings about the individual input files, and will finish with a short list of any files that were not correctly processed by GRIN.
The same FILE.LIST can be used again as the main input file for Programme GRID, which will process all the .KOUT files with all the Probes, giving one lineprinter output file GRIDLONT.DAT and one file of energy values (GRIDKONT.DAT).
There is no limitation to the number of Target molecules in a FILE.LIST for Programme GRIN. However, the maximum number of Targets in a FILE.LIST for Programme GRID itself is set to 1500 from version 19 of the Programmes.
You may want to add a "message" to each of the names on the FILE.LIST, before using it as input for GRID. This message might be, for example, the biological activity of the compound.
The GRIDKONT file may finally be post-processed by Programme GCNT to give output for Sybyl CoMFA, or by Programme GSIM to give output for Golpe or SIMCA.
4.3.5. Unprocessed files
If your GREAT.LIST contains many files, it is possible that some of them may not be processed completely by Programme GREAT. A list of partly processed or unprocessed files will be displayed when GREAT has finished, so that you can give them any special treatment which they may still require. We recommend that you rerun each of these files individually through GREAT, starting from the original copy of the original input file which should still be in another secure directory. Recopy that original file into your working directory, and do not allow GREAT to make any guesses during the rerun!
4.3.6. Directory names in great.list
The file GREAT.LIST itself must be in the current directory where you are working, so that Programme GREAT can find it. We recommend that you also have the working copies of your original coordinate files in that same working directory. Keep the original coordinate files safely in another secure directory, and use GREAT to process the working copies. In fact it is not an absolute requirement that you keep the working copies in your working directory. GREAT.LIST can point to working copies in some other directory if that is preferable. In that case, however, the full directory name of each file must be shown in GREAT.LIST. Your GREAT.LIST file would then look something like this:
/usr/people/me/other/ach.pdb /usr/people/me/other/atropine.pdb /usr/people/me/other/pilocarpine.pdb /usr/people/me/other/lachesine.pdb |
| <<< Previous | Home | Next >>> |
| General introduction | Up | Programme GRIN |