Programme GLUE

Chapter 41. Programme GLUE

41.1. Introduction to programme GLUE

Programme GLUE fits ligands molecules into a set of GRID maps of a target structure. GLUE is a docking programme using all the GRID force fields options and capabilities.

The input target structure (protein) must be prepared for the Target as usual, but GREATER interface can make this work much more simpler. Moreover, the subsequent generation of a set of GRID maps on that Target has been automated.

GLUE is able to carry out the various steps needed to obtain one or more docked positions of the ligand into the target in automatic way.

Programme GLUE would now require only two input files: the target protein in the kout format, and the ligand molecule(s) to dock into the protein. Optionally the User may provide the location of the docking by using a simple grid cage in ASCII format. GLUE will then generate the GRID maps for the probes that are going to simulate at best the ligand structure. Then the GRID maps are further elaborated and used as input for the docking software. Finally GLUE will generate the results of the docking saved in a single file and in some individual files for graphical analysis.

This sort of information can be of particular value for ligand design, because a small structural alteration of the ligend(s) might be made in order to favour one of the binding modes at the expense of the others.

GLUE runs in UNIX and LINUX systems.

41.2. Overall method used by programme GLUE

41.2.1. The initial search

Programme GLUE is used to find possible interaction sites for ligand molecule(s) with another macro-molecule (the "Target" protein). GLUE requires, as input, the structures of both these molecules.

Programme GLUE begins by sorting the polar and hydrophobic heavy atoms of the ligand(s). With a pyranose glucose ligand molecule it would find:

  • Five polar hydroxyl oxygen groups;

  • One (rather less polar) ether oxygen atom; and

  • Six non-polar carbons with their bonded hydrogens.

Starting randomly at one position of the grid cage, it would place one of the polar atoms (P1) of the ligand at a favourable place on the corresponding Grid map. For instance, it might place an hydroxyl oxygen of glucose at a favourable energy minimum in the Grid Map for aliphatic hydroxyl. This position would define a location (T1 on the Target) at which the hydroxyl group (P1 of the ligand) might be able to make favourable non-bonded interactions.

While keeping the first polar atom at T1, the Programme would then search for a second favourable place (T2) on the Target, at which one of the other polar atoms (P2) of the ligand could be placed. There might be several such positions on the Grid map for hydroxyl. Each T2 position would have to be at the appropriate distance from T1, and the four remaining hydroxyl groups of the glucose would be available to try at each of these T2 places. Moreover, there would probably be some suitably-positioned energy minima on the Grid map for ether oxygen, and so the Programme would also try to fit the ether oxygen atom of glucose to favourable places on the ether map.

The third fitting stage begins as soon as two polar atoms (P1 and P2) of the Group Probe are positioned at two favourable places on the Target. These places (T1 and T2) define an axis round which the ligand can be rotated, until a third polar atom (P3) of the Group Probe is close to a third appropriate energy minimum (T3) of the Target. Then, as soon as this third suitable position has been found, the six sets of coordinates T1, T2, T3, P1, P2 and P3 are saved as the preliminary definition of one possible bonding mode of the glucose ligand to the lysozyme Target. This procedure is called "the pose".

The same procedure is used to fit hydrophobic atoms in hydrophobic locations of the Target. Therefore GLUE is not limited by the number and type of bonding atoms as other docking softwares.

If a part of the ligand is particularly hydrophobic, GLUE automatically starts placing one of the hydrophobic group of the ligand in the hydrophobic map of the Target.

An interesting example is provided by camphor as a ligand of cytochrome P450cam. The camphor molecule has one polar atom (a carbonyl oxygen) and three methyl groups. GLUE will use the hydrophobic Probe, and the info from hydrogen, water, and carbonyl oxygen maps. Each of the methyl carbons is assigned as hydrophobic and GLUE then runs normally using the carbonyl oxygen and the three methyls to guide the ligand to its site.

41.2.2. The elimination stage

The same search process starts again as soon as the previous binding mode has been saved, and this cycle continues until all possible triplet interactions have been stored. The overall number of saved positions is usually rather large, but many can be eliminated very quickly from further consideration and so the process is not inefficient. For example, exactly the same binding mode may be generated several times, either starting with P1 at T1 followed by P2 at T2, or starting with P2 at T2 and then finding P1 at T1. Such replicates can be eliminated almost instantaneously.

The Programme will have now identified all the ways in which three atoms of a ligand could make polar interactions with the Target. However no consideration has yet been given to steric clashes and other effects, and the Grid maps for hydrogen and water are next used in order to assess these aspects. Very many putative binding modes can be eliminated at this stage.

The remaining binding modes are finally reassessed and optimised. All polar, charge and van der Waals interactions of the ligand are taken into account, and this step normally yields a very short list of possible binding modes for the ligand to the Target.

The final results from GLUE are sent to file for graphical display or further analysis. The output file contains one or more sets of ligand coordinates in (multi)mol2 format. It is a list of the possible positions for the ligand, correctly aligned in the coordinate frame of the Target. This output file can be visualized together with the Target's PDB file, and used to display simultaneously all the predicted ligand binding positions. The file can be also split to permit inspection of one ligand at a time, or the coordinates for each binding orientation can be used as the starting points for other computations.

41.3. User guide to programme GLUE

41.3.1. Structure of programme GLUE

GLUE requires two different kinds of input file, and it produces one file of output:

  1. The input files

    • STRUCTURE FILES: these input files describe the structure and properties of the Target and the ligand molecule. They are regular GRINKOUT files produced by Programme GREATER (or GRIN) for the Target and ligand respectively. They will be called TARGET.KOUT (the protein) and LIGAND.KOUT (the ligand).

    • GRID MAPS: These input files for GLUE are produced automatically.

    NOTE on Directive NPLA: Directive NPLA is automatically set to 1 for the initial GRID runs. No other value is acceptable.

    NOTE ON MINI FILES: GLUE produces some files, called .MINI, which are the direct input to Programmes GLUE or FLAP.

  2. The output files

    The name of the output file can be selected by the User. The OUTPUT file is a Brookhaven PDB file with the coordinates of the ligand in its predicted mode of binding to the Target (the protein). Several alternative binding modes may be predicted, and they will then be printed one after the other to the OUTPUT.

41.3.2. The Target for programme GLUE

The Target for GLUE is normally a protein, glyco-protein, nucleic acid or other biological macromolecule. It is processed through GREATER (or GRIN) in the traditional way to give a file TARGET.KOUT. Programme GREATER would be used to convert the input into PDB format, if the original Target structure was not a Brookhaven PDB file. Programme GREAT would be used for special applications still not present in GREATER, such as the use of special directives, or the conversion of complex proteins with cofactor and or hydrated aminoacid with 'special' water molecules to be taken into consideration.

TARGET.KOUT is used as input to GRID in the traditional way. All the normal directives for GRID may be used, but the grid points in GLUE are set to 1 Angstrom apart (Directive NPLA = 1).

41.3.3. The GLUE Ligand

The ligand can also be processed through GREATER (or GRIN) in the traditional way to give a file called LIGAND.KOUT. However, this is not necessary and ligand(s) in standard mol2 format can be directly processed by GLUE.

41.4. Running GLUE starting from FLAP sites

The initial search (the pose) can be more efficient when suitable initial positions P1, ..., P4 are selected by programme FLAP.

41.4.1. Overview of FLAP

FLAP (Fingerprint for Ligands And Proteins) is a new computational procedure to explore the 3D-pharmacophore space of Ligands and Proteins. All the potential 3 and 4 point 3D pharmacophores expressed by ligands and/or receptors are calculated taking conformational flexibility and molecular or receptor shape into account. Starting from GRID force field parameterisation, FLAP uses common frame of reference to allow ligand-ligand, ligand-protein or protein-protein comparison.

3D pharmacophores consist of triplets or quadruplets of distances between chemical features. With 4-point pharmacophores chirality is evaluated with a significant increase on the amount of information on fundamental requirement for ligand-receptor recognition. Molecular and receptor shape are precisely evaluated "on the fly" and compared only when required. For a (macro)molecule the features are automatically identified. Then all the accessible geometries for all the combinations of four features are calculated.

Please refer to the GLUE tutorial 02 for further explanations.

41.5. Using ligands as probe molecules in GLUE

Water can play several different roles in ligand binding. It competes for polar groups in the cleft and on the ligand. It can force the ligand to bind in an unexpected orientation. It is responsible for Disfavoured Sites (see Tutorial06). It can influence selectivity. It can bridge between ligand and Target. It causes the hydrophobic effect, and Grid can be used to investigate each of these roles when a ligand:target complex is formed.

In fact the affinity of any ligand may be sub-optimal if water satisfies the hydrogen-bonding capacity of the receptor better than the ligand itself, and the establishment of appropriate hydrogen bonds between ligand and receptor is therefore essential. The water sets a standard to which the polar atoms of the ligand must aspire, and mismatched hydrogen bonds may depress the affinity of a ligand disasterously. The exact arrangement of hydrogen bonds can be a critically important determinant of selectivity, but the establishment of good hydrogen bonds does not necessarily lead to high affinity. Strong binding is usually due to good hydrophobic interactions.

There are several advantages in applying Grid to study the role of water in drug binding. The Grid Force Field has been widely used for more than twenty years; it can handle the hydrophobic effect; it is not restricted to any particular class of drug molecule or biological macro-molecule, and Grid runs quickly. Disadvantages are that one needs to know the structure of the macromolecule complex when complete ligand molecules are being studied, and that Grid (like other force-fields) can still be improved in many different ways.

Grid can be used to estimate some enthalpic and entropic components of the interaction between a ligand molecule and its Target. This is a traditional 19th Century equilibrium thermodynamic approach, and could in principle be used to determine the affinity of the ligand for its receptor. However, Grid is not recommended for that job because many other methods are already available (See next note), and because there are serious and well-known problems when the entropic and enthalpic components are individually estimated. These problems arise because both components are big, and the affinity of the ligand depends on the small poorly-defined difference between the two.

Equation 1:

RTLnk = 2.303RTLogk =  dF = dH -TdS    ........(1)

where:

R   is the Gas Constant (1.987 cal/mole.K) T   is the absolute temperature (assumed 308 K) Lnk is the natural logarithm of the equilibrium constant dF  is the Free Energy change dH  is the enthalpic component dS  is the entropy change and 2.303 converts from Natural logarithms to base 10.

so:

Logk = dF/(2.303*1.987*308) = dF/1409

or converting from small calories to Kcalories:

Logk = dF/1.409

This equation shows that an error of 1.409 Kcal/mole in the estimated Free Energy dF would lead to a ten-fold error in the predicted affinity constant k. In other words, if dH and TdS were each about 30 Kcal/mole (but of opposite sign), then a 10 percent mistake in estimating one of them would lead to an error of more than a hundred-fold in the value of k.

One only needs to consider a single component of TdS in order to appreciate the serious errors which can occur when this approach is used to estimate the free energy of binding. An incoming drug molecule presumably tumbles in free solution until it reaches its binding cleft, and would have three rotational and three translational degrees of freedom of its own. These would be lost when it binds securely to its receptor, and one may calculate that the six lost degrees of freedom should theoretically depress its affinity by 14 Kcal/mole. However, experimental attempts to measure the size of this effect typically lead to much smaller estimates of between 0.5 and 1.5 Kcal/mole.

The low experimental values might be expected if the drug were not tumbling freely beforehand, or if it was vigorously jumping around in the binding cleft after "binding". Be this as it may, one is left with a discrepancy of more than 12 Kcal/mole between theory and experiment, and this corresponds to more than eight orders of magnitude of uncertainty in the binding constant!

It is clear that any method which depends on the difference between dH and TdS is inherently unstable, and this is why Grid will not provide reliable values for the free energy of drug-receptor interactions. Drug-design, on the other hand, involves changing the structure of a candidate drug so that either dH or TdS (or both) are altered in a way which will lead to improved affinity. One does not need to know the absolute value of k in order to make it better. If the hydrophobic effect or the favourable enthalpic interactions of a ligand with its Target can be increased, or the penalty for pre-bound waters lowered, the new ligand molecule should have greater binding affinity and this is often what one wants to know.

REFERENCES to the estimation of ligand affinities include:

1)   Cramer et al (1988). J.Am.Chem.Soc. 110. 5959-5967
2)   Krystek et al (1993). J.Mol.Biol. 234. 661-679
3)   Miyamoto and Kollman (1993). PNAS. 90. 8402-8406
4)   Aqvist (1994). Protein Eng. 7. 385-391
5)   Bohm (1994). J.Comput-Aided Mol. Des. 8. 243-256
6)   Holloway et al (1995). J.Med.Chem. 38. 305-317
7)   Head et al (1996). J.Am.Chem.Soc. 118. 3959-3969
8)   Kollman (1996). Acc.Chem.Res. 29. 461-469
9)   Helms and Wade (1998). J.Am.Chem.Soc. 120. 2710-2713
10)  Murray et al (1998). J.Comp-Aided Mol.Des. 12. 503-519
11)  Pitera and Kollman (1998). J.Am.Chem.Soc. 120. 7557-7567

41.6. General method to compute the binding energy for ligands as probe molecules in GLUE

If one begins with the structure of the complex, and prepares the input KOUT file with GREATER, one can obtain two separate PDB files: one for the ligand-free protein and another for the ligand by itself. Water molecules are normally omitted from the proteiun, but if required they can be retained (see GREATER manual for explanation). Counter-ions may be included in the protein, if appropriate, to neutralise the net charge. The ligand file may only contain HETATM records, and we will call it the "Probe Molecule".

The coordinates of the ligand should represent the Probe Molecule correctly located in the binding cleft of the Target, and should naturally be appropriate since the file for the ligand was prepared from the file for the complex. However unexpectedly close contacts are sometimes found between atoms of the Target and HETATMS of the Probe Molecule, and those contacts can seriously bias the results. Grid will normally allow the ligand to move slightly in its binding cleft in order to relax any such contacts (See GLUE Tutorial 01).

The main computation can then be subdivided into the following stages:

  1. The GLUE run is started, and its first job is to relax any close contacts between ligand and Target as described above. The overall movement of the ligand is normally very small, and the relaxed coordinates are printed to the GRIDLONT output file.

  2. It is assumed that the binding cleft of the Target was exposed to water before binding to the Probe Molecule, and that some water molecules might have been held to its surface by hydrogen bonds. The GRINKOUT file for the Target is therefore used as the first input file for Grid, and a Water Probe is used to predict the location of such waters. A list of waters is prepared, and details about each individual water molecule are printed to the GRIDLONT output file for detailed analysis.

  3. The predicted waters are added to the binding cleft, giving a hydrated Target whose interactions with the Probe Molecule can be studied.

  4. GLUE then strips appropriate waters away from the surface of the Target at those places where it will be in contact with the atoms of the Probe Molecule when the complex is finally reassembled. These are the water molecules which would be disturbed or displaced as a result of complex formation, and GLUE estimates the enthalpy associated with their movement or removal. The answer is normally a positive value because the breaking of hydrogen-bonds from the Target to water is an enthalpically unfavourable process, and the contribution made by each individual water is listed in the GRIDLONT file.

  5. It is also assumed that some water molecules would be held by hydrogen bonds to the surface of the Probe Molecule before it entered the binding cleft. The next step is therefore to predict the location of those waters, and decide which ones would be disturbed or displaced from the surface of the Probe Molecule when the complex is formed. The enthalpy changes are again estimated as described above, and are again positive.

  6. At this stage some parts of the ligand surface are still immersed in water, and parts are now dry because waters have been removed where it is going to touch the Target. Similarly parts of the Target are immersed in water, and parts are dry where it is going to touch the ligand. The dry surface of the ligand is now brought up to the dry surface of the cleft, and enthalpic interactions between the two are computed. This is normally a large negative value indicating a favourable attraction due to induction, dispersion and hydrogen bonding interactions.

  7. In Step 5 above, some predicted waters were stripped away from the ligand but others were left attached to it. The residual waters were those making hydrogen-bonds to the ligand which were not displaced on binding, and they were brought with the ligand into the cleft. Similarly some predicted waters were left in the cleft because they would not be displaced by the ligand. An unacceptable clash of predicted water molecules may now occur, because a water coming with the ligand might be brought into almost the same position as a water already in the cleft. Searches are therefore made for any such clashing waters, and one of each pair is displaced with an appropriate energy adjustment.

  8. Many water molecules will still remain round the newly-formed complex, and some may form water bridges between the ligand and the cleft. These bridges are enthalpically favourable and contribute a negative enthalpy. However, the number of bridges is often small, and so their influence may be limited.

  9. Parts of the ligand surface, and parts of the cleft may be hydrophobic, and the displacement of ordered waters from these regions into bulk water is entropically favourable to ligand binding. Such waters are not normally detected by X-ray crystallography, and their positions would not have been predicted by GLUE. However there is a Hydrophobic Probe in GLUE, and hydrophobicity almost always contributes a negative entropy term when the overall ligand-target interaction is being computed. This term may be large or small depending on the structures of the ligand and the cleft, and the way they are assembled together in the complex.

  10. Parts of the ligand molecule may be conformationally flexible, and so may some side-chains of the Target. Degrees of conformational freedom may therefore be lost when the ligand binds, and this loss of flexibility is entropically unfavourable to ligand binding. The associated energy term is therefore positive, but it is sometimes quite small and may be absent altogether.

41.7. List of energy terms

The following list shows the computed terms which contribute to the interaction energy between Probe Molecule and Target. The sign against each indicates whether it normally favours ligand binding (- sign) or tends to oppose it (+ sign):

Energy balance sheet for Glucose binding to Glycogen Phosphorilase.

Steric Contacts0
Ligand-Target Enthalpic interaction energy and Water bridges between Ligand and Target-43.7
Penalty for displacing pre-bound waters from Target+27.7
Penalty for displacing pre-bound waters from Ligand+15.8
Penalty for mutually incompatible waters+0.0
Term contributed by hydrophobic interactions-3.4
Penalty for restricted rotational torsions+0.0
OVERALL TOTAL-3.6

This approach helps one to assess the role of water molecules when ligands bind in biological systems. However, there are too many uncertainties for worthwhile estimates of the equilibrium constant to be obtained, and this must be considered only as an aid for ranking the Target-ligand interaction energies.

41.8. GLUE limitations

GLUE is based on the GRID force field and it correctly works whenever ligands contain upto 300 atoms and proteins upto 24000 atoms.

When starting from FLAP sites, the threshold values correspond to FLAP parameters: upto 6 probes can be used (one of which is automatically assigned to the DRY probe) and 25 points for each MINI file.

Please refer to the GLUE tutorial 01 and 02 for further explanations.

Latest versions

Login

Username

Password

Register | Lost password?