The Calculation Parameters dialog is presented to the User when importing series, importing fields or recomputing.

 

Method Dialog

 

In this dialog, the User can change the different parameters that control the ALMOND encoding engine. These can be grouped under three categories:

 

Probes parameters Define the identity of the different probes present in the MIFs

Define also the use or not of each probe in the analysis and the use or not of cross-interactions.

Field filtering parameters Define the number of nodes to extract and the relative importance of the field values for the nodes selection

MACC2 transform parameters Defines the width of the smoothing window, which determines the number of discrete distance ranges to consider in the MACC2 transform.

Defines a fixed number of total variables to extract.

 

Each of these three category of parameters can be accessed and changed pressing one of the three buttons shown at the right hand side of the dialog.

The value of the ALMD directive is also indicated but it cannot be changed.

The true value of the ALMD directive is not contained in the kont files of GRID. When GRID field are imported, it is only possible to know if ALMD was more than 0.0 by looking at the size of the file, since an extra information is given when ALMD is greater than 0.0. Therefore, a value of the ALMD directive equal to 1.0 indicates that the ALMD directive was set in the grid.in used to generate the kont, but not what was its exact value.

When field are imported from GOLPE, the value of the ALMD directive is set to 0.0.

 

Define probes

In this dialog the User can define the identity of the probes used in the MIF analysis. When ALMOND import series, GRID is handled by ALMOND itself, the identity of the probes is known and these will be shown as the default. When ALMOND import fields, the program can detect the number of field generated but not the probes used and the User must define them.

Probes can be of the following types:

probe code probe name old coeff. new coeff.
OH OH phenyl -7.0 -7.9
OH2 water -7.0 -8.8
C3 C sp3 -2.4 -2.8
DRY hydrophobic -2.7 -3.5
O O carbonyl -5.6 -6.1
O:: O carboxyl -8.7 -7.8
OC2 O ether -3.9 -4.3
N: N sp3 neutral -6.4 -6.7
N1 N amide -6.8 -7.2
N1: NH sp3 neutral -8.4 -8.7
N1+ NH sp3 cation -9.1 -9.9
F fluorine -6.8 -3.1
CL chlorine -4.5 -4.9
BR bromine -4.3 -5.2
NA+ sodium cation -22.7 -15.7
K+ potassium cation -17.6 -13.4
CA+2 calcium cation -43.6 -30.6
FE+2 ferrous cation -33.2 -41.0

In the above table, the first column represents the probe code as it is used in GRID v.19. The second column explains the identity of the chemical group used in the calculation. The third column shows the scaling coefficient used to scale the values of the fields untill version 3.2.0 and the fourth column shows the coefficients that are currently used in almond.

The old coefficients correspond to the most negative interaction energies found in the GRID analysis of a large series of compounds (about 600 drugs or drug-like compounds). Although the number of compounds used to make the first generation of coefficient was fairly high, it was rather common to find series of compounds with correlogram values significantly above 1.0. Therefore, a more extensive study was performed on a larger database of compounds (about 55.000 compounds) which lead to the generation of new coefficients that are currently implemented in ALMOND since version 3.2.0. The coefficients correspond to the most negative interaction energies found in the GRID analysis after ordering the compounds of the series by the value of their most negative interaction energy and discarding the first five percentiles of the energy distribution.

The database used for scaling was the full Maybridge HTS database (september 2002). The structure were converted from 2D to 3D with corina 2.6. The database contains mostly druglike compounds but some of them can be considered as outliers because they make very strong interactions with the probe due their particular conformation. The energy value of such compounds are not taken into account by removing the first five percentiles of the energy value distribution. Applying this coefficient should guarantee that, for most of the usual compounds (95% of the compounds in the case of the Maybridge database), the values of the field will variate between 0 and 1. However, when using charged compounds or peculiar ones, it might happens to obtain values larger than 1. This is not an error and is not detrimental for the analysis.

There is no pre-defined probe type for MEPs in ALMOND 3.2.0.

Additionally, in this dialog the User can instruct ALMOND to use or ignore the different fields and to use or not cross-interactions in the MACC2 transform.

 

Define Probes Dialog

 

Probe 1

Select a probe from the list if the label shown does not correspond with the probe used for the analysis

 

TIP (shape probe)

This control is used to activate the shape probe which is a special ALMOND probe (see background section). It is enabled when at least one GRID probe other than DRY is used. If four GRID probes are used, the shape probe is disabled since a maximum of four probes is allowed in ALMOND.

For consistency, the curvature values obtained with the shape probe are scaled in the same way as the energy values obtained with common GRID probes: i.e. using a scaling factor obtained from the same series of compounds (see above).

 

TIP Options

Clicking on this button opens the Shape parameters dialog windows. It is enabled when the shape probe chooser has been turned on.

 

use

If this control is marked the probe will be used into the analysis, if not it will be ignored. Al least 1 probe should be marked.

 

use cross-correlation

If this control is marked, ALMOND will compute cross-interactions in between points of different fields in order to obtain cross-correlograms.

 

Define filtering

The filtering step will extract from the MIF a smaller number of nodes that hopefully represent the most relevant pharmacophoric characteristics of the compounds (see the background section for details).

 

Define Filtering Dialog

 

Number of filtered nodes:

Using the sliding control the User can define the exact number of nodes that will be extracted from each MIF. These are extracted from nodes with negative values significantly different from 0.000. If the number of nodes defined is too high it might happen that the some field does not contains enough nodes to meet the requirement. In this case a warning message will be dumped to the log. If this message is observed often we suggest to reduce the number of nodes.

The default is 100 nodes. This usually is a good selection for drug-like compounds and a grid spacing of 0.5. There are certain circumstanced that can affect the optimum number of nodes:

This is one of the most important parameters of the method. The best way to fine tune the value of this parameter is to visually inspect the filtered points selected for some representative compounds of the series. In particular, if the filtered points do not represent reasonably well all the potential pharmacophoric points, the number of filtered points should be increased.

The number of nodes has a large impact on the speed of the method. Selections over 200 are not recommended and will slow down the analysis significantly.

 

Relative weight of the field (%):

The selection of the nodes is made according to two different criteria (see background for details):

  1. selected nodes should have high negative values
  2. selected nodes should be as far from every other selected node as much as possible

The User can move the sliding control to change the relative weight of the first criterion (the field values) in the selection. If moved to 100%, the selection will be based only upon the values of the field. On the other hand, if it is moved to 0%, the selection will be based only on the distances of the nodes. Usually the default (50%) is a good choice. However, sometimes compounds with formal charges or strong polar groups can tend to concentrate to many filtered nodes around them. In these situations, decreasing the importance of the field values can be of help.

As in the previous case, the best way to fine tune this value is the visual inspection of the filtered nodes selected on a representative set of compounds.

 

Define MACC2

The auto and cross correlograms produced by the MACC2 transform (see background) contain variables representing node-node interactions at certain distance ranges. These ranges constitute a discrete division of the space like the millimeters marks in a ruler: in between two of these marks, all distance found are considered similar and indistinguishable. The space separating these marks in MACC2 is called "width of the smoothing window" and can be changed here.

 

Define MACC2 Dialog

 

Width of the smoothing window:

The User can move the sliding control to change this width. The value is expressed in grid spacing units. If the value is set to a very low value, many distance ranges will be defined, the correlograms will contain a large number of variables and there is the risk that conceptually equal distances will fall in different variables due to small differences in the discrete sampling of the MIF. On the other hand, if it is set to a value too high, conceptually different distances can be assigned to the same variables. In our experience the default value works fine for most cases and no change is usually required.

Please notice that the number of ALMOND variables strictly depends on this value and is independent of the number of filtered nodes

 

Autocorrelogram size:

If the control is set to zero, ALMOND runs the MACC2 analysis in all the series and then defines the size of the correlograms as the minimum size that encloses every variable with at least one non-zero value. The same correlogram size is assigned to all the correlograms, for all the objects in the series.

Please notice that the number of ALMOND variables extracted depends only on the width of the smoothing window and the series of compounds analyzed.

This is most inconvenient when a model has been developed and the analysis has to be repeated for a set of different compounds, since ALMOND can produce a different number of descriptors, thus making impossible the PCA or PLS external predictions on these new compounds. In such cases it is possible to define a fixed number of variables to extract. The number defined will correspond exactly with the number of variables included in each correlogram. In this case, some of the variables can contain only zero values or, on the contrary, some distances found in the series will not be represented into the correlograms.

 

 

Any change in the parameters is immediately reflected in the text window. When this text window shows acceptable values the User can press the OK button to start the computation or Cancel to abort it.

 

Latest versions

Login

Username

Password

Register | Lost password?