|
Prediction palette |

The prediction palette is an efficient tool to calculate the PCA or the PLS scores or to predict the activity of series of objects not included originally in the models. Results are printed in the main text windows and can be monitored on 2D or 3D plots. Also, objects considered irrelevant for the prediction can be excluded either manually or graphically.
The first step is to select PCA or PLS prediction. In both cases the graphics (2D, 3D or scatter plot) are activated only after the PCA or PLS prediction is completed.
Result are print with the following format:
- PCA
followed by SSexp and SSacc. for every PC.object sequential number object name coordinates of the object in the PC model (PC1 to PC5) - PLS
followed by the external SDEP every times a LV is introduced into the modelobject sequential number external value
(e.g. <1.14>)object name coordinates of the object in the LV model (LV1 to LV5)
2D/3D scores
The User must specify the axes for the PCA or PLS plot. X=1 and Y=2 refers to PC1 vs PC2 space. In order to show in the plot the predicted values, the show external predictions button must be enabled.
Pressing OK the 2D or 3D plot appears. The projected points will be reported in red. When the spectrum color is activated it is very simple to appreciate the predicted activity for the external compounds (see how to activate the spectrum plot in the 2D plot or 3D plot sections of this manual).
scatter
It reports experimental vs calculated activity for the compounds in the library. The external compounds will be reported in red. By clicking on the compounds the calculated activity will be reported at the present dimensionality.
residuals
It reports the experimental activity vs the recalculated residues at the selected dimensionality. The external compounds will be reported in red. By clicking on the compounds the calculated residual value will be reported at the present dimensionality.
PLS plot
The User must select the component dimensionality (the first component is always the best choice). The show external predictions button must be ON to show the predictions in the plot. The external compounds will be reported in red. The position of the compounds in the model (yellow line) can be used to make a ranking of the compounds.
The button Excludeobjects is used to exclude some of the objects of the external series of the prediction. Excluding objects is useful for two reasons:
- Sometimes external predicted compounds are really far from the current chemical space. These compounds often produce a graphic "compression" of the plot thus disturbing the chemical interpretation.
- For virtual screening purpose, a lot of compounds are projected to the model whereas only a small subset of relevant objects is of main interest. The fastest and easiest way of selecting these objects is to use the graphical exclusion method.
By clicking on the Exclude objects button a new List objectswindow will appear. Click on this link to go to the exclude object dialog box. After pressing OK on the List objects window, another prediction will be performed without the disturbing compounds.
The Exit button, closes the prediction palette. If this palette was presented after a direct projection, (command File>>>Direct projection...) the projected objects will remain in memory for further analysis. If, the palette was presented after command Modeling>>>Project on library model... the external compounds will be removed from memory and the original series will remains instead.
Similarity

This dialog allows to evaluate the similarity between the compounds in the library model and in the projected series. Comparisons are based on the Hodgkin index:
![]()
Scoring mode
In principle, all variables in the computed correlograms participate in the computation. However, variables which take a value higher than 0 for one of the compounds and not for the other, often contribute very much to the index and dominate the comparison. In order to soften this effect, it is possible to apply different filtering options:
|
Global |
All variables are used for the similarity index. No filtering is applied |
|
projected<model |
Only variables which take a value higher than 0 for the projected compounds are used in the computation of the similarity. This is interesting when the projected molecules are smaller in size than the average molecules in the global model. |
|
model<projected |
Only variables which take a value higher than 0 for the compounds in the global model are used in the computation of the similarity. This is interesting when the molecules in the global model are smaller in size than the molecules in the projected series. |
|
Local |
Only variables which take a value higher than 0 for both the projected and the global compounds are used in the computation of the similarity. |
slice correlograms
For the purpose of the comparison only, the values of the correlograms can also be transformed by substracting the given value, and setting to 0 any resulting negative value. The effect of this transform is minimizing the influence in the scoring of variables having very high values for certain compounds. Please notice that this transformation will only have effect if the value entered in this field is higher than 0.000.
Output
The resulting scoring can be shown in the Main window, dumped to a file (SimilarityScores.txt) or both. In the case of large series, it is dangerous to select the Main Window output, because the dump can take a long time to complete.
Pressing the OK button will compute the similarity. Press Cancel to exit without performing any computation.