| MoKa manual | ||
|---|---|---|
| <<< Previous | Next >>> | |
Chapter 6. Kibitzer: Extend pKa accuracy
Kibitzer is an automatic and expert tool to expand the MoKa internal database with a corporate database of pKa values. A fully automated training procedure makes it easy to create customized pKa models by using your experimental pKa values. You can also easily import an SD file containing experimental pKa values into the Kibitzer tool. Check the pKa assignments and, if necessary, correct them by using the embedded molecular viewer. Create customized models and verify the enhancements using MoKa. To save your Kibitzer project, export it in an SD file.
6.1. The Interface and the Workflow
Here is the Kibitzer Interface and the three steps used to build a custom pKa prediction model:
Import SD file
Check assigned pKa
Build models

6.2. The Menus
Here are the menus you will see displayed on the Kibitzer menu bar (left to right), and the commands you will find in each.
File menu
Open - Opens a .kib file to import into Kibitzer
Save - Saves your current workSave as... - Saves your current work in the file specified
Import SD data file - Imports an SD fileExport SD - Exports the current project to an SD file including appropriate fields to store user defined settings
Exit - Quits Kibitzer
Edit menu
Select All - Selects all the molecules listed
Unselect All - Unselects all the molecules listed
Tools menu
Compute -
Opens dialog box to start the model building processLoad Custom Model - Selects the Custom Model to use to assign experimental pKas. Please select this option before importing the SD file
Restore Default Model - Selects the MoKa standard Internal Model to assign experimental pKas. Please select this option before importing the SD file
Note: It is possible to load a custom model at program startup by using the
--load-modeloption, like this: kibitzer --load-model=modelname
View menu
Optimize 3D view - Rotates the structure in the
Molecule view window for better visualization
Toggle log window - Opens the log window
Annotate depiction -
Annotates the 2D depiction with atom
numbers (in parentheses) and experimental pKa
Help menu
Manual - Opens the manual
About - Displays information about the Kibitzer version
6.3. Working with Kibitzer
This chapter provides basic information to get you started working with Kibitzer. Before creating customized pKa prediction models, you need to consider how they will be used. Are you going to add your whole database of pKa values? Are you going to add only a part of your data set and keep the remainder for testing? Are you going to add only one molecular structure? Knowing your requirements will help you choose a suitable data set.
Kibitzer improves the accuracy of MoKa pKa calculations by expanding the chemical space of the internal database. If you want to test the capabilities of the software, bear in mind that Kibitzer works by automatically adding missing parameters to describe a molecular structure. When the structure that you are adding is already known, Kibitzer will account for your pKa and adjust the existing parameters accordingly.

By checking the QP value, you can easily see whether your structure contains structural features not parametrized in the internal database. The higher the absolute value of QP is, the further is your structure is from the chemical space covered by the model currently in use. To obtain the best results and avoid overfitting, include at least 2-3 structures of the same series when the QP for a pKa is significantly high ( > 1.0).
To select a data set suited to your requirements, you need to consider the following:
Kibitzer cannot significantly improve the accuracy of the predictions if QP = 0. However, you will be able to see a shift towards your experimental data.
If you generate a model for structures that have nonzero QP values, Kibitzer will build a model that has a very good fit with your experimental data.
If you add only one pKa, you will be able to see approximately the same pKa shift for structures of the same series.
If the training library and the test library are unrelated you will not be able to see the effects of the training.
6.3.1. Step 1: Import SD file
Import an SD file containing experimental pKa values and Kibitzer will
automatically assign such pKas to their
corresponding ionizable sites.
In the SD file each pKa should be reported in a field containing the pattern <pka> (case insensitive). For example:
> <PKA1> 3.490 > <PKA2> 5.320 |
> <PKA1> 3.490 5.320 |
> <PKA1> <3.0 5.320 cosolvent |
6.3.2. Step 2: Check assigned pKas
After importing your SD file, the Name of every molecule is listed on the left panel along with the attributes Accuracy, Class and Info. You can sort molecules by clicking on the corresponding attribute. The Accuracy attribute is also displayed by colored arrows:

You should be particularly careful with molecules labeled by a red arrow. Kibitzer assigns experimental pKa values according to MoKa calculations.
A significant difference (> SD + 1.5 pKa units) between the calculated and the
experimental pKas of a molecule is highlighted by a red arrow to the left of the molecule's name.
If the assignments have a good degree of accuracy,
the corresponding molecule is associated with a yellow (better than 1.5 pKa units)
or green (within standard deviation) arrow. Molecules that have no experimental pKa value
or no ionizable site do have any arrow to the left of their names.
The label class classifies molecules according to the relative number of predicted and experimental pKa values reported in the SD file.
To ease the assignment, predicted pKas in the extreme range are filtered out. For example, predicted pKas above 12 of weak acids or predicted pKas below 2 of very weak bases are removed because such pKas cannot be measured in normal conditions.
If for a molecule the number of experimental pKa values exceeds the number of ionizable sites, the experimental pKa values regarded as the least reliable are removed.
When Kibitzer finds more ionizable sites than experimental pKas, all the non-assigned sites are labeled as "N/A" (not available). This does not represent a problem for the computation and these centers are simply not considered.
The attribute Class keeps track of such operations:
CLASS A: n. pred. pKa = n. exp. pKa (before filtering); n. pred. pKa = n. exp. pKa (after filtering)
CLASS B: n. pred. pKa > n. exp. pKa (before filtering); n. pred. pKa = n. exp. pKa (after filtering)
CLASS C: n. pred. pKa < n. exp. pKa (before filtering); n. pred. pKa < n. exp. pKa(after filtering)
CLASS D: n. pred. pKa < n. exp. pKa (before filtering); n. pred. pKa = n. exp. pKa(after filtering)
Warning: prevent noise in your custom model. You might find that some of the assigned experimental pKa values are accompanied by a warning, which indicates that not only the experimental pKa assigned is very different from the predicted one, but that the system is also well parametrized for the structure that you are submitting. Consequently, this pKa value might not be beneficial to the training.
Typically, the warning stems from one of the following:
The assignment is wrong; you can try to correct it manually
MoKa is not parametrized to predict that particular pKa correctly
The experimental pKa reported conflicts with the structure given
This problem can usually be solved by manually correcting the assignment. If this procedure does not work but you are confident that the experimental pKa is correct, disregard the warning. Otherwise, deselect the corresponding pKa value before building the custom models.
Add weight to your experimental pKa values. Ionizable centers that have QP = 0 are very well parametrized and so adding your experimental pKa values may produce little benefit to your customized model. If you wish to add more weight to your experimental pKa values, you can add replicas of the same structures. While it is not possible to predict the number of replicas to add, a minimum of three structures is necessary to obtain a significant effect.
6.3.3. Step 3: Build Models
If you are happy with the current assignments, you can build a new model,
which will be stored in a .mkd file. You can also run a full validation by checking full validation in the dialog box Build model.
The validation process is important to improve the model's predictive ability, and it is increasingly important to run it for large training data (over 3000 pKa values). However, you can safely skip this step if your training library is small.
MoKa can import the .mkd file (Edit->Load custom model) and calculate pKas using a custom model, which is based on the MoKa internal database biased by the imported custom pKa database.
6.4. Tips and Troubleshooting
Here are a few tips for using Kibitzer.
Start building a temporary custom model that includes one or two compounds per series. Then load this temporary custom model to ease the assignment of your whole pKa database.
Remember that pKa value warnings might add noise to your models.
Some useful how to's.
merge multiple .kib files: with your currently opened .kib file, click Save as... and select the .kib file with whom you need to do the merging. You will be prompted to either overwrite or append. Click append for merging.
check the benefits of training: Export your Kibitzer project file in an SD file that will include all the assignments set. After this, load the SD file into Versus and select pKa assignments from Kibitzer SD. Now you can check the results with internal and custom models.
check the effects of training on the pKa prediction models: Tools > Select Internal Model. Kibitzer now loads the custom model selected and indicates the differences with internal models in terms of standard deviation. Differences of more than 0.05 may indicate inconsistencies in your custom model
Upgrading from older versions. Please note that models saved in .mkd files are NOT compatible with different versions of the software. You can save your current work for future use in a .kib file or in a .SD file, which can be both safely transferred from one version to another of Kibitzer.
6.5. Capabilities and Limitations
Kibitzer allows you to expand the chemical space covered by MoKa, and the custom model results in more accurate pKa predictions, but only for molecules within the new chemical space explored. To test the capabilities of Kibitzer you need to benchmark the predictivity of a custom model against a set of molecules of the same series of those added.
| <<< Previous | Home | Next >>> |
| MoKa command-line | Blabber: Generate Protomers |
