Chapter 5. MoKa command-line

The command-line version of MoKa is a very convenient tool to perform large batch computations and to integrate the MoKa prediction engine in any existing workflow.

Example usage:

moka_cli -o output.txt -s FILE_ID --load-model=mymodel.mkd --show-qp input.sdf
This reads the file input.sdf and write results into output.txt using the custom model mymodel.mdk and displaying the qp value. The names of the structures are obtained from the SDF field <FILE_ID>.

The full set of available options is shown when you run moka_cli without parameters:

usage: ./moka_cli [options] filename
options are:
 -h, --help                     display this help
     --version                  show version info
 -v, --verbose                  show all warning messages
     --input-type=<sd|mol2>     input file type ( autodetect )
 -o, --output=FILE              output file name
     --output-type=<txt|sd>     output file type ( txt )
 -s, --sd-field=NAME            use the specified SD field to retrieve
                                the structure ID
     --load-model=<database>    use a custom model database for predictions
     --acid-values=NUM          output at most NUM acid values ( all )
     --basic-values=NUM         output at most NUM basic values ( all )
     --acid-lo-limit=VALUE      low cutoff value for acid pKa ( -10.0 )
     --acid-hi-limit=VALUE      high cutoff value for acid pKa ( 20.0 )
     --basic-lo-limit=VALUE     low cutoff value for basic pKa ( -10.0 )
     --basic-hi-limit=VALUE     high cutoff value for basic pKa ( 20.0 )
     --show-qp                  add qp values to output
     --hide-sd                  remove SD values from output
     --logp=<value>             use <value> for logP instead of internal
                                prediction engine
     --logd=<pH:logD>           calculate logP from the given logD value.
                                implies:
                                   --show-logp
                                conflicts with:
                                   --logp
                                   --logp-attribute
                                   --show-logd
     --logp-attribute=<name>    read logP value from SD attribute <name>
     --show-logp                add logP prediction to output (sd only)
     --show-logd=<pH list>      add logD prediction to output (sd only)
     --tautomer                 replace input structure with most suitable
                                tautomer

<ph list> is a comma separated list of the following:
    * single value
    * range ( min-max )
Examples:
    * single value:    --show-logd=6
    * single range:    --show-logd=6-10
    * multiple values: --show-logd=6,7,8
    * mixed:           --show-logd=6,7-10

5.1. Output format

The output capabilities of MoKa depends on the chosen output format. The plain text (txt) format is useful when only pKa predictions are required, while the SD format can include all other data together with the molecular structure.

5.1.1. Plain text (txt) format

When using the the plain text (txt) format, the output file has one line for each input compound. Each line has the following format:

NAME CH UT IC ( a|b PK ATOM SD QP ) * IC

And the fields are:

  • NAME - the molecule name

  • CH - covalent hydration flag: 0 (false) or 1 (true)

  • UT - unstable tautomer flag: 0 (false), 1 (true) or - (not computed)

  • IC - number of ionizable centers

Then, for each ionizable center:

  • a|b - type of ionizable center (acid or basic)

  • PK - predicted pKa value

  • ATOM - atom number

  • SD - standard deviation of prediction

  • QP - quality parameter

Note: By default, the SD value is printed while the QP value is not

This is what the output .txt file looks like:

Figure 5-1. Results of batch calculation exported to .txt

5.1.2. SD file format

When using the SD format as output, the result line with pKa predictions is added to an SD attribute named "MoKa". The line is formatted in the same way as the txt output format.

Additionally, when LogP and LogD options are active, two attributes named "MoKa.LogP" and "MoKa.LogD" are associated to the structure; the former has a single value, while the latter is composed by one or more pairs [pH: LogD]

A sample output is reproduced herein:

$ moka_cli --show-logp --show-logd=4-8 --output-type=sd pyridine.sdf 
Pyridine


  6  6  0  0  0              1 V2000
    0.0000    1.5000    0.0000 C   0  0  0  0  0  0
    0.8500    2.0000    0.0000 C   0  0  0  0  0  0
    1.7250    1.5000    0.0000 C   0  0  0  0  0  0
    1.7250    0.5000    0.0000 C   0  0  0  0  0  0
    0.8500    0.0000    0.0000 N   0  0  0  0  0  0
    0.0000    0.5000    0.0000 C   0  0  0  0  0  0
  4  5  2  0
  5  6  1  0
  2  3  2  0
  1  2  1  0
  3  4  1  0
  1  6  2  0
M  END
>  <MoKa> v1.1.0-RC1
Pyridine 0 - 1 b 4.97 5 0.51 

>  <MoKa.LogD> 
4: -0.27
4.5: 0.15
5: 0.46
5.5: 0.63
6: 0.7
6.5: 0.73
7: 0.74
7.5: 0.74
8: 0.74

>  <MoKa.LogP> 
0.74

$$$$

Latest versions

Login

Username

Password

Register | Lost password?