Summary of the results of docking a library to a target by Autodock-4, Autodock-Vina, eHiTS, PLANTS, GOLD, DOCK or Glide
Aug. 14, 2020
The program Dockres scans the result of Autodock (Version 4) or Autodock-Vina or eHiTS or PLANTS or GOLD or DOCK or Glide docking runs with a series of ligands. It gathers the top binders and displays a variety of statistics, both on the ligand set and on the top binding poses. The ligands selected can be filtered by a number of criteria.
File convention
Dockres uses a one letter code for the screening software used:
Input of the program
Besides the structure file for the target macromolecule (of the form macro.pdb*, or (for GOLD) macro.mol2) Dockres assumes the availability of the following files (the notation macro stands for the name of the macromolecule file's name without the .pdbqs .pdbqt, .pdb or .mol2 extension):
Format of the file macro_<sw>.dir:
mm.gpf 1 ligx.mm.dlg 2 ligy.mm.dlg 3 ligz.mm.dlg
In addition, Dockres needs
Dockres can be run both interactively from a terminal or in batch mode, specifying the run parameters as command-line options. The terminal inputs can also be logged to a file that can be used to rerun a job, possibly after editing it. When compiled with the parallel code included it has to be run in batch mode (with concomittant restrictions, vide infra).
In interactive mode it starts with asking (possibly a subset of the) for the following information:
The result summary starts with printing on the terminal the list of the top-scoring poses, the number of poses in the top-score ranges, and a plot showing the distribution of the location of poses over the macromolecule's residues. The program then gives the user the option to
If no (more) repetition of calculations are requested, the program proceeds to the last stage where it offers the user the following options:
In batch mode the following information can be specified:
Ligand type number list:
H :H-C = 1 C :>C< =11 N :>N< =21 O :C-O-H=31 O :C-O-C=41 O-:C-On =51
H :H-N = 2 C :>C= =12 N :-N< =22 O :N-O-H=32 O :C-O-N=42 O-:P-On =52
H :H-O = 3 C :C=-C =13 N :-N= =23 O :O-O-H=33 O :C-O-O=43 O-:S-On =53
H :H-P = 4 C :C=-N =14 N :-N-=C=24 O :P-O-H=34 O :C-O-P=44 O-:*On* =54
H :H-S = 5 C :Carom=15 N :*N* =25 O :S-O-H=35 O :C-O-S=45
H :H* = 6 C :*C* =16 O :C=O =36 O :*OX* =46
O :P=O =37
O :S=O =38 P :P* =58
S :S* =59
**:* =60
> dockres -mm hemoglobin -sw eHiTS -np 20 -ol 2 -ib 99
The first two items (-mm and -sw) are compulsory; preferably, they should be the first two items, allowing all warning and error messages to be printed on the log file macro_<sw>.res. For the rest of the input that can be specified in interactive mode the default values are used.
Batch run with flexible macromolecule has not yet been implemented.
Some inputs currently can only be provided in the interactive mode (e.g., hydrogen-bond thresholds, filtering options). To use a non-default option for which no command-line input is implemented, an interactive run is required that can be started from the checkpoint file. It will not be CPU intensive since the time-consuming data gathering has been completed already.
Output of the program
Dockres will create the following files:
The file macro_<sw>.res will contain
A typical example of such output (for information in the number of rotatable bonds in a ligand) is
Distribution of number of rotatable bonds over 11 ligands
Average= 4.1818 S.D.= 1.6414
.00 .00 .18 .18 .18 .36 .00 .00 .09 .00
+----+----+----+----+----+----+----+----+----+----+
1.00 | |
.90 | |
.80 | |
.70 | |
.60 | |
.50 | |
.40 | |****| |
.30 | |****| |
.20 | |****|****|****|****| |
.10 | |****|****|****|****| |****| |
+----+----+----+----+----+----+----+----+----+----+
0 1 2 3 4 5 6 7 8 9
Here the X axis is the value of the property for which the distribution is
calculated;
the Y axis is the fraction of ligands having a particular value of the property;
the numbers on the top give the actual fractions.
In this example, the highest column is for 5. This means that most ligands
in this library have 5 rotatable bonds.
The hight of the column is at 0.4, meaning that between 30% and 40% of
the ligands have 5 rotatable bonds - the actual number is 36%, shown on top.
The number on top shows 0.36, meaning
that 36%
+---------+---------+---------+---------+---------+-
10| |
| |
| |
| * |
| * |
| * |
| * |
| * * |
|** * * |
1|** * * |
+---------+---------+---------+---------+---------+-
51 100
+---------+---------+---------+---------+---------+-
10| * |
| * |
| * |
| * |
| * |
| * |
| * |
| * |
| * |
1| * |
+---------+---------+---------+---------+---------+-
101 150
The height of the column is proportional to the number of ligands docked to
the residue represented by the X axis.
The residues to which a ligand is docked can be assigned by using the residue
where the closes contact is or using all residues that include atoms on the
contact list (
Largest count= 7
+---------+---------+---------+---------+---------+-
-4.40| 2 |
|1 2 |
|17 |
| 7 |
| |
|1 4 |
| |
| |
| |
-7.15| |
+---------+---------+---------+---------+---------+-
51 100
+---------+---------+---------+---------+---------+-
-4.40| |
| |
| |
| |
| |
| 4 |
| M |
| 1 |
| 1 |
-7.15| 1 |
+---------+---------+---------+---------+---------+-
101 150
Here again the X axis represent the residue number, the Y axis the docking
energy or free energy.
Whenever a number appears, it indicates that ligands were docked to
the corresponding residue, having the corresponding docking (free) energy.
The number (between 0 and 9) is proportional to the number of ligands
in that category. M (instead of a digit) shows the residue/energy combination
with the largest number of members (the value is given before the plots);
the digits 0 - 9 represent proportionally smaller number of members.
Compilation of the program
The program is written in Fortran 77. Its size is governed by the parameters (the number between the braces is the value set in the source code), established in the first line of the program
The program uses several arrays of size MAXMOL*MAXPOSE, dominating the memory requirement.
It should be compiled at the highest optimization level for maximum speed. For example, using the f77 compiler the compilation can be executed by
f77 -O4 -o dockres.exe dockres.f
Some compilers fail due to a so-called 'relocation error' when optimizing
at levels higher than one is asked.
When using the Intel Fortran compiler (ifort), adding the compiler directives
-mcmodel=medium -share_intel
solved the problem. With some of the other compiler (but not the GNU compiler)
the compilation key
-fpic was found to solve the problem.
The optional parallelization is using the MPI library. Note, that this requires running in batch mode, with the concomittant restrictions. Furthemore, the parallel version does not work for DOCK or Glide. To compile Dockres with the parallel code included, first remove the 'C@DM' string from the source code:
> cat dockres.f | sed 'C@DM'd > dockres_mpi.f
> f77mpi -o dockres -O4 dockres_mpi.f
The name of the MPI-enabled compiler may be different in your system and additional libraries may also be needed to be invoked.
For parallelized runs, the parameter MAXMOL can be set to less than the total number of ligands - it should be just large enough to hold data for Nmolec/NCPU. In this case, however, the program stops after writing the checkpoint file and a separate single-CPU run, compiled with the parameter MAXMOL set large enough to hold all ligands should be used to print/save the results. This option is useful for distributed memory systems where the majority of nodes have relatively small memory.
Note that if Fortran-90 is used for one compilation, then it should be used for the version reading the checkpoint file as well, otherwise the binary files will be incompatible.