Protocol for Partial (e.g., Loop) Optimization with MMC

Mihaly Mezei

Department of Pharmacological Sciences
Icahn School of Medicine at Mount Sinai
New York, NY 10019

The loop optimization described in Ref. 1 represents the field of the bulk of the macromolecule that is kept rigid by potential grids. Other potential application can be the optimization of a peptide-protein complex. Such calculations require the following steps when using MMC:

RUN #0: Sample input file macro.inp:

FILE maxro
PRNT ECHO
!DBUG 2 170 1  53 1
HRDW VC32          ! 32-bit vector
SVVC SPCC 9.05     ! Solvent cutoff
SUVC MIGC 0.0      ! MI on the solute
!PBCN RECT  111.70  85.62  73.47
PBCN SPHR 200
MOVE RAND          ! Random selections
TEMP 5000
SUUC SPGC 20.0
SETC DDIE 78.4 6.02944 0.018733345 213.5782 8.0
NSLV 00   !100 solvent molecules to start with
STEP  0.00   00.0   0.55    40.0  60
! slt, slv stepsize, slt move freq.
SVPT TIP3 TIP3
SUPT AM02
SAMP METC 0.5
FLXR PDB 98
ATOM      3  CA  ILE X   3      -6.344  -9.312   8.947  1.00  0.00           C
ATOM     22  CA  GLN X   5      -0.828  -8.270   4.506  1.00  0.00           C
ATOM     39  CA  GLU X   7       5.317  -9.338   0.624  1.00  0.00           C
ATOM     54  CA  PHE X  18       2.209 -11.037  -2.060  1.00  0.00           C
...
ATOM   1525  CA  VAL X 373      -7.867  -1.737  -0.043  1.00  0.00           C
ATOM   1541  CA  VAL X 374     -10.415  -4.657  -0.060  1.00  0.00           C
ATOM   1557  CA  SER X 375     -13.108  -5.217  -2.770  1.00  0.00           C
ATOM   1568  CA  SER X 376     -16.949  -5.315  -3.172  1.00  0.00           C
ATOM   1579  CA  LEU X 377     -19.113  -6.741  -6.028  1.00  0.00           C
ATOM   1598  CA  ALA X 378     -22.324  -5.609  -7.813  1.00  0.00           C
SLTA SMPL MMC FILE 6088 0 0 0 1 maxro
!BNDL DIST
DSTC NONE
MVRT TVAL
PARD EXBI SHCY 0.05 0.5 30.0 0.4 0.5 30.0 0.4 0.5 30.0 0.2 -231
PART EXBI CYCL SHCY 0.95
LOOP PROX TRNG 1.0
PRMF AM02 parm14ipq.dat
TORD INPT 262
   38   40   90.0000   90.0000    4    1       1.0     180.0 ! iresa=    3 ILE
   40   46   90.0000   90.0000    5    1       1.0     180.0 ! iresa=    3 ILE
   76   78   90.0000   90.0000    8    1       1.0     180.0 ! iresa=    5 GLN
   78   81   90.0000   90.0000    9    1       1.0     180.0 ! iresa=    5 GLN
   81   84   90.0000   90.0000   10    1       1.0     180.0 ! iresa=    5 GLN
  103  105   90.0000   90.0000   11    1       1.0     180.0 ! iresa=    7 GLU
  105  108   90.0000   90.0000   12    1       1.0     180.0 ! iresa=    7 GLU
  108  111   90.0000   90.0000   13    1       1.0     180.0 ! iresa=    7 GLU
  268  270   90.0000   90.0000   33    1       1.0     180.0 ! iresa=   18 PHE
  270  273   90.0000   90.0000   34    1       1.0     180.0 ! iresa=   18 PHE
...
 6052 6055   90.0000   90.0000 1934    1       1.0     180.0 ! iresa=  376 SER
 6057 6059   90.0000   90.0000 1935    1       1.0     180.0 ! iresa=  376 SER
 6059 6061   90.0000   90.0000 1936    1       1.0     180.0 ! iresa=  377 LEU
 6061 6063   90.0000   90.0000 1937    1       1.0     180.0 ! iresa=  377 LEU
 6061 6076   90.0000   90.0000 1938    1       1.0     180.0 ! iresa=  377 LEU
 6063 6066   90.0000   90.0000 1939    1       1.0     180.0 ! iresa=  377 LEU
 6076 6078   90.0000   90.0000 1940    1       1.0     180.0 ! iresa=  377 LEU
 6078 6080   90.0000   90.0000 1941    1       1.0     180.0 ! iresa=  378 ALA
 6080 6086   90.0000   90.0000 1942    1       1.0     180.0 ! iresa=  378 ALA
AROM CHRM 3 CA  CPH1 CPH2
ENHB SMPL SMPL 3.0 1.65 1.0 -1.0 1 1 20
HBMO ADDC D>NH AC=O 29513.0 10628.0
SVIN CARA 24.5 1.5
!---- End of part common to all runs

RUN #0-p: Peptide constraint initialization

CNST INIT NOCT 13 ~
 6049  771  8.7 50.0 5.0 ~
 6035  305 15.1 50.0 5.0 ~
 6028  325 15.0 50.0 5.0 ~
 6011   57 11.5 50.0 5.0 ~
 5991   76  9.7 50.0 5.0 ~
 5985 2986  8.5 50.0 5.0 ~
 5965 2972  9.3 50.0 5.0 ~
 5955 3245 11.5 50.0 5.0 ~
 5941 3269  9.3 50.0 5.0 ~
 5920 3290 12.2 50.0 5.0 ~
 5901 3388 12.4 50.0 5.0 ~
 5887 1070  7.1 50.0 5.0 ~
 5867 1051  6.8 50.0 5.0

RUN #1 - Last part of input for the first test run:

CNFG RANI ASCI  ! Create initial configuration by inserting the molecule
RUNS 000001000 100000 20000000 100000 100000Y

RUN #2 - Last part of input for additional test runs:

CNFG READ ASCI  ! Read the coordinates of the molecule
RUNS 000001000 100000 20000000 100000 100000

RUN #3 - Last part of input to create the grid maps:

CNFG READ ASCI  ! Create the grid map
SPST CMPW 161 99999.0 1.0 50.0 -7.2 -5.2 -2.5  15.0 ! EDIT !

RUN #4 - Last part of input for the high-temperature run to create starting configurations for the annealing runs

CNFG READ ASCI
SPST EMAP 161  2.0
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
STUN TORS PICL 0.01 0.999 2000 +10
STUN LOOP PICL 0.01 0.999 2000 +10
MINE TRAJ
TRAJ ALLB FLEX ALST
RUNS 200000000 1000000 20000000 100000  
!    # of MC steps              Frequency of wrtiting to the .hst file
STOP

RUN #5 - Last part of input for the simulated annealing runs:

CNFG READ ASCI
SPST EMAP 161  100000
!Run without tuning using large step size to escape from
!possible repulsive traps
RUNS 1000000  10000 10000000 40000 100000 100000 100001
RUNS 1000000  10000 10000000 40000 100000 100000 100001
STUN TRAN PICL  0.2 0.999 2000 +10
STUN ROTA PICL  0.2 0.999 2000 +10
STUN TORS PICL  0.2 0.999 2000 +10
STUN LOOP PICL  0,2  .999 2000 +10
MINE ALLC
SANN EXPO 4000000 0.0 4000.0
RUNS 16000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
SANN EXPO 4000000 0.0 3000.0
RUNS 20000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
SANN EXPO 4000000 0.0 2000.0
RUNS 32000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
SANN EXPO 4000000 0.0 1000.0
RUNS 32000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
SANN EXPO 4000000 0.0 500.0
RUNS 32000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
SANN EXPO 4000000 0.0  10.0
RUNS 32000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
SANN LINE 4000000 0.0 0.01
RUNS 20000000  10000 10000000 40000 100000 100000 100001
RMCK -2

RUN #5-p - Last part of input for the simulated annealing runs with constraints:

CNFG READ ASCI
SPST EMAP 161  100000
RUNS 1000000  10000 10000000 40000 100000 100000 100001
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
STUN TORS PICL 0.01 0.999 2000 +10
STUN LOOP PICL 0.01 0.999 2000 +10
CNST UPDT NOCT 13 ~
  8.7 50.0 5.0 ~
...
  6.8 50.0 5.0
MINE ALLC
SANN EXPO 4000000 0.0 4000.0
RUNS 16000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
CNST UPDT NOCT 13 ~
  8.7 50.0 4.0 ~
...
  6.8 50.0 4.0
SANN EXPO 4000000 0.0 3000.0
RUNS 20000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
CNST UPDT NOCT 13 ~
  8.7 50.0 3.0 ~
...
  6.8 50.0 3.0
SANN EXPO 4000000 0.0 2000.0
RUNS 32000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
CNST UPDT NOCT 13 ~
  8.7 50.0 2.0 ~
...
  6.8 50.0 2.0
SANN EXPO 4000000 0.0 1000.0
RUNS 32000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
CNST UPDT NOCT 13 ~
  8.7 50.0 2.0 ~
...
  6.8 50.0 2.0
SANN EXPO 4000000 0.0 500.0
RUNS 32000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
CNST UPDT NOCT 13 ~
  8.7 50.0 1.0 ~
...
  6.8 50.0 1.0
SANN EXPO 4000000 0.0  10.0
RUNS 32000000  10000 10000000 40000 100000 100000 100001
RMCK -2
STUN TORS PICL  0.3 0.999 2000 +10
STUN LOOP PICL  0.3 0.999 2000 +10
STUN TRAN PICL  0.3 0.999 2000 +10
STUN ROTA PICL  0.3 0.999 2000 +10
SANN LINE 4000000 0.0 0.01
RUNS 20000000  10000 10000000 40000 100000 100000 100001
RMCK -2

C-shell script #1 to submit the annealing jobs in parallel under the LSF queuing system. For other queuing system, change the qsub command accordingly.

#!/bin/csh
echo -n "Name of the MMC input file template for annealing (without the .inp)="
set macro=($<)
echo -n "Number of CPU's to use [100]="
set ncpu=($<)
if ($ncpu == '') then
  set ncpu = 100
endif
echo -n "Name of the que to submit the jobs [premium]="
set quename=($<)
if ($quename == '') then
  set quename = 'premium'
endif
if ($quename == 'premium') then
  set account = 'premium'
else
  echo -n "Name of the account to use="
  set account=($<)
endif
set wt = '24'
echo -n 'Wall time in hours ['$wt']='
set wt =($<)
if ($w != '') then
  if ($w > 24) then
    echo Maximum time for premium runs is 24 hrs
    exit
  endif
  set wt = $w
endif
set walltime = `echo "-W"$wt":00"`
echo -n "Name of the MMC executable with full path="
set mmc=($<)
set sleepsec = 10
set homedir = `pwd`
echo Runing $mmc in directory $homedir  on $ncpu CPUs for maximum $wt hours
echo -n "Number of annealing jobs to run="
set nlist=($<)
set ndone = 0
while ($ndone < $nlist)
@ ndone = $ndone + 1
# Count the number of jobs (with name ANNEAL) running and wait if needed`
  set njobs = `qstat|grep $user|grep -c ANNEAL`
  while ($njobs >= $ncpu)
    sleep $sleepsec
    set njobs = `qstat|grep $user|grep -c ANNEAL`
  end
# submit_anneal.csh is the submit script for one job - see below
  cat $macro.inp | sed 's/FILE '$macro'/FILE '$macro'_'$ndone'/' | sed 's/CLST_R
  set job = $macro'_'$ndone
  echo Submitting job $job
  bsub -q $quename -P $account -J ANNEAL_$ndone $walltime "$homedir/submit_annea
end

C-shell script #2 (anneal.csh) to run one annealing job under the LSF queuing system. It is called by the script above. For other queing system, change the # directives accordingly.

#!/bin/csh
##Job submission script using the LSF queuing system.
#BSUB -o %J.log
#BSUB -e %J.err
#BSUB -n 1
#BSUB -L /bin/csh
#
cd $LS_SUBCWD
set homedir = $1
set mmc = $2
set job = $3
cd $homedir
$mmc < $job.inp >& $job.out

C-shell script #3 (renamepdb.csh) to rename the PDB files for use by Simulaid

#!/bin/csh
echo -n "Root of the PDB file names (i.e, omit the .*.pdb)="
set f0=($<)
set nrun = 8
set ndone = 0
while ($ndone < 100)
@ ndone = $ndone + 1
  set f1 = $f0'_'$ndone.$nrun.pdb
  set f2 = $f0.$ndone.pdb
  echo RENAME $f1 $f2
  mv $f1 $f2
end

REFERENCE:
1. M. Cui, M. Mezei, and R. Osman, Prediction of protein loop structures using a local move Monte Carlo approach and a grid-based force field, Protein Eng. Des. Sel., 21, 729-735 (2008).

Last modified: 09/05/2020 (MM)