Protocol for obtaining solvent sites with Grand-Canonical Ensemble simulations using MMC

Mihaly Mezei

Department of Phatmacological Sciences
Icahn School of Medicine at Mount Sinai
New York, NY 10019

Obtaining solvent sites requires the following steps when using MMC:

  1. Use Simulaid to optimize the orientation of your solute in a bounding box (Note: NOT in a periodic cell!). Let it center the sysytem to (0,0,0) and note the optimized edge lengths from Simulaid.s output:
    === Edges of the smallest box still enclosing the solute:
    === Optimum:   33.37131  48.39131  39.04523 A vol=   63053.4 A^3
                    Ex(opt)   Ey(opt)   Ez(opt)
    
  2. Use Simulaid to generate an .slt file (the solute description with potential codes). This is one of the conversion options. You may have to change PDB atom/residue names to Charmm (or Amber) convention first (a other conversion option of Simulaid). Unless you use a united atom model, make sure your structure contains the hydrogens too.
  3. Set up the GCE input for MMC (following the examples):
    • Add 10-13Å on either side to the edges of the bounding box to get your periodic cell sizes and define the inner cube of the MOND key to be about 5Å away from the simulation box:
      PBCN RECT 57.0 72.0 63.0
      MOND .23.5 23.5 .31 31 .26.5 26.5
      
    • Use the BTUN key to target 0.997 g/ml density in the bulk water region:
      BTUN PICL 0.997
      
    • Use
      NSLV 10
      
      and
      CNFG RANI ASCI
      
      to generate an initial configuration of 10 randomly placed waters (don't use CNFG RANC since that would shift the solute out of the bounding box).
    • Once MMC is running, check the list of charges for the residues (called groups by MMC) - non-integer charges indicate that some atoms (usually involving terminal and/or patch residues (e.g., disulphide bonds) have their charges incorrectly set (this is a known limitation of Simulaid).
  4. Run a first a million MC steps with the B parameter set at the input (with the GCEN key) to a positive number, say, 1.0. This will mostly fill the simulation box.
  5. Repeatedly run 20 million (or more) steps until the number of waters stops increasing and the energy stops decreasing.
  6. Once the density is about to be settled to an acceptable value, save every, say, 10000-th configuration in an .hst file (see the TRAJ and RUNS keys).
  7. Since this trajectory still contains lots of solvent waters, use the FILT key to eliminate the outside waters. The option we like uses the circular variance - a single cutoff can be chosen to include waters only way inside, or near the surface, etc. Alternative filtering can be done by solvation shells and solvation energies (this latter is useful to get rid of waters solvating the membrane-bound side of the protein.
  8. This filtered trajectory can be used to generate the so-called generic sites, using the GENS key. These sites will have RMS and fractional occupancy attached to them so you will have some sense how certain MMC is in each.
  9. From the generic sites MMC will also create a representative configuration (the configuration with low RMS from the sites found and few sites missing). Also, MMC will scan the trajectory and for each site with occupancy above a threshold value (specified on input) it will search for a solvent molecule that is the closest to that site. This way you will have a composite file with actual waters in a reasonable orientation. However, these waters may be inconsistent with each other (e.g., too close). Finally, sites within a threshold distance can be clustered together (these usually represent alternative sites with low occupancy each).

Last modified: 06/04/2019 (MM)