INTOCHAM: conversion from InsightII to Charmm/Amber/Moil

Mihaly Mezei

Department of Pharmacological Sciences,

Icahn School of Medicine at Mount Sinai

New York, NY 10029

Mihaly.Mezei@mssm.edu

Nov. 4, 1998.

The program INTOCHAM converts a structure file (.car file) produced by InsightII into any of the followings:

Atomic charges assigned by InsightII will simply transferred The conversion assumes the availiability of a Charmm or Amber parameter file (whichever is applicable). The currently built-in atom type conversions are based on either the Polygen/Quanta parameter file or on the Charmm 2.2 protein parameter file or on the Amber force- field distributed with Amber 4.0. The program can also convert back a Charmm COOR file to InsightII, as long as the original InsightII .car file and the resequencing information generated during the conversion (vide infra) are available.

At start, the program will quiz you about the conversion type (Insight to Quanta or to Charmm 2.2, Insight to Amber with or without potential conversion, Insight to Moil, back from Charmm to Insight).

For conversions from Insight, the program will ask the following information:

The program then sets up the bonding lists, converts the types based on the table below when the CVFF force field is used in the .car file (first 48 conversion types). In some instances, ambiguities will be resolved based on the bond list (see conversion types above 80). The program will inform you if some of the CVFF types can be converted in more than one ways or if no atomtype has been assigned to a CVFF type.

After the type conversion, you have the option to change the new type of any of the atoms (i.e., override the program's 'judgement') - a must when no conversion has been assigned by the program. It is important that any such change be made at this point because the choice of improper torsion terms in Charmm depends on the types of the atoms and changing atomtypes after the improper torsion terms are selected may result in missing some of them. Thus if messages about ambiguous or missing conversions were received, the conversion may have to be repeated. After having examined the structure with this list in hand one can decide about the necessary changes. The new atomnames can be identified with their original sequence numbers by looking at the generated Charmm COOR file.

The program also gives you the option to switch to united atom representation of the carbons (with CVFF input only). In this case hydrogens bonded to carbons are dropped, and the atomtypes of the corresponding carbons are changed to the respective united atom types. Also, the hydrogen charges are added to that of the carbon. Conversions to Charmm convert only the aliphatic carbons. Conversions to Amber converts all aromatic carbons to type CD. Note, however that the Amber force field contains several different aromatic carbon atomtypes (CD, CE, CF, CG, CI, CJ, CP) depending on the environment of the united atom.

The program will recognize different molecules as separate residues. The residue name of the molecule will be the residue name given by Insight to the first atom (see Amber groupings below, though). Thus, if you want to control the Charmm residue names, you have to edit the Residue ID field of the .car file (columns 52-55). If two molecules have the same residue name and their chemical formula is probably the same (the sum of the atomic number squares are the same) then the conversion assumes the second molecule to have the same description as the first and will not make any entry into the Charmm RTF file. The program will give the number of molecules found to be duplicates in this sense. If two different molecules were given the same residue name then the program will ask you to type in a new residue name for the second molecule.

The group concept, important when large molecules are modeled because the cutoffs can be based on the groups and not on the whole molecule, is also implemented. Groups can be defined by Insight (Builder/Forcefield/Groups). Alternatively, you can also request the converter to treat the different Insight-defined residues as different groups. Conversions to Charmm will translate it into grouping of the atoms within the residues using the GROUP command in the RTF input. Conversions to Amber or Moil, on the other hand, will create a separate 'molecule' (monomer in the Moil terminology) for each group in the actual molecule and provide the inter-group bonds as 'crosslinks' (for Amber) or additional bonds specified in the addbond file (for Moil). If the Insight-defined grouping is invoked, the .mdf file corresponfing to the .car file has to be available as well.

The program assumes that residues named WTR are solvents and treats them accordingly. You will be quizzed if you used an other solvent name.

If periodic boundary conditions were used by Insight, the conversion to Charmm will implement them too. In this case, you will be quizzed about the location of the Charmm image file (default provided). The box size is also written on the message file. Note that the Insight box has the origin of the coordinate system at a vertex of the simulation cell.

The atoms will be printed in an order that puts the solvent molecules at the end. If the solute was also found to consist of disconnected parts (separate molecules) or the atoms were grouped according to the Insight residues then the order of the atoms in the Charmm COOR file will put atoms in the same group/molecule consecutively. The program will write the information about this resequencing of the atoms in the structure into a file whose name is generated from the .car file, with .car replaced by .seq : for each input atom it gives its new sequence number. This file is also needed later if a Charmm COOR file is to be converted back to Insight. Also, the Charmm COOR file generated will have the old sequence number in its first column.

The input file to Charmm will have the same name as the Insight .car file, except that the extension .car will be replaced by .ch . The Amber files will have extensions .prep , .link and .edit (for the unit 5 inputs of the programs PREP, LINK and EDIT, respectively) and .pdbin and .addwat for the coordinate files for the solute and solvents (if any), respectively.

In addition, the program also writes to any conversion-related message and an annotated list of the improper torsions generated to a message file whose name is derived from the Insight .car file name by replacing the extension with .msg . It is suggested that the user review this list.

The Charmm input file will contain in the title the Insight file name, the list of conversions employed originally and the name of the parameter file used. It will include into the RTF file all improper torsions for which the parameter file contained an entry, but will limit to one the number of improper torsion terms centered on any given atom and will reject a parameter file entry that would make the environment of a four-bonded carbon planar. There will be no IC (internal coordinate) entries - they are not needed since a complete coordinate file is given. The last Charmm command is a steepest descent minimization with default parameters.

The result of the conversion can be submitted to Charmm as is and should, in principle, run through without fatal error. A possible problem is that potential parameters are missing. This can be either the result of the shortcoming of the parameter file or an incorrect conversion built into the program. Once the converted Charmm input file has been found free of errors, the actual Charmm commands performing calculations and saving the results can be added manually.

The Amber input files are annotated at the end with reference to the conversion program. The PREP input also describes the conversions used.

For conversion from Charmm back to Insight the program asks the name of the Insight .car file, the name of the Charmm .CRD file and the name to be given to the new Insight .car file.

ERROR messages can indicate problems with the data or insufficient capacity. In case of the latter, the dimensions of the program should be increased. Any message prefixed with PROGRAM ERROR indicates an internal inconsistency likely due to a program error and should be reported to the author.

Charmm and Amber atomtypes assigned to Insight (CVFF) atomtypes
 
Conv.    CVFF ->   CHARMm         CHARMM    AMBER 
 num.              Quanta         V 2.2    
                                
 1        h          HA             HA        HC
 2        d          HA             HA     
 3        hn         H  (a)         HC        H
                     HC (c)         H         
 4        ho         H  (a)         H         HO
                     HC (b)                   
 5        hp         H  (a)         H         
                     HC (b)                
 6        hs         H  (a)         H         HS
                     HC (b)                
 7        h*         HT             HT        HO
 8        c          CH3E           CT1       CT
                                84: CT2
                                85: CT3   
 9        cg         CT             C2        CT
10        c'         C              C         C
11        cp         C6R (d)        CA        CA
                 90: CR66                 94: CR
                 91: CR56 
12        cr         C              CA        CA
13        c-         C   (?)        C   (?)  
14        ca         CT             CT1       CT
15        c3         CT             CT3       CT
16        cn         CT             CT1       CT
                                86: CT2      
17        c2         CT             CT        CT
18        c1         CT             CT2       CT
19        c5         C5R  (d)       CA        C*
                 88: CR55                 92: CC
                 89: CR56                 93: CB
20        c=         CUA1 (e)       C   (?)   
21        ct         CUY1 (f)              
22        ci         C5R  (?)       CA  (?) 
23        n          NP             NH2       N
24        n2         NT                    
25        np         N6R  (g)                 NC
                 87: N5R  (h)               
26        n3         NT             NN5       N3
27        n4         NT                       NT
28        n=                               
29        nt                               
30        n1         NC                    
31        ni         NC   (?)       NR3    
32        o'         O              O1        O
                 81: OA         83: OB        
                 82: OK                  
33        o          OS             OS        OS
34        o-         OC             OC     
35        oh         OT             OH        OH
36        o*         OW (i)         OT        OH
37        s          SE                    
38        s1         ST                       S
39        sh         ST                       SH
40        p          PT                       P
41        si         MSI        
42        f          XF         
43        cl         XCL        
44        br         XBR        
45        Cl         XCL           
46        Na         MNA        
47        c+                    
48        nu                    

98  United atom CH1  CH1E                     CH
99  United atom CH2  CH2E                     C2
100 United atom CH3  CH3E                     C3
-------------------------------------------------------------------
a: neutral; 
b: purines/pyrimidines
c: charged
d: see also CR55,CR56,CR66,C5RP,C6RP
g: 6-membered ring;
h: 5-membered ring
e: see also CUA2
i: TIP3P water (OH2 for ST2)
f: see also CUY2, CUY3
?: Weak correspondence      

     CVFF force field atom types

Atom Type   Description
 1   h      Hydrogen bonded to C.
 2   d      General Deuterium.
 3   hn     Hydrogen bonded to N.
 4   ho     Hydrogen bonded to O.
 5   hp     Hydrogen bonded to P.
 6   hs     Hydrogen bonded to S.
 7   h*     Hydrogen in water molecule.
 8   c      sp3 aliphatic carbon.
 9   cg     sp3 alpha carbon in glycine.
10   c'     sp2 carbon in carbonyl (C=O) group.
11   cp     sp2 aromatic carbon (partial double bonds).
12   cr     Carbon in guanidinium group (HN=C(NH2)2).
13   c-     Carbon in charged carboxylate (COO-) group.
14   ca     General amino acid alpha carbon (sp3).
15   c3     sp3 carbon in methyl (CH3) group.
16   cn     sp3 carbon bonded to N.
17   c2     sp3 carbon bonded to 2 H's, 2 heavy atoms.
18   c1     sp3 carbon bonded to 1 H, 3 Heavy atoms.
19   c5     sp2 aromatic carbon in five membered ring.
20   c=     sp2 nonaromatic carbon involved in double bond.
21   Ct     sp carbon involved in triple bond.
22   ci     Aromatic carbon in a charged imidazole ring (HIS+).
23   n      sp2 nitrogen with 1 H, 2 heavy atoms (amide group).
24   n2     sp2 nitrogen (NH2 in the guanidinium group (HN=C(NH2)2).
25   np     sp2 aromatic nitrogen (partial double bond)
26   n3     sp3 nitrogen with three substituents.
27   n4     sp3 nitrogen with four substituents.
28   n=     sp2 nitrogen involved in a double bond (non-aromatic).
29   nt     sp nitrogen involved in triple bond.
30   n1     sp2 nitrogen in charged arginine.
31   ni     sp2 nitrogen in a charged imidazole ring (HIS+).
32   o'     Oxygen in carbonyl (C=O) group.
33   o      sp3 oxygen in ether or ester groups.
34   o-     Oxygen in charged carboxylate (COO-) group.
35   oh     Oxygen in hydroxyl (OH) group.
36   o*     Oxygen in water molecule.
37   s      Sulfur in methionine (C-S-C) group.
38   s1     Sulfur involved in S-S disulfide bond.
39   sh     Sulfur in sulfhydryl (-SH) group.
40   p      General phosphorous atom.
41   si     Silicon.
42   f      Fluorine bonded to a carbon.
43   cl     Chlorine bonded to a carbon.
44   br     Bromine bonded to a carbon.
45   Cl     Chloride ion.
46   Na     Sodium ion.
47   c+     Calcium ion-Ca++
48   nu     NULL atom for relative free energy.

   CHARMm atomtypes from Quanta parameter file

   1 H     Hydrogen bonding hydrogen (neutral group)
   2 HC    Hydrogen bonding hydrogen (charged group)
   3 HA    Aliphatic or aromatic hydrogen
   4 HT    TIPS3P water model hydrogen
   5 LP    ST2 lone pair
   6 BE    Beryllium
   7 B     Boron
  10 CT    Aliphatic carbon (tetrahedral)
  11 CH1E  Extended atom carbon with one hydrogen
  12 CH2E  Extended atom carbon with two hydrogens
  13 CH3E  Extended atom carbon with three hydrogens
  14 C     Carbonyl or Guanidinium carbon
  15 CM    Carbonmonoxide carbon
  16 CUA1  Carbon in double bond,first pair
  17 CUA2  Carbon in double bond,second pair conjd.to first
  18 CUY1  Carbon in triple bond,first pair
  19 CUY2  Carbon in triple bond,second pair
  21 C5R   Aromatic carbon in a five member ring
  22 C6R   Aromatic carbon in a six  member ring
  23 C5RE  Extended aromatic carbon in five member ring
  24 C6RE  Extended aromatic carbon in six member ring
  25 CR55  Aromatic carbon-merged five member rings
  26 CR56  Aromatic carbon-merged five/six member rings 
  27 CR66  Aromatic carbon-merged six member rings
  28 C5RP  for Aryl-Aryl bond between C5R rings
  29 C6RP  for Aryl-Aryl bond between C6R rings
  30 N5RP  Nitrogen for bridghead between 5-mem rings
  31 N     Nitrile nitrogen
  32 NP    Peptide/amide nitrogen 
  33 NX    Proline nitrogen
  34 N5R   Nitrogen in a five member aromatic ring
  35 N6R   Nitrogen in a six member aromatic ring
  36 NT    Amine nitrogen (tetrahedral)
  37 NC    Charged guanidinuim nitrogen
  38 NO2   Nitro group nitrogen
  40 O     Carbonyl oxygen for amides
  41 OA    Carbonyl oxygen for aldehydes
  42 OK    Carbonyl oxygen for ketones
  43 OC    Charged oxygen
  45 OT    Hydroxyl oxygen (tetrahedral)/Ionizable acid oxygen
  46 OW    TIP3P water model oxygen
  47 OH2   ST2  water model oxygen
  48 OM    Carbonmonoxide oxygen
  49 OS    Ester oxygen
  50 OE    Ether oxygen / Acetal oxygen
  51 OAC   Carbonyl oxygen for acids
  52 O5R   Oxygen in five member aromatic ring
  60 PT    Phosphorous (tetrahedral)
  61 PO3   Phosphorous bonded to three oxygens
  62 PO4   Phosphorous bonded to four oxygens
  70 ST    Sulphur (tetrahedral)
  71 SH1E  Extended atom sulphur with one hydrogen
  72 S5R   Sulphur in a five member aromatic ring
  73 S6R   Thioether sulphur
  74 SE    Thiocarbonyl sulphur
  75 SK    Thioketone sulphur
  76 SO1   Sulphur bonded to one oxygen
  77 SO2   Sulphur bonded to two oxygens
  78 SO3   Sulphur bonded to three oxygens
  79 SO4   Sulphur bonded to four oxygens
  80 MLI   Lithium
  81 MNA   Sodium
  82 MMG   Magnesiun
  83 MK    Potassium
  84 MCA   Calcium 
  85 MMN   Manganese
  86 MFE   Iron
  87 MZN   Zinc
  88 MRB   Rubidium
  89 MCS   Cesium
  90 MSI   Silicon
  91 MAL   ALuminum
  92 XF    Fluorine
  93 XCL   Chlorine
  94 XBR   Bromine
  95 XI    Iodine
  96 MCU   Copper
  97 MV    Vanadium
  98 MCR   Chromium
  99 MCO   Cobalt
 100 MNI   Nickel
 101 MAS   Arsenic
 102 MSE   Selenium
 103 MSR   Strontium
 104 MY    Yttrium
 105 MZR   Zirconium
 106 MNB   Niobium
 107 MMO   Molybdenum
 108 MRU   Ruthenium
 109 MRH   Rhodium
 110 MPD   Palladium
 111 MAG   Silver
 112 MCD   Cadmium
 113 MSN   Tin
 114 MSB   Antimony
 115 MBA   Barium
 116 MW    Tungsten
 117 MOS   Osmium
 118 MPT   Platinum
 119 MAU   Gold
 120 MHG   Mercury
 121 MPB   Lead
 122 MBI   Bismuth
 123 MLA   Lanthanum
 124 MCE   Cerium
 125 MPR   Praseodymium
 126 MAC   Actinium
 127 MTH   Thorium
 128 MU    Uranium
 129 MTE   Tellurium
 130 MPO   Polonium
 131 AT    Astatine
 132 MTC   Technetium
 133 MSC   Scandium
 134 MTI   Titanium
 135 MGA   Gallium
 136 MGE   Germanium
 137 MIN   Indium
 138 MHF   Hafnium
 139 MTA   Tantalum
 140 MIR   Iridium
 141 MTL   Thallium
 142 MFR   Francium
 143 MRA   Radium
 144 MND   Neodymium
 145 MPM   Promethium
 146 MSM   Samarium
 147 MEU   Europium
 148 MGD   Gadolinium
 149 MTB   Terbium
 150 MDY   Dysprosium
 151 MHO   Holmium
 152 MER   Erbium
 153 MTM   Thulium
 154 MYB   Ytterbium
 155 MLU   Lutetium
 156 MPA   Protactinium
 157 MNP   Neptunium
 158 MPU   Plutonium
 159 MAM   Americium
 160 MCM   Curium
 161 MBK   Berkelium
 162 MCF   Californium
 163 MES   Einsteinium
 164 MFM   Fermium
 165 MMD   Medelevium
 166 MNO   Nobelium
 167 MLR   Lawrencium             
 168 HE    Helium
 169 NE    Neon
 170 AR    Argon
 171 KR    Krypton
 172 XE    Xenon
 173 RN    Radon

    Charmm 2.2 atomtypes (from the protein parameter file)

   1 H     polar H
   2 HC    N-ter H
   3 HA    nonpolar H
   4 HT    TIPS3P WATER HYDROGEN
   5 HP    aromatic H
   6 HB    backbone H
   7 HR1   his he1, (+) his HG,HD2
   8 HR2   (+) his HE1
   9 HR3   neutral his HG, HD2
  11 C     carbonyl C, peptide backbone
  12 CA    aromatic C
  13 CT1   aliphatic sp3 C for CH
  14 CT2   aliphatic sp3 C for CH2
  15 CT3   aliphatic sp3 C for CH3
  16 CPH1  his CG and CD2 carbons
  17 CPH2  his CE1 carbon
  18 CPT   trp C between rings
  19 CY    TRP C in pyrrole ring
  20 CP1   tetrahedral C (proline CA)
  21 CP2   tetrahedral C (proline CB/CG)
  22 CP3   tetrahedral C (proline CD)
  23 CC    carbonyl C, asn,asp,gln,glu,cter,ct2
  24 CD    carbonyl C, pres aspp,glup,ct1
  25 CPA   heme alpha-C
  26 CPB   heme beta-C
  27 CPM   heme meso-C
  28 CM    heme CO carbon
  29 CS    thiolate carbon
  31 N     proline N
  32 NR1   neutral his protonated ring nitrogen
  33 NR2   neutral his unprotonated ring nitrogen
  34 NR3   charged his ring nitrogen
  35 NH1   peptide nitrogen
  36 NH2   amide nitrogen
  37 NH3   ammonium nitrogen
  38 NC2   guanidinium nitroogen
  39 NY    TRP N in pyrrole ring
  40 NP    Proline ring NH2+ (N-terminal)
  41 NPH   heme pyrrole N
  51 O     carbonyl oxygen
  52 OB    carbonyl oxygen in acetic acid
  53 OC    carboxylate oxygen
  54 OH1   hydroxyl oxygen
  55 OS    ester oxygen
  56 OT    TIPS3P WATER OXYGEN
  57 OM    heme CO/O2 oxygen
  81 S     sulphur
  82 SM    sulfur C-S-S-C type
  83 SS    thiolate sulfur
  90 CAL   calcium 2+
  91 ZN    zinc (II) cation
  92 FE    heme iron 56