The primary advantage of internal-coordinate methods of molecular dynamics is the ability to use larger timesteps than the 1 femtosecond step size typically used in Cartesian molecular dynamics. An important measure of the accuracy of the dynamics calculation is the energy fluctuations. In microcanonical dynamics, the total energy of the system, E, should be constant, even though its two components, the potential energy, V, and kinetic energy, K, fluctuate. The energy fluctuation is defined by
where is the Boltzmann constant and is the temperature of the simulation. Figure shows the value of E during 1 picosecond ( s) simulations of the pentapeptide Met-enkephalin (NH-Tyr-Gly-Gly-Phe-Met-COO) for NEIMO(N) and Cartesian(C) dynamics simulations at timesteps ranging from 1 fs to 20 fs. The Cartesian dynamics simulations had an initial 1 fs equilibration phase, where the fluctuations were significantly higher. These fluctuations were not included in these results. The NEIMO simulations did not require an equilibration phase. Cartesian dynamics simulations were not possible for timesteps greater than 3 fs. When timesteps are too large for the motions being simulated, particle motions from one timestep to the next are exaggerated and the energy quickly ``blows up'' (the energy goes to infinity). For Cartesian dynamics of large systems, timesteps greater than 1 fs are usually unstable. Even for the small peptide Met-enkephalin, a 2 fs timestep gives rise to far larger energy fluctuations than a 1 fs simulation. The NEIMO dynamics simulation is far more stable, with timesteps as large as 18 fs giving small fluctuations, smaller even than the Cartesian dynamics simulation with a 1 fs timestep. A fairer comparison is to divide the energy fluctuations by the number of degrees of freedom. For Met-enkephalin, = 28 (22 dihedral angles plus the six degrees of freedom for the base body), while the number of degrees of freedom in Cartesian dynamics is , or 138. The scaled fluctuations, *, are also shown in Figure and are labeled with an asterisk (N* and C*). NEIMO timesteps as large as 12 fs gave smaller scaled fluctuations than the 1 fs Cartesian simulation.
Similar results were obtained for nine-residue polyalanine, (Ala). Cartesian dynamics were only possible at 1 fs and 2 fs timesteps. The 3 fs simulation did not blow up, but the fluctuations were extremely large. The scaled fluctuations, *, were very similar for 1 fs and 2 fs Cartesian dynamics of both peptides. The NEIMO simulations of (Ala) gave larger values of and * than for Met-enkephalin at almost every timestep, but the fluctuations did not blow up until timesteps larger than 30 fs were used. It is likely that (Ala) is able to tolerate such large timesteps because it has no light sidechain clusters which would be expected to have higher rotational velocities. Since the DREIDING forcefield uses a united-atom approach, the CH units of the Alanine sidechains are treated as a single atom and, therefore, do not form clusters which move independently in the NEIMO model. In contrast, the tyrosine, phenylalanine, and methionine sidechains of Met-enkephalin all contain individual clusters with low moments of inertia. As indicated below in the analysis of Met-enkephalin dihedral angle fluctuations, the long, unbranched methionine sidechain is particularly flexible.
As the simulations are carried out for longer periods of time, the fluctuations gradually increased. For instance, a 1 ps NEIMO simulation of Met-enkephalin using a 5 fs timesteps had a value of less than 0.0001 kcal/mol. The same simulation run for 5 ps had kcal/mol, even though no 0.1 ps stretch of the simulation had kcal/mol, and the average fluctuation over the 50 0.1 ps stretches was only 0.0001 kcal/mol. Run for 25 ps, the simulation has an overall of 0.0360 kcal/mol, even though the average 0.1 ps stretch had kcal/mol. This discrepancy is caused by very slow fluctuations in the total energy which cause to slowly diverge from . The cause of this long-term fluctuation is unknown.
In order to compare NEIMO directly to the matrix method of Mazur et al., the quantity was calculated from simulations of (Ala) at timesteps ranging from 1 fs to 20 fs. For each timestep, the simulation was run for 4.0 ps during which the velocities were rescaled, when necessary, to equilibrate the system. At the end of the 4.0 ps run, 110 additional steps were run. The first ten of these were discarded, but the final 100 steps were used to determine , which is defined by
is the average energy during the 100 steps, and is the root-mean-square deviation in the energy. Mazur et al. reported simulations on (Ala) using a variety of models including some containing explicit hydrogens. The DREIDING/NEIMO calculation corresponds to their third model: united atoms are used rather than explicit hydrogens, and all bond lengths and angles are fixed. Only dihedral degrees of freedom are allowed plus the six degrees of freedom of the base body, for a total of 32 degrees of freedom. Mazur et al. obtained a value of = using timesteps of 0.5 fs. Values of increased linearly with increasing timesteps, but they were able to achieve their desire level of accuracy, , using timesteps as large as 13 fs. NEIMO simulations using a 0.5 fs timestep had a larger value of = , but timesteps as large as 15 fs gave , as can be seen in Figure . These results are very consistent with the results of Mazur et al., even though they used a different forcefield (a combination of CHARMm and ECEPP) and a different integration scheme. It should be noted that the choice of forcefield should not affect the results dramatically, since our average energy (25.402 kcal/mol) is nearly equal, though opposite in sign, to the average energy they report (-26.8 kcal/mol) for their 0.5 fs timestep simulations.
Although timesteps of 15 fs and longer are clearly possible for NEIMO simulations of small peptides such as Met-enkephalin and (Ala), such timesteps are not yet possible for large polypeptides and proteins. Simulations were run on the 36 residue hormone peptide avian pancreatic polypeptide (aPP), a very interesting case because it is one of the smallest known polypeptides to fold into a stable globular form. Figure shows the
polypeptide backbone of aPP, which has two helices, an helix and a collagen-like polyproline helix. Hydrophobic sidechains line the cleft between the two helices, allowing for unusual stability in a peptide this size.
Figure shows and * using different timesteps for 1 ps simulations of aPP. NEIMO simulations of aPP break down when timesteps above 10 fs are used. Although timesteps as large as 9 fs give values of as good or better than the 1 fs Cartesian simulation, the scaled fluctuations, *, are approximately equal for the 6 fs NEIMO and 1 fs Cartesian cases. Several factors may cause folded polypeptides and proteins to have substantially larger fluctuations than small peptides at large timesteps. Complex secondary structure elements such as helices, turns, and beta sheets, are held together by hydrogen bonds, which are very short-range interactions. Large timesteps may cause rapid destabilization of these hydrogen bond networks. Another potential problem, due to the nature of the dynamics algorithm rather than the chemical nature of the system being studied, is the possible buildup of errors as the recursive equations are solved from the base cluster to the tips and the tips to the base. This is a more fundamental problem that may be substantially improved by higher precision in the calculations and a more central choice for the base cluster. Currently, the amino-terminal -NH group is usually chosen as the base cluster, and the carboxy terminal residue tip clusters are maximally distant.
The fastest dynamical modes in the NEIMO model are those with the smallest spatial inertia. In protein systems, these are clusters containing explicit hydrogens, where rotation of the hinge moves only the hydrogen atoms. For instance, the hydroxyl group of Tyrosine forms a two-atom cluster. Rotation of the hinge between the aromatic ring C and the hydroxyl O only modifies the hydroxyl hydrogen. These are the fastest degrees of freedom in the system. These dihedrals can be fixed by including the moiety in its parent cluster. In tyrosine, the hydroxyl and aromatic ring can be treated as a single cluster. This ``Rigid H'' model removes the fastest degrees of freedom of the system and enables even longer timesteps to be used than the standard NEIMO model. This is seen clearly in Figure , where the 18 -OH and -NH groups have been incorporated with their parent clusters. Although the scaled fluctuations, *, are very similar for small timesteps, the standard model blows up when timesteps above 10 fs are used, while the ``Rigid H'' fluctuations increase only slowly above this point. Simulations using even longer timesteps displayed the same gradual increase in fluctuations, without any sharp jump in *, as is seen in the standard model when timesteps above 10 fs are used. The ``Rigid H'' model would be useful for studies primarily interested in large-scale motions, where the hydrogen-bonding interactions of these sidechain groups are less important, and the advantage of longer timesteps is pre-eminent.
Detailed studies of protein systems require the inclusion of solvent, which plays an important role in stabilizing the native conformation of most proteins. Solvent includes both water (and/or lipids in the case of membrane-bound proteins) and ionic charges, which may be present to stabilize charged groups on the protein. In order to test the ability of NEIMO simulations to include such factors, we ran calculations where NEIMO dynamics were used to solve the equations of motion for the protein, while standard Cartesian dynamics were solved simultaneously for counterions. Avian pancreatic polypeptide was used as a test system. Oppositely-charged groups within 10 Å of each other were considered paired. This left eight unpaired charges, which were then neutralized by adding counterions (five Na and three Cl). After the counterion locations were optimized by minimizing their energies, simulations were run for 2 ps using various timesteps. The first picosecond was used for equilibrating the counterion motions, and was determined from 1 ps to 2 ps. The results are shown in Figure , along with the results from standard and ``Rigid H'' simulations of the protein alone. The addition of counterions increases the energy fluctuation substantially, but timesteps as large as 8-10 fs are still practical. This is, nevertheless, a great improvement over simulations where all atoms are treated with Cartesian-space molecular dynamics.
It is common practice to keep the temperature of a microcanonical dynamics simulation roughly constant by periodically scaling the velocities. Other calculations which must be done periodically, such as updating a list of nonbond pairs within a given cutoff distance, can be done at the same time the velocities are rescaled. This is particularly important for large systems, where calculation of all nonbonded interactions for every timestep is prohibitive. Under such conditions, where nonbonds and velocities are updated periodically, the total energy, E, does not remain constant throughout the entire time of the simulation. The energy fluctuation is, therefore, no longer an accurate measure of the dynamics because diverges from . Instead, we have used the average fluctuation, , determined by calculating during each 0.100 ps interval, and averaging. If the total calculation has 0.100 ps intervals, is defined by
where is the energy fluctuation calculated during the -th interval. In such calculations, timesteps should be chosen so that they give an integral number of dynamics steps per 0.100 ps - for instance, a timestep of 3.0303 fs is used, rather than 3.0 fs.
Figure shows the variation in during 5 ps simulations of aPP. In these calculations, the Cell-Multipole Method was used for the nonbond calculations. The was calculated during 0.1 ps intervals, during which the average kinetic energy was calculated and the farfield contribution to the CMM energy was held constant. At the end of each 0.1 ps interval, the velocities were rescaled if necessary, the CMM farfield was recalculated, and the was recorded. At the end of the 5 ps simulations, the values were averaged to give . These values are plotted in Figure . For very short timesteps (1 and 2 fs), the values are much larger than the values from the 1 ps simulations in Figure . At large timesteps, however, the results are very consistent with the shorter simulations.
Figure shows the average value of E* during 5 ps simulations of several of the proteins in Table . Clearly the energy fluctuations in NEIMO dynamics simulations, even when scaled by the number of degrees of freedom, increase linearly with protein size. This is in contrast to the fluctuations during Cartesian dynamics simulations, which are roughly constant when divided by the number of degrees of freedom. Some of the possible causes of this phenomenon have been mentioned above. It should be noted that for NEIMO simulations using a 1 fs timestep, the rise in * is much flatter than at higher timesteps. The very large proteins studied here may have problems with large timesteps simply because the actual number of nonbond interactions is extremely large, being proportional to , and large dynamics steps can cause rearrangement of a substantial proportion of these interactions. Currently, work is being done to improve the results for large timesteps.