The primary advantage of internal-coordinate methods of molecular
dynamics is the ability to use larger timesteps than the 1 femtosecond step
size typically used in Cartesian molecular dynamics. An important
measure of the accuracy of the dynamics calculation is the energy
fluctuations. In microcanonical dynamics, the total energy of the
system, E, should be constant, even though its two components, the
potential energy, V, and kinetic energy, K, fluctuate. The
energy fluctuation is defined by
where is the Boltzmann constant and
is the temperature of
the simulation. Figure
shows the value of E
during 1 picosecond ( s) simulations of the pentapeptide
Met-enkephalin (NH
-Tyr-Gly-Gly-Phe-Met-COO
) for NEIMO(N) and
Cartesian(C) dynamics simulations at timesteps ranging from 1 fs to 20
fs. The Cartesian dynamics simulations had an initial 1 fs
equilibration phase, where the fluctuations were significantly higher.
These fluctuations were not included in these results. The NEIMO
simulations did not require an equilibration phase. Cartesian dynamics
simulations were not possible for timesteps greater than 3 fs. When
timesteps are too large for the motions being simulated, particle
motions from one timestep to the next are exaggerated and the energy
quickly ``blows up'' (the energy goes to infinity). For Cartesian
dynamics of large systems, timesteps
greater than 1 fs are usually unstable. Even for the small peptide
Met-enkephalin, a 2 fs timestep gives rise to far larger energy
fluctuations than a 1 fs simulation. The NEIMO dynamics simulation is
far more stable, with timesteps as large as 18 fs giving small
fluctuations, smaller even than the Cartesian dynamics simulation with
a 1 fs timestep. A fairer comparison is to divide the energy
fluctuations by the number of degrees of freedom. For Met-enkephalin,
= 28 (22 dihedral angles plus the six degrees of freedom
for the base body), while the number of degrees of freedom in
Cartesian dynamics is
, or 138. The scaled fluctuations,
*, are also shown in Figure
and are labeled with
an asterisk (N* and C*). NEIMO timesteps as large as 12 fs gave
smaller scaled fluctuations than the 1 fs Cartesian simulation.
Similar results were obtained for nine-residue polyalanine, (Ala).
Cartesian dynamics were only possible at 1 fs and 2 fs timesteps. The
3 fs simulation did not blow up, but
the fluctuations were extremely large. The scaled fluctuations,
*, were very similar for 1 fs and 2 fs Cartesian dynamics
of both peptides. The NEIMO simulations of (Ala)
gave larger
values of
and
* than for Met-enkephalin at almost
every timestep, but the fluctuations did not blow up until
timesteps larger than 30 fs were used. It is likely that (Ala)
is
able to tolerate such large timesteps because it has no light
sidechain clusters which would be expected to have higher rotational
velocities. Since the DREIDING forcefield uses a united-atom
approach, the CH
units of the Alanine sidechains are treated as a
single atom and, therefore, do not form clusters which move
independently in the NEIMO model. In contrast, the tyrosine,
phenylalanine, and methionine sidechains of Met-enkephalin all contain
individual clusters with low moments of inertia. As indicated below in
the analysis of Met-enkephalin dihedral angle fluctuations, the long,
unbranched methionine sidechain is particularly flexible.
As the simulations are carried out for longer periods of time, the
fluctuations gradually increased. For instance, a 1 ps
NEIMO simulation of Met-enkephalin using a 5 fs timesteps had a value of
less than 0.0001 kcal/mol. The same simulation run for 5
ps had
kcal/mol, even though no 0.1 ps stretch of
the simulation had
kcal/mol, and the average
fluctuation over the 50 0.1 ps stretches was only 0.0001 kcal/mol.
Run for 25 ps, the simulation has an overall
of 0.0360
kcal/mol, even though the average 0.1 ps stretch had
kcal/mol. This discrepancy is caused by very slow
fluctuations in the total energy which cause
to slowly
diverge from
. The cause of this long-term fluctuation is
unknown.
In order to compare NEIMO directly to the matrix method of
Mazur et al.[35], the quantity was calculated
from simulations of (Ala)
at timesteps ranging from 1 fs
to 20 fs. For each timestep, the simulation was run for 4.0 ps during
which the velocities were rescaled, when necessary, to equilibrate the
system. At the end of the 4.0 ps run, 110 additional steps were run.
The first ten of these were discarded, but the final 100 steps were
used to determine
, which is defined by
is the average energy during the 100 steps, and
is the root-mean-square deviation in the energy.
Mazur et al. reported simulations on (Ala)
using a variety
of models including some containing explicit hydrogens. The
DREIDING/NEIMO calculation corresponds to their third model: united
atoms are used rather than explicit hydrogens, and all bond lengths
and angles are fixed. Only dihedral degrees of freedom are allowed plus
the six degrees of freedom of the base body, for a total of 32 degrees
of freedom. Mazur et al. obtained a value of
=
using timesteps of 0.5 fs. Values of
increased
linearly with increasing timesteps, but they were able to achieve
their desire level of accuracy,
, using timesteps as
large as 13 fs. NEIMO simulations using a 0.5 fs timestep had a
larger value of
=
, but timesteps as large as 15 fs
gave
, as can be seen in Figure
.
These results are very consistent with the results of Mazur et al.,
even though they used a different forcefield (a combination of
CHARMm[47] and ECEPP[48]) and a different
integration scheme. It should be noted that the choice of forcefield
should not affect the results dramatically, since our average energy
(25.402 kcal/mol) is nearly equal, though opposite in sign, to the
average energy they report (-26.8 kcal/mol) for their 0.5 fs timestep
simulations.
Although timesteps of 15 fs and longer are clearly possible for NEIMO
simulations of small peptides such as Met-enkephalin and (Ala),
such timesteps are not yet possible for large
polypeptides and proteins. Simulations were run on the 36 residue
hormone peptide avian pancreatic polypeptide (aPP), a very interesting
case because it is one of the smallest known polypeptides to
fold into a stable globular form. Figure
shows the
polypeptide backbone of aPP, which has two helices,
an helix and a collagen-like polyproline helix. Hydrophobic
sidechains line the cleft between the two helices, allowing for
unusual stability in a peptide this size.
Figure
shows and
* using different timesteps for 1 ps simulations of aPP.
NEIMO simulations of aPP break down when timesteps above 10 fs are
used. Although timesteps as large as 9 fs give values of
as
good or better than the 1 fs Cartesian simulation, the scaled
fluctuations,
*, are approximately equal for the 6 fs NEIMO
and 1 fs Cartesian cases. Several factors may cause folded
polypeptides and proteins to have substantially larger fluctuations
than small peptides at large timesteps. Complex secondary structure
elements such as helices, turns, and beta sheets, are held together by
hydrogen bonds, which are very short-range interactions. Large
timesteps may cause rapid destabilization of these hydrogen bond
networks. Another potential problem, due to the nature of the
dynamics algorithm rather than the chemical nature of the system being
studied, is the possible buildup of errors as the recursive equations
are solved from the base cluster to the tips and the tips to the base.
This is a more fundamental problem that may be substantially improved by
higher precision in the calculations and a more central choice for the
base cluster. Currently, the amino-terminal -NH
group is
usually chosen as the base cluster, and the carboxy terminal residue
tip clusters are maximally distant.
The fastest dynamical modes in the NEIMO model are those with the
smallest spatial inertia. In protein systems, these are clusters
containing explicit hydrogens, where rotation of the hinge moves only
the hydrogen atoms. For instance, the hydroxyl group of Tyrosine
forms a two-atom cluster. Rotation of the hinge between the aromatic
ring C and the hydroxyl O
only modifies the hydroxyl
hydrogen. These are the fastest degrees of freedom in the system.
These dihedrals can be fixed by including the moiety in its parent
cluster. In tyrosine, the hydroxyl and aromatic ring can be treated
as a single cluster. This ``Rigid H'' model removes the
fastest degrees of freedom of the system and enables even longer
timesteps to be used than the standard NEIMO model. This is seen
clearly in Figure
, where the 18 -OH and -NH groups
have been incorporated with their parent clusters. Although the
scaled fluctuations,
*, are very similar for small timesteps,
the standard model blows up when timesteps above 10 fs are used, while
the ``Rigid H'' fluctuations increase only slowly above this point.
Simulations using even longer timesteps displayed the same gradual
increase in fluctuations, without any sharp jump in
*, as is
seen in the standard model when timesteps above 10 fs are used.
The ``Rigid H'' model would be useful for studies primarily interested
in large-scale motions, where the hydrogen-bonding interactions of
these sidechain groups are less important, and the advantage of longer
timesteps is pre-eminent.
Detailed studies of protein systems require the inclusion of
solvent, which plays an important role in stabilizing the native
conformation of most proteins. Solvent includes both water (and/or
lipids in the case of membrane-bound proteins) and ionic charges,
which may be present to stabilize charged groups on the protein. In
order to test the ability of NEIMO simulations to include such
factors, we ran calculations where NEIMO dynamics were used to solve
the equations of motion for the protein, while standard Cartesian
dynamics were solved simultaneously for counterions. Avian pancreatic
polypeptide was used as a test system. Oppositely-charged groups
within 10 Å of each other were considered paired. This left eight
unpaired charges, which were then neutralized by adding counterions
(five Na and three Cl
). After the counterion locations were
optimized by minimizing their energies, simulations were run for 2 ps
using various timesteps. The first picosecond was used for
equilibrating the counterion motions, and
was determined from
1 ps to 2 ps. The results are shown in Figure
, along with
the results from standard and ``Rigid H'' simulations of the protein
alone. The addition of counterions increases the energy fluctuation
substantially, but timesteps as large as 8-10 fs are still practical.
This is, nevertheless, a great improvement over simulations where all
atoms are treated with Cartesian-space molecular dynamics.
It is common practice to keep the temperature of a microcanonical
dynamics simulation roughly constant by periodically scaling the
velocities. Other calculations which must be done periodically, such
as updating a list of nonbond pairs within a given cutoff distance,
can be done at the same time the velocities are rescaled. This is
particularly important for large systems, where calculation of all
nonbonded interactions for every timestep is prohibitive. Under such
conditions, where nonbonds and velocities are updated periodically,
the total energy, E, does not remain constant throughout the entire
time of the simulation. The energy fluctuation is,
therefore, no longer an accurate measure of the dynamics because
diverges from
. Instead, we
have used the average fluctuation,
, determined by
calculating
during each 0.100 ps interval, and averaging. If
the total calculation has
0.100 ps intervals,
is
defined by
where is the energy fluctuation calculated during the
-th interval. In such calculations, timesteps should be chosen so that
they give an integral number of dynamics steps per 0.100 ps - for
instance, a timestep of 3.0303 fs is used, rather than 3.0 fs.
Figure
shows the variation in during 5 ps
simulations of aPP. In these calculations, the Cell-Multipole Method
was used for the nonbond calculations. The
was calculated
during 0.1 ps intervals, during which the average kinetic energy was
calculated and the farfield contribution to the CMM energy was held
constant[25]. At the end of each 0.1 ps interval, the
velocities were rescaled if necessary, the CMM farfield was
recalculated, and the
was recorded. At the end of the 5 ps
simulations, the
values were averaged to give
.
These values are plotted in Figure
. For very short
timesteps (1 and 2 fs), the
values are much larger than the
values from the 1 ps
simulations in Figure
. At large timesteps, however, the
results are very consistent with the shorter simulations.
Figure
shows the average value of E* during 5 ps
simulations of several of the proteins in
Table
. Clearly the energy fluctuations in NEIMO dynamics
simulations, even when scaled by the number of degrees of freedom,
increase linearly with protein size. This is in contrast to the
fluctuations during Cartesian dynamics simulations, which are roughly
constant when divided by the number of degrees of freedom. Some of
the possible causes of this phenomenon have been mentioned above. It
should be noted that for NEIMO simulations using a 1 fs timestep, the
rise in * is much flatter than at higher timesteps. The
very large proteins studied here may have problems with large
timesteps simply because the actual number of nonbond interactions is
extremely large, being proportional to
, and large dynamics steps
can cause rearrangement of a substantial proportion of these
interactions. Currently, work is being done to improve the results
for large timesteps.