The values listed in Table
were used in an attempt to
reproduce the structure of crambin using the C coordinates from
the crystal structure[77]. Twenty different backbone
conformations were generated by using different random numbers to
control the selection of
dihedrals as well as to determine which
conformations would be accepted and which rejected. The
conformational energy of the backbone, the rms deviations in backbone
atoms and
dihedrals from each of these structures is listed in
Table
, ranked by energy. The average backbone rms
deviation for these 20 simulations was 0.527 Å, in close agreement
with the previous result of 0.520 Å mentioned in the preceding
section. The average all-atom deviation was 1.696 Å.
It is apparent that there is only a small correlation
between the backbone energy and the rms fit to the crystal structure
backbone. The backbone of the crystal structure itself has an energy
of 759.8 kcal/mol, higher than 12 of the 20 model conformations. This
is likely to be due both to errors in the crystal structure and in the
limitations of the forcefield approach: no forcefield can be optimal
for every crystal structure, even when such factors as crystal packing
and solvation are considered. Nevertheless, in cases where the
crystal structure is unknown, the backbone energy is the best
criterion for selecting model structures. Other possible selection
criteria, including C constraint energy and total energy including
sidechain atoms, had even worse correlation with the deviation in the
backbone coordinates.
The five lowest-energy backbone conformations from Phase 1 were used
as a starting point for Phase 2. For each of the five backbone
conformations, five Phase 2 simulations were carried out, again using
different random numbers to produce different results. Each
simulation involved 1000 Monte Carlo steps using 10 probability
grids and a simulation temperature of 300 K. The 25 conformations
produced are listed in Table
. Again, there is only a
small correlation between energy and rms fit to the crystal structure.
Nevertheless, the fits are quite good, with an average rms
deviation from the crystal structure of 1.323 Å. All five backbone
conformations were represented throughout the list of of all-atom
conformations, so the backbone energy was not the determining factor
in the overall energy.
The best energy conformation from Phase 2 was chosen as the ``model''
conformation of crambin for detailed comparison to the ``true''
structure, the crystal structure[77]. Table
gives a breakdown of the rms deviation of the crambin model for
different regions of the protein. Some of this information is shown
graphically in Figure
, where the backbone rms
deviation of each residue is shown. The largest deviations occur at the
carboxy terminus, where residues 45 and 46 are very poorly modeled.
If these two residues are excluded, the backbone rms deviation drops
from 0.543 Å to 0.361 Å. The carboxy terminal residues are
generally the worst modeled residues because there are fewer constraints on
the structure: they usually lie on the surface of the protein where
there are fewer inter-residue contacts and there is no
C
to constrain the orientation of the terminal carboxyl group.
In the crambin model, the Asn 46 sidechain and the terminal carboxyl group
have reversed positions, giving rise to a large error even though
the chemical significance is small. The backbone rms deviation is
fairly consistent throughout the rest of the protein, with 34 of the
46 residues having deviations in the 0.1-0.4 Å range. The
lowest backbone deviations are in the residues of the long
helix,
Helix 1, where the deviation in atomic coordinates is 0.209 Å,
and the deviation in
and
dihedrals is only 13.7
. The
deviations are equally low (0.232 Å and 13.1
) for the first
seven residues of Helix 2. However, the last residue in the helix
starts a turn, and is poorly modeled. In general, the turn regions
before and after
helices are the most poorly modeled residues other
than those at the C-terminus. This is very apparent from both the graph in
Figure
and the picture in Figure
.
These regions (particularly residues 5, 20, and 30) have nonstandard
values which have very low probabilities in the
probability
grids. No
probability grids were specifically developed for
turn regions, but these might prove very valuable.
The sidechain modeling is not as successful as the backbone modeling,
with the average deviation in atomic coordinates being near 2.0 Å.
This is not at all surprising, since each peptide unit in the
polypeptide backbone is constrained at both ends by the positions of
two consecutive C's while the sidechains are usually constrained only by
their attachment to a single C
. The constraints on the sidechain
conformations are primarily steric in nature: sidechains in the
interior of a protein can have considerable steric overlap and their
conformations must be correlated to allow for closest packing. The
rms deviation for the atomic coordinates may not be the best
indication of modeling success, since it will be heavily weighted
toward any poorly modeled large sidechain such as arginine. A better
measure is the deviation in sidechain dihedral angles,
, defined
as the absolute value of the difference between the dihedral in the
model and in the crystal structure. The deviations in
are shown for the
crambin model in Figure
. Most dihedrals have
high probabilities at 60, -60
, and 180
, so
deviations would be expected to be near 0
or 120
.
Of the 37
's in crambin, 24 have deviations less than
30
, and 11 have deviations between 90
and 150
.
Therefore, only two have
deviations between 30
and 90
. It is important to note
that five of the 11 poorly modeled sidechains are cystein residues
involved in disulfide bridges in the crystal structure. The C
Builder does not currently predict the presence of disulfide bridges,
so the disulfide bond is not included in the Monte Carlo energy
evaluations. Such a term could be included and would certainly
improve the results for these residues. RMS deviations for the
different backbone and sidechain dihedrals are shown in Table
.
Although the sidechain dihedrals are not as well modeled as the
backbone, the results are not discouraging with respect to other
methods. As discussed below, our method provides results for
flavodoxin dihedrals as good or better than other methods, and
these results for crambin are even better.
The differences between the crambin model and the crystal structure
are shown in detail in Figures
,
, and
. Figure
shows the model and crystal
structure backbones for the entire protein. For most of the protein,
it is very difficult to distinguish between the two structures. Only in the
turn regions after the two helices is the difference readily apparent.
The two following figures show the all-atom structures of the two
helices of crambin. Helix 2, shown in Figure
, is very
well modeled, with an rms deviation of 1.03 Å for all atoms. In
terms of the all-atom deviation, it is the best modeled region of the
protein (see Table
). The picture shows this quite well,
with both sidechain and backbone atoms showing little difference
between the two structures, except for Thr 30 on the C-terminal
(right) end of the helix. As explained above, this residue begins a
turn in the backbone conformation and is poorly sampled during the Phase 1
backbone Monte Carlo. The Helix 1 backbone, in contrast, is modeled
quite well throughout its length, including Pro 19 at its C-terminal
(left) end. However, Helix 1 has many large sidechains which are
difficult to model. Large errors can be seen in Asn 14 and Arg 17.
The latter has a particularly large impact on the rms deviation.
Excluding Arg 17, the crambin model has an rms deviation of 1.207 Å,
rather than 1.386 Å. However, this incorrect conformation of Arg 17
may be energetically more favorable than other conformations more
similar to the crystal structure. Of the next four lowest-energy
conformations listed in Table
, all five have more
native-like conformations of Arg 17, but all are higher in energy.
The crambin model illustrates several general findings for simulations
using the PGMC C Builder. The lowest-energy structures from
Phases 1 and 2 are usually among the best models built, but are rarely
the very best. Regardless, the backbone models from Phase 1 are
consistently good, and almost any one of them provides an acceptable
model of the true backbone. The model backbones are especially good
in regions of regular secondary structure such as helices and sheets,
but rather poor in turn regions. These results are obtained
consistently in different simulations. There is a much larger
variation among the results from Phase 2. This may be due to the
constraints of time; the number of 1000 Monte Carlo steps was selected
largely in order to keep the simulation time below ten minutes, so
that large numbers of different conformations could be evaluated.
Better and more consistent results might be obtained by substantially
longer calculations. Nevertheless, between 40%and 60%of
dihedrals are modeled correctly.