Small peptides are an important class of molecules to study with computational methods because it is extremely difficult to obtain experimental information about them. Polypeptides of less than 50 residues in length rarely assume a single conformation in solution. This flexibility makes them virtually impossible to characterize structurally with either X-ray crystallography or multidimensional NMR. Even if a crystal or solution structure could be determined, its relevance would be questionable since these molecules usually act by binding to a receptor, and it is their bound conformation which is of utmost importance. There is evidence that the bound conformations of such peptides are different from their solution structures. Computational studies of such molecules are extremely important, therefore, because they may produce not only a prediction for the global minimum energy conformation but an ensemble of low energy conformations which may include the receptor-bound conformation.
The small neuropeptide Met-enkephalin has become a standard test case for peptide structure prediction programs. This pentamer, with sequence Tyr-Gly-Gly-Phe-Met, has been studied by a number of groups using different conformational search methodologies . Li and Scheraga believe they have found the global minimum  of Met-enkephalin, and this claim is supported in the work of von Freyberg and Braun, who found the same minimum-energy conformation several times during their simulations. It is important to note that this conformation is merely the lowest found so far using the ECEPP/2 forcefield. It is not computationally possible, currently, to evaluate all possible conformations so it remains a possibility that lower-energy conformations do exist even using the ECEPP/2 forcefield. It is also very possible that different forcefields have different global minima. The ECEPP/2 global minimum is by no means guaranteed to be the DREIDING or AMBER global minimum.
The purpose of our simulations is to determine low-energy conformations of Met-enkephalin using the DREIDING forcefield. To this end, we have carried out numerous simulations on Met-enkephalin using the Probability Grid Monte Carlo method, at different grid spacings and different temperatures. These studies aim to answer the following questions:
The simplest calculation using our probability grids is to build Met-enkephalin using standard geometries, and then rotate each pair and each sidechain to its peak conformation according to the probability grids. We used the ``Peptide Builder'' function of BIOGRAF to build an initial structure from a ``Peptide Library'' of amino acid structures. This starting structure was then modified by rotating the backbone and sidechain dihedrals to their highest-probability conformations. This process was repeated for each grid spacing, and the energy of each conformation was calculated using BIOGRAF and the DREIDING forcefield. The results are shown in Table . We call these conformations ``S'', where is the grid spacing. Except for the 5-peak conformation, 5, the energy decreases as the grid spacing becomes finer - i.e., finer grids give better conformations. The energy of 5 is extremely high because of steric overlap between the sidechain of Tyr 1 and the backbone of Met 5. This overlap is caused by both the tyrosine sidechain conformation, and the fact the the glycines have conformations, thereby forming a turn in the backbone. The 30-peak conformation, 30 has a very similar backbone conformation, but its sidechain conformations, most importantly that of Tyr 1, are completely different, so it avoids the steric overlap problem. The other conformations have completely different backbone conformations than 5 and 30. The conformations 10 and 15 are almost identical -helices.
We used the conjugate-gradients minimizer of BIOGRAF to minimize these conformations in order to find a stable local minimum in the DREIDING potential energy surface. The dihedral angles of the minimized conformations, S, were no longer restricted to lattice values, and the bonds and angles were no longer fixed. The resulting structures were significantly different and had far lower energies than the un-minimized conformations, as can be seen in Table . The minimized conformations differ from their unminimized counterparts by 1.4 Å or more. Except for the 5 - 5 transition, minimization primarily modifies the backbone dihedrals (see Table ). Minimization of 5, however, substantially modifies the sidechain dihedrals, especially and of Met 5. This removes the steric overlap and improves the energy of the 5 conformation to a value similar to the other minimized conformations. Interestingly, 60 and 30 are the best minimized conformations, even though 60 and 30 were substantially higher in energy than 15 and 10. The turn-like conformations of the 30 and 60 are more compact than the helical 10 and 15 conformations. This compactness caused unfavorable overlap initially, but upon minimization led to favorable van der Waals packing as well and electrostatic interactions between the N- and C- termini.
It is interesting to compare these conformations to the ECEPP/2 global minimum conformation of Li and Scheraga. This conformation, LS is described in Table both before and after DREIDING minimization. Clearly, the Li and Scheraga dihedral angles do specify a global minimum using our geometries and the DREIDING forcefield. The energy of LS is substantially higher than many of the unminimized conformations in Table . After minimization, its energy is far lower but the conformation has changed substantially: the angles have changed by an average of 37.4. Nevertheless, the minimization altered the initial conformation less than it altered the grid-peak conformations of Table , indicating that LS lies closer to a local minimum in the conformational energy space. It should be noted that the Li and Scheraga global minimum is the product of minimization using an internal-coordinate minimizer whereas the DREIDING minimizations use full Cartesian-coordinate minimization of all 3n-6 degrees of freedom. It is possible that the changes to its torsions upon minimization are due to the increased dimensionality, i.e., allowing bonds and angles to vary decreases the energy barriers for the torsional degrees of freedom as well, allowing them to move. This factor may be as important as differences between the DREIDING and ECEPP forcefields. In any case, with an energy of 67.7 kcal/mol, the minimized Li and Scheraga conformation, LS, is more than 10 kcal/mol lower in energy than any of the minimized conformations listed in Table .
The goal of our simulations was to sample the conformational space of Met-enkephalin using the PGMC method, and to fully minimize the conformations to yield optimum conformations. Our simulations produced approximately 50,000 conformations per hour of cpu time on a single processor of a Silicon Graphics 4D/380 workstation. The energy of each conformation was calculated using the DREIDING forcefield with no nonbond cutoffs - all van der Waals and electrostatic pairs were included. Short runs of 10,000 Monte Carlo steps were carried out first, in order to optimize the parameters to be used in longer runs.
The key parameter in any Metropolis Monte Carlo simulation is the simulation temperature (Equation ()), which controls the probability of accepting a new conformation generated at random. A higher temperature means that a conformational change which increases the energy is more likely to be accepted than at a lower temperature. The advantage is that energy barriers can be traversed more easily and conformational space can be searched more thoroughly. The corresponding disadvantage of high simulation temperatures is that the computation can spend a lot of time in bad areas of conformational space and not settle near optimal conformations. The effect of temperature can be seen in Figure , where the results from four separate 10,000-step runs are shown. At 100 step intervals during the simulation, both the current energy (Figure a) and the best energy to that point (Figure b) were recorded. The four simulations, run at temperatures of 0 K, 300 K, 1000 K, and 10,000 K, give very different results. As expected, the 10,000 K simulation has a far greater fluctuation in energy than the lower temperature simulations. However, although it samples a wider variety of conformations, it does not sample as many good conformations. As can be seen in (Figure b), it is the low-temperature simulations at 0 K and 300 K which obtain the lowest-energy conformations. All four simulations began with the 183.6 kcal/mol conformation 15 and produced substantially better conformations. The 300 K simulation produced the best conformation, which had an energy of 131.9 kcal/mol before minimization and 70.3 kcal/mol after minimization - nearly as low as the LS. In contrast, the 10,000 K simulation had a best conformation more than 10 kcal/mol higher, with an energy of 145.7 kcal/mol before minimization. After minimization, however, the energy dropped to 73.9 kcal/mol, nearly as low.
Monte Carlo simulations use random numbers both to produce new conformations and to determine whether new conformations are accepted or rejected. The use of random numbers ensures that simulations will proceed differently every time they are run, even when initial conditions are the same. Therefore, it is possible to run several simulations at the same grid spacing, with the same starting structure, and achieve different final conformations. Single runs like those plotted in Figure are not sufficient for establishing optimum parameters. In order to obtain optimal parameters, we carried out numerous simulations at different grid spacings and temperatures. One such series is show in Figure . These calculations used a 15 grid spacing and a temperature of 1000 K. Forty simulations were run, using different random numbers, and the lowest-energy reached during each simulation was recorded. Twenty simulations, labeled (I), began with the same initial conformation, 15 and twenty began with conformations generated at random from the backbone and sidechain 15 probability grids. All the simulations progressed in the same way, with new conformations generated at every step by selecting one sidechain or one pair at random and choosing a new conformation from the probability grids. However, a different series of random numbers was used each time, so different conformations were generated and different conformations were accepted or rejected. Apparently, there is little advantage to starting with the peak conformation rather than one generated at random. The best energy from the twenty simulations (I) begun with 15 was 143.0 kcal/mol. The average best energy from the twenty simulations was kcal/mol. This compares to an overall best of 143.4 kcal/mol and an average best of kcal/mol for the twenty simulations (R) begun with random conformations.
The next series of simulations was carried out in order to determine which combinations of temperature and grid spacing gave the best results. For each grid spacing (5, 10, 15, 30, and 60 degrees), 10 simulations of 10,000 steps were run at each of five temperatures (0 K, 300 K, 600 K, 1000 K, and 5000 K). Each simulation began with and conformations chosen at random from the probability grids. For each of the 250 simulations, the best overall conformation was saved and its energy recorded. We also recorded the overall acceptance rate during each run. This number, the percentage of new conformations which are accepted, depends directly upon the simulation temperature and indirectly upon the probability grids, which determine the conformational sampling and, therefore, the energy range of conformations which are generated. For each of the 25 temperature/spacing combinations, the overall best energy, the average best energy and the acceptance rate were calculated for its 10 10,000-step runs. The acceptance rates are shown in Figure . As expected, the rate is highly temperature-dependent, with roughly a 70%acceptance rate at 5000 K, 33%at 1000 K, 20%at 600 K, 10%at 300 K, and 0.5%at 0 K. The latter figure actually represents the percentage of time a new minimum energy conformation is reached, since at 0 K, there is no probability of accepting a conformation which is higher in energy than the preceding one. At nonzero temperatures, the acceptance rate includes instances when and when , but meets the Metropolis criterion (see Section ).
The energy minima for the 25 temperature/spacing combinations are shown in Figure . Figure a shows the lowest energy obtained during all ten runs combined while Figure b shows the average energy minimum for the ten separate runs. The overall lowest-energy conformation gives a good indication of how effective a set of parameters is, but it only requires one good conformation to be sampled over 100,000 steps, so it depends on good luck as well as on good parameters. A low value for the average minimum energy, on the other hand, requires that a set of parameters give consistently good answers. This is, therefore, the more useful number. The best overall energy obtained during the 250 runs was 127.9 kcal/mol, sampled during a 0 K simulation using 5 probability grids. However, the best average energies were obtained during 300 K simulations. The average over 10 simulations at 300 K using both 5 and 10 probability grids was essentially the same: 131.8 kcal/mol. This is slightly less than the 5/0 K average of 131.9 kcal/mol and significantly less than the next closest, the 15/300 K simulations, which averaged 132.6 kcal/mol.
It is interesting to note that the energies appear to converge as the grid spacing increases. At low temperatures, finer grids give better results, but at high temperatures, the opposite result is obtained. Figure indicates that this is related to the acceptance rate, which is much higher for 60 than 5 grids at 300 K (16.3%vs. 9.3%), but lower for 60 than 5 grids at 5000 K (62.2%vs. 68.3%). Figure and Figure show a strong negative correlation between high acceptance rates and low energies. This is likely to be the case because low acceptance rates require that the simulation take a more downhill path through conformation space. This may prevent sampling of low energy conformations in distant areas of conformation space, but it leads to lower energies on average. The relationship between grid spacing and acceptance rates is more subtle. The fine 5 and 10 grids have far more possible conformations which vary slightly from one another. Therefore, at high temperatures it may be difficult to make progress towards an energy minimum, because so many conformations are nearby in energy and are accepted. The 60 conformations are likely to vary much more widely in energy, and fewer would be acceptable. At low temperatures, however, the far greater flexibility of the fine-grained grids provide much easier pathways to minima in the potential energy surface. Rapid descent into an energy minimum may lock the structure into areas of conformational space where few low-energy alternatives exist, so the acceptance rate drops rapidly. Indeed, acceptance rates for the first few hundred steps are usually much higher than for later steps. The fact that grids play a significant role in the acceptance rate may mean that a grid-based annealing scheme may be effective, as temperature-base annealing has been shown to be. For instance, one might start with 60 grids in order to sample broader regions of conformational space, but then slowly decrease the grid spacing as the simulation proceeds, to give added flexibility in favored subregions of the potential energy surface.
We chose the most successful spacing/temperature combinations from the above study (5/300 K and 10/300 K) to use in a more thorough second set of simulations. In this set of calculations, twenty 50,000-step simulations were run for each of the two spacing/temperature choices. The best conformation sampled during each run was saved and then minimized. The energies of the best conformations, before and after minimization, are shown in Figure . The results are fairly consistent from one simulation to the next, and there is not a large difference between the simulations which used 10 probability grids and those which used 5 grids. The best energy, on average, for the 10-grid simulations, was 129.4 kcal/mol before minimization and 71.4 kcal/mol after minimization. For the 5 simulations, the averages were 128.0 kcal/mol before minimization, and 73.0 kcal/mol afterwards. It is interesting that the 10 conformations are, on average, better than the 5 conformations after minimization, even though they are worse before minimization. This implies that the 5 conformations are closer to their local minima and are optimized less dramatically (by an average of 55 kcal/mol vs. 58 kcal/mol) during the full Cartesian-coordinate minimization.
Although the results are very consistent for all 40 simulations, no two simulations produce the same energy minimum. Several are extremely close, differing by only 5 or 10 at a single dihedral, but in general there is a fairly good diversity of conformations represented. This can be clearly seen in Figure , where the dihedral angles from the various energy minima are plotted. The numbering for the dihedrals is shown in Table and Figure . Although there are
definite trends among the different minima, only the five dihedrals unsampled by the PGMC method are the same in each conformation. These dihedrals are of Tyr 1 (dihedral 4), and the four backbone dihedrals, (6, 9, 12, and 17). All other dihedrals are represented by at least two different conformations and some by as many as 11. The dihedrals which show the greatest variability are the and of the two glycine residues, dihedrals 7, 8, 10, and 11. This is an important factor to consider in understanding the active conformations of Met-enkephalin. The great flexibility of the glycine backbone means that many conformations are possible which differ radically in their backbone conformation but which are near each other in energy. The active conformation may require one particular glycine conformation which, in the absence of a receptor, is not especially favorable.
Of the twenty optimized conformations created by the 10 simulations, six have energies within 1.0 kcal/mol of LS (Table ). Two of the 5 conformations also had energies within 1.0 kcal/mol. None of these eight conformations, however, was actually lower in energy than LS; the closest had an energy of 68.0 kcal/mol, only 0.3 kcal/mol higher. It is not known whether any conformation close to the Li and Scheraga minimum was sampled during any of the simulations, since its energy is so high before minimization. The eight lowest energy conformations predicted here are extremely similar to one another, particularly at residues Phe 4 and Met 5, where they are all virtually identical. Only one of the 5 conformational minima, which had an energy of 68.6 kcal/mol, differed significantly in this region. As can be seen in Figure b, the other seven minima are identical in this region, and are distinguished primarily at of Met 1, the and of Gly 2, and the of Gly 3. The Li and Scheraga minimum, LS, not shown, is also very similar, but differs in the sidechain dihedrals, and of Phe 4 and of Met 5.