toward chemical complexity

Scientific background

Current *ab initio* methods, although very accurate and general, are
computationally extremely demanding. Their application is thus restricted
to relatively small systems and short simulation times. On the other hand,
the use of *classical force fields* allow the dynamical simulation of
millions of atoms, which makes them applicable to the study of a wide variety
of physical processes (e.g., shockwaves,

dislocation dynamics, fracture, oxidation) which require larger system sizes
and longer simulation times. However, force fields are notoriously difficult
to develop and their accuracy must be established for each application.

A main current challenge is to develop methodologies that retain the accuracy
of quantum mechanics (QM) while allowing large-scale

simulations. Current approaches to bridge from electronic structure to atomistics
can be categorized in two groups, described in the next sections.

- Improvement of the techniques for force-field development. The scope is to achieve classical potentials capable of describing chemical reactions. This aims at the production of transferable reactive force fields, normally based on accurate QM calculations which sample the various chemical environments encountered in the applications.
- Development of embedding schemes. Here concurrent, multi-scale models are produced in which high quality QM calculations are performed in a small space-time region (where/when important chemistry is occurring) and a simpler (non reactive) force field description is used to simulate the dynamics of the rest of the system.

**Reactive Force Fields**

In general, the total energy of an atomistic system is written as the sum
of three terms, Eelec, Eval, and

EvdW, where Eelec denotes electrostatic interactions,
Eval denotes valence interactions (bonds, angles, torsions,

etc.), and EvdW denotes van der Waals interactions (short
range repulsion and longer range attraction).

In order to describe chemical reactions and atomic interactions in a changing
chemical environment, most reactive force fields are based on one (or both)
of two main concepts:

- A dependence of Eval on the partial bond order, obtained solely from atomic positions.
- Environment-dependent, self-consistent charges in Eelec (including atomic polarizability if necessary).

These variable charges and partial bond orders are the key concepts that
allow a classical description of chemical reactions,

dramatically improving the transferability of interatomic potentials (e.g.
retaining the same description for oxygen atoms

in O2, H2O and Al2O3).
Concerning the derivation of the partial bond orders, reactive force fields
can be categorized into two main groups:

- Tight-binding derived analytic moment expansions.
- Physically sound functionals parameterized using either high quality QM or experiments.

The two existing models of the latter class of reactive force fields are
the REBO potential (and its further developments,

AIREBO) and the quantum-based force field denoted ReaxFF. Both models were
originally developed for hydrocarbons, but now

include (or will include soon) the extension to other systems (nitrogen,
oxides, and metals). In order to gain accuracy on the

description of chemical reactions, these force fields have to be fitted
on a large variety of QM calculations including bond

dissociation curves, angular and torsional bending, and the decomposition
mechanisms of various molecules. To enable this without

unnecessary duplication of effort, the 2002 CECAM/SIMU workshop community
developed the idea of building a web site hosted by CECAM (currently under
development) for such interatomic potential training sets.

**Embedding schemes**

Another general approach for ``bridging the length scale gap'' between
the quantum and the classical world is based on embedding

strategies in which the accuracy of the description is not uniform throughout
the investigated system.

In the standard embedding approach the system subregion where high accuracy
is needed is described by a QM Hamiltonian,

and is surrounded by (``embedded'' in) a larger zone treated by classical
interatomic potentials. In some applications, a third

level of description (e.g., a finite element description) is used to include
another external comtinuum region [9]. The system

is thus divided into different parts associated to different models (levels
of sophistication), and the main difficulties are

encountered in matching different models at the boundary regions. For instance,
the so-called QM/MM (Quantum Mechanics/ Molecular Mechanics) methods find
their applications in various fields, including biophysics and biochemistry,
where the chemically

interesting parts are treated with {\it ab initio} techniques, while the
surrounding regions are treated by empirical potentials.

Although allowing for ``large-scale'' simulations (several thousand atoms
treated classically, plus typically a hundred

treated by QM), these methods have about the same time limitations of the
pure *ab initio* methods when applied on small system

sizes. This could become a problem when the reaction time of the chemical
processes of interest is much longer than the feasible simulation time. In
these cases, it is necessary to develop special techniques for sampling rare
events [5]. Another important open

problem in these concurrent simulations is how to choose the regions to
be treated by different models, and in particular how these regions should
evolve during the course of the dynamics.

In a different spirit the LOTF (Learn On The Fly) technique uses a unique
(parametrized, classical) Hamiltonian to

describe the atomic interactions. However, when needed, the interatomic
potential parameters associated to the ``chemically

active'' atoms (e.g. atoms in a crack tip) are allowed to vary during the
simulation to better describe their changing environments [6]. The continuously
adjusted parameters, which have a local meaning, can be obtained by ``learning''
from QM

calculations carried out ``on the fly'' on restricted sets of atoms relevant
to the local chemical processes observed during the

simulation, using e.g. a force matching technique. With this procedure,
an arbitrarily accurate description of the dynamical

evolution of a system can be achieved, computing QM information only when/where
needed (time embedding is thus introduced, on the top of the more standard
space embedding). While fitting the parameters ``on the fly'' introduces some
extra complication in the

algorithm, the use of a unique Hamiltonian effectively removes matching
problems in the boundary regions. This method is

similar in some respects to another approach based on classical potentials,
the Optimal Potential method developed by A. Laio and

S. Bernard [7] for metals, in which accuracy is given the priority over
transferability. Here the interaction potential is

characterized by an enlarged set of parameters (not resolved in space) which
are fitted on first principles calculations on the

entire system. As the first principles calculations can be substantially
separated in time, the time limitations of this method are essentially those
of a classical approach.

Most of these hybrid length-scale methods encounter problems concerning
the conservation of energy, since the Hamiltonian is

changed during the simulation when boundary regions move or potential parameters
are changed, and none address the equally

important time-scale problems [5]. An original alternative in this direction,
coupling *in time* and using parallel computers (the ``parareal''
method), has been introduced by G. Zerah [8]. The basic idea of this method
is that a high accuracy dynamical trajectory (e.g. ab initio MD) can be obtained
in a computationally efficient way starting from an approximate trajectory
(e.g. using an empirical

potential) and making a series of iterative corrections that require independent
dynamical runs using the high-accuracy method

but for times much smaller than the total simulation time. If these corrections
are made in parallel, the target accurate

trajectory can be obtained in significantly less wall-clock time (but more
CPU time) than a complete high-accuracy dynamical run

would require. One mathematical property of the relevant equations is that
the convergence of this process is guaranteed.

Multi-scale modeling in material science is a natural application of such
a method (where the fine and coarse grain propagators are

*ab initio* and classical descriptions respectively). This method is
currently being developed and a wide range of applications is envisaged.