Next: Design of a Up: Molecular Dynamics of Conformational Previous: Molecular Dynamics of Conformational

Introduction

Knowledge about detailed atomic structures of biological macromolecules has been rapidly accumulated in recent years (see, e.g., Refs. Deisenhofer88,Henderson90). That progress opens the chance to acquire an understanding of macromolecular biological function in terms of basic physical and chemical notions. Many aspects particularly of protein function are known to be connected to dynamical processes within these macromolecules[3,4,5]. Therefore, adequate descriptions of that molecular dynamics (MD) are required and represent essential clues in the attempt to derive function from structure. Due to the structural complexity of proteins and a corresponding lack of well-founded coarse-grained effective models for the dynamics, the method of MD-simulation[6,7] currently is the only approach, to which some reliability can be assigned. That method conceives a macromolecule as a classical many-body system of `atoms' and describes the quantum-mechanical forces like the chemical binding forces, which are caused by the electronic degrees of freedom, by a semi-empirical force field. Accordingly, the molecular dynamics is simulated by integration of the Newtonian equations of motion.

The enormous computational task associated with MD-simulation of biological macromolecules entails an upper limit to the time scale of dynamical processes accessible by this method: the MD-simulation of one nanosecond ( s) of an average-sized system consisting of 30,000 atoms requires roughly floating point operations if all long-range interactions are taken into account. About 200 days of CPU-time on a 1 GFLOPS-supercomputer are necessary to perform that task. Hence, the limit of accessible time scales set by current computer technology is in the nanosecond range.

However, many biochemical processes occur at time scales, which are by six to twelve orders of magnitude larger than that limit: typical ligand binding reactions as well as quaternary rearrangements occur in the range to seconds; protein aggregation and protein folding processes require up to seconds[8]. Admittedly, enormous efforts have been spent to increase the computational performance of MD including efficient implementations of MD-codes on vector-machines (e.g. Ref. Mertz91) or, more recently, on parallel computers [10,11,12]. However, assuming that such efforts generate an increase of processing capabilities at about the rate of every ten years one is forced to the conclusion that computer technology will not allow MD-descriptions of many important biochemical processes before the year 2030 (cf. also Ref. Gunsteren90).

At present, a reduction of the amount of computation involved in the description of protein dynamics is the prerequisite to further extend the range of accessible biochemical processes. Accordingly, various techniques have been developed and employed among which three main approaches can be distinguished:

(a): Mathematical and numerical methods attempt to reduce the amount of necessary computation essentially without any modification of the physical description. These methods include higher order integration algorithms[14], the use of generalized internal coordinates[15,16], symplectic integration algorithms[14,17], fast multipole methods[18,19], variable time step methods[20,21], and various multiple time step methods[20,22,23,24,25,26].
(b): Proper approximations modify the molecular model employed in MD-simulations in a way that enables a reduction of the computational task. Here, care has to be taken to ensure, that the approximations do not too seriously alter the physics of the macromolecular dynamics[27]. Examples are the neglect of long-range interactions by use of a ``cut-off'' function[23], the suppression of fast degrees of freedom[28], and the so-called ``mass-tensor'' molecular dynamics[29].
(c): Effective models are designed to replace the original MD-model. They rest on a classification of `relevant' vs. `irrelevant' system properties for a given dynamical process. These models reduce the explicit description to the relevant properties and assume that the action of the irrelevant properties can be implicitly taken into account by renormalized interactions or other quantities representing statistical averages. Stochastic models, in particular, are based on the assumption, that the detailed dynamics of fast degrees of freedom, such as bond- or bond angle vibrations, is not essential for protein structure and function. Successful applications of stochastic descriptions like Monte Carlo simulation[30,31], Langevin dynamics[32], generalized Langevin dynamics with memory friction (e.g. Ref. Tuckerman91a) or the use of statistical potentials[34], support that assumption.

Certainly, this classification is not mutually exclusive which becomes obvious by considering the neglect of the long-range part of the Coulomb interaction as an example: that method equally can be regarded as an approximation and as an effective model implicitly accounting for shielding effects caused by atomic polarizabilities.

Most of the above methods have been designed for a wide class of many-body systems and, therefore, represent general purpose methods. Application of these methods to proteins typically speeds up MD-simulations by about one order of magnitude. However, in view of the desire to increase accessible time spans by six to twelve orders of magnitude, the efficiency gains achieved by these general purpose methods represent only a moderate success.

Major efficiency gains can be expected if computational methods and effective models are developed which more specifically take advantage of structural and dynamical properties particular to proteins. That expectation rests on the emerging notion that proteins actually possess unique properties which distinguish these many-body systems from others[35]. In particular, a clear-cut identification of irrelevant degrees of freedom, explicit consideration of which usually is computationally demanding, should allow considerable efficiency gains by development of coarse-grained effective models, which are adjusted to the particular dynamical and structural properties of proteins.

From the above discussion we conclude, that a proper characterization of protein dynamics is a prerequisite for the development of efficient protein dynamics descriptions. We note that a separation of relevant observables from irrelevant ones is also required for an application oriented evaluation of a given MD-method. Usually, such an evaluation is based on a comparison of certain quantities computed from test-simulations carried out with the given MD-method, with corresponding quantities obtained from simulations employing a reference method, which is assumed to provide more accurate results (cf. Refs. Verlet67,Gibson90). However, for an application oriented evaluation, the quality of the given method should be evaluated solely with respect to its ability to describe relevant properties accurately. These ideas are discussed and exemplified in detail in a forthcoming paper[38] as well as in[26].

In the present paper we focus on the question, how knowledge on the very special dynamical properties of proteins can be acquired. Below we will motivate the hypothesis, that studies of the dynamics of simplified protein models are well suited to contribute to such knowledge, provided the dynamical properties of the protein model can be shown to be sufficiently similar to those of more realistic protein models.

Contrary to less complex many-body systems the dynamics of proteins appears to involve a hierarchy of time scales[39,40]. The high-frequency dynamics of protein models has been examined in detail by means of MD-simulation[41,42] as well as by normal mode analysis[43], whereas knowledge about the low-frequency dynamics is sparse. However, many quantities which are important to protein function are defined only at slow time scales well above 100 picoseconds: mean first passage times for transitions between conformational substates[44,4], which are considered as elementary steps for `functionally important motions'[45], fall in that region. Computations of corresponding transition rates, e.g. by transition state or activated dynamics [5,46], or of other relevant quantities, like free energies, typically require large statistical ensembles and, therefore, show slow convergence. Accordingly, studies of infrequent conformational motions have been possible only for small polypeptides [47,48,49,50]. Typically, the time scale covered by available sampling techniques like umbrella sampling [51] or various force-bias methods [31,52] is too short as to provide an ensemble large enough for accurate results. Hence a characterization of protein dynamics is required especially in the low-frequency region.

At the first glance, this requirement appears to entail a vicious circle which impedes the development of efficient protein dynamics descriptions: On the one hand, due to the structural complexity of the system as well as due to the lack of experimental data, studies of dynamical properties in the low-frequency region have to rely on extended MD-simulations. On the other hand, it is difficult to carry out such simulations unless sufficiently efficient protein dynamics descriptions have been developed.

In view of that problem we note that insights into slow phenomena of protein dynamics, such as the folding process, have been provided by studies of small, oversimplified protein models, such as lattice models[53] or `bead'-models[54]. Of course, the simplified structure of such model systems requires a careful interpretation of results in order to provide information on properties of real proteins. But at the same time their simplicity entails the key advantage of such systems which is to permit extended simulations covering time spans several orders of magnitude larger than those accessible to simulations on more realistic protein models. Hence, analysis of the dynamics of a simplified protein model by means of extended MD-simulations should enable insights into dynamical properties of proteins and, therefore, should contribute to the development of more efficient protein dynamics descriptions.

The present paper exemplifies that approach by considering a `minimal model' for proteins, which is described in the following Section. Despite its simplified structure, MD-simulations carried out on this protein model reveal dynamical properties similar to those computed from MD-simulations of more realistic, complex protein models or to those obtained from experiments (Section ). In particular, the dynamics of the protein model was found to exhibit conformational transitions at a time scale of several hundred picoseconds. Such conformational dynamics appears to be ubiquitous in protein dynamics[55,56]. As we will argue, these similarities support the assumption that results obtained by our extended MD-simulations of the simplified protein model actually provide information about the low-frequency dynamics of real proteins.

In Section we analyze whether memory-effects are present in the dynamics of our model. Memory-effects of dynamic quantities are well known to exist in proteins at short time scales up to the picosecond range[33]. Here they give rise to non-vanishing autocorrelation functions of atomic positions or velocities[57]. Proper consideration of these short time correlations is essential for stochastic descriptions of fast degrees of freedom[58]. Accordingly, the development of coarse-grained effective descriptions of slow degrees of freedom requires knowledge about time scales of correlations. To contribute to such knowledge, we address the question, to what extent memory effects show up in the low-frequency conformational dynamics of proteins.

Next: Design of a Up: Molecular Dynamics of Conformational Previous: Molecular Dynamics of Conformational

Helmut Grubmueller
Mon Nov 6 16:25:56 MET 1995