The structure of the apo-form of glutaminyl cyclase from Ixodes scapularis at the end of a 400-nanosecond molecular dynamics simulation

Biomolecular force field validation

Classical molecular dynamics simulations are widely used to model biomolecular systems for drug design, protein structure determination, and biophysics research in general. In classical molecular dynamics, the movement of the atoms and molecules is obtained by numerically integrating Newton’s equations of motion as a function of time. The reliability of such simulations is dependent on the accuracy of the model used to describe the forces acting on the atoms. These models are commonly referred to as “force fields”. A number of alternative biomolecular force fields have been published. The force fields developed by the group of Professor Alan Mark at UQ are used by 1000’s of researchers worldwide. Testing and validating these force fields is critical. The question we are trying to address is how much data is needed to reliably distinguish between two forcefields in terms of providing a more accurate
description of both the protein structure and its dynamics. Specifically, how many different proteins need to be tested and for how long does each need to be simulated?

Computational throughput is the major limiting factor. Even localised protein dynamics such as loop movements occur on timescale of microseconds to milliseconds. The longer we can simulate and the more systems we can consider, the more reliability we can compare to experiment. To push to longer time scales, I developed a web-based interface to convert files in the GROMOS format into a format compatible with the Amber molecular dynamics package (see the “Topology Converter” tab on the ATB website. The advantage of Amber is that it runs efficiently on Graphics Processing Units. These simulations not only enable us to test accuracy of force fields but to test the reliability of the available structural models. For example, in some cases the simulations have revealed instabilities in the protein which could be traced to errors in the structure deposited within the Protein Data Bank (PDB).