Abstracted Protein Simulator
This section gives a historical introduction to the simulator describing the aims of the project. It then details the current functionality of the simulator and considers possible future extensions.
The Abstracted Protein Simulator started out as my (Dan Mossop's) fourth year project as a Computer Science undergraduate at the University of Edinburgh. The original specification, provided by my supervisor, Fred Howell, was to create a software toolkit which enabled proteins to be simulated at a highly abstracted level. The proteins were to be considered as mechanical components with the very minimum of functional detail - only the functionality which affected their behaviour significantly in the system being studied would be modelled.
By removing large amounts of arguably unnecessary detail from the system, it becomes possible to simplify the system. This leads to more efficient implementations, with the end result that much larger systems can be simulated. For instance, the computational complexity of conducting simulations in atomic detail means that super-computers are generally required to model a single protein, and even then only over timescales far shorter than that at which much of the interesting protein behaviour occurs. By contrast the Abstracted Protein Simulator is able to simulate hundreds of proteins over much longer timescales while rendering the simulation in 3D, all on a standard desktop PC. The tradeoff is, of course, a marked decrease in accuracy.
Using Java and Java3D, I created the version of the simulator which along with my dissertation formed my fourth year project. The simulator and dissertation are available for download here. This version was force-based. That is, the proteins in the system were updated according to the forces acting on them from other proteins and the surrounding fluid. The system was partially successful, the instantaneous force calculations and the new velocity calculations appeared to be accurate. However, it became clear that the use of a force-based model was probably not the best way to tackle the problem. The calculation of instantaneous forces worked well over small timesteps when the change in displacement was small, but over the larger timesteps at which much of the behaviour we were interested in occured, the error resulting from the force sampling became very large. As a result the modelling capability of this version was quite limited.
After completing my degree I was employed by the University of Edinburgh to carry out further work on the simulator. One of the main areas to deal with was finding a replacement for the force-based update mechanism. Several common modelling techniques where considered, including Dissipative Particle Dynamics (DPD) which is normally used to study fluid dynamics. Unfortunately DPD and most of the others also rely largely force-based techniques, rendering them unsuitable for large timescales. One technique which offered a real solution to the problem was Monte-Carlo simulation.
Monte-Carlo simulations are probabilistic and their updates are based on the energy of the system, not the forces acting. By choosing randomly altered configurations as the next possible state of the system and favouring those in which the overall energy is less than the in the current state, Monte-Carlo systems create a bias towards low energy systems, as occurs in actual physical systems. This reliance on energies was suited to the study of protein systems as it copes well with large timesteps. One thing it did not seem to cope well with, however, was flexible bonds between proteins. If, for example, several proteins were bonded in a ring formation and the update was made for a large timescale then the resultant energy would almost always be a lot higher than the initial energy (due to the bonds being stretched), leading to the new state being rejected. To avoid this problem a decision was made to use only rigid bonds in the models.
Current Functionality
The Abstracted Protein Simulator continues to use Java and Java3d and is capable of performing frame by frame 3D rendering of the simulation scene. It now also takes in descriptions of the scene from XML files.
Proteins in the system can be represented by spheres, cylinders or compound objects formed from the two. The proteins can have a number of bond sites whose behaviour are controlled by rules specified in the input file. Updates of the system state happen in two stages. Firstly the positions of the objects are updated using Monte-Carlo techniques. Then bonds are created or dropped as required to satisfy the bond rules. When bonds are formed the bonding objects are rotated into alignment. The simulator has a user interface from which new simulations can be loaded and the simulations can be controlled.
Future Extensions
The simulator as it currently stands has sufficient functionality to perform a range of interesting simulations (see, for instance, the synapse and vesicle sample applications). There are, however, a number of possible extensions which could increase the range of possible simulations.
Implementing flexible bonds between the proteins, rather than insisting that all bonds be rigid would significantly increase the power of the simulator. It would allow simulations which represent nature more accurately and would allow the simulation of systems in which bond flexibility plays a crucial role in the outcome of the simulation.
Another improvement which could be made is to the speed of the simulator. While some work has gone into optimizing the simulator, resulting in a significant speed-up in execution time, there is still some room for improvement. The principle benefit of faster execution times is that more, or more complex, simulations can be carried out in a given period of time.
A powerpoint presentation by Fred Howell : Cartoon modeling of protein interactions covers the aims and rationale of this project