RNA Folding Simulation

Giff Ransom, COMP 259 Final Project, UNC Chapel Hill

giff AT ransomshire DOT com

Background

RNA molecules are single stranded copies of a segment of a gene in the DNA. Though a DNA strand is locked in a double helix with its partner, making their three dimensional structure predictable and stable, RNA molecules have no such complementary strand. Rather than staying stretched out in a line, RNA molecules tend to fold back on themselves, and the three-dimensional shape formed by a given strand is important for studying its interaction with other molecules.

As an RNA molecule folds, its bases bond with other complementary bases in the strand, similar to DNA. However, since the sequence isn't semetrical, it will not match perfectly and form a double helix. Exactly which bases line up with their pair determines the most stable structure of the RNA molecule.

Most biologists use two-dimensional simulations when they need to find the ideal configuration for a given RNA molecule. However, these do not provide an acurate representation of the three-dimensional dynamics and emerging structure.

Three-dimensional molecular dynamic simulations give a much better glimpse of the RNA molecule's structure, as well as a more accurate prediction of the actual shape. For instance, some configurations may look good in the 2D simulation and yet not work in 3D space. Ideally, each atom of the molecule would be modeled, but RNA strands can be as large as 10,000 bases long - far beyond the scope of all-atom simulations.

Project

In his 2002 dissertation, Kurt Grunberger showed that the molecular model can be simplified while still fairly accurately predicting the RNA tertiary structure.

My model is a further simplification - using only one particle for the backbone sugar and one rigid particle for the neucleotide base. This allows for the simulation of much larger strands, while maintaining most of the essential properties of an RNA molecule.

The spring rest states and constraint distances in my system are based on the bDNA configuration of the double helix. While the aDNA configuration is a more accurate configuration for RNA, it is also somewhat more complex since the bases do not lie perpendicular to the helical axis. I used bDNA to get the system up and running, but changing it to aDNA would require only tweaking the system constants.

Bond Angles

The most basic contribution to the forces of the system come from the bond angles. These five key angles are derived from the bDNA configuration and are maintained using a simple spring force. The spring and damping constants were tweaked for stability for my simulation.

Angle Radians Spring Constant Damping Constant
Bm-Sm-Sl 0.955571 30000.0 100.0
Bm-Sm-Su 1.7832 30000.0 100.0
Su-Sm-Sl 2.71556 50000.0 100.0
Dm-Bm-Sm 2.09212 30000.0 100.0
Nm-Bm-Sm 1.570796327 10000.0 100.0

Tortional Angles

There are only two rods to rotate around in the system - rotation along the bond between sugars, and along the bond between sugar and base. However, having the normal face the correct direction along the RNA strand is crucial, so that tortional angle is defined both in terms of the upper and lower sugars. Tortional forces are implemented using springs.

Angle Radians Spring Constant Damping Constant

Bu-Su-Sm-Bm
(along Su-Sm)

0.369694 10000.0 100.0
Nm-Bm-Sm-Su
(along Bm-Sm)
2.32255 100000.0 1000.0
Nm-Bm-Sm-Su
(along Bm-Sm)
.939544 100000.0 1000.0

Constraints

In her book Molecular Modeling and Simulation, Tamar Schlick notes that the use of fixed bond lengths do not significantly alter the dynamical properties of a molecular dyanamics system, although constraining bond angles does. Constrained bond lengths offer the chance to use much larger timesteps, so I implemented them in my project using the method outlined in the SIGRAPH physically based modelling course notes. The distances are in angstrums.

Additionally, I constrain the angle between the normal and direction vectors of the base to be 90 degrees.

Bond Length

Bm-Sm

5.37
Sm-Su 5.89363

Non-Bonded Interaction

The most important interaction in the system is that between two bases. I took several factors into account - electrostatic attraction and repulsion, and van-der-waals attraction and repulsion. I used the lennard-jones potential for the van-der-walls interactions, with a cap on the repulsion force at short distances for the sake of stability. The electrostatic attraction between bases uses Coulmbs law, using an attractive force for complementary base pairs and a smaller repulsive force for non-complementary ones. The electrostatic and van-der-waals forces affect both rotational and directional velocity.

In addition, complementary base pairs also attempt to point their normals in opposite directions, with a force proportional to the coulumb force. I also implemented van-der-waals repulsion between sugars and sugars, and sugars and bases for the sake of collision resistance.

 

Non-bonded interactions clearly dominate computational time, and can easily become an O(n^2) algorithm if there is no optimization. My program uses a cut-off distance of 100 angstroms and an efficient distance comparrison algorithm. This reduces the time to O(n) for non-bonded calculations and O(n log n) for distance measurements.

Brownian Motion

In order to simulate constant collisions with water molecules, I am applying small random forces to each sugar. Unfortunately, I found the constrained dynamics solver highly unstable when these forces are too strong. To remedy this, I apply many weak random forces, and only a few large ones. This keeps the system stable and creates enough turbulance help the RNA strand explore many more confirmations.

Results

The simulation works very well. Small RNA strands can be simulated interactively, and large strands can be simulated and recorded. Running the simulation for strands of 1000 took several days, but the resulting dynamics look really good.

100 Bases
500 Bases
1000 Bases

Future Work

A fairly straightforward expansion of the project would be to implement the aDNA rather than bDNA configuration for constants. Future expansion might focus on tweaking the behavior to better reflect the underlying molecular dynamics. Running the RNA simulator side by side with an MD simulator for the same molecule would provide error checking, and the ultimate goal would be the RNA simulater yielding similar configurations after many timesteps. I don't know how easily this can be achieved with this extra level of abstraction, but the current behavior looks promising.

Source Code & Executable (Uses SparseLib++, GLUT, and GLUI)