Why are RNA functions so new to us?

This is a question I have been asked by many students. If RNA is so critical, and non-coding RNAs so common, such as those involved in regulation (RNAi), viral defence (siRNAs, CRSPRs) and the general running of the cell (RNase P, tmRNA), why are we only just learning about them?

The answer is that only recently do we have the knowledge to recognise that it is RNA doing these functions and the technology to uncover and research them.  Researchers studied protein coding genes for decades because once we knew roughly how they change over time. If the nucleotide sequence (its string of G’s, T’s in DNA (U’s in RNA), C’s and A’s) changes then the protein sequence will be wrong and the protein may not work. 

With non-coding ncRNA the nucleotide sequence is not forced to keep its patterns as it does for proteins.  Instead the string codes for how the RNA folds and binds.  This means that large portions of the sequence string can be completely different between species so long as the base pairing is the same, and small protein binding sequences are there. In RNA there is extra flexibility in the binding since G can bind to U as well as C, and other strange pairings such as G:A and C:A occur frequently in some molecules.  Secondary structure rules also have some flexibility. As you can see in the figure below (I have tried to make it big enough) the RNase P RNA can form a different structure in humans and one of our parasites Giardia.  The sequence is quite different and the structure different but the function is the same. RNA has a lot of flexibility which means that it has the potential to evolve to do more and new functions.

But this flexibility also means that finding these genes can be extremely hard. When we look at a new species we can have no idea what the real structure will look like until we find the gene. We usually have to guess, and then see what we get from our guesses.   It becomes not so much detective work as with other genomics but a real needle in the haystack exercise.  More on that next week in ‘How to find an RNA- 101’.

Adapted from Kikovska et al. 2007 PNAS 104:2062-2067    The RNA sequence starts from the left top overhang, goes around the molecule and back to pair with the beginning again. Pairings are shown by dashes except for the very long range and tricky pairing in the P4 region.