Research

Valadkhan lab is interested in the role of non-protein-coding RNA molecules in higher eukaryotes, especially in human. It has been recently shown that ~99% of the RNAs transcribed from the human genome do not code for proteins, but function as RNA molecules. This intriguing discovery will have a huge impact on the way we think about cellular function. It is becoming increasingly evident that these RNAs are playing highly critical roles in the cell, and recent data suggests that they form highly complex regulatory pathways that have made the enormous complexity of human body possible. In fact, there are some reports that suggest that these RNAs are responsible for us, humans, being different from other primates! Our research focuses on two main areas of investigation regarding non-protein-coding RNAs:


  • 1) The function of human snRNAs, the RNA components of the spliceosome

  • 2) The function of large non-protein-coding RNAs





What primordial Earth should have looked like. Spliceosomal snRNAs have most likely originated from catalytic RNAs that were the first forms of life on Earth. According to the "RNA World" hypothesis, catalysts made entirely of RNA acquired the ability to replicate themselves in a way that involved Watson-Crick basepairing, thus ensuring the passage of genetic information to the progeny and genetic continuity of such self-replicative ribozymes. Today, relics of primordial RNA catalysts still play crucial roles in modern living organisms. snRNAs, the ribosomal RNA, and RNase P, among others, belong to this category.



1) The snRNAs:

Molecular fossils, gatekeepers of genetic information, and culprits in many human diseases

The snRNAs are the central components of snRNPs, or small nuclear ribonucleo-proteins. snRNPs are of considerable medical interest, since they are the major auto antigens in most autoimmune diseases, such as lupus erythematosus and rheumatoid arthritis. In addition, they are the major components of the spliceosome. In addition to the central role of splicing in human gene expression, a huge number of human diseases, including a long list of cancers, certain types of Alzheimer???s and other autoimmune diseases, and almost 50% of all human genetic diseases are caused by splicing mistakes. All these underscore the urgent need for understanding the function of these molecules, which in turn will allow us to understand the mechanism of a wide range of human diseases, and ultimately help in designing a cure.

In addition to their critical role in splicing and human disease, snRNAs are most likely molecular fossils, having their origin in very early, pre-cellular forms of life! This adds an intellectually and evolutionarily intriguing dimension to their significance.

It has been estimated that at any given time there are around 150,000 mRNAs in each mammalian cell. Since the average human pre-messenger RNA has about 7-8 introns, and considering the rapid rate of turnover of the cellular mRNA pool, it is hardly surprising that the human spliceosome is considered the largest, most complex cellular machine, consisting of over 200 different components, and that each human cell contains 1,000,000 copies of each snRNA! This enormous complexity has prevented us from studying the function of the spliceosome, which in turn has seriously limited our understanding of the mechanism of a large number of human diseases.

To circumvent this problem, we have recently developed a unique, novel model for the spliceosome that consists of only two spliceosomal snRNAs (U6 and U2 snRNAs). Using molecular nano-engineering, we have been able to show that these two RNAs, without the help of any of the other 200 spliceosomal components, can perform the splicing reaction (Mohammadi, Geisler and Valadkhan, soon to be published). This discovery has put us in a unique position to address a number of fundamental questions on how the spliceosome works, and why it makes disease-causing mistakes.

a) Mechanism of determination of splice sites

Almost all splicing-associated human diseases are caused by mistakes in splice site determination. This will, in turn, result in frame shift, missense or nonsense mutations (if the spliced RNA is an mRNA) or altered function (if the spliced RNA is a non-protein-coding RNA), leading to disease states. We are currently using a battery of mutagenesis and sequencing techniques to determine how the splice sites are determined in our minimal spliceosome model. We will next test our hypotheses derived from the minimal spliceosome model in vivo, in the authentic human spliceosome, using RNAs from disease genes, when appropriate. These experiments will provide fundamental new insights into the way splice sites are chosen, and the way splicing-related human diseases happen.

The minimal splicing model. Human U6 and U2 snRNAs, in the absence of all other spliceosomal factors, can catalyze the splicing reaction. Further characterization of this system will allow us to use it as a tool to understand the highly complex human spliceosome.




b) Defining the three-dimensional structure, critical functional elements, and evolutionary history of snRNAs

We have also performed a number of crosslinking and mutagenesis studies that have allowed us to develop a secondary, and soon, tertiary structure for the minimal spliceosome model. Atomic substitution studies will allow us to develop a working model for how the complex of U6 and U2 can catalyze the splicing reaction, opening the door to a detailed enzymological analysis. In addition, we are interested to know how the snRNAs U6 and U2 have evolved from yeast, the simplest organism that performs splicing, to human.

c) Understanding the function of spliceosomal proteins, especially Prp8, the largest conserved nuclear protein

One of the spliceosomal proteins, Prp8, is intimately associated with the U6 and U2 snRNAs in activated spliceosomes in vivo. Interestingly, mutations in this protein lead to a hereditary form of blindness, retinitis pigmentosa. We are interested in defining the role of this protein in splicing, and to determine why, although this protein is essential in all tissues, its mutants are causing an eye-specific problem. Using the minimal spliceosome model and in vivo studies, we are currently characterizing its interactions with the snRNAs and its specific role in human retina.



2) The large non-protein-coding RNAs:

Digital regulators of the human cellular function

The exciting discovery that most of our genome codes for non-protein-coding RNA molecules suggests that a significant percentage of our cellular function is performed by these RNAs. While structural RNAs, such as the ribosomal RNAs, snRNAs, snoRNAs, ??? have been known for a long time, the recent discovery of regulatory small RNAs (miRNAs, siRNAs, etc.) has proved that RNA molecules have an enormous potential as regulators. The most recent, and least understood, category of non-protein-coding RNAs, the large non-coding RNAs, are thought to play highly critical regulatory roles in the cell. What is intriguing is that these RNAs might constitute a totally ???hidden??? layer of regulatory pathways in the cell. Since the binding affinity of RNA can be easily modulated and fine tuned by altering its base-pairing ability, the RNA molecules are dubbed the "digital controllers" of the cell. Despite the very small volume of data available on these large RNAs, they have been implicated in a number of developmental processes and diseases including malignancies. We are currently studying one such RNA, which seems to be involved in determination of cellular fate during differentiation. Interestingly, this large RNA is highly expressed in brain, especially in neurons, and it is possible that it is involved in fine-tuning the highly complex neuronal function. We aim to define the role of this RNA in neurons, and in other tissues, using a variety of in vivo techniques, computational techniques, and in vitro biochemistry.

a) Functional Characterization of the large non-protein-coding RNA in vivo and in vitro

There is very little mechanistic data on the function of the large regulatory RNAs. As the first step in functional analysis of our large RNA, we are currently defining its localization in neuronal and non-neuronal cells. We will also attempt to determine if it is highly enriched in a certain area of brain. We are also determining the factors that interact with our large non-protein-coding RNA in vivo in neuronal and non-neuronal cells. Using a variety of techniques, we will confirm each observed interaction both in vitro and in vivo. Further, using shRNA-mediated techniques, we are analyzing how cellular function of neuronal and non-neuronal cells are affected by the absence of this RNA, using a variety of biochemical, cell biology, and microarray analysis techniques. Taken together, these analyses promise to provide us with a mechanistic framework for the function of this RNA.

b) Generating a whole animal model for the study of this non-protein-coding RNA

Additionally, we plan to generate a mouse knock out model, which will allow us to define the role of this RNA in the whole organism, and determine its role in development. Ultimately, we believe that studying this large RNA will provide us with clues to the function of this class of "ribo-regulators\" in mammalian cells. Further, the same strategies can be used to characterize other large non-protein-coding RNAs of interest, which will help broaden our view of strategies used for RNA-mediated regulation of cellular function.