Wednesday, December 22, 2010

genomic dark matter

Sciencemag | It used to seem so straightforward. DNA told the body how to build proteins. The instructions came in chapters called genes. Strands of DNA's chemical cousin RNA served as molecular messengers, carrying orders to the cells' protein factories and translating them into action. Between the genes lay long stretches of “junk DNA,” incoherent, useless, and inert.

That was then. In fact, gene regulation has turned out to be a surprisingly complex process governed by various types of regulatory DNA, which may lie deep in the wilderness of supposed “junk.” Far from being humble messengers, RNAs of all shapes and sizes are actually powerful players in how genomes operate. Finally, there's been increasing recognition of the widespread role of chemical alterations called epigenetic factors that can influence the genome across generations without changing the DNA sequence itself.

The scope of this “dark genome” became apparent in 2001, when the human genome was first published. Scientists expected to find as many as 100,000 genes packed into the 3 billion bases of human DNA; they were startled to learn that there were fewer than 35,000. (The current count is 21,000.) Protein-coding regions accounted for just 1.5% of the genome. Could the rest of our DNA really just be junk?

The deciphering of the mouse genome in 2002 showed that there must be more to the story. Mice and people turned out to share not only many genes but also vast stretches of noncoding DNA. To have been “conserved” throughout the 75 million years since the mouse and human lineages diverged, those regions were likely to be crucial to the organisms' survival.