thescientist | In 2002, a group of plant researchers studying legumes at the Max Planck Institute for Plant Breeding Research in Cologne, Germany, discovered that a 679-nucleotide RNA believed to function in a noncoding capacity was in fact a protein-coding messenger RNA (mRNA).1 It had been classified as a long (or large) noncoding RNA (lncRNA) by virtue of being more than 200 nucleotides in length. The RNA, transcribed from a gene called early nodulin 40 (ENOD40), contained short open reading frames (ORFs)—putative protein-coding sequences bookended by start and stop codons—but the ORFs were so short that they had previously been overlooked. When the Cologne collaborators examined the RNA more closely, however, they found that two of the ORFs did indeed encode tiny peptides: one of 12 and one of 24 amino acids. Sampling the legumes confirmed that these micropeptides were made in the plant, where they interacted with a sucrose-synthesizing enzyme.
Five years later, another ORF-containing mRNA that had been posing as a lncRNA was discovered in Drosophila.2,3 After performing a screen of fly embryos to find lncRNAs, Yuji Kageyama, then of the National Institute for Basic Biology in Okazaki, Japan, suppressed each transcript’s expression. “Only one showed a clear phenotype,” says Kageyama, now at Kobe University. Because embryos missing this particular RNA lacked certain cuticle features, giving them the appearance of smooth rice grains, the researchers named the RNA “polished rice” (pri).
Turning his attention to how the RNA functioned, Kageyama thought he should first rule out the possibility that it encoded proteins. But he couldn’t. “We actually found it was a protein-coding gene,” he says. “It was an accident—we are RNA people!” The pri gene turned out to encode four tiny peptides—three of 11 amino acids and one of 32—that Kageyama and colleagues showed are important for activating a key developmental transcription factor.4
Since then, a handful of other lncRNAs have switched to the mRNA ranks after being found to harbor micropeptide-encoding short ORFs (sORFs)—those less than 300 nucleotides in length. And given the vast number of documented lncRNAs—most of which have no known function—the chance of finding others that contain micropeptide codes seems high.
The hunt for these tiny treasures is now on, but it’s a challenging quest. After all, there are good reasons why these itty-bitty peptides and their codes went unnoticed for so long.