Assembling a New Model Plant Genome

Anne Sternberger is a PhD student in Plant Biology and Research Assistant in the Wyatt Lab. Anne’s research interests include both systematic and molecular biology. Her current project is assembling the genome of the Downy yellow violet (Viola pubescens), a wild yellow violet with black striations. Her faculty advisers Dr. Harvey Ballard, one of the foremost experts in the world on violets, and Dr. Sarah Wyatt share an interest in this particular violet because of its mixed breeding system which creates two different flowers. One flower (chasmogamous flower) has evolved to attract pollinators and allows the plant to reproduce through cross-pollination — providing opportunities for genetic diversity. The second flower (cleistogamous flower) is self pollinating. The mixed breeding system, as well as the plant’s genetic makeup, and other phenotypical and physiological attributes make the Downy yellow violet an excellent candidate for a model organism. However, the genome of the plant has not been fully sequenced – only a rough genomic draft exists – meaning large gaps appear in the known sequence, and this barrier has limited the usefulness of the plant for researchers despite its important characteristics.

Figure 1: Downy yellow violet flowers  – chasmogamous (l) and cleistogamous (r)


Anne’s project goal is to assemble the Downy yellow violet genome to qualify it as a model organism for plant research and use the fully assembled genome to investigate the mixed breeding system of many violets. To this end she has enlisted to help of collaborator Dr. Kevin Childs, a faculty member at Michigan State University who has developed a bioinformatic pipeline that will assemble pieces of genetic code into genomic contigs (contiguous readings of many smaller genetic sequences) that will allow Anne to fill in the gaps in the Downy yellow violet genome.

Figure 2: Assembling the Genome

Anne’s portion of the project requires she extract RNA from nine major organs of the plant. The nine organs selected are representative of the entire plant system. To extract RNA data, Anne collected violets from nearby Sell’s Park and then extracted three sets of RNA from each of the nine targeted organs. After extracting the RNA, Anne will package and send the samples to her co-researcher who will run RNA-Seq on the samples and enter the resulting data into his bioinformatic pipeline. The researchers will then assemble the contigs and then annotate them to locate the genes that are expressed along the DNA sequence. By annotating the contigs and correlating them to the representative organs of the downy yellow violets, Anne’s research will identify the locations that code for a gene.

Once the Anne’s RNA transcriptomic information is loaded into the bioinformatic pipeline, she will travel to her co-researchers campus to train on the application he developed. Both collaboration and technology have accelerated the sequencing of the genome of a unique plant with a mixed breeding system, and when complete, it will be a first-of-kind model organism for plant biologists.