Polymorphism identification and improved genome annotation of Brassica rapa through Deep RNA sequencing.
Bottom Line: An improved Brassica rapa transcriptome was constructed to detect novel transcripts and to improve the current genome annotation.The deep RNA-Seq reassembled Brassica rapa transcriptome identified 44,239 protein-coding genes.Gaps in the current genome assembly of B. rapa are highlighted by our identification of 780 unmapped transcripts.
Affiliation: Department of Plant Biology, University of California, Davis, California 95616.Show MeSH
Mentions: Functional annotation based on BLASTX and gene ontology allowed the classification of re-annotated transcripts into functional groups. The categorization of all BLASTX results indicate that Arabidopsis thaliana, Vitis vinifera, Populus trichocarpa, Oryza sativa, and Ricinus communis were the top five plant species in terms of number of hits to the revised transcriptome. The annotation step of Blast2GO assigned functions to 32,317 (73%) gene models. A total of 155,618 GO terms were obtained for a total of 44,239 re-annotated gene models. The distribution of the gene models in different GO categories is shown in Figure 4. Blast2GO can also be used to integrate other information such as metabolic pathways using KEGG annotation (Kanehisa and Goto 2000). We mapped all the predicted proteins to the reference canonical pathways in KEGG for functional categorization and annotation (see Materials and Methods). Of the 44,264 gene models, 13,203 had one or more KEGG annotations belonging to 321 different KEGG pathways. These GO and KEGG annotations will be helpful to researchers using the updated transcripts, for example, enabling categorization of transcriptional responses by the types of enriched GO or KEGG terms. The GO annotated table for B. rapa is saved in File S11.
Affiliation: Department of Plant Biology, University of California, Davis, California 95616.