De novo assembly of the pennycress (Thlaspi arvense) transcriptome provides tools for the development of a winter cover crop and biodiesel feedstock.
Bottom Line: A global comparison of homology between the pennycress and Arabidopsis transcriptomes, along with four other Brassicaceae species, revealed a high level of global sequence conservation within the family.Identification of these genes leads to testable hypotheses concerning their conserved function and to rational strategies to improve agronomic properties in pennycress.Future work to characterize isoform variation between diverse pennycress lines and develop a draft genome sequence for pennycress will further direct trait improvement.
Affiliation: Department of Plant Biology, University of Minnesota, 1445 Gortner Avenue, 250 Biological Sciences Center, Saint Paul, MN 55108, USA.Show MeSH
Related in: MedlinePlus
Mentions: The assembly with a word size of 64, 95% match length, and 95% match percentage was chosen for further analysis and annotation due to the high quality of assembly statistics and high proportion of assembled transcripts with significant matches to Arabidopsis genes compared to the other assemblies. A summary of sequencing reads and assembly statistics is shown in Table 1. A total of 33 874 contigs were assembled using these parameters. This includes a spiked phiX174 genome sequence that serves as a sequencing control, which was subsequently removed from the final assembly and total assembly length. The mean contig length was 1242 bp, with minimum and maximum contig lengths of 215 and 15 516 bp, respectively. The size distribution of contig lengths is shown in Figure 1(a). The N50 was 1729 bp, meaning all contigs this size or larger encompassed 50% of the total 42 069 800 bp assembly length. This Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GenBank under the accession GAKE00000000. The version described in this paper is the first version, GAKE01000000. Approximately 1.5% of the contigs were excluded from the archives due to the number of ambiguous nucleotides in those sequences. The complete, annotated FASTA file is available at http://www.cbs.umn.edu/lab/marks/pennycress/transcriptome.
Affiliation: Department of Plant Biology, University of Minnesota, 1445 Gortner Avenue, 250 Biological Sciences Center, Saint Paul, MN 55108, USA.