Limits...
RNA-seq analysis of Quercus pubescens Leaves: de novo transcriptome assembly, annotation and functional markers development.

Torre S, Tattini M, Brunetti C, Fineschi S, Fini A, Ferrini F, Sebastiani F - PLoS ONE (2014)

Bottom Line: These annotations and local BLAST allowed identify genes specifically associated with mechanisms of drought avoidance.We completed a successful global analysis of the Q. pubescens leaf transcriptome using RNA-seq.Our tools enable comparative genomics studies on other Quercus species taking advantage of large intra-specific ecophysiological differences.

View Article: PubMed Central - PubMed

Affiliation: Institute for Plant Protection, Department of Biology, Agricultural and Food Sciences, The National Research Council of Italy (CNR), Sesto Fiorentino, Italy.

ABSTRACT
Quercus pubescens Willd., a species distributed from Spain to southwest Asia, ranks high for drought tolerance among European oaks. Q. pubescens performs a role of outstanding significance in most Mediterranean forest ecosystems, but few mechanistic studies have been conducted to explore its response to environmental constrains, due to the lack of genomic resources. In our study, we performed a deep transcriptomic sequencing in Q. pubescens leaves, including de novo assembly, functional annotation and the identification of new molecular markers. Our results are a pre-requisite for undertaking molecular functional studies, and may give support in population and association genetic studies. 254,265,700 clean reads were generated by the Illumina HiSeq 2000 platform, with an average length of 98 bp. De novo assembly, using CLC Genomics, produced 96,006 contigs, having a mean length of 618 bp. Sequence similarity analyses against seven public databases (Uniprot, NR, RefSeq and KOGs at NCBI, Pfam, InterPro and KEGG) resulted in 83,065 transcripts annotated with gene descriptions, conserved protein domains, or gene ontology terms. These annotations and local BLAST allowed identify genes specifically associated with mechanisms of drought avoidance. Finally, 14,202 microsatellite markers and 18,425 single nucleotide polymorphisms (SNPs) were, in silico, discovered in assembled and annotated sequences. We completed a successful global analysis of the Q. pubescens leaf transcriptome using RNA-seq. The assembled and annotated sequences together with newly discovered molecular markers provide genomic information for functional genomic studies in Q. pubescens, with special emphasis to response mechanisms to severe constrain of the Mediterranean climate. Our tools enable comparative genomics studies on other Quercus species taking advantage of large intra-specific ecophysiological differences.

Show MeSH

Related in: MedlinePlus

Catalytic activity distribution in annotated Q.pubescens transcripts.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4231058&req=5

pone-0112487-g004: Catalytic activity distribution in annotated Q.pubescens transcripts.

Mentions: Gene Ontology (GO) terms and enzyme commission numbers (EC) for Q. pubescens transcripts were retrieved using Blast2GO [30]. Gene Ontology is an international classification system that provides a standardized vocabulary useful in describing functions of uncharacterized genes. Our results show that downy oak differs substantially from model plants. Indeed, just 19,178 (28.1%) retrieved the associated GO terms, and only 8,563 (12.5%) were annotated to a total of 32,844 GO term annotations, of the 68,285 most significant BLASTX hits against the NR plant species database (Table 2). All the extracted GO terms were summarized into the three main GO categories: 15,701 terms (47.8%) belong to the Biological Process class, 5,393 terms (16.4%) fit with the Molecular Function class and 11,750 terms (35.8%) belong to the Cellular Component class. Major sub-categories reported in Figure 3 come from GO level 2 classification. Two sub-categories “cell” (GO: 0005623) and “organelle” (GO: 0043226) occur in molecular function cluster; sub-categories “binding” (GO: 0005488) and “catalytic activity” (GO: 0003824) are clustered in cellular component; and five sub-categories “metabolic process” (GO: 0008152), “cellular process” (GO: 0009987), “single-organism process” (GO:0044699), “response to stimulus” (GO: 0050896) and “biological regulation” (GO: 0065007) were in the cluster of biological process. However, these results assigned only a small percentage of downy oak transcripts to GO terms, possibly due to large number of uninformative gene descriptions of protein hits. Of the 8,536 sequences annotated with GO terms, 2,805 were assigned to 114 EC numbers. In detail, transferase activity (40.6%), hydrolase activity (265%) and oxidoreductase activity (19.6%) were the most represented enzymes (Figure 4). The large number of annotated enzymes within these three groups suggest the presence of genes associated to pathways of secondary metabolite biosynthesis [47], [48], as we detail below for KEGG pathway mapping.


RNA-seq analysis of Quercus pubescens Leaves: de novo transcriptome assembly, annotation and functional markers development.

Torre S, Tattini M, Brunetti C, Fineschi S, Fini A, Ferrini F, Sebastiani F - PLoS ONE (2014)

Catalytic activity distribution in annotated Q.pubescens transcripts.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4231058&req=5

pone-0112487-g004: Catalytic activity distribution in annotated Q.pubescens transcripts.
Mentions: Gene Ontology (GO) terms and enzyme commission numbers (EC) for Q. pubescens transcripts were retrieved using Blast2GO [30]. Gene Ontology is an international classification system that provides a standardized vocabulary useful in describing functions of uncharacterized genes. Our results show that downy oak differs substantially from model plants. Indeed, just 19,178 (28.1%) retrieved the associated GO terms, and only 8,563 (12.5%) were annotated to a total of 32,844 GO term annotations, of the 68,285 most significant BLASTX hits against the NR plant species database (Table 2). All the extracted GO terms were summarized into the three main GO categories: 15,701 terms (47.8%) belong to the Biological Process class, 5,393 terms (16.4%) fit with the Molecular Function class and 11,750 terms (35.8%) belong to the Cellular Component class. Major sub-categories reported in Figure 3 come from GO level 2 classification. Two sub-categories “cell” (GO: 0005623) and “organelle” (GO: 0043226) occur in molecular function cluster; sub-categories “binding” (GO: 0005488) and “catalytic activity” (GO: 0003824) are clustered in cellular component; and five sub-categories “metabolic process” (GO: 0008152), “cellular process” (GO: 0009987), “single-organism process” (GO:0044699), “response to stimulus” (GO: 0050896) and “biological regulation” (GO: 0065007) were in the cluster of biological process. However, these results assigned only a small percentage of downy oak transcripts to GO terms, possibly due to large number of uninformative gene descriptions of protein hits. Of the 8,536 sequences annotated with GO terms, 2,805 were assigned to 114 EC numbers. In detail, transferase activity (40.6%), hydrolase activity (265%) and oxidoreductase activity (19.6%) were the most represented enzymes (Figure 4). The large number of annotated enzymes within these three groups suggest the presence of genes associated to pathways of secondary metabolite biosynthesis [47], [48], as we detail below for KEGG pathway mapping.

Bottom Line: These annotations and local BLAST allowed identify genes specifically associated with mechanisms of drought avoidance.We completed a successful global analysis of the Q. pubescens leaf transcriptome using RNA-seq.Our tools enable comparative genomics studies on other Quercus species taking advantage of large intra-specific ecophysiological differences.

View Article: PubMed Central - PubMed

Affiliation: Institute for Plant Protection, Department of Biology, Agricultural and Food Sciences, The National Research Council of Italy (CNR), Sesto Fiorentino, Italy.

ABSTRACT
Quercus pubescens Willd., a species distributed from Spain to southwest Asia, ranks high for drought tolerance among European oaks. Q. pubescens performs a role of outstanding significance in most Mediterranean forest ecosystems, but few mechanistic studies have been conducted to explore its response to environmental constrains, due to the lack of genomic resources. In our study, we performed a deep transcriptomic sequencing in Q. pubescens leaves, including de novo assembly, functional annotation and the identification of new molecular markers. Our results are a pre-requisite for undertaking molecular functional studies, and may give support in population and association genetic studies. 254,265,700 clean reads were generated by the Illumina HiSeq 2000 platform, with an average length of 98 bp. De novo assembly, using CLC Genomics, produced 96,006 contigs, having a mean length of 618 bp. Sequence similarity analyses against seven public databases (Uniprot, NR, RefSeq and KOGs at NCBI, Pfam, InterPro and KEGG) resulted in 83,065 transcripts annotated with gene descriptions, conserved protein domains, or gene ontology terms. These annotations and local BLAST allowed identify genes specifically associated with mechanisms of drought avoidance. Finally, 14,202 microsatellite markers and 18,425 single nucleotide polymorphisms (SNPs) were, in silico, discovered in assembled and annotated sequences. We completed a successful global analysis of the Q. pubescens leaf transcriptome using RNA-seq. The assembled and annotated sequences together with newly discovered molecular markers provide genomic information for functional genomic studies in Q. pubescens, with special emphasis to response mechanisms to severe constrain of the Mediterranean climate. Our tools enable comparative genomics studies on other Quercus species taking advantage of large intra-specific ecophysiological differences.

Show MeSH
Related in: MedlinePlus