Limits...
Establishment and analysis of a reference transcriptome for Spodoptera frugiperda.

Legeai F, Gimenez S, Duvic B, Escoubas JM, Gosselin Grenet AS, Blanc F, Cousserans F, Séninet I, Bretaudeau A, Mutuel D, Girard PA, Monsempes C, Magdelenat G, Hilliou F, Feyereisen R, Ogliastro M, Volkoff AN, Jacquin-Joly E, d'Alençon E, Nègre N, Fournier P - BMC Genomics (2014)

Bottom Line: We conclude that the Sf_TR2012b transcriptome is a valid reference transcriptome.While its reliability decreases for the detection and annotation of genes under strong transcriptional constraint we still recover a fair percentage of tissue-specific transcripts.Similarly, we observed an interesting interplay of gene families involved in immunity between fat bodies and antennae.

View Article: PubMed Central - PubMed

Affiliation: INRA, UMR1333, DGIMI, Montpellier, France. nicolas.negre@univ-montp2.fr.

ABSTRACT

Background: Spodoptera frugiperda (Noctuidae) is a major agricultural pest throughout the American continent. The highly polyphagous larvae are frequently devastating crops of importance such as corn, sorghum, cotton and grass. In addition, the Sf9 cell line, widely used in biochemistry for in vitro protein production, is derived from S. frugiperda tissues. Many research groups are using S. frugiperda as a model organism to investigate questions such as plant adaptation, pest behavior or resistance to pesticides.

Results: In this study, we constructed a reference transcriptome assembly (Sf_TR2012b) of RNA sequences obtained from more than 35 S. frugiperda developmental time-points and tissue samples. We assessed the quality of this reference transcriptome by annotating a ubiquitous gene family--ribosomal proteins--as well as gene families that have a more constrained spatio-temporal expression and are involved in development, immunity and olfaction. We also provide a time-course of expression that we used to characterize the transcriptional regulation of the gene families studied.

Conclusion: We conclude that the Sf_TR2012b transcriptome is a valid reference transcriptome. While its reliability decreases for the detection and annotation of genes under strong transcriptional constraint we still recover a fair percentage of tissue-specific transcripts. That allowed us to explore the spatial and temporal expression of genes and to observe that some olfactory receptors are expressed in antennae and palps but also in other non related tissues such as fat bodies. Similarly, we observed an interesting interplay of gene families involved in immunity between fat bodies and antennae.

Show MeSH

Related in: MedlinePlus

Clustering of expression. A. Heatmap representing the medoids of expression of the 20 clusters of genes for the 10 Illumina RNAseq experiments. B. Barplot representing the number of genes that are present in each cluster. C. Histogram representing the density of genes in each cluster for all (green) or RBP genes (red). Orange represents the intersection. It shows if RBP genes are over- or under-represented in each cluster compared to the total number of genes in each cluster. D. Same as in C. for Hox-domain genes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4150953&req=5

Fig4: Clustering of expression. A. Heatmap representing the medoids of expression of the 20 clusters of genes for the 10 Illumina RNAseq experiments. B. Barplot representing the number of genes that are present in each cluster. C. Histogram representing the density of genes in each cluster for all (green) or RBP genes (red). Orange represents the intersection. It shows if RBP genes are over- or under-represented in each cluster compared to the total number of genes in each cluster. D. Same as in C. for Hox-domain genes.

Mentions: We were interested in using the reference transcriptome to study families of genes involved in larval development, in chemosensory reception and in immune response. These categories of genes are all necessary to the adaptation of the feeding larva to its environment, whether be its host plant or its panoply of pathogens. Specifically we were interested in investigating the repertoire of genes and their level of expression for each of these categories. Thus, we extracted total RNA corresponding to 6 developmental time-points and 4 experimental dissected tissues to generate a quantitative gene expression dataset by Illumina sequencing (see Methods and Table 1). Total RNA has been extracted from the selected samples and sequenced to produce 10 M single-end 50 bp reads (see Additional file 7: Figure S2), representing approximately 1.3× coverage of each nucleotide in the reference transcriptome. Levels of expression for each gene in Sf_TR2012b have been measured by 4 indices: total coverage, reads per million (RPM), normalized reads count using limma [35] or normalization of reads count using DESeq [36]. These four tables of expression are available for download on the homepage of Lepidodb (http://www.inra.fr/lepidodb/downloads/TR2012b). We calculated the correlation coefficient (Pearson r) between pairwise samples (Figure 3C) and noted that there was much more correlation between L2 early and L3 early stages and between L2 late and L3 late than between both L2 stages or between both L3 stages, indicating that different transcripts expression profiles were defining the beginning and the end of each ecdysis cycle. Interestingly, the L6 late time-point was not correlated with the other larval time-points, probably reflecting the onset of metamorphosis. Less variation was observed between tracheae, antennae and palps and fat body samples, all coming from late larval stages.To identify the genes specific to each of the samples, we used a k-mean clustering method (using the clara function in R), to group together genes with similar expression profiles. We empirically chose 20 as the number of k-mean clusters that was producing categories with enough differences between them. We immediately noticed that one category (cluster 3, see Figure 4A-B) comprised most of the transcripts (31,622). This category had a mean expression level of close to 0 RPM. This probably represents most of the rare transcripts coming from the large amount of samples that we dissected and included in our 454 sequencing assembly, in addition to the set of ESTs previously described. The Cluster 1, similarly, had a large number (17,529) of transcripts with low expression and no noticeable specific expression profile. However, most of the other clusters identified groups of genes whose expression was specific of one or two samples (Ex, clusters 2 and 19 on Figure 4A). To test whether those clusters made sense, we checked to which clusters rbp and Hox proteins belonged. Compared to the overall distribution of transcripts in the different clusters, rbp proteins are enriched in clusters 2,5,6,7,8,9,13 and 19 (Figure 4C). Except for clusters 2 and 8, all the other clusters show a median level of expression that is high in all samples, even if this level varies from sample to sample. Hox transcripts however are enriched in clusters 2 and 6 (Figure 4D). The cluster 2 in particular corresponds to genes that have a higher level of expression during embryonic stages compared to any of the other samples, which makes sense given that Hox-domain genes are often transcription factors involved in embryogenesis.Figure 4


Establishment and analysis of a reference transcriptome for Spodoptera frugiperda.

Legeai F, Gimenez S, Duvic B, Escoubas JM, Gosselin Grenet AS, Blanc F, Cousserans F, Séninet I, Bretaudeau A, Mutuel D, Girard PA, Monsempes C, Magdelenat G, Hilliou F, Feyereisen R, Ogliastro M, Volkoff AN, Jacquin-Joly E, d'Alençon E, Nègre N, Fournier P - BMC Genomics (2014)

Clustering of expression. A. Heatmap representing the medoids of expression of the 20 clusters of genes for the 10 Illumina RNAseq experiments. B. Barplot representing the number of genes that are present in each cluster. C. Histogram representing the density of genes in each cluster for all (green) or RBP genes (red). Orange represents the intersection. It shows if RBP genes are over- or under-represented in each cluster compared to the total number of genes in each cluster. D. Same as in C. for Hox-domain genes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4150953&req=5

Fig4: Clustering of expression. A. Heatmap representing the medoids of expression of the 20 clusters of genes for the 10 Illumina RNAseq experiments. B. Barplot representing the number of genes that are present in each cluster. C. Histogram representing the density of genes in each cluster for all (green) or RBP genes (red). Orange represents the intersection. It shows if RBP genes are over- or under-represented in each cluster compared to the total number of genes in each cluster. D. Same as in C. for Hox-domain genes.
Mentions: We were interested in using the reference transcriptome to study families of genes involved in larval development, in chemosensory reception and in immune response. These categories of genes are all necessary to the adaptation of the feeding larva to its environment, whether be its host plant or its panoply of pathogens. Specifically we were interested in investigating the repertoire of genes and their level of expression for each of these categories. Thus, we extracted total RNA corresponding to 6 developmental time-points and 4 experimental dissected tissues to generate a quantitative gene expression dataset by Illumina sequencing (see Methods and Table 1). Total RNA has been extracted from the selected samples and sequenced to produce 10 M single-end 50 bp reads (see Additional file 7: Figure S2), representing approximately 1.3× coverage of each nucleotide in the reference transcriptome. Levels of expression for each gene in Sf_TR2012b have been measured by 4 indices: total coverage, reads per million (RPM), normalized reads count using limma [35] or normalization of reads count using DESeq [36]. These four tables of expression are available for download on the homepage of Lepidodb (http://www.inra.fr/lepidodb/downloads/TR2012b). We calculated the correlation coefficient (Pearson r) between pairwise samples (Figure 3C) and noted that there was much more correlation between L2 early and L3 early stages and between L2 late and L3 late than between both L2 stages or between both L3 stages, indicating that different transcripts expression profiles were defining the beginning and the end of each ecdysis cycle. Interestingly, the L6 late time-point was not correlated with the other larval time-points, probably reflecting the onset of metamorphosis. Less variation was observed between tracheae, antennae and palps and fat body samples, all coming from late larval stages.To identify the genes specific to each of the samples, we used a k-mean clustering method (using the clara function in R), to group together genes with similar expression profiles. We empirically chose 20 as the number of k-mean clusters that was producing categories with enough differences between them. We immediately noticed that one category (cluster 3, see Figure 4A-B) comprised most of the transcripts (31,622). This category had a mean expression level of close to 0 RPM. This probably represents most of the rare transcripts coming from the large amount of samples that we dissected and included in our 454 sequencing assembly, in addition to the set of ESTs previously described. The Cluster 1, similarly, had a large number (17,529) of transcripts with low expression and no noticeable specific expression profile. However, most of the other clusters identified groups of genes whose expression was specific of one or two samples (Ex, clusters 2 and 19 on Figure 4A). To test whether those clusters made sense, we checked to which clusters rbp and Hox proteins belonged. Compared to the overall distribution of transcripts in the different clusters, rbp proteins are enriched in clusters 2,5,6,7,8,9,13 and 19 (Figure 4C). Except for clusters 2 and 8, all the other clusters show a median level of expression that is high in all samples, even if this level varies from sample to sample. Hox transcripts however are enriched in clusters 2 and 6 (Figure 4D). The cluster 2 in particular corresponds to genes that have a higher level of expression during embryonic stages compared to any of the other samples, which makes sense given that Hox-domain genes are often transcription factors involved in embryogenesis.Figure 4

Bottom Line: We conclude that the Sf_TR2012b transcriptome is a valid reference transcriptome.While its reliability decreases for the detection and annotation of genes under strong transcriptional constraint we still recover a fair percentage of tissue-specific transcripts.Similarly, we observed an interesting interplay of gene families involved in immunity between fat bodies and antennae.

View Article: PubMed Central - PubMed

Affiliation: INRA, UMR1333, DGIMI, Montpellier, France. nicolas.negre@univ-montp2.fr.

ABSTRACT

Background: Spodoptera frugiperda (Noctuidae) is a major agricultural pest throughout the American continent. The highly polyphagous larvae are frequently devastating crops of importance such as corn, sorghum, cotton and grass. In addition, the Sf9 cell line, widely used in biochemistry for in vitro protein production, is derived from S. frugiperda tissues. Many research groups are using S. frugiperda as a model organism to investigate questions such as plant adaptation, pest behavior or resistance to pesticides.

Results: In this study, we constructed a reference transcriptome assembly (Sf_TR2012b) of RNA sequences obtained from more than 35 S. frugiperda developmental time-points and tissue samples. We assessed the quality of this reference transcriptome by annotating a ubiquitous gene family--ribosomal proteins--as well as gene families that have a more constrained spatio-temporal expression and are involved in development, immunity and olfaction. We also provide a time-course of expression that we used to characterize the transcriptional regulation of the gene families studied.

Conclusion: We conclude that the Sf_TR2012b transcriptome is a valid reference transcriptome. While its reliability decreases for the detection and annotation of genes under strong transcriptional constraint we still recover a fair percentage of tissue-specific transcripts. That allowed us to explore the spatial and temporal expression of genes and to observe that some olfactory receptors are expressed in antennae and palps but also in other non related tissues such as fat bodies. Similarly, we observed an interesting interplay of gene families involved in immunity between fat bodies and antennae.

Show MeSH
Related in: MedlinePlus