Limits...
Large scale full-length cDNA sequencing reveals a unique genomic landscape in a lepidopteran model insect, Bombyx mori.

Suetsugu Y, Futahashi R, Kanamori H, Kadono-Okuda K, Sasanuma S, Narukawa J, Ajimura M, Jouraku A, Namiki N, Shimomura M, Sezutsu H, Osanai-Futahashi M, Suzuki MG, Daimon T, Shinoda T, Taniai K, Asaoka K, Niwa R, Kawaoka S, Katsuma S, Tamura T, Noda H, Kasahara M, Sugano S, Suzuki Y, Fujiwara H, Kataoka H, Arunkumar KP, Tomar A, Nagaraju J, Goldsmith MR, Feng Q, Xia Q, Yamamoto K, Shimada T, Mita K - G3 (Bethesda) (2013)

Bottom Line: The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics.More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters.The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes.

View Article: PubMed Central - PubMed

Affiliation: National Institute of Agrobiological Sciences, Tsukuba 305-8634, Japan.

ABSTRACT
The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics. A more complete annotation of the genome will benefit functional and comparative studies and accelerate extensive industrial applications for this insect. To realize these goals, we embarked upon a large-scale full-length cDNA collection from 21 full-length cDNA libraries derived from 14 tissues of the domesticated silkworm and performed full sequencing by primer walking for 11,104 full-length cDNAs. The large average intron size was 1904 bp, resulting from a high accumulation of transposons. Using gene models predicted by GLEAN and published mRNAs, we identified 16,823 gene loci on the silkworm genome assembly. Orthology analysis of 153 species, including 11 insects, revealed that among three Lepidoptera including Monarch and Heliconius butterflies, the 403 largest silkworm-specific genes were composed mainly of protective immunity, hormone-related, and characteristic structural proteins. Analysis of testis-/ovary-specific genes revealed distinctive features of sexual dimorphism, including depletion of ovary-specific genes on the Z chromosome in contrast to an enrichment of testis-specific genes. More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters. The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes.

Show MeSH

Related in: MedlinePlus

Relationships between genome size and average intron/exon lengths for various model species. To calculate the average intron and exon lengths for 11 species other than B. mori (D. melanogaster, C. elegans, M. musculus, H. sapiens, D. plexippus, H. melpomene, G. gallus, A. aegypti, Acyrthosiphon pisum, Strongylocentrotus purpuratus, and Danio rerio), the GTF-formatted gene annotation files were downloaded from the Ensembl ftp site ftp://ftp.ensembl.org and processed with a custom Perl script. Triangles denote the TE content.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3755909&req=5

fig2: Relationships between genome size and average intron/exon lengths for various model species. To calculate the average intron and exon lengths for 11 species other than B. mori (D. melanogaster, C. elegans, M. musculus, H. sapiens, D. plexippus, H. melpomene, G. gallus, A. aegypti, Acyrthosiphon pisum, Strongylocentrotus purpuratus, and Danio rerio), the GTF-formatted gene annotation files were downloaded from the Ensembl ftp site ftp://ftp.ensembl.org and processed with a custom Perl script. Triangles denote the TE content.

Mentions: By a comparison of 10,666 mapped FL-cDNA gene structures with previously reported silkworm gene prediction models (The International Silkworm Genome Consortium 2008) we found that 7504 FL-cDNAs matched with models, whereas 3162 (30%) showed no match. Among matched FL-cDNAs, our comparison showed that 1666 FL-cDNAs provided complete matches; however, 5059 structures comprising approximately three-fourths of the predicted genes were misannotated. As a comprehensive silkworm gene set (Table S1), previous annotations were updated by the use of available FL-cDNA data instead of predicted gene models and previous predicted gene models were used if there was no transcriptome data. By pair-wise comparison of transcripts mapped to a given locus, 2072 FL-cDNAs appeared to be derived from alternative splicing. The mean exon number per gene was 4.8, and mean exon and intron sizes were 353 bp and 1904 bp, respectively (Table 3). Comparison of these values relative to the genome size of 11 other model species showed a good correlation of the intron size with the genome size (Figure 2; R = 0.942; P < 0.001), indicating that the large introns of silkworm may have contributed to its relatively large genome size. We also compared the intron-genome size with the fraction of transposable elements in each genome (Figure 2), resulting in a rough correlation between genome size and TE content (R = 0.558; P < 0.059). The deviation in this value was considerably larger than the ratio of the mean intron size:genome size, which suggests that the large introns in the silkworm genome may have arisen in part from a high accumulation of repetitive sequences, mainly composed of transposons (Osanai-Futahashi et al. 2008). In contrast to the intron size, the average exon size had almost no correlation with the genome size (R = −0.487; P = 0.109), indicating there is very little variation in average exon length among the species examined.


Large scale full-length cDNA sequencing reveals a unique genomic landscape in a lepidopteran model insect, Bombyx mori.

Suetsugu Y, Futahashi R, Kanamori H, Kadono-Okuda K, Sasanuma S, Narukawa J, Ajimura M, Jouraku A, Namiki N, Shimomura M, Sezutsu H, Osanai-Futahashi M, Suzuki MG, Daimon T, Shinoda T, Taniai K, Asaoka K, Niwa R, Kawaoka S, Katsuma S, Tamura T, Noda H, Kasahara M, Sugano S, Suzuki Y, Fujiwara H, Kataoka H, Arunkumar KP, Tomar A, Nagaraju J, Goldsmith MR, Feng Q, Xia Q, Yamamoto K, Shimada T, Mita K - G3 (Bethesda) (2013)

Relationships between genome size and average intron/exon lengths for various model species. To calculate the average intron and exon lengths for 11 species other than B. mori (D. melanogaster, C. elegans, M. musculus, H. sapiens, D. plexippus, H. melpomene, G. gallus, A. aegypti, Acyrthosiphon pisum, Strongylocentrotus purpuratus, and Danio rerio), the GTF-formatted gene annotation files were downloaded from the Ensembl ftp site ftp://ftp.ensembl.org and processed with a custom Perl script. Triangles denote the TE content.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3755909&req=5

fig2: Relationships between genome size and average intron/exon lengths for various model species. To calculate the average intron and exon lengths for 11 species other than B. mori (D. melanogaster, C. elegans, M. musculus, H. sapiens, D. plexippus, H. melpomene, G. gallus, A. aegypti, Acyrthosiphon pisum, Strongylocentrotus purpuratus, and Danio rerio), the GTF-formatted gene annotation files were downloaded from the Ensembl ftp site ftp://ftp.ensembl.org and processed with a custom Perl script. Triangles denote the TE content.
Mentions: By a comparison of 10,666 mapped FL-cDNA gene structures with previously reported silkworm gene prediction models (The International Silkworm Genome Consortium 2008) we found that 7504 FL-cDNAs matched with models, whereas 3162 (30%) showed no match. Among matched FL-cDNAs, our comparison showed that 1666 FL-cDNAs provided complete matches; however, 5059 structures comprising approximately three-fourths of the predicted genes were misannotated. As a comprehensive silkworm gene set (Table S1), previous annotations were updated by the use of available FL-cDNA data instead of predicted gene models and previous predicted gene models were used if there was no transcriptome data. By pair-wise comparison of transcripts mapped to a given locus, 2072 FL-cDNAs appeared to be derived from alternative splicing. The mean exon number per gene was 4.8, and mean exon and intron sizes were 353 bp and 1904 bp, respectively (Table 3). Comparison of these values relative to the genome size of 11 other model species showed a good correlation of the intron size with the genome size (Figure 2; R = 0.942; P < 0.001), indicating that the large introns of silkworm may have contributed to its relatively large genome size. We also compared the intron-genome size with the fraction of transposable elements in each genome (Figure 2), resulting in a rough correlation between genome size and TE content (R = 0.558; P < 0.059). The deviation in this value was considerably larger than the ratio of the mean intron size:genome size, which suggests that the large introns in the silkworm genome may have arisen in part from a high accumulation of repetitive sequences, mainly composed of transposons (Osanai-Futahashi et al. 2008). In contrast to the intron size, the average exon size had almost no correlation with the genome size (R = −0.487; P = 0.109), indicating there is very little variation in average exon length among the species examined.

Bottom Line: The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics.More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters.The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes.

View Article: PubMed Central - PubMed

Affiliation: National Institute of Agrobiological Sciences, Tsukuba 305-8634, Japan.

ABSTRACT
The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics. A more complete annotation of the genome will benefit functional and comparative studies and accelerate extensive industrial applications for this insect. To realize these goals, we embarked upon a large-scale full-length cDNA collection from 21 full-length cDNA libraries derived from 14 tissues of the domesticated silkworm and performed full sequencing by primer walking for 11,104 full-length cDNAs. The large average intron size was 1904 bp, resulting from a high accumulation of transposons. Using gene models predicted by GLEAN and published mRNAs, we identified 16,823 gene loci on the silkworm genome assembly. Orthology analysis of 153 species, including 11 insects, revealed that among three Lepidoptera including Monarch and Heliconius butterflies, the 403 largest silkworm-specific genes were composed mainly of protective immunity, hormone-related, and characteristic structural proteins. Analysis of testis-/ovary-specific genes revealed distinctive features of sexual dimorphism, including depletion of ovary-specific genes on the Z chromosome in contrast to an enrichment of testis-specific genes. More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters. The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes.

Show MeSH
Related in: MedlinePlus