Limits...
The venom-gland transcriptome of the eastern diamondback rattlesnake (Crotalus adamanteus).

Rokyta DR, Lemmon AR, Margres MJ, Aronow K - BMC Genomics (2012)

Bottom Line: The most diverse toxin classes were the C-type lectins (21 clusters), the snake-venom metalloproteinases (16 clusters), and the serine proteinases (14 clusters).The high-abundance nontoxin transcripts were predominantly those involved in protein folding and translation, consistent with the protein-secretory function of the tissue.We have more than doubled the number of sequenced toxins for this species and created extensive genomic resources for snakes based entirely on de novo assembly of Illumina sequence data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biological Science, Florida State University, Tallahassee, FL 32306-4295, USA. drokyta@bio.fsu.edu

ABSTRACT

Background: Snake venoms have significant impacts on human populations through the morbidity and mortality associated with snakebites and as sources of drugs, drug leads, and physiological research tools. Genes expressed by venom-gland tissue, including those encoding toxic proteins, have therefore been sequenced but only with relatively sparse coverage resulting from the low-throughput sequencing approaches available. High-throughput approaches based on 454 pyrosequencing have recently been applied to the study of snake venoms to give the most complete characterizations to date of the genes expressed in active venom glands, but such approaches are costly and still provide a far-from-complete characterization of the genes expressed during venom production.

Results: We describe the de novo assembly and analysis of the venom-gland transcriptome of an eastern diamondback rattlesnake (Crotalus adamanteus) based on 95,643,958 pairs of quality-filtered, 100-base-pair Illumina reads. We identified 123 unique, full-length toxin-coding sequences, which cluster into 78 groups with less than 1% nucleotide divergence, and 2,879 unique, full-length nontoxin coding sequences. The toxin sequences accounted for 35.4% of the total reads, and the nontoxin sequences for an additional 27.5%. The most highly expressed toxin was a small myotoxin related to crotamine, which accounted for 5.9% of the total reads. Snake-venom metalloproteinases accounted for the highest percentage of reads mapping to a toxin class (24.4%), followed by C-type lectins (22.2%) and serine proteinases (20.0%). The most diverse toxin classes were the C-type lectins (21 clusters), the snake-venom metalloproteinases (16 clusters), and the serine proteinases (14 clusters). The high-abundance nontoxin transcripts were predominantly those involved in protein folding and translation, consistent with the protein-secretory function of the tissue.

Conclusions: We have provided the most complete characterization of the genes expressed in an active snake venom gland to date, producing insights into snakebite pathology and guidance for snakebite treatment for the largest rattlesnake species and arguably the most dangerous snake native to the United States of America, C. adamanteus. We have more than doubled the number of sequenced toxins for this species and created extensive genomic resources for snakes based entirely on de novo assembly of Illumina sequence data.

Show MeSH

Related in: MedlinePlus

Comparison of gene ontology (GO) results for our annotated full-length nontoxin sequences with those of the contigs from ade novo assembly with NGen. Only level 2 GO terms are shown. The distributions of GO terms are similar across data sets, suggesting that the annotated transcripts provided a comprehensive characterization of the genes expressed in the venom gland. (A) The distributions of sequences reaching various stages of identification and annotation are shown. The level 2 GO terms are shown for molecular function (B), biological process (C), and cellular component (D).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3472243&req=5

Figure 4: Comparison of gene ontology (GO) results for our annotated full-length nontoxin sequences with those of the contigs from ade novo assembly with NGen. Only level 2 GO terms are shown. The distributions of GO terms are similar across data sets, suggesting that the annotated transcripts provided a comprehensive characterization of the genes expressed in the venom gland. (A) The distributions of sequences reaching various stages of identification and annotation are shown. The level 2 GO terms are shown for molecular function (B), biological process (C), and cellular component (D).

Mentions: We characterized the nontoxin genes expressed in the C. adamanteus venom gland by two means. First, we took all of the contigs from one of our four de novo NGen assemblies based on 20 million merged reads and conducted a full Blast2Go [73] analysis on the contigs comprising ≥ 100 reads. Of the 12,746 contigs (assembly 2 in Table 2), we were able to provide gene ontology (GO) annotations for 9,040 of them (Figure 4A). The major functional classes (level 2) represented in these results were binding and catalysis, followed by transcription regulation (Figure 4B). The major biological process GO terms (level 2) were cellular processes and metabolic processes (Figure 4C). Interestingly, viral reproductive function was detected and probably represents the activity of transposable elements or retroviruses like those previously noted in snake venom-gland transcriptomes [34]. The major cellular component GO terms (level 2) were cell and organelle (Figure 4D). For these results, we made no attempt to exclude toxin sequences, because they are necessarily a small minority of the total sequences, and did not require that contigs contain full-length coding sequences.


The venom-gland transcriptome of the eastern diamondback rattlesnake (Crotalus adamanteus).

Rokyta DR, Lemmon AR, Margres MJ, Aronow K - BMC Genomics (2012)

Comparison of gene ontology (GO) results for our annotated full-length nontoxin sequences with those of the contigs from ade novo assembly with NGen. Only level 2 GO terms are shown. The distributions of GO terms are similar across data sets, suggesting that the annotated transcripts provided a comprehensive characterization of the genes expressed in the venom gland. (A) The distributions of sequences reaching various stages of identification and annotation are shown. The level 2 GO terms are shown for molecular function (B), biological process (C), and cellular component (D).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3472243&req=5

Figure 4: Comparison of gene ontology (GO) results for our annotated full-length nontoxin sequences with those of the contigs from ade novo assembly with NGen. Only level 2 GO terms are shown. The distributions of GO terms are similar across data sets, suggesting that the annotated transcripts provided a comprehensive characterization of the genes expressed in the venom gland. (A) The distributions of sequences reaching various stages of identification and annotation are shown. The level 2 GO terms are shown for molecular function (B), biological process (C), and cellular component (D).
Mentions: We characterized the nontoxin genes expressed in the C. adamanteus venom gland by two means. First, we took all of the contigs from one of our four de novo NGen assemblies based on 20 million merged reads and conducted a full Blast2Go [73] analysis on the contigs comprising ≥ 100 reads. Of the 12,746 contigs (assembly 2 in Table 2), we were able to provide gene ontology (GO) annotations for 9,040 of them (Figure 4A). The major functional classes (level 2) represented in these results were binding and catalysis, followed by transcription regulation (Figure 4B). The major biological process GO terms (level 2) were cellular processes and metabolic processes (Figure 4C). Interestingly, viral reproductive function was detected and probably represents the activity of transposable elements or retroviruses like those previously noted in snake venom-gland transcriptomes [34]. The major cellular component GO terms (level 2) were cell and organelle (Figure 4D). For these results, we made no attempt to exclude toxin sequences, because they are necessarily a small minority of the total sequences, and did not require that contigs contain full-length coding sequences.

Bottom Line: The most diverse toxin classes were the C-type lectins (21 clusters), the snake-venom metalloproteinases (16 clusters), and the serine proteinases (14 clusters).The high-abundance nontoxin transcripts were predominantly those involved in protein folding and translation, consistent with the protein-secretory function of the tissue.We have more than doubled the number of sequenced toxins for this species and created extensive genomic resources for snakes based entirely on de novo assembly of Illumina sequence data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biological Science, Florida State University, Tallahassee, FL 32306-4295, USA. drokyta@bio.fsu.edu

ABSTRACT

Background: Snake venoms have significant impacts on human populations through the morbidity and mortality associated with snakebites and as sources of drugs, drug leads, and physiological research tools. Genes expressed by venom-gland tissue, including those encoding toxic proteins, have therefore been sequenced but only with relatively sparse coverage resulting from the low-throughput sequencing approaches available. High-throughput approaches based on 454 pyrosequencing have recently been applied to the study of snake venoms to give the most complete characterizations to date of the genes expressed in active venom glands, but such approaches are costly and still provide a far-from-complete characterization of the genes expressed during venom production.

Results: We describe the de novo assembly and analysis of the venom-gland transcriptome of an eastern diamondback rattlesnake (Crotalus adamanteus) based on 95,643,958 pairs of quality-filtered, 100-base-pair Illumina reads. We identified 123 unique, full-length toxin-coding sequences, which cluster into 78 groups with less than 1% nucleotide divergence, and 2,879 unique, full-length nontoxin coding sequences. The toxin sequences accounted for 35.4% of the total reads, and the nontoxin sequences for an additional 27.5%. The most highly expressed toxin was a small myotoxin related to crotamine, which accounted for 5.9% of the total reads. Snake-venom metalloproteinases accounted for the highest percentage of reads mapping to a toxin class (24.4%), followed by C-type lectins (22.2%) and serine proteinases (20.0%). The most diverse toxin classes were the C-type lectins (21 clusters), the snake-venom metalloproteinases (16 clusters), and the serine proteinases (14 clusters). The high-abundance nontoxin transcripts were predominantly those involved in protein folding and translation, consistent with the protein-secretory function of the tissue.

Conclusions: We have provided the most complete characterization of the genes expressed in an active snake venom gland to date, producing insights into snakebite pathology and guidance for snakebite treatment for the largest rattlesnake species and arguably the most dangerous snake native to the United States of America, C. adamanteus. We have more than doubled the number of sequenced toxins for this species and created extensive genomic resources for snakes based entirely on de novo assembly of Illumina sequence data.

Show MeSH
Related in: MedlinePlus