Limits...
The venom-gland transcriptome of the eastern diamondback rattlesnake (Crotalus adamanteus).

Rokyta DR, Lemmon AR, Margres MJ, Aronow K - BMC Genomics (2012)

Bottom Line: The most diverse toxin classes were the C-type lectins (21 clusters), the snake-venom metalloproteinases (16 clusters), and the serine proteinases (14 clusters).The high-abundance nontoxin transcripts were predominantly those involved in protein folding and translation, consistent with the protein-secretory function of the tissue.We have more than doubled the number of sequenced toxins for this species and created extensive genomic resources for snakes based entirely on de novo assembly of Illumina sequence data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biological Science, Florida State University, Tallahassee, FL 32306-4295, USA. drokyta@bio.fsu.edu

ABSTRACT

Background: Snake venoms have significant impacts on human populations through the morbidity and mortality associated with snakebites and as sources of drugs, drug leads, and physiological research tools. Genes expressed by venom-gland tissue, including those encoding toxic proteins, have therefore been sequenced but only with relatively sparse coverage resulting from the low-throughput sequencing approaches available. High-throughput approaches based on 454 pyrosequencing have recently been applied to the study of snake venoms to give the most complete characterizations to date of the genes expressed in active venom glands, but such approaches are costly and still provide a far-from-complete characterization of the genes expressed during venom production.

Results: We describe the de novo assembly and analysis of the venom-gland transcriptome of an eastern diamondback rattlesnake (Crotalus adamanteus) based on 95,643,958 pairs of quality-filtered, 100-base-pair Illumina reads. We identified 123 unique, full-length toxin-coding sequences, which cluster into 78 groups with less than 1% nucleotide divergence, and 2,879 unique, full-length nontoxin coding sequences. The toxin sequences accounted for 35.4% of the total reads, and the nontoxin sequences for an additional 27.5%. The most highly expressed toxin was a small myotoxin related to crotamine, which accounted for 5.9% of the total reads. Snake-venom metalloproteinases accounted for the highest percentage of reads mapping to a toxin class (24.4%), followed by C-type lectins (22.2%) and serine proteinases (20.0%). The most diverse toxin classes were the C-type lectins (21 clusters), the snake-venom metalloproteinases (16 clusters), and the serine proteinases (14 clusters). The high-abundance nontoxin transcripts were predominantly those involved in protein folding and translation, consistent with the protein-secretory function of the tissue.

Conclusions: We have provided the most complete characterization of the genes expressed in an active snake venom gland to date, producing insights into snakebite pathology and guidance for snakebite treatment for the largest rattlesnake species and arguably the most dangerous snake native to the United States of America, C. adamanteus. We have more than doubled the number of sequenced toxins for this species and created extensive genomic resources for snakes based entirely on de novo assembly of Illumina sequence data.

Show MeSH

Related in: MedlinePlus

The cellular-components GO terms identified for the 2,879 annotated full-length nontoxin sequences. Terms specific for the production, processing, and export of proteins are highlighted in black. The inset shows the low-abundance portion of the full distribution.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3472243&req=5

Figure 7: The cellular-components GO terms identified for the 2,879 annotated full-length nontoxin sequences. Terms specific for the production, processing, and export of proteins are highlighted in black. The inset shows the low-abundance portion of the full distribution.

Mentions: For our second approach, we used only the 2,879 transcripts with full-length coding sequences for nontoxin proteins. We analyzed these sequences with Blast2GO. The distributions of level 2 GO terms for these data were almost identical to those of the full NGen assembly described above (Figure 4), suggesting that our 2,879 annotated nontoxin sequences provide a representative sample of the full venom-gland transcriptome. The full distributions of GO terms for these sequences across all levels are shown in Figures 5, 6, and 7. As expected for a secretory tissue, processes related to protein production and secretion were well represented (e.g., protein transport and protein modification; Figure 5), as were protein-binding functions (Figure 6) and proteins localized to the endoplasmic reticulum (ER) and the Golgi apparatus (Figure 7).


The venom-gland transcriptome of the eastern diamondback rattlesnake (Crotalus adamanteus).

Rokyta DR, Lemmon AR, Margres MJ, Aronow K - BMC Genomics (2012)

The cellular-components GO terms identified for the 2,879 annotated full-length nontoxin sequences. Terms specific for the production, processing, and export of proteins are highlighted in black. The inset shows the low-abundance portion of the full distribution.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3472243&req=5

Figure 7: The cellular-components GO terms identified for the 2,879 annotated full-length nontoxin sequences. Terms specific for the production, processing, and export of proteins are highlighted in black. The inset shows the low-abundance portion of the full distribution.
Mentions: For our second approach, we used only the 2,879 transcripts with full-length coding sequences for nontoxin proteins. We analyzed these sequences with Blast2GO. The distributions of level 2 GO terms for these data were almost identical to those of the full NGen assembly described above (Figure 4), suggesting that our 2,879 annotated nontoxin sequences provide a representative sample of the full venom-gland transcriptome. The full distributions of GO terms for these sequences across all levels are shown in Figures 5, 6, and 7. As expected for a secretory tissue, processes related to protein production and secretion were well represented (e.g., protein transport and protein modification; Figure 5), as were protein-binding functions (Figure 6) and proteins localized to the endoplasmic reticulum (ER) and the Golgi apparatus (Figure 7).

Bottom Line: The most diverse toxin classes were the C-type lectins (21 clusters), the snake-venom metalloproteinases (16 clusters), and the serine proteinases (14 clusters).The high-abundance nontoxin transcripts were predominantly those involved in protein folding and translation, consistent with the protein-secretory function of the tissue.We have more than doubled the number of sequenced toxins for this species and created extensive genomic resources for snakes based entirely on de novo assembly of Illumina sequence data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biological Science, Florida State University, Tallahassee, FL 32306-4295, USA. drokyta@bio.fsu.edu

ABSTRACT

Background: Snake venoms have significant impacts on human populations through the morbidity and mortality associated with snakebites and as sources of drugs, drug leads, and physiological research tools. Genes expressed by venom-gland tissue, including those encoding toxic proteins, have therefore been sequenced but only with relatively sparse coverage resulting from the low-throughput sequencing approaches available. High-throughput approaches based on 454 pyrosequencing have recently been applied to the study of snake venoms to give the most complete characterizations to date of the genes expressed in active venom glands, but such approaches are costly and still provide a far-from-complete characterization of the genes expressed during venom production.

Results: We describe the de novo assembly and analysis of the venom-gland transcriptome of an eastern diamondback rattlesnake (Crotalus adamanteus) based on 95,643,958 pairs of quality-filtered, 100-base-pair Illumina reads. We identified 123 unique, full-length toxin-coding sequences, which cluster into 78 groups with less than 1% nucleotide divergence, and 2,879 unique, full-length nontoxin coding sequences. The toxin sequences accounted for 35.4% of the total reads, and the nontoxin sequences for an additional 27.5%. The most highly expressed toxin was a small myotoxin related to crotamine, which accounted for 5.9% of the total reads. Snake-venom metalloproteinases accounted for the highest percentage of reads mapping to a toxin class (24.4%), followed by C-type lectins (22.2%) and serine proteinases (20.0%). The most diverse toxin classes were the C-type lectins (21 clusters), the snake-venom metalloproteinases (16 clusters), and the serine proteinases (14 clusters). The high-abundance nontoxin transcripts were predominantly those involved in protein folding and translation, consistent with the protein-secretory function of the tissue.

Conclusions: We have provided the most complete characterization of the genes expressed in an active snake venom gland to date, producing insights into snakebite pathology and guidance for snakebite treatment for the largest rattlesnake species and arguably the most dangerous snake native to the United States of America, C. adamanteus. We have more than doubled the number of sequenced toxins for this species and created extensive genomic resources for snakes based entirely on de novo assembly of Illumina sequence data.

Show MeSH
Related in: MedlinePlus