Limits...
Transcriptome sequencing of black grouse (Tetrao tetrix) for immune gene discovery and microsatellite development.

Wang B, Ekblom R, Castoe TA, Jones EP, Kozma R, Bongcam-Rudloff E, Pollock DD, Höglund J - Open Biol (2012)

Bottom Line: A specific BLAST search with an emphasis on immune genes found 308 homologous chicken genes that have immune function, including ten major histocompatibility complex-related genes located on chicken chromosome 16.A preliminary test of the polymorphism of the microsatellites found 10 polymorphic microsatellites of the 102 tested.Genomic resources generated in this study should greatly benefit future ecological, evolutionary and conservation genetic studies on this species.

View Article: PubMed Central - PubMed

Affiliation: Population Biology and Conservation Biology, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18 D, 75236 Uppsala, Sweden. biao.wang@ebc.uu.se

ABSTRACT
The black grouse (Tetrao tetrix) is a galliform bird species that is important for both ecological studies and conservation genetics. Here, we report the sequencing of the spleen transcriptome of black grouse using 454 GS FLX Titanium sequencing. We performed a large-scale gene discovery analysis with a focus on genes that might be related to fitness in this species and also identified a large set of microsatellites. In total, we obtained 182 179 quality-filtered sequencing reads that we assembled into 9035 contigs. Using these contigs and 15 794 length-filtered (greater than 200 bp) singletons, we identified 7762 transcripts that appear to be homologues of chicken genes. A specific BLAST search with an emphasis on immune genes found 308 homologous chicken genes that have immune function, including ten major histocompatibility complex-related genes located on chicken chromosome 16. We also identified 1300 expressed sequence tag microsatellites and were able to design suitable flanking primers for 526 of these. A preliminary test of the polymorphism of the microsatellites found 10 polymorphic microsatellites of the 102 tested. Genomic resources generated in this study should greatly benefit future ecological, evolutionary and conservation genetic studies on this species.

Show MeSH
A summary of sequencing and contig assembly results. (a) Length distribution of the pre-process 454 quality-filter-pass reads. (b) Length distribution of assembled contigs. Contigs larger than 2000 bp are binned at the end of the x-axis. (c) Distribution of reads per contig (blue) and coverage per nucleotide site (red). Contigs with more than 30 reads are binned at the end of the x-axis. (d) Density scatterplot showing relationship between reads per contig and contig length. The black line represents the trend of the contig length with increasing reads per contig. Both the x- and y-axes are presented on a log scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3376728&req=5

RSOB120054F1: A summary of sequencing and contig assembly results. (a) Length distribution of the pre-process 454 quality-filter-pass reads. (b) Length distribution of assembled contigs. Contigs larger than 2000 bp are binned at the end of the x-axis. (c) Distribution of reads per contig (blue) and coverage per nucleotide site (red). Contigs with more than 30 reads are binned at the end of the x-axis. (d) Density scatterplot showing relationship between reads per contig and contig length. The black line represents the trend of the contig length with increasing reads per contig. Both the x- and y-axes are presented on a log scale.

Mentions: In total, we sequenced one 1/8 and two 1/16 454 GS FLX Titanium runs, nearly the equivalent of 1/4 of a run. The raw sequences were deposited in the NCBI short read archive under accession number SRA036234. After adapter trimming and quality filtering, we retained a total of 182 179 reads, with a mean length of 320 ± 140 bp (table 1; figure 1a). Of these, 153 065 (84.0%) reads were assembled into 9035 contigs with a length threshold of 100 bp. The mean length of the contigs was 470 ± 250 bp (figure 1b), with 2276 of the contigs being larger than 500 bp. The mean number of reads per contig was 18.81, and the average contig coverage per nucleotide site was 10.01 (figure 1c). For the trimmed and cleaned reads that were not assembled (the singletons), only those longer than 200 bp were included in downstream analysis. There are 15 794 such singletons and their mean length is 370 ± 90 bp. To generally confirm the quality of the singletons and the contigs, we mapped all of them to the chicken genome (WUGSC 2.1). In sum, 19 497 of 24 829 sequences (78.5%), including the contigs and the size-filtered singletons, could be mapped to the chicken genome. The failure of the rest sequences could be due to the fact that the chicken genome itself has not been well completed—for example, many of the microchromosomes are under-represented, and many complicated regions with copy number variations are absent [39]. Also, by the nature of the identification process, truly novel transcribed sequences are generally ignored [40].Table 1.


Transcriptome sequencing of black grouse (Tetrao tetrix) for immune gene discovery and microsatellite development.

Wang B, Ekblom R, Castoe TA, Jones EP, Kozma R, Bongcam-Rudloff E, Pollock DD, Höglund J - Open Biol (2012)

A summary of sequencing and contig assembly results. (a) Length distribution of the pre-process 454 quality-filter-pass reads. (b) Length distribution of assembled contigs. Contigs larger than 2000 bp are binned at the end of the x-axis. (c) Distribution of reads per contig (blue) and coverage per nucleotide site (red). Contigs with more than 30 reads are binned at the end of the x-axis. (d) Density scatterplot showing relationship between reads per contig and contig length. The black line represents the trend of the contig length with increasing reads per contig. Both the x- and y-axes are presented on a log scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3376728&req=5

RSOB120054F1: A summary of sequencing and contig assembly results. (a) Length distribution of the pre-process 454 quality-filter-pass reads. (b) Length distribution of assembled contigs. Contigs larger than 2000 bp are binned at the end of the x-axis. (c) Distribution of reads per contig (blue) and coverage per nucleotide site (red). Contigs with more than 30 reads are binned at the end of the x-axis. (d) Density scatterplot showing relationship between reads per contig and contig length. The black line represents the trend of the contig length with increasing reads per contig. Both the x- and y-axes are presented on a log scale.
Mentions: In total, we sequenced one 1/8 and two 1/16 454 GS FLX Titanium runs, nearly the equivalent of 1/4 of a run. The raw sequences were deposited in the NCBI short read archive under accession number SRA036234. After adapter trimming and quality filtering, we retained a total of 182 179 reads, with a mean length of 320 ± 140 bp (table 1; figure 1a). Of these, 153 065 (84.0%) reads were assembled into 9035 contigs with a length threshold of 100 bp. The mean length of the contigs was 470 ± 250 bp (figure 1b), with 2276 of the contigs being larger than 500 bp. The mean number of reads per contig was 18.81, and the average contig coverage per nucleotide site was 10.01 (figure 1c). For the trimmed and cleaned reads that were not assembled (the singletons), only those longer than 200 bp were included in downstream analysis. There are 15 794 such singletons and their mean length is 370 ± 90 bp. To generally confirm the quality of the singletons and the contigs, we mapped all of them to the chicken genome (WUGSC 2.1). In sum, 19 497 of 24 829 sequences (78.5%), including the contigs and the size-filtered singletons, could be mapped to the chicken genome. The failure of the rest sequences could be due to the fact that the chicken genome itself has not been well completed—for example, many of the microchromosomes are under-represented, and many complicated regions with copy number variations are absent [39]. Also, by the nature of the identification process, truly novel transcribed sequences are generally ignored [40].Table 1.

Bottom Line: A specific BLAST search with an emphasis on immune genes found 308 homologous chicken genes that have immune function, including ten major histocompatibility complex-related genes located on chicken chromosome 16.A preliminary test of the polymorphism of the microsatellites found 10 polymorphic microsatellites of the 102 tested.Genomic resources generated in this study should greatly benefit future ecological, evolutionary and conservation genetic studies on this species.

View Article: PubMed Central - PubMed

Affiliation: Population Biology and Conservation Biology, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18 D, 75236 Uppsala, Sweden. biao.wang@ebc.uu.se

ABSTRACT
The black grouse (Tetrao tetrix) is a galliform bird species that is important for both ecological studies and conservation genetics. Here, we report the sequencing of the spleen transcriptome of black grouse using 454 GS FLX Titanium sequencing. We performed a large-scale gene discovery analysis with a focus on genes that might be related to fitness in this species and also identified a large set of microsatellites. In total, we obtained 182 179 quality-filtered sequencing reads that we assembled into 9035 contigs. Using these contigs and 15 794 length-filtered (greater than 200 bp) singletons, we identified 7762 transcripts that appear to be homologues of chicken genes. A specific BLAST search with an emphasis on immune genes found 308 homologous chicken genes that have immune function, including ten major histocompatibility complex-related genes located on chicken chromosome 16. We also identified 1300 expressed sequence tag microsatellites and were able to design suitable flanking primers for 526 of these. A preliminary test of the polymorphism of the microsatellites found 10 polymorphic microsatellites of the 102 tested. Genomic resources generated in this study should greatly benefit future ecological, evolutionary and conservation genetic studies on this species.

Show MeSH