Limits...
Transcriptome analysis and SSR/SNP markers information of the blunt snout bream (Megalobrama amblycephala).

Gao Z, Luo W, Liu H, Zeng C, Liu X, Yi S, Wang W - PLoS ONE (2012)

Bottom Line: A total number of 4,952 SSRs were found and 116 polymorphic loci have been characterized.A significant number of SNPs (25,697) and indels (23,287) were identified based on specific filter criteria in the M. amblycephala.The identified SSR and SNP markers will greatly benefit its breeding program and whole genome association studies.

View Article: PubMed Central - PubMed

Affiliation: Key Lab of Freshwater Animal Breeding, College of Fisheries, Ministry of Agriculture, Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, People's Republic of China.

ABSTRACT

Background: Blunt snout bream (Megalobrama amblycephala) is an herbivorous freshwater fish species native to China and has been recognized as a main aquaculture species in the Chinese freshwater polyculture system with high economic value. Right now, only limited EST resources were available for M. amblycephala. Recent advances in large-scale RNA sequencing provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes.

Methodology and principal findings: Using 454 pyrosequencing, a total of 1,409,706 high quality reads (total length 577 Mbp) were generated from the normalized cDNA of pooled M. amblycephala individuals. These sequences were assembled into 26,802 contigs and 73,675 singletons. After BLAST searches against the NCBI non-redundant (NR) and UniProt databases with an arbitrary expectation value of E(-10), over 40,000 unigenes were functionally annotated and classified using the FunCat functional annotation scheme. A comparative genomics approach revealed a substantial proportion of genes expressed in M. amblycephala tanscriptome to be shared across the genomes of zebrafish, medaka, tetraodon, fugu, stickleback, human, mouse, and chicken, and identified a substantial number of potentially novel M. amblycephala genes. A total number of 4,952 SSRs were found and 116 polymorphic loci have been characterized. A significant number of SNPs (25,697) and indels (23,287) were identified based on specific filter criteria in the M. amblycephala.

Conclusions: This study is the first comprehensive transcriptome analysis for a fish species belonging to the genus Megalobrama. These large EST resources are expected to be valuable for the development of molecular markers, construction of gene-based linkage map, and large-scale expression analysis of M. amblycephala, as well as comparative genome analysis for the genus Megalobrama fish species. The identified SSR and SNP markers will greatly benefit its breeding program and whole genome association studies.

Show MeSH
Classification of SNPs identified from 454 sequences of M. amblycephala.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3412804&req=5

pone-0042637-g006: Classification of SNPs identified from 454 sequences of M. amblycephala.

Mentions: Putative SNPs/indels detected may be false positives, potentially arising from sequencing errors or misassembly of paralogous sequence variants or multisite sequence variants [39]. Therefore, in order to select SNPs/indels with high confidence, putative SNPs were screened based on several factors including surrounding sequence quality, absence of additional SNPs/indels in the flanking regions, and minor allele frequency. The setting of a minimum minor allele frequency >15% may help reduce false SNP/indel calling based on sequence errors. Additionally, multiple SNPs/indels located close to one another (<15 bp) often represent sequence errors and prevent the design of primers and probes for SNP/indels genotyping. Therefore, a requirement of no additional SNPs in the 15 bp flanking region around a putative SNP/indels was applied. After filtering, 25,697 SNPs were identified and these SNPs included 17,272 transitions and 8,425 transversions (Figure 6). The filtered SNP frequency in the transcribed sequences was one SNP per 401 bp of the transcribed sequences in M. amblycephala. A total of 23,287 filtered indels were identified with one indel per 392 bp of the transcribed sequences. Since the information on minor allele frequency is an important consideration in choosing which SNPs to be used in SNP arrays, the minor allele frequencies of SNPs in the discovery populations were estimated from the sequence data (Figure 7). The average minor allele frequencies were 30.9% in putative filtered SNPs identified for M. amblycephala.


Transcriptome analysis and SSR/SNP markers information of the blunt snout bream (Megalobrama amblycephala).

Gao Z, Luo W, Liu H, Zeng C, Liu X, Yi S, Wang W - PLoS ONE (2012)

Classification of SNPs identified from 454 sequences of M. amblycephala.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3412804&req=5

pone-0042637-g006: Classification of SNPs identified from 454 sequences of M. amblycephala.
Mentions: Putative SNPs/indels detected may be false positives, potentially arising from sequencing errors or misassembly of paralogous sequence variants or multisite sequence variants [39]. Therefore, in order to select SNPs/indels with high confidence, putative SNPs were screened based on several factors including surrounding sequence quality, absence of additional SNPs/indels in the flanking regions, and minor allele frequency. The setting of a minimum minor allele frequency >15% may help reduce false SNP/indel calling based on sequence errors. Additionally, multiple SNPs/indels located close to one another (<15 bp) often represent sequence errors and prevent the design of primers and probes for SNP/indels genotyping. Therefore, a requirement of no additional SNPs in the 15 bp flanking region around a putative SNP/indels was applied. After filtering, 25,697 SNPs were identified and these SNPs included 17,272 transitions and 8,425 transversions (Figure 6). The filtered SNP frequency in the transcribed sequences was one SNP per 401 bp of the transcribed sequences in M. amblycephala. A total of 23,287 filtered indels were identified with one indel per 392 bp of the transcribed sequences. Since the information on minor allele frequency is an important consideration in choosing which SNPs to be used in SNP arrays, the minor allele frequencies of SNPs in the discovery populations were estimated from the sequence data (Figure 7). The average minor allele frequencies were 30.9% in putative filtered SNPs identified for M. amblycephala.

Bottom Line: A total number of 4,952 SSRs were found and 116 polymorphic loci have been characterized.A significant number of SNPs (25,697) and indels (23,287) were identified based on specific filter criteria in the M. amblycephala.The identified SSR and SNP markers will greatly benefit its breeding program and whole genome association studies.

View Article: PubMed Central - PubMed

Affiliation: Key Lab of Freshwater Animal Breeding, College of Fisheries, Ministry of Agriculture, Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, People's Republic of China.

ABSTRACT

Background: Blunt snout bream (Megalobrama amblycephala) is an herbivorous freshwater fish species native to China and has been recognized as a main aquaculture species in the Chinese freshwater polyculture system with high economic value. Right now, only limited EST resources were available for M. amblycephala. Recent advances in large-scale RNA sequencing provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes.

Methodology and principal findings: Using 454 pyrosequencing, a total of 1,409,706 high quality reads (total length 577 Mbp) were generated from the normalized cDNA of pooled M. amblycephala individuals. These sequences were assembled into 26,802 contigs and 73,675 singletons. After BLAST searches against the NCBI non-redundant (NR) and UniProt databases with an arbitrary expectation value of E(-10), over 40,000 unigenes were functionally annotated and classified using the FunCat functional annotation scheme. A comparative genomics approach revealed a substantial proportion of genes expressed in M. amblycephala tanscriptome to be shared across the genomes of zebrafish, medaka, tetraodon, fugu, stickleback, human, mouse, and chicken, and identified a substantial number of potentially novel M. amblycephala genes. A total number of 4,952 SSRs were found and 116 polymorphic loci have been characterized. A significant number of SNPs (25,697) and indels (23,287) were identified based on specific filter criteria in the M. amblycephala.

Conclusions: This study is the first comprehensive transcriptome analysis for a fish species belonging to the genus Megalobrama. These large EST resources are expected to be valuable for the development of molecular markers, construction of gene-based linkage map, and large-scale expression analysis of M. amblycephala, as well as comparative genome analysis for the genus Megalobrama fish species. The identified SSR and SNP markers will greatly benefit its breeding program and whole genome association studies.

Show MeSH