Limits...
Survey of the Applications of NGS to Whole-Genome Sequencing and Expression Profiling.

Lim JS, Choi BS, Lee JS, Shin C, Yang TJ, Rhee JS, Lee JS, Choi IY - Genomics Inform (2012)

Bottom Line: Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing.The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences.However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

View Article: PubMed Central - PubMed

Affiliation: National Instrumentation Center for Environmental Management, College of Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Korea.

ABSTRACT
Recently, the technologies of DNA sequence variation and gene expression profiling have been used widely as approaches in the expertise of genome biology and genetics. The application to genome study has been particularly developed with the introduction of the next-generation DNA sequencer (NGS) Roche/454 and Illumina/Solexa systems, along with bioinformation analysis technologies of whole-genome de novo assembly, expression profiling, DNA variation discovery, and genotyping. Both massive whole-genome shotgun paired-end sequencing and mate paired-end sequencing data are important steps for constructing de novo assembly of novel genome sequencing data. It is necessary to have DNA sequence information from a multiplatform NGS with at least 2× and 30× depth sequence of genome coverage using Roche/454 and Illumina/Solexa, respectively, for effective an way of de novo assembly. Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing. Whole-genome expression profile data are useful to approach genome system biology with quantification of expressed RNAs from a whole-genome transcriptome, depending on the tissue samples. The hybrid mRNA sequences from Rohce/454 and Illumina/Solexa are more powerful to find novel genes through de novo assembly in any whole-genome sequenced species. The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences. However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

No MeSH data available.


A scheme of transcriptome expression analysis through massively parallel signature sequencing (MPSS) technology and bioinformatics: The identification of expressed genes through hybrid de novo assembly with Roche/454 and Illumina/Solexa data (left) and expressed level profiling through mapping the Illumina/Solexa sequence to the expressed sequence tag reference.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC3475479&req=5

Figure 4: A scheme of transcriptome expression analysis through massively parallel signature sequencing (MPSS) technology and bioinformatics: The identification of expressed genes through hybrid de novo assembly with Roche/454 and Illumina/Solexa data (left) and expressed level profiling through mapping the Illumina/Solexa sequence to the expressed sequence tag reference.

Mentions: The hybrid sequence data of 20× and 50× coverage of the estimated transcriptome sequence from Roche/454 and Illumina/Solexa, respectively, is effective in creating novel expressed reference sequences, while short-read Illumina/Solexa data are cost-efficient on expression quantification information for comparing exposed samples and natural phenotype samples through mapping to the reference genes (Fig. 4). Only and average 30× coverage of transcriptome depth of short-read sequences of Illumina/Solexa is enough to check expression quantification, compared to reference expressed sequence tag sequences. The expressed information could be different, depending on the software using CAP3, MIRA, Newbler, SeqMan, and CLC. Therefore, the results should be compared according to variable program options to define robust expression profiling [41]. To date, a powerful tool of ChIP-on-chip is used for understanding gene transcription regulation. Thus, two-channel microarray technology of a combination of chromatin immunoprecipitation could be used for genomewide mapping of binding sites of DNA-interacting proteins [29]. In any NGS application, the transcriptome expression information would be more useful than complete genome information research with the lowest sequencing budget for biologists to better understand gene regulation of related genetic phenotypes with the in silico method. Of in silico methods, conserved miRNA and novel miRNA discovery is available on the massive miRNAnome data in any species. Specially, the target genes of miRNA discovered could be robust information to approach genome biology studies. Transcriptome assembly is smaller than genome assembly and thus should be more computationally tractable but is often harder, as individual contigs can often have highly variable read coverages. Comparing single assemblers, Newbler 2.5 performed the best on our trial dataset, but other assemblers were closely comparable. Combining different optimal assemblies from different programs, however, gives a more credible final product, and this strategy is recommended [41].


Survey of the Applications of NGS to Whole-Genome Sequencing and Expression Profiling.

Lim JS, Choi BS, Lee JS, Shin C, Yang TJ, Rhee JS, Lee JS, Choi IY - Genomics Inform (2012)

A scheme of transcriptome expression analysis through massively parallel signature sequencing (MPSS) technology and bioinformatics: The identification of expressed genes through hybrid de novo assembly with Roche/454 and Illumina/Solexa data (left) and expressed level profiling through mapping the Illumina/Solexa sequence to the expressed sequence tag reference.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC3475479&req=5

Figure 4: A scheme of transcriptome expression analysis through massively parallel signature sequencing (MPSS) technology and bioinformatics: The identification of expressed genes through hybrid de novo assembly with Roche/454 and Illumina/Solexa data (left) and expressed level profiling through mapping the Illumina/Solexa sequence to the expressed sequence tag reference.
Mentions: The hybrid sequence data of 20× and 50× coverage of the estimated transcriptome sequence from Roche/454 and Illumina/Solexa, respectively, is effective in creating novel expressed reference sequences, while short-read Illumina/Solexa data are cost-efficient on expression quantification information for comparing exposed samples and natural phenotype samples through mapping to the reference genes (Fig. 4). Only and average 30× coverage of transcriptome depth of short-read sequences of Illumina/Solexa is enough to check expression quantification, compared to reference expressed sequence tag sequences. The expressed information could be different, depending on the software using CAP3, MIRA, Newbler, SeqMan, and CLC. Therefore, the results should be compared according to variable program options to define robust expression profiling [41]. To date, a powerful tool of ChIP-on-chip is used for understanding gene transcription regulation. Thus, two-channel microarray technology of a combination of chromatin immunoprecipitation could be used for genomewide mapping of binding sites of DNA-interacting proteins [29]. In any NGS application, the transcriptome expression information would be more useful than complete genome information research with the lowest sequencing budget for biologists to better understand gene regulation of related genetic phenotypes with the in silico method. Of in silico methods, conserved miRNA and novel miRNA discovery is available on the massive miRNAnome data in any species. Specially, the target genes of miRNA discovered could be robust information to approach genome biology studies. Transcriptome assembly is smaller than genome assembly and thus should be more computationally tractable but is often harder, as individual contigs can often have highly variable read coverages. Comparing single assemblers, Newbler 2.5 performed the best on our trial dataset, but other assemblers were closely comparable. Combining different optimal assemblies from different programs, however, gives a more credible final product, and this strategy is recommended [41].

Bottom Line: Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing.The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences.However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

View Article: PubMed Central - PubMed

Affiliation: National Instrumentation Center for Environmental Management, College of Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Korea.

ABSTRACT
Recently, the technologies of DNA sequence variation and gene expression profiling have been used widely as approaches in the expertise of genome biology and genetics. The application to genome study has been particularly developed with the introduction of the next-generation DNA sequencer (NGS) Roche/454 and Illumina/Solexa systems, along with bioinformation analysis technologies of whole-genome de novo assembly, expression profiling, DNA variation discovery, and genotyping. Both massive whole-genome shotgun paired-end sequencing and mate paired-end sequencing data are important steps for constructing de novo assembly of novel genome sequencing data. It is necessary to have DNA sequence information from a multiplatform NGS with at least 2× and 30× depth sequence of genome coverage using Roche/454 and Illumina/Solexa, respectively, for effective an way of de novo assembly. Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing. Whole-genome expression profile data are useful to approach genome system biology with quantification of expressed RNAs from a whole-genome transcriptome, depending on the tissue samples. The hybrid mRNA sequences from Rohce/454 and Illumina/Solexa are more powerful to find novel genes through de novo assembly in any whole-genome sequenced species. The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences. However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

No MeSH data available.