Limits...
Survey of the Applications of NGS to Whole-Genome Sequencing and Expression Profiling.

Lim JS, Choi BS, Lee JS, Shin C, Yang TJ, Rhee JS, Lee JS, Choi IY - Genomics Inform (2012)

Bottom Line: Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing.The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences.However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

View Article: PubMed Central - PubMed

Affiliation: National Instrumentation Center for Environmental Management, College of Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Korea.

ABSTRACT
Recently, the technologies of DNA sequence variation and gene expression profiling have been used widely as approaches in the expertise of genome biology and genetics. The application to genome study has been particularly developed with the introduction of the next-generation DNA sequencer (NGS) Roche/454 and Illumina/Solexa systems, along with bioinformation analysis technologies of whole-genome de novo assembly, expression profiling, DNA variation discovery, and genotyping. Both massive whole-genome shotgun paired-end sequencing and mate paired-end sequencing data are important steps for constructing de novo assembly of novel genome sequencing data. It is necessary to have DNA sequence information from a multiplatform NGS with at least 2× and 30× depth sequence of genome coverage using Roche/454 and Illumina/Solexa, respectively, for effective an way of de novo assembly. Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing. Whole-genome expression profile data are useful to approach genome system biology with quantification of expressed RNAs from a whole-genome transcriptome, depending on the tissue samples. The hybrid mRNA sequences from Rohce/454 and Illumina/Solexa are more powerful to find novel genes through de novo assembly in any whole-genome sequenced species. The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences. However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

No MeSH data available.


Related in: MedlinePlus

Integrated pipeline for de novo assembly of novel genome sequencing. The scheme is filtering data to remove low-quality and shot-read initial assemblies using variable software and compare to contigs, hybrid contigs using MIRA assembler, and contig ordering using SSPACE software to scaffold construction.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC3475479&req=5

Figure 2: Integrated pipeline for de novo assembly of novel genome sequencing. The scheme is filtering data to remove low-quality and shot-read initial assemblies using variable software and compare to contigs, hybrid contigs using MIRA assembler, and contig ordering using SSPACE software to scaffold construction.

Mentions: The genome sequence could be associated with the predicted genes with transcriptome sequence data. An ideal method for cost-effective novel genome sequencing using NGS is de novo assembly with diverse shotgun fragment end sequencing data of multiplat systems (Fig. 1). The first strategy of novel genome DNA sequencing is sequencing the genomic DNA for contig and scaffold construction after randomly sheared shotgun single read-end or paired-end read DNA sequencing using Roche/454 or Illumina/Solexa with information on how to assemble with the NGS data using variable assembly software. Recently, a catfish genome was sequenced with multiplatform Roche/454 and Illumina/Solexa technology and assembled with an effective combination of low coverage depth of 18× Roche/454 and 70× Illumina/Solexa data using 3 assembly softwares - Newbler software to the 454 reads, Velvet assembler to the Illumina read, and MIRA assembler for final assembly of contigs and singletons derived from initial assembled data - resulting in 193 contigs with an N50 value of 13,123 bp [2]. In an additional multiplatform data assembly of a 40-Mb eukaryotic genome of the fungus Sordaria macrospra, a combination sequence of 85-fold coverage of Illumina/Solexa and 10-fold coverage by Roche/454 sequencing was assembled to a 40-Mb draft version (N50 of 117 kb) with the Velvet assembler as a reference of a model organism for fungal morphogenesis [17]. In the recent effective assembly methods reported, combinations of the multiplatform sequence are shown as successful novel genome assembly using variable assembly strategy pipelines. Comparing the pipeline of assembly strategy, we suggest an effective integrated pipeline in which data are filtered to remove low-quality and short-read initial assemblies using variable software and then compared to contigs, hybrid contigs using MIRA assembler, and finally contig orders using SSPACE software (http://www.baseclear.com/dna-sequencing/data-analysis/) [18] for scaffold construction through de novo assembly of novel genome sequencing (Fig. 2). According to the comparison of several ways of de novo assembly, we suggest using both DNA sequences from multiplatform NGS with at least 2× and 30× depth sequences of genome coverage using Roche/454 and Illumina/Solexa, respectively, and doing hybrid assembly for cost-effective novel genome sequencing.


Survey of the Applications of NGS to Whole-Genome Sequencing and Expression Profiling.

Lim JS, Choi BS, Lee JS, Shin C, Yang TJ, Rhee JS, Lee JS, Choi IY - Genomics Inform (2012)

Integrated pipeline for de novo assembly of novel genome sequencing. The scheme is filtering data to remove low-quality and shot-read initial assemblies using variable software and compare to contigs, hybrid contigs using MIRA assembler, and contig ordering using SSPACE software to scaffold construction.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC3475479&req=5

Figure 2: Integrated pipeline for de novo assembly of novel genome sequencing. The scheme is filtering data to remove low-quality and shot-read initial assemblies using variable software and compare to contigs, hybrid contigs using MIRA assembler, and contig ordering using SSPACE software to scaffold construction.
Mentions: The genome sequence could be associated with the predicted genes with transcriptome sequence data. An ideal method for cost-effective novel genome sequencing using NGS is de novo assembly with diverse shotgun fragment end sequencing data of multiplat systems (Fig. 1). The first strategy of novel genome DNA sequencing is sequencing the genomic DNA for contig and scaffold construction after randomly sheared shotgun single read-end or paired-end read DNA sequencing using Roche/454 or Illumina/Solexa with information on how to assemble with the NGS data using variable assembly software. Recently, a catfish genome was sequenced with multiplatform Roche/454 and Illumina/Solexa technology and assembled with an effective combination of low coverage depth of 18× Roche/454 and 70× Illumina/Solexa data using 3 assembly softwares - Newbler software to the 454 reads, Velvet assembler to the Illumina read, and MIRA assembler for final assembly of contigs and singletons derived from initial assembled data - resulting in 193 contigs with an N50 value of 13,123 bp [2]. In an additional multiplatform data assembly of a 40-Mb eukaryotic genome of the fungus Sordaria macrospra, a combination sequence of 85-fold coverage of Illumina/Solexa and 10-fold coverage by Roche/454 sequencing was assembled to a 40-Mb draft version (N50 of 117 kb) with the Velvet assembler as a reference of a model organism for fungal morphogenesis [17]. In the recent effective assembly methods reported, combinations of the multiplatform sequence are shown as successful novel genome assembly using variable assembly strategy pipelines. Comparing the pipeline of assembly strategy, we suggest an effective integrated pipeline in which data are filtered to remove low-quality and short-read initial assemblies using variable software and then compared to contigs, hybrid contigs using MIRA assembler, and finally contig orders using SSPACE software (http://www.baseclear.com/dna-sequencing/data-analysis/) [18] for scaffold construction through de novo assembly of novel genome sequencing (Fig. 2). According to the comparison of several ways of de novo assembly, we suggest using both DNA sequences from multiplatform NGS with at least 2× and 30× depth sequences of genome coverage using Roche/454 and Illumina/Solexa, respectively, and doing hybrid assembly for cost-effective novel genome sequencing.

Bottom Line: Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing.The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences.However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

View Article: PubMed Central - PubMed

Affiliation: National Instrumentation Center for Environmental Management, College of Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Korea.

ABSTRACT
Recently, the technologies of DNA sequence variation and gene expression profiling have been used widely as approaches in the expertise of genome biology and genetics. The application to genome study has been particularly developed with the introduction of the next-generation DNA sequencer (NGS) Roche/454 and Illumina/Solexa systems, along with bioinformation analysis technologies of whole-genome de novo assembly, expression profiling, DNA variation discovery, and genotyping. Both massive whole-genome shotgun paired-end sequencing and mate paired-end sequencing data are important steps for constructing de novo assembly of novel genome sequencing data. It is necessary to have DNA sequence information from a multiplatform NGS with at least 2× and 30× depth sequence of genome coverage using Roche/454 and Illumina/Solexa, respectively, for effective an way of de novo assembly. Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing. Whole-genome expression profile data are useful to approach genome system biology with quantification of expressed RNAs from a whole-genome transcriptome, depending on the tissue samples. The hybrid mRNA sequences from Rohce/454 and Illumina/Solexa are more powerful to find novel genes through de novo assembly in any whole-genome sequenced species. The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences. However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

No MeSH data available.


Related in: MedlinePlus