Limits...
Summarizing specific profiles in Illumina sequencing from whole-genome amplified DNA.

Tsai IJ, Hunt M, Holroyd N, Huckvale T, Berriman M, Kikuchi T - DNA Res. (2013)

Bottom Line: Detailed analysis of the reads from amplified libraries revealed characteristics suggesting that majority of amplified fragment ends are identical but inverted versions of each other.Read coverage in amplified libraries is correlated with both tandem and inverted repeat content, while GC content only influences sequencing in long-insert libraries.To utilize the full potential of WGA to reveal the real biological interest, this article highlights the importance of recognizing additional sources of errors from amplified sequence reads and discusses the potential implications in downstream analyses.

View Article: PubMed Central - PubMed

Affiliation: Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SA, UK Faculty of Medicine, Division of Parasitology, Department of Infectious Disease, University of Miyazaki, Miyazaki 889-1692, Japan.

Show MeSH

Related in: MedlinePlus

A plot of genome coverage against normalised average depth. Deviation from the theoretical curve (red) indicates less evenness in coverage depth distribution across the genome. Different protocols are plotted with different colours as listed in the legend, and dashed lines indicate read coverage from Replicate 1 of the long-insert libraries.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4060946&req=5

DST054F2: A plot of genome coverage against normalised average depth. Deviation from the theoretical curve (red) indicates less evenness in coverage depth distribution across the genome. Different protocols are plotted with different colours as listed in the legend, and dashed lines indicate read coverage from Replicate 1 of the long-insert libraries.

Mentions: One of the most important criteria for accurate variant calling and assemblies from Illumina reads is an even coverage of sequence data genome-wide. We first evaluated the variability in the depth of coverage of short-insert reads32 by plotting the cumulative fraction of normalized depth of correctly paired read coverage that covers a given cumulative fraction of genome (Fig. 2). Normalization of read coverage depth allows libraries of different coverage depths to be compared with each other. The theoretical line (Fig. 2) indicates a perfectly uniform distribution of reads where 100% of the genome is covered by reads with a normalized and consistent depth of 1. Figure 2 shows that both replicates of the unamplified short-insert library have the closest fit to the theoretical line, suggesting the most uniform distribution of reads. The remaining samples show some level of deviation, suggesting non-uniform distribution across the genome. Distribution plots of the long-insert libraries show more deviation away from the theoretical distribution than short-insert libraries. This effect is more evident in the lower tail of the distribution, indicating a greater proportion of the genome has lower coverage. By inspecting regions of lower coverage across all libraries, the most evident patterns are regions enriched in G homopolymers tracts and GGC motifs33 (Supplementary Fig. S7).Figure 2.


Summarizing specific profiles in Illumina sequencing from whole-genome amplified DNA.

Tsai IJ, Hunt M, Holroyd N, Huckvale T, Berriman M, Kikuchi T - DNA Res. (2013)

A plot of genome coverage against normalised average depth. Deviation from the theoretical curve (red) indicates less evenness in coverage depth distribution across the genome. Different protocols are plotted with different colours as listed in the legend, and dashed lines indicate read coverage from Replicate 1 of the long-insert libraries.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4060946&req=5

DST054F2: A plot of genome coverage against normalised average depth. Deviation from the theoretical curve (red) indicates less evenness in coverage depth distribution across the genome. Different protocols are plotted with different colours as listed in the legend, and dashed lines indicate read coverage from Replicate 1 of the long-insert libraries.
Mentions: One of the most important criteria for accurate variant calling and assemblies from Illumina reads is an even coverage of sequence data genome-wide. We first evaluated the variability in the depth of coverage of short-insert reads32 by plotting the cumulative fraction of normalized depth of correctly paired read coverage that covers a given cumulative fraction of genome (Fig. 2). Normalization of read coverage depth allows libraries of different coverage depths to be compared with each other. The theoretical line (Fig. 2) indicates a perfectly uniform distribution of reads where 100% of the genome is covered by reads with a normalized and consistent depth of 1. Figure 2 shows that both replicates of the unamplified short-insert library have the closest fit to the theoretical line, suggesting the most uniform distribution of reads. The remaining samples show some level of deviation, suggesting non-uniform distribution across the genome. Distribution plots of the long-insert libraries show more deviation away from the theoretical distribution than short-insert libraries. This effect is more evident in the lower tail of the distribution, indicating a greater proportion of the genome has lower coverage. By inspecting regions of lower coverage across all libraries, the most evident patterns are regions enriched in G homopolymers tracts and GGC motifs33 (Supplementary Fig. S7).Figure 2.

Bottom Line: Detailed analysis of the reads from amplified libraries revealed characteristics suggesting that majority of amplified fragment ends are identical but inverted versions of each other.Read coverage in amplified libraries is correlated with both tandem and inverted repeat content, while GC content only influences sequencing in long-insert libraries.To utilize the full potential of WGA to reveal the real biological interest, this article highlights the importance of recognizing additional sources of errors from amplified sequence reads and discusses the potential implications in downstream analyses.

View Article: PubMed Central - PubMed

Affiliation: Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SA, UK Faculty of Medicine, Division of Parasitology, Department of Infectious Disease, University of Miyazaki, Miyazaki 889-1692, Japan.

Show MeSH
Related in: MedlinePlus