Limits...
Improving ancient DNA genome assembly

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Most reconstruction methods for genomes of ancient origin that are used today require a closely related reference. In order to identify genomic rearrangements or the deletion of whole genes, de novo assembly has to be used. However, because of inherent problems with ancient DNA, its de novo assembly is highly complicated. In order to tackle the diversity in the length of the input reads, we propose a two-layer approach, where multiple assemblies are generated in the first layer, which are then combined in the second layer. We used this two-layer assembly to generate assemblies for two different ancient samples and compared the results to current de novo assembly approaches. We are able to improve the assembly with respect to the length of the contigs and can resolve more repetitive regions.

No MeSH data available.


Workflow of our two-layer assembly approach.First the reads are preprocessed by removing sequenced adapters and clipping low-quality bases. After that, multiple de novo assemblies are generated using a de Bruijn graph approach with multiple values for k. The reads are then mapped back against each of these resulting contigs and the contigs with no read support are filtered out. In Layer 2, these filtered contigs are then combined and assembled again using an Overlap-Layout-Consensus approach. Very short contigs are removed. The resulting contigs are mapped against a reference genome and contig statistics are calculated in order to assess the quality of the assembly.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5384568&req=5

fig-1: Workflow of our two-layer assembly approach.First the reads are preprocessed by removing sequenced adapters and clipping low-quality bases. After that, multiple de novo assemblies are generated using a de Bruijn graph approach with multiple values for k. The reads are then mapped back against each of these resulting contigs and the contigs with no read support are filtered out. In Layer 2, these filtered contigs are then combined and assembled again using an Overlap-Layout-Consensus approach. Very short contigs are removed. The resulting contigs are mapped against a reference genome and contig statistics are calculated in order to assess the quality of the assembly.

Mentions: The general structure of our two-layer assembly approach is to use multiple assemblies in a first layer with different k-mers, which are then merged in a second layer assembly using an overlap-based assembly program (see Fig. 1).


Improving ancient DNA genome assembly
Workflow of our two-layer assembly approach.First the reads are preprocessed by removing sequenced adapters and clipping low-quality bases. After that, multiple de novo assemblies are generated using a de Bruijn graph approach with multiple values for k. The reads are then mapped back against each of these resulting contigs and the contigs with no read support are filtered out. In Layer 2, these filtered contigs are then combined and assembled again using an Overlap-Layout-Consensus approach. Very short contigs are removed. The resulting contigs are mapped against a reference genome and contig statistics are calculated in order to assess the quality of the assembly.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5384568&req=5

fig-1: Workflow of our two-layer assembly approach.First the reads are preprocessed by removing sequenced adapters and clipping low-quality bases. After that, multiple de novo assemblies are generated using a de Bruijn graph approach with multiple values for k. The reads are then mapped back against each of these resulting contigs and the contigs with no read support are filtered out. In Layer 2, these filtered contigs are then combined and assembled again using an Overlap-Layout-Consensus approach. Very short contigs are removed. The resulting contigs are mapped against a reference genome and contig statistics are calculated in order to assess the quality of the assembly.
Mentions: The general structure of our two-layer assembly approach is to use multiple assemblies in a first layer with different k-mers, which are then merged in a second layer assembly using an overlap-based assembly program (see Fig. 1).

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Most reconstruction methods for genomes of ancient origin that are used today require a closely related reference. In order to identify genomic rearrangements or the deletion of whole genes, de novo assembly has to be used. However, because of inherent problems with ancient DNA, its de novo assembly is highly complicated. In order to tackle the diversity in the length of the input reads, we propose a two-layer approach, where multiple assemblies are generated in the first layer, which are then combined in the second layer. We used this two-layer assembly to generate assemblies for two different ancient samples and compared the results to current de novo assembly approaches. We are able to improve the assembly with respect to the length of the contigs and can resolve more repetitive regions.

No MeSH data available.