Limits...
Improving ancient DNA genome assembly

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Most reconstruction methods for genomes of ancient origin that are used today require a closely related reference. In order to identify genomic rearrangements or the deletion of whole genes, de novo assembly has to be used. However, because of inherent problems with ancient DNA, its de novo assembly is highly complicated. In order to tackle the diversity in the length of the input reads, we propose a two-layer approach, where multiple assemblies are generated in the first layer, which are then combined in the second layer. We used this two-layer assembly to generate assemblies for two different ancient samples and compared the results to current de novo assembly approaches. We are able to improve the assembly with respect to the length of the contigs and can resolve more repetitive regions.

No MeSH data available.


Related in: MedlinePlus

Distribution of the length of the contigs generated by the different assemblies.The results generated by the second layer assembly with SGA are shown in white. The results of one first layer assembly is shown in dark grey. The light grey part represents the overlap of both methods. (A) Shows the results using SOAPdenovo2 in the first layer and (B) shows the results using MEGAHIT in this layer for the leprosy data. (C and D) Show the same results on the pestis data. In order to highlight the differences, the data were logarithmized.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5384568&req=5

fig-3: Distribution of the length of the contigs generated by the different assemblies.The results generated by the second layer assembly with SGA are shown in white. The results of one first layer assembly is shown in dark grey. The light grey part represents the overlap of both methods. (A) Shows the results using SOAPdenovo2 in the first layer and (B) shows the results using MEGAHIT in this layer for the leprosy data. (C and D) Show the same results on the pestis data. In order to highlight the differences, the data were logarithmized.

Mentions: The length distribution of the resulting leprosy contigs shows a clear shift towards longer contigs (see Fig. 3). Because the contigs generated from the pestis data were very short, we did not filter them for a minimum length of 1,000 bp. It can be seen that even when all contigs are used, there is a shift towards longer contigs after our two-layer assembly method.


Improving ancient DNA genome assembly
Distribution of the length of the contigs generated by the different assemblies.The results generated by the second layer assembly with SGA are shown in white. The results of one first layer assembly is shown in dark grey. The light grey part represents the overlap of both methods. (A) Shows the results using SOAPdenovo2 in the first layer and (B) shows the results using MEGAHIT in this layer for the leprosy data. (C and D) Show the same results on the pestis data. In order to highlight the differences, the data were logarithmized.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5384568&req=5

fig-3: Distribution of the length of the contigs generated by the different assemblies.The results generated by the second layer assembly with SGA are shown in white. The results of one first layer assembly is shown in dark grey. The light grey part represents the overlap of both methods. (A) Shows the results using SOAPdenovo2 in the first layer and (B) shows the results using MEGAHIT in this layer for the leprosy data. (C and D) Show the same results on the pestis data. In order to highlight the differences, the data were logarithmized.
Mentions: The length distribution of the resulting leprosy contigs shows a clear shift towards longer contigs (see Fig. 3). Because the contigs generated from the pestis data were very short, we did not filter them for a minimum length of 1,000 bp. It can be seen that even when all contigs are used, there is a shift towards longer contigs after our two-layer assembly method.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Most reconstruction methods for genomes of ancient origin that are used today require a closely related reference. In order to identify genomic rearrangements or the deletion of whole genes, de novo assembly has to be used. However, because of inherent problems with ancient DNA, its de novo assembly is highly complicated. In order to tackle the diversity in the length of the input reads, we propose a two-layer approach, where multiple assemblies are generated in the first layer, which are then combined in the second layer. We used this two-layer assembly to generate assemblies for two different ancient samples and compared the results to current de novo assembly approaches. We are able to improve the assembly with respect to the length of the contigs and can resolve more repetitive regions.

No MeSH data available.


Related in: MedlinePlus