Limits...
TE-Tracker: systematic identification of transposition events through whole-genome resequencing.

Gilly A, Etcheverry M, Madoui MA, Guy J, Quadrana L, Alberti A, Martin A, Heitkam T, Engelen S, Labadie K, Le Pen J, Wincker P, Colot V, Aury JM - BMC Bioinformatics (2014)

Bottom Line: As such, they are increasingly recognized as impacting all aspects of genome function.We show that TE-Tracker, while working independently of any prior annotation, bridges the gap between these two approaches in terms of detection power.Indeed, its positive predictive value (PPV) is comparable to that of dedicated TE software while its sensitivity is typical of a generic SV detection tool.

View Article: PubMed Central - PubMed

Affiliation: Commissariat a l'Energie Atomique (CEA), Institut de Genomique (IG), Genoscope, 2 rue Gaston Crémieux, BP5706, 91057, Evry, France. ag15@sanger.ac.uk.

ABSTRACT

Background: Transposable elements (TEs) are DNA sequences that are able to move from their location in the genome by cutting or copying themselves to another locus. As such, they are increasingly recognized as impacting all aspects of genome function. With the dramatic reduction in cost of DNA sequencing, it is now possible to resequence whole genomes in order to systematically characterize novel TE mobilization in a particular individual. However, this task is made difficult by the inherently repetitive nature of TE sequences, which in some eukaryotes compose over half of the genome sequence. Currently, only a few software tools dedicated to the detection of TE mobilization using next-generation-sequencing are described in the literature. They often target specific TEs for which annotation is available, and are only able to identify families of closely related TEs, rather than individual elements.

Results: We present TE-Tracker, a general and accurate computational method for the de-novo detection of germ line TE mobilization from re-sequenced genomes, as well as the identification of both their source and destination sequences. We compare our method with the two classes of existing software: specialized TE-detection tools and generic structural variant (SV) detection tools. We show that TE-Tracker, while working independently of any prior annotation, bridges the gap between these two approaches in terms of detection power. Indeed, its positive predictive value (PPV) is comparable to that of dedicated TE software while its sensitivity is typical of a generic SV detection tool. TE-Tracker demonstrates the benefit of adopting an annotation-independent, de novo approach for the detection of TE mobilization events. We use TE-Tracker to provide a comprehensive view of transposition events induced by loss of DNA methylation in Arabidopsis. TE-Tracker is freely available at http://www.genoscope.cns.fr/TE-Tracker .

Conclusions: We show that TE-Tracker accurately detects both the source and destination of novel transposition events in re-sequenced genomes. Moreover, TE-Tracker is able to detect all potential donor sequences for a given insertion, and can identify the correct one among them. Furthermore, TE-Tracker produces significantly fewer false positives than common SV detection programs, thus greatly facilitating the detection and analysis of TE mobilization events.

Show MeSH

Related in: MedlinePlus

TE-Tracker overview, main steps of the TE-Tracker pipeline.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4279814&req=5

Fig1: TE-Tracker overview, main steps of the TE-Tracker pipeline.

Mentions: TE-Tracker is divided into three independent modules: Eris, Leto and Metis (Figure 1). TE-Tracker starts with the Eris module detecting discordant read pairs (Figure 2a), i.e. pairs that map in unexpected orientation or location with respect to the preparation and insert size, which can constitute evidence of a transposition event. First, alignments are filtered based on mapping quality and then a random sample of the read pairs is used to estimate the insert size distribution. Median and median absolute deviation (MAD) thresholds are used to mark as discordant the pairs for which the read mates map with an unexpected insert size (see Methods). Pairs mapping on different chromosomes or in an unexpected orientation with respect to the sequencing library are flagged discordant as well. When multiple mappings are available for either mate of one pair, the pair is considered discordant only if all combinations of mate mappings match the aforementioned discordance criteria; in which case all potential mappings are recorded as if they were unique mappings from separate read pairs (see Methods).Figure 1


TE-Tracker: systematic identification of transposition events through whole-genome resequencing.

Gilly A, Etcheverry M, Madoui MA, Guy J, Quadrana L, Alberti A, Martin A, Heitkam T, Engelen S, Labadie K, Le Pen J, Wincker P, Colot V, Aury JM - BMC Bioinformatics (2014)

TE-Tracker overview, main steps of the TE-Tracker pipeline.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4279814&req=5

Fig1: TE-Tracker overview, main steps of the TE-Tracker pipeline.
Mentions: TE-Tracker is divided into three independent modules: Eris, Leto and Metis (Figure 1). TE-Tracker starts with the Eris module detecting discordant read pairs (Figure 2a), i.e. pairs that map in unexpected orientation or location with respect to the preparation and insert size, which can constitute evidence of a transposition event. First, alignments are filtered based on mapping quality and then a random sample of the read pairs is used to estimate the insert size distribution. Median and median absolute deviation (MAD) thresholds are used to mark as discordant the pairs for which the read mates map with an unexpected insert size (see Methods). Pairs mapping on different chromosomes or in an unexpected orientation with respect to the sequencing library are flagged discordant as well. When multiple mappings are available for either mate of one pair, the pair is considered discordant only if all combinations of mate mappings match the aforementioned discordance criteria; in which case all potential mappings are recorded as if they were unique mappings from separate read pairs (see Methods).Figure 1

Bottom Line: As such, they are increasingly recognized as impacting all aspects of genome function.We show that TE-Tracker, while working independently of any prior annotation, bridges the gap between these two approaches in terms of detection power.Indeed, its positive predictive value (PPV) is comparable to that of dedicated TE software while its sensitivity is typical of a generic SV detection tool.

View Article: PubMed Central - PubMed

Affiliation: Commissariat a l'Energie Atomique (CEA), Institut de Genomique (IG), Genoscope, 2 rue Gaston Crémieux, BP5706, 91057, Evry, France. ag15@sanger.ac.uk.

ABSTRACT

Background: Transposable elements (TEs) are DNA sequences that are able to move from their location in the genome by cutting or copying themselves to another locus. As such, they are increasingly recognized as impacting all aspects of genome function. With the dramatic reduction in cost of DNA sequencing, it is now possible to resequence whole genomes in order to systematically characterize novel TE mobilization in a particular individual. However, this task is made difficult by the inherently repetitive nature of TE sequences, which in some eukaryotes compose over half of the genome sequence. Currently, only a few software tools dedicated to the detection of TE mobilization using next-generation-sequencing are described in the literature. They often target specific TEs for which annotation is available, and are only able to identify families of closely related TEs, rather than individual elements.

Results: We present TE-Tracker, a general and accurate computational method for the de-novo detection of germ line TE mobilization from re-sequenced genomes, as well as the identification of both their source and destination sequences. We compare our method with the two classes of existing software: specialized TE-detection tools and generic structural variant (SV) detection tools. We show that TE-Tracker, while working independently of any prior annotation, bridges the gap between these two approaches in terms of detection power. Indeed, its positive predictive value (PPV) is comparable to that of dedicated TE software while its sensitivity is typical of a generic SV detection tool. TE-Tracker demonstrates the benefit of adopting an annotation-independent, de novo approach for the detection of TE mobilization events. We use TE-Tracker to provide a comprehensive view of transposition events induced by loss of DNA methylation in Arabidopsis. TE-Tracker is freely available at http://www.genoscope.cns.fr/TE-Tracker .

Conclusions: We show that TE-Tracker accurately detects both the source and destination of novel transposition events in re-sequenced genomes. Moreover, TE-Tracker is able to detect all potential donor sequences for a given insertion, and can identify the correct one among them. Furthermore, TE-Tracker produces significantly fewer false positives than common SV detection programs, thus greatly facilitating the detection and analysis of TE mobilization events.

Show MeSH
Related in: MedlinePlus