Limits...
Next-generation sequencing-based detection of germline L1-mediated transductions.

Tica J, Lee E, Untergasser A, Meiers S, Garfield DA, Gokcumen O, Furlong EE, Park PJ, Stütz AM, Korbel JO - BMC Genomics (2016)

Bottom Line: We employed TIGER to characterize polymorphic transductions in fifteen genomes from non-human primate species (chimpanzee, orangutan and rhesus macaque), as well as in a human genome.We achieved high accuracy as confirmed by PCR and two single molecule DNA sequencing techniques, and uncovered differences in relative rates of transduction between primate species.By enabling detection of polymorphic transductions, TIGER makes this form of relevant structural variation amenable for population and personal genome analysis.

View Article: PubMed Central - PubMed

Affiliation: European Molecular Biology Laboratory, Genome Biology Unit, 69117, Heidelberg, Germany.

ABSTRACT

Background: While active LINE-1 (L1) elements possess the ability to mobilize flanking sequences to different genomic loci through a process termed transduction influencing genomic content and structure, an approach for detecting polymorphic germline non-reference transductions in massively-parallel sequencing data has been lacking.

Results: Here we present the computational approach TIGER (Transduction Inference in GERmline genomes), enabling the discovery of non-reference L1-mediated transductions by combining L1 discovery with detection of unique insertion sequences and detailed characterization of insertion sites. We employed TIGER to characterize polymorphic transductions in fifteen genomes from non-human primate species (chimpanzee, orangutan and rhesus macaque), as well as in a human genome. We achieved high accuracy as confirmed by PCR and two single molecule DNA sequencing techniques, and uncovered differences in relative rates of transduction between primate species.

Conclusions: By enabling detection of polymorphic transductions, TIGER makes this form of relevant structural variation amenable for population and personal genome analysis.

No MeSH data available.


Related in: MedlinePlus

Experimental verification of TIGER-based L1-mediated 3′ transductions by PCR. a General primer design: outer (grey arrows) primers were placed outside of the event in the target locus to amplify the L1-mediated sequence transduction insertion allele and/or the reference genome allele. On the left side of the locus, the corresponding sequence (dotted line) uniquely matches the target site, and subsequently matches to multiple positions in the genome in line with the presence of an L1 element. Further to the right, the sequence will also match uniquely to the target site and end with a polyA stretch not seen in the reference genome. In order to confirm the presence and origin of the transduced sequence (source locus), we employed a 2nd set of primers (purple arrows) inside the predicted unique transduction sequence. b Example PCRs verifying rhesus macaque L1-mediated sequence transductions, based on outer primers, are shown for inferred carrier (C) and non-carrier (NC) samples. In the presence of an L1-mediated transduction sequence insertion, a larger band than the reference band in NC is seen; heterozygotes show both bands whereas homozygous L1-mediated sequence transduction insertions show only one (i.e. the higher) band. c A Circos plot shows the distribution for all inferred rhesus macaque L1-mediated sequence transductions (for orangutan and chimpanzee, see Additional file 1: Figure S6); experimentally validated insertions by PCR and MinION single molecule sequencing are depicted in green. Arrowheads indicate directionality towards the target locus
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4862182&req=5

Fig3: Experimental verification of TIGER-based L1-mediated 3′ transductions by PCR. a General primer design: outer (grey arrows) primers were placed outside of the event in the target locus to amplify the L1-mediated sequence transduction insertion allele and/or the reference genome allele. On the left side of the locus, the corresponding sequence (dotted line) uniquely matches the target site, and subsequently matches to multiple positions in the genome in line with the presence of an L1 element. Further to the right, the sequence will also match uniquely to the target site and end with a polyA stretch not seen in the reference genome. In order to confirm the presence and origin of the transduced sequence (source locus), we employed a 2nd set of primers (purple arrows) inside the predicted unique transduction sequence. b Example PCRs verifying rhesus macaque L1-mediated sequence transductions, based on outer primers, are shown for inferred carrier (C) and non-carrier (NC) samples. In the presence of an L1-mediated transduction sequence insertion, a larger band than the reference band in NC is seen; heterozygotes show both bands whereas homozygous L1-mediated sequence transduction insertions show only one (i.e. the higher) band. c A Circos plot shows the distribution for all inferred rhesus macaque L1-mediated sequence transductions (for orangutan and chimpanzee, see Additional file 1: Figure S6); experimentally validated insertions by PCR and MinION single molecule sequencing are depicted in green. Arrowheads indicate directionality towards the target locus

Mentions: To verify the accuracy of TIGER, we performed validation experiments on 51 randomly chosen 3′ transduction calls (seven in chimpanzee, 28 in orangutan and 16 in macaque), using PCR followed by capillary sequencing (Table 1 and Fig. 3). We employed a combination of an outer and inner primer pair to specifically amplify the target region, and to overcome the barriers brought about by the two respective polyA tails for pursuing validation by capillary sequencing (Fig. 3a). This validation strategy enabled verification of both the presence of the MEI and of the transduced unique sequence. A representative PCR gel picture for macaque, using outer primers, is shown in Fig. 3b. A Circos plot depicting all predicted transductions in macaque (and in highlighted form with available PCR validation data) is provided in Fig. 3c. In total, we verified 43 out of 51 L1-mediated transduction calls, based on which we estimated a False Discovery Rate (FDR) (see Additional file 1: Supplementary Methods for explanation on FDR calculation) of 15.7 % (with similar FDR estimates across different primate species; Table 1). Investigation of the experimental data on the eight false positive loci revealed that seven lacked MEIs (L1 insertion negative calls) as well as the transduced sequence, whereas the remaining locus presented evidence of an L1 insertion but lacked the inferred transduced sequence (transduction negative call).Table 1


Next-generation sequencing-based detection of germline L1-mediated transductions.

Tica J, Lee E, Untergasser A, Meiers S, Garfield DA, Gokcumen O, Furlong EE, Park PJ, Stütz AM, Korbel JO - BMC Genomics (2016)

Experimental verification of TIGER-based L1-mediated 3′ transductions by PCR. a General primer design: outer (grey arrows) primers were placed outside of the event in the target locus to amplify the L1-mediated sequence transduction insertion allele and/or the reference genome allele. On the left side of the locus, the corresponding sequence (dotted line) uniquely matches the target site, and subsequently matches to multiple positions in the genome in line with the presence of an L1 element. Further to the right, the sequence will also match uniquely to the target site and end with a polyA stretch not seen in the reference genome. In order to confirm the presence and origin of the transduced sequence (source locus), we employed a 2nd set of primers (purple arrows) inside the predicted unique transduction sequence. b Example PCRs verifying rhesus macaque L1-mediated sequence transductions, based on outer primers, are shown for inferred carrier (C) and non-carrier (NC) samples. In the presence of an L1-mediated transduction sequence insertion, a larger band than the reference band in NC is seen; heterozygotes show both bands whereas homozygous L1-mediated sequence transduction insertions show only one (i.e. the higher) band. c A Circos plot shows the distribution for all inferred rhesus macaque L1-mediated sequence transductions (for orangutan and chimpanzee, see Additional file 1: Figure S6); experimentally validated insertions by PCR and MinION single molecule sequencing are depicted in green. Arrowheads indicate directionality towards the target locus
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4862182&req=5

Fig3: Experimental verification of TIGER-based L1-mediated 3′ transductions by PCR. a General primer design: outer (grey arrows) primers were placed outside of the event in the target locus to amplify the L1-mediated sequence transduction insertion allele and/or the reference genome allele. On the left side of the locus, the corresponding sequence (dotted line) uniquely matches the target site, and subsequently matches to multiple positions in the genome in line with the presence of an L1 element. Further to the right, the sequence will also match uniquely to the target site and end with a polyA stretch not seen in the reference genome. In order to confirm the presence and origin of the transduced sequence (source locus), we employed a 2nd set of primers (purple arrows) inside the predicted unique transduction sequence. b Example PCRs verifying rhesus macaque L1-mediated sequence transductions, based on outer primers, are shown for inferred carrier (C) and non-carrier (NC) samples. In the presence of an L1-mediated transduction sequence insertion, a larger band than the reference band in NC is seen; heterozygotes show both bands whereas homozygous L1-mediated sequence transduction insertions show only one (i.e. the higher) band. c A Circos plot shows the distribution for all inferred rhesus macaque L1-mediated sequence transductions (for orangutan and chimpanzee, see Additional file 1: Figure S6); experimentally validated insertions by PCR and MinION single molecule sequencing are depicted in green. Arrowheads indicate directionality towards the target locus
Mentions: To verify the accuracy of TIGER, we performed validation experiments on 51 randomly chosen 3′ transduction calls (seven in chimpanzee, 28 in orangutan and 16 in macaque), using PCR followed by capillary sequencing (Table 1 and Fig. 3). We employed a combination of an outer and inner primer pair to specifically amplify the target region, and to overcome the barriers brought about by the two respective polyA tails for pursuing validation by capillary sequencing (Fig. 3a). This validation strategy enabled verification of both the presence of the MEI and of the transduced unique sequence. A representative PCR gel picture for macaque, using outer primers, is shown in Fig. 3b. A Circos plot depicting all predicted transductions in macaque (and in highlighted form with available PCR validation data) is provided in Fig. 3c. In total, we verified 43 out of 51 L1-mediated transduction calls, based on which we estimated a False Discovery Rate (FDR) (see Additional file 1: Supplementary Methods for explanation on FDR calculation) of 15.7 % (with similar FDR estimates across different primate species; Table 1). Investigation of the experimental data on the eight false positive loci revealed that seven lacked MEIs (L1 insertion negative calls) as well as the transduced sequence, whereas the remaining locus presented evidence of an L1 insertion but lacked the inferred transduced sequence (transduction negative call).Table 1

Bottom Line: We employed TIGER to characterize polymorphic transductions in fifteen genomes from non-human primate species (chimpanzee, orangutan and rhesus macaque), as well as in a human genome.We achieved high accuracy as confirmed by PCR and two single molecule DNA sequencing techniques, and uncovered differences in relative rates of transduction between primate species.By enabling detection of polymorphic transductions, TIGER makes this form of relevant structural variation amenable for population and personal genome analysis.

View Article: PubMed Central - PubMed

Affiliation: European Molecular Biology Laboratory, Genome Biology Unit, 69117, Heidelberg, Germany.

ABSTRACT

Background: While active LINE-1 (L1) elements possess the ability to mobilize flanking sequences to different genomic loci through a process termed transduction influencing genomic content and structure, an approach for detecting polymorphic germline non-reference transductions in massively-parallel sequencing data has been lacking.

Results: Here we present the computational approach TIGER (Transduction Inference in GERmline genomes), enabling the discovery of non-reference L1-mediated transductions by combining L1 discovery with detection of unique insertion sequences and detailed characterization of insertion sites. We employed TIGER to characterize polymorphic transductions in fifteen genomes from non-human primate species (chimpanzee, orangutan and rhesus macaque), as well as in a human genome. We achieved high accuracy as confirmed by PCR and two single molecule DNA sequencing techniques, and uncovered differences in relative rates of transduction between primate species.

Conclusions: By enabling detection of polymorphic transductions, TIGER makes this form of relevant structural variation amenable for population and personal genome analysis.

No MeSH data available.


Related in: MedlinePlus