Limits...
Statistical approaches to detecting and analyzing tandem repeats in genomic sequences.

Anisimova M, Pečerska J, Schaper E - Front Bioeng Biotechnol (2015)

Bottom Line: We present our recent contributions to computational and statistical approaches for TR significance testing, sequence profile-based TR annotation, TR-aware sequence alignment, phylogenetic analyses of TR unit number and order, and TR benchmarks.Importantly, all these methods explicitly rely on the evolutionary definition of a tandem repeat as a sequence of adjacent repeat units stemming from a common ancestor.The discussed work has a focus on protein TRs, yet is generally applicable to nucleic acid TRs, sharing similar features.

View Article: PubMed Central - PubMed

Affiliation: Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW) , Wädenswil , Switzerland.

ABSTRACT
Tandem repeats (TRs) are frequently observed in genomes across all domains of life. Evidence suggests that some TRs are crucial for proteins with fundamental biological functions and can be associated with virulence, resistance, and infectious/neurodegenerative diseases. Genome-scale systematic studies of TRs have the potential to unveil core mechanisms governing TR evolution and TR roles in shaping genomes. However, TR-related studies are often non-trivial due to heterogeneous and sometimes fast evolving TR regions. In this review, we discuss these intricacies and their consequences. We present our recent contributions to computational and statistical approaches for TR significance testing, sequence profile-based TR annotation, TR-aware sequence alignment, phylogenetic analyses of TR unit number and order, and TR benchmarks. Importantly, all these methods explicitly rely on the evolutionary definition of a tandem repeat as a sequence of adjacent repeat units stemming from a common ancestor. The discussed work has a focus on protein TRs, yet is generally applicable to nucleic acid TRs, sharing similar features.

No MeSH data available.


Related in: MedlinePlus

Tandem repeats in genomic sequences. (A) An example TR with three units and the corresponding MSA of its units. (B) Different parts of a TR motif (R = right and L = left) have different histories after a single duplication with shifted TR units. Shown are these duplication histories as phylogenies of the right and left parts of the TR motif. (C) Five scenarios of overlapping and non-overlapping TR annotations.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4362331&req=5

Figure 1: Tandem repeats in genomic sequences. (A) An example TR with three units and the corresponding MSA of its units. (B) Different parts of a TR motif (R = right and L = left) have different histories after a single duplication with shifted TR units. Shown are these duplication histories as phylogenies of the right and left parts of the TR motif. (C) Five scenarios of overlapping and non-overlapping TR annotations.

Mentions: A tandem repeat (TR) in genomic sequence is a subsequent recurrence of a single sequence motif. TRs are described by the length of the minimal repeating motif (unit), the number of units, and the similarity among its units. The similarity of initially identical TR units fades with time through point mutations and indels, masking their shared ancestry. Diverged TR units, even when unrecognizable by eye, can maintain structural similarity over long evolutionary times [e.g., Figure 1 in Kajava (2012)]. While the mechanisms shaping TRs are poorly understood, they can evolve by duplication/loss of TR units, recombination, and gene conversion (Pearson et al., 2005; Richard et al., 2008). TRs can mutate by replication slippage (Levinson and Gutman, 1987; Ellegren, 2000), whereby the mispairing of a slipping-strand during the DNA synthesis causes a loss or gain of units as loops of TR units form hairpin structures (Mirkin, 2006).


Statistical approaches to detecting and analyzing tandem repeats in genomic sequences.

Anisimova M, Pečerska J, Schaper E - Front Bioeng Biotechnol (2015)

Tandem repeats in genomic sequences. (A) An example TR with three units and the corresponding MSA of its units. (B) Different parts of a TR motif (R = right and L = left) have different histories after a single duplication with shifted TR units. Shown are these duplication histories as phylogenies of the right and left parts of the TR motif. (C) Five scenarios of overlapping and non-overlapping TR annotations.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4362331&req=5

Figure 1: Tandem repeats in genomic sequences. (A) An example TR with three units and the corresponding MSA of its units. (B) Different parts of a TR motif (R = right and L = left) have different histories after a single duplication with shifted TR units. Shown are these duplication histories as phylogenies of the right and left parts of the TR motif. (C) Five scenarios of overlapping and non-overlapping TR annotations.
Mentions: A tandem repeat (TR) in genomic sequence is a subsequent recurrence of a single sequence motif. TRs are described by the length of the minimal repeating motif (unit), the number of units, and the similarity among its units. The similarity of initially identical TR units fades with time through point mutations and indels, masking their shared ancestry. Diverged TR units, even when unrecognizable by eye, can maintain structural similarity over long evolutionary times [e.g., Figure 1 in Kajava (2012)]. While the mechanisms shaping TRs are poorly understood, they can evolve by duplication/loss of TR units, recombination, and gene conversion (Pearson et al., 2005; Richard et al., 2008). TRs can mutate by replication slippage (Levinson and Gutman, 1987; Ellegren, 2000), whereby the mispairing of a slipping-strand during the DNA synthesis causes a loss or gain of units as loops of TR units form hairpin structures (Mirkin, 2006).

Bottom Line: We present our recent contributions to computational and statistical approaches for TR significance testing, sequence profile-based TR annotation, TR-aware sequence alignment, phylogenetic analyses of TR unit number and order, and TR benchmarks.Importantly, all these methods explicitly rely on the evolutionary definition of a tandem repeat as a sequence of adjacent repeat units stemming from a common ancestor.The discussed work has a focus on protein TRs, yet is generally applicable to nucleic acid TRs, sharing similar features.

View Article: PubMed Central - PubMed

Affiliation: Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW) , Wädenswil , Switzerland.

ABSTRACT
Tandem repeats (TRs) are frequently observed in genomes across all domains of life. Evidence suggests that some TRs are crucial for proteins with fundamental biological functions and can be associated with virulence, resistance, and infectious/neurodegenerative diseases. Genome-scale systematic studies of TRs have the potential to unveil core mechanisms governing TR evolution and TR roles in shaping genomes. However, TR-related studies are often non-trivial due to heterogeneous and sometimes fast evolving TR regions. In this review, we discuss these intricacies and their consequences. We present our recent contributions to computational and statistical approaches for TR significance testing, sequence profile-based TR annotation, TR-aware sequence alignment, phylogenetic analyses of TR unit number and order, and TR benchmarks. Importantly, all these methods explicitly rely on the evolutionary definition of a tandem repeat as a sequence of adjacent repeat units stemming from a common ancestor. The discussed work has a focus on protein TRs, yet is generally applicable to nucleic acid TRs, sharing similar features.

No MeSH data available.


Related in: MedlinePlus