Limits...
Short tandem repeats in human exons: a target for disease mutations.

Madsen BE, Villesen P, Wiuf C - BMC Genomics (2008)

Bottom Line: In contrast to longer tandem repeats, our definition of STRs found them to be present in exons of most known human genes (92%), 99% of all STR sequences in exons are shorter than 33 base pairs and 62% of all STR sequences are imperfect repeats.These results are preserved when we limit the analysis to STRs outside known longer tandem repeats.Based on our findings we conclude that STRs represent hypermutable regions in the human genome that are linked to human disease.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Research Center, University of Aarhus, DK-8000 Aarhus C, Denmark. eskerod@birc.au.dk

ABSTRACT

Background: In recent years it has been demonstrated that structural variations, such as indels (insertions and deletions), are common throughout the genome, but the implications of structural variations are still not clearly understood. Long tandem repeats (e.g. microsatellites or simple repeats) are known to be hypermutable (indel-rich), but are rare in exons and only occasionally associated with diseases. Here we focus on short (imperfect) tandem repeats (STRs) which fall below the radar of conventional tandem repeat detection, and investigate whether STRs are targets for disease-related mutations in human exons. In particular, we test whether they share the hypermutability of the longer tandem repeats and whether disease-related genes have a higher STR content than non-disease-related genes.

Results: We show that validated human indels are extremely common in STR regions compared to non-STR regions. In contrast to longer tandem repeats, our definition of STRs found them to be present in exons of most known human genes (92%), 99% of all STR sequences in exons are shorter than 33 base pairs and 62% of all STR sequences are imperfect repeats. We also demonstrate that STRs are significantly overrepresented in disease-related genes in both human and mouse. These results are preserved when we limit the analysis to STRs outside known longer tandem repeats.

Conclusion: Based on our findings we conclude that STRs represent hypermutable regions in the human genome that are linked to human disease. In addition, STRs constitute an obvious target when screening for rare mutations, because of the relatively low amount of STRs in exons (1,973,844 bp) and the limited length of STR regions.

Show MeSH
STR content in the exons of human disease genes. Absolute STR amount for human reference genes and the four sets of disease genes with number of genes shown in parentheses. Genes are ranked by absolute STR content, with the STR poorest genes to the left. Note the virtually all disease genes harbour STRs in their exons.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2543027&req=5

Figure 5: STR content in the exons of human disease genes. Absolute STR amount for human reference genes and the four sets of disease genes with number of genes shown in parentheses. Genes are ranked by absolute STR content, with the STR poorest genes to the left. Note the virtually all disease genes harbour STRs in their exons.

Mentions: First, we found that exons of disease-related genes generally are longer than those of reference genes, and also that the amount of STRs in exons is larger in disease genes than in reference genes (Figure 5 and Additional file 1: Figure S2). To compare the amount of STRs in different subsets of genes we therefore used the relative amount of STRs in a gene, i.e. the length of STRs in the gene relative to the length of the gene. We found that all four subsets of disease-related genes had significantly higher relative amounts of STR regions in exons than non-disease-related genes, and that almost all disease-related genes have STRs in their exons (Table 1, Additional file 1: Figure S2). In contrast, this is not true if we consider introns instead of exons (Additional file 1: Table S4). To validate the findings, we replicated the analysis in mouse using data from the Mouse Genome Database (MGD) [17] and obtained similar results (Table 1, Additional file 1: Figures S2).


Short tandem repeats in human exons: a target for disease mutations.

Madsen BE, Villesen P, Wiuf C - BMC Genomics (2008)

STR content in the exons of human disease genes. Absolute STR amount for human reference genes and the four sets of disease genes with number of genes shown in parentheses. Genes are ranked by absolute STR content, with the STR poorest genes to the left. Note the virtually all disease genes harbour STRs in their exons.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2543027&req=5

Figure 5: STR content in the exons of human disease genes. Absolute STR amount for human reference genes and the four sets of disease genes with number of genes shown in parentheses. Genes are ranked by absolute STR content, with the STR poorest genes to the left. Note the virtually all disease genes harbour STRs in their exons.
Mentions: First, we found that exons of disease-related genes generally are longer than those of reference genes, and also that the amount of STRs in exons is larger in disease genes than in reference genes (Figure 5 and Additional file 1: Figure S2). To compare the amount of STRs in different subsets of genes we therefore used the relative amount of STRs in a gene, i.e. the length of STRs in the gene relative to the length of the gene. We found that all four subsets of disease-related genes had significantly higher relative amounts of STR regions in exons than non-disease-related genes, and that almost all disease-related genes have STRs in their exons (Table 1, Additional file 1: Figure S2). In contrast, this is not true if we consider introns instead of exons (Additional file 1: Table S4). To validate the findings, we replicated the analysis in mouse using data from the Mouse Genome Database (MGD) [17] and obtained similar results (Table 1, Additional file 1: Figures S2).

Bottom Line: In contrast to longer tandem repeats, our definition of STRs found them to be present in exons of most known human genes (92%), 99% of all STR sequences in exons are shorter than 33 base pairs and 62% of all STR sequences are imperfect repeats.These results are preserved when we limit the analysis to STRs outside known longer tandem repeats.Based on our findings we conclude that STRs represent hypermutable regions in the human genome that are linked to human disease.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Research Center, University of Aarhus, DK-8000 Aarhus C, Denmark. eskerod@birc.au.dk

ABSTRACT

Background: In recent years it has been demonstrated that structural variations, such as indels (insertions and deletions), are common throughout the genome, but the implications of structural variations are still not clearly understood. Long tandem repeats (e.g. microsatellites or simple repeats) are known to be hypermutable (indel-rich), but are rare in exons and only occasionally associated with diseases. Here we focus on short (imperfect) tandem repeats (STRs) which fall below the radar of conventional tandem repeat detection, and investigate whether STRs are targets for disease-related mutations in human exons. In particular, we test whether they share the hypermutability of the longer tandem repeats and whether disease-related genes have a higher STR content than non-disease-related genes.

Results: We show that validated human indels are extremely common in STR regions compared to non-STR regions. In contrast to longer tandem repeats, our definition of STRs found them to be present in exons of most known human genes (92%), 99% of all STR sequences in exons are shorter than 33 base pairs and 62% of all STR sequences are imperfect repeats. We also demonstrate that STRs are significantly overrepresented in disease-related genes in both human and mouse. These results are preserved when we limit the analysis to STRs outside known longer tandem repeats.

Conclusion: Based on our findings we conclude that STRs represent hypermutable regions in the human genome that are linked to human disease. In addition, STRs constitute an obvious target when screening for rare mutations, because of the relatively low amount of STRs in exons (1,973,844 bp) and the limited length of STR regions.

Show MeSH