Limits...
Tandem repeats modify the structure of human genes hosted in segmental duplications.

De Grassi A, Ciccarelli FD - Genome Biol. (2009)

Bottom Line: Here we specifically investigated the effect of variations in internal tandem repeats (ITRs) on the gene structure of human paralogs located in segmental duplications.We found that around 7% of the primate-specific genes located within duplicated regions of the genome contain variable tandem repeats.The resulting effect is the production of a variety of primate-specific proteins, which mostly differ in number and sequence of amino acid repeats.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Experimental Oncology, European Institute of Oncology, IFOM-IEO Campus, Via Adamello, 20139 Milan, Italy. anna.degrassi@ifom-ieo-campus.it

ABSTRACT

Background: Recently duplicated genes are often subject to genomic rearrangements that can lead to the development of novel gene structures. Here we specifically investigated the effect of variations in internal tandem repeats (ITRs) on the gene structure of human paralogs located in segmental duplications.

Results: We found that around 7% of the primate-specific genes located within duplicated regions of the genome contain variable tandem repeats. These genes are members of large groups of recently duplicated paralogs that are often polymorphic in the human population. Half of the identified ITRs occur within coding exons and may be either kept or spliced out from the mature transcript. When ITRs reside within exons, they encode variable amino acid repeats. When located at exon-intron boundaries, ITRs can generate alternative splicing patterns through the formation of novel introns.

Conclusions: Our study shows that variation in the number of ITRs impacts on recently duplicated genes by modifying their coding sequence, splicing pattern, and tissue expression. The resulting effect is the production of a variety of primate-specific proteins, which mostly differ in number and sequence of amino acid repeats.

Show MeSH
Length of variable ITRs compared to all ITRs in SDs and in the human genome. Compared are (a) the total length of the repeats, (b) the length of the repeat unit, and (c) the number of repeat units between the variable ITRs that modify the gene structure (grey) and all other exonic and non-exonic ITRs in SDs (pink) and in the rest of the human genome (light-blue). ITR modifications occur preferentially through the repetition of minisatellites and are depleted in short repeats.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2812944&req=5

Figure 3: Length of variable ITRs compared to all ITRs in SDs and in the human genome. Compared are (a) the total length of the repeats, (b) the length of the repeat unit, and (c) the number of repeat units between the variable ITRs that modify the gene structure (grey) and all other exonic and non-exonic ITRs in SDs (pink) and in the rest of the human genome (light-blue). ITR modifications occur preferentially through the repetition of minisatellites and are depleted in short repeats.

Mentions: Variable ITRs responsible for gene modifications are composed, on average, of 30-bp units that are repeated 4 times for a total length of 160 bp (Table S1 in Additional file 2). When compared to all ITRs within exonic and non-exonic regions hosted in SDs as well as in the whole human genome, variable ITRs affecting the gene structure are significantly longer (Figure 3a) as a consequence of longer units (Figure 3b) rather then of higher numbers of repetitions (Figure 3c). Therefore, ITR-driven modifications of genes hosted in SDs are preferentially mediated by minisatellites. This result can be explained by different and concomitant reasons. First, it partly reflects the fact that we focused on almost identical regions, thus favoring the detection of longer repeat units. As a general trend, ITRs lying in SDs have, on average, repeat units significantly longer than ITRs dispersed in the rest of human genome (Figure 3b). Second, long repeats are more variable than short repeats probably because they enlarge the target sequence for slippage or unequal crossover [35]. Finally, the absence of variable ITRs with repeat units shorter than 9 bp (Figure 3b) suggests a preferential retention of repeats that can significantly diversify the sequence of the encoded proteins.


Tandem repeats modify the structure of human genes hosted in segmental duplications.

De Grassi A, Ciccarelli FD - Genome Biol. (2009)

Length of variable ITRs compared to all ITRs in SDs and in the human genome. Compared are (a) the total length of the repeats, (b) the length of the repeat unit, and (c) the number of repeat units between the variable ITRs that modify the gene structure (grey) and all other exonic and non-exonic ITRs in SDs (pink) and in the rest of the human genome (light-blue). ITR modifications occur preferentially through the repetition of minisatellites and are depleted in short repeats.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2812944&req=5

Figure 3: Length of variable ITRs compared to all ITRs in SDs and in the human genome. Compared are (a) the total length of the repeats, (b) the length of the repeat unit, and (c) the number of repeat units between the variable ITRs that modify the gene structure (grey) and all other exonic and non-exonic ITRs in SDs (pink) and in the rest of the human genome (light-blue). ITR modifications occur preferentially through the repetition of minisatellites and are depleted in short repeats.
Mentions: Variable ITRs responsible for gene modifications are composed, on average, of 30-bp units that are repeated 4 times for a total length of 160 bp (Table S1 in Additional file 2). When compared to all ITRs within exonic and non-exonic regions hosted in SDs as well as in the whole human genome, variable ITRs affecting the gene structure are significantly longer (Figure 3a) as a consequence of longer units (Figure 3b) rather then of higher numbers of repetitions (Figure 3c). Therefore, ITR-driven modifications of genes hosted in SDs are preferentially mediated by minisatellites. This result can be explained by different and concomitant reasons. First, it partly reflects the fact that we focused on almost identical regions, thus favoring the detection of longer repeat units. As a general trend, ITRs lying in SDs have, on average, repeat units significantly longer than ITRs dispersed in the rest of human genome (Figure 3b). Second, long repeats are more variable than short repeats probably because they enlarge the target sequence for slippage or unequal crossover [35]. Finally, the absence of variable ITRs with repeat units shorter than 9 bp (Figure 3b) suggests a preferential retention of repeats that can significantly diversify the sequence of the encoded proteins.

Bottom Line: Here we specifically investigated the effect of variations in internal tandem repeats (ITRs) on the gene structure of human paralogs located in segmental duplications.We found that around 7% of the primate-specific genes located within duplicated regions of the genome contain variable tandem repeats.The resulting effect is the production of a variety of primate-specific proteins, which mostly differ in number and sequence of amino acid repeats.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Experimental Oncology, European Institute of Oncology, IFOM-IEO Campus, Via Adamello, 20139 Milan, Italy. anna.degrassi@ifom-ieo-campus.it

ABSTRACT

Background: Recently duplicated genes are often subject to genomic rearrangements that can lead to the development of novel gene structures. Here we specifically investigated the effect of variations in internal tandem repeats (ITRs) on the gene structure of human paralogs located in segmental duplications.

Results: We found that around 7% of the primate-specific genes located within duplicated regions of the genome contain variable tandem repeats. These genes are members of large groups of recently duplicated paralogs that are often polymorphic in the human population. Half of the identified ITRs occur within coding exons and may be either kept or spliced out from the mature transcript. When ITRs reside within exons, they encode variable amino acid repeats. When located at exon-intron boundaries, ITRs can generate alternative splicing patterns through the formation of novel introns.

Conclusions: Our study shows that variation in the number of ITRs impacts on recently duplicated genes by modifying their coding sequence, splicing pattern, and tissue expression. The resulting effect is the production of a variety of primate-specific proteins, which mostly differ in number and sequence of amino acid repeats.

Show MeSH