Limits...
Structure and expression analysis of rice paleo duplications.

Throude M, Bolot S, Bosio M, Pont C, Sarda X, Quraishi UM, Bourgis F, Lessard P, Rogowsky P, Ghesquiere A, Murigneux A, Charmet G, Perez P, Salse J - Nucleic Acids Res. (2009)

Bottom Line: Improved sequence alignment criteria were used to characterize 10 major chromosome-to-chromosome duplication relationships associated with 1440 paralogous pairs, covering 47.8% of the rice genome, with 12.6% of genes that are conserved within sister blocks.Using a micro-array experiment, a genome-wide expression map has been produced, in which 2382 genes show significant differences of expression in root, leaf and grain.On the basis of a Gene Ontology analysis, we have identified and characterized the gene families that have been structurally and functionally preferentially retained in the duplication showing that the vast majority (>85%) of duplicated have been either lost or have been subfunctionalized or neofunctionalized during 50-70 million years of evolution.

View Article: PubMed Central - PubMed

Affiliation: UMR 1095 INRA/UBP, Génétique, Diversité et Ecophysiologie des Céréales (GDEC), Domaine de Crouelle, 234, 63100 Clermont Ferrand, France.

ABSTRACT
Having a well-known history of genome duplication, rice is a good model for studying structural and functional evolution of paleo duplications. Improved sequence alignment criteria were used to characterize 10 major chromosome-to-chromosome duplication relationships associated with 1440 paralogous pairs, covering 47.8% of the rice genome, with 12.6% of genes that are conserved within sister blocks. Using a micro-array experiment, a genome-wide expression map has been produced, in which 2382 genes show significant differences of expression in root, leaf and grain. By integrating both structural (1440 paralogous pairs) and functional information (2382 differentially expressed genes), we identified 115 paralogous gene pairs for which at least one copy is differentially expressed in one of the three tissues. A vast majority of the 115 paralogous gene pairs have been neofunctionalized or subfunctionalized as 88%, 89% and 96% of duplicates, respectively, expressed in grain, leaf and root show distinct expression patterns. On the basis of a Gene Ontology analysis, we have identified and characterized the gene families that have been structurally and functionally preferentially retained in the duplication showing that the vast majority (>85%) of duplicated have been either lost or have been subfunctionalized or neofunctionalized during 50-70 million years of evolution.

Show MeSH
Co-regulation pattern of 32 trancription factors. (A) The average number (±SD) of genes that are expressed in the same tissues for the 32 transcription factors in a physical window of 100, 300 and 600 genes are schematically represented for the grain, the leaf and the root micro-array data. The number of genes that are expected to be expressed at random in the three tissues for the same physical window based on the whole eMAP are mentioned (closed triangle). (B) Expression pattern of the genes expressed in the grain in a 100 genes window centered on the 24 TF expressed in the grain (five stages). Within 24 boxes are shown the expression profile of a single TF (red) as the other genes (blue) expressed in the grain within a 100 gene physical window centred on the considered TF.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2651813&req=5

Figure 2: Co-regulation pattern of 32 trancription factors. (A) The average number (±SD) of genes that are expressed in the same tissues for the 32 transcription factors in a physical window of 100, 300 and 600 genes are schematically represented for the grain, the leaf and the root micro-array data. The number of genes that are expected to be expressed at random in the three tissues for the same physical window based on the whole eMAP are mentioned (closed triangle). (B) Expression pattern of the genes expressed in the grain in a 100 genes window centered on the 24 TF expressed in the grain (five stages). Within 24 boxes are shown the expression profile of a single TF (red) as the other genes (blue) expressed in the grain within a 100 gene physical window centred on the considered TF.

Mentions: Ren et al. (28) reported recently the presence of coexpression domains for ∼5% the rice genome based on a data set of 14 789 differentially expressed genes from affymetrix experiments. Moreover, several studies have suggested that coexpressed genes may participate in the same biological pathway (29). In order to test this hypothesis, 32 annotated transcription factors (TFs) [from a total of 373 TF in TIGR v4 (30)] were selected as associated with a single or several expression profiles in our data set in root (10 genes TF), grain (24 genes TF) and leaf (five genes TF), cf. highlighted with green stars on the Figure 1A. We performed a gene expression correlation analysis based on these TF (cf. Materials and Methods section). Windows of 100, 300 or 600 genes centered on each TF were selected and the average number of genes within each physical window that were expressed in the same tissue was calculated. For the 24 TF expressed in grain, 5.5 ± 2.5, 13.9 ± 3.2 and 26.9 ± 6.5 genes were co-regulated, i.e. expressed in the grain. Taking into account that for the whole rice eMAP, a total of 1770 (4.1%) genes were expressed in grain, a random co-regulation value would be 4.1% of 100 (i.e. 4.1 genes), 300 (i.e. 12.5 genes) and 600 (i.e. 24.6 genes) genes for each physical window considered. For the five TF expressed in leaf, 2.8 ± 1.1, 7.2 ± 2.6 and 12.8 ± 4.5 genes were co-regulated, respectively, for the three physical windows considered. Since for the whole rice eMAP, a total of 772 among 42 653 (1.8%) genes were expressed in leaves, a random co-regulation value would be 1.8% of 100 (i.e. 1.8 genes), 300 (i.e. 5.4 genes) and 600 (i.e. 10.8 genes) genes for each physical window. Finally, for the 10 TF expressed in root, 2.6 ± 1.4, 7.2 ± 3.2 and 12 ± 5.1 genes are co-regulated, respectively, for the three physical windows considered. The whole rice eMAP showing a total of 803 among 42 653 (1.8%) genes expressed in roots, a random co-regulation value would be 1.8% of 100 (i.e. 1.8 genes), 300 (i.e. 5.6 genes) and 600 (i.e. 11.2 genes) genes for each considered physical window (cf. Figure 2A). Co-regulation concept can be formulated as a hypothesis in which this phenomenon exists if the average number of genes that were expressed in the same tissue within a 100, 300 or 600 gene-window centered on the 32 FT is higher than what could be expected at random (based on the whole eMAP). The number of genes that could be expected to be expressed in the same tissue in a gene-window is defined by taking into account that, at the genome-wide level, 1770, 772 and 803 genes are expressed in the grain, the leaf and the root, respectively. The co-regulation effect ( hypothesis) is visible in Figure 2A where in every gene-window centered on the 32 TF, the average number of genes expressed in the same tissue is higher than what could be expected at random. However, the Figure 2B represents the 24 TF that are expressed in grain associated with the genes expressed in the same tissue within a 100 gene-window. Even if a clear co-regulation effect has been identified at the tissue level (cf. Figure 2A), when considering the detailed expression kinetic, the expression pattern of the genes within the cluster profiling are very different and not correlated (Pearson cut-off value of 0.52), with the exception of clusters #13, #21 and #22. If the co-regulation phenomenon does exist for given plant tissues, based on their developmental kinetics, it is only moderate and has to be considered with caution.Figure 2.


Structure and expression analysis of rice paleo duplications.

Throude M, Bolot S, Bosio M, Pont C, Sarda X, Quraishi UM, Bourgis F, Lessard P, Rogowsky P, Ghesquiere A, Murigneux A, Charmet G, Perez P, Salse J - Nucleic Acids Res. (2009)

Co-regulation pattern of 32 trancription factors. (A) The average number (±SD) of genes that are expressed in the same tissues for the 32 transcription factors in a physical window of 100, 300 and 600 genes are schematically represented for the grain, the leaf and the root micro-array data. The number of genes that are expected to be expressed at random in the three tissues for the same physical window based on the whole eMAP are mentioned (closed triangle). (B) Expression pattern of the genes expressed in the grain in a 100 genes window centered on the 24 TF expressed in the grain (five stages). Within 24 boxes are shown the expression profile of a single TF (red) as the other genes (blue) expressed in the grain within a 100 gene physical window centred on the considered TF.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2651813&req=5

Figure 2: Co-regulation pattern of 32 trancription factors. (A) The average number (±SD) of genes that are expressed in the same tissues for the 32 transcription factors in a physical window of 100, 300 and 600 genes are schematically represented for the grain, the leaf and the root micro-array data. The number of genes that are expected to be expressed at random in the three tissues for the same physical window based on the whole eMAP are mentioned (closed triangle). (B) Expression pattern of the genes expressed in the grain in a 100 genes window centered on the 24 TF expressed in the grain (five stages). Within 24 boxes are shown the expression profile of a single TF (red) as the other genes (blue) expressed in the grain within a 100 gene physical window centred on the considered TF.
Mentions: Ren et al. (28) reported recently the presence of coexpression domains for ∼5% the rice genome based on a data set of 14 789 differentially expressed genes from affymetrix experiments. Moreover, several studies have suggested that coexpressed genes may participate in the same biological pathway (29). In order to test this hypothesis, 32 annotated transcription factors (TFs) [from a total of 373 TF in TIGR v4 (30)] were selected as associated with a single or several expression profiles in our data set in root (10 genes TF), grain (24 genes TF) and leaf (five genes TF), cf. highlighted with green stars on the Figure 1A. We performed a gene expression correlation analysis based on these TF (cf. Materials and Methods section). Windows of 100, 300 or 600 genes centered on each TF were selected and the average number of genes within each physical window that were expressed in the same tissue was calculated. For the 24 TF expressed in grain, 5.5 ± 2.5, 13.9 ± 3.2 and 26.9 ± 6.5 genes were co-regulated, i.e. expressed in the grain. Taking into account that for the whole rice eMAP, a total of 1770 (4.1%) genes were expressed in grain, a random co-regulation value would be 4.1% of 100 (i.e. 4.1 genes), 300 (i.e. 12.5 genes) and 600 (i.e. 24.6 genes) genes for each physical window considered. For the five TF expressed in leaf, 2.8 ± 1.1, 7.2 ± 2.6 and 12.8 ± 4.5 genes were co-regulated, respectively, for the three physical windows considered. Since for the whole rice eMAP, a total of 772 among 42 653 (1.8%) genes were expressed in leaves, a random co-regulation value would be 1.8% of 100 (i.e. 1.8 genes), 300 (i.e. 5.4 genes) and 600 (i.e. 10.8 genes) genes for each physical window. Finally, for the 10 TF expressed in root, 2.6 ± 1.4, 7.2 ± 3.2 and 12 ± 5.1 genes are co-regulated, respectively, for the three physical windows considered. The whole rice eMAP showing a total of 803 among 42 653 (1.8%) genes expressed in roots, a random co-regulation value would be 1.8% of 100 (i.e. 1.8 genes), 300 (i.e. 5.6 genes) and 600 (i.e. 11.2 genes) genes for each considered physical window (cf. Figure 2A). Co-regulation concept can be formulated as a hypothesis in which this phenomenon exists if the average number of genes that were expressed in the same tissue within a 100, 300 or 600 gene-window centered on the 32 FT is higher than what could be expected at random (based on the whole eMAP). The number of genes that could be expected to be expressed in the same tissue in a gene-window is defined by taking into account that, at the genome-wide level, 1770, 772 and 803 genes are expressed in the grain, the leaf and the root, respectively. The co-regulation effect ( hypothesis) is visible in Figure 2A where in every gene-window centered on the 32 TF, the average number of genes expressed in the same tissue is higher than what could be expected at random. However, the Figure 2B represents the 24 TF that are expressed in grain associated with the genes expressed in the same tissue within a 100 gene-window. Even if a clear co-regulation effect has been identified at the tissue level (cf. Figure 2A), when considering the detailed expression kinetic, the expression pattern of the genes within the cluster profiling are very different and not correlated (Pearson cut-off value of 0.52), with the exception of clusters #13, #21 and #22. If the co-regulation phenomenon does exist for given plant tissues, based on their developmental kinetics, it is only moderate and has to be considered with caution.Figure 2.

Bottom Line: Improved sequence alignment criteria were used to characterize 10 major chromosome-to-chromosome duplication relationships associated with 1440 paralogous pairs, covering 47.8% of the rice genome, with 12.6% of genes that are conserved within sister blocks.Using a micro-array experiment, a genome-wide expression map has been produced, in which 2382 genes show significant differences of expression in root, leaf and grain.On the basis of a Gene Ontology analysis, we have identified and characterized the gene families that have been structurally and functionally preferentially retained in the duplication showing that the vast majority (>85%) of duplicated have been either lost or have been subfunctionalized or neofunctionalized during 50-70 million years of evolution.

View Article: PubMed Central - PubMed

Affiliation: UMR 1095 INRA/UBP, Génétique, Diversité et Ecophysiologie des Céréales (GDEC), Domaine de Crouelle, 234, 63100 Clermont Ferrand, France.

ABSTRACT
Having a well-known history of genome duplication, rice is a good model for studying structural and functional evolution of paleo duplications. Improved sequence alignment criteria were used to characterize 10 major chromosome-to-chromosome duplication relationships associated with 1440 paralogous pairs, covering 47.8% of the rice genome, with 12.6% of genes that are conserved within sister blocks. Using a micro-array experiment, a genome-wide expression map has been produced, in which 2382 genes show significant differences of expression in root, leaf and grain. By integrating both structural (1440 paralogous pairs) and functional information (2382 differentially expressed genes), we identified 115 paralogous gene pairs for which at least one copy is differentially expressed in one of the three tissues. A vast majority of the 115 paralogous gene pairs have been neofunctionalized or subfunctionalized as 88%, 89% and 96% of duplicates, respectively, expressed in grain, leaf and root show distinct expression patterns. On the basis of a Gene Ontology analysis, we have identified and characterized the gene families that have been structurally and functionally preferentially retained in the duplication showing that the vast majority (>85%) of duplicated have been either lost or have been subfunctionalized or neofunctionalized during 50-70 million years of evolution.

Show MeSH