Limits...
A novel bioinformatics pipeline to discover genes related to arbuscular mycorrhizal symbiosis based on their evolutionary conservation pattern among higher plants.

Favre P, Bapaume L, Bossolini E, Delorenzi M, Falquet L, Reinhardt D - BMC Plant Biol. (2014)

Bottom Line: However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify.As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM.This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

View Article: PubMed Central - PubMed

ABSTRACT

Background: Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species.

Results: Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility.

Conclusions: We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

Show MeSH

Related in: MedlinePlus

Conservation ratios of potentially AM-related proteins averaged for relevant plant groups with significant difference between non-AM and AM species. Histograms represent the frequency distributions of the ratios of log10 of the E-values from psi-blast. The query sequences for psi-blast were generated by calculating MSA consensus sequences based on the results of Task3. These query sequences were blasted against AM-competent dicot species (group A), monocot species (group B), and non-mycorrhizal dicot species (group C) (compare with Additional file 12: Table S4). To derive conservation ratios, the E-values were averaged group-wise for groups A, B, and C, respectively, and the following ratios were generated: C/A and C/B. Conservation ratios were included only if the difference between the groups were significant (p < 0.05 for Wilcoxon test, compare with Additional file 12: Table S4). (a) Ratios for log10(group C)/log10(group A). (b) Ratios for log10(group C)/log10(group B). Note that 8 control genes from Additional file 5: Table S1 passed the filter in (a), whereas none passed in (b).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4274732&req=5

Fig6: Conservation ratios of potentially AM-related proteins averaged for relevant plant groups with significant difference between non-AM and AM species. Histograms represent the frequency distributions of the ratios of log10 of the E-values from psi-blast. The query sequences for psi-blast were generated by calculating MSA consensus sequences based on the results of Task3. These query sequences were blasted against AM-competent dicot species (group A), monocot species (group B), and non-mycorrhizal dicot species (group C) (compare with Additional file 12: Table S4). To derive conservation ratios, the E-values were averaged group-wise for groups A, B, and C, respectively, and the following ratios were generated: C/A and C/B. Conservation ratios were included only if the difference between the groups were significant (p < 0.05 for Wilcoxon test, compare with Additional file 12: Table S4). (a) Ratios for log10(group C)/log10(group A). (b) Ratios for log10(group C)/log10(group B). Note that 8 control genes from Additional file 5: Table S1 passed the filter in (a), whereas none passed in (b).

Mentions: The E-values were then compared between the groups by Wilcoxon test (Figure 2, P4; see P4_eval_wilcox.R in Additional file 3: File S2) to identify genes for which the populations of E-values between groups C and A, or between C and B were significantly different (Additional file 12: Table S4). For proteins that exhibited significant differences, the E-values were averaged among the three groups, and the ratios between the log(10) of these values for C/A and C/B were calculated as a relative measure for AM-related sequence conservation (conservation ratio). The higher the conservation ratio, the farther the non-mycorrhizal homologues are from the consensus sequences relative to the homologues from AM-competent species, indicative for AM-related conservation. Establishing the frequency distribution of the conservation ratios revealed that several of our test genes, such as SYMRK, VAPYRIN, RAM1, RAM2, and PT4 passed this filter (Figure 6a), hence their pattern of sequence conservation was significantly related to the competence to engage in AM symbiosis. Surprisingly, none of the test genes passed the comparison between non-mycorrhizal dicots and mycorrhizal monocots (Figure 6b), although at least VAPYRIN, RAM1, and PT4 are more closely related between the AM-competent dicots and the monocots, than between the AM-competent and the non-mycorrhizal dicots [16] (Figure 1a,b).Figure 6


A novel bioinformatics pipeline to discover genes related to arbuscular mycorrhizal symbiosis based on their evolutionary conservation pattern among higher plants.

Favre P, Bapaume L, Bossolini E, Delorenzi M, Falquet L, Reinhardt D - BMC Plant Biol. (2014)

Conservation ratios of potentially AM-related proteins averaged for relevant plant groups with significant difference between non-AM and AM species. Histograms represent the frequency distributions of the ratios of log10 of the E-values from psi-blast. The query sequences for psi-blast were generated by calculating MSA consensus sequences based on the results of Task3. These query sequences were blasted against AM-competent dicot species (group A), monocot species (group B), and non-mycorrhizal dicot species (group C) (compare with Additional file 12: Table S4). To derive conservation ratios, the E-values were averaged group-wise for groups A, B, and C, respectively, and the following ratios were generated: C/A and C/B. Conservation ratios were included only if the difference between the groups were significant (p < 0.05 for Wilcoxon test, compare with Additional file 12: Table S4). (a) Ratios for log10(group C)/log10(group A). (b) Ratios for log10(group C)/log10(group B). Note that 8 control genes from Additional file 5: Table S1 passed the filter in (a), whereas none passed in (b).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4274732&req=5

Fig6: Conservation ratios of potentially AM-related proteins averaged for relevant plant groups with significant difference between non-AM and AM species. Histograms represent the frequency distributions of the ratios of log10 of the E-values from psi-blast. The query sequences for psi-blast were generated by calculating MSA consensus sequences based on the results of Task3. These query sequences were blasted against AM-competent dicot species (group A), monocot species (group B), and non-mycorrhizal dicot species (group C) (compare with Additional file 12: Table S4). To derive conservation ratios, the E-values were averaged group-wise for groups A, B, and C, respectively, and the following ratios were generated: C/A and C/B. Conservation ratios were included only if the difference between the groups were significant (p < 0.05 for Wilcoxon test, compare with Additional file 12: Table S4). (a) Ratios for log10(group C)/log10(group A). (b) Ratios for log10(group C)/log10(group B). Note that 8 control genes from Additional file 5: Table S1 passed the filter in (a), whereas none passed in (b).
Mentions: The E-values were then compared between the groups by Wilcoxon test (Figure 2, P4; see P4_eval_wilcox.R in Additional file 3: File S2) to identify genes for which the populations of E-values between groups C and A, or between C and B were significantly different (Additional file 12: Table S4). For proteins that exhibited significant differences, the E-values were averaged among the three groups, and the ratios between the log(10) of these values for C/A and C/B were calculated as a relative measure for AM-related sequence conservation (conservation ratio). The higher the conservation ratio, the farther the non-mycorrhizal homologues are from the consensus sequences relative to the homologues from AM-competent species, indicative for AM-related conservation. Establishing the frequency distribution of the conservation ratios revealed that several of our test genes, such as SYMRK, VAPYRIN, RAM1, RAM2, and PT4 passed this filter (Figure 6a), hence their pattern of sequence conservation was significantly related to the competence to engage in AM symbiosis. Surprisingly, none of the test genes passed the comparison between non-mycorrhizal dicots and mycorrhizal monocots (Figure 6b), although at least VAPYRIN, RAM1, and PT4 are more closely related between the AM-competent dicots and the monocots, than between the AM-competent and the non-mycorrhizal dicots [16] (Figure 1a,b).Figure 6

Bottom Line: However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify.As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM.This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

View Article: PubMed Central - PubMed

ABSTRACT

Background: Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species.

Results: Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility.

Conclusions: We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

Show MeSH
Related in: MedlinePlus