Limits...
A novel bioinformatics pipeline to discover genes related to arbuscular mycorrhizal symbiosis based on their evolutionary conservation pattern among higher plants.

Favre P, Bapaume L, Bossolini E, Delorenzi M, Falquet L, Reinhardt D - BMC Plant Biol. (2014)

Bottom Line: However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify.As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM.This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

View Article: PubMed Central - PubMed

ABSTRACT

Background: Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species.

Results: Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility.

Conclusions: We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

Show MeSH
Strategy used to identify AM-related genes based on sequence conservation. The flow chart reflects the stepwise identification of potential AM-related proteins based on their pattern of sequence conservation at the protein level, the pattern of gene expression, and predicted regulatory elements in their promoters. Sp: Plant species. P0-P4 correspond to protocols, files, or scripts provided in supplementary materials (Additional file 4: File S3). Blue boxes: Databases; green boxes: Tools and processes; pink boxes: intermediate outputs; red boxes: final outputs.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4274732&req=5

Fig2: Strategy used to identify AM-related genes based on sequence conservation. The flow chart reflects the stepwise identification of potential AM-related proteins based on their pattern of sequence conservation at the protein level, the pattern of gene expression, and predicted regulatory elements in their promoters. Sp: Plant species. P0-P4 correspond to protocols, files, or scripts provided in supplementary materials (Additional file 4: File S3). Blue boxes: Databases; green boxes: Tools and processes; pink boxes: intermediate outputs; red boxes: final outputs.

Mentions: In order to identify AM-related genes in a systematic way, we first applied the clustering software Hieranoid [25] to the proteome sequences of six AM-competent species, namely Medicago truncatula (Mtr), Glycine max (Gma), Vitis vinifera (Vvi), Solanum lycopersicum (Sly), Solanum tuberosum (Stu) and Populus trichocarpa (Ptr), and three non-mycorrhizal species, namely Arabidopsis thaliana (Ath), Arabidopsis lyrata (Aly), and Brassica rapa (Bra). Pairwise clustering proceeded based on a conceptual phylogenetic tree of the involved plant species (Additional file 2: Figure S1; Additional file 3: File S2, script P0_HieraProcedure.txt). Figure 2 describes the work-flow of our strategy. Briefly, the proteomes of 9 species were used for clustering. The resulting trees of orthologous/paralogous proteins were filtered to yield lists of protein clusters that satisfied certain defined criteria referred to as Task3, Task4, Task9 (see next section). These gene lists were further processed to isolate AM-related genes based on gene expression, on conservation of the protein-coding region, and on the promoter sequence, as discussed in detail in the following sections. Scripts involved in the different processes (P0, P1, P2, P3, and P4 in Figure 2) are provided in Additional file 3: File S2.Figure 2


A novel bioinformatics pipeline to discover genes related to arbuscular mycorrhizal symbiosis based on their evolutionary conservation pattern among higher plants.

Favre P, Bapaume L, Bossolini E, Delorenzi M, Falquet L, Reinhardt D - BMC Plant Biol. (2014)

Strategy used to identify AM-related genes based on sequence conservation. The flow chart reflects the stepwise identification of potential AM-related proteins based on their pattern of sequence conservation at the protein level, the pattern of gene expression, and predicted regulatory elements in their promoters. Sp: Plant species. P0-P4 correspond to protocols, files, or scripts provided in supplementary materials (Additional file 4: File S3). Blue boxes: Databases; green boxes: Tools and processes; pink boxes: intermediate outputs; red boxes: final outputs.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4274732&req=5

Fig2: Strategy used to identify AM-related genes based on sequence conservation. The flow chart reflects the stepwise identification of potential AM-related proteins based on their pattern of sequence conservation at the protein level, the pattern of gene expression, and predicted regulatory elements in their promoters. Sp: Plant species. P0-P4 correspond to protocols, files, or scripts provided in supplementary materials (Additional file 4: File S3). Blue boxes: Databases; green boxes: Tools and processes; pink boxes: intermediate outputs; red boxes: final outputs.
Mentions: In order to identify AM-related genes in a systematic way, we first applied the clustering software Hieranoid [25] to the proteome sequences of six AM-competent species, namely Medicago truncatula (Mtr), Glycine max (Gma), Vitis vinifera (Vvi), Solanum lycopersicum (Sly), Solanum tuberosum (Stu) and Populus trichocarpa (Ptr), and three non-mycorrhizal species, namely Arabidopsis thaliana (Ath), Arabidopsis lyrata (Aly), and Brassica rapa (Bra). Pairwise clustering proceeded based on a conceptual phylogenetic tree of the involved plant species (Additional file 2: Figure S1; Additional file 3: File S2, script P0_HieraProcedure.txt). Figure 2 describes the work-flow of our strategy. Briefly, the proteomes of 9 species were used for clustering. The resulting trees of orthologous/paralogous proteins were filtered to yield lists of protein clusters that satisfied certain defined criteria referred to as Task3, Task4, Task9 (see next section). These gene lists were further processed to isolate AM-related genes based on gene expression, on conservation of the protein-coding region, and on the promoter sequence, as discussed in detail in the following sections. Scripts involved in the different processes (P0, P1, P2, P3, and P4 in Figure 2) are provided in Additional file 3: File S2.Figure 2

Bottom Line: However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify.As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM.This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

View Article: PubMed Central - PubMed

ABSTRACT

Background: Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species.

Results: Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility.

Conclusions: We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.

Show MeSH