Limits...
Evolutionary origins of Brassicaceae specific genes in Arabidopsis thaliana.

Donoghue MT, Keshavaiah C, Swamidatta SH, Spillane C - BMC Evol. Biol. (2011)

Bottom Line: Over half of the subset of the 958 lineage-specific genes found only in Arabidopsis thaliana have alignments to intergenic regions in Arabidopsis lyrata, consistent with either de novo origination or differential gene loss and retention, with both evolutionary scenarios explaining the lineage-specific status of these genes.This study comprehensively identifies all of the Brassicaceae-specific genes in Arabidopsis thaliana and identifies how the majority of such lineage-specific genes have arisen.Insights regarding the functional roles of lineage-specific genes are further advanced through identification of enrichment for stress responsiveness in lineage-specific genes, highlighting their likely importance for environmental adaptation strategies.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biochemistry, University College Cork, Cork, Ireland.

ABSTRACT

Background: All sequenced genomes contain a proportion of lineage-specific genes, which exhibit no sequence similarity to any genes outside the lineage. Despite their prevalence, the origins and functions of most lineage-specific genes remain largely unknown. As more genomes are sequenced opportunities for understanding evolutionary origins and functions of lineage-specific genes are increasing.

Results: This study provides a comprehensive analysis of the origins of lineage-specific genes (LSGs) in Arabidopsis thaliana that are restricted to the Brassicaceae family. In this study, lineage-specific genes within the nuclear (1761 genes) and mitochondrial (28 genes) genomes are identified. The evolutionary origins of two thirds of the lineage-specific genes within the Arabidopsis thaliana genome are also identified. Almost a quarter of lineage-specific genes originate from non-lineage-specific paralogs, while the origins of ~10% of lineage-specific genes are partly derived from DNA exapted from transposable elements (twice the proportion observed for non-lineage-specific genes). Lineage-specific genes are also enriched in genes that have overlapping CDS, which is consistent with such novel genes arising from overprinting. Over half of the subset of the 958 lineage-specific genes found only in Arabidopsis thaliana have alignments to intergenic regions in Arabidopsis lyrata, consistent with either de novo origination or differential gene loss and retention, with both evolutionary scenarios explaining the lineage-specific status of these genes. A smaller number of lineage-specific genes with an incomplete open reading frame across different Arabidopsis thaliana accessions are further identified as accession-specific genes, most likely of recent origin in Arabidopsis thaliana. Putative de novo origination for two of the Arabidopsis thaliana-only genes is identified via additional sequencing across accessions of Arabidopsis thaliana and closely related sister species lineages. We demonstrate that lineage-specific genes have high tissue specificity and low expression levels across multiple tissues and developmental stages. Finally, stress responsiveness is identified as a distinct feature of Brassicaceae-specific genes; where these LSGs are enriched for genes responsive to a wide range of abiotic stresses.

Conclusion: Improving our understanding of the origins of lineage-specific genes is key to gaining insights regarding how novel genes can arise and acquire functionality in different lineages. This study comprehensively identifies all of the Brassicaceae-specific genes in Arabidopsis thaliana and identifies how the majority of such lineage-specific genes have arisen. The analysis allows the relative importance (and prevalence) of different evolutionary routes to the genesis of novel ORFs within lineages to be assessed. Insights regarding the functional roles of lineage-specific genes are further advanced through identification of enrichment for stress responsiveness in lineage-specific genes, highlighting their likely importance for environmental adaptation strategies.

Show MeSH

Related in: MedlinePlus

Summary of evidence for evolutionary origins of Arabidopsis thaliana lineage-specific genes. The number of LSGs that fit each evolutionary scenario tested, plus the number of LSGs without elucidated origins. Support for gene model expression provided by an EST or cDNA consistent with the of gene model (as listed by TAIR). Support of expression at the locus provided by EST, cDNA or microarray probeset (TAIR and Ath1 affymetrix microarray).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3049755&req=5

Figure 1: Summary of evidence for evolutionary origins of Arabidopsis thaliana lineage-specific genes. The number of LSGs that fit each evolutionary scenario tested, plus the number of LSGs without elucidated origins. Support for gene model expression provided by an EST or cDNA consistent with the of gene model (as listed by TAIR). Support of expression at the locus provided by EST, cDNA or microarray probeset (TAIR and Ath1 affymetrix microarray).

Mentions: In total, 105 gene models in the Arabidopsis thaliana genome have overlapping CDS with another gene model (31 of which are LSGs, while 74 are non-LSGs). Twenty-one LSGs overlap with 21 non-LSGs, which are presented in Additional file 5), while ten LSGs overlap with other LSGs. Twenty-six (out of 1761) nuclear genome LSGs overlap with the CDS of other gene models. In comparison, 68 (out of 25234) non-LSG models overlap with the CDS of other gene models in the nuclear genome, indicating that LSGs are enriched for overlapping CDS in the nuclear genome (hypergeometric test, p < 0.01). In contrast, we found that LSGs are not enriched for overlapping CDS in the mitochondrial genome. Whilst overlapping CDS are enriched in LSGs, this model of gene evolution only accounts for 21 of the LSGs within the Arabidopsis thaliana genome i.e. only 1.18% of all LSGs (Figure 1) indicating that overprinting of existing CDS sequence is a relatively rare mechanism for generation of novel LSGs in Arabidopsis thaliana.


Evolutionary origins of Brassicaceae specific genes in Arabidopsis thaliana.

Donoghue MT, Keshavaiah C, Swamidatta SH, Spillane C - BMC Evol. Biol. (2011)

Summary of evidence for evolutionary origins of Arabidopsis thaliana lineage-specific genes. The number of LSGs that fit each evolutionary scenario tested, plus the number of LSGs without elucidated origins. Support for gene model expression provided by an EST or cDNA consistent with the of gene model (as listed by TAIR). Support of expression at the locus provided by EST, cDNA or microarray probeset (TAIR and Ath1 affymetrix microarray).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3049755&req=5

Figure 1: Summary of evidence for evolutionary origins of Arabidopsis thaliana lineage-specific genes. The number of LSGs that fit each evolutionary scenario tested, plus the number of LSGs without elucidated origins. Support for gene model expression provided by an EST or cDNA consistent with the of gene model (as listed by TAIR). Support of expression at the locus provided by EST, cDNA or microarray probeset (TAIR and Ath1 affymetrix microarray).
Mentions: In total, 105 gene models in the Arabidopsis thaliana genome have overlapping CDS with another gene model (31 of which are LSGs, while 74 are non-LSGs). Twenty-one LSGs overlap with 21 non-LSGs, which are presented in Additional file 5), while ten LSGs overlap with other LSGs. Twenty-six (out of 1761) nuclear genome LSGs overlap with the CDS of other gene models. In comparison, 68 (out of 25234) non-LSG models overlap with the CDS of other gene models in the nuclear genome, indicating that LSGs are enriched for overlapping CDS in the nuclear genome (hypergeometric test, p < 0.01). In contrast, we found that LSGs are not enriched for overlapping CDS in the mitochondrial genome. Whilst overlapping CDS are enriched in LSGs, this model of gene evolution only accounts for 21 of the LSGs within the Arabidopsis thaliana genome i.e. only 1.18% of all LSGs (Figure 1) indicating that overprinting of existing CDS sequence is a relatively rare mechanism for generation of novel LSGs in Arabidopsis thaliana.

Bottom Line: Over half of the subset of the 958 lineage-specific genes found only in Arabidopsis thaliana have alignments to intergenic regions in Arabidopsis lyrata, consistent with either de novo origination or differential gene loss and retention, with both evolutionary scenarios explaining the lineage-specific status of these genes.This study comprehensively identifies all of the Brassicaceae-specific genes in Arabidopsis thaliana and identifies how the majority of such lineage-specific genes have arisen.Insights regarding the functional roles of lineage-specific genes are further advanced through identification of enrichment for stress responsiveness in lineage-specific genes, highlighting their likely importance for environmental adaptation strategies.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biochemistry, University College Cork, Cork, Ireland.

ABSTRACT

Background: All sequenced genomes contain a proportion of lineage-specific genes, which exhibit no sequence similarity to any genes outside the lineage. Despite their prevalence, the origins and functions of most lineage-specific genes remain largely unknown. As more genomes are sequenced opportunities for understanding evolutionary origins and functions of lineage-specific genes are increasing.

Results: This study provides a comprehensive analysis of the origins of lineage-specific genes (LSGs) in Arabidopsis thaliana that are restricted to the Brassicaceae family. In this study, lineage-specific genes within the nuclear (1761 genes) and mitochondrial (28 genes) genomes are identified. The evolutionary origins of two thirds of the lineage-specific genes within the Arabidopsis thaliana genome are also identified. Almost a quarter of lineage-specific genes originate from non-lineage-specific paralogs, while the origins of ~10% of lineage-specific genes are partly derived from DNA exapted from transposable elements (twice the proportion observed for non-lineage-specific genes). Lineage-specific genes are also enriched in genes that have overlapping CDS, which is consistent with such novel genes arising from overprinting. Over half of the subset of the 958 lineage-specific genes found only in Arabidopsis thaliana have alignments to intergenic regions in Arabidopsis lyrata, consistent with either de novo origination or differential gene loss and retention, with both evolutionary scenarios explaining the lineage-specific status of these genes. A smaller number of lineage-specific genes with an incomplete open reading frame across different Arabidopsis thaliana accessions are further identified as accession-specific genes, most likely of recent origin in Arabidopsis thaliana. Putative de novo origination for two of the Arabidopsis thaliana-only genes is identified via additional sequencing across accessions of Arabidopsis thaliana and closely related sister species lineages. We demonstrate that lineage-specific genes have high tissue specificity and low expression levels across multiple tissues and developmental stages. Finally, stress responsiveness is identified as a distinct feature of Brassicaceae-specific genes; where these LSGs are enriched for genes responsive to a wide range of abiotic stresses.

Conclusion: Improving our understanding of the origins of lineage-specific genes is key to gaining insights regarding how novel genes can arise and acquire functionality in different lineages. This study comprehensively identifies all of the Brassicaceae-specific genes in Arabidopsis thaliana and identifies how the majority of such lineage-specific genes have arisen. The analysis allows the relative importance (and prevalence) of different evolutionary routes to the genesis of novel ORFs within lineages to be assessed. Insights regarding the functional roles of lineage-specific genes are further advanced through identification of enrichment for stress responsiveness in lineage-specific genes, highlighting their likely importance for environmental adaptation strategies.

Show MeSH
Related in: MedlinePlus