Limits...
Sequencing and analysis of the gene-rich space of cowpea.

Timko MP, Rushton PJ, Laudeman TW, Bokowiec MT, Chipumuro E, Cheung F, Town CD, Chen X - BMC Genomics (2008)

Bottom Line: With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research.The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, University of Virginia, Charlottesville, Virginia 22903, USA. mpt9g@virginia.edu

ABSTRACT

Background: Cowpea, Vigna unguiculata (L.) Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly by poor subsistence farmers. Despite its economic and social importance in the developing world, cowpea remains to a large extent an underexploited crop. Among the major goals of cowpea breeding and improvement programs is the stacking of desirable agronomic traits, such as disease and pest resistance and response to abiotic stresses. Implementation of marker-assisted selection and breeding programs is severely limited by a paucity of trait-linked markers and a general lack of information on gene structure and organization. With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.

Results: We report here the sequencing and analysis of the gene-rich, hypomethylated portion of the cowpea genome selectively cloned by methylation filtration (MF) technology. Over 250,000 gene-space sequence reads (GSRs) with an average length of 610 bp were generated, yielding ~160 Mb of sequence information. The GSRs were assembled, annotated by BLAST homology searches of four public protein annotation databases and four plant proteomes (A. thaliana, M. truncatula, O. sativa, and P. trichocarpa), and analyzed using various domain and gene modeling tools. A total of 41,260 GSR assemblies and singletons were annotated, of which 19,786 have unique GenBank accession numbers. Within the GSR dataset, 29% of the sequences were annotated using the Arabidopsis Gene Ontology (GO) with the largest categories of assigned function being catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. A total of 5,888 GSRs had homology to genes encoding transcription factors (TFs) and transcription associated factors (TAFs) representing about 5% of the total annotated sequences in the dataset. Sixty-two (62) of the 64 well-characterized plant transcription factor (TF) gene families are represented in the cowpea GSRs, and these families are of similar size and phylogenetic organization to those characterized in other plants. The cowpea GSRs also provides a rich source of genes involved in photoperiodic control, symbiosis, and defense-related responses. Comparisons to available databases revealed that about 74% of cowpea ESTs and 70% of all legume ESTs were represented in the GSR dataset. As approximately 12% of all GSRs contain an identifiable simple-sequence repeat, the dataset is a powerful resource for the design of microsatellite markers.

Conclusion: The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research. Not only does the gene space sequence enable the detailed analysis of gene structure, gene family organization and phylogenetic relationships within cowpea, but it also facilitates the characterization of syntenic relationships with other cultivated and model legumes, and will contribute to determining patterns of chromosomal evolution in the Leguminosae. The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

Show MeSH

Related in: MedlinePlus

Distribution of molecular function assignment for cowpea GSRs by GO annotation. Gene Ontology (GO) annotations of cowpea GSRs were generated by Arabidopsis refseq BLAST searches and GSRs were assigned molecular functions using the complex search function, level 3 in the tree. A total of 77,591 cowpea sequences were annotated. Shown next to each functional category is the percentage of GSRs in each named category, followed in parenthesis by the number of annotated GSRs in the group.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2279124&req=5

Figure 1: Distribution of molecular function assignment for cowpea GSRs by GO annotation. Gene Ontology (GO) annotations of cowpea GSRs were generated by Arabidopsis refseq BLAST searches and GSRs were assigned molecular functions using the complex search function, level 3 in the tree. A total of 77,591 cowpea sequences were annotated. Shown next to each functional category is the percentage of GSRs in each named category, followed in parenthesis by the number of annotated GSRs in the group.

Mentions: To determine whether there was any bias in the enrichment for genes using MF, we made putative functional assignments for the individual GSRs based upon the most significant match obtained from database searches against the Arabidopsis GO annotation categories. As shown in Figure 1, the putative annotations were grouped into three top-level ontologies: cellular component, biological process, and molecular function. Approximately 29% (77,591/263,425) of the cowpea GSRs could be annotated in this way. Among those sequences that could be assigned a functional classification, the largest categories were catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. Cellular binding activities (e.g., receptors) and gene products involved in cellular response to stimuli are among the second group of gene products. Among the GSRs assigned molecular function by GO annotation, 5,888 GSRs (~11%) had homology to genes encoding transcription factors (TFs) and transcription associated factors (TAFs). This value is similar to what was found by direct annotation of the GSR assemblies, in which ~5% (1042/19,786) of the total annotated sequences have this putative function assignment.


Sequencing and analysis of the gene-rich space of cowpea.

Timko MP, Rushton PJ, Laudeman TW, Bokowiec MT, Chipumuro E, Cheung F, Town CD, Chen X - BMC Genomics (2008)

Distribution of molecular function assignment for cowpea GSRs by GO annotation. Gene Ontology (GO) annotations of cowpea GSRs were generated by Arabidopsis refseq BLAST searches and GSRs were assigned molecular functions using the complex search function, level 3 in the tree. A total of 77,591 cowpea sequences were annotated. Shown next to each functional category is the percentage of GSRs in each named category, followed in parenthesis by the number of annotated GSRs in the group.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2279124&req=5

Figure 1: Distribution of molecular function assignment for cowpea GSRs by GO annotation. Gene Ontology (GO) annotations of cowpea GSRs were generated by Arabidopsis refseq BLAST searches and GSRs were assigned molecular functions using the complex search function, level 3 in the tree. A total of 77,591 cowpea sequences were annotated. Shown next to each functional category is the percentage of GSRs in each named category, followed in parenthesis by the number of annotated GSRs in the group.
Mentions: To determine whether there was any bias in the enrichment for genes using MF, we made putative functional assignments for the individual GSRs based upon the most significant match obtained from database searches against the Arabidopsis GO annotation categories. As shown in Figure 1, the putative annotations were grouped into three top-level ontologies: cellular component, biological process, and molecular function. Approximately 29% (77,591/263,425) of the cowpea GSRs could be annotated in this way. Among those sequences that could be assigned a functional classification, the largest categories were catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. Cellular binding activities (e.g., receptors) and gene products involved in cellular response to stimuli are among the second group of gene products. Among the GSRs assigned molecular function by GO annotation, 5,888 GSRs (~11%) had homology to genes encoding transcription factors (TFs) and transcription associated factors (TAFs). This value is similar to what was found by direct annotation of the GSR assemblies, in which ~5% (1042/19,786) of the total annotated sequences have this putative function assignment.

Bottom Line: With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research.The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, University of Virginia, Charlottesville, Virginia 22903, USA. mpt9g@virginia.edu

ABSTRACT

Background: Cowpea, Vigna unguiculata (L.) Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly by poor subsistence farmers. Despite its economic and social importance in the developing world, cowpea remains to a large extent an underexploited crop. Among the major goals of cowpea breeding and improvement programs is the stacking of desirable agronomic traits, such as disease and pest resistance and response to abiotic stresses. Implementation of marker-assisted selection and breeding programs is severely limited by a paucity of trait-linked markers and a general lack of information on gene structure and organization. With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.

Results: We report here the sequencing and analysis of the gene-rich, hypomethylated portion of the cowpea genome selectively cloned by methylation filtration (MF) technology. Over 250,000 gene-space sequence reads (GSRs) with an average length of 610 bp were generated, yielding ~160 Mb of sequence information. The GSRs were assembled, annotated by BLAST homology searches of four public protein annotation databases and four plant proteomes (A. thaliana, M. truncatula, O. sativa, and P. trichocarpa), and analyzed using various domain and gene modeling tools. A total of 41,260 GSR assemblies and singletons were annotated, of which 19,786 have unique GenBank accession numbers. Within the GSR dataset, 29% of the sequences were annotated using the Arabidopsis Gene Ontology (GO) with the largest categories of assigned function being catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. A total of 5,888 GSRs had homology to genes encoding transcription factors (TFs) and transcription associated factors (TAFs) representing about 5% of the total annotated sequences in the dataset. Sixty-two (62) of the 64 well-characterized plant transcription factor (TF) gene families are represented in the cowpea GSRs, and these families are of similar size and phylogenetic organization to those characterized in other plants. The cowpea GSRs also provides a rich source of genes involved in photoperiodic control, symbiosis, and defense-related responses. Comparisons to available databases revealed that about 74% of cowpea ESTs and 70% of all legume ESTs were represented in the GSR dataset. As approximately 12% of all GSRs contain an identifiable simple-sequence repeat, the dataset is a powerful resource for the design of microsatellite markers.

Conclusion: The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research. Not only does the gene space sequence enable the detailed analysis of gene structure, gene family organization and phylogenetic relationships within cowpea, but it also facilitates the characterization of syntenic relationships with other cultivated and model legumes, and will contribute to determining patterns of chromosomal evolution in the Leguminosae. The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

Show MeSH
Related in: MedlinePlus