Limits...
Sequencing and analysis of the gene-rich space of cowpea.

Timko MP, Rushton PJ, Laudeman TW, Bokowiec MT, Chipumuro E, Cheung F, Town CD, Chen X - BMC Genomics (2008)

Bottom Line: With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research.The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, University of Virginia, Charlottesville, Virginia 22903, USA. mpt9g@virginia.edu

ABSTRACT

Background: Cowpea, Vigna unguiculata (L.) Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly by poor subsistence farmers. Despite its economic and social importance in the developing world, cowpea remains to a large extent an underexploited crop. Among the major goals of cowpea breeding and improvement programs is the stacking of desirable agronomic traits, such as disease and pest resistance and response to abiotic stresses. Implementation of marker-assisted selection and breeding programs is severely limited by a paucity of trait-linked markers and a general lack of information on gene structure and organization. With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.

Results: We report here the sequencing and analysis of the gene-rich, hypomethylated portion of the cowpea genome selectively cloned by methylation filtration (MF) technology. Over 250,000 gene-space sequence reads (GSRs) with an average length of 610 bp were generated, yielding ~160 Mb of sequence information. The GSRs were assembled, annotated by BLAST homology searches of four public protein annotation databases and four plant proteomes (A. thaliana, M. truncatula, O. sativa, and P. trichocarpa), and analyzed using various domain and gene modeling tools. A total of 41,260 GSR assemblies and singletons were annotated, of which 19,786 have unique GenBank accession numbers. Within the GSR dataset, 29% of the sequences were annotated using the Arabidopsis Gene Ontology (GO) with the largest categories of assigned function being catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. A total of 5,888 GSRs had homology to genes encoding transcription factors (TFs) and transcription associated factors (TAFs) representing about 5% of the total annotated sequences in the dataset. Sixty-two (62) of the 64 well-characterized plant transcription factor (TF) gene families are represented in the cowpea GSRs, and these families are of similar size and phylogenetic organization to those characterized in other plants. The cowpea GSRs also provides a rich source of genes involved in photoperiodic control, symbiosis, and defense-related responses. Comparisons to available databases revealed that about 74% of cowpea ESTs and 70% of all legume ESTs were represented in the GSR dataset. As approximately 12% of all GSRs contain an identifiable simple-sequence repeat, the dataset is a powerful resource for the design of microsatellite markers.

Conclusion: The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research. Not only does the gene space sequence enable the detailed analysis of gene structure, gene family organization and phylogenetic relationships within cowpea, but it also facilitates the characterization of syntenic relationships with other cultivated and model legumes, and will contribute to determining patterns of chromosomal evolution in the Leguminosae. The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

Show MeSH

Related in: MedlinePlus

Mapping of cowpea assemblies and singletons to the M. truncatula pseudomolecules. GSR assemblies and singltons were mapped by tblastx searches to the M. truncatula chromosome-scale pseudomolecules available on the TIGR M. truncatula database. The broad green lines represent tblastx alignments; narrow lines connect High-scoring Segment Pairs (HSPs) derived from the same cowpea sequence. An HSP consists of two sequence fragments of arbitrary but equal length whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score. A: An example of mapping cowpea contigs and singletons to a 40 kb region of chromosome 0 (which represents BACs that have not been anchored to the genetic map). B: A closer view of the same region from 396 k to 404 k. C: A region of M. truncatula chromosome 6 where a single cowpea GSR spans and has high quality tblastx matches to three distinct IMGAG gene models, indicating microsynteny. M. truncatula gene model AC134521_19 has no match in that region of the cowpea genome. D: A region of M. truncatula chromosome 2 where there are several GSR matches, but no M. truncatula gene model.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2279124&req=5

Figure 2: Mapping of cowpea assemblies and singletons to the M. truncatula pseudomolecules. GSR assemblies and singltons were mapped by tblastx searches to the M. truncatula chromosome-scale pseudomolecules available on the TIGR M. truncatula database. The broad green lines represent tblastx alignments; narrow lines connect High-scoring Segment Pairs (HSPs) derived from the same cowpea sequence. An HSP consists of two sequence fragments of arbitrary but equal length whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score. A: An example of mapping cowpea contigs and singletons to a 40 kb region of chromosome 0 (which represents BACs that have not been anchored to the genetic map). B: A closer view of the same region from 396 k to 404 k. C: A region of M. truncatula chromosome 6 where a single cowpea GSR spans and has high quality tblastx matches to three distinct IMGAG gene models, indicating microsynteny. M. truncatula gene model AC134521_19 has no match in that region of the cowpea genome. D: A region of M. truncatula chromosome 2 where there are several GSR matches, but no M. truncatula gene model.

Mentions: Using tblastx searches, 42,988 GSRs (24,075 GSR assemblies and 18,913 singltons) could be mapped to the M. truncatula chromosome-scale pseudomolecules available on the TIGR M. truncatula database [43]. The cowpea sequences are broadly distributed among the nine M. truncatula pseudomolecules [see Additional file 3]. Several examples of the mapping are shown in Figure 2. We were able to find over 500 cases where GSR assemblies/singletons map to at least 2 adjacent IMGAG genes along the pseudomolecules, indicating a significant level of microsynteny. We also found examples where along a syntenic region, there appears to be a gene missing in either cowpea or M. truncatula (Figure 2C). This could be due to either a gene insertion/deletion in one of the species. It is unlikely to be due to an annotation error in M. truncatula, since tblastx would detect sequence similarity in this region even if no gene model was predicted.


Sequencing and analysis of the gene-rich space of cowpea.

Timko MP, Rushton PJ, Laudeman TW, Bokowiec MT, Chipumuro E, Cheung F, Town CD, Chen X - BMC Genomics (2008)

Mapping of cowpea assemblies and singletons to the M. truncatula pseudomolecules. GSR assemblies and singltons were mapped by tblastx searches to the M. truncatula chromosome-scale pseudomolecules available on the TIGR M. truncatula database. The broad green lines represent tblastx alignments; narrow lines connect High-scoring Segment Pairs (HSPs) derived from the same cowpea sequence. An HSP consists of two sequence fragments of arbitrary but equal length whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score. A: An example of mapping cowpea contigs and singletons to a 40 kb region of chromosome 0 (which represents BACs that have not been anchored to the genetic map). B: A closer view of the same region from 396 k to 404 k. C: A region of M. truncatula chromosome 6 where a single cowpea GSR spans and has high quality tblastx matches to three distinct IMGAG gene models, indicating microsynteny. M. truncatula gene model AC134521_19 has no match in that region of the cowpea genome. D: A region of M. truncatula chromosome 2 where there are several GSR matches, but no M. truncatula gene model.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2279124&req=5

Figure 2: Mapping of cowpea assemblies and singletons to the M. truncatula pseudomolecules. GSR assemblies and singltons were mapped by tblastx searches to the M. truncatula chromosome-scale pseudomolecules available on the TIGR M. truncatula database. The broad green lines represent tblastx alignments; narrow lines connect High-scoring Segment Pairs (HSPs) derived from the same cowpea sequence. An HSP consists of two sequence fragments of arbitrary but equal length whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score. A: An example of mapping cowpea contigs and singletons to a 40 kb region of chromosome 0 (which represents BACs that have not been anchored to the genetic map). B: A closer view of the same region from 396 k to 404 k. C: A region of M. truncatula chromosome 6 where a single cowpea GSR spans and has high quality tblastx matches to three distinct IMGAG gene models, indicating microsynteny. M. truncatula gene model AC134521_19 has no match in that region of the cowpea genome. D: A region of M. truncatula chromosome 2 where there are several GSR matches, but no M. truncatula gene model.
Mentions: Using tblastx searches, 42,988 GSRs (24,075 GSR assemblies and 18,913 singltons) could be mapped to the M. truncatula chromosome-scale pseudomolecules available on the TIGR M. truncatula database [43]. The cowpea sequences are broadly distributed among the nine M. truncatula pseudomolecules [see Additional file 3]. Several examples of the mapping are shown in Figure 2. We were able to find over 500 cases where GSR assemblies/singletons map to at least 2 adjacent IMGAG genes along the pseudomolecules, indicating a significant level of microsynteny. We also found examples where along a syntenic region, there appears to be a gene missing in either cowpea or M. truncatula (Figure 2C). This could be due to either a gene insertion/deletion in one of the species. It is unlikely to be due to an annotation error in M. truncatula, since tblastx would detect sequence similarity in this region even if no gene model was predicted.

Bottom Line: With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research.The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, University of Virginia, Charlottesville, Virginia 22903, USA. mpt9g@virginia.edu

ABSTRACT

Background: Cowpea, Vigna unguiculata (L.) Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly by poor subsistence farmers. Despite its economic and social importance in the developing world, cowpea remains to a large extent an underexploited crop. Among the major goals of cowpea breeding and improvement programs is the stacking of desirable agronomic traits, such as disease and pest resistance and response to abiotic stresses. Implementation of marker-assisted selection and breeding programs is severely limited by a paucity of trait-linked markers and a general lack of information on gene structure and organization. With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.

Results: We report here the sequencing and analysis of the gene-rich, hypomethylated portion of the cowpea genome selectively cloned by methylation filtration (MF) technology. Over 250,000 gene-space sequence reads (GSRs) with an average length of 610 bp were generated, yielding ~160 Mb of sequence information. The GSRs were assembled, annotated by BLAST homology searches of four public protein annotation databases and four plant proteomes (A. thaliana, M. truncatula, O. sativa, and P. trichocarpa), and analyzed using various domain and gene modeling tools. A total of 41,260 GSR assemblies and singletons were annotated, of which 19,786 have unique GenBank accession numbers. Within the GSR dataset, 29% of the sequences were annotated using the Arabidopsis Gene Ontology (GO) with the largest categories of assigned function being catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. A total of 5,888 GSRs had homology to genes encoding transcription factors (TFs) and transcription associated factors (TAFs) representing about 5% of the total annotated sequences in the dataset. Sixty-two (62) of the 64 well-characterized plant transcription factor (TF) gene families are represented in the cowpea GSRs, and these families are of similar size and phylogenetic organization to those characterized in other plants. The cowpea GSRs also provides a rich source of genes involved in photoperiodic control, symbiosis, and defense-related responses. Comparisons to available databases revealed that about 74% of cowpea ESTs and 70% of all legume ESTs were represented in the GSR dataset. As approximately 12% of all GSRs contain an identifiable simple-sequence repeat, the dataset is a powerful resource for the design of microsatellite markers.

Conclusion: The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research. Not only does the gene space sequence enable the detailed analysis of gene structure, gene family organization and phylogenetic relationships within cowpea, but it also facilitates the characterization of syntenic relationships with other cultivated and model legumes, and will contribute to determining patterns of chromosomal evolution in the Leguminosae. The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

Show MeSH
Related in: MedlinePlus