Limits...
CATMA, a comprehensive genome-scale resource for silencing and transcript profiling of Arabidopsis genes.

Sclep G, Allemeersch J, Liechti R, De Meyer B, Beynon J, Bhalerao R, Moreau Y, Nietfeld W, Renou JP, Reymond P, Kuiper MT, Hilson P - BMC Bioinformatics (2007)

Bottom Line: To validate the efficacy of GST mapping criteria and design rules, the predicted and experimentally observed hybridization characteristics associated to GST features were correlated in transcript profiling datasets obtained with the CATMAv2 microarray, confirming the reliability of this platform.These latter 1,533 features constitute the CATMAv4 addition.This resource is used both for the production of spotted microarrays and the large-scale cloning of hairpin RNA silencing vectors.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Plant Systems Biology, VIB, Technologiepark 927, 9052 Ghent, Belgium. gert.sclep@tecnoparco.org

ABSTRACT

Background: The Complete Arabidopsis Transcript MicroArray (CATMA) initiative combines the efforts of laboratories in eight European countries 1 to deliver gene-specific sequence tags (GSTs) for the Arabidopsis research community. The CATMA initiative offers the power and flexibility to regularly update the GST collection according to evolving knowledge about the gene repertoire. These GST amplicons can easily be reamplified and shared, subsets can be picked at will to print dedicated arrays, and the GSTs can be cloned and used for other functional studies. This ongoing initiative has already produced approximately 24,000 GSTs that have been made publicly available for spotted microarray printing and RNA interference.

Results: GSTs from the CATMA version 2 repertoire (CATMAv2, created in 2002) were mapped onto the gene models from two independent Arabidopsis nuclear genome annotation efforts, TIGR5 and PSB-EuGène, to consolidate a list of genes that were targeted by previously designed CATMA tags. A total of 9,027 gene models were not tagged by any amplified CATMAv2 GST, and 2,533 amplified GSTs were no longer predicted to tag an updated gene model. To validate the efficacy of GST mapping criteria and design rules, the predicted and experimentally observed hybridization characteristics associated to GST features were correlated in transcript profiling datasets obtained with the CATMAv2 microarray, confirming the reliability of this platform. To complete the CATMA repertoire, all 9,027 gene models for which no GST had yet been designed were processed with an adjusted version of the Specific Primer and Amplicon Design Software (SPADS). A total of 5,756 novel GSTs were designed and amplified by PCR from genomic DNA. Together with the pre-existing GST collection, this new addition constitutes the CATMAv3 repertoire. It comprises 30,343 unique amplified sequences that tag 24,202 and 23,009 protein-encoding nuclear gene models in the TAIR6 and EuGène genome annotations, respectively. To cover the remaining untagged genes, we identified 543 additional GSTs using less stringent design criteria and designed 990 sequence tags matching multiple members of gene families (Gene Family Tags or GFTs) to cover any remaining untagged genes. These latter 1,533 features constitute the CATMAv4 addition.

Conclusion: To update the CATMA GST repertoire, we designed 7,289 additional sequence tags, bringing the total number of tagged TAIR6-annotated Arabidopsis nuclear protein-coding genes to 26,173. This resource is used both for the production of spotted microarrays and the large-scale cloning of hairpin RNA silencing vectors. All information about the resulting updated CATMA repertoire is available through the CATMA database http://www.catma.org.

Show MeSH

Related in: MedlinePlus

Comparative analysis of the CATMAv2 repertoire and the CATMAv3 addition. Quality comparison between the CATMAv2 repertoire (left) and the CATMAv3 repertoire (right). The top and bottom panels show the distribution of the probes with regard to their mapping location on the cognate gene and the cumulative distribution of the probe specificity, measured as the percentage sequence identity of the best non-trivial BLAST hit when comparing the probe against the TIGR5 genome, respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2147040&req=5

Figure 2: Comparative analysis of the CATMAv2 repertoire and the CATMAv3 addition. Quality comparison between the CATMAv2 repertoire (left) and the CATMAv3 repertoire (right). The top and bottom panels show the distribution of the probes with regard to their mapping location on the cognate gene and the cumulative distribution of the probe specificity, measured as the percentage sequence identity of the best non-trivial BLAST hit when comparing the probe against the TIGR5 genome, respectively.

Mentions: Two parameters were analyzed to investigate whether the new GST set (CATMAv3 addition) has properties similar to the previously designed tags (CATMAv2) and to assess the consistency of the subsequent GST batches: (1) the relative position of the tag within the gene (Figure 2, top), with the 3' location being preferable to 5', because transcript labeling techniques often include oligo-dT priming steps; (2) the distribution of GST specificity measured as the percentage sequence identity of a tag with the next best non-trivial genome BLAST hit (Figure 2, bottom), indicating the likelihood that GST microarray features might yield unspecific hybridization. The comparative analysis indicated that both sets are similar with regard to the relative GST position. The CATMAv3 addition did not show an overall decrease in specificity. However, from Figure 2, a larger fraction of the next best non-trivial BLAST hits has a homology closer to 70% (the absolute overall cut-off) in v3 than in v2. Although it is reasonable to assume that the performance of the majority of the added features would be similar to that of the CATMAv2 repertoire in microarray experiments [4], we examined more closely the cross-hybridization potential for all GSTs. For this purpose, we calculated the melting temperature of the DNA heteroduplex between each GST and their best non-target BLAST hit sequence with the Baldino formula [15], in commonly adopted CATMA microarray hybridization conditions (4×SSC, 50% formamide, 45°C). Conservatively, a GST was flagged for potential cross-hybridization when this predicted melting temperature was equal to or higher than 45°C. This information is available via the CATMA database [14]. The percentages of flagged GSTs were 3.2%, and 9.7% for CATMAv2 and CATMAv3, respectively.


CATMA, a comprehensive genome-scale resource for silencing and transcript profiling of Arabidopsis genes.

Sclep G, Allemeersch J, Liechti R, De Meyer B, Beynon J, Bhalerao R, Moreau Y, Nietfeld W, Renou JP, Reymond P, Kuiper MT, Hilson P - BMC Bioinformatics (2007)

Comparative analysis of the CATMAv2 repertoire and the CATMAv3 addition. Quality comparison between the CATMAv2 repertoire (left) and the CATMAv3 repertoire (right). The top and bottom panels show the distribution of the probes with regard to their mapping location on the cognate gene and the cumulative distribution of the probe specificity, measured as the percentage sequence identity of the best non-trivial BLAST hit when comparing the probe against the TIGR5 genome, respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2147040&req=5

Figure 2: Comparative analysis of the CATMAv2 repertoire and the CATMAv3 addition. Quality comparison between the CATMAv2 repertoire (left) and the CATMAv3 repertoire (right). The top and bottom panels show the distribution of the probes with regard to their mapping location on the cognate gene and the cumulative distribution of the probe specificity, measured as the percentage sequence identity of the best non-trivial BLAST hit when comparing the probe against the TIGR5 genome, respectively.
Mentions: Two parameters were analyzed to investigate whether the new GST set (CATMAv3 addition) has properties similar to the previously designed tags (CATMAv2) and to assess the consistency of the subsequent GST batches: (1) the relative position of the tag within the gene (Figure 2, top), with the 3' location being preferable to 5', because transcript labeling techniques often include oligo-dT priming steps; (2) the distribution of GST specificity measured as the percentage sequence identity of a tag with the next best non-trivial genome BLAST hit (Figure 2, bottom), indicating the likelihood that GST microarray features might yield unspecific hybridization. The comparative analysis indicated that both sets are similar with regard to the relative GST position. The CATMAv3 addition did not show an overall decrease in specificity. However, from Figure 2, a larger fraction of the next best non-trivial BLAST hits has a homology closer to 70% (the absolute overall cut-off) in v3 than in v2. Although it is reasonable to assume that the performance of the majority of the added features would be similar to that of the CATMAv2 repertoire in microarray experiments [4], we examined more closely the cross-hybridization potential for all GSTs. For this purpose, we calculated the melting temperature of the DNA heteroduplex between each GST and their best non-target BLAST hit sequence with the Baldino formula [15], in commonly adopted CATMA microarray hybridization conditions (4×SSC, 50% formamide, 45°C). Conservatively, a GST was flagged for potential cross-hybridization when this predicted melting temperature was equal to or higher than 45°C. This information is available via the CATMA database [14]. The percentages of flagged GSTs were 3.2%, and 9.7% for CATMAv2 and CATMAv3, respectively.

Bottom Line: To validate the efficacy of GST mapping criteria and design rules, the predicted and experimentally observed hybridization characteristics associated to GST features were correlated in transcript profiling datasets obtained with the CATMAv2 microarray, confirming the reliability of this platform.These latter 1,533 features constitute the CATMAv4 addition.This resource is used both for the production of spotted microarrays and the large-scale cloning of hairpin RNA silencing vectors.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Plant Systems Biology, VIB, Technologiepark 927, 9052 Ghent, Belgium. gert.sclep@tecnoparco.org

ABSTRACT

Background: The Complete Arabidopsis Transcript MicroArray (CATMA) initiative combines the efforts of laboratories in eight European countries 1 to deliver gene-specific sequence tags (GSTs) for the Arabidopsis research community. The CATMA initiative offers the power and flexibility to regularly update the GST collection according to evolving knowledge about the gene repertoire. These GST amplicons can easily be reamplified and shared, subsets can be picked at will to print dedicated arrays, and the GSTs can be cloned and used for other functional studies. This ongoing initiative has already produced approximately 24,000 GSTs that have been made publicly available for spotted microarray printing and RNA interference.

Results: GSTs from the CATMA version 2 repertoire (CATMAv2, created in 2002) were mapped onto the gene models from two independent Arabidopsis nuclear genome annotation efforts, TIGR5 and PSB-EuGène, to consolidate a list of genes that were targeted by previously designed CATMA tags. A total of 9,027 gene models were not tagged by any amplified CATMAv2 GST, and 2,533 amplified GSTs were no longer predicted to tag an updated gene model. To validate the efficacy of GST mapping criteria and design rules, the predicted and experimentally observed hybridization characteristics associated to GST features were correlated in transcript profiling datasets obtained with the CATMAv2 microarray, confirming the reliability of this platform. To complete the CATMA repertoire, all 9,027 gene models for which no GST had yet been designed were processed with an adjusted version of the Specific Primer and Amplicon Design Software (SPADS). A total of 5,756 novel GSTs were designed and amplified by PCR from genomic DNA. Together with the pre-existing GST collection, this new addition constitutes the CATMAv3 repertoire. It comprises 30,343 unique amplified sequences that tag 24,202 and 23,009 protein-encoding nuclear gene models in the TAIR6 and EuGène genome annotations, respectively. To cover the remaining untagged genes, we identified 543 additional GSTs using less stringent design criteria and designed 990 sequence tags matching multiple members of gene families (Gene Family Tags or GFTs) to cover any remaining untagged genes. These latter 1,533 features constitute the CATMAv4 addition.

Conclusion: To update the CATMA GST repertoire, we designed 7,289 additional sequence tags, bringing the total number of tagged TAIR6-annotated Arabidopsis nuclear protein-coding genes to 26,173. This resource is used both for the production of spotted microarrays and the large-scale cloning of hairpin RNA silencing vectors. All information about the resulting updated CATMA repertoire is available through the CATMA database http://www.catma.org.

Show MeSH
Related in: MedlinePlus