Limits...
Efficient targeted transcript discovery via array-based normalization of RACE libraries.

Djebali S, Kapranov P, Foissac S, Lagarde J, Reymond A, Ucla C, Wyss C, Drenkow J, Dumais E, Murray RR, Lin C, Szeto D, Denoeud F, Calvo M, Frankish A, Harrow J, Makrythanasis P, Vidal M, Salehi-Ashtiani K, Antonarakis SE, Gingeras TR, Guigó R - Nat. Methods (2008)

Bottom Line: Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large.This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance.We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.

View Article: PubMed Central - PubMed

Affiliation: Grup de Recerca en Informàtica Biomèdica, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain.

ABSTRACT
Rapid amplification of cDNA ends (RACE) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. To improve sampling efficiency of human transcripts, we hybridized the products of the RACE reaction onto tiling arrays and used the detected exons to delineate a series of reverse-transcriptase (RT)-PCRs, through which the original RACE transcript population was segregated into simpler transcript populations. We independently cloned the products and sequenced randomly selected clones. This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.

Show MeSH
genomic coverage of RACEfrags originating from different tissues or combinations of tissues(a)total number of nucleotides in RACEfrags as a function of the tissue in which the RACE reaction has been performed.(b)cumulative number of transcribed nucleotides detected by RACEfrags per tissue. The cumulative coverage is obtained iteratively. At each step, a new tissue is included in the carrying combination of tissues. The tissue included at each step is the one for which RACEfrags include the maximum number of nucleotides in the genome, not previously included in the carrying combination of tissues. Correspondingly, tissues are ordered on the X-axis from left to right, so that the tissue at a given position is the one producing more novel RACEfrags with respect to the RACEfrags produced by tissues to its left on the axis. While this is a heuristic approach that does not guarantee optimality, we believe that for this particular problem, it will certainly produce a nearly optimal ranking.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2713501&req=5

Figure 3: genomic coverage of RACEfrags originating from different tissues or combinations of tissues(a)total number of nucleotides in RACEfrags as a function of the tissue in which the RACE reaction has been performed.(b)cumulative number of transcribed nucleotides detected by RACEfrags per tissue. The cumulative coverage is obtained iteratively. At each step, a new tissue is included in the carrying combination of tissues. The tissue included at each step is the one for which RACEfrags include the maximum number of nucleotides in the genome, not previously included in the carrying combination of tissues. Correspondingly, tissues are ordered on the X-axis from left to right, so that the tissue at a given position is the one producing more novel RACEfrags with respect to the RACEfrags produced by tissues to its left on the axis. While this is a heuristic approach that does not guarantee optimality, we believe that for this particular problem, it will certainly produce a nearly optimal ranking.

Mentions: We performed both 5' and 3' RACE of 12 genes mapping on human chromosomes 21 and 22 on polyA+ RNA of 48 cell types (see Table 1 and Methods). Both 5’ and 3’ RACE reactions for three widely spaced genes per chromosome were pooled and hybridized onto a high-density tiling array of human chromosomes 21 and 22 with 17-nucleotide interrogation resolution. Detailed results are provided in Supplementary Results, but figure 3 summarizes the main findings. Figure 3a plots the genomic coverage of RACEfrags as a function of the tissue in which the RACE reaction was performed. Not surprisingly, tissues exhibit, in general, higher transcriptional diversity than cell lines, but large variations in the amount of transcribed bases are observed between both tissues and cell lines, consistent with previous results19. Figure 3b plots the cumulative genomic coverage as a function of the combination of tissues. As shown, a combination of about 16 cell types already captures about 90% of all detected transcribed nucleotides.


Efficient targeted transcript discovery via array-based normalization of RACE libraries.

Djebali S, Kapranov P, Foissac S, Lagarde J, Reymond A, Ucla C, Wyss C, Drenkow J, Dumais E, Murray RR, Lin C, Szeto D, Denoeud F, Calvo M, Frankish A, Harrow J, Makrythanasis P, Vidal M, Salehi-Ashtiani K, Antonarakis SE, Gingeras TR, Guigó R - Nat. Methods (2008)

genomic coverage of RACEfrags originating from different tissues or combinations of tissues(a)total number of nucleotides in RACEfrags as a function of the tissue in which the RACE reaction has been performed.(b)cumulative number of transcribed nucleotides detected by RACEfrags per tissue. The cumulative coverage is obtained iteratively. At each step, a new tissue is included in the carrying combination of tissues. The tissue included at each step is the one for which RACEfrags include the maximum number of nucleotides in the genome, not previously included in the carrying combination of tissues. Correspondingly, tissues are ordered on the X-axis from left to right, so that the tissue at a given position is the one producing more novel RACEfrags with respect to the RACEfrags produced by tissues to its left on the axis. While this is a heuristic approach that does not guarantee optimality, we believe that for this particular problem, it will certainly produce a nearly optimal ranking.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2713501&req=5

Figure 3: genomic coverage of RACEfrags originating from different tissues or combinations of tissues(a)total number of nucleotides in RACEfrags as a function of the tissue in which the RACE reaction has been performed.(b)cumulative number of transcribed nucleotides detected by RACEfrags per tissue. The cumulative coverage is obtained iteratively. At each step, a new tissue is included in the carrying combination of tissues. The tissue included at each step is the one for which RACEfrags include the maximum number of nucleotides in the genome, not previously included in the carrying combination of tissues. Correspondingly, tissues are ordered on the X-axis from left to right, so that the tissue at a given position is the one producing more novel RACEfrags with respect to the RACEfrags produced by tissues to its left on the axis. While this is a heuristic approach that does not guarantee optimality, we believe that for this particular problem, it will certainly produce a nearly optimal ranking.
Mentions: We performed both 5' and 3' RACE of 12 genes mapping on human chromosomes 21 and 22 on polyA+ RNA of 48 cell types (see Table 1 and Methods). Both 5’ and 3’ RACE reactions for three widely spaced genes per chromosome were pooled and hybridized onto a high-density tiling array of human chromosomes 21 and 22 with 17-nucleotide interrogation resolution. Detailed results are provided in Supplementary Results, but figure 3 summarizes the main findings. Figure 3a plots the genomic coverage of RACEfrags as a function of the tissue in which the RACE reaction was performed. Not surprisingly, tissues exhibit, in general, higher transcriptional diversity than cell lines, but large variations in the amount of transcribed bases are observed between both tissues and cell lines, consistent with previous results19. Figure 3b plots the cumulative genomic coverage as a function of the combination of tissues. As shown, a combination of about 16 cell types already captures about 90% of all detected transcribed nucleotides.

Bottom Line: Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large.This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance.We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.

View Article: PubMed Central - PubMed

Affiliation: Grup de Recerca en Informàtica Biomèdica, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain.

ABSTRACT
Rapid amplification of cDNA ends (RACE) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. To improve sampling efficiency of human transcripts, we hybridized the products of the RACE reaction onto tiling arrays and used the detected exons to delineate a series of reverse-transcriptase (RT)-PCRs, through which the original RACE transcript population was segregated into simpler transcript populations. We independently cloned the products and sequenced randomly selected clones. This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.

Show MeSH