Limits...
Global identification and characterization of transcriptionally active regions in the rice genome.

Li L, Wang X, Sasidharan R, Stolc V, Deng W, He H, Korbel J, Chen X, Tongprasit W, Ronald P, Chen R, Gerstein M, Deng XW - PLoS ONE (2007)

Bottom Line: Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes.However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking.These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut, United States of America.

ABSTRACT
Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking. We report here the identification of 25,352 and 27,744 transcriptionally active regions (TARs) not encoded by annotated exons in the rice (Oryza. sativa) subspecies japonica and indica, respectively. The non-exonic TARs account for approximately two thirds of the total TARs detected by tiling arrays and represent transcripts likely conserved between japonica and indica. Transcription of 21,018 (83%) japonica non-exonic TARs was verified through expression profiling in 10 tissue types using a re-array in which annotated genes and TARs were each represented by five independent probes. Subsequent analyses indicate that about 80% of the japonica TARs that were not assigned to annotated exons can be assigned to various putatively functional or structural elements of the rice genome, including splice variants, uncharacterized portions of incompletely annotated genes, antisense transcripts, duplicated gene fragments, and potential non-coding RNAs. These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome.

Show MeSH

Related in: MedlinePlus

Analysis of non-coding intergenic TARs. (A) Scatterplot of GC2 versus GC3 in all gene models (n = 46,976), FL-cDNA-supported PASA gene models (n = 11,494), and intergenic TARs (n = 5256). The intergenic TARs were distal (>1Kb) to a gene model excluding those with a hit in the ProSite database. (B) Overlapping TARs with putative non-coding transcripts. A 5-Kb region represented by the high-resolution tiling array is shown. The interrogating probes are aligned to the chromosomal coordinates, with the fluorescence intensity value depicted as a vertical line. Gene models, no-exonic TARs and putative non-coding transcripts are depicted as horizontal arrows, which point to the direction of transcription. A portion of the region containing four non-coding transcripts and a pair of TARs is enlarged and shown at the bottom. (C) Predicted secondary structure of the TAR Chr10fwd_10524178. The sequence corresponding to the putative nc-RNA ts_342 is highlighted.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC1808428&req=5

pone-0000294-g006: Analysis of non-coding intergenic TARs. (A) Scatterplot of GC2 versus GC3 in all gene models (n = 46,976), FL-cDNA-supported PASA gene models (n = 11,494), and intergenic TARs (n = 5256). The intergenic TARs were distal (>1Kb) to a gene model excluding those with a hit in the ProSite database. (B) Overlapping TARs with putative non-coding transcripts. A 5-Kb region represented by the high-resolution tiling array is shown. The interrogating probes are aligned to the chromosomal coordinates, with the fluorescence intensity value depicted as a vertical line. Gene models, no-exonic TARs and putative non-coding transcripts are depicted as horizontal arrows, which point to the direction of transcription. A portion of the region containing four non-coding transcripts and a pair of TARs is enlarged and shown at the bottom. (C) Predicted secondary structure of the TAR Chr10fwd_10524178. The sequence corresponding to the putative nc-RNA ts_342 is highlighted.

Mentions: The intergenic TARs, distal to a gene model and not overlapping with other elements of the genome, numbered ∼8400 in total and do not appear to encode proteins. Two lines of evidence supported conclusion. First, the linear relationship between the GC content of the second (GC2) and the third (GC3) codon positions of the longest deduced ORF, of the six possible translations, from these TARs deviated from that of known genes [42]. As shown in Figure 6A, the GC3/GC2 correlation of the intergenic TARs aligned along the diagonal, far from that of the PASA gene models. Consistent with this observation, the TAR-deduced peptide sequences were enriched for GC-rich codons, such as Pro and Arg (Figure S6). Although it is possible that some of the putative proteins have legitimate biological functions, the biased codon usage suggests that many of the intergenic TARs may not function through their deduced ORFs.


Global identification and characterization of transcriptionally active regions in the rice genome.

Li L, Wang X, Sasidharan R, Stolc V, Deng W, He H, Korbel J, Chen X, Tongprasit W, Ronald P, Chen R, Gerstein M, Deng XW - PLoS ONE (2007)

Analysis of non-coding intergenic TARs. (A) Scatterplot of GC2 versus GC3 in all gene models (n = 46,976), FL-cDNA-supported PASA gene models (n = 11,494), and intergenic TARs (n = 5256). The intergenic TARs were distal (>1Kb) to a gene model excluding those with a hit in the ProSite database. (B) Overlapping TARs with putative non-coding transcripts. A 5-Kb region represented by the high-resolution tiling array is shown. The interrogating probes are aligned to the chromosomal coordinates, with the fluorescence intensity value depicted as a vertical line. Gene models, no-exonic TARs and putative non-coding transcripts are depicted as horizontal arrows, which point to the direction of transcription. A portion of the region containing four non-coding transcripts and a pair of TARs is enlarged and shown at the bottom. (C) Predicted secondary structure of the TAR Chr10fwd_10524178. The sequence corresponding to the putative nc-RNA ts_342 is highlighted.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC1808428&req=5

pone-0000294-g006: Analysis of non-coding intergenic TARs. (A) Scatterplot of GC2 versus GC3 in all gene models (n = 46,976), FL-cDNA-supported PASA gene models (n = 11,494), and intergenic TARs (n = 5256). The intergenic TARs were distal (>1Kb) to a gene model excluding those with a hit in the ProSite database. (B) Overlapping TARs with putative non-coding transcripts. A 5-Kb region represented by the high-resolution tiling array is shown. The interrogating probes are aligned to the chromosomal coordinates, with the fluorescence intensity value depicted as a vertical line. Gene models, no-exonic TARs and putative non-coding transcripts are depicted as horizontal arrows, which point to the direction of transcription. A portion of the region containing four non-coding transcripts and a pair of TARs is enlarged and shown at the bottom. (C) Predicted secondary structure of the TAR Chr10fwd_10524178. The sequence corresponding to the putative nc-RNA ts_342 is highlighted.
Mentions: The intergenic TARs, distal to a gene model and not overlapping with other elements of the genome, numbered ∼8400 in total and do not appear to encode proteins. Two lines of evidence supported conclusion. First, the linear relationship between the GC content of the second (GC2) and the third (GC3) codon positions of the longest deduced ORF, of the six possible translations, from these TARs deviated from that of known genes [42]. As shown in Figure 6A, the GC3/GC2 correlation of the intergenic TARs aligned along the diagonal, far from that of the PASA gene models. Consistent with this observation, the TAR-deduced peptide sequences were enriched for GC-rich codons, such as Pro and Arg (Figure S6). Although it is possible that some of the putative proteins have legitimate biological functions, the biased codon usage suggests that many of the intergenic TARs may not function through their deduced ORFs.

Bottom Line: Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes.However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking.These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut, United States of America.

ABSTRACT
Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking. We report here the identification of 25,352 and 27,744 transcriptionally active regions (TARs) not encoded by annotated exons in the rice (Oryza. sativa) subspecies japonica and indica, respectively. The non-exonic TARs account for approximately two thirds of the total TARs detected by tiling arrays and represent transcripts likely conserved between japonica and indica. Transcription of 21,018 (83%) japonica non-exonic TARs was verified through expression profiling in 10 tissue types using a re-array in which annotated genes and TARs were each represented by five independent probes. Subsequent analyses indicate that about 80% of the japonica TARs that were not assigned to annotated exons can be assigned to various putatively functional or structural elements of the rice genome, including splice variants, uncharacterized portions of incompletely annotated genes, antisense transcripts, duplicated gene fragments, and potential non-coding RNAs. These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome.

Show MeSH
Related in: MedlinePlus