Limits...
Transcriptome dynamics-based operon prediction and verification in Streptomyces coelicolor.

Charaniya S, Mehra S, Lian W, Jayapal KP, Karypis G, Hu WS - Nucleic Acids Res. (2007)

Bottom Line: Using a compilation of genome-scale temporal transcriptome data for the model organism, Streptomyces coelicolor, under different environmental and genetic perturbations, we have developed a supervised machine-learning method for operon prediction in this microorganism.Based on model predictions for the entire genome, we verified the co-transcription of more than 250 gene pairs by RT-PCR.These results vastly increase the database of known operons in S. coelicolor and provide valuable information for exploring gene function and regulation to harness the potential of this differentiating microorganism for synthesis of natural products.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemical Engineering and Materials Science, University of Minnesota, 421 Washington Avenue SE, Minneapolis, MN 55455-0132, USA.

ABSTRACT
Streptomyces spp. produce a variety of valuable secondary metabolites, which are regulated in a spatio-temporal manner by a complex network of inter-connected gene products. Using a compilation of genome-scale temporal transcriptome data for the model organism, Streptomyces coelicolor, under different environmental and genetic perturbations, we have developed a supervised machine-learning method for operon prediction in this microorganism. We demonstrate that, using features dependent on transcriptome dynamics and genome sequence, a support vector machines (SVM)-based classification algorithm can accurately classify >90% of gene pairs in a set of known operons. Based on model predictions for the entire genome, we verified the co-transcription of more than 250 gene pairs by RT-PCR. These results vastly increase the database of known operons in S. coelicolor and provide valuable information for exploring gene function and regulation to harness the potential of this differentiating microorganism for synthesis of natural products.

Show MeSH
Comparison of Pearson correlation between transcript levels of adjacent genes in KOPs and NOPs. The microarray experiments were divided into nine sets and correlation between transcript levels of adjacent genes in every pair was calculated for each set. The histogram of the number of sets in which correlation exceeds 0.7 in (a) Known operon pairs (KOPs); (b) Non-operon pairs (NOPs); (c) Randomly selected pairs.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2175336&req=5

Figure 3: Comparison of Pearson correlation between transcript levels of adjacent genes in KOPs and NOPs. The microarray experiments were divided into nine sets and correlation between transcript levels of adjacent genes in every pair was calculated for each set. The histogram of the number of sets in which correlation exceeds 0.7 in (a) Known operon pairs (KOPs); (b) Non-operon pairs (NOPs); (c) Randomly selected pairs.

Mentions: Temporal transcriptome data obtained from 206 cell samples were divided into nine different sets depending on the experimental design, strains and culture conditions used (Supplementary Table S1). For every KOP, the Pearson correlation between the transcript levels of the adjacent genes was calculated for each of the nine sets, and the number of sets in which the correlation exceeds 0.7 was counted. The KOPs were divided into 10 groups according to the number of sets (0,1,2,… 9) in which transcript correlation exceeds 0.7. Figure 3a shows the distribution of the KOPs in different groups. Only one out of 149 KOPs has transcript correlation r > 0.7 in all the nine sets. The error in measurement of transcript level due to noise, may have contributed to the relatively low correlation between genes in KOPs. The presence of as yet-unidentified site for internal regulation (internal promoter or transcription terminator), or differential mRNA degradation could also potentially reduce the similarity in transcript level of genes in a KOP. Nonetheless, 58 (39%) KOPs have transcript correlation r > 0.7 in four or more sets. In contrast, only six (5%) NOPs have transcript correlation r > 0.7 in four or more sets (Figure 3b). Further, 78 (64%) NOPs do not satisfy the correlation threshold of 0.7 in any of the nine sets, in contrast to only 18 (12%) KOPs. This separation between KOPs and NOPs is evident even at higher correlation thresholds. Thirty-two KOPs have transcript correlation r > 0.8 in four or more sets in contrast to only one NOP.Figure 3.


Transcriptome dynamics-based operon prediction and verification in Streptomyces coelicolor.

Charaniya S, Mehra S, Lian W, Jayapal KP, Karypis G, Hu WS - Nucleic Acids Res. (2007)

Comparison of Pearson correlation between transcript levels of adjacent genes in KOPs and NOPs. The microarray experiments were divided into nine sets and correlation between transcript levels of adjacent genes in every pair was calculated for each set. The histogram of the number of sets in which correlation exceeds 0.7 in (a) Known operon pairs (KOPs); (b) Non-operon pairs (NOPs); (c) Randomly selected pairs.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2175336&req=5

Figure 3: Comparison of Pearson correlation between transcript levels of adjacent genes in KOPs and NOPs. The microarray experiments were divided into nine sets and correlation between transcript levels of adjacent genes in every pair was calculated for each set. The histogram of the number of sets in which correlation exceeds 0.7 in (a) Known operon pairs (KOPs); (b) Non-operon pairs (NOPs); (c) Randomly selected pairs.
Mentions: Temporal transcriptome data obtained from 206 cell samples were divided into nine different sets depending on the experimental design, strains and culture conditions used (Supplementary Table S1). For every KOP, the Pearson correlation between the transcript levels of the adjacent genes was calculated for each of the nine sets, and the number of sets in which the correlation exceeds 0.7 was counted. The KOPs were divided into 10 groups according to the number of sets (0,1,2,… 9) in which transcript correlation exceeds 0.7. Figure 3a shows the distribution of the KOPs in different groups. Only one out of 149 KOPs has transcript correlation r > 0.7 in all the nine sets. The error in measurement of transcript level due to noise, may have contributed to the relatively low correlation between genes in KOPs. The presence of as yet-unidentified site for internal regulation (internal promoter or transcription terminator), or differential mRNA degradation could also potentially reduce the similarity in transcript level of genes in a KOP. Nonetheless, 58 (39%) KOPs have transcript correlation r > 0.7 in four or more sets. In contrast, only six (5%) NOPs have transcript correlation r > 0.7 in four or more sets (Figure 3b). Further, 78 (64%) NOPs do not satisfy the correlation threshold of 0.7 in any of the nine sets, in contrast to only 18 (12%) KOPs. This separation between KOPs and NOPs is evident even at higher correlation thresholds. Thirty-two KOPs have transcript correlation r > 0.8 in four or more sets in contrast to only one NOP.Figure 3.

Bottom Line: Using a compilation of genome-scale temporal transcriptome data for the model organism, Streptomyces coelicolor, under different environmental and genetic perturbations, we have developed a supervised machine-learning method for operon prediction in this microorganism.Based on model predictions for the entire genome, we verified the co-transcription of more than 250 gene pairs by RT-PCR.These results vastly increase the database of known operons in S. coelicolor and provide valuable information for exploring gene function and regulation to harness the potential of this differentiating microorganism for synthesis of natural products.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemical Engineering and Materials Science, University of Minnesota, 421 Washington Avenue SE, Minneapolis, MN 55455-0132, USA.

ABSTRACT
Streptomyces spp. produce a variety of valuable secondary metabolites, which are regulated in a spatio-temporal manner by a complex network of inter-connected gene products. Using a compilation of genome-scale temporal transcriptome data for the model organism, Streptomyces coelicolor, under different environmental and genetic perturbations, we have developed a supervised machine-learning method for operon prediction in this microorganism. We demonstrate that, using features dependent on transcriptome dynamics and genome sequence, a support vector machines (SVM)-based classification algorithm can accurately classify >90% of gene pairs in a set of known operons. Based on model predictions for the entire genome, we verified the co-transcription of more than 250 gene pairs by RT-PCR. These results vastly increase the database of known operons in S. coelicolor and provide valuable information for exploring gene function and regulation to harness the potential of this differentiating microorganism for synthesis of natural products.

Show MeSH