Limits...
Transcriptome dynamics-based operon prediction and verification in Streptomyces coelicolor.

Charaniya S, Mehra S, Lian W, Jayapal KP, Karypis G, Hu WS - Nucleic Acids Res. (2007)

Bottom Line: Using a compilation of genome-scale temporal transcriptome data for the model organism, Streptomyces coelicolor, under different environmental and genetic perturbations, we have developed a supervised machine-learning method for operon prediction in this microorganism.Based on model predictions for the entire genome, we verified the co-transcription of more than 250 gene pairs by RT-PCR.These results vastly increase the database of known operons in S. coelicolor and provide valuable information for exploring gene function and regulation to harness the potential of this differentiating microorganism for synthesis of natural products.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemical Engineering and Materials Science, University of Minnesota, 421 Washington Avenue SE, Minneapolis, MN 55455-0132, USA.

ABSTRACT
Streptomyces spp. produce a variety of valuable secondary metabolites, which are regulated in a spatio-temporal manner by a complex network of inter-connected gene products. Using a compilation of genome-scale temporal transcriptome data for the model organism, Streptomyces coelicolor, under different environmental and genetic perturbations, we have developed a supervised machine-learning method for operon prediction in this microorganism. We demonstrate that, using features dependent on transcriptome dynamics and genome sequence, a support vector machines (SVM)-based classification algorithm can accurately classify >90% of gene pairs in a set of known operons. Based on model predictions for the entire genome, we verified the co-transcription of more than 250 gene pairs by RT-PCR. These results vastly increase the database of known operons in S. coelicolor and provide valuable information for exploring gene function and regulation to harness the potential of this differentiating microorganism for synthesis of natural products.

Show MeSH
Density distribution of intergenic distance in KOPs and NOPs. (continous line) KOPs; (dashed line) NOPs.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2175336&req=5

Figure 2: Density distribution of intergenic distance in KOPs and NOPs. (continous line) KOPs; (dashed line) NOPs.

Mentions: As described in the Materials and Methods section, from a set of known operons we obtained 149 known operons pairs (KOPs) and 122 non-operon pairs (NOPs). The density distribution of the intergenic distances in KOPs and NOPs is shown in Figure 2. For KOPs, the distribution has a sharp peak around intergenic distance of 0 bp. Sixty-seven (45%) KOPs have an intergenic distance less than 0 bp indicative of a translational overlap between the genes. Fifty-seven of these gene pairs have an overlap of 4 bp. Among them 35 have ATGA as the overlapping sequence, where ATG corresponds to start codon for the second gene and TGA is the stop codon for the first gene. The overlapping sequence in other 22 pairs is GTGA. Since S. coelicolor has 72% GC content, GTG is also a commonly observed translational start codon. An overlap of 1 bp between the start and the stop codons of adjacent genes was also observed among five of the KOPs. In contrast, only six (5%) NOPs have an overlap in the intergenic distance.Figure 2.


Transcriptome dynamics-based operon prediction and verification in Streptomyces coelicolor.

Charaniya S, Mehra S, Lian W, Jayapal KP, Karypis G, Hu WS - Nucleic Acids Res. (2007)

Density distribution of intergenic distance in KOPs and NOPs. (continous line) KOPs; (dashed line) NOPs.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2175336&req=5

Figure 2: Density distribution of intergenic distance in KOPs and NOPs. (continous line) KOPs; (dashed line) NOPs.
Mentions: As described in the Materials and Methods section, from a set of known operons we obtained 149 known operons pairs (KOPs) and 122 non-operon pairs (NOPs). The density distribution of the intergenic distances in KOPs and NOPs is shown in Figure 2. For KOPs, the distribution has a sharp peak around intergenic distance of 0 bp. Sixty-seven (45%) KOPs have an intergenic distance less than 0 bp indicative of a translational overlap between the genes. Fifty-seven of these gene pairs have an overlap of 4 bp. Among them 35 have ATGA as the overlapping sequence, where ATG corresponds to start codon for the second gene and TGA is the stop codon for the first gene. The overlapping sequence in other 22 pairs is GTGA. Since S. coelicolor has 72% GC content, GTG is also a commonly observed translational start codon. An overlap of 1 bp between the start and the stop codons of adjacent genes was also observed among five of the KOPs. In contrast, only six (5%) NOPs have an overlap in the intergenic distance.Figure 2.

Bottom Line: Using a compilation of genome-scale temporal transcriptome data for the model organism, Streptomyces coelicolor, under different environmental and genetic perturbations, we have developed a supervised machine-learning method for operon prediction in this microorganism.Based on model predictions for the entire genome, we verified the co-transcription of more than 250 gene pairs by RT-PCR.These results vastly increase the database of known operons in S. coelicolor and provide valuable information for exploring gene function and regulation to harness the potential of this differentiating microorganism for synthesis of natural products.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemical Engineering and Materials Science, University of Minnesota, 421 Washington Avenue SE, Minneapolis, MN 55455-0132, USA.

ABSTRACT
Streptomyces spp. produce a variety of valuable secondary metabolites, which are regulated in a spatio-temporal manner by a complex network of inter-connected gene products. Using a compilation of genome-scale temporal transcriptome data for the model organism, Streptomyces coelicolor, under different environmental and genetic perturbations, we have developed a supervised machine-learning method for operon prediction in this microorganism. We demonstrate that, using features dependent on transcriptome dynamics and genome sequence, a support vector machines (SVM)-based classification algorithm can accurately classify >90% of gene pairs in a set of known operons. Based on model predictions for the entire genome, we verified the co-transcription of more than 250 gene pairs by RT-PCR. These results vastly increase the database of known operons in S. coelicolor and provide valuable information for exploring gene function and regulation to harness the potential of this differentiating microorganism for synthesis of natural products.

Show MeSH