Limits...
Distinguishing between productive and abortive promoters using a random forest classifier in Mycoplasma pneumoniae.

Lloréns-Rico V, Lluch-Senar M, Serrano L - Nucleic Acids Res. (2015)

Bottom Line: We determined the contribution to transcription events of different genomic features: the -10, extended -10 and -35 boxes, the UP element, the bases surrounding the -10 box and the nearest-neighbor free energy of the promoter region.Using a random forest classifier and the aforementioned features transformed into scores, we could distinguish between true, abortive promoters and non-promoters with good -10 box sequences.The methods used in this characterization of promoters can be extended to other bacteria and have important applications for promoter design in bacterial genome engineering.

View Article: PubMed Central - PubMed

Affiliation: EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain Universitat Pompeu Fabra (UPF), Dr Aiguader 88, 08003 Barcelona, Spain.

Show MeSH

Related in: MedlinePlus

Promoter re-annotation in M. pneumoniae. (A) Promoters detected in a region of 30 kb of the M. pneumoniae genome. Blue dots represent data from RNA tiling arrays in M. pneumoniae, the black line represents RNA-seq data, red arrows represent the annotated genes in the plus strand and vertical green lines represent the promoters found by the random forest classifier. These promoters coincide with sharp increases in the values of expression, both in the RNA-seq and the tiling data, validating the prediction. (B) Manual curation and re-annotation of promoters in the genome of M. pneumoniae. A promoter was found on the positive strand at the position 494 837 (vertical green line), which did not coincide with any annotated TSS (65). The predicted promoter, which is inside gene MPN410, coincides with a sharp increase in the RNA-seq data in the three different experiments represented.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4402517&req=5

Figure 4: Promoter re-annotation in M. pneumoniae. (A) Promoters detected in a region of 30 kb of the M. pneumoniae genome. Blue dots represent data from RNA tiling arrays in M. pneumoniae, the black line represents RNA-seq data, red arrows represent the annotated genes in the plus strand and vertical green lines represent the promoters found by the random forest classifier. These promoters coincide with sharp increases in the values of expression, both in the RNA-seq and the tiling data, validating the prediction. (B) Manual curation and re-annotation of promoters in the genome of M. pneumoniae. A promoter was found on the positive strand at the position 494 837 (vertical green line), which did not coincide with any annotated TSS (65). The predicted promoter, which is inside gene MPN410, coincides with a sharp increase in the RNA-seq data in the three different experiments represented.

Mentions: Using the 0.6 cutoff, the random forest classifier was able to find 709 putative promoters in the genome of M. pneumoniae (Supplementary Table S3). Of these, 576 coincide with steep changes both in RNA-seq and tiling data, indicating a TSS in the vicinity of the predicted promoter (Figure 4A).


Distinguishing between productive and abortive promoters using a random forest classifier in Mycoplasma pneumoniae.

Lloréns-Rico V, Lluch-Senar M, Serrano L - Nucleic Acids Res. (2015)

Promoter re-annotation in M. pneumoniae. (A) Promoters detected in a region of 30 kb of the M. pneumoniae genome. Blue dots represent data from RNA tiling arrays in M. pneumoniae, the black line represents RNA-seq data, red arrows represent the annotated genes in the plus strand and vertical green lines represent the promoters found by the random forest classifier. These promoters coincide with sharp increases in the values of expression, both in the RNA-seq and the tiling data, validating the prediction. (B) Manual curation and re-annotation of promoters in the genome of M. pneumoniae. A promoter was found on the positive strand at the position 494 837 (vertical green line), which did not coincide with any annotated TSS (65). The predicted promoter, which is inside gene MPN410, coincides with a sharp increase in the RNA-seq data in the three different experiments represented.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4402517&req=5

Figure 4: Promoter re-annotation in M. pneumoniae. (A) Promoters detected in a region of 30 kb of the M. pneumoniae genome. Blue dots represent data from RNA tiling arrays in M. pneumoniae, the black line represents RNA-seq data, red arrows represent the annotated genes in the plus strand and vertical green lines represent the promoters found by the random forest classifier. These promoters coincide with sharp increases in the values of expression, both in the RNA-seq and the tiling data, validating the prediction. (B) Manual curation and re-annotation of promoters in the genome of M. pneumoniae. A promoter was found on the positive strand at the position 494 837 (vertical green line), which did not coincide with any annotated TSS (65). The predicted promoter, which is inside gene MPN410, coincides with a sharp increase in the RNA-seq data in the three different experiments represented.
Mentions: Using the 0.6 cutoff, the random forest classifier was able to find 709 putative promoters in the genome of M. pneumoniae (Supplementary Table S3). Of these, 576 coincide with steep changes both in RNA-seq and tiling data, indicating a TSS in the vicinity of the predicted promoter (Figure 4A).

Bottom Line: We determined the contribution to transcription events of different genomic features: the -10, extended -10 and -35 boxes, the UP element, the bases surrounding the -10 box and the nearest-neighbor free energy of the promoter region.Using a random forest classifier and the aforementioned features transformed into scores, we could distinguish between true, abortive promoters and non-promoters with good -10 box sequences.The methods used in this characterization of promoters can be extended to other bacteria and have important applications for promoter design in bacterial genome engineering.

View Article: PubMed Central - PubMed

Affiliation: EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain Universitat Pompeu Fabra (UPF), Dr Aiguader 88, 08003 Barcelona, Spain.

Show MeSH
Related in: MedlinePlus