Limits...
Annotation of genomics data using bidirectional hidden Markov models unveils variations in Pol II transcription cycle.

Zacher B, Lidschreiber M, Cramer P, Gagneur J, Tresch A - Mol. Syst. Biol. (2014)

Bottom Line: To overcome these limitations, we introduce bidirectional HMMs which infer directed genomic states from occupancy profiles de novo.Application to RNA polymerase II-associated factors in yeast and chromatin modifications in human T cells recovers the majority of transcribed loci, reveals gene-specific variations in the yeast transcription cycle and indicates the existence of directed chromatin state patterns at transcribed, but not at repressed, regions in the human genome.We anticipate bidirectional HMMs to significantly improve the analyses of genome-associated directed processes.

View Article: PubMed Central - PubMed

Affiliation: Gene Center and Department of Biochemistry, Center for Integrated Protein Science CIPSM, Ludwig-Maximilians-Universität München, Munich, Germany Institute for Genetics, University of Cologne, Cologne, Germany.

No MeSH data available.


Related in: MedlinePlus

Principle of bidirectional HMM (bdHMM).Simulated occupancy signal (1st track from the top) for a putative factor with a low level (centered at 0) in untranscribed regions (state U), an intermediate level in 5' part of genes (state E), and a high level in 3' part of genes (state L). Arrows (2nd track) depict boundaries and orientation of transcription. Unlike standard HMM (3rd track), bdHMM (4th track) infers strands (+ or −) to expressed states (E, L).HMM transition graph. Because orientation of transcription is not modeled by standard HMM, the spurious reverse transitions (E ⇒ U,L ⇒ E and U ⇒ L) are as likely as the correctly oriented transitions (U ⇒ E,E ⇒ L and L ⇒ U).bdHMM transition graph. In contrast to HMM, bdHMM, which has explicit strand-specific expressed states (E+/E− and L+/L−), allows inferring only the correctly oriented transitions.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4300491&req=5

fig01: Principle of bidirectional HMM (bdHMM).Simulated occupancy signal (1st track from the top) for a putative factor with a low level (centered at 0) in untranscribed regions (state U), an intermediate level in 5' part of genes (state E), and a high level in 3' part of genes (state L). Arrows (2nd track) depict boundaries and orientation of transcription. Unlike standard HMM (3rd track), bdHMM (4th track) infers strands (+ or −) to expressed states (E, L).HMM transition graph. Because orientation of transcription is not modeled by standard HMM, the spurious reverse transitions (E ⇒ U,L ⇒ E and U ⇒ L) are as likely as the correctly oriented transitions (U ⇒ E,E ⇒ L and L ⇒ U).bdHMM transition graph. In contrast to HMM, bdHMM, which has explicit strand-specific expressed states (E+/E− and L+/L−), allows inferring only the correctly oriented transitions.

Mentions: Standard and bidirectional HMMs are best understood with the help of a simulated dataset. A precise definition of the HMM and a bdHMM is given in the Materials and Methods. The example in Fig1 considers a part of the genome where transcription occurs as a sequence of three different genomic segments. The transcribed regions split into segments of early (E) and late (L) transcription activity, and they are flanked by untranscribed (U) segments. The order of the three segments U, E and L along the genome depends on the orientation of the respective gene (Fig1A, gray arrows). ChIP measurements o0,o1,…,oT for a single protein at genomic positions t = 0,1,…,T were simulated with low (U), medium (E) and high (L) average occupancy in the different segments. Note that these ChIP signals do not contain strand-specific information. An HMM defines a probability distribution on a sequence of observations o0,…,oT. It assumes that each observation ot is emitted by a corresponding (unobserved) state variable , which can assume values from a finite set of hidden states. The value of determines the probability of observing ot, . The hidden variables form a first-order Markov chain, which means that the probability for observing depends only on st−1, the transition probability . After the learning of these probabilities, the HMM outputs the so-called Viterbi path, which is the most likely state sequence that generated the observations. In our example, the Viterbi path provides a genome annotation.


Annotation of genomics data using bidirectional hidden Markov models unveils variations in Pol II transcription cycle.

Zacher B, Lidschreiber M, Cramer P, Gagneur J, Tresch A - Mol. Syst. Biol. (2014)

Principle of bidirectional HMM (bdHMM).Simulated occupancy signal (1st track from the top) for a putative factor with a low level (centered at 0) in untranscribed regions (state U), an intermediate level in 5' part of genes (state E), and a high level in 3' part of genes (state L). Arrows (2nd track) depict boundaries and orientation of transcription. Unlike standard HMM (3rd track), bdHMM (4th track) infers strands (+ or −) to expressed states (E, L).HMM transition graph. Because orientation of transcription is not modeled by standard HMM, the spurious reverse transitions (E ⇒ U,L ⇒ E and U ⇒ L) are as likely as the correctly oriented transitions (U ⇒ E,E ⇒ L and L ⇒ U).bdHMM transition graph. In contrast to HMM, bdHMM, which has explicit strand-specific expressed states (E+/E− and L+/L−), allows inferring only the correctly oriented transitions.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4300491&req=5

fig01: Principle of bidirectional HMM (bdHMM).Simulated occupancy signal (1st track from the top) for a putative factor with a low level (centered at 0) in untranscribed regions (state U), an intermediate level in 5' part of genes (state E), and a high level in 3' part of genes (state L). Arrows (2nd track) depict boundaries and orientation of transcription. Unlike standard HMM (3rd track), bdHMM (4th track) infers strands (+ or −) to expressed states (E, L).HMM transition graph. Because orientation of transcription is not modeled by standard HMM, the spurious reverse transitions (E ⇒ U,L ⇒ E and U ⇒ L) are as likely as the correctly oriented transitions (U ⇒ E,E ⇒ L and L ⇒ U).bdHMM transition graph. In contrast to HMM, bdHMM, which has explicit strand-specific expressed states (E+/E− and L+/L−), allows inferring only the correctly oriented transitions.
Mentions: Standard and bidirectional HMMs are best understood with the help of a simulated dataset. A precise definition of the HMM and a bdHMM is given in the Materials and Methods. The example in Fig1 considers a part of the genome where transcription occurs as a sequence of three different genomic segments. The transcribed regions split into segments of early (E) and late (L) transcription activity, and they are flanked by untranscribed (U) segments. The order of the three segments U, E and L along the genome depends on the orientation of the respective gene (Fig1A, gray arrows). ChIP measurements o0,o1,…,oT for a single protein at genomic positions t = 0,1,…,T were simulated with low (U), medium (E) and high (L) average occupancy in the different segments. Note that these ChIP signals do not contain strand-specific information. An HMM defines a probability distribution on a sequence of observations o0,…,oT. It assumes that each observation ot is emitted by a corresponding (unobserved) state variable , which can assume values from a finite set of hidden states. The value of determines the probability of observing ot, . The hidden variables form a first-order Markov chain, which means that the probability for observing depends only on st−1, the transition probability . After the learning of these probabilities, the HMM outputs the so-called Viterbi path, which is the most likely state sequence that generated the observations. In our example, the Viterbi path provides a genome annotation.

Bottom Line: To overcome these limitations, we introduce bidirectional HMMs which infer directed genomic states from occupancy profiles de novo.Application to RNA polymerase II-associated factors in yeast and chromatin modifications in human T cells recovers the majority of transcribed loci, reveals gene-specific variations in the yeast transcription cycle and indicates the existence of directed chromatin state patterns at transcribed, but not at repressed, regions in the human genome.We anticipate bidirectional HMMs to significantly improve the analyses of genome-associated directed processes.

View Article: PubMed Central - PubMed

Affiliation: Gene Center and Department of Biochemistry, Center for Integrated Protein Science CIPSM, Ludwig-Maximilians-Universität München, Munich, Germany Institute for Genetics, University of Cologne, Cologne, Germany.

No MeSH data available.


Related in: MedlinePlus