Limits...
Analysis methods for studying the 3D architecture of the genome.

Ay F, Noble WS - Genome Biol. (2015)

Bottom Line: The rapidly increasing quantity of genome-wide chromosome conformation capture data presents great opportunities and challenges in the computational modeling and interpretation of the three-dimensional genome.In particular, with recent trends towards higher-resolution high-throughput chromosome conformation capture (Hi-C) data, the diversity and complexity of biological hypotheses that can be tested necessitates rigorous computational and statistical methods as well as scalable pipelines to interpret these datasets.Here we review computational tools to interpret Hi-C data, including pipelines for mapping, filtering, and normalization, and methods for confidence estimation, domain calling, visualization, and three-dimensional modeling.

View Article: PubMed Central - PubMed

Affiliation: Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA. ferhatay@uw.edu.

ABSTRACT
The rapidly increasing quantity of genome-wide chromosome conformation capture data presents great opportunities and challenges in the computational modeling and interpretation of the three-dimensional genome. In particular, with recent trends towards higher-resolution high-throughput chromosome conformation capture (Hi-C) data, the diversity and complexity of biological hypotheses that can be tested necessitates rigorous computational and statistical methods as well as scalable pipelines to interpret these datasets. Here we review computational tools to interpret Hi-C data, including pipelines for mapping, filtering, and normalization, and methods for confidence estimation, domain calling, visualization, and three-dimensional modeling.

No MeSH data available.


Related in: MedlinePlus

Overview of Hi-C analysis pipelines. These pipelines start from raw reads and produce raw and normalized contact maps for further interpretation. The colored boxes represent alternative ways to accomplish a given step in the pipeline. RE, restriction enzyme. At each step, commonly used file formats (‘.fq’, ‘.bam’, and ‘.txt’) are indicated. a, The blue, pink and green boxes correspond to pre-truncation, iterative mapping and allowing split alignments, respectively. b, Several filters are applied to individual reads. c, The blue and pink boxes correspond to strand filters and distance filters, respectively. d, Three alternative methods for normalization
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4556012&req=5

Fig1: Overview of Hi-C analysis pipelines. These pipelines start from raw reads and produce raw and normalized contact maps for further interpretation. The colored boxes represent alternative ways to accomplish a given step in the pipeline. RE, restriction enzyme. At each step, commonly used file formats (‘.fq’, ‘.bam’, and ‘.txt’) are indicated. a, The blue, pink and green boxes correspond to pre-truncation, iterative mapping and allowing split alignments, respectively. b, Several filters are applied to individual reads. c, The blue and pink boxes correspond to strand filters and distance filters, respectively. d, Three alternative methods for normalization

Mentions: Pre-truncation: Pre-process all the reads and truncate the ones that contain potential ligation junctions to keep the longest piece without a junction sequence [46] (Fig. 1a, blue box). For restriction enzymes that leave sticky ends, the ligation junction sequence is a concatenation of two filled-in restriction sites (for example, AAGCTAGCTT for HindIII that cuts at A /AGCTT and GATCGATC for MboI that cuts at GATC /).Fig. 1


Analysis methods for studying the 3D architecture of the genome.

Ay F, Noble WS - Genome Biol. (2015)

Overview of Hi-C analysis pipelines. These pipelines start from raw reads and produce raw and normalized contact maps for further interpretation. The colored boxes represent alternative ways to accomplish a given step in the pipeline. RE, restriction enzyme. At each step, commonly used file formats (‘.fq’, ‘.bam’, and ‘.txt’) are indicated. a, The blue, pink and green boxes correspond to pre-truncation, iterative mapping and allowing split alignments, respectively. b, Several filters are applied to individual reads. c, The blue and pink boxes correspond to strand filters and distance filters, respectively. d, Three alternative methods for normalization
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4556012&req=5

Fig1: Overview of Hi-C analysis pipelines. These pipelines start from raw reads and produce raw and normalized contact maps for further interpretation. The colored boxes represent alternative ways to accomplish a given step in the pipeline. RE, restriction enzyme. At each step, commonly used file formats (‘.fq’, ‘.bam’, and ‘.txt’) are indicated. a, The blue, pink and green boxes correspond to pre-truncation, iterative mapping and allowing split alignments, respectively. b, Several filters are applied to individual reads. c, The blue and pink boxes correspond to strand filters and distance filters, respectively. d, Three alternative methods for normalization
Mentions: Pre-truncation: Pre-process all the reads and truncate the ones that contain potential ligation junctions to keep the longest piece without a junction sequence [46] (Fig. 1a, blue box). For restriction enzymes that leave sticky ends, the ligation junction sequence is a concatenation of two filled-in restriction sites (for example, AAGCTAGCTT for HindIII that cuts at A /AGCTT and GATCGATC for MboI that cuts at GATC /).Fig. 1

Bottom Line: The rapidly increasing quantity of genome-wide chromosome conformation capture data presents great opportunities and challenges in the computational modeling and interpretation of the three-dimensional genome.In particular, with recent trends towards higher-resolution high-throughput chromosome conformation capture (Hi-C) data, the diversity and complexity of biological hypotheses that can be tested necessitates rigorous computational and statistical methods as well as scalable pipelines to interpret these datasets.Here we review computational tools to interpret Hi-C data, including pipelines for mapping, filtering, and normalization, and methods for confidence estimation, domain calling, visualization, and three-dimensional modeling.

View Article: PubMed Central - PubMed

Affiliation: Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA. ferhatay@uw.edu.

ABSTRACT
The rapidly increasing quantity of genome-wide chromosome conformation capture data presents great opportunities and challenges in the computational modeling and interpretation of the three-dimensional genome. In particular, with recent trends towards higher-resolution high-throughput chromosome conformation capture (Hi-C) data, the diversity and complexity of biological hypotheses that can be tested necessitates rigorous computational and statistical methods as well as scalable pipelines to interpret these datasets. Here we review computational tools to interpret Hi-C data, including pipelines for mapping, filtering, and normalization, and methods for confidence estimation, domain calling, visualization, and three-dimensional modeling.

No MeSH data available.


Related in: MedlinePlus