Limits...
Deciphering the code for retroviral integration target site selection.

Santoni FA, Hartley O, Luban J - PLoS Comput. Biol. (2010)

Bottom Line: ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets.When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9.The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology and Molecular Medicine, University of Geneva, Geneva, Switzerland.

ABSTRACT
Upon cell invasion, retroviruses generate a DNA copy of their RNA genome and integrate retroviral cDNA within host chromosomal DNA. Integration occurs throughout the host cell genome, but target site selection is not random. Each subgroup of retrovirus is distinguished from the others by attraction to particular features on chromosomes. Despite extensive efforts to identify host factors that interact with retrovirion components or chromosome features predictive of integration, little is known about how integration sites are selected. We attempted to identify markers predictive of retroviral integration by exploiting Precision-Recall methods for extracting information from highly skewed datasets to derive robust and discriminating measures of association. ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets. When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9. By combining peaks from ChIPSeq datasets, a supermarker was identified that localized within 2 kB of 75% of MLV proviruses and detected differences in integration preferences among different cell types. The supermarker predicted the likelihood of integration within specific chromosomal regions in a cell-type specific manner, yielding probabilities for integration into proto-oncogene LMO2 identical to experimentally determined values. The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

Show MeSH
Visualization of association between retroviral integration sites and chromosomal markers.(A) Construction of chromosome projection mandalas to visualize the proximity of individual proviruses to the nearest marker on the chromosome. The linear sequence of each human chromosome was linked and circularized. Proviral integration sites were located on the circle according to their position on each chromosome (empty circles) and then a marker (filled circles) was placed towards the center of the circle, at a distance from the perimeter that was equal, in log scale from 0 to 1 megabase, to the distance from the closest marker (empty boxes). Blue filled circles represent proviruses that were within 2kB from the nearest marker; red circles represent proviruses that are >2kB from the nearest marker. Examples of chromosome projection mandala for (B) MLV (Lewinski et al. 2006) versus H3K4me3, the arrow indicates the chromosomal mapping direction (C) Control versus H3K4me3 (D) MLV versus STAT1 and (E) MLV versus CpG+TSS. The number of MLV proviruses analyzed in this dataset (Lewinski et al. 2006) was 588. The F score and the percentage of proviruses within 2 kB are presented under each mandala.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2991247&req=5

pcbi-1001008-g001: Visualization of association between retroviral integration sites and chromosomal markers.(A) Construction of chromosome projection mandalas to visualize the proximity of individual proviruses to the nearest marker on the chromosome. The linear sequence of each human chromosome was linked and circularized. Proviral integration sites were located on the circle according to their position on each chromosome (empty circles) and then a marker (filled circles) was placed towards the center of the circle, at a distance from the perimeter that was equal, in log scale from 0 to 1 megabase, to the distance from the closest marker (empty boxes). Blue filled circles represent proviruses that were within 2kB from the nearest marker; red circles represent proviruses that are >2kB from the nearest marker. Examples of chromosome projection mandala for (B) MLV (Lewinski et al. 2006) versus H3K4me3, the arrow indicates the chromosomal mapping direction (C) Control versus H3K4me3 (D) MLV versus STAT1 and (E) MLV versus CpG+TSS. The number of MLV proviruses analyzed in this dataset (Lewinski et al. 2006) was 588. The F score and the percentage of proviruses within 2 kB are presented under each mandala.

Mentions: To visualize genome-wide association of proviruses with potential markers, chromosome projection mandalas were developed (Figure 1A, see Methods). Each dot on the mandala represents a retroviral integration site with the following polar coordinates: angular distance corresponds to genomic location on the indicated chromosome; radial distance from the contour of the circle is the distance in nucleotides from the nearest site of the marker in question, log-scaled from 0 to 1 megabase.


Deciphering the code for retroviral integration target site selection.

Santoni FA, Hartley O, Luban J - PLoS Comput. Biol. (2010)

Visualization of association between retroviral integration sites and chromosomal markers.(A) Construction of chromosome projection mandalas to visualize the proximity of individual proviruses to the nearest marker on the chromosome. The linear sequence of each human chromosome was linked and circularized. Proviral integration sites were located on the circle according to their position on each chromosome (empty circles) and then a marker (filled circles) was placed towards the center of the circle, at a distance from the perimeter that was equal, in log scale from 0 to 1 megabase, to the distance from the closest marker (empty boxes). Blue filled circles represent proviruses that were within 2kB from the nearest marker; red circles represent proviruses that are >2kB from the nearest marker. Examples of chromosome projection mandala for (B) MLV (Lewinski et al. 2006) versus H3K4me3, the arrow indicates the chromosomal mapping direction (C) Control versus H3K4me3 (D) MLV versus STAT1 and (E) MLV versus CpG+TSS. The number of MLV proviruses analyzed in this dataset (Lewinski et al. 2006) was 588. The F score and the percentage of proviruses within 2 kB are presented under each mandala.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2991247&req=5

pcbi-1001008-g001: Visualization of association between retroviral integration sites and chromosomal markers.(A) Construction of chromosome projection mandalas to visualize the proximity of individual proviruses to the nearest marker on the chromosome. The linear sequence of each human chromosome was linked and circularized. Proviral integration sites were located on the circle according to their position on each chromosome (empty circles) and then a marker (filled circles) was placed towards the center of the circle, at a distance from the perimeter that was equal, in log scale from 0 to 1 megabase, to the distance from the closest marker (empty boxes). Blue filled circles represent proviruses that were within 2kB from the nearest marker; red circles represent proviruses that are >2kB from the nearest marker. Examples of chromosome projection mandala for (B) MLV (Lewinski et al. 2006) versus H3K4me3, the arrow indicates the chromosomal mapping direction (C) Control versus H3K4me3 (D) MLV versus STAT1 and (E) MLV versus CpG+TSS. The number of MLV proviruses analyzed in this dataset (Lewinski et al. 2006) was 588. The F score and the percentage of proviruses within 2 kB are presented under each mandala.
Mentions: To visualize genome-wide association of proviruses with potential markers, chromosome projection mandalas were developed (Figure 1A, see Methods). Each dot on the mandala represents a retroviral integration site with the following polar coordinates: angular distance corresponds to genomic location on the indicated chromosome; radial distance from the contour of the circle is the distance in nucleotides from the nearest site of the marker in question, log-scaled from 0 to 1 megabase.

Bottom Line: ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets.When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9.The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology and Molecular Medicine, University of Geneva, Geneva, Switzerland.

ABSTRACT
Upon cell invasion, retroviruses generate a DNA copy of their RNA genome and integrate retroviral cDNA within host chromosomal DNA. Integration occurs throughout the host cell genome, but target site selection is not random. Each subgroup of retrovirus is distinguished from the others by attraction to particular features on chromosomes. Despite extensive efforts to identify host factors that interact with retrovirion components or chromosome features predictive of integration, little is known about how integration sites are selected. We attempted to identify markers predictive of retroviral integration by exploiting Precision-Recall methods for extracting information from highly skewed datasets to derive robust and discriminating measures of association. ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets. When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9. By combining peaks from ChIPSeq datasets, a supermarker was identified that localized within 2 kB of 75% of MLV proviruses and detected differences in integration preferences among different cell types. The supermarker predicted the likelihood of integration within specific chromosomal regions in a cell-type specific manner, yielding probabilities for integration into proto-oncogene LMO2 identical to experimentally determined values. The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

Show MeSH