Limits...
Deciphering the code for retroviral integration target site selection.

Santoni FA, Hartley O, Luban J - PLoS Comput. Biol. (2010)

Bottom Line: ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets.When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9.The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology and Molecular Medicine, University of Geneva, Geneva, Switzerland.

ABSTRACT
Upon cell invasion, retroviruses generate a DNA copy of their RNA genome and integrate retroviral cDNA within host chromosomal DNA. Integration occurs throughout the host cell genome, but target site selection is not random. Each subgroup of retrovirus is distinguished from the others by attraction to particular features on chromosomes. Despite extensive efforts to identify host factors that interact with retrovirion components or chromosome features predictive of integration, little is known about how integration sites are selected. We attempted to identify markers predictive of retroviral integration by exploiting Precision-Recall methods for extracting information from highly skewed datasets to derive robust and discriminating measures of association. ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets. When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9. By combining peaks from ChIPSeq datasets, a supermarker was identified that localized within 2 kB of 75% of MLV proviruses and detected differences in integration preferences among different cell types. The supermarker predicted the likelihood of integration within specific chromosomal regions in a cell-type specific manner, yielding probabilities for integration into proto-oncogene LMO2 identical to experimentally determined values. The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

Show MeSH
Influence of window size on the F score.Association (F score) between MLV proviruses [43] and either H3K4me3 (green dashed line with stars) or the supermarker in HeLa cells (solid blue line with solid circles) as a function of window size in kilobases. The true positive fraction versus the false positive fraction is shown for each point.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2991247&req=5

pcbi-1001008-g004: Influence of window size on the F score.Association (F score) between MLV proviruses [43] and either H3K4me3 (green dashed line with stars) or the supermarker in HeLa cells (solid blue line with solid circles) as a function of window size in kilobases. The true positive fraction versus the false positive fraction is shown for each point.

Mentions: ChIPSeq datasets for 60 chromatin-associated factors (Table 1) were compared with 14 provirus datasets for MLV, PERV, XMRV, HIV-1, HTLV-1, ASLV, Foamy virus, and HIV/MLV chimeras (Table 2). Acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9, were strongly associated with gammaretroviral integration sites, all with F scores >0.80 (Figures 1 and 2, Table 3 and Tables S1 and S2). H3K4me3 in particular was strongly associated with MLV integration sites (68% wi2kB; p<10−324; F score 0.83) and with the integration sites of PERV (60% wi2kB; p<10−350; F score 0.82) and XMRV (64% wi2kB; p<10−170; F score 0.81) (Figures 1 and 2, Table 3). The effect of window size on the F score was examined for factors strongly associated with MLV and the other gammaretroviruses. Interestingly, the F score was maximal when it was calculated using a window of +/−2 kB for proviruses flanking the sites of these chromatin features (Figure 4).


Deciphering the code for retroviral integration target site selection.

Santoni FA, Hartley O, Luban J - PLoS Comput. Biol. (2010)

Influence of window size on the F score.Association (F score) between MLV proviruses [43] and either H3K4me3 (green dashed line with stars) or the supermarker in HeLa cells (solid blue line with solid circles) as a function of window size in kilobases. The true positive fraction versus the false positive fraction is shown for each point.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2991247&req=5

pcbi-1001008-g004: Influence of window size on the F score.Association (F score) between MLV proviruses [43] and either H3K4me3 (green dashed line with stars) or the supermarker in HeLa cells (solid blue line with solid circles) as a function of window size in kilobases. The true positive fraction versus the false positive fraction is shown for each point.
Mentions: ChIPSeq datasets for 60 chromatin-associated factors (Table 1) were compared with 14 provirus datasets for MLV, PERV, XMRV, HIV-1, HTLV-1, ASLV, Foamy virus, and HIV/MLV chimeras (Table 2). Acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9, were strongly associated with gammaretroviral integration sites, all with F scores >0.80 (Figures 1 and 2, Table 3 and Tables S1 and S2). H3K4me3 in particular was strongly associated with MLV integration sites (68% wi2kB; p<10−324; F score 0.83) and with the integration sites of PERV (60% wi2kB; p<10−350; F score 0.82) and XMRV (64% wi2kB; p<10−170; F score 0.81) (Figures 1 and 2, Table 3). The effect of window size on the F score was examined for factors strongly associated with MLV and the other gammaretroviruses. Interestingly, the F score was maximal when it was calculated using a window of +/−2 kB for proviruses flanking the sites of these chromatin features (Figure 4).

Bottom Line: ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets.When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9.The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology and Molecular Medicine, University of Geneva, Geneva, Switzerland.

ABSTRACT
Upon cell invasion, retroviruses generate a DNA copy of their RNA genome and integrate retroviral cDNA within host chromosomal DNA. Integration occurs throughout the host cell genome, but target site selection is not random. Each subgroup of retrovirus is distinguished from the others by attraction to particular features on chromosomes. Despite extensive efforts to identify host factors that interact with retrovirion components or chromosome features predictive of integration, little is known about how integration sites are selected. We attempted to identify markers predictive of retroviral integration by exploiting Precision-Recall methods for extracting information from highly skewed datasets to derive robust and discriminating measures of association. ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets. When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9. By combining peaks from ChIPSeq datasets, a supermarker was identified that localized within 2 kB of 75% of MLV proviruses and detected differences in integration preferences among different cell types. The supermarker predicted the likelihood of integration within specific chromosomal regions in a cell-type specific manner, yielding probabilities for integration into proto-oncogene LMO2 identical to experimentally determined values. The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

Show MeSH