Limits...
Deciphering the code for retroviral integration target site selection.

Santoni FA, Hartley O, Luban J - PLoS Comput. Biol. (2010)

Bottom Line: ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets.When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9.The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology and Molecular Medicine, University of Geneva, Geneva, Switzerland.

ABSTRACT
Upon cell invasion, retroviruses generate a DNA copy of their RNA genome and integrate retroviral cDNA within host chromosomal DNA. Integration occurs throughout the host cell genome, but target site selection is not random. Each subgroup of retrovirus is distinguished from the others by attraction to particular features on chromosomes. Despite extensive efforts to identify host factors that interact with retrovirion components or chromosome features predictive of integration, little is known about how integration sites are selected. We attempted to identify markers predictive of retroviral integration by exploiting Precision-Recall methods for extracting information from highly skewed datasets to derive robust and discriminating measures of association. ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets. When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9. By combining peaks from ChIPSeq datasets, a supermarker was identified that localized within 2 kB of 75% of MLV proviruses and detected differences in integration preferences among different cell types. The supermarker predicted the likelihood of integration within specific chromosomal regions in a cell-type specific manner, yielding probabilities for integration into proto-oncogene LMO2 identical to experimentally determined values. The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

Show MeSH

Related in: MedlinePlus

Influence of dataset matching on the F score.Histograms of the F score (upper panel) and the percentage of associated proviruses wi2kb of the supermarkers (lower panel) with respect to MLV proviruses, either from Lewinski et al (MLV HeLa I) or Wu et al (MLV HeLa II), and the HIVmINmGAG chimera, as indicated. Supermarkers were generated with ChiPSeq data from HeLa cells or from CD4+ T cells and compared with MLV proviruses from either HeLa cells or CD4+T cells. “Matched” means that the provirus and the supermarker are from the same cell type.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2991247&req=5

pcbi-1001008-g010: Influence of dataset matching on the F score.Histograms of the F score (upper panel) and the percentage of associated proviruses wi2kb of the supermarkers (lower panel) with respect to MLV proviruses, either from Lewinski et al (MLV HeLa I) or Wu et al (MLV HeLa II), and the HIVmINmGAG chimera, as indicated. Supermarkers were generated with ChiPSeq data from HeLa cells or from CD4+ T cells and compared with MLV proviruses from either HeLa cells or CD4+T cells. “Matched” means that the provirus and the supermarker are from the same cell type.

Mentions: To determine if the F score has the ability to discriminate between cell types, MLV provirus data sets from HeLa and CD4+ T cells were compared with the supermarker for each of these cell types, in all combinations. As mentioned above, when an MLV provirus dataset obtained from infection of HeLa cells [43] was compared with the supermarker from HeLa cell ChIPSeq data, very strong association was observed (75% wi2kB; p<10−284; F score 0.87) (Table 5 and Figure 10). When the same provirus dataset was compared with the supermarker derived from CD4+ T cell ChIPSeq data the strength of the association was much decreased (32% wi2kB; p<10−57; F score 0.61) (Table 5 and Figure 10). The same pattern was seen for the chimera HIVmINmGag, for which association with the supermarker in HeLa cells (70% wi2kB; p<10−263; F score 0.86)(Table 5 and Figure 10) was much greater than association with the supermarker in CD4+ T cells (27% wi2kB; p<10−24; F score 0.56) (Table 5 and Figure 10). The opposite pattern was also seen in that MLV proviruses cloned from CD4+ T cells [71] were strongly associated with the supermarker derived in these cells (71% wi2kB; p<10−112; F score 0.84) (Table 5 and Figure 10), and less well associated with the supermarker from HeLa cells (39% wi2kB; p<10−42; F score 0.67) (Table 5 and Figure 10).


Deciphering the code for retroviral integration target site selection.

Santoni FA, Hartley O, Luban J - PLoS Comput. Biol. (2010)

Influence of dataset matching on the F score.Histograms of the F score (upper panel) and the percentage of associated proviruses wi2kb of the supermarkers (lower panel) with respect to MLV proviruses, either from Lewinski et al (MLV HeLa I) or Wu et al (MLV HeLa II), and the HIVmINmGAG chimera, as indicated. Supermarkers were generated with ChiPSeq data from HeLa cells or from CD4+ T cells and compared with MLV proviruses from either HeLa cells or CD4+T cells. “Matched” means that the provirus and the supermarker are from the same cell type.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2991247&req=5

pcbi-1001008-g010: Influence of dataset matching on the F score.Histograms of the F score (upper panel) and the percentage of associated proviruses wi2kb of the supermarkers (lower panel) with respect to MLV proviruses, either from Lewinski et al (MLV HeLa I) or Wu et al (MLV HeLa II), and the HIVmINmGAG chimera, as indicated. Supermarkers were generated with ChiPSeq data from HeLa cells or from CD4+ T cells and compared with MLV proviruses from either HeLa cells or CD4+T cells. “Matched” means that the provirus and the supermarker are from the same cell type.
Mentions: To determine if the F score has the ability to discriminate between cell types, MLV provirus data sets from HeLa and CD4+ T cells were compared with the supermarker for each of these cell types, in all combinations. As mentioned above, when an MLV provirus dataset obtained from infection of HeLa cells [43] was compared with the supermarker from HeLa cell ChIPSeq data, very strong association was observed (75% wi2kB; p<10−284; F score 0.87) (Table 5 and Figure 10). When the same provirus dataset was compared with the supermarker derived from CD4+ T cell ChIPSeq data the strength of the association was much decreased (32% wi2kB; p<10−57; F score 0.61) (Table 5 and Figure 10). The same pattern was seen for the chimera HIVmINmGag, for which association with the supermarker in HeLa cells (70% wi2kB; p<10−263; F score 0.86)(Table 5 and Figure 10) was much greater than association with the supermarker in CD4+ T cells (27% wi2kB; p<10−24; F score 0.56) (Table 5 and Figure 10). The opposite pattern was also seen in that MLV proviruses cloned from CD4+ T cells [71] were strongly associated with the supermarker derived in these cells (71% wi2kB; p<10−112; F score 0.84) (Table 5 and Figure 10), and less well associated with the supermarker from HeLa cells (39% wi2kB; p<10−42; F score 0.67) (Table 5 and Figure 10).

Bottom Line: ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets.When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9.The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology and Molecular Medicine, University of Geneva, Geneva, Switzerland.

ABSTRACT
Upon cell invasion, retroviruses generate a DNA copy of their RNA genome and integrate retroviral cDNA within host chromosomal DNA. Integration occurs throughout the host cell genome, but target site selection is not random. Each subgroup of retrovirus is distinguished from the others by attraction to particular features on chromosomes. Despite extensive efforts to identify host factors that interact with retrovirion components or chromosome features predictive of integration, little is known about how integration sites are selected. We attempted to identify markers predictive of retroviral integration by exploiting Precision-Recall methods for extracting information from highly skewed datasets to derive robust and discriminating measures of association. ChIPSeq datasets for more than 60 factors were compared with 14 retroviral integration datasets. When compared with MLV, PERV or XMRV integration sites, strong association was observed with STAT1, acetylation of H3 and H4 at several positions, and methylation of H2AZ, H3K4, and K9. By combining peaks from ChIPSeq datasets, a supermarker was identified that localized within 2 kB of 75% of MLV proviruses and detected differences in integration preferences among different cell types. The supermarker predicted the likelihood of integration within specific chromosomal regions in a cell-type specific manner, yielding probabilities for integration into proto-oncogene LMO2 identical to experimentally determined values. The supermarker thus identifies chromosomal features highly favored for retroviral integration, provides clues to the mechanism by which retrovirus integration sites are selected, and offers a tool for predicting cell-type specific proto-oncogene activation by retroviruses.

Show MeSH
Related in: MedlinePlus