New methods for finding common insertion sites and co-occurring common insertion sites in transposon- and virus-based genetic screens.
Bottom Line:
Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation.In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window.Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.
View Article:
PubMed Central - PubMed
Affiliation: Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA. tracy.l.bergemann@medtronic.com
ABSTRACT
Show MeSH
Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy. Related in: MedlinePlus |
Related In:
Results -
Collection
License getmorefigures.php?uid=PMC3351147&req=5
Mentions: The number of times that insertions appear within a defined interval follows a Poisson process. Support for this assumption is provided in Section 2.1 in Supplementary Data. The Poisson probability distribution function is P(X = x) = e−λλx/x! where the parameter λ is the rate of insertion and x is the number of insertions residing within a given window. The methods in this section assume that all windows within a single model are of the same size. The rate of insertion can account for other important variables, such as the number of TA sites, using a Poisson regression. For individual regions Ri, i = 1, 2, … , nw, where nw is the number of windows of size w, the Poisson regression calculates the expected rate of insertion λi for region Ri using information about the size of the region, the chromosome it resides on, the number of TA sites within the window and the number of potential recoverable insertions. This last variable, the number of potential recoverable insertions, depends upon the restriction enzymes used during linker-mediated PCR (LM-PCR). A transposon insertion in a TA dinucleotide will not be recoverable if the nearest restriction enzyme cut site is too close to or too distant from the TA dinucleotide. The details for determining the TA dinucleotides and potential recoverable insertions (PRIs) in each window are provided in Section 1 of Supplementary Data. Figure 1 demonstrates that as window size increases, the insertion rate by chromosome varies increasingly. These chromosomal differences are accounted for by the coefficient βc. The effect of the number of potential recoverable insertions is estimated with β1. The TA sites are accounted for with an offset such that is roughly the number of insertions divided by the number of TA sites. The PRIM iswhere c = 1,2, … ,19. The resulting fit from this regression will provide the expected rate of insertion for a given chromosome, a given number of PRIs and a given number of TA sites within each window. The Poisson regression above can be extended to account for other important variables such as mouse gender, donor concatemer site, or, when analyzing retroviral insertion data, transcription start site.Figure 1. |
View Article: PubMed Central - PubMed
Affiliation: Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA. tracy.l.bergemann@medtronic.com