Limits...
New methods for finding common insertion sites and co-occurring common insertion sites in transposon- and virus-based genetic screens.

Bergemann TL, Starr TK, Yu H, Steinbach M, Erdmann J, Chen Y, Cormier RT, Largaespada DA, Silverstein KA - Nucleic Acids Res. (2012)

Bottom Line: Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation.We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods.Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.

View Article: PubMed Central - PubMed

Affiliation: Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA. tracy.l.bergemann@medtronic.com

ABSTRACT
Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.

Show MeSH

Related in: MedlinePlus

For various window sizes, a plot of the average rate of insertion for each mouse chromosome using the 15 857 insertions from the Starr et al. (10) data set. Conceptually, the rate parameter reflects the number of insertions per window, adjusting for the TA count. Chromosome 1 was dropped from the plot because for many mice this was where the donor transposon concatamer resided. All insertions that appeared on the same chromosome as their donor concatamer were removed in order to eliminate local-hopping artifacts. The local-hopping phenomenon is explained in more detail in Starr et al. (10).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3351147&req=5

gkr1295-F1: For various window sizes, a plot of the average rate of insertion for each mouse chromosome using the 15 857 insertions from the Starr et al. (10) data set. Conceptually, the rate parameter reflects the number of insertions per window, adjusting for the TA count. Chromosome 1 was dropped from the plot because for many mice this was where the donor transposon concatamer resided. All insertions that appeared on the same chromosome as their donor concatamer were removed in order to eliminate local-hopping artifacts. The local-hopping phenomenon is explained in more detail in Starr et al. (10).

Mentions: The number of times that insertions appear within a defined interval follows a Poisson process. Support for this assumption is provided in Section 2.1 in Supplementary Data. The Poisson probability distribution function is P(X = x) = e−λλx/x! where the parameter λ is the rate of insertion and x is the number of insertions residing within a given window. The methods in this section assume that all windows within a single model are of the same size. The rate of insertion can account for other important variables, such as the number of TA sites, using a Poisson regression. For individual regions Ri, i = 1, 2, … , nw, where nw is the number of windows of size w, the Poisson regression calculates the expected rate of insertion λi for region Ri using information about the size of the region, the chromosome it resides on, the number of TA sites within the window and the number of potential recoverable insertions. This last variable, the number of potential recoverable insertions, depends upon the restriction enzymes used during linker-mediated PCR (LM-PCR). A transposon insertion in a TA dinucleotide will not be recoverable if the nearest restriction enzyme cut site is too close to or too distant from the TA dinucleotide. The details for determining the TA dinucleotides and potential recoverable insertions (PRIs) in each window are provided in Section 1 of Supplementary Data. Figure 1 demonstrates that as window size increases, the insertion rate by chromosome varies increasingly. These chromosomal differences are accounted for by the coefficient βc. The effect of the number of potential recoverable insertions is estimated with β1. The TA sites are accounted for with an offset such that is roughly the number of insertions divided by the number of TA sites. The PRIM iswhere c = 1,2, … ,19. The resulting fit from this regression will provide the expected rate of insertion for a given chromosome, a given number of PRIs and a given number of TA sites within each window. The Poisson regression above can be extended to account for other important variables such as mouse gender, donor concatemer site, or, when analyzing retroviral insertion data, transcription start site.Figure 1.


New methods for finding common insertion sites and co-occurring common insertion sites in transposon- and virus-based genetic screens.

Bergemann TL, Starr TK, Yu H, Steinbach M, Erdmann J, Chen Y, Cormier RT, Largaespada DA, Silverstein KA - Nucleic Acids Res. (2012)

For various window sizes, a plot of the average rate of insertion for each mouse chromosome using the 15 857 insertions from the Starr et al. (10) data set. Conceptually, the rate parameter reflects the number of insertions per window, adjusting for the TA count. Chromosome 1 was dropped from the plot because for many mice this was where the donor transposon concatamer resided. All insertions that appeared on the same chromosome as their donor concatamer were removed in order to eliminate local-hopping artifacts. The local-hopping phenomenon is explained in more detail in Starr et al. (10).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3351147&req=5

gkr1295-F1: For various window sizes, a plot of the average rate of insertion for each mouse chromosome using the 15 857 insertions from the Starr et al. (10) data set. Conceptually, the rate parameter reflects the number of insertions per window, adjusting for the TA count. Chromosome 1 was dropped from the plot because for many mice this was where the donor transposon concatamer resided. All insertions that appeared on the same chromosome as their donor concatamer were removed in order to eliminate local-hopping artifacts. The local-hopping phenomenon is explained in more detail in Starr et al. (10).
Mentions: The number of times that insertions appear within a defined interval follows a Poisson process. Support for this assumption is provided in Section 2.1 in Supplementary Data. The Poisson probability distribution function is P(X = x) = e−λλx/x! where the parameter λ is the rate of insertion and x is the number of insertions residing within a given window. The methods in this section assume that all windows within a single model are of the same size. The rate of insertion can account for other important variables, such as the number of TA sites, using a Poisson regression. For individual regions Ri, i = 1, 2, … , nw, where nw is the number of windows of size w, the Poisson regression calculates the expected rate of insertion λi for region Ri using information about the size of the region, the chromosome it resides on, the number of TA sites within the window and the number of potential recoverable insertions. This last variable, the number of potential recoverable insertions, depends upon the restriction enzymes used during linker-mediated PCR (LM-PCR). A transposon insertion in a TA dinucleotide will not be recoverable if the nearest restriction enzyme cut site is too close to or too distant from the TA dinucleotide. The details for determining the TA dinucleotides and potential recoverable insertions (PRIs) in each window are provided in Section 1 of Supplementary Data. Figure 1 demonstrates that as window size increases, the insertion rate by chromosome varies increasingly. These chromosomal differences are accounted for by the coefficient βc. The effect of the number of potential recoverable insertions is estimated with β1. The TA sites are accounted for with an offset such that is roughly the number of insertions divided by the number of TA sites. The PRIM iswhere c = 1,2, … ,19. The resulting fit from this regression will provide the expected rate of insertion for a given chromosome, a given number of PRIs and a given number of TA sites within each window. The Poisson regression above can be extended to account for other important variables such as mouse gender, donor concatemer site, or, when analyzing retroviral insertion data, transcription start site.Figure 1.

Bottom Line: Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation.We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods.Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.

View Article: PubMed Central - PubMed

Affiliation: Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA. tracy.l.bergemann@medtronic.com

ABSTRACT
Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.

Show MeSH
Related in: MedlinePlus