Limits...
Serine integrase chimeras with activity in E. coli and HeLa cells.

Farruggio AP, Calos MP - Biol Open (2014)

Bottom Line: In order to generate information that might be useful for altering the specificity of serine integrases and to improve their efficiency, we tested a hybridization strategy that has been successful with several small serine recombinases.Our work is the first to demonstrate chimeric serine integrase activity.This analysis sheds light on integrase structure and function, and establishes a potentially tractable means to probe the specificity of the thousands of putative large serine recombinases that have been revealed by bioinformatics studies.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305-5120, USA.

No MeSH data available.


Related in: MedlinePlus

Histogram of serine recombinase lengths.All proteins with an InterPro (Hunter et al., 2012) serine recombinase catalytic domain (IPR006119; 35,076 entries) were clustered (13,019 clusters). The mean protein length of each cluster was computed, and the distribution of these lengths is presented here as a histogram. The list of putative serine recombinases was assembled with a custom script that scanned the entire InterPro “Protein matched complete” XML flatfile (∼75 GiB uncompressed; downloaded on Feb. 14, 2014) for proteins with an IPR006119 domain (35,076 proteins found). Protein sequences were downloaded from UniProt (UniProt Consortium, 2012) and were validated via CRC64 checksum comparison with InterPro. CD-HIT (Li and Godzik, 2006) version 4.6.1 was used to perform the clustering with the following parameters: 95% identity cutoff, 95% size cutoff, five character word size. Because the smallest characterized serine integrases (A118 and U153, accession numbers Q9T193 and Q8LTD8, respectively) are both 452 residues in length, we estimate that there are at least 4,000 unique putative large serine recombinases in the InterPro database as of February 14, 2014.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4197438&req=5

f01: Histogram of serine recombinase lengths.All proteins with an InterPro (Hunter et al., 2012) serine recombinase catalytic domain (IPR006119; 35,076 entries) were clustered (13,019 clusters). The mean protein length of each cluster was computed, and the distribution of these lengths is presented here as a histogram. The list of putative serine recombinases was assembled with a custom script that scanned the entire InterPro “Protein matched complete” XML flatfile (∼75 GiB uncompressed; downloaded on Feb. 14, 2014) for proteins with an IPR006119 domain (35,076 proteins found). Protein sequences were downloaded from UniProt (UniProt Consortium, 2012) and were validated via CRC64 checksum comparison with InterPro. CD-HIT (Li and Godzik, 2006) version 4.6.1 was used to perform the clustering with the following parameters: 95% identity cutoff, 95% size cutoff, five character word size. Because the smallest characterized serine integrases (A118 and U153, accession numbers Q9T193 and Q8LTD8, respectively) are both 452 residues in length, we estimate that there are at least 4,000 unique putative large serine recombinases in the InterPro database as of February 14, 2014.

Mentions: Serine integrases mediate recombination between two distinct ∼50 bp phage and bacterial sequences named attP and attB, respectively (Brown et al., 2011; Smith et al., 2010). Without assistance from other proteins, the reaction proceeds in a unidirectional manner to produce the left and right attachment sites – attL and attR (Smith et al., 2010). Because they are ∼200–350 residues larger than the small serine recombinases (Fig. 1), serine integrases are classified as members of the large serine recombinase sub-family (Smith and Thorpe, 2002). All serine integrases characterized to date appear to consist of an ∼120 amino acid N-terminal domain that is connected via a ∼30 residue alpha-helix to a ∼300–450 amino acid C-terminal domain (supplementary material Table S1). The N-terminal domain is principally involved in catalysis, but also imparts some sequence specificity, and the C-terminal domain appears to be primarily responsible for DNA-binding and directionality control (Ghosh et al., 2005; Gordley et al., 2007; Mandali et al., 2013; McEwan et al., 2009; Rowley et al., 2008). At present, there are thousands of putative large serine recombinases in the sequence databases (Fig. 1).


Serine integrase chimeras with activity in E. coli and HeLa cells.

Farruggio AP, Calos MP - Biol Open (2014)

Histogram of serine recombinase lengths.All proteins with an InterPro (Hunter et al., 2012) serine recombinase catalytic domain (IPR006119; 35,076 entries) were clustered (13,019 clusters). The mean protein length of each cluster was computed, and the distribution of these lengths is presented here as a histogram. The list of putative serine recombinases was assembled with a custom script that scanned the entire InterPro “Protein matched complete” XML flatfile (∼75 GiB uncompressed; downloaded on Feb. 14, 2014) for proteins with an IPR006119 domain (35,076 proteins found). Protein sequences were downloaded from UniProt (UniProt Consortium, 2012) and were validated via CRC64 checksum comparison with InterPro. CD-HIT (Li and Godzik, 2006) version 4.6.1 was used to perform the clustering with the following parameters: 95% identity cutoff, 95% size cutoff, five character word size. Because the smallest characterized serine integrases (A118 and U153, accession numbers Q9T193 and Q8LTD8, respectively) are both 452 residues in length, we estimate that there are at least 4,000 unique putative large serine recombinases in the InterPro database as of February 14, 2014.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4197438&req=5

f01: Histogram of serine recombinase lengths.All proteins with an InterPro (Hunter et al., 2012) serine recombinase catalytic domain (IPR006119; 35,076 entries) were clustered (13,019 clusters). The mean protein length of each cluster was computed, and the distribution of these lengths is presented here as a histogram. The list of putative serine recombinases was assembled with a custom script that scanned the entire InterPro “Protein matched complete” XML flatfile (∼75 GiB uncompressed; downloaded on Feb. 14, 2014) for proteins with an IPR006119 domain (35,076 proteins found). Protein sequences were downloaded from UniProt (UniProt Consortium, 2012) and were validated via CRC64 checksum comparison with InterPro. CD-HIT (Li and Godzik, 2006) version 4.6.1 was used to perform the clustering with the following parameters: 95% identity cutoff, 95% size cutoff, five character word size. Because the smallest characterized serine integrases (A118 and U153, accession numbers Q9T193 and Q8LTD8, respectively) are both 452 residues in length, we estimate that there are at least 4,000 unique putative large serine recombinases in the InterPro database as of February 14, 2014.
Mentions: Serine integrases mediate recombination between two distinct ∼50 bp phage and bacterial sequences named attP and attB, respectively (Brown et al., 2011; Smith et al., 2010). Without assistance from other proteins, the reaction proceeds in a unidirectional manner to produce the left and right attachment sites – attL and attR (Smith et al., 2010). Because they are ∼200–350 residues larger than the small serine recombinases (Fig. 1), serine integrases are classified as members of the large serine recombinase sub-family (Smith and Thorpe, 2002). All serine integrases characterized to date appear to consist of an ∼120 amino acid N-terminal domain that is connected via a ∼30 residue alpha-helix to a ∼300–450 amino acid C-terminal domain (supplementary material Table S1). The N-terminal domain is principally involved in catalysis, but also imparts some sequence specificity, and the C-terminal domain appears to be primarily responsible for DNA-binding and directionality control (Ghosh et al., 2005; Gordley et al., 2007; Mandali et al., 2013; McEwan et al., 2009; Rowley et al., 2008). At present, there are thousands of putative large serine recombinases in the sequence databases (Fig. 1).

Bottom Line: In order to generate information that might be useful for altering the specificity of serine integrases and to improve their efficiency, we tested a hybridization strategy that has been successful with several small serine recombinases.Our work is the first to demonstrate chimeric serine integrase activity.This analysis sheds light on integrase structure and function, and establishes a potentially tractable means to probe the specificity of the thousands of putative large serine recombinases that have been revealed by bioinformatics studies.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305-5120, USA.

No MeSH data available.


Related in: MedlinePlus