PlasmoGEM, a database supporting a community resource for large-scale experimental genetics in malaria parasites.
Bottom Line: They can be used for a wide array of applications, including protein localisation, gene interaction studies and high-throughput genetic screens.The PlasmoGEM web interface allows users to search a database of finished knock-out and gene tagging vectors, view details of their designs, download vector sequence in different formats and view available quality control data as well as suggested genotyping strategies.We also make gDNA library clones and intermediate vectors available for researchers to produce vectors for themselves.
Affiliation: Wellcome Trust Sanger Institute, Hinxton Cambridge, CB10 1SA, UK.Show MeSH
Related in: MedlinePlus
Mentions: The highly AT rich P. berghei genome contains many repetitive sequences and long homopolymeric tracts of adenine and thymine nucleotides that accumulate mutations at an increased rate when propagated in E. coli. Some of these could inadvertently modify neighbouring genes, if introduced into the parasite genome by the long homology arms of recombineered vectors. In the worst case (and in the absence of genetic complementation experiments) this may lead to the misattribution of a phenotype to the wrong gene. The homology arms and barcodes of vectors are therefore verified at the end of the production pipeline by sequencing four clones per design on an Illumina MiSeq instrument (150 bp paired ends are sequenced of 400–600 bp fragments (11)). This approach proved economical since production plates are laid out to eliminate overlap between homology arms of different vectors, allowing reads from a sequencing library encompassing an entire plate to be mapped back unambiguously to individual vectors. We typically sequence four colonies each from five plates together using different Illumina index codes. Read mapping and alignment processing uses open-source tools SMALT (https://www.sanger.ac.uk/resources/software/smalt/) and SAMtools (12). Mutations are called by a module of our custom software suite. Only high-quality alignments (quality score ≥ 20) within the expected insert-size range are used for the analysis. Point mutations and short insertions or deletions, which most often occur in long homopolymeric tracts of A or T nucleotides, are called when Binary Alignment/Map (BAM) files (12) reveal at least 25% of aligned reads with the mutation. Larger mutations, such as loss of a homology arm or structural rearrangements, are detected from drops of read coverage below 30% of the average depth (Figure 3).
Affiliation: Wellcome Trust Sanger Institute, Hinxton Cambridge, CB10 1SA, UK.