Available resources and challenges for the clinical annotation of somatic variations.
Bottom Line: There are a number of such resources that have been designed to catalog and mine a plethora of germline variants or mutations.However, when analyzing tumor specimens in clinical settings, one may need to use different or ancillary resources that are specific for somatic variants or actionable mutations that may have clinical or treatment implications.In addition, the current need for collating various annotation sources into one-stop solutions to facilitate faster query execution and better integration into existing laboratory information systems are discussed.
Affiliation: Department of Pathology, Virginia Commonwealth University, Richmond, Virginia.Show MeSH
Related in: MedlinePlus
Mentions: Several algorithms are then applied to the raw data to align these short reads to a reference genome, assign read and mapping quality scores, and assess those loci that differ from the reference, called variants.4 These algorithms generate a variant call format (VCF) file,5 which is a generic format for storing DNA variant data such as single-nucleotide polymorphisms (SNPs), multiple nucleotide polymorphisms (MNPs), insertions (INS), and deletions (DEL), together with quality annotations. VCF files contain variants from a range of positions of the reference genome and are usually stored in a compressed manner. A typical VCF file does not contain information in a way that would be useful for a physician or researcher, such as the transcript and/or gene that contains the variant; the effect, if any, on the encoded protein, such as synonymous, missense, or nonsense mutations; the likelihood that the variant is pathogenic; and the effect on response to targeted therapies. As opposed to human whole-exome sequencing (WES) or whole-genome sequencing (WGS), which can yield nearly 100,000 or 3,600,000 variants6 per sample, respectively; targeted sequencing for cancer-related gene panels typically yield <<20 variants per tumor sample. Even though the medical genomicist processing such VCF files does not have to filter thousands of variants down to a manageable subset, he or she has the important task of distinguishing medically important or actionable variants from the others and reporting them to the treating physician in a meaningful manner (Fig. 1). Ideally, this filtering would be a simple operation of intersecting a VCF file with a comprehensive reference database of medically annotated variants. However, such a resource does not yet exist in a publicly available format, and the medical genomicist has to manually mine a plethora of publicly available and expert curated databases focused on human variant information, such as the Human Gene Mutation Database,7 the National Center for Biotechnology Information Short Genetic Variations database,8 and the Online Mendelian Inheritance of Man (OMIM; http://omim.org/), among others.9
Affiliation: Department of Pathology, Virginia Commonwealth University, Richmond, Virginia.