Limits...
Automated genome mining of ribosomal peptide natural products.

Mohimani H, Kersten RD, Liu WT, Wang M, Purvine SO, Wu S, Brewer HM, Pasa-Tolic L, Bandeira N, Moore BS, Pevzner PA, Dorrestein PC - ACS Chem. Biol. (2014)

Bottom Line: Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity.1 In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic data sets.RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs).The presented tool is available at cyclo.ucsd.edu.

View Article: PubMed Central - PubMed

Affiliation: Department of Electrical and Computer Engineering, University of California San Diego , La Jolla, California 92093, United States.

ABSTRACT
Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity.1 In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic data sets. Here, we introduce RiPPquest, a tandem mass spectrometry database search tool for identification of microbial RiPPs, and apply it to lanthipeptide discovery. RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs). We highlight RiPPquest by connecting multiple RiPPs from extracts of Streptomyces to their gene clusters and by the discovery of a new class III lanthipeptide, informatipeptin, from Streptomyces viridochromogenes DSM 40736 to reflect that it is a natural product that was discovered by mass spectrometry based genome mining using algorithmic tools rather than manual inspection of mass spectrometry data and genetic information. The presented tool is available at cyclo.ucsd.edu.

Show MeSH

Related in: MedlinePlus

Workflow implementedin the RiPPquest algorithm for automated peptidogenomicsof RiPPs. (a) Prediction of lanthipeptide gene clusters in microbialgenome sequence. (b) Generation of 10 kb windows centered at LANC-domainof gene clusters. (c) Prediction of ORFs in each gene cluster. (d)Selection of all candidate precursor peptides ORFs <100 aa. (e)Generation of candidate core peptides via C-terminal half of eachselected ORF. (f) Generation of all biosynthetic and gas phase productsof each core peptide, exemplified by peptide TFCRS. (g) Generationof MS/MS peptide database of predicted lanthipeptide products. (h)MS/MS analysis of microbial extract. (i) Matching of MS/MS data withMS/MS lanthipeptide spectral database with computed p-values. (j)Molecular network analysis of MS/MS data to identify peptide homologuesand to confirm PSMs.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4215869&req=5

fig1: Workflow implementedin the RiPPquest algorithm for automated peptidogenomicsof RiPPs. (a) Prediction of lanthipeptide gene clusters in microbialgenome sequence. (b) Generation of 10 kb windows centered at LANC-domainof gene clusters. (c) Prediction of ORFs in each gene cluster. (d)Selection of all candidate precursor peptides ORFs <100 aa. (e)Generation of candidate core peptides via C-terminal half of eachselected ORF. (f) Generation of all biosynthetic and gas phase productsof each core peptide, exemplified by peptide TFCRS. (g) Generationof MS/MS peptide database of predicted lanthipeptide products. (h)MS/MS analysis of microbial extract. (i) Matching of MS/MS data withMS/MS lanthipeptide spectral database with computed p-values. (j)Molecular network analysis of MS/MS data to identify peptide homologuesand to confirm PSMs.

Mentions: A recent study introduced a peptidogenomic approachfor the rapidcharacterization of RiPPs by MS-guided genome mining.11 Herein, de novo tandem MS sequence tagsof RiPPs were obtained by manual MS/MS analysis and searched againstthe 6-frame translation of a target genome under consideration ofbiosynthetic and gas phase modifications of target core peptide sequences.The manual peptidogenomic approach relies on characterization of longMS/MS sequence tags of 4–5 amino acids (aa) and a reductionof the search space in the 6-frame translation of a microbial genomeby consideration of only <100 aa-long ORFs. Since long sequencetags are often not present in macrocyclic RiPPs such as lanthipeptidesand since manual peptidogenomic analysis of large metabolic LC-MS/MSdata sets is limited, we implemented the peptidogenomic approach ina MS/MS database search tool, called RiPPquest, with computation ofstatistical significance for identified Peptide-Spectrum Matches (PSMs)to enable analysis of larger data sets (Figure 1). To overcome the weaknesses of proteomic MS/MS database tools toidentify RiPPs, RiPPquest was specified in peptide database generationand peptide-spectrum matching for a connection of lanthipeptide MS/MSdata with lanthipeptide gene clusters in microbial genomes.


Automated genome mining of ribosomal peptide natural products.

Mohimani H, Kersten RD, Liu WT, Wang M, Purvine SO, Wu S, Brewer HM, Pasa-Tolic L, Bandeira N, Moore BS, Pevzner PA, Dorrestein PC - ACS Chem. Biol. (2014)

Workflow implementedin the RiPPquest algorithm for automated peptidogenomicsof RiPPs. (a) Prediction of lanthipeptide gene clusters in microbialgenome sequence. (b) Generation of 10 kb windows centered at LANC-domainof gene clusters. (c) Prediction of ORFs in each gene cluster. (d)Selection of all candidate precursor peptides ORFs <100 aa. (e)Generation of candidate core peptides via C-terminal half of eachselected ORF. (f) Generation of all biosynthetic and gas phase productsof each core peptide, exemplified by peptide TFCRS. (g) Generationof MS/MS peptide database of predicted lanthipeptide products. (h)MS/MS analysis of microbial extract. (i) Matching of MS/MS data withMS/MS lanthipeptide spectral database with computed p-values. (j)Molecular network analysis of MS/MS data to identify peptide homologuesand to confirm PSMs.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4215869&req=5

fig1: Workflow implementedin the RiPPquest algorithm for automated peptidogenomicsof RiPPs. (a) Prediction of lanthipeptide gene clusters in microbialgenome sequence. (b) Generation of 10 kb windows centered at LANC-domainof gene clusters. (c) Prediction of ORFs in each gene cluster. (d)Selection of all candidate precursor peptides ORFs <100 aa. (e)Generation of candidate core peptides via C-terminal half of eachselected ORF. (f) Generation of all biosynthetic and gas phase productsof each core peptide, exemplified by peptide TFCRS. (g) Generationof MS/MS peptide database of predicted lanthipeptide products. (h)MS/MS analysis of microbial extract. (i) Matching of MS/MS data withMS/MS lanthipeptide spectral database with computed p-values. (j)Molecular network analysis of MS/MS data to identify peptide homologuesand to confirm PSMs.
Mentions: A recent study introduced a peptidogenomic approachfor the rapidcharacterization of RiPPs by MS-guided genome mining.11 Herein, de novo tandem MS sequence tagsof RiPPs were obtained by manual MS/MS analysis and searched againstthe 6-frame translation of a target genome under consideration ofbiosynthetic and gas phase modifications of target core peptide sequences.The manual peptidogenomic approach relies on characterization of longMS/MS sequence tags of 4–5 amino acids (aa) and a reductionof the search space in the 6-frame translation of a microbial genomeby consideration of only <100 aa-long ORFs. Since long sequencetags are often not present in macrocyclic RiPPs such as lanthipeptidesand since manual peptidogenomic analysis of large metabolic LC-MS/MSdata sets is limited, we implemented the peptidogenomic approach ina MS/MS database search tool, called RiPPquest, with computation ofstatistical significance for identified Peptide-Spectrum Matches (PSMs)to enable analysis of larger data sets (Figure 1). To overcome the weaknesses of proteomic MS/MS database tools toidentify RiPPs, RiPPquest was specified in peptide database generationand peptide-spectrum matching for a connection of lanthipeptide MS/MSdata with lanthipeptide gene clusters in microbial genomes.

Bottom Line: Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity.1 In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic data sets.RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs).The presented tool is available at cyclo.ucsd.edu.

View Article: PubMed Central - PubMed

Affiliation: Department of Electrical and Computer Engineering, University of California San Diego , La Jolla, California 92093, United States.

ABSTRACT
Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity.1 In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic data sets. Here, we introduce RiPPquest, a tandem mass spectrometry database search tool for identification of microbial RiPPs, and apply it to lanthipeptide discovery. RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs). We highlight RiPPquest by connecting multiple RiPPs from extracts of Streptomyces to their gene clusters and by the discovery of a new class III lanthipeptide, informatipeptin, from Streptomyces viridochromogenes DSM 40736 to reflect that it is a natural product that was discovered by mass spectrometry based genome mining using algorithmic tools rather than manual inspection of mass spectrometry data and genetic information. The presented tool is available at cyclo.ucsd.edu.

Show MeSH
Related in: MedlinePlus