Limits...
A compatible exon-exon junction database for the identification of exon skipping events using tandem mass spectrum data.

Mo F, Hong X, Gao F, Du L, Wang J, Omenn GS, Lin B - BMC Bioinformatics (2008)

Bottom Line: It is estimated that about 74% of multi-exon human genes have alternative splicing.We wrote scripts in perl, Bioperl, mysql and Ensembl API and built a theoretical exon-exon junction protein database to account for all possible combinations of exons for a gene while keeping the frame of translation (i.e., keeping only in-phase exon-exon combinations) from the Ensembl Core Database.This database will be useful in annotating genome structures using rapidly accumulating proteomics data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Systems Biology Division, Zhejiang-California Nanosystems Institute (ZCNI) of Zhejiang University, Zhejiang University Huajiachi Campus, Hangzhou, PR China. mofan.hz@gmail.com

ABSTRACT

Background: Alternative splicing is an important gene regulation mechanism. It is estimated that about 74% of multi-exon human genes have alternative splicing. High throughput tandem (MS/MS) mass spectrometry provides valuable information for rapidly identifying potentially novel alternatively-spliced protein products from experimental datasets. However, the ability to identify alternative splicing events through tandem mass spectrometry depends on the database against which the spectra are searched.

Results: We wrote scripts in perl, Bioperl, mysql and Ensembl API and built a theoretical exon-exon junction protein database to account for all possible combinations of exons for a gene while keeping the frame of translation (i.e., keeping only in-phase exon-exon combinations) from the Ensembl Core Database. Using our liver cancer MS/MS dataset, we identified a total of 488 non-redundant peptides that represent putative exon skipping events.

Conclusion: Our exon-exon junction database provides the scientific community with an efficient means to identify novel alternatively spliced (exon skipping) protein isoforms using mass spectrometry data. This database will be useful in annotating genome structures using rapidly accumulating proteomics data.

Show MeSH

Related in: MedlinePlus

Flowchart of our pipeline for identifying exon skipping forms using MS/MS data.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2636810&req=5

Figure 3: Flowchart of our pipeline for identifying exon skipping forms using MS/MS data.

Mentions: We used two liver mass spectrometry datasets to perform X!Tandem and SEQUEST searches against the constructed putative exon-exon junction protein database. The two datasets (one is cancer, the other is normal) were generated using multiple dimension liquid chromatography (MDLC) coupled to a LTQ-Orbitrap (Thermo Scientific Inc.) mass spectrometer. We converted all RAW mass spectra to mzXML files (160 for liver cancer tissues and 161 for normal adjacent tissues) by ReAdw , then analyzed them using the X!Tandem open source protein identification program [15] or the TurboSEQUEST (Bioworks version 3.2, Termo Electron)[16] (Figure 3). Searching parameters for X!Tandem and SEQUEST were set the same. Searches were performed using a fragment monoisotopic mass error tolerance of 400 ppm and parent monoisotopic mass tolerance of +/- 10 ppm. In the search parameters, we included one static post-translational modification for cysteine (+57.022 daltons) and four optional post-translational modifications (+16 for methionine and + 80 daltons for serine, threonine and tyrosine). Trypsin was used as proteolytic cleavage enzyme, and one missed cleavage site was allowed.


A compatible exon-exon junction database for the identification of exon skipping events using tandem mass spectrum data.

Mo F, Hong X, Gao F, Du L, Wang J, Omenn GS, Lin B - BMC Bioinformatics (2008)

Flowchart of our pipeline for identifying exon skipping forms using MS/MS data.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2636810&req=5

Figure 3: Flowchart of our pipeline for identifying exon skipping forms using MS/MS data.
Mentions: We used two liver mass spectrometry datasets to perform X!Tandem and SEQUEST searches against the constructed putative exon-exon junction protein database. The two datasets (one is cancer, the other is normal) were generated using multiple dimension liquid chromatography (MDLC) coupled to a LTQ-Orbitrap (Thermo Scientific Inc.) mass spectrometer. We converted all RAW mass spectra to mzXML files (160 for liver cancer tissues and 161 for normal adjacent tissues) by ReAdw , then analyzed them using the X!Tandem open source protein identification program [15] or the TurboSEQUEST (Bioworks version 3.2, Termo Electron)[16] (Figure 3). Searching parameters for X!Tandem and SEQUEST were set the same. Searches were performed using a fragment monoisotopic mass error tolerance of 400 ppm and parent monoisotopic mass tolerance of +/- 10 ppm. In the search parameters, we included one static post-translational modification for cysteine (+57.022 daltons) and four optional post-translational modifications (+16 for methionine and + 80 daltons for serine, threonine and tyrosine). Trypsin was used as proteolytic cleavage enzyme, and one missed cleavage site was allowed.

Bottom Line: It is estimated that about 74% of multi-exon human genes have alternative splicing.We wrote scripts in perl, Bioperl, mysql and Ensembl API and built a theoretical exon-exon junction protein database to account for all possible combinations of exons for a gene while keeping the frame of translation (i.e., keeping only in-phase exon-exon combinations) from the Ensembl Core Database.This database will be useful in annotating genome structures using rapidly accumulating proteomics data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Systems Biology Division, Zhejiang-California Nanosystems Institute (ZCNI) of Zhejiang University, Zhejiang University Huajiachi Campus, Hangzhou, PR China. mofan.hz@gmail.com

ABSTRACT

Background: Alternative splicing is an important gene regulation mechanism. It is estimated that about 74% of multi-exon human genes have alternative splicing. High throughput tandem (MS/MS) mass spectrometry provides valuable information for rapidly identifying potentially novel alternatively-spliced protein products from experimental datasets. However, the ability to identify alternative splicing events through tandem mass spectrometry depends on the database against which the spectra are searched.

Results: We wrote scripts in perl, Bioperl, mysql and Ensembl API and built a theoretical exon-exon junction protein database to account for all possible combinations of exons for a gene while keeping the frame of translation (i.e., keeping only in-phase exon-exon combinations) from the Ensembl Core Database. Using our liver cancer MS/MS dataset, we identified a total of 488 non-redundant peptides that represent putative exon skipping events.

Conclusion: Our exon-exon junction database provides the scientific community with an efficient means to identify novel alternatively spliced (exon skipping) protein isoforms using mass spectrometry data. This database will be useful in annotating genome structures using rapidly accumulating proteomics data.

Show MeSH
Related in: MedlinePlus