Limits...
A draft map of the human proteome.

Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, Chaerkady R, Madugundu AK, Kelkar DS, Isserlin R, Jain S, Thomas JK, Muthusamy B, Leal-Rojas P, Kumar P, Sahasrabuddhe NA, Balakrishnan L, Advani J, George B, Renuse S, Selvan LD, Patil AH, Nanjappa V, Radhakrishnan A, Prasad S, Subbannayya T, Raju R, Kumar M, Sreenivasamurthy SK, Marimuthu A, Sathe GJ, Chavan S, Datta KK, Subbannayya Y, Sahu A, Yelamanchi SD, Jayaram S, Rajagopalan P, Sharma J, Murthy KR, Syed N, Goel R, Khan AA, Ahmad S, Dey G, Mudgal K, Chatterjee A, Huang TC, Zhong J, Wu X, Shaw PG, Freed D, Zahari MS, Mukherjee KK, Shankar S, Mahadevan A, Lam H, Mitchell CJ, Shankar SK, Satishchandra P, Schroeder JT, Sirdeshmukh R, Maitra A, Leach SD, Drake CG, Halushka MK, Prasad TS, Hruban RH, Kerr CL, Bader GD, Iacobuzio-Donahue CA, Gowda H, Pandey A - Nature (2014)

Bottom Line: However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet.In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans.A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames.

View Article: PubMed Central - PubMed

Affiliation: 1] McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA [2] Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA.

ABSTRACT
The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here we present a draft map of the human proteome using high-resolution Fourier-transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames. This large human proteome catalogue (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease.

Show MeSH

Related in: MedlinePlus

Landscape of the normal human proteomea, Tissue-supervised hierarchical clustering reveals the landscape of gene expression across the analyzed cells and tissues. Selected tissue-restricted genes are highlighted in boxes to show some well-studied (black) as well as hypothetical proteins of unknown function (red). The color key indicates the normalized spectral counts per gene detected across the tissues. b, A heat map showing tissue expression of fetal tissue-restricted genes ordered by average expression across fetal tissues (left) and a zoom-in of the top 40 most abundant genes (right). The color key indicates the spectral counts per gene. c, An ROC curve showing a comparison of the performance of the current dataset (blue, area under the curve = 0.762) with 111 individual gene expression datasets (orange) and an composite of the 111 individual datasets (red, area under the curve = 0.692). d, Developmental stage-specific differential expression of protein complexes in fetal and adult liver tissues. Heat map shows protein complexes with less than or equal to half of their subunits expressed in one of the tissue types. The darker the color, the greater the number of expressed subunits.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4403737&req=5

Figure 2: Landscape of the normal human proteomea, Tissue-supervised hierarchical clustering reveals the landscape of gene expression across the analyzed cells and tissues. Selected tissue-restricted genes are highlighted in boxes to show some well-studied (black) as well as hypothetical proteins of unknown function (red). The color key indicates the normalized spectral counts per gene detected across the tissues. b, A heat map showing tissue expression of fetal tissue-restricted genes ordered by average expression across fetal tissues (left) and a zoom-in of the top 40 most abundant genes (right). The color key indicates the spectral counts per gene. c, An ROC curve showing a comparison of the performance of the current dataset (blue, area under the curve = 0.762) with 111 individual gene expression datasets (orange) and an composite of the 111 individual datasets (red, area under the curve = 0.692). d, Developmental stage-specific differential expression of protein complexes in fetal and adult liver tissues. Heat map shows protein complexes with less than or equal to half of their subunits expressed in one of the tissue types. The darker the color, the greater the number of expressed subunits.

Mentions: We used a label-free method based on spectral counting to quantitate protein expression across cells/tissues. Although more variable as compared to label-based methods, this method is readily applicable to analysis of a large number of samples8 and has been shown to be reproducible20. Supervised hierarchical clustering showed proteins encoded by some genes to be expressed in only a few cells/tissues, while others were more broadly expressed (Fig. 2a). Some proteins detected in only one sample were encoded by well-known genes like CD19 in B cells, SCN1A in frontal cortex and GNAT1 in the retina, while others were encoded by ill-characterized genes. For example, C8orf46 was expressed in adult frontal cortex while C9orf9 was expressed in adult ovary and testis. Overall, we detected proteins encoded by 1,537 genes only in one of 30 human samples examined in this study (Extended Data Fig. 2c). These may or may not be tissue-specific genes because of the limit of detection of mass spectrometry and because this analysis did not sample every human cell or tissue type. Because methods based on antibody-based detection can be more sensitive, we performed Western blotting experiments to confirm the tissue-restricted nature of expression of some proteins against which appropriate antibodies were available. Of 32 proteins tested, eight proteins exhibited a tissue-specific expression in agreement with mass spectrometry-derived data (Extended Data Fig. 3a). Four proteins exhibited a more widespread expression although in each of these cases extra bands were detected (Extended Data Fig. 3b). In eighteen cases, the antibody did not recognize a protein in the expected size range at all while no band was detectable in the remaining two cases.


A draft map of the human proteome.

Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, Chaerkady R, Madugundu AK, Kelkar DS, Isserlin R, Jain S, Thomas JK, Muthusamy B, Leal-Rojas P, Kumar P, Sahasrabuddhe NA, Balakrishnan L, Advani J, George B, Renuse S, Selvan LD, Patil AH, Nanjappa V, Radhakrishnan A, Prasad S, Subbannayya T, Raju R, Kumar M, Sreenivasamurthy SK, Marimuthu A, Sathe GJ, Chavan S, Datta KK, Subbannayya Y, Sahu A, Yelamanchi SD, Jayaram S, Rajagopalan P, Sharma J, Murthy KR, Syed N, Goel R, Khan AA, Ahmad S, Dey G, Mudgal K, Chatterjee A, Huang TC, Zhong J, Wu X, Shaw PG, Freed D, Zahari MS, Mukherjee KK, Shankar S, Mahadevan A, Lam H, Mitchell CJ, Shankar SK, Satishchandra P, Schroeder JT, Sirdeshmukh R, Maitra A, Leach SD, Drake CG, Halushka MK, Prasad TS, Hruban RH, Kerr CL, Bader GD, Iacobuzio-Donahue CA, Gowda H, Pandey A - Nature (2014)

Landscape of the normal human proteomea, Tissue-supervised hierarchical clustering reveals the landscape of gene expression across the analyzed cells and tissues. Selected tissue-restricted genes are highlighted in boxes to show some well-studied (black) as well as hypothetical proteins of unknown function (red). The color key indicates the normalized spectral counts per gene detected across the tissues. b, A heat map showing tissue expression of fetal tissue-restricted genes ordered by average expression across fetal tissues (left) and a zoom-in of the top 40 most abundant genes (right). The color key indicates the spectral counts per gene. c, An ROC curve showing a comparison of the performance of the current dataset (blue, area under the curve = 0.762) with 111 individual gene expression datasets (orange) and an composite of the 111 individual datasets (red, area under the curve = 0.692). d, Developmental stage-specific differential expression of protein complexes in fetal and adult liver tissues. Heat map shows protein complexes with less than or equal to half of their subunits expressed in one of the tissue types. The darker the color, the greater the number of expressed subunits.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4403737&req=5

Figure 2: Landscape of the normal human proteomea, Tissue-supervised hierarchical clustering reveals the landscape of gene expression across the analyzed cells and tissues. Selected tissue-restricted genes are highlighted in boxes to show some well-studied (black) as well as hypothetical proteins of unknown function (red). The color key indicates the normalized spectral counts per gene detected across the tissues. b, A heat map showing tissue expression of fetal tissue-restricted genes ordered by average expression across fetal tissues (left) and a zoom-in of the top 40 most abundant genes (right). The color key indicates the spectral counts per gene. c, An ROC curve showing a comparison of the performance of the current dataset (blue, area under the curve = 0.762) with 111 individual gene expression datasets (orange) and an composite of the 111 individual datasets (red, area under the curve = 0.692). d, Developmental stage-specific differential expression of protein complexes in fetal and adult liver tissues. Heat map shows protein complexes with less than or equal to half of their subunits expressed in one of the tissue types. The darker the color, the greater the number of expressed subunits.
Mentions: We used a label-free method based on spectral counting to quantitate protein expression across cells/tissues. Although more variable as compared to label-based methods, this method is readily applicable to analysis of a large number of samples8 and has been shown to be reproducible20. Supervised hierarchical clustering showed proteins encoded by some genes to be expressed in only a few cells/tissues, while others were more broadly expressed (Fig. 2a). Some proteins detected in only one sample were encoded by well-known genes like CD19 in B cells, SCN1A in frontal cortex and GNAT1 in the retina, while others were encoded by ill-characterized genes. For example, C8orf46 was expressed in adult frontal cortex while C9orf9 was expressed in adult ovary and testis. Overall, we detected proteins encoded by 1,537 genes only in one of 30 human samples examined in this study (Extended Data Fig. 2c). These may or may not be tissue-specific genes because of the limit of detection of mass spectrometry and because this analysis did not sample every human cell or tissue type. Because methods based on antibody-based detection can be more sensitive, we performed Western blotting experiments to confirm the tissue-restricted nature of expression of some proteins against which appropriate antibodies were available. Of 32 proteins tested, eight proteins exhibited a tissue-specific expression in agreement with mass spectrometry-derived data (Extended Data Fig. 3a). Four proteins exhibited a more widespread expression although in each of these cases extra bands were detected (Extended Data Fig. 3b). In eighteen cases, the antibody did not recognize a protein in the expected size range at all while no band was detectable in the remaining two cases.

Bottom Line: However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet.In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans.A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames.

View Article: PubMed Central - PubMed

Affiliation: 1] McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA [2] Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA.

ABSTRACT
The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here we present a draft map of the human proteome using high-resolution Fourier-transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames. This large human proteome catalogue (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease.

Show MeSH
Related in: MedlinePlus