Limits...
Comprehensive comparison of large-scale tissue expression datasets.

Santos A, Tsafou K, Stolte C, Pletscher-Frankild S, O'Donoghue SI, Jensen LJ - PeerJ (2015)

Bottom Line: Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated.We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining.We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression.

View Article: PubMed Central - HTML - PubMed

Affiliation: Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Copenhagen , Denmark.

ABSTRACT
For tissues to carry out their functions, they rely on the right proteins to be present. Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated. We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining. We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression. By developing comparable confidence scores for all types of evidence, we show that it is possible to improve both quality and coverage by combining the datasets. To facilitate use and visualization of our work, we have developed the TISSUES resource (http://tissues.jensenlab.org), which makes all the scored and integrated data available through a single user-friendly web interface.

No MeSH data available.


Summary of the tissues and number of proteins present in each dataset.For our analyses, we mapped 9 datasets to 21 major tissues of interest. This figure shows which datasets cover which of these major tissues and how many proteins each dataset identified.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4493645&req=5

fig-1: Summary of the tissues and number of proteins present in each dataset.For our analyses, we mapped 9 datasets to 21 major tissues of interest. This figure shows which datasets cover which of these major tissues and how many proteins each dataset identified.

Mentions: We here present the first comparative evaluation of the quality of tissue associations from a variety of different datasets and experimental methods as well as from manual curation (The UniProt Consortium, 2014) and automatic text mining of the biomedical literature (Fig. 1). We show that these datasets—despite the technological differences—agree surprisingly well with each other and can be combined to improve quality and coverage. Finally, as a result of the integration process, we have developed the TISSUES resource (http://tissues.jensenlab.org), which makes the above mentioned heterogeneous data more easily accessible to researchers by collecting them in a single place and assigning confidence scores.


Comprehensive comparison of large-scale tissue expression datasets.

Santos A, Tsafou K, Stolte C, Pletscher-Frankild S, O'Donoghue SI, Jensen LJ - PeerJ (2015)

Summary of the tissues and number of proteins present in each dataset.For our analyses, we mapped 9 datasets to 21 major tissues of interest. This figure shows which datasets cover which of these major tissues and how many proteins each dataset identified.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4493645&req=5

fig-1: Summary of the tissues and number of proteins present in each dataset.For our analyses, we mapped 9 datasets to 21 major tissues of interest. This figure shows which datasets cover which of these major tissues and how many proteins each dataset identified.
Mentions: We here present the first comparative evaluation of the quality of tissue associations from a variety of different datasets and experimental methods as well as from manual curation (The UniProt Consortium, 2014) and automatic text mining of the biomedical literature (Fig. 1). We show that these datasets—despite the technological differences—agree surprisingly well with each other and can be combined to improve quality and coverage. Finally, as a result of the integration process, we have developed the TISSUES resource (http://tissues.jensenlab.org), which makes the above mentioned heterogeneous data more easily accessible to researchers by collecting them in a single place and assigning confidence scores.

Bottom Line: Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated.We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining.We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression.

View Article: PubMed Central - HTML - PubMed

Affiliation: Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Copenhagen , Denmark.

ABSTRACT
For tissues to carry out their functions, they rely on the right proteins to be present. Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated. We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining. We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression. By developing comparable confidence scores for all types of evidence, we show that it is possible to improve both quality and coverage by combining the datasets. To facilitate use and visualization of our work, we have developed the TISSUES resource (http://tissues.jensenlab.org), which makes all the scored and integrated data available through a single user-friendly web interface.

No MeSH data available.