Limits...
Comprehensive comparison of large-scale tissue expression datasets.

Santos A, Tsafou K, Stolte C, Pletscher-Frankild S, O'Donoghue SI, Jensen LJ - PeerJ (2015)

Bottom Line: Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated.We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining.We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression.

View Article: PubMed Central - HTML - PubMed

Affiliation: Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Copenhagen , Denmark.

ABSTRACT
For tissues to carry out their functions, they rely on the right proteins to be present. Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated. We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining. We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression. By developing comparable confidence scores for all types of evidence, we show that it is possible to improve both quality and coverage by combining the datasets. To facilitate use and visualization of our work, we have developed the TISSUES resource (http://tissues.jensenlab.org), which makes all the scored and integrated data available through a single user-friendly web interface.

No MeSH data available.


Distribution of expression breadth of the transcriptome datasets.For each of the five mRNA datasets, the histograms show the number of protein-coding genes expressed at low, medium, and high confidence as function of number of tissues. With the exception of UniGene, the distributions are bimodal, with most proteins occurring in either few tissues or in most tissues measured, supporting the notion of distinguishing between tissue-specific and ubiquitous expression.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4493645&req=5

fig-2: Distribution of expression breadth of the transcriptome datasets.For each of the five mRNA datasets, the histograms show the number of protein-coding genes expressed at low, medium, and high confidence as function of number of tissues. With the exception of UniGene, the distributions are bimodal, with most proteins occurring in either few tissues or in most tissues measured, supporting the notion of distinguishing between tissue-specific and ubiquitous expression.

Mentions: Figure 2 shows the expression breadths for five transcriptome datasets, each at the three different confidence levels. Most show a clear bimodal distribution with peaks at the extreme ends, i.e., the vast majority of genes are expressed either in only a few tissues or in most tissues measured. We thus show that data from several sources and technologies robustly support a natural distinction between tissue-specific and ubiquitously expressed genes.


Comprehensive comparison of large-scale tissue expression datasets.

Santos A, Tsafou K, Stolte C, Pletscher-Frankild S, O'Donoghue SI, Jensen LJ - PeerJ (2015)

Distribution of expression breadth of the transcriptome datasets.For each of the five mRNA datasets, the histograms show the number of protein-coding genes expressed at low, medium, and high confidence as function of number of tissues. With the exception of UniGene, the distributions are bimodal, with most proteins occurring in either few tissues or in most tissues measured, supporting the notion of distinguishing between tissue-specific and ubiquitous expression.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4493645&req=5

fig-2: Distribution of expression breadth of the transcriptome datasets.For each of the five mRNA datasets, the histograms show the number of protein-coding genes expressed at low, medium, and high confidence as function of number of tissues. With the exception of UniGene, the distributions are bimodal, with most proteins occurring in either few tissues or in most tissues measured, supporting the notion of distinguishing between tissue-specific and ubiquitous expression.
Mentions: Figure 2 shows the expression breadths for five transcriptome datasets, each at the three different confidence levels. Most show a clear bimodal distribution with peaks at the extreme ends, i.e., the vast majority of genes are expressed either in only a few tissues or in most tissues measured. We thus show that data from several sources and technologies robustly support a natural distinction between tissue-specific and ubiquitously expressed genes.

Bottom Line: Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated.We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining.We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression.

View Article: PubMed Central - HTML - PubMed

Affiliation: Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Copenhagen , Denmark.

ABSTRACT
For tissues to carry out their functions, they rely on the right proteins to be present. Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated. We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining. We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression. By developing comparable confidence scores for all types of evidence, we show that it is possible to improve both quality and coverage by combining the datasets. To facilitate use and visualization of our work, we have developed the TISSUES resource (http://tissues.jensenlab.org), which makes all the scored and integrated data available through a single user-friendly web interface.

No MeSH data available.