Limits...
Comparative analysis of proteome and transcriptome variation in mouse.

Ghazalpour A, Bennett B, Petyuk VA, Orozco L, Hagopian R, Mungrue IN, Farber CR, Sinsheimer J, Kang HM, Furlotte N, Park CC, Wen PZ, Brewer H, Weitz K, Camp DG, Pan C, Yordanova R, Neuhaus I, Tilford C, Siemers N, Gargalovic P, Eskin E, Kirchgessner T, Smith DJ, Smith RD, Lusis AJ - PLoS Genet. (2011)

Bottom Line: For example, differential splicing clearly affects the analyses for certain genes; but, based on deep sequencing, this does not substantially contribute to the overall estimate of the correlation.Using correlation analysis, we found that a low number of clinical trait relationships are preserved between the protein and mRNA gene products and that the majority of such relationships are specific to either the protein levels or transcript levels.In light of the widespread use of high-throughput technologies in both clinical and basic research, the results presented have practical as well as basic implications.

View Article: PubMed Central - PubMed

Affiliation: Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America. aghazalp@ucla.edu

ABSTRACT
The relationships between the levels of transcripts and the levels of the proteins they encode have not been examined comprehensively in mammals, although previous work in plants and yeast suggest a surprisingly modest correlation. We have examined this issue using a genetic approach in which natural variations were used to perturb both transcript levels and protein levels among inbred strains of mice. We quantified over 5,000 peptides and over 22,000 transcripts in livers of 97 inbred and recombinant inbred strains and focused on the 7,185 most heritable transcripts and 486 most reliable proteins. The transcript levels were quantified by microarray analysis in three replicates and the proteins were quantified by Liquid Chromatography-Mass Spectrometry using O(18)-reference-based isotope labeling approach. We show that the levels of transcripts and proteins correlate significantly for only about half of the genes tested, with an average correlation of 0.27, and the correlations of transcripts and proteins varied depending on the cellular location and biological function of the gene. We examined technical and biological factors that could contribute to the modest correlation. For example, differential splicing clearly affects the analyses for certain genes; but, based on deep sequencing, this does not substantially contribute to the overall estimate of the correlation. We also employed genome-wide association analyses to map loci controlling both transcript and protein levels. Surprisingly, little overlap was observed between the protein- and transcript-mapped loci. We have typed numerous clinically relevant traits among the strains, including adiposity, lipoprotein levels, and tissue parameters. Using correlation analysis, we found that a low number of clinical trait relationships are preserved between the protein and mRNA gene products and that the majority of such relationships are specific to either the protein levels or transcript levels. Surprisingly, transcript levels were more strongly correlated with clinical traits than protein levels. In light of the widespread use of high-throughput technologies in both clinical and basic research, the results presented have practical as well as basic implications.

Show MeSH

Related in: MedlinePlus

A schematic representation of the experimental design.97 inbred and recombinant inbred strains in the HMDP panel were utilized to study the relationships between transcripts, proteins, and clinical traits. The relationships between proteins and transcripts were assessed at the biological level by the overall correlation across datasets, and at the genetic level by comparing the genome-wide association profiles of the two datasets. The biological relationship between the transcripts and proteins was also assessed in the context of the physiological phenotypes by relating these two datasets to the 42 clinical traits measured in the HMDP panel.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3111477&req=5

pgen-1001393-g001: A schematic representation of the experimental design.97 inbred and recombinant inbred strains in the HMDP panel were utilized to study the relationships between transcripts, proteins, and clinical traits. The relationships between proteins and transcripts were assessed at the biological level by the overall correlation across datasets, and at the genetic level by comparing the genome-wide association profiles of the two datasets. The biological relationship between the transcripts and proteins was also assessed in the context of the physiological phenotypes by relating these two datasets to the 42 clinical traits measured in the HMDP panel.

Mentions: The experimental design of our study is depicted in Figure 1. To study the relationship between transcript and protein levels globally, we examined 97 inbred strains of mice of the HMDP representing a wide range of genetic diversity, including ∼11,000,000 single nucleotide polymorphisms as well as copy number variations [13], [14]. As we have shown previously, this population includes thousands of expression quantitative trait loci (eQTL) that can be mapped in the population using association analysis with correction for population structure using a mixed model algorithm [12]. The resolution achieved in this way is, on average, one to two orders of magnitude narrower than that using linkage analysis [12]. Livers from the 97 strains were quantitatively analyzed for global transcript levels using the Affymetrix HT-MG-430A platform and for protein levels using LC-MS employing AMT tag approach for identification and 16O/18O labeling for quantification [12], [15]. In the latter, each individually processed and unlabeled sample is spiked with the 18O labeled “universal” reference pool (i.e. the pool made from mixing together the same amount of isolated proteins from all samples) providing an internal standard for accurate measurement of protein abundance across biological samples. This dual-quantification, which combines the label-free and isotope labeling techniques, has been shown to be significantly superior over label-free methods in terms of quantification precision [15] and offers a simple, robust, and a more precise alternative to other proteomic techniques for studying variations in protein levels across large biological samples. In the LC-MS dataset, we also included 10 technical replicates from the C57BL/6J strain to measure the reproducibility of the sample preparation and technology which we describe in detail below.


Comparative analysis of proteome and transcriptome variation in mouse.

Ghazalpour A, Bennett B, Petyuk VA, Orozco L, Hagopian R, Mungrue IN, Farber CR, Sinsheimer J, Kang HM, Furlotte N, Park CC, Wen PZ, Brewer H, Weitz K, Camp DG, Pan C, Yordanova R, Neuhaus I, Tilford C, Siemers N, Gargalovic P, Eskin E, Kirchgessner T, Smith DJ, Smith RD, Lusis AJ - PLoS Genet. (2011)

A schematic representation of the experimental design.97 inbred and recombinant inbred strains in the HMDP panel were utilized to study the relationships between transcripts, proteins, and clinical traits. The relationships between proteins and transcripts were assessed at the biological level by the overall correlation across datasets, and at the genetic level by comparing the genome-wide association profiles of the two datasets. The biological relationship between the transcripts and proteins was also assessed in the context of the physiological phenotypes by relating these two datasets to the 42 clinical traits measured in the HMDP panel.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3111477&req=5

pgen-1001393-g001: A schematic representation of the experimental design.97 inbred and recombinant inbred strains in the HMDP panel were utilized to study the relationships between transcripts, proteins, and clinical traits. The relationships between proteins and transcripts were assessed at the biological level by the overall correlation across datasets, and at the genetic level by comparing the genome-wide association profiles of the two datasets. The biological relationship between the transcripts and proteins was also assessed in the context of the physiological phenotypes by relating these two datasets to the 42 clinical traits measured in the HMDP panel.
Mentions: The experimental design of our study is depicted in Figure 1. To study the relationship between transcript and protein levels globally, we examined 97 inbred strains of mice of the HMDP representing a wide range of genetic diversity, including ∼11,000,000 single nucleotide polymorphisms as well as copy number variations [13], [14]. As we have shown previously, this population includes thousands of expression quantitative trait loci (eQTL) that can be mapped in the population using association analysis with correction for population structure using a mixed model algorithm [12]. The resolution achieved in this way is, on average, one to two orders of magnitude narrower than that using linkage analysis [12]. Livers from the 97 strains were quantitatively analyzed for global transcript levels using the Affymetrix HT-MG-430A platform and for protein levels using LC-MS employing AMT tag approach for identification and 16O/18O labeling for quantification [12], [15]. In the latter, each individually processed and unlabeled sample is spiked with the 18O labeled “universal” reference pool (i.e. the pool made from mixing together the same amount of isolated proteins from all samples) providing an internal standard for accurate measurement of protein abundance across biological samples. This dual-quantification, which combines the label-free and isotope labeling techniques, has been shown to be significantly superior over label-free methods in terms of quantification precision [15] and offers a simple, robust, and a more precise alternative to other proteomic techniques for studying variations in protein levels across large biological samples. In the LC-MS dataset, we also included 10 technical replicates from the C57BL/6J strain to measure the reproducibility of the sample preparation and technology which we describe in detail below.

Bottom Line: For example, differential splicing clearly affects the analyses for certain genes; but, based on deep sequencing, this does not substantially contribute to the overall estimate of the correlation.Using correlation analysis, we found that a low number of clinical trait relationships are preserved between the protein and mRNA gene products and that the majority of such relationships are specific to either the protein levels or transcript levels.In light of the widespread use of high-throughput technologies in both clinical and basic research, the results presented have practical as well as basic implications.

View Article: PubMed Central - PubMed

Affiliation: Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America. aghazalp@ucla.edu

ABSTRACT
The relationships between the levels of transcripts and the levels of the proteins they encode have not been examined comprehensively in mammals, although previous work in plants and yeast suggest a surprisingly modest correlation. We have examined this issue using a genetic approach in which natural variations were used to perturb both transcript levels and protein levels among inbred strains of mice. We quantified over 5,000 peptides and over 22,000 transcripts in livers of 97 inbred and recombinant inbred strains and focused on the 7,185 most heritable transcripts and 486 most reliable proteins. The transcript levels were quantified by microarray analysis in three replicates and the proteins were quantified by Liquid Chromatography-Mass Spectrometry using O(18)-reference-based isotope labeling approach. We show that the levels of transcripts and proteins correlate significantly for only about half of the genes tested, with an average correlation of 0.27, and the correlations of transcripts and proteins varied depending on the cellular location and biological function of the gene. We examined technical and biological factors that could contribute to the modest correlation. For example, differential splicing clearly affects the analyses for certain genes; but, based on deep sequencing, this does not substantially contribute to the overall estimate of the correlation. We also employed genome-wide association analyses to map loci controlling both transcript and protein levels. Surprisingly, little overlap was observed between the protein- and transcript-mapped loci. We have typed numerous clinically relevant traits among the strains, including adiposity, lipoprotein levels, and tissue parameters. Using correlation analysis, we found that a low number of clinical trait relationships are preserved between the protein and mRNA gene products and that the majority of such relationships are specific to either the protein levels or transcript levels. Surprisingly, transcript levels were more strongly correlated with clinical traits than protein levels. In light of the widespread use of high-throughput technologies in both clinical and basic research, the results presented have practical as well as basic implications.

Show MeSH
Related in: MedlinePlus