Limits...
Joint analysis of differential gene expression in multiple studies using correlation motifs.

Wei Y, Tenzen T, Ji H - Biostatistics (2014)

Bottom Line: The motifs provide the basis for sharing information among studies and genes.The approach has flexibility to handle all possible study-specific differential patterns.It improves detection of differential expression and overcomes the barrier of exponential model complexity.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USADepartment of Statistics, The Chinese University of Hong Kong, Shatin NT, Hong Kong.

Show MeSH
(a) A cartoon illustration of SHH pathway. (b) A numerical example of the data generating model. There exist four motifs in the dataset, with the abundance . Each row of the  matrix represents a motif and each column corresponds to a study. Thus,  indicates the probability for genes belonging to motif  to be differentially expressed in study . For example, the probability for genes belonging to motif 1 to be differentially expressed in study 4 is 0.83. The gray scale of the cells in  and  illustrates the probability value. The probability increases from 0 to 1 as the color changes from light to dark. Given  and , each gene is assigned a motif indicator . For instance, the fifth gene belongs to motif 2 (indicated by a cell with a number “2”). Next, the configuration of the fifth gene, , is generated according to . As a result, the fifth gene is differentially expressed in study 2, 4, and 5. Finally, the moderated t-statistic  within each study  is produced according to the configuration .
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4263229&req=5

KXU038F1: (a) A cartoon illustration of SHH pathway. (b) A numerical example of the data generating model. There exist four motifs in the dataset, with the abundance . Each row of the matrix represents a motif and each column corresponds to a study. Thus, indicates the probability for genes belonging to motif to be differentially expressed in study . For example, the probability for genes belonging to motif 1 to be differentially expressed in study 4 is 0.83. The gray scale of the cells in and illustrates the probability value. The probability increases from 0 to 1 as the color changes from light to dark. Given and , each gene is assigned a motif indicator . For instance, the fifth gene belongs to motif 2 (indicated by a cell with a number “2”). Next, the configuration of the fifth gene, , is generated according to . As a result, the fifth gene is differentially expressed in study 2, 4, and 5. Finally, the moderated t-statistic within each study is produced according to the configuration .

Mentions: One example that motivated this article is a study of the vertebrate sonic hedgehog (SHH) signaling pathway. SHH is a signaling protein that can bind to patched 1 (PTCH1), a receptor protein in cell membrane (Figure 1(a)). PTCH1 can interact with another membrane protein smoothened (SMO) to repress its activity. In the absence of SHH, PTCH1 keeps SMO inactive. The presence of SHH will repress PTCH1 and activate SMO. The active SMO triggers a signaling cascade to modulate activities of three transcription factors, GLI1, GLI2, and GLI3, which in turn induce or repress the expression of hundreds of downstream target genes. SHH pathway is a core signaling pathway in vertebrate (Ingham and McMahon, 2001). To elucidate the underlying mechanisms linking this pathway to development and diseases, multiple studies have been conducted in different contexts to identify genes whose transcriptional activities are modulated by SHH signaling. Some studies perturb the SHH signal in different tissues by knocking out or over-expressing the pathway's key signal transduction components such as SHH, PTCH1, and SMO, while others compare disease samples with corresponding controls. Table 1 contains eight such datasets in mouse originally collected by Tenzen and others (2006) and Mao and others (2006). Each dataset involves a comparison of genome-wide expression profiles between two different sample types. These data were all generated using Affymetrix Mouse Expression Set 430 arrays. The questions of biological interest include (i) which genes are controlled by the SHH signal in each dataset, (ii) which genes are the core targets that respond to the SHH signal irrespective of tissue type and developmental stage, and (iii) which genes are context-specific targets and are modulated by the SHH signal only in certain conditions.Table 1.


Joint analysis of differential gene expression in multiple studies using correlation motifs.

Wei Y, Tenzen T, Ji H - Biostatistics (2014)

(a) A cartoon illustration of SHH pathway. (b) A numerical example of the data generating model. There exist four motifs in the dataset, with the abundance . Each row of the  matrix represents a motif and each column corresponds to a study. Thus,  indicates the probability for genes belonging to motif  to be differentially expressed in study . For example, the probability for genes belonging to motif 1 to be differentially expressed in study 4 is 0.83. The gray scale of the cells in  and  illustrates the probability value. The probability increases from 0 to 1 as the color changes from light to dark. Given  and , each gene is assigned a motif indicator . For instance, the fifth gene belongs to motif 2 (indicated by a cell with a number “2”). Next, the configuration of the fifth gene, , is generated according to . As a result, the fifth gene is differentially expressed in study 2, 4, and 5. Finally, the moderated t-statistic  within each study  is produced according to the configuration .
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4263229&req=5

KXU038F1: (a) A cartoon illustration of SHH pathway. (b) A numerical example of the data generating model. There exist four motifs in the dataset, with the abundance . Each row of the matrix represents a motif and each column corresponds to a study. Thus, indicates the probability for genes belonging to motif to be differentially expressed in study . For example, the probability for genes belonging to motif 1 to be differentially expressed in study 4 is 0.83. The gray scale of the cells in and illustrates the probability value. The probability increases from 0 to 1 as the color changes from light to dark. Given and , each gene is assigned a motif indicator . For instance, the fifth gene belongs to motif 2 (indicated by a cell with a number “2”). Next, the configuration of the fifth gene, , is generated according to . As a result, the fifth gene is differentially expressed in study 2, 4, and 5. Finally, the moderated t-statistic within each study is produced according to the configuration .
Mentions: One example that motivated this article is a study of the vertebrate sonic hedgehog (SHH) signaling pathway. SHH is a signaling protein that can bind to patched 1 (PTCH1), a receptor protein in cell membrane (Figure 1(a)). PTCH1 can interact with another membrane protein smoothened (SMO) to repress its activity. In the absence of SHH, PTCH1 keeps SMO inactive. The presence of SHH will repress PTCH1 and activate SMO. The active SMO triggers a signaling cascade to modulate activities of three transcription factors, GLI1, GLI2, and GLI3, which in turn induce or repress the expression of hundreds of downstream target genes. SHH pathway is a core signaling pathway in vertebrate (Ingham and McMahon, 2001). To elucidate the underlying mechanisms linking this pathway to development and diseases, multiple studies have been conducted in different contexts to identify genes whose transcriptional activities are modulated by SHH signaling. Some studies perturb the SHH signal in different tissues by knocking out or over-expressing the pathway's key signal transduction components such as SHH, PTCH1, and SMO, while others compare disease samples with corresponding controls. Table 1 contains eight such datasets in mouse originally collected by Tenzen and others (2006) and Mao and others (2006). Each dataset involves a comparison of genome-wide expression profiles between two different sample types. These data were all generated using Affymetrix Mouse Expression Set 430 arrays. The questions of biological interest include (i) which genes are controlled by the SHH signal in each dataset, (ii) which genes are the core targets that respond to the SHH signal irrespective of tissue type and developmental stage, and (iii) which genes are context-specific targets and are modulated by the SHH signal only in certain conditions.Table 1.

Bottom Line: The motifs provide the basis for sharing information among studies and genes.The approach has flexibility to handle all possible study-specific differential patterns.It improves detection of differential expression and overcomes the barrier of exponential model complexity.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USADepartment of Statistics, The Chinese University of Hong Kong, Shatin NT, Hong Kong.

Show MeSH