Limits...
Integrative Analysis of Gene Networks and Their Application to Lung Adenocarcinoma Studies

View Article: PubMed Central - PubMed

ABSTRACT

The construction of gene regulatory networks (GRNs) is an essential component of biomedical research to determine disease mechanisms and identify treatment targets. Gaussian graphical models (GGMs) have been widely used for constructing GRNs by inferring conditional dependence among a set of gene expressions. In practice, GRNs obtained by the analysis of a single data set may not be reliable due to sample limitations. Therefore, it is important to integrate multiple data sets from comparable studies to improve the construction of a GRN. In this article, we introduce an equivalent measure of partial correlation coefficients in GGMs and then extend the method to construct a GRN by combining the equivalent measures from different sources. Furthermore, we develop a method for multiple data sets with a natural missing mechanism to accommodate the differences among different platforms in multiple sources of data. Simulation results show that this integrative analysis outperforms the standard methods and can detect hub genes in the true network. The proposed integrative method was applied to 12 lung adenocarcinoma data sets collected from different studies. The constructed network is consistent with the current biological knowledge and reveals new insights about lung adenocarcinoma.

No MeSH data available.


ROC curve and partial ROC curve under FPR < 0.05 for all methods where the sample and network sizes are n = 100 and p = 83.FPR indicates false-positive rate; JGL, joint graphical lasso; ROC, receiver operating characteristic curve.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC5392014&req=5

f2-10.1177_1176935117690778: ROC curve and partial ROC curve under FPR < 0.05 for all methods where the sample and network sizes are n = 100 and p = 83.FPR indicates false-positive rate; JGL, joint graphical lasso; ROC, receiver operating characteristic curve.

Mentions: Figure 2 displays the receiver operating characteristic (ROC) and partial ROC curves of all methods for a sample data set in the simulation with a small size network. The partial ROC curve represents the ROC curve on the region where the false-positive rate is less than 0.05. Figure 2 shows that the ψi-learning method performs much better than other methods. For the separate ψ-learning methods based on each data set, as expected, the ψ1-learning method performs better than other ones from data sets with larger noise levels. It is interesting to note that the ψp-learning method performs worse than the ψ1-learning and ψ2-learning methods, although the ψp-learning method uses more samples. This result might be caused by the third data set’s poor quality, which means that simply pooling data sets from multiple sources may not be a good choice, whereas the proposed integrative analysis provides an effective way to improve the performances of the ψ-learning method using more information.


Integrative Analysis of Gene Networks and Their Application to Lung Adenocarcinoma Studies
ROC curve and partial ROC curve under FPR < 0.05 for all methods where the sample and network sizes are n = 100 and p = 83.FPR indicates false-positive rate; JGL, joint graphical lasso; ROC, receiver operating characteristic curve.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC5392014&req=5

f2-10.1177_1176935117690778: ROC curve and partial ROC curve under FPR < 0.05 for all methods where the sample and network sizes are n = 100 and p = 83.FPR indicates false-positive rate; JGL, joint graphical lasso; ROC, receiver operating characteristic curve.
Mentions: Figure 2 displays the receiver operating characteristic (ROC) and partial ROC curves of all methods for a sample data set in the simulation with a small size network. The partial ROC curve represents the ROC curve on the region where the false-positive rate is less than 0.05. Figure 2 shows that the ψi-learning method performs much better than other methods. For the separate ψ-learning methods based on each data set, as expected, the ψ1-learning method performs better than other ones from data sets with larger noise levels. It is interesting to note that the ψp-learning method performs worse than the ψ1-learning and ψ2-learning methods, although the ψp-learning method uses more samples. This result might be caused by the third data set’s poor quality, which means that simply pooling data sets from multiple sources may not be a good choice, whereas the proposed integrative analysis provides an effective way to improve the performances of the ψ-learning method using more information.

View Article: PubMed Central - PubMed

ABSTRACT

The construction of gene regulatory networks (GRNs) is an essential component of biomedical research to determine disease mechanisms and identify treatment targets. Gaussian graphical models (GGMs) have been widely used for constructing GRNs by inferring conditional dependence among a set of gene expressions. In practice, GRNs obtained by the analysis of a single data set may not be reliable due to sample limitations. Therefore, it is important to integrate multiple data sets from comparable studies to improve the construction of a GRN. In this article, we introduce an equivalent measure of partial correlation coefficients in GGMs and then extend the method to construct a GRN by combining the equivalent measures from different sources. Furthermore, we develop a method for multiple data sets with a natural missing mechanism to accommodate the differences among different platforms in multiple sources of data. Simulation results show that this integrative analysis outperforms the standard methods and can detect hub genes in the true network. The proposed integrative method was applied to 12 lung adenocarcinoma data sets collected from different studies. The constructed network is consistent with the current biological knowledge and reveals new insights about lung adenocarcinoma.

No MeSH data available.