Limits...
Integrative Analysis of Gene Networks and Their Application to Lung Adenocarcinoma Studies

View Article: PubMed Central - PubMed

ABSTRACT

The construction of gene regulatory networks (GRNs) is an essential component of biomedical research to determine disease mechanisms and identify treatment targets. Gaussian graphical models (GGMs) have been widely used for constructing GRNs by inferring conditional dependence among a set of gene expressions. In practice, GRNs obtained by the analysis of a single data set may not be reliable due to sample limitations. Therefore, it is important to integrate multiple data sets from comparable studies to improve the construction of a GRN. In this article, we introduce an equivalent measure of partial correlation coefficients in GGMs and then extend the method to construct a GRN by combining the equivalent measures from different sources. Furthermore, we develop a method for multiple data sets with a natural missing mechanism to accommodate the differences among different platforms in multiple sources of data. Simulation results show that this integrative analysis outperforms the standard methods and can detect hub genes in the true network. The proposed integrative method was applied to 12 lung adenocarcinoma data sets collected from different studies. The constructed network is consistent with the current biological knowledge and reveals new insights about lung adenocarcinoma.

No MeSH data available.


ROC curve and partial ROC curve under FPR < 0.05 for all methods where the sample and network sizes are n = 100 and p = 612, respectively. FPR indicates false-positive rate; JGL, joint graphical lasso; ROC, receiver operating characteristic curve.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC5392014&req=5

f3-10.1177_1176935117690778: ROC curve and partial ROC curve under FPR < 0.05 for all methods where the sample and network sizes are n = 100 and p = 612, respectively. FPR indicates false-positive rate; JGL, joint graphical lasso; ROC, receiver operating characteristic curve.

Mentions: Figure 3 and Table 2 show the results for the simulations based on a large network. We again see that the ψi-learning method overall outperforms all other methods and works well for the high-dimensional data. Figure 4 displays the network paths constructed by various q values for the ψi-learning method. The ψi-learning method with a large q value produces a sparse network, whereas one with a small q value leads to a dense network, as shown in Figure 4. For a small q value, the resulting network not only shows a lower number of FPEs but also selects a lower number of TPEs. In contrast, the network constructed by a large q value not only identifies the TPEs well but also selects more irrelevant edges, as seen in Tables 1 and 2. Figure 4 indicates that the ψi-learning method with an appropriate q value detects the hub genes well in the true network. These results suggest that the integrative analysis for multiple data sets would improve the performance of the ψ-learning method and would enable us to construct more reliable GRNs.


Integrative Analysis of Gene Networks and Their Application to Lung Adenocarcinoma Studies
ROC curve and partial ROC curve under FPR < 0.05 for all methods where the sample and network sizes are n = 100 and p = 612, respectively. FPR indicates false-positive rate; JGL, joint graphical lasso; ROC, receiver operating characteristic curve.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC5392014&req=5

f3-10.1177_1176935117690778: ROC curve and partial ROC curve under FPR < 0.05 for all methods where the sample and network sizes are n = 100 and p = 612, respectively. FPR indicates false-positive rate; JGL, joint graphical lasso; ROC, receiver operating characteristic curve.
Mentions: Figure 3 and Table 2 show the results for the simulations based on a large network. We again see that the ψi-learning method overall outperforms all other methods and works well for the high-dimensional data. Figure 4 displays the network paths constructed by various q values for the ψi-learning method. The ψi-learning method with a large q value produces a sparse network, whereas one with a small q value leads to a dense network, as shown in Figure 4. For a small q value, the resulting network not only shows a lower number of FPEs but also selects a lower number of TPEs. In contrast, the network constructed by a large q value not only identifies the TPEs well but also selects more irrelevant edges, as seen in Tables 1 and 2. Figure 4 indicates that the ψi-learning method with an appropriate q value detects the hub genes well in the true network. These results suggest that the integrative analysis for multiple data sets would improve the performance of the ψ-learning method and would enable us to construct more reliable GRNs.

View Article: PubMed Central - PubMed

ABSTRACT

The construction of gene regulatory networks (GRNs) is an essential component of biomedical research to determine disease mechanisms and identify treatment targets. Gaussian graphical models (GGMs) have been widely used for constructing GRNs by inferring conditional dependence among a set of gene expressions. In practice, GRNs obtained by the analysis of a single data set may not be reliable due to sample limitations. Therefore, it is important to integrate multiple data sets from comparable studies to improve the construction of a GRN. In this article, we introduce an equivalent measure of partial correlation coefficients in GGMs and then extend the method to construct a GRN by combining the equivalent measures from different sources. Furthermore, we develop a method for multiple data sets with a natural missing mechanism to accommodate the differences among different platforms in multiple sources of data. Simulation results show that this integrative analysis outperforms the standard methods and can detect hub genes in the true network. The proposed integrative method was applied to 12 lung adenocarcinoma data sets collected from different studies. The constructed network is consistent with the current biological knowledge and reveals new insights about lung adenocarcinoma.

No MeSH data available.