AraNet v2: an improved database of co-functional gene networks for the study of Arabidopsis thaliana and 27 other nonmodel plant species.
Bottom Line: Recent advances in high-throughput experimental technology have enabled the generation of an unprecedented amount of data from A. thaliana, which has facilitated data-driven approaches to unravel the genetic organization of plant phenotypes.We previously published a description of a genome-scale functional gene network for A. thaliana, AraNet, which was constructed by integrating multiple co-functional gene networks inferred from diverse data types, and we demonstrated the predictive power of this network for complex phenotypes.To enhance the usability of the network, we implemented an AraNet v2 web server, which generates functional predictions for A. thaliana and 27 nonmodel plant species using an orthology-based projection of nonmodel plant genes on the A. thaliana gene network.
Affiliation: Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea.Show MeSH
Mentions: If the accuracy of the co-functional links in AraNet v2 is equal or higher than AraNet, then we expect that the expansion of the genome coverage in AraNet v2 may lead to enhanced predictions for functions and phenotypes in A. thaliana. We used a validation set, which was composed of gene pairs independent from the gold standard gene pairs used for network training, to assess the predictive power of each network. Assuming two genes participating in the same pathway are likely to belong to the same protein complex or to be localized within the same subcellular compartment, we generated two distinct validation sets: (i) gene pairs that share subcellular localization annotations by SUBcellular localization database for Arabidopsis proteins (SUBA3) (12), (ii) gene pairs that share GO cellular component (GO-CC) annotations (11). To avoid misleading co-functional relationships due to ambiguous subcellular compartments, we ignored the SUBA3 annotations for ‘cytosol’ and ‘plasma membrane’. For the same reason, we excluded 16 GO-CC terms with more than 350 annotated genes from a total of 420 terms. These procedures resulted in 335 641 gene pairs by SUBA3, of which only 418 pairs overlapped with the gold standard gene pairs that we used for network training (∼0.3%), and 394 076 gene pairs by GO-CC, of which less than 9% overlapped with the gold standard gene pairs. The accuracy of the network links for the given genome coverage was compared between AraNet v2 and AraNet based on the validation set. A substantially higher accuracy over the entire genome coverage range was observed using AraNet v2 for both SUBA3 and GO-CC annotations (Figure 1a and b). From this result, we conclude that AraNet v2 significantly improves both genome coverage and linkage accuracy.
Affiliation: Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea.