Functional gene groups are concentrated within chromosomes, among chromosomes and in the nuclear space of the human genome.
Bottom Line: We find a significant concentration of functional groups both in terms of their distance within the same chromosome and in terms of their dispersal over several chromosomes.The result holds for all three types of functional groups that we tested.Hence, the human genome shows substantial concentration of functional groups within chromosomes and across chromosomes in space.
Affiliation: Genome Informatics, Faculty of Technology and Institute for Bioinformatics, Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld 33615, Germany IBM Research-Haifa, Mount Carmel, Haifa 3498825, Israel.Show MeSH
Mentions: The distribution tail test used above checks whether the average inter-chromosomal distances between gene pairs within a group is significantly smaller than expected at random. However, if proximity tendency exists only between specific genes within a group, it may be undetected after averaging all pairs in the group. This could explain the fact that the test was significant for PPIs but not for the other types, which have larger groups. To examine whether such tendency exists, we applied the distribution tail test again, but this time we used the individual distances between gene pairs instead of the average over these distances. We computed the distance between each pair of genes from the same group that reside on different chromosomes, binned the values obtained from the entire set of such pairs into 20 bins, and tested the concentration at the distribution tail. We found that for all three group types, namely PPIs, pathways and complexes, gene pairs from the same group tend to cluster within a small spatial region even when they lie on different chromosomes. For the three resultant distributions, the first bin in the real genome (5% of pairs with highest spatial inter-chromosomal proximity) was significantly more populated than the same bin in the random genomes, with P-values 0.004, 10−4 and 0.02 for PPIs, complexes and pathways respectively. This result reflects the clustering tendency of genes from the same group. For PPIs and complexes, the cumulative distribution of the histogram tail remained statistically significant also beyond the first bin. For PPIs, more than 25% of the pairs displayed strong clustering tendency (e.g. for the sum of frequencies in bins 1–6, the obtained P-value was ≤0.03). An even more pronounced effect was found for complexes, where about 90% of the pairs, populating bins 1–18 in the cumulative histogram, had all P-values below 0.02. These results, normalized by dividing by the real genome values for the sake of better visibility, are illustrated in Figure 3.
Affiliation: Genome Informatics, Faculty of Technology and Institute for Bioinformatics, Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld 33615, Germany IBM Research-Haifa, Mount Carmel, Haifa 3498825, Israel.