Limits...
Positive correlation between gene coexpression and positional clustering in the zebrafish genome.

Ng YK, Wu W, Zhang L - BMC Genomics (2009)

Bottom Line: This paper analyzes correlation between the proximity of eukaryotic genes and their transcriptional expression pattern in the zebrafish (Danio rerio) genome using available microarray data and gene annotation.The analyses show that neighbouring genes are significantly coexpressed in the zebrafish genome, and the coexpression level is influenced by the intergenic distance and transcription orientation.This fact is further supported by examining the coexpression level of genes within positional clusters in the neighbourhood model.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Mathematics, National University of Singapore, 2 Science Drive 2, Singapore 117543, Singapore. matnyk@nus.edu.sg

ABSTRACT

Background: Co-expressing genes tend to cluster in eukaryotic genomes. This paper analyzes correlation between the proximity of eukaryotic genes and their transcriptional expression pattern in the zebrafish (Danio rerio) genome using available microarray data and gene annotation.

Results: The analyses show that neighbouring genes are significantly coexpressed in the zebrafish genome, and the coexpression level is influenced by the intergenic distance and transcription orientation. This fact is further supported by examining the coexpression level of genes within positional clusters in the neighbourhood model. There is a positive correlation between gene coexpression and positional clustering in the zebrafish genome.

Conclusion: The study provides another piece of evidence for the hypothesis that coexpressed genes do cluster in the eukaryotic genomes.

Show MeSH
Mean R of neighboring gene pairs in different -lg p-value intervals. Mean R values of neighboring gene pairs in -lg p intervals. All p-values were calculated with D = 25K. Gene pairs grouped into parallel, divergent, and convergent orientations are plotted similarly. There is only one gene pair has -lg p-value in the interval 5~6, for both the divergent and convergent cases. They are hence omitted from the plot.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2654907&req=5

Figure 4: Mean R of neighboring gene pairs in different -lg p-value intervals. Mean R values of neighboring gene pairs in -lg p intervals. All p-values were calculated with D = 25K. Gene pairs grouped into parallel, divergent, and convergent orientations are plotted similarly. There is only one gene pair has -lg p-value in the interval 5~6, for both the divergent and convergent cases. They are hence omitted from the plot.

Mentions: Finally, we investigated whether there is a correlation between the mean R of gene pairs and p-value for a positional cluster. With D = 25K, we considered all the pairs of the neighbouring genes in the same cluster. We divided the gene pairs into seven categories according to the p-value of the clusters to which the gene pairs belong. These seven categories correspond one-to-one to the following intervals of p-values: 0~10-6, 10-6~10-5, 10-5~10-4, 10-4~10-3, 10-3~10-2, 10-2~10-1, 10-1~1. To simplify presentation we consider (base 10 logarithm) -lg p-value instead of p-value, and use the intervals: 0~1, 1~2, 2~3, 3~4, 4~5, 5~6, > 6. We calculated the mean R of the neighbouring gene pairs in each category and observed a significant correlation between -lg p-value and the degree of coexpression of neighbouring gene pairs, either using the complete dataset (Figure 4) or the dataset after tandem duplicates are removed (Figure 5). This correlation is extremely significant for gene pairs that are transcripted in the parallel orientation. The mean R value is as high as 0.5088 (with standard error 0.0642) for the complete dataset and 0.3228 (with standard error 0.1330) even after tandem duplicates are removed. We also observed that at low p-value (high -lg p-value), more gene pairs in the identified clusters are transcribed in the parallel orientation, even with tandem duplicates (Table 5). We examined a correlation between -lg p-value and neighbouring gene distance to find if such a high correlation can be explained with intergenic distance. No such correlation was found (Figure 6).


Positive correlation between gene coexpression and positional clustering in the zebrafish genome.

Ng YK, Wu W, Zhang L - BMC Genomics (2009)

Mean R of neighboring gene pairs in different -lg p-value intervals. Mean R values of neighboring gene pairs in -lg p intervals. All p-values were calculated with D = 25K. Gene pairs grouped into parallel, divergent, and convergent orientations are plotted similarly. There is only one gene pair has -lg p-value in the interval 5~6, for both the divergent and convergent cases. They are hence omitted from the plot.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2654907&req=5

Figure 4: Mean R of neighboring gene pairs in different -lg p-value intervals. Mean R values of neighboring gene pairs in -lg p intervals. All p-values were calculated with D = 25K. Gene pairs grouped into parallel, divergent, and convergent orientations are plotted similarly. There is only one gene pair has -lg p-value in the interval 5~6, for both the divergent and convergent cases. They are hence omitted from the plot.
Mentions: Finally, we investigated whether there is a correlation between the mean R of gene pairs and p-value for a positional cluster. With D = 25K, we considered all the pairs of the neighbouring genes in the same cluster. We divided the gene pairs into seven categories according to the p-value of the clusters to which the gene pairs belong. These seven categories correspond one-to-one to the following intervals of p-values: 0~10-6, 10-6~10-5, 10-5~10-4, 10-4~10-3, 10-3~10-2, 10-2~10-1, 10-1~1. To simplify presentation we consider (base 10 logarithm) -lg p-value instead of p-value, and use the intervals: 0~1, 1~2, 2~3, 3~4, 4~5, 5~6, > 6. We calculated the mean R of the neighbouring gene pairs in each category and observed a significant correlation between -lg p-value and the degree of coexpression of neighbouring gene pairs, either using the complete dataset (Figure 4) or the dataset after tandem duplicates are removed (Figure 5). This correlation is extremely significant for gene pairs that are transcripted in the parallel orientation. The mean R value is as high as 0.5088 (with standard error 0.0642) for the complete dataset and 0.3228 (with standard error 0.1330) even after tandem duplicates are removed. We also observed that at low p-value (high -lg p-value), more gene pairs in the identified clusters are transcribed in the parallel orientation, even with tandem duplicates (Table 5). We examined a correlation between -lg p-value and neighbouring gene distance to find if such a high correlation can be explained with intergenic distance. No such correlation was found (Figure 6).

Bottom Line: This paper analyzes correlation between the proximity of eukaryotic genes and their transcriptional expression pattern in the zebrafish (Danio rerio) genome using available microarray data and gene annotation.The analyses show that neighbouring genes are significantly coexpressed in the zebrafish genome, and the coexpression level is influenced by the intergenic distance and transcription orientation.This fact is further supported by examining the coexpression level of genes within positional clusters in the neighbourhood model.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Mathematics, National University of Singapore, 2 Science Drive 2, Singapore 117543, Singapore. matnyk@nus.edu.sg

ABSTRACT

Background: Co-expressing genes tend to cluster in eukaryotic genomes. This paper analyzes correlation between the proximity of eukaryotic genes and their transcriptional expression pattern in the zebrafish (Danio rerio) genome using available microarray data and gene annotation.

Results: The analyses show that neighbouring genes are significantly coexpressed in the zebrafish genome, and the coexpression level is influenced by the intergenic distance and transcription orientation. This fact is further supported by examining the coexpression level of genes within positional clusters in the neighbourhood model. There is a positive correlation between gene coexpression and positional clustering in the zebrafish genome.

Conclusion: The study provides another piece of evidence for the hypothesis that coexpressed genes do cluster in the eukaryotic genomes.

Show MeSH