Limits...
Orientation, distance, regulation and function of neighbouring genes.

Gherman A, Wang R, Avramopoulos D - Hum. Genomics (2009)

Bottom Line: The sequencing of the human genome has allowed us to observe globally and in detail the arrangement of genes along the chromosomes.We have undertaken a systematic evaluation of the spatial distribution and orientation of known genes across the human genome.We used genome-level information, including phylogenetic conservation, single nucleotide polymorphism density and correlation of gene expression to assess the importance of this distribution.

View Article: PubMed Central - HTML - PubMed

Affiliation: McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, School of Medicine, Baltimore, MD 21205, USA.

ABSTRACT
The sequencing of the human genome has allowed us to observe globally and in detail the arrangement of genes along the chromosomes. There are multiple lines of evidence that this arrangement is not random, both in terms of intergenic distances and orientation of neighbouring genes. We have undertaken a systematic evaluation of the spatial distribution and orientation of known genes across the human genome. We used genome-level information, including phylogenetic conservation, single nucleotide polymorphism density and correlation of gene expression to assess the importance of this distribution. In addition to confirming and extending known properties of the genome, such as the significance of gene deserts and the importance of 'head to head' orientation of gene pairs in proximity, we provide significant new observations that include a smaller average size for intervals separating the 3' ends of neighbouring genes, a correlation of gene expression across tissues for genes as far as 100 kilobases apart and signatures of increasing positive selection with decreasing interval size surprisingly relaxing for intervals smaller than approximately 500 base pairs. Further, we provide extensive graphical representations of the genome-wide data to allow for observations and comparisons beyond what we address.

Show MeSH
Phylogenetic conservation. Conservation in the three types of intervals arranged by size, measured as the number of bases within conserved elements identified by the PhastCons algorithm per 1,000 base pairs. The y-axis shows the sliding average of the conservation of 100 intervals consecutive in size, while the x-axis shows the average size of the same 100 intervals. This artificially reduces the variance between individual points and is done for illustration only. All reported statistics use the individual values, not these sliding averages.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2665707&req=5

Figure 3: Phylogenetic conservation. Conservation in the three types of intervals arranged by size, measured as the number of bases within conserved elements identified by the PhastCons algorithm per 1,000 base pairs. The y-axis shows the sliding average of the conservation of 100 intervals consecutive in size, while the x-axis shows the average size of the same 100 intervals. This artificially reduces the variance between individual points and is done for illustration only. All reported statistics use the individual values, not these sliding averages.

Mentions: Conservation over evolutionary time is considered to be an indication of functional significance. Conservation by interval type and size is shown in Figure 3 (PhastCons conserved bases per 1,000). Because of the large variability in conservation among intervals, the y-axis shows a sliding average of 100 intervals consecutive in size. For example, the conservation at size 1 kb in Figure 3 corresponds to the average conservation of 100 intervals, of which the median interval has a size of 1 kb. This was done in this and other figures to reduce the variance between individual points and make the trends visible. Although useful and necessary, it should be noted that this illustration can generate some artefacts at the smallest intervals, where the inclusion of a single small interval with, for example, high conservation leads to a sudden increase in the average, which then slowly declines as more intervals without conservation are added (Figure 3). The reported statistical analyses always use individual, not average, values.


Orientation, distance, regulation and function of neighbouring genes.

Gherman A, Wang R, Avramopoulos D - Hum. Genomics (2009)

Phylogenetic conservation. Conservation in the three types of intervals arranged by size, measured as the number of bases within conserved elements identified by the PhastCons algorithm per 1,000 base pairs. The y-axis shows the sliding average of the conservation of 100 intervals consecutive in size, while the x-axis shows the average size of the same 100 intervals. This artificially reduces the variance between individual points and is done for illustration only. All reported statistics use the individual values, not these sliding averages.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2665707&req=5

Figure 3: Phylogenetic conservation. Conservation in the three types of intervals arranged by size, measured as the number of bases within conserved elements identified by the PhastCons algorithm per 1,000 base pairs. The y-axis shows the sliding average of the conservation of 100 intervals consecutive in size, while the x-axis shows the average size of the same 100 intervals. This artificially reduces the variance between individual points and is done for illustration only. All reported statistics use the individual values, not these sliding averages.
Mentions: Conservation over evolutionary time is considered to be an indication of functional significance. Conservation by interval type and size is shown in Figure 3 (PhastCons conserved bases per 1,000). Because of the large variability in conservation among intervals, the y-axis shows a sliding average of 100 intervals consecutive in size. For example, the conservation at size 1 kb in Figure 3 corresponds to the average conservation of 100 intervals, of which the median interval has a size of 1 kb. This was done in this and other figures to reduce the variance between individual points and make the trends visible. Although useful and necessary, it should be noted that this illustration can generate some artefacts at the smallest intervals, where the inclusion of a single small interval with, for example, high conservation leads to a sudden increase in the average, which then slowly declines as more intervals without conservation are added (Figure 3). The reported statistical analyses always use individual, not average, values.

Bottom Line: The sequencing of the human genome has allowed us to observe globally and in detail the arrangement of genes along the chromosomes.We have undertaken a systematic evaluation of the spatial distribution and orientation of known genes across the human genome.We used genome-level information, including phylogenetic conservation, single nucleotide polymorphism density and correlation of gene expression to assess the importance of this distribution.

View Article: PubMed Central - HTML - PubMed

Affiliation: McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, School of Medicine, Baltimore, MD 21205, USA.

ABSTRACT
The sequencing of the human genome has allowed us to observe globally and in detail the arrangement of genes along the chromosomes. There are multiple lines of evidence that this arrangement is not random, both in terms of intergenic distances and orientation of neighbouring genes. We have undertaken a systematic evaluation of the spatial distribution and orientation of known genes across the human genome. We used genome-level information, including phylogenetic conservation, single nucleotide polymorphism density and correlation of gene expression to assess the importance of this distribution. In addition to confirming and extending known properties of the genome, such as the significance of gene deserts and the importance of 'head to head' orientation of gene pairs in proximity, we provide significant new observations that include a smaller average size for intervals separating the 3' ends of neighbouring genes, a correlation of gene expression across tissues for genes as far as 100 kilobases apart and signatures of increasing positive selection with decreasing interval size surprisingly relaxing for intervals smaller than approximately 500 base pairs. Further, we provide extensive graphical representations of the genome-wide data to allow for observations and comparisons beyond what we address.

Show MeSH