Limits...
Assignment of isochores for all completely sequenced vertebrate genomes using a consensus.

Schmidt T, Frishman D - Genome Biol. (2008)

Bottom Line: We show that although the currently available isochore mapping methods agree on the isochore classification of about two-thirds of the human DNA, they produce significantly different results with regard to the location of isochore boundaries and isochore length distribution.We present a new consensus isochore assignment method based on majority voting and provide IsoBase, a comprehensive on-line database of isochore maps for all completely sequenced vertebrate genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Genome-Oriented Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, D-85350 Freising, Germany.

ABSTRACT
We show that although the currently available isochore mapping methods agree on the isochore classification of about two-thirds of the human DNA, they produce significantly different results with regard to the location of isochore boundaries and isochore length distribution. We present a new consensus isochore assignment method based on majority voting and provide IsoBase, a comprehensive on-line database of isochore maps for all completely sequenced vertebrate genomes.

Show MeSH

Related in: MedlinePlus

Isochore assignment confidence of human genes. Each bin of the histogram shows the percentage of genes supported by a given average number of computational methods. Denoted is the upper border of each bin. Each bin shows the number of genes having an isochore assignment confidence c with lower-border < c ≤ upper border. For example, 30% of genes have a confidence value of >1.8 and ≤ 2.0. About one-third (29%, the right-most bar) of all genes are equally classified by all four independent methods (BASIO, IsoFinder, GC-Profile and least-squares). Gene classifications with low confidence can hardly be found. For 99.8% of all genes at least two methods agree completely over the whole coding region. Furthermore, only very few genes have a confidence value between two full numbers. This can be explained by two observations: the genes are usually completely located within a single isochore stretch; and these gene regions are hardly separated by any of the segmentation methods. Therefore, usually two, three or all four methods agree for the complete gene. The mean and median support for all genes is 3.0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2481423&req=5

Figure 5: Isochore assignment confidence of human genes. Each bin of the histogram shows the percentage of genes supported by a given average number of computational methods. Denoted is the upper border of each bin. Each bin shows the number of genes having an isochore assignment confidence c with lower-border < c ≤ upper border. For example, 30% of genes have a confidence value of >1.8 and ≤ 2.0. About one-third (29%, the right-most bar) of all genes are equally classified by all four independent methods (BASIO, IsoFinder, GC-Profile and least-squares). Gene classifications with low confidence can hardly be found. For 99.8% of all genes at least two methods agree completely over the whole coding region. Furthermore, only very few genes have a confidence value between two full numbers. This can be explained by two observations: the genes are usually completely located within a single isochore stretch; and these gene regions are hardly separated by any of the segmentation methods. Therefore, usually two, three or all four methods agree for the complete gene. The mean and median support for all genes is 3.0.

Mentions: Most genes completely reside within a single isochore stretch (Additional data file 2). A comparison of random segmentations that have comparable block lengths shows that more genes are wholly located within an isochore segment than would be expected by chance. This is especially pronounced in isochore segmentations with segments of relatively short average length, such as those determined using IsoFinder and BASIO, and underlines the utility of isochore information for gene prediction. This observation may be related to the structure of chromatin [31] or chromosome break-prone regions [32]. We also found that most genes are classified into the same isochore families by the different methods. As a consequence, the isochore assignment confidence, as defined in Materials and methods, is very good for most genes and hardly any genes are classified with low confidence (Figure 5). One further observation is that most genes are found in regions with integer confidence values. This can be explained by the fact that genes typically reside completely within a single isochore stretch, irrespective of the applied method. For example, if a gene is completely covered by an isochore stretch in all isochore predictions, then the confidence value for this gene will always be two, three or four, depending on the number of methods that agree in their classification. In contrast, non-integer confidence values indicate regions that show a certain agreement for parts of the gene only, usually because an isochore border is located within a given gene. Overall, 99.8% of all genes are assigned to the same isochore families by at least two methods. This provides a sound basis for using isochore classification of genes in experimental studies such as expression analysis.


Assignment of isochores for all completely sequenced vertebrate genomes using a consensus.

Schmidt T, Frishman D - Genome Biol. (2008)

Isochore assignment confidence of human genes. Each bin of the histogram shows the percentage of genes supported by a given average number of computational methods. Denoted is the upper border of each bin. Each bin shows the number of genes having an isochore assignment confidence c with lower-border < c ≤ upper border. For example, 30% of genes have a confidence value of >1.8 and ≤ 2.0. About one-third (29%, the right-most bar) of all genes are equally classified by all four independent methods (BASIO, IsoFinder, GC-Profile and least-squares). Gene classifications with low confidence can hardly be found. For 99.8% of all genes at least two methods agree completely over the whole coding region. Furthermore, only very few genes have a confidence value between two full numbers. This can be explained by two observations: the genes are usually completely located within a single isochore stretch; and these gene regions are hardly separated by any of the segmentation methods. Therefore, usually two, three or all four methods agree for the complete gene. The mean and median support for all genes is 3.0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2481423&req=5

Figure 5: Isochore assignment confidence of human genes. Each bin of the histogram shows the percentage of genes supported by a given average number of computational methods. Denoted is the upper border of each bin. Each bin shows the number of genes having an isochore assignment confidence c with lower-border < c ≤ upper border. For example, 30% of genes have a confidence value of >1.8 and ≤ 2.0. About one-third (29%, the right-most bar) of all genes are equally classified by all four independent methods (BASIO, IsoFinder, GC-Profile and least-squares). Gene classifications with low confidence can hardly be found. For 99.8% of all genes at least two methods agree completely over the whole coding region. Furthermore, only very few genes have a confidence value between two full numbers. This can be explained by two observations: the genes are usually completely located within a single isochore stretch; and these gene regions are hardly separated by any of the segmentation methods. Therefore, usually two, three or all four methods agree for the complete gene. The mean and median support for all genes is 3.0.
Mentions: Most genes completely reside within a single isochore stretch (Additional data file 2). A comparison of random segmentations that have comparable block lengths shows that more genes are wholly located within an isochore segment than would be expected by chance. This is especially pronounced in isochore segmentations with segments of relatively short average length, such as those determined using IsoFinder and BASIO, and underlines the utility of isochore information for gene prediction. This observation may be related to the structure of chromatin [31] or chromosome break-prone regions [32]. We also found that most genes are classified into the same isochore families by the different methods. As a consequence, the isochore assignment confidence, as defined in Materials and methods, is very good for most genes and hardly any genes are classified with low confidence (Figure 5). One further observation is that most genes are found in regions with integer confidence values. This can be explained by the fact that genes typically reside completely within a single isochore stretch, irrespective of the applied method. For example, if a gene is completely covered by an isochore stretch in all isochore predictions, then the confidence value for this gene will always be two, three or four, depending on the number of methods that agree in their classification. In contrast, non-integer confidence values indicate regions that show a certain agreement for parts of the gene only, usually because an isochore border is located within a given gene. Overall, 99.8% of all genes are assigned to the same isochore families by at least two methods. This provides a sound basis for using isochore classification of genes in experimental studies such as expression analysis.

Bottom Line: We show that although the currently available isochore mapping methods agree on the isochore classification of about two-thirds of the human DNA, they produce significantly different results with regard to the location of isochore boundaries and isochore length distribution.We present a new consensus isochore assignment method based on majority voting and provide IsoBase, a comprehensive on-line database of isochore maps for all completely sequenced vertebrate genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Genome-Oriented Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, D-85350 Freising, Germany.

ABSTRACT
We show that although the currently available isochore mapping methods agree on the isochore classification of about two-thirds of the human DNA, they produce significantly different results with regard to the location of isochore boundaries and isochore length distribution. We present a new consensus isochore assignment method based on majority voting and provide IsoBase, a comprehensive on-line database of isochore maps for all completely sequenced vertebrate genomes.

Show MeSH
Related in: MedlinePlus