Limits...
Genomic distribution of AFLP markers relative to gene locations for different eukaryotic species.

Caballero A, García-Pereira MJ, Quesada H - BMC Genomics (2013)

Bottom Line: The high coverage of AFLP markers across the genomes and the high proportion of markers within or close to gene sequences make them suitable for genome scans and detecting large islands of differentiation in the genome.However, for specific traits, the percentage of AFLP markers close to genes can be rather small.Therefore, genome scans directed towards the search of markers closely linked to selected loci can be a difficult task in many instances.

View Article: PubMed Central - HTML - PubMed

Affiliation: Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidade de Vigo, 36310 Vigo, Spain. armando@uvigo.es

ABSTRACT

Background: Amplified fragment length polymorphism (AFLP) markers are frequently used for a wide range of studies, such as genome-wide mapping, population genetic diversity estimation, hybridization and introgression studies, phylogenetic analyses, and detection of signatures of selection. An important issue to be addressed for some of these fields is the distribution of the markers across the genome, particularly in relation to gene sequences.

Results: Using in-silico restriction fragment analysis of the genomes of nine eukaryotic species we characterise the distribution of AFLP fragments across the genome and, particularly, in relation to gene locations. First, we identify the physical position of markers across the chromosomes of all species. An observed accumulation of fragments around (peri) centromeric regions in some species is produced by repeated sequences, and this accumulation disappears when AFLP bands rather than fragments are considered. Second, we calculate the percentage of AFLP markers positioned within gene sequences. For the typical EcoRI/MseI enzyme pair, this ranges between 28 and 87% and is usually larger than that expected by chance because of the higher GC content of gene sequences relative to intergenic ones. In agreement with this, the use of enzyme pairs with GC-rich restriction sites substantially increases the above percentages. For example, using the enzyme system SacI/HpaII, 86% of AFLP markers are located within gene sequences in A. thaliana, and 100% of markers in Plasmodium falciparun. We further find that for a typical trait controlled by 50 genes of average size, if 1000 AFLPs are used in a study, the number of those within 1 kb distance from any of the genes would be only about 1-2, and only about 50% of the genes would have markers within that distance.

Conclusions: The high coverage of AFLP markers across the genomes and the high proportion of markers within or close to gene sequences make them suitable for genome scans and detecting large islands of differentiation in the genome. However, for specific traits, the percentage of AFLP markers close to genes can be rather small. Therefore, genome scans directed towards the search of markers closely linked to selected loci can be a difficult task in many instances.

Show MeSH
Distribution of the number of AFLP fragments, number of genes, and average GC content, across the different chromosomes of Arabidopsis thaliana, shown in non-overlapping windows of 200 kb. (A) Number of AFLP fragments (EcoRI/MseI) in red, number of genes in blue. (B) Average GC content. The approximate location of the centromeric regions is marked with an arrow.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3750350&req=5

Figure 1: Distribution of the number of AFLP fragments, number of genes, and average GC content, across the different chromosomes of Arabidopsis thaliana, shown in non-overlapping windows of 200 kb. (A) Number of AFLP fragments (EcoRI/MseI) in red, number of genes in blue. (B) Average GC content. The approximate location of the centromeric regions is marked with an arrow.

Mentions: We first focus on the Arabidopsis thaliana genome, as a number of in-silico studies have been carried out previously on this species. The distribution of the number of AFLP fragments (EcoRI/MseI) and the number of genes across the different chromosomes are shown in non-overlapping windows of 200 kb in Figure 1A. It is apparent that a certain accumulation of AFLP fragments are located around or in the centromeric regions, particularly for chromosomes 3 and 5. The reason for these increases in the number of fragments can be ascribed to the higher GC content attached to these genomic areas (Figure 1B). Indeed, although the number of MseI sites is lower in these regions than in others (Figure 2A), the number of EcoRI sites they contain is drastically increased (Figure 2B), leading to an increase in the number of AFLP fragments. Nevertheless, the excess of AFLP fragments around the centromeric regions, virtually disappears when AFLP bands rather than fragments are considered in the analysis (Figure 3). The reason is that in the centromeric regions repeated sequences which produce particular fragments of the same size occur and can be expected to collide in the same electrophoretic band. In order to check this explanation, we looked in detail at the centromeric regions of chromosomes 3 and 5 as defined by The Arabidopsis Genome Initiative [46]. We found, for example, that an AFLP fragment sequence of 104 bp in the centromeric region of chromosome 3 repeated 50 times. In chromosome 5 there was an AFLP fragment sequence of 117 bp repeated 63 times and one of 116 bp repeated 9 times.


Genomic distribution of AFLP markers relative to gene locations for different eukaryotic species.

Caballero A, García-Pereira MJ, Quesada H - BMC Genomics (2013)

Distribution of the number of AFLP fragments, number of genes, and average GC content, across the different chromosomes of Arabidopsis thaliana, shown in non-overlapping windows of 200 kb. (A) Number of AFLP fragments (EcoRI/MseI) in red, number of genes in blue. (B) Average GC content. The approximate location of the centromeric regions is marked with an arrow.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3750350&req=5

Figure 1: Distribution of the number of AFLP fragments, number of genes, and average GC content, across the different chromosomes of Arabidopsis thaliana, shown in non-overlapping windows of 200 kb. (A) Number of AFLP fragments (EcoRI/MseI) in red, number of genes in blue. (B) Average GC content. The approximate location of the centromeric regions is marked with an arrow.
Mentions: We first focus on the Arabidopsis thaliana genome, as a number of in-silico studies have been carried out previously on this species. The distribution of the number of AFLP fragments (EcoRI/MseI) and the number of genes across the different chromosomes are shown in non-overlapping windows of 200 kb in Figure 1A. It is apparent that a certain accumulation of AFLP fragments are located around or in the centromeric regions, particularly for chromosomes 3 and 5. The reason for these increases in the number of fragments can be ascribed to the higher GC content attached to these genomic areas (Figure 1B). Indeed, although the number of MseI sites is lower in these regions than in others (Figure 2A), the number of EcoRI sites they contain is drastically increased (Figure 2B), leading to an increase in the number of AFLP fragments. Nevertheless, the excess of AFLP fragments around the centromeric regions, virtually disappears when AFLP bands rather than fragments are considered in the analysis (Figure 3). The reason is that in the centromeric regions repeated sequences which produce particular fragments of the same size occur and can be expected to collide in the same electrophoretic band. In order to check this explanation, we looked in detail at the centromeric regions of chromosomes 3 and 5 as defined by The Arabidopsis Genome Initiative [46]. We found, for example, that an AFLP fragment sequence of 104 bp in the centromeric region of chromosome 3 repeated 50 times. In chromosome 5 there was an AFLP fragment sequence of 117 bp repeated 63 times and one of 116 bp repeated 9 times.

Bottom Line: The high coverage of AFLP markers across the genomes and the high proportion of markers within or close to gene sequences make them suitable for genome scans and detecting large islands of differentiation in the genome.However, for specific traits, the percentage of AFLP markers close to genes can be rather small.Therefore, genome scans directed towards the search of markers closely linked to selected loci can be a difficult task in many instances.

View Article: PubMed Central - HTML - PubMed

Affiliation: Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidade de Vigo, 36310 Vigo, Spain. armando@uvigo.es

ABSTRACT

Background: Amplified fragment length polymorphism (AFLP) markers are frequently used for a wide range of studies, such as genome-wide mapping, population genetic diversity estimation, hybridization and introgression studies, phylogenetic analyses, and detection of signatures of selection. An important issue to be addressed for some of these fields is the distribution of the markers across the genome, particularly in relation to gene sequences.

Results: Using in-silico restriction fragment analysis of the genomes of nine eukaryotic species we characterise the distribution of AFLP fragments across the genome and, particularly, in relation to gene locations. First, we identify the physical position of markers across the chromosomes of all species. An observed accumulation of fragments around (peri) centromeric regions in some species is produced by repeated sequences, and this accumulation disappears when AFLP bands rather than fragments are considered. Second, we calculate the percentage of AFLP markers positioned within gene sequences. For the typical EcoRI/MseI enzyme pair, this ranges between 28 and 87% and is usually larger than that expected by chance because of the higher GC content of gene sequences relative to intergenic ones. In agreement with this, the use of enzyme pairs with GC-rich restriction sites substantially increases the above percentages. For example, using the enzyme system SacI/HpaII, 86% of AFLP markers are located within gene sequences in A. thaliana, and 100% of markers in Plasmodium falciparun. We further find that for a typical trait controlled by 50 genes of average size, if 1000 AFLPs are used in a study, the number of those within 1 kb distance from any of the genes would be only about 1-2, and only about 50% of the genes would have markers within that distance.

Conclusions: The high coverage of AFLP markers across the genomes and the high proportion of markers within or close to gene sequences make them suitable for genome scans and detecting large islands of differentiation in the genome. However, for specific traits, the percentage of AFLP markers close to genes can be rather small. Therefore, genome scans directed towards the search of markers closely linked to selected loci can be a difficult task in many instances.

Show MeSH