Limits...
Genome-Wide Computational Analysis of Musa Microsatellites: Classification, Cross-Taxon Transferability, Functional Annotation, Association with Transposons & miRNAs, and Genetic Marker Potential.

Biswas MK, Liu Y, Li C, Sheng O, Mayer C, Yi G - PLoS ONE (2015)

Bottom Line: A high SSR frequency (177 per Mbp) was found in the Musa genome.A significant number of Musa SSRs are associated with pre-miRNAs, and 83% of these SSRs are promising candidates for the development of therapeutic SSR markers.These additional markers could be a valuable resource for marker-assisted breeding, genetic diversity and genomic studies of Musa and related species.

View Article: PubMed Central - PubMed

Affiliation: Institution of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Guangzhou, Guangdong Province, China; Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China; The College of Life Science, South China Agricultural University, Guangzhou, China.

ABSTRACT
The development of organized, informative, robust, user-friendly, and freely accessible molecular markers is imperative to the Musa marker assisted breeding program. Although several hundred SSR markers have already been developed, the number of informative, robust, and freely accessible Musa markers remains inadequate for some breeding applications. In view of this issue, we surveyed SSRs in four different data sets, developed large-scale non-redundant highly informative therapeutic SSR markers, and classified them according to their attributes, as well as analyzed their cross-taxon transferability and utility for the genetic study of Musa and its relatives. A high SSR frequency (177 per Mbp) was found in the Musa genome. AT-rich dinucleotide repeats are predominant, and trinucleotide repeats are the most abundant in transcribed regions. A significant number of Musa SSRs are associated with pre-miRNAs, and 83% of these SSRs are promising candidates for the development of therapeutic SSR markers. Overall, 74% of the SSR markers were polymorphic, and 94% were transferable to at least one Musa spp. Two hundred forty-three markers generated a total of 1047 alleles, with 2-8 alleles each and an average of 4.38 alleles per locus. The PIC values ranged from 0.31 to 0.89 and averaged 0.71. We report the largest set of non-redundant, polymorphic, new SSR markers to be developed in Musa. These additional markers could be a valuable resource for marker-assisted breeding, genetic diversity and genomic studies of Musa and related species.

No MeSH data available.


(A) Relative frequency (%) of SSR classes, by number of repeats in the four different data sets of Musa spp. (B) Detail investigation of individual repeat motifs for each class of SSRs found in AA, BB, EST and GSS sequences.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4488140&req=5

pone.0131312.g001: (A) Relative frequency (%) of SSR classes, by number of repeats in the four different data sets of Musa spp. (B) Detail investigation of individual repeat motifs for each class of SSRs found in AA, BB, EST and GSS sequences.

Mentions: To facilitate the genome-wide identification, distribution and classification of perfect SSRs according to their attributes, we analyzed the 473 Mbp M. acuminata genome (data set AA), 403Mbp M. balbisiana genome (data set BB), 41 Mbp EST (Expressed Sequences data) and 19 Mbp GSS(Genome survey sequences) sequences, and the results are presented in Table 1, Figs 1 and 2, and S1–S4 Figs. In total, 87396, 79355, 7479 and 1850 SSRs, comprising different types of desirable repeat motifs (from di- to hexanucleotide repeats) were identified in the AA, BB, EST and GSS data sets (Table 1), respectively. The SSR densities of the A and B genomes are identical, but they are slightly lower than that of the EST data set. Additionally, we found that the GSS SSR density was almost two-fold greater than those of the other data sets studied. Combining the results of the four data sets revealed that 177 microsatellites were identified per megabase of Musa genome (see S1 Table). To compare the SSR density of Musa with other plant species, the whole genome sequences of 23 plant species were searched for SSRs using the same parameters. Surprisingly, Musa had higher microsatellite densities than most of the tested species, with the exceptions of O. sativa, A. chinensis, C. papaya, C. sativus, C. melo, P. persica, F. ananassa and V. vinifera (S1 Table). The relative SSR frequencies (%) and length distributions of various di- to hexa-nucleotide motifs of the four Musa data sets are presented in Fig 1A. Dinucleotide repeats were the most common SSR class in the AA, BB and GSS data sets, accounting for nearly 64% of SSRs overall, while 44% di- and 47% trinucleotide repeats were estimated for the EST data set. We also found that dinucleotide repeats were the most common repeat class in almost all of the plant genomes tested, with the exceptions of B. distachyon and L. usitatissimum (see S1 Table). Our results reveal that the frequency distribution of di- to hexanucleotide repeats with regards to their numbers of repeat units increased as the number of repeat units decreased. As shown in Fig 1A, the frequency of dinucleotide repeats decreased with increased repeat unit more gradually than for other large repeats, and tetra through hexa-nucleotides demonstrated the most dramatic reduction in frequency distribution.


Genome-Wide Computational Analysis of Musa Microsatellites: Classification, Cross-Taxon Transferability, Functional Annotation, Association with Transposons & miRNAs, and Genetic Marker Potential.

Biswas MK, Liu Y, Li C, Sheng O, Mayer C, Yi G - PLoS ONE (2015)

(A) Relative frequency (%) of SSR classes, by number of repeats in the four different data sets of Musa spp. (B) Detail investigation of individual repeat motifs for each class of SSRs found in AA, BB, EST and GSS sequences.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4488140&req=5

pone.0131312.g001: (A) Relative frequency (%) of SSR classes, by number of repeats in the four different data sets of Musa spp. (B) Detail investigation of individual repeat motifs for each class of SSRs found in AA, BB, EST and GSS sequences.
Mentions: To facilitate the genome-wide identification, distribution and classification of perfect SSRs according to their attributes, we analyzed the 473 Mbp M. acuminata genome (data set AA), 403Mbp M. balbisiana genome (data set BB), 41 Mbp EST (Expressed Sequences data) and 19 Mbp GSS(Genome survey sequences) sequences, and the results are presented in Table 1, Figs 1 and 2, and S1–S4 Figs. In total, 87396, 79355, 7479 and 1850 SSRs, comprising different types of desirable repeat motifs (from di- to hexanucleotide repeats) were identified in the AA, BB, EST and GSS data sets (Table 1), respectively. The SSR densities of the A and B genomes are identical, but they are slightly lower than that of the EST data set. Additionally, we found that the GSS SSR density was almost two-fold greater than those of the other data sets studied. Combining the results of the four data sets revealed that 177 microsatellites were identified per megabase of Musa genome (see S1 Table). To compare the SSR density of Musa with other plant species, the whole genome sequences of 23 plant species were searched for SSRs using the same parameters. Surprisingly, Musa had higher microsatellite densities than most of the tested species, with the exceptions of O. sativa, A. chinensis, C. papaya, C. sativus, C. melo, P. persica, F. ananassa and V. vinifera (S1 Table). The relative SSR frequencies (%) and length distributions of various di- to hexa-nucleotide motifs of the four Musa data sets are presented in Fig 1A. Dinucleotide repeats were the most common SSR class in the AA, BB and GSS data sets, accounting for nearly 64% of SSRs overall, while 44% di- and 47% trinucleotide repeats were estimated for the EST data set. We also found that dinucleotide repeats were the most common repeat class in almost all of the plant genomes tested, with the exceptions of B. distachyon and L. usitatissimum (see S1 Table). Our results reveal that the frequency distribution of di- to hexanucleotide repeats with regards to their numbers of repeat units increased as the number of repeat units decreased. As shown in Fig 1A, the frequency of dinucleotide repeats decreased with increased repeat unit more gradually than for other large repeats, and tetra through hexa-nucleotides demonstrated the most dramatic reduction in frequency distribution.

Bottom Line: A high SSR frequency (177 per Mbp) was found in the Musa genome.A significant number of Musa SSRs are associated with pre-miRNAs, and 83% of these SSRs are promising candidates for the development of therapeutic SSR markers.These additional markers could be a valuable resource for marker-assisted breeding, genetic diversity and genomic studies of Musa and related species.

View Article: PubMed Central - PubMed

Affiliation: Institution of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Guangzhou, Guangdong Province, China; Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China; The College of Life Science, South China Agricultural University, Guangzhou, China.

ABSTRACT
The development of organized, informative, robust, user-friendly, and freely accessible molecular markers is imperative to the Musa marker assisted breeding program. Although several hundred SSR markers have already been developed, the number of informative, robust, and freely accessible Musa markers remains inadequate for some breeding applications. In view of this issue, we surveyed SSRs in four different data sets, developed large-scale non-redundant highly informative therapeutic SSR markers, and classified them according to their attributes, as well as analyzed their cross-taxon transferability and utility for the genetic study of Musa and its relatives. A high SSR frequency (177 per Mbp) was found in the Musa genome. AT-rich dinucleotide repeats are predominant, and trinucleotide repeats are the most abundant in transcribed regions. A significant number of Musa SSRs are associated with pre-miRNAs, and 83% of these SSRs are promising candidates for the development of therapeutic SSR markers. Overall, 74% of the SSR markers were polymorphic, and 94% were transferable to at least one Musa spp. Two hundred forty-three markers generated a total of 1047 alleles, with 2-8 alleles each and an average of 4.38 alleles per locus. The PIC values ranged from 0.31 to 0.89 and averaged 0.71. We report the largest set of non-redundant, polymorphic, new SSR markers to be developed in Musa. These additional markers could be a valuable resource for marker-assisted breeding, genetic diversity and genomic studies of Musa and related species.

No MeSH data available.