Limits...
Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

Bertolini F, Scimone C, Geraci C, Schiavo G, Utzeri VJ, Chiofalo V, Fontanesi L - PLoS ONE (2015)

Bottom Line: Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca).Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads.The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

View Article: PubMed Central - PubMed

Affiliation: Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, Bologna, Italy; Department of Veterinary Sciences, Animal Production Unit, University of Messina, Polo Universitario dell'Annunziata, Messina, Italy.

ABSTRACT
Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

No MeSH data available.


Coverage of the mapped donkey reads on the EquCab2.0 chromosomes.Data have been reported for the different chromosomes and for the three sequenced donkeys (Peppe and Pippo with Ion Proton and Willy with Illumina [36]). P+P (the merged sequences output of Peppe and Pippo) is almost complitery overlapping the resuts of the output of Willy, therefore it is not always directly visible in the figure.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4495037&req=5

pone.0131925.g002: Coverage of the mapped donkey reads on the EquCab2.0 chromosomes.Data have been reported for the different chromosomes and for the three sequenced donkeys (Peppe and Pippo with Ion Proton and Willy with Illumina [36]). P+P (the merged sequences output of Peppe and Pippo) is almost complitery overlapping the resuts of the output of Willy, therefore it is not always directly visible in the figure.

Mentions: The first draft assembly of the donkey genome reported in Orlando et al. [36] and generated from Illumina sequences of a donkey (Willy) reared in the Copenhagen zoo (Denmark) is based on 90,287 scaffolds and 420,214 shorter contigs. All the scaffolds account for a total of about 2.29 Gbp while the contigs account for 60.48 million of nucleotides for an overall total of 2.35 Gbp. The N50 size of the contigs and of the scaffold is 6.38 kb and 100.94 kb respectively. The EquCab2.0 horse genome has 2.37 Gbp and its N50 size of the contigs (no. = 55,316) and of the scaffolds (no. = 9,687) is 112.38 kbp and 46 Mbp, respectively [27]. These features affected the number of reads we obtained from the Ion Proton runs of the two donkeys that were aligned to the two Equus genomes (Table 1). Considering the two animals together, a total of about 214.15 million of reads were aligned to the draft donkey genome (scaffolds+contigs) whereas about 8.48 million of reads were not mapped, using the stringent procedure adopted (see Methods). The same analysis against the horse reference genome (EquCab2.0, including the autosomes and the chromosome X), produced 236.67 million of Ion Proton mapped reads and only 2,286 unmapped reads. Mean depth of the produced reads was 11.06 X with 94.76% coverage and 12.22 X with 97.54% coverage considering the draft donkey and the EquCab2.0 genome versions, respectively. A lower number of Illumina donkey reads aligned to the draft donkey scaffolds than those aligned to the EquCab2.0 genome was already reported by Orlando et al. [36], despite the draft genome of E. asinus was produced from the same Illumina reads used for the de novo assembly of this genome. This is probably due to the very preliminary assembly attempted for the donkey genome, that, however, was useful to deduce evolutionary information [36]. For a comparison, we aligned these Illumina reads obtained for the donkey genome [36] against the EquCab2.0 genome and obtained 538,783,052 mapped reads (with a mean depth coverage of 15.85 X). A summary of the mapped donkey Ion Proton and Illumina reads on the horse autosomal and X chromosomes available in the EcuCab2.0 genome version and their chromosome coverage are reported in Figs 1 and 2, respectively. Details are reported in S3 Table.


Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

Bertolini F, Scimone C, Geraci C, Schiavo G, Utzeri VJ, Chiofalo V, Fontanesi L - PLoS ONE (2015)

Coverage of the mapped donkey reads on the EquCab2.0 chromosomes.Data have been reported for the different chromosomes and for the three sequenced donkeys (Peppe and Pippo with Ion Proton and Willy with Illumina [36]). P+P (the merged sequences output of Peppe and Pippo) is almost complitery overlapping the resuts of the output of Willy, therefore it is not always directly visible in the figure.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4495037&req=5

pone.0131925.g002: Coverage of the mapped donkey reads on the EquCab2.0 chromosomes.Data have been reported for the different chromosomes and for the three sequenced donkeys (Peppe and Pippo with Ion Proton and Willy with Illumina [36]). P+P (the merged sequences output of Peppe and Pippo) is almost complitery overlapping the resuts of the output of Willy, therefore it is not always directly visible in the figure.
Mentions: The first draft assembly of the donkey genome reported in Orlando et al. [36] and generated from Illumina sequences of a donkey (Willy) reared in the Copenhagen zoo (Denmark) is based on 90,287 scaffolds and 420,214 shorter contigs. All the scaffolds account for a total of about 2.29 Gbp while the contigs account for 60.48 million of nucleotides for an overall total of 2.35 Gbp. The N50 size of the contigs and of the scaffold is 6.38 kb and 100.94 kb respectively. The EquCab2.0 horse genome has 2.37 Gbp and its N50 size of the contigs (no. = 55,316) and of the scaffolds (no. = 9,687) is 112.38 kbp and 46 Mbp, respectively [27]. These features affected the number of reads we obtained from the Ion Proton runs of the two donkeys that were aligned to the two Equus genomes (Table 1). Considering the two animals together, a total of about 214.15 million of reads were aligned to the draft donkey genome (scaffolds+contigs) whereas about 8.48 million of reads were not mapped, using the stringent procedure adopted (see Methods). The same analysis against the horse reference genome (EquCab2.0, including the autosomes and the chromosome X), produced 236.67 million of Ion Proton mapped reads and only 2,286 unmapped reads. Mean depth of the produced reads was 11.06 X with 94.76% coverage and 12.22 X with 97.54% coverage considering the draft donkey and the EquCab2.0 genome versions, respectively. A lower number of Illumina donkey reads aligned to the draft donkey scaffolds than those aligned to the EquCab2.0 genome was already reported by Orlando et al. [36], despite the draft genome of E. asinus was produced from the same Illumina reads used for the de novo assembly of this genome. This is probably due to the very preliminary assembly attempted for the donkey genome, that, however, was useful to deduce evolutionary information [36]. For a comparison, we aligned these Illumina reads obtained for the donkey genome [36] against the EquCab2.0 genome and obtained 538,783,052 mapped reads (with a mean depth coverage of 15.85 X). A summary of the mapped donkey Ion Proton and Illumina reads on the horse autosomal and X chromosomes available in the EcuCab2.0 genome version and their chromosome coverage are reported in Figs 1 and 2, respectively. Details are reported in S3 Table.

Bottom Line: Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca).Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads.The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

View Article: PubMed Central - PubMed

Affiliation: Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, Bologna, Italy; Department of Veterinary Sciences, Animal Production Unit, University of Messina, Polo Universitario dell'Annunziata, Messina, Italy.

ABSTRACT
Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

No MeSH data available.