Limits...
Construction and analysis of high-density linkage map using high-throughput sequencing data.

Liu D, Ma C, Hong W, Huang L, Liu M, Liu H, Zeng H, Deng D, Xin H, Song J, Xu C, Sun X, Hou X, Wang X, Zheng H - PLoS ONE (2014)

Bottom Line: HighMap employs an iterative ordering and error correction strategy based on a k-nearest neighbor algorithm and a Monte Carlo multipoint maximum likelihood algorithm.The singleton rate was less than one-ninth of that generated by JoinMap4.1.It will facilitate genome assembling, comparative genomic analysis, and QTL studies.

View Article: PubMed Central - PubMed

Affiliation: Biomarker Technologies Corporation, Beijing, China.

ABSTRACT
Linkage maps enable the study of important biological questions. The construction of high-density linkage maps appears more feasible since the advent of next-generation sequencing (NGS), which eases SNP discovery and high-throughput genotyping of large population. However, the marker number explosion and genotyping errors from NGS data challenge the computational efficiency and linkage map quality of linkage study methods. Here we report the HighMap method for constructing high-density linkage maps from NGS data. HighMap employs an iterative ordering and error correction strategy based on a k-nearest neighbor algorithm and a Monte Carlo multipoint maximum likelihood algorithm. Simulation study shows HighMap can create a linkage map with three times as many markers as ordering-only methods while offering more accurate marker orders and stable genetic distances. Using HighMap, we constructed a common carp linkage map with 10,004 markers. The singleton rate was less than one-ninth of that generated by JoinMap4.1. Its total map distance was 5,908 cM, consistent with reports on low-density maps. HighMap is an efficient method for constructing high-density, high-quality linkage maps from high-throughput population NGS data. It will facilitate genome assembling, comparative genomic analysis, and QTL studies. HighMap is available at http://highmap.biomarker.com.cn/.

Show MeSH

Related in: MedlinePlus

Changes in linkage map quality as genotyping error increased.The X-axis represents genotyping error. The Y-axis represents Spearman rank correlation coefficient between estimated map marker order and true marker location for A, B and C, singleton rates for D, E and F, estimated genetic map distances for G, H and I, respectively.“Integrated”, “Female”, and “Male” indicates integrated, female, or male linkage maps, respectively. JoinMap4.0 failed to construct linkage map due to its inefficiency in estimating linkage phases when the error rate exceeded about 14%.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4048240&req=5

pone-0098855-g003: Changes in linkage map quality as genotyping error increased.The X-axis represents genotyping error. The Y-axis represents Spearman rank correlation coefficient between estimated map marker order and true marker location for A, B and C, singleton rates for D, E and F, estimated genetic map distances for G, H and I, respectively.“Integrated”, “Female”, and “Male” indicates integrated, female, or male linkage maps, respectively. JoinMap4.0 failed to construct linkage map due to its inefficiency in estimating linkage phases when the error rate exceeded about 14%.

Mentions: To assess the performance of HighMap in marker order accuracy and map distance stability, simulation data set was generated from the full-sib family consisting of 200 offsprings with 200 markers, which contained different missing observations or different genotyping errors. Results showed that the Spearman correlation coefficient between the true and calculated marker order based on HighMap decreased less obviously than that of JoinMap4.1 as the marker error rate increased. The differences of the correlation coefficient between HighMap and JoinMap4.1 were more pronounced when error rate exceeded 20% (Figure 3). This result demonstrated that HighMap could offer linkage maps of higher quality than JoinMap4.1 when there were a large proportion of erroneous markers. The singleton rate of HighMap grew slowly as error rates increased, whereas the singleton rate ascended linearly with JoinMap4.1. HighMap led to only 3.3% of the singleton rate when the marker data contained 20% error, whereas JoinMap4.1 led to 14.4% of the singleton rate, suggesting that HighMap detected and eliminated most genotyping errors from the data. Both the correlation and singleton analysis revealed that JoinMap4.0 was sensitive not only to erroneous data but also to missing data (Figure 3 and Figure S2). It failed to construct linkage map due to its inefficiency in estimating linkage phases when the error rate exceeded about 14% (Figure 3). Collectively, HighMap remarkedly outperformed both JoinMap4.0 and JoinMap4.1 with respect to marker order accuracy.


Construction and analysis of high-density linkage map using high-throughput sequencing data.

Liu D, Ma C, Hong W, Huang L, Liu M, Liu H, Zeng H, Deng D, Xin H, Song J, Xu C, Sun X, Hou X, Wang X, Zheng H - PLoS ONE (2014)

Changes in linkage map quality as genotyping error increased.The X-axis represents genotyping error. The Y-axis represents Spearman rank correlation coefficient between estimated map marker order and true marker location for A, B and C, singleton rates for D, E and F, estimated genetic map distances for G, H and I, respectively.“Integrated”, “Female”, and “Male” indicates integrated, female, or male linkage maps, respectively. JoinMap4.0 failed to construct linkage map due to its inefficiency in estimating linkage phases when the error rate exceeded about 14%.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4048240&req=5

pone-0098855-g003: Changes in linkage map quality as genotyping error increased.The X-axis represents genotyping error. The Y-axis represents Spearman rank correlation coefficient between estimated map marker order and true marker location for A, B and C, singleton rates for D, E and F, estimated genetic map distances for G, H and I, respectively.“Integrated”, “Female”, and “Male” indicates integrated, female, or male linkage maps, respectively. JoinMap4.0 failed to construct linkage map due to its inefficiency in estimating linkage phases when the error rate exceeded about 14%.
Mentions: To assess the performance of HighMap in marker order accuracy and map distance stability, simulation data set was generated from the full-sib family consisting of 200 offsprings with 200 markers, which contained different missing observations or different genotyping errors. Results showed that the Spearman correlation coefficient between the true and calculated marker order based on HighMap decreased less obviously than that of JoinMap4.1 as the marker error rate increased. The differences of the correlation coefficient between HighMap and JoinMap4.1 were more pronounced when error rate exceeded 20% (Figure 3). This result demonstrated that HighMap could offer linkage maps of higher quality than JoinMap4.1 when there were a large proportion of erroneous markers. The singleton rate of HighMap grew slowly as error rates increased, whereas the singleton rate ascended linearly with JoinMap4.1. HighMap led to only 3.3% of the singleton rate when the marker data contained 20% error, whereas JoinMap4.1 led to 14.4% of the singleton rate, suggesting that HighMap detected and eliminated most genotyping errors from the data. Both the correlation and singleton analysis revealed that JoinMap4.0 was sensitive not only to erroneous data but also to missing data (Figure 3 and Figure S2). It failed to construct linkage map due to its inefficiency in estimating linkage phases when the error rate exceeded about 14% (Figure 3). Collectively, HighMap remarkedly outperformed both JoinMap4.0 and JoinMap4.1 with respect to marker order accuracy.

Bottom Line: HighMap employs an iterative ordering and error correction strategy based on a k-nearest neighbor algorithm and a Monte Carlo multipoint maximum likelihood algorithm.The singleton rate was less than one-ninth of that generated by JoinMap4.1.It will facilitate genome assembling, comparative genomic analysis, and QTL studies.

View Article: PubMed Central - PubMed

Affiliation: Biomarker Technologies Corporation, Beijing, China.

ABSTRACT
Linkage maps enable the study of important biological questions. The construction of high-density linkage maps appears more feasible since the advent of next-generation sequencing (NGS), which eases SNP discovery and high-throughput genotyping of large population. However, the marker number explosion and genotyping errors from NGS data challenge the computational efficiency and linkage map quality of linkage study methods. Here we report the HighMap method for constructing high-density linkage maps from NGS data. HighMap employs an iterative ordering and error correction strategy based on a k-nearest neighbor algorithm and a Monte Carlo multipoint maximum likelihood algorithm. Simulation study shows HighMap can create a linkage map with three times as many markers as ordering-only methods while offering more accurate marker orders and stable genetic distances. Using HighMap, we constructed a common carp linkage map with 10,004 markers. The singleton rate was less than one-ninth of that generated by JoinMap4.1. Its total map distance was 5,908 cM, consistent with reports on low-density maps. HighMap is an efficient method for constructing high-density, high-quality linkage maps from high-throughput population NGS data. It will facilitate genome assembling, comparative genomic analysis, and QTL studies. HighMap is available at http://highmap.biomarker.com.cn/.

Show MeSH
Related in: MedlinePlus