Limits...
A general and efficient method for estimating continuous IBD functions for use in genome scans for QTL.

Besnier F, Carlborg O - BMC Bioinformatics (2007)

Bottom Line: Estimation of IBD functions improved the computational efficiency and memory usage in genome scanning for QTL.The approach of storing IBD as polynomial IBD-function was also shown to reduce the amount of memory required in genome scans for QTL.They will, however, be particularly useful in genome scans for multiple interacting QTL, where the improvements in both computational and memory efficiency are the key for successful development of efficient optimization algorithms to allow widespread use of this methodology.

View Article: PubMed Central - HTML - PubMed

Affiliation: Linnaeus Centre for Bioinformatics, Uppsala University, SE-75124 Uppsala, Sweden. francois.besnier@lcb.uu.se

ABSTRACT

Background: Identity by descent (IBD) matrix estimation is a central component in mapping of Quantitative Trait Loci (QTL) using variance component models. A large number of algorithms have been developed for estimation of IBD between individuals in populations at discrete locations in the genome for use in genome scans to detect QTL affecting various traits of interest in experimental animal, human and agricultural pedigrees. Here, we propose a new approach to estimate IBD as continuous functions rather than as discrete values.

Results: Estimation of IBD functions improved the computational efficiency and memory usage in genome scanning for QTL. We have explored two approaches to obtain continuous marker-bracket IBD-functions. By re-implementing an existing and fast deterministic IBD-estimation method, we show that this approach results in IBD functions that produces the exact same IBD as the original algorithm, but with a greater than 2-fold improvement of the computational efficiency and a considerably lower memory requirement for storing the resulting genome-wide IBD. By developing a general IBD function approximation algorithm, we show that it is possible to estimate marker-bracket IBD functions from IBD matrices estimated at marker locations by any existing IBD estimation algorithm. The general algorithm provides approximations that lead to QTL variance component estimates that even in worst-case scenarios are very similar to the true values. The approach of storing IBD as polynomial IBD-function was also shown to reduce the amount of memory required in genome scans for QTL.

Conclusion: In addition to direct improvements in computational and memory efficiency, estimation of IBD-functions is a fundamental step needed to develop and implement new efficient optimization algorithms for high precision localization of QTL. Here, we discuss and test two approaches for estimating IBD functions based on existing IBD estimation algorithms. Our approaches provide immediately useful techniques for use in single QTL analyses in the variance component QTL mapping framework. They will, however, be particularly useful in genome scans for multiple interacting QTL, where the improvements in both computational and memory efficiency are the key for successful development of efficient optimization algorithms to allow widespread use of this methodology.

Show MeSH
The required memory (in Mb) to store all IBD matrices in a marker bracket as individual matrices or as a single IBD function matrix for increasing numbers of tested locations in the bracket.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2194736&req=5

Figure 3: The required memory (in Mb) to store all IBD matrices in a marker bracket as individual matrices or as a single IBD function matrix for increasing numbers of tested locations in the bracket.

Mentions: Storing a typical IBD-matrix in the chicken pedigree used in this study at single precision requires approximately m = 8722 * 0.5 * 2 * 10-5 ≈ 7 Mb, as there are 872 individuals in the pedigree and around 50 % non-zero elements. If on the other hand the IBD functions for a marker interval are stored in the form of a sparse matrix that contains the three parameters (a, b, c) of a second degree polynomial (ax2 + bx + c) for each bracket and pair of individuals with single precision, then the constant p in [1] works out to be ≈ 4 × 10-5 and the memory requirement to store one IBD-function matrix to about m = 8722 * 0.5 * 4 * 10-5 ≈ 14 Mb. The advantage of this storage approach for varying number of IBD matrices in a marker bracket is illustrated in Figure 3.


A general and efficient method for estimating continuous IBD functions for use in genome scans for QTL.

Besnier F, Carlborg O - BMC Bioinformatics (2007)

The required memory (in Mb) to store all IBD matrices in a marker bracket as individual matrices or as a single IBD function matrix for increasing numbers of tested locations in the bracket.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2194736&req=5

Figure 3: The required memory (in Mb) to store all IBD matrices in a marker bracket as individual matrices or as a single IBD function matrix for increasing numbers of tested locations in the bracket.
Mentions: Storing a typical IBD-matrix in the chicken pedigree used in this study at single precision requires approximately m = 8722 * 0.5 * 2 * 10-5 ≈ 7 Mb, as there are 872 individuals in the pedigree and around 50 % non-zero elements. If on the other hand the IBD functions for a marker interval are stored in the form of a sparse matrix that contains the three parameters (a, b, c) of a second degree polynomial (ax2 + bx + c) for each bracket and pair of individuals with single precision, then the constant p in [1] works out to be ≈ 4 × 10-5 and the memory requirement to store one IBD-function matrix to about m = 8722 * 0.5 * 4 * 10-5 ≈ 14 Mb. The advantage of this storage approach for varying number of IBD matrices in a marker bracket is illustrated in Figure 3.

Bottom Line: Estimation of IBD functions improved the computational efficiency and memory usage in genome scanning for QTL.The approach of storing IBD as polynomial IBD-function was also shown to reduce the amount of memory required in genome scans for QTL.They will, however, be particularly useful in genome scans for multiple interacting QTL, where the improvements in both computational and memory efficiency are the key for successful development of efficient optimization algorithms to allow widespread use of this methodology.

View Article: PubMed Central - HTML - PubMed

Affiliation: Linnaeus Centre for Bioinformatics, Uppsala University, SE-75124 Uppsala, Sweden. francois.besnier@lcb.uu.se

ABSTRACT

Background: Identity by descent (IBD) matrix estimation is a central component in mapping of Quantitative Trait Loci (QTL) using variance component models. A large number of algorithms have been developed for estimation of IBD between individuals in populations at discrete locations in the genome for use in genome scans to detect QTL affecting various traits of interest in experimental animal, human and agricultural pedigrees. Here, we propose a new approach to estimate IBD as continuous functions rather than as discrete values.

Results: Estimation of IBD functions improved the computational efficiency and memory usage in genome scanning for QTL. We have explored two approaches to obtain continuous marker-bracket IBD-functions. By re-implementing an existing and fast deterministic IBD-estimation method, we show that this approach results in IBD functions that produces the exact same IBD as the original algorithm, but with a greater than 2-fold improvement of the computational efficiency and a considerably lower memory requirement for storing the resulting genome-wide IBD. By developing a general IBD function approximation algorithm, we show that it is possible to estimate marker-bracket IBD functions from IBD matrices estimated at marker locations by any existing IBD estimation algorithm. The general algorithm provides approximations that lead to QTL variance component estimates that even in worst-case scenarios are very similar to the true values. The approach of storing IBD as polynomial IBD-function was also shown to reduce the amount of memory required in genome scans for QTL.

Conclusion: In addition to direct improvements in computational and memory efficiency, estimation of IBD-functions is a fundamental step needed to develop and implement new efficient optimization algorithms for high precision localization of QTL. Here, we discuss and test two approaches for estimating IBD functions based on existing IBD estimation algorithms. Our approaches provide immediately useful techniques for use in single QTL analyses in the variance component QTL mapping framework. They will, however, be particularly useful in genome scans for multiple interacting QTL, where the improvements in both computational and memory efficiency are the key for successful development of efficient optimization algorithms to allow widespread use of this methodology.

Show MeSH