Limits...
The effects of linkage disequilibrium in large scale SNP datasets for MDR.

Grady BJ, Torstenson ES, Ritchie MD - BioData Min (2011)

Bottom Line: In this study, we examined the effect of LD on the sensitivity of the Multifactor Dimensionality Reduction (MDR) software package.Higher levels of LD begin to confound the MDR algorithm and lead to a drop in sensitivity with respect to the identification of a direct association; it does not, however, affect the ability to detect indirect association.As such, the results of MDR analysis in datasets with LD should be carefully examined to consider the underlying LD structure of the dataset.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, TN 37232, USA. ritchie@chgr.mc.vanderbilt.edu.

ABSTRACT

Background: In the analysis of large-scale genomic datasets, an important consideration is the power of analytical methods to identify accurate predictive models of disease. When trying to assess sensitivity from such analytical methods, a confounding factor up to this point has been the presence of linkage disequilibrium (LD). In this study, we examined the effect of LD on the sensitivity of the Multifactor Dimensionality Reduction (MDR) software package.

Results: Four relative amounts of LD were simulated in multiple one- and two-locus scenarios for which the position of the functional SNP(s) within LD blocks varied. Simulated data was analyzed with MDR to determine the sensitivity of the method in different contexts, where the sensitivity of the method was gauged as the number of times out of 100 that the method identifies the correct one- or two-locus model as the best overall model. As the amount of LD increases, the sensitivity of MDR to detect the correct functional SNP drops but the sensitivity to detect the disease signal and find an indirect association increases.

Conclusions: Higher levels of LD begin to confound the MDR algorithm and lead to a drop in sensitivity with respect to the identification of a direct association; it does not, however, affect the ability to detect indirect association. Careful examination of the solution models generated by MDR reveals that MDR can identify loci in the correct LD block; though it is not always the functional SNP. As such, the results of MDR analysis in datasets with LD should be carefully examined to consider the underlying LD structure of the dataset.

No MeSH data available.


Related in: MedlinePlus

Sensitivity of MDR for two-locus disease models. The detection sensitivity of MDR when analyzing data with 40% LD, 60% LD, 80% LD or 95% LD and attempting to identify purely epistatic two-locus models.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3108918&req=5

Figure 5: Sensitivity of MDR for two-locus disease models. The detection sensitivity of MDR when analyzing data with 40% LD, 60% LD, 80% LD or 95% LD and attempting to identify purely epistatic two-locus models.

Mentions: Three types of two-locus epistatic interactions were examined: two SNPs in different blocks of LD, two SNPs in the same block of LD, two SNPs with one SNP inside an LD block and the other outside of any called block of LD. The instance in which two SNPs were in the same block of LD was not possible to simulate from the 40% LD data pool due to the lack of sufficiently large LD blocks. For each model, effect sizes of 5%, 10% and 15% broad-sense heritability were simulated. Each disease model was purely epistatic, meaning that there should be no detectable marginal effect from either functional SNP. For all two-locus models, analysis was run as described followed by a progression of removing one, the other, and finally both functional SNPs. The goal of this series of experiments is to determine whether or not MDR can detect the underlying genetic signal using LD (i.e. indirect association), or if it is limited in sensitivity to do so. The results of these MDR analyses are shown in Table 2 and trends in the data are illustrated in Figure 5. The case in which two functional SNPs were separated in two different LD blocks and the case where only one functional SNP resided in an LD block displayed similar trends. First, the sensitivity of MDR for most models increased proportionally with effect size from 5% to 15% heritability but not significantly between 15% and 25% heritability. MDR had high signal sensitivity in both low and high LD while exact sensitivity dropped in high LD datasets with two loci in separate LD blocks or only one locus in an LD block. In addition, the ability to detect the disease signals in absence of the actual functional SNPs increased with the amount of LD in the dataset. Surprisingly, when the SNP not in an LD block (SNP1) was dropped from high LD datasets before analysis, there was still considerable signal sensitivity. This phenomenon is most likely the result of patterns of long-range LD (not necessarily considered part of an LD block). With the exception of the models with lowest LD amounts, there was little difference in the signal sensitivities between analyses with one or both functional SNPs removed. Interestingly in some instances, there was more sensitivity with both SNPs removed than with the removal of only the second SNP. Once again, in many cases the drop in signal sensitivity with functional SNPs dropped results from selection of SNPs in LD below the threshold with the functional SNPs.


The effects of linkage disequilibrium in large scale SNP datasets for MDR.

Grady BJ, Torstenson ES, Ritchie MD - BioData Min (2011)

Sensitivity of MDR for two-locus disease models. The detection sensitivity of MDR when analyzing data with 40% LD, 60% LD, 80% LD or 95% LD and attempting to identify purely epistatic two-locus models.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3108918&req=5

Figure 5: Sensitivity of MDR for two-locus disease models. The detection sensitivity of MDR when analyzing data with 40% LD, 60% LD, 80% LD or 95% LD and attempting to identify purely epistatic two-locus models.
Mentions: Three types of two-locus epistatic interactions were examined: two SNPs in different blocks of LD, two SNPs in the same block of LD, two SNPs with one SNP inside an LD block and the other outside of any called block of LD. The instance in which two SNPs were in the same block of LD was not possible to simulate from the 40% LD data pool due to the lack of sufficiently large LD blocks. For each model, effect sizes of 5%, 10% and 15% broad-sense heritability were simulated. Each disease model was purely epistatic, meaning that there should be no detectable marginal effect from either functional SNP. For all two-locus models, analysis was run as described followed by a progression of removing one, the other, and finally both functional SNPs. The goal of this series of experiments is to determine whether or not MDR can detect the underlying genetic signal using LD (i.e. indirect association), or if it is limited in sensitivity to do so. The results of these MDR analyses are shown in Table 2 and trends in the data are illustrated in Figure 5. The case in which two functional SNPs were separated in two different LD blocks and the case where only one functional SNP resided in an LD block displayed similar trends. First, the sensitivity of MDR for most models increased proportionally with effect size from 5% to 15% heritability but not significantly between 15% and 25% heritability. MDR had high signal sensitivity in both low and high LD while exact sensitivity dropped in high LD datasets with two loci in separate LD blocks or only one locus in an LD block. In addition, the ability to detect the disease signals in absence of the actual functional SNPs increased with the amount of LD in the dataset. Surprisingly, when the SNP not in an LD block (SNP1) was dropped from high LD datasets before analysis, there was still considerable signal sensitivity. This phenomenon is most likely the result of patterns of long-range LD (not necessarily considered part of an LD block). With the exception of the models with lowest LD amounts, there was little difference in the signal sensitivities between analyses with one or both functional SNPs removed. Interestingly in some instances, there was more sensitivity with both SNPs removed than with the removal of only the second SNP. Once again, in many cases the drop in signal sensitivity with functional SNPs dropped results from selection of SNPs in LD below the threshold with the functional SNPs.

Bottom Line: In this study, we examined the effect of LD on the sensitivity of the Multifactor Dimensionality Reduction (MDR) software package.Higher levels of LD begin to confound the MDR algorithm and lead to a drop in sensitivity with respect to the identification of a direct association; it does not, however, affect the ability to detect indirect association.As such, the results of MDR analysis in datasets with LD should be carefully examined to consider the underlying LD structure of the dataset.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, TN 37232, USA. ritchie@chgr.mc.vanderbilt.edu.

ABSTRACT

Background: In the analysis of large-scale genomic datasets, an important consideration is the power of analytical methods to identify accurate predictive models of disease. When trying to assess sensitivity from such analytical methods, a confounding factor up to this point has been the presence of linkage disequilibrium (LD). In this study, we examined the effect of LD on the sensitivity of the Multifactor Dimensionality Reduction (MDR) software package.

Results: Four relative amounts of LD were simulated in multiple one- and two-locus scenarios for which the position of the functional SNP(s) within LD blocks varied. Simulated data was analyzed with MDR to determine the sensitivity of the method in different contexts, where the sensitivity of the method was gauged as the number of times out of 100 that the method identifies the correct one- or two-locus model as the best overall model. As the amount of LD increases, the sensitivity of MDR to detect the correct functional SNP drops but the sensitivity to detect the disease signal and find an indirect association increases.

Conclusions: Higher levels of LD begin to confound the MDR algorithm and lead to a drop in sensitivity with respect to the identification of a direct association; it does not, however, affect the ability to detect indirect association. Careful examination of the solution models generated by MDR reveals that MDR can identify loci in the correct LD block; though it is not always the functional SNP. As such, the results of MDR analysis in datasets with LD should be carefully examined to consider the underlying LD structure of the dataset.

No MeSH data available.


Related in: MedlinePlus