Limits...
A distribution-free multi-factorial profiler for harvesting information from high-density screenings.

Besseris GJ - PLoS ONE (2013)

Bottom Line: Partial effects are sliced off systematically from the investigated response to form individual contrasts using simple robust measures.Main benefits of the method are: 1) easy to grasp, 2) well-explained test-power properties, 3) distribution-free, 4) sparsity-free, 5) calibration-free, 6) simulation-free, 7) easy to implement, and 8) expanded usability to any type and size of multi-factorial screening designs.The method is elucidated with a benchmarked profiling effort for a water filtration process.

View Article: PubMed Central - PubMed

Affiliation: Department of Mechanical Engineering, Advanced Industrial & Manufacturing Systems Program, Technological Educational Institute of Piraeus, Aegaleo, Greece. besseris@teipir.gr

ABSTRACT
Data screening is an indispensable phase in initiating the scientific discovery process. Fractional factorial designs offer quick and economical options for engineering highly-dense structured datasets. Maximum information content is harvested when a selected fractional factorial scheme is driven to saturation while data gathering is suppressed to no replication. A novel multi-factorial profiler is presented that allows screening of saturated-unreplicated designs by decomposing the examined response to its constituent contributions. Partial effects are sliced off systematically from the investigated response to form individual contrasts using simple robust measures. By isolating each time the disturbance attributed solely to a single controlling factor, the Wilcoxon-Mann-Whitney rank stochastics are employed to assign significance. We demonstrate that the proposed profiler possesses its own self-checking mechanism for detecting a potential influence due to fluctuations attributed to the remaining unexplainable error. Main benefits of the method are: 1) easy to grasp, 2) well-explained test-power properties, 3) distribution-free, 4) sparsity-free, 5) calibration-free, 6) simulation-free, 7) easy to implement, and 8) expanded usability to any type and size of multi-factorial screening designs. The method is elucidated with a benchmarked profiling effort for a water filtration process.

Show MeSH

Related in: MedlinePlus

Half-normal plot (MINITAB 16.0) of the effects for the screening of filtration time for significance levels, α = 0.1 (A) and α = 0.05 (B).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3756950&req=5

pone-0073275-g005: Half-normal plot (MINITAB 16.0) of the effects for the screening of filtration time for significance levels, α = 0.1 (A) and α = 0.05 (B).

Mentions: Our proposed technique disposes of one of the most controversial assumptions in high-density low-sampling data screening, the so called sparsity assumption [21], [28], [30]. The sparsity assumption causes several impediments in explaining unreplicated-saturated fractional factorial data. First of all, it precludes from the start the possibility that all studied effects might end up to be prevalent. In translation, this a priori condition suggests that some information should always be sacrificed in order to produce any information at all. Therefore, the quality status of a data conversion effort might be assessed only by weighing in sparsity. The crucial predicament which has long been unresolved is how much sparsity should be present in the design to generate acceptable results. This last source of unwavering uncertainty harbors in several sparsity-intensive techniques which are accessible through mainstream data analytics software packages. Astonishingly, no stochastic indicators of sparsity have ever been reported. A favorite commercial combination is the half-normal plot [28] and the Lenth method [30]. In Figure 5, we provide the output (MINITAB 16.0) of the filtration rate screening which automatically returns a result by synthesizing information from the normal plot and the Lenth method. The low detectability at the typical error rate of 0.05 is pronounced in the profiling shown in Figure 5B. Setting the screen at the more crude error rate of 0.1 (Figure 5A), the three active effects have not been recovered entirely. It is striking that contrary to other permutation-based techniques, the rank-sum approach as developed herein is sparsity-free. Hence, it eliminates any dependence on empirical calibration [37]. Calibration is known to relay conflicting conclusions because it considers different cut-off scales devised separately for the individual error rate and the experiment-wise error rate [49]. Depending on how the knowledge worker intends to filter a group of contrasts, effecting a two-way calibration approach may not eventually be a harmonious act. Non-parametric modifications to refine the critical values for the Lenth test - at two principal error rates - have not improved substantially the quality of the prediction [37], [49]. Nevertheless, a better performance standing is attained by resorting to the corrections published by Ye and Hamada [49]. For example, setting the experimentwise error rate at 0.05, the Lenth-test critical value improves to 23.18. However, as it is portrayed in the Pareto chart for the effects in Figure 6, in spite of such refinement, there is no chance for discovering an influence. At an experimentwise error rate which is set at 0.1, the cutoff point now reduces further to a value of 17.6. Even at such a rudimentary selective mode, we observe that two-thirds of the dominant factors have still not been harvested. Otherwise, the EER needs to be raised to a value of 0.30 to enable distinguishing all three active effects. The corresponding IER value predicted by the Ye and Hamada [49] method agrees with our result only when the level is raised for their comparison to at least a value of 0.06. As a side note, the residualization of the original FT-response has been reconfigured in our technique such that exchangeability of the residuals has been maintained for all possible factorial comparisons [38].


A distribution-free multi-factorial profiler for harvesting information from high-density screenings.

Besseris GJ - PLoS ONE (2013)

Half-normal plot (MINITAB 16.0) of the effects for the screening of filtration time for significance levels, α = 0.1 (A) and α = 0.05 (B).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3756950&req=5

pone-0073275-g005: Half-normal plot (MINITAB 16.0) of the effects for the screening of filtration time for significance levels, α = 0.1 (A) and α = 0.05 (B).
Mentions: Our proposed technique disposes of one of the most controversial assumptions in high-density low-sampling data screening, the so called sparsity assumption [21], [28], [30]. The sparsity assumption causes several impediments in explaining unreplicated-saturated fractional factorial data. First of all, it precludes from the start the possibility that all studied effects might end up to be prevalent. In translation, this a priori condition suggests that some information should always be sacrificed in order to produce any information at all. Therefore, the quality status of a data conversion effort might be assessed only by weighing in sparsity. The crucial predicament which has long been unresolved is how much sparsity should be present in the design to generate acceptable results. This last source of unwavering uncertainty harbors in several sparsity-intensive techniques which are accessible through mainstream data analytics software packages. Astonishingly, no stochastic indicators of sparsity have ever been reported. A favorite commercial combination is the half-normal plot [28] and the Lenth method [30]. In Figure 5, we provide the output (MINITAB 16.0) of the filtration rate screening which automatically returns a result by synthesizing information from the normal plot and the Lenth method. The low detectability at the typical error rate of 0.05 is pronounced in the profiling shown in Figure 5B. Setting the screen at the more crude error rate of 0.1 (Figure 5A), the three active effects have not been recovered entirely. It is striking that contrary to other permutation-based techniques, the rank-sum approach as developed herein is sparsity-free. Hence, it eliminates any dependence on empirical calibration [37]. Calibration is known to relay conflicting conclusions because it considers different cut-off scales devised separately for the individual error rate and the experiment-wise error rate [49]. Depending on how the knowledge worker intends to filter a group of contrasts, effecting a two-way calibration approach may not eventually be a harmonious act. Non-parametric modifications to refine the critical values for the Lenth test - at two principal error rates - have not improved substantially the quality of the prediction [37], [49]. Nevertheless, a better performance standing is attained by resorting to the corrections published by Ye and Hamada [49]. For example, setting the experimentwise error rate at 0.05, the Lenth-test critical value improves to 23.18. However, as it is portrayed in the Pareto chart for the effects in Figure 6, in spite of such refinement, there is no chance for discovering an influence. At an experimentwise error rate which is set at 0.1, the cutoff point now reduces further to a value of 17.6. Even at such a rudimentary selective mode, we observe that two-thirds of the dominant factors have still not been harvested. Otherwise, the EER needs to be raised to a value of 0.30 to enable distinguishing all three active effects. The corresponding IER value predicted by the Ye and Hamada [49] method agrees with our result only when the level is raised for their comparison to at least a value of 0.06. As a side note, the residualization of the original FT-response has been reconfigured in our technique such that exchangeability of the residuals has been maintained for all possible factorial comparisons [38].

Bottom Line: Partial effects are sliced off systematically from the investigated response to form individual contrasts using simple robust measures.Main benefits of the method are: 1) easy to grasp, 2) well-explained test-power properties, 3) distribution-free, 4) sparsity-free, 5) calibration-free, 6) simulation-free, 7) easy to implement, and 8) expanded usability to any type and size of multi-factorial screening designs.The method is elucidated with a benchmarked profiling effort for a water filtration process.

View Article: PubMed Central - PubMed

Affiliation: Department of Mechanical Engineering, Advanced Industrial & Manufacturing Systems Program, Technological Educational Institute of Piraeus, Aegaleo, Greece. besseris@teipir.gr

ABSTRACT
Data screening is an indispensable phase in initiating the scientific discovery process. Fractional factorial designs offer quick and economical options for engineering highly-dense structured datasets. Maximum information content is harvested when a selected fractional factorial scheme is driven to saturation while data gathering is suppressed to no replication. A novel multi-factorial profiler is presented that allows screening of saturated-unreplicated designs by decomposing the examined response to its constituent contributions. Partial effects are sliced off systematically from the investigated response to form individual contrasts using simple robust measures. By isolating each time the disturbance attributed solely to a single controlling factor, the Wilcoxon-Mann-Whitney rank stochastics are employed to assign significance. We demonstrate that the proposed profiler possesses its own self-checking mechanism for detecting a potential influence due to fluctuations attributed to the remaining unexplainable error. Main benefits of the method are: 1) easy to grasp, 2) well-explained test-power properties, 3) distribution-free, 4) sparsity-free, 5) calibration-free, 6) simulation-free, 7) easy to implement, and 8) expanded usability to any type and size of multi-factorial screening designs. The method is elucidated with a benchmarked profiling effort for a water filtration process.

Show MeSH
Related in: MedlinePlus