Limits...
Type I error control for tree classification.

Jung SH, Chen Y, Ahn H - Cancer Inform (2014)

Bottom Line: Binary tree classification has been useful for classifying the whole population based on the levels of outcome variable that is associated with chosen predictors.Nonetheless, there have not been many publications to address this issue.In this paper, we propose a binary tree classification method to control the probability to accept a predictor below certain level, say 5%.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA.

ABSTRACT
Binary tree classification has been useful for classifying the whole population based on the levels of outcome variable that is associated with chosen predictors. Often we start a classification with a large number of candidate predictors, and each predictor takes a number of different cutoff values. Because of these types of multiplicity, binary tree classification method is subject to severe type I error probability. Nonetheless, there have not been many publications to address this issue. In this paper, we propose a binary tree classification method to control the probability to accept a predictor below certain level, say 5%.

No MeSH data available.


Related in: MedlinePlus

Kaplan–Meier survival curves for the samples in the terminal nodes of the tree in Figure 2 for the lung cancer data.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4237155&req=5

f3-cin-suppl.7-2014-011: Kaplan–Meier survival curves for the samples in the terminal nodes of the tree in Figure 2 for the lung cancer data.

Mentions: Figure 2 shows the tree obtained using SSP. The type I error level α was chosen to be 0.2, and the corresponding critical value ζa was 24.5. The first split occurred on Gene 201303–at at 1620. The adjusted P-value for the split was >0.0001 with corresponding test statistic value of 39.5. The left child node was split on Gene 215882_at at 10.5 (adjusted P-value of 0.086 and test statistic value of 25.3), and the right child node was split on Gene 219323_s_at at 192.8 (adjusted P-value of 0.191 and test statistic value of 24.5). The median survival time varies a lot among the four terminal nodes. The median survival time of the first terminal node is more than quadruple the fourth terminal node. Figure 3 compares the Kaplan–Meier survival curves of the four groups. The groups are numbered from the left node to the right. The critical values are 27.1 for a 0.05 significance level and 25.2 for a 0.1 significance level. The tree will have only one split at the 0.05 type I error level and two splits (one more at the left child node of the root) at the 0.1 type I error level.


Type I error control for tree classification.

Jung SH, Chen Y, Ahn H - Cancer Inform (2014)

Kaplan–Meier survival curves for the samples in the terminal nodes of the tree in Figure 2 for the lung cancer data.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4237155&req=5

f3-cin-suppl.7-2014-011: Kaplan–Meier survival curves for the samples in the terminal nodes of the tree in Figure 2 for the lung cancer data.
Mentions: Figure 2 shows the tree obtained using SSP. The type I error level α was chosen to be 0.2, and the corresponding critical value ζa was 24.5. The first split occurred on Gene 201303–at at 1620. The adjusted P-value for the split was >0.0001 with corresponding test statistic value of 39.5. The left child node was split on Gene 215882_at at 10.5 (adjusted P-value of 0.086 and test statistic value of 25.3), and the right child node was split on Gene 219323_s_at at 192.8 (adjusted P-value of 0.191 and test statistic value of 24.5). The median survival time varies a lot among the four terminal nodes. The median survival time of the first terminal node is more than quadruple the fourth terminal node. Figure 3 compares the Kaplan–Meier survival curves of the four groups. The groups are numbered from the left node to the right. The critical values are 27.1 for a 0.05 significance level and 25.2 for a 0.1 significance level. The tree will have only one split at the 0.05 type I error level and two splits (one more at the left child node of the root) at the 0.1 type I error level.

Bottom Line: Binary tree classification has been useful for classifying the whole population based on the levels of outcome variable that is associated with chosen predictors.Nonetheless, there have not been many publications to address this issue.In this paper, we propose a binary tree classification method to control the probability to accept a predictor below certain level, say 5%.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA.

ABSTRACT
Binary tree classification has been useful for classifying the whole population based on the levels of outcome variable that is associated with chosen predictors. Often we start a classification with a large number of candidate predictors, and each predictor takes a number of different cutoff values. Because of these types of multiplicity, binary tree classification method is subject to severe type I error probability. Nonetheless, there have not been many publications to address this issue. In this paper, we propose a binary tree classification method to control the probability to accept a predictor below certain level, say 5%.

No MeSH data available.


Related in: MedlinePlus