Limits...
A response to information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments.

Peddada SD, Umbach DM, Harris SF - BMC Bioinformatics (2009)

Bottom Line: Performance of their proposal was compared with three other methods including the order-restricted inference based methodology of Peddada et al.In this note we point out several inaccuracies in Liu et al. and conclude that the order-restricted inference based methodology of Peddada et al. (programmed in the software ORIOGEN) indeed operates at the desired nominal Type 1 error level, an important feature of a statistical decision rule, while being computationally substantially faster than indicated by Liu et al.Our results on the false positive rate of ORIOGEN suggest some error in Figure three of Liu et al., perhaps due to a programming error.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: For gene expression data obtained from a time-course microarray experiment, Liu et al. developed a new algorithm for clustering genes with similar expression profiles over time. Performance of their proposal was compared with three other methods including the order-restricted inference based methodology of Peddada et al. In this note we point out several inaccuracies in Liu et al. and conclude that the order-restricted inference based methodology of Peddada et al. (programmed in the software ORIOGEN) indeed operates at the desired nominal Type 1 error level, an important feature of a statistical decision rule, while being computationally substantially faster than indicated by Liu et al.

Results: Application of ORIOGEN to the well-known breast cancer cell line data of Lobenhofer et al. revealed that ORIOGEN software took only 21 minutes to run (using 100,000 bootstraps with p = 0.0025), substantially faster than the 72 hours found by Liu et al. using Matlab. Also, based on a data simulated according to the model and parameters of simulation 1 (sigma2 = 1, M = 5) in [1] we found that ORIOGEN took less than 30 seconds to run in stark contrast to Liu et al. who reported that their implementation of the same algorithm in R took 2979.29 seconds. Furthermore, for the simulation studies reported in [1], unlike the claims made by Liu et al., ORIOGEN always maintained the desired false positive rate. According to Figure three in Liu et al. their algorithm had a false positive rate ranging approximately from 0.20 to 0.70 for the scenarios that they simulated.

Conclusions: Our comparisons of run times indicate that the implementations of ORIOGEN's algorithm in Matlab and R by Liu et al. is inefficient compared to the publicly available JAVA implementation. Our results on the false positive rate of ORIOGEN suggest some error in Figure three of Liu et al., perhaps due to a programming error.

Show MeSH

Related in: MedlinePlus

Simulation 1: The overall error rate of Peddada's method and the one-stage ORICC algorithm. (due to Liu et al.)
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2813245&req=5

Figure 1: Simulation 1: The overall error rate of Peddada's method and the one-stage ORICC algorithm. (due to Liu et al.)

Mentions: The second comment is that the false positive rates reported in Figure three in [7]were incorrect. We thank Peddada et al. for carefully reading our paper and pointing out this mistake. In our paper, we mistakenly stated the p-value threshold used for Peddada's method. The threshold was 0.5 instead of 0.025. We repeated all simulations in our paper involving Peddada's method using a p-value threshold of 0.025 and the results are presented in Figures 1, 2, 3 and 4 at the end of this report. Figures 1, 2 and 3 are for Simulation 1 in [7]and obtained by imposing the error rate for Peddada's method using threshold 0.025 on Figures 2, 3 and 4 in [7]. Figure 4 is for Simulation 2 in [7]and gives Rand's C statistics for the clusters given by Peddada's method using threshold 0.025. As pointed out by Peddada et al. in their correspondence, Peddada's method controls the false positive rate under the nominal level, i.e. the p-value threshold (See Figure 2). However, lower false positive rates are often at the price of increased false negative rates and also higher overall error rates (See Figures 1 and 3). Though a threshold of 0.5 seems unreasonable for p-values, it does offer an overall better clustering result than using 0.025 as the threshold. This is further confirmed by Rand's statistics in Figure 4. Except the comparison in false positive rates (Figure 2) is different from what we stated in our paper, other conclusions in our paper remain unchanged. It is worth noting that Peddada's method can achieve any false positive rate by using the corresponding p-value threshold.


A response to information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments.

Peddada SD, Umbach DM, Harris SF - BMC Bioinformatics (2009)

Simulation 1: The overall error rate of Peddada's method and the one-stage ORICC algorithm. (due to Liu et al.)
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2813245&req=5

Figure 1: Simulation 1: The overall error rate of Peddada's method and the one-stage ORICC algorithm. (due to Liu et al.)
Mentions: The second comment is that the false positive rates reported in Figure three in [7]were incorrect. We thank Peddada et al. for carefully reading our paper and pointing out this mistake. In our paper, we mistakenly stated the p-value threshold used for Peddada's method. The threshold was 0.5 instead of 0.025. We repeated all simulations in our paper involving Peddada's method using a p-value threshold of 0.025 and the results are presented in Figures 1, 2, 3 and 4 at the end of this report. Figures 1, 2 and 3 are for Simulation 1 in [7]and obtained by imposing the error rate for Peddada's method using threshold 0.025 on Figures 2, 3 and 4 in [7]. Figure 4 is for Simulation 2 in [7]and gives Rand's C statistics for the clusters given by Peddada's method using threshold 0.025. As pointed out by Peddada et al. in their correspondence, Peddada's method controls the false positive rate under the nominal level, i.e. the p-value threshold (See Figure 2). However, lower false positive rates are often at the price of increased false negative rates and also higher overall error rates (See Figures 1 and 3). Though a threshold of 0.5 seems unreasonable for p-values, it does offer an overall better clustering result than using 0.025 as the threshold. This is further confirmed by Rand's statistics in Figure 4. Except the comparison in false positive rates (Figure 2) is different from what we stated in our paper, other conclusions in our paper remain unchanged. It is worth noting that Peddada's method can achieve any false positive rate by using the corresponding p-value threshold.

Bottom Line: Performance of their proposal was compared with three other methods including the order-restricted inference based methodology of Peddada et al.In this note we point out several inaccuracies in Liu et al. and conclude that the order-restricted inference based methodology of Peddada et al. (programmed in the software ORIOGEN) indeed operates at the desired nominal Type 1 error level, an important feature of a statistical decision rule, while being computationally substantially faster than indicated by Liu et al.Our results on the false positive rate of ORIOGEN suggest some error in Figure three of Liu et al., perhaps due to a programming error.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: For gene expression data obtained from a time-course microarray experiment, Liu et al. developed a new algorithm for clustering genes with similar expression profiles over time. Performance of their proposal was compared with three other methods including the order-restricted inference based methodology of Peddada et al. In this note we point out several inaccuracies in Liu et al. and conclude that the order-restricted inference based methodology of Peddada et al. (programmed in the software ORIOGEN) indeed operates at the desired nominal Type 1 error level, an important feature of a statistical decision rule, while being computationally substantially faster than indicated by Liu et al.

Results: Application of ORIOGEN to the well-known breast cancer cell line data of Lobenhofer et al. revealed that ORIOGEN software took only 21 minutes to run (using 100,000 bootstraps with p = 0.0025), substantially faster than the 72 hours found by Liu et al. using Matlab. Also, based on a data simulated according to the model and parameters of simulation 1 (sigma2 = 1, M = 5) in [1] we found that ORIOGEN took less than 30 seconds to run in stark contrast to Liu et al. who reported that their implementation of the same algorithm in R took 2979.29 seconds. Furthermore, for the simulation studies reported in [1], unlike the claims made by Liu et al., ORIOGEN always maintained the desired false positive rate. According to Figure three in Liu et al. their algorithm had a false positive rate ranging approximately from 0.20 to 0.70 for the scenarios that they simulated.

Conclusions: Our comparisons of run times indicate that the implementations of ORIOGEN's algorithm in Matlab and R by Liu et al. is inefficient compared to the publicly available JAVA implementation. Our results on the false positive rate of ORIOGEN suggest some error in Figure three of Liu et al., perhaps due to a programming error.

Show MeSH
Related in: MedlinePlus