Limits...
Information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments.

Liu T, Lin N, Shi N, Zhang B - BMC Bioinformatics (2009)

Bottom Line: Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time.It is also computationally much faster than Wang et al. 3.In a real data example, the ORICC algorithm identifies new and interesting genes that previous analyses failed to reveal.

View Article: PubMed Central - HTML - PubMed

Affiliation: Key Laboratory for Applied Statistics of MOE and School of Mathematics and Statistics, Northeast Normal University, Changchun, PR China. tianqingliu@gmail.com

ABSTRACT

Background: Time-course microarray experiments produce vector gene expression profiles across a series of time points. Clustering genes based on these profiles is important in discovering functional related and co-regulated genes. Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time. Peddada et al. 1 proposed a clustering algorithm that can incorporate the temporal ordering using order-restricted statistical inference. This algorithm is, however, very time-consuming and hence inapplicable to most microarray experiments that contain a large number of genes. Its computational burden also imposes difficulty to assess the clustering reliability, which is a very important measure when clustering noisy microarray data.

Results: We propose a computationally efficient information criterion-based clustering algorithm, called ORICC, that also takes account of the ordering in time-course microarray experiments by embedding the order-restricted inference into a model selection framework. Genes are assigned to the profile which they best match determined by a newly proposed information criterion for order-restricted inference. In addition, we also developed a bootstrap procedure to assess ORICC's clustering reliability for every gene. Simulation studies show that the ORICC method is robust, always gives better clustering accuracy than Peddada's method and saves hundreds of times computational time. Under some scenarios, its accuracy is also better than some other existing clustering methods for short time-course microarray data, such as STEM 2 and Wang et al. 3. It is also computationally much faster than Wang et al. 3.

Conclusion: Our ORICC algorithm, which takes advantage of the temporal ordering in time-course microarray experiments, provides good clustering accuracy and is meanwhile much faster than Peddada's method. Moreover, the clustering reliability for each gene can also be assessed, which is unavailable in Peddada's method. In a real data example, the ORICC algorithm identifies new and interesting genes that previous analyses failed to reveal.

Show MeSH
Simulation 1: The false positive rate of Peddada's method and the one-stage ORICC algorithm. The horizontal axis represents the number of replicates, and the vertical axis represents false positive rate. Dashed lines are for the one-stage ORICC algorithm, and solid lines are for Peddada's method.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2696449&req=5

Figure 3: Simulation 1: The false positive rate of Peddada's method and the one-stage ORICC algorithm. The horizontal axis represents the number of replicates, and the vertical axis represents false positive rate. Dashed lines are for the one-stage ORICC algorithm, and solid lines are for Peddada's method.

Mentions: We then use the overall error rate, the false positive rate and the false negative rate to evaluate the accuracy of the two algorithms. Simulation results are summarized in Figures 2, 3 and 4. Figure 2 and Figure 4 show that the overall error rate and the false negative rate of the one-stage ORICC algorithm are always better than those of Peddada's method. Figure 3 shows that the false positive rate of the one-stage ORICC algorithm is better than that of Peddada's method in most cases. The one-stage ORICC algorithm not only provides good clustering accuracy but also is much faster than Peddada's method. For example, when σ2 = 1 and M = 5, the run time for Peddada's method and one-stage ORICC algorithm is 2979.29 seconds versus 25.55 seconds.


Information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments.

Liu T, Lin N, Shi N, Zhang B - BMC Bioinformatics (2009)

Simulation 1: The false positive rate of Peddada's method and the one-stage ORICC algorithm. The horizontal axis represents the number of replicates, and the vertical axis represents false positive rate. Dashed lines are for the one-stage ORICC algorithm, and solid lines are for Peddada's method.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2696449&req=5

Figure 3: Simulation 1: The false positive rate of Peddada's method and the one-stage ORICC algorithm. The horizontal axis represents the number of replicates, and the vertical axis represents false positive rate. Dashed lines are for the one-stage ORICC algorithm, and solid lines are for Peddada's method.
Mentions: We then use the overall error rate, the false positive rate and the false negative rate to evaluate the accuracy of the two algorithms. Simulation results are summarized in Figures 2, 3 and 4. Figure 2 and Figure 4 show that the overall error rate and the false negative rate of the one-stage ORICC algorithm are always better than those of Peddada's method. Figure 3 shows that the false positive rate of the one-stage ORICC algorithm is better than that of Peddada's method in most cases. The one-stage ORICC algorithm not only provides good clustering accuracy but also is much faster than Peddada's method. For example, when σ2 = 1 and M = 5, the run time for Peddada's method and one-stage ORICC algorithm is 2979.29 seconds versus 25.55 seconds.

Bottom Line: Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time.It is also computationally much faster than Wang et al. 3.In a real data example, the ORICC algorithm identifies new and interesting genes that previous analyses failed to reveal.

View Article: PubMed Central - HTML - PubMed

Affiliation: Key Laboratory for Applied Statistics of MOE and School of Mathematics and Statistics, Northeast Normal University, Changchun, PR China. tianqingliu@gmail.com

ABSTRACT

Background: Time-course microarray experiments produce vector gene expression profiles across a series of time points. Clustering genes based on these profiles is important in discovering functional related and co-regulated genes. Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time. Peddada et al. 1 proposed a clustering algorithm that can incorporate the temporal ordering using order-restricted statistical inference. This algorithm is, however, very time-consuming and hence inapplicable to most microarray experiments that contain a large number of genes. Its computational burden also imposes difficulty to assess the clustering reliability, which is a very important measure when clustering noisy microarray data.

Results: We propose a computationally efficient information criterion-based clustering algorithm, called ORICC, that also takes account of the ordering in time-course microarray experiments by embedding the order-restricted inference into a model selection framework. Genes are assigned to the profile which they best match determined by a newly proposed information criterion for order-restricted inference. In addition, we also developed a bootstrap procedure to assess ORICC's clustering reliability for every gene. Simulation studies show that the ORICC method is robust, always gives better clustering accuracy than Peddada's method and saves hundreds of times computational time. Under some scenarios, its accuracy is also better than some other existing clustering methods for short time-course microarray data, such as STEM 2 and Wang et al. 3. It is also computationally much faster than Wang et al. 3.

Conclusion: Our ORICC algorithm, which takes advantage of the temporal ordering in time-course microarray experiments, provides good clustering accuracy and is meanwhile much faster than Peddada's method. Moreover, the clustering reliability for each gene can also be assessed, which is unavailable in Peddada's method. In a real data example, the ORICC algorithm identifies new and interesting genes that previous analyses failed to reveal.

Show MeSH