Limits...
Multimodal microscopy for automated histologic analysis of prostate cancer.

Kwak JT, Hewitt SM, Sinha S, Bhargava R - BMC Cancer (2011)

Bottom Line: We were able to achieve very effective fusion of the information from two different images that provide very different types of data with different characteristics.The method is entirely transparent to a user and does not involve any adjustment or decision-making based on spectral data.By combining the IR and optical data, we achieved high accurate classification.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

ABSTRACT

Background: Prostate cancer is the single most prevalent cancer in US men whose gold standard of diagnosis is histologic assessment of biopsies. Manual assessment of stained tissue of all biopsies limits speed and accuracy in clinical practice and research of prostate cancer diagnosis. We sought to develop a fully-automated multimodal microscopy method to distinguish cancerous from non-cancerous tissue samples.

Methods: We recorded chemical data from an unstained tissue microarray (TMA) using Fourier transform infrared (FT-IR) spectroscopic imaging. Using pattern recognition, we identified epithelial cells without user input. We fused the cell type information with the corresponding stained images commonly used in clinical practice. Extracted morphological features, optimized by two-stage feature selection method using a minimum-redundancy-maximal-relevance (mRMR) criterion and sequential floating forward selection (SFFS), were applied to classify tissue samples as cancer or non-cancer.

Results: We achieved high accuracy (area under ROC curve (AUC) >0.97) in cross-validations on each of two data sets that were stained under different conditions. When the classifier was trained on one data set and tested on the other data set, an AUC value of ~0.95 was observed. In the absence of IR data, the performance of the same classification system dropped for both data sets and between data sets.

Conclusions: We were able to achieve very effective fusion of the information from two different images that provide very different types of data with different characteristics. The method is entirely transparent to a user and does not involve any adjustment or decision-making based on spectral data. By combining the IR and optical data, we achieved high accurate classification.

Show MeSH

Related in: MedlinePlus

List of features and their maximal relevance and "mRMR rank". In the second column, G and L represent global and local features, respectively. AVG, STD, TOT, and MAX denote the average, standard deviation, total amount, and extremal value of features. * In computing local features representing "size of lumen", two options are available: one is to consider only the part of the lumen within the window, and the other is to consider the entire lumen into account. Asterisk indicates that the former option was chosen.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3045985&req=5

Figure 10: List of features and their maximal relevance and "mRMR rank". In the second column, G and L represent global and local features, respectively. AVG, STD, TOT, and MAX denote the average, standard deviation, total amount, and extremal value of features. * In computing local features representing "size of lumen", two options are available: one is to consider only the part of the lumen within the window, and the other is to consider the entire lumen into account. Asterisk indicates that the former option was chosen.

Mentions: We examined the importance of each feature by its rank in the first phase of feature selection, based on its "relevance" to the class label (see [Additional file 1: mRMR]). Since different features (e.g., average or standard deviation, global or local features) based on the same underlying quantity (e.g., "lumen roundness") generally have similar relevance, we examined the average relevance of features in each of 17 feature categories (Figure 9), for each data set. The relevance of features is consistent across cross-validation (see [Additional file 1: Supplementary Figure S1]). The complete list of the individual features and their relevance and mRMR rank (for Data1) is available in Figure 10. For Data1, lumen-related feature categories are most relevant in general, while epithelium-related feature categories are most important for Data2. It is surprising that the top 3 feature categories in Data1 (Figure 9, blue bars) - size of lumen, lumen roundness, and lumen convex hull ratio - have very low relevance in Data2, although we note that this may be in large part due to variations in staining and malignancy of tumors between the two data sets and differences in the size of two data sets. The comparable classification results on Data2 (Table 1, 2), in spite of the maximal relevance differences, may indicate the broadness of our feature set and the accuracy of our feature selection method and facilitate the application of the same classifier on different data sets. Nevertheless, a larger scale study may be necessary to precisely examine the differences between data sets and features. It is, however, noteworthy that examining the features (or feature categories) with highest relevance alone may be slightly misleading, because this examination does not account for redundancy among features.


Multimodal microscopy for automated histologic analysis of prostate cancer.

Kwak JT, Hewitt SM, Sinha S, Bhargava R - BMC Cancer (2011)

List of features and their maximal relevance and "mRMR rank". In the second column, G and L represent global and local features, respectively. AVG, STD, TOT, and MAX denote the average, standard deviation, total amount, and extremal value of features. * In computing local features representing "size of lumen", two options are available: one is to consider only the part of the lumen within the window, and the other is to consider the entire lumen into account. Asterisk indicates that the former option was chosen.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3045985&req=5

Figure 10: List of features and their maximal relevance and "mRMR rank". In the second column, G and L represent global and local features, respectively. AVG, STD, TOT, and MAX denote the average, standard deviation, total amount, and extremal value of features. * In computing local features representing "size of lumen", two options are available: one is to consider only the part of the lumen within the window, and the other is to consider the entire lumen into account. Asterisk indicates that the former option was chosen.
Mentions: We examined the importance of each feature by its rank in the first phase of feature selection, based on its "relevance" to the class label (see [Additional file 1: mRMR]). Since different features (e.g., average or standard deviation, global or local features) based on the same underlying quantity (e.g., "lumen roundness") generally have similar relevance, we examined the average relevance of features in each of 17 feature categories (Figure 9), for each data set. The relevance of features is consistent across cross-validation (see [Additional file 1: Supplementary Figure S1]). The complete list of the individual features and their relevance and mRMR rank (for Data1) is available in Figure 10. For Data1, lumen-related feature categories are most relevant in general, while epithelium-related feature categories are most important for Data2. It is surprising that the top 3 feature categories in Data1 (Figure 9, blue bars) - size of lumen, lumen roundness, and lumen convex hull ratio - have very low relevance in Data2, although we note that this may be in large part due to variations in staining and malignancy of tumors between the two data sets and differences in the size of two data sets. The comparable classification results on Data2 (Table 1, 2), in spite of the maximal relevance differences, may indicate the broadness of our feature set and the accuracy of our feature selection method and facilitate the application of the same classifier on different data sets. Nevertheless, a larger scale study may be necessary to precisely examine the differences between data sets and features. It is, however, noteworthy that examining the features (or feature categories) with highest relevance alone may be slightly misleading, because this examination does not account for redundancy among features.

Bottom Line: We were able to achieve very effective fusion of the information from two different images that provide very different types of data with different characteristics.The method is entirely transparent to a user and does not involve any adjustment or decision-making based on spectral data.By combining the IR and optical data, we achieved high accurate classification.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

ABSTRACT

Background: Prostate cancer is the single most prevalent cancer in US men whose gold standard of diagnosis is histologic assessment of biopsies. Manual assessment of stained tissue of all biopsies limits speed and accuracy in clinical practice and research of prostate cancer diagnosis. We sought to develop a fully-automated multimodal microscopy method to distinguish cancerous from non-cancerous tissue samples.

Methods: We recorded chemical data from an unstained tissue microarray (TMA) using Fourier transform infrared (FT-IR) spectroscopic imaging. Using pattern recognition, we identified epithelial cells without user input. We fused the cell type information with the corresponding stained images commonly used in clinical practice. Extracted morphological features, optimized by two-stage feature selection method using a minimum-redundancy-maximal-relevance (mRMR) criterion and sequential floating forward selection (SFFS), were applied to classify tissue samples as cancer or non-cancer.

Results: We achieved high accuracy (area under ROC curve (AUC) >0.97) in cross-validations on each of two data sets that were stained under different conditions. When the classifier was trained on one data set and tested on the other data set, an AUC value of ~0.95 was observed. In the absence of IR data, the performance of the same classification system dropped for both data sets and between data sets.

Conclusions: We were able to achieve very effective fusion of the information from two different images that provide very different types of data with different characteristics. The method is entirely transparent to a user and does not involve any adjustment or decision-making based on spectral data. By combining the IR and optical data, we achieved high accurate classification.

Show MeSH
Related in: MedlinePlus