Limits...
Image-level and group-level models for Drosophila gene expression pattern annotation.

Sun Q, Muckatira S, Yuan L, Ji S, Newfeld S, Kumar S, Ye J - BMC Bioinformatics (2013)

Bottom Line: In our experiment, the three pooling functions perform comparably well in feature dimension reduction.The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data.Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ, 85287, USA. jieping.ye@asu.edu.

ABSTRACT

Background: Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.

Results: We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.

Conclusion: In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

Show MeSH
Image-level scheme with union vs image-level scheme without union for all stage ranges. The y-axis of the figure indicates the AUC.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3924186&req=5

Figure 5: Image-level scheme with union vs image-level scheme without union for all stage ranges. The y-axis of the figure indicates the AUC.

Mentions: To support our image-level scheme, we implement another image-level scheme without union which evaluates the prediction on the image-level. The results are reported in Figure 5. Under the same condition, the union operation significantly improves the performance of the image-level scheme over all stage ranges (see Table 3).


Image-level and group-level models for Drosophila gene expression pattern annotation.

Sun Q, Muckatira S, Yuan L, Ji S, Newfeld S, Kumar S, Ye J - BMC Bioinformatics (2013)

Image-level scheme with union vs image-level scheme without union for all stage ranges. The y-axis of the figure indicates the AUC.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3924186&req=5

Figure 5: Image-level scheme with union vs image-level scheme without union for all stage ranges. The y-axis of the figure indicates the AUC.
Mentions: To support our image-level scheme, we implement another image-level scheme without union which evaluates the prediction on the image-level. The results are reported in Figure 5. Under the same condition, the union operation significantly improves the performance of the image-level scheme over all stage ranges (see Table 3).

Bottom Line: In our experiment, the three pooling functions perform comparably well in feature dimension reduction.The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data.Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ, 85287, USA. jieping.ye@asu.edu.

ABSTRACT

Background: Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.

Results: We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.

Conclusion: In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.

Show MeSH