Limits...
An automatic method to detect and track the glottal gap from high speed videoendoscopic images.

Andrade-Miranda G, Godino-Llorente JI, Moro-Velázquez L, Gómez-García JA - Biomed Eng Online (2015)

Bottom Line: Thanks to the ROI implementation, our technique is robust to the camera shifting and also the objective test proved the effectiveness and performance of the approach in the most challenging scenarios that it is when exist an inappropriate closure of the vocal folds.The novelties of the proposed algorithm relies on the used of temporal information for identify an adaptive ROI and the use of watershed merging combined with active contours for the glottis delimitation.Additionally, an automatic procedure for synthesize multiline VKG by the identification of the glottal main axis is developed.

View Article: PubMed Central - PubMed

Affiliation: Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Campus de Montegancedo, Crta. M40 km, 38, Madrid, Spain. gxandrade@ics.upm.es.

ABSTRACT

Background: The image-based analysis of the vocal folds vibration plays an important role in the diagnosis of voice disorders. The analysis is based not only on the direct observation of the video sequences, but also in an objective characterization of the phonation process by means of features extracted from the recorded images. However, such analysis is based on a previous accurate identification of the glottal gap, which is the most challenging step for a further automatic assessment of the vocal folds vibration.

Methods: In this work, a complete framework to automatically segment and track the glottal area (or glottal gap) is proposed. The algorithm identifies a region of interest that is adapted along time, and combine active contours and watershed transform for the final delineation of the glottis and also an automatic procedure for synthesize different videokymograms is proposed.

Results: Thanks to the ROI implementation, our technique is robust to the camera shifting and also the objective test proved the effectiveness and performance of the approach in the most challenging scenarios that it is when exist an inappropriate closure of the vocal folds.

Conclusions: The novelties of the proposed algorithm relies on the used of temporal information for identify an adaptive ROI and the use of watershed merging combined with active contours for the glottis delimitation. Additionally, an automatic procedure for synthesize multiline VKG by the identification of the glottal main axis is developed.

No MeSH data available.


Related in: MedlinePlus

Complete methodology representation. From top to down and left to right: Input image; segmentation obtained after watershed and first region merging; second region merging (the white part of the image represents the region that correlates with the previous step); overlapping results and initialization for the active contour; and final delimitation of the glottis after 100 iterations
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4625946&req=5

Fig6: Complete methodology representation. From top to down and left to right: Input image; segmentation obtained after watershed and first region merging; second region merging (the white part of the image represents the region that correlates with the previous step); overlapping results and initialization for the active contour; and final delimitation of the glottis after 100 iterations

Mentions: There are two main categories for the active contour models or snakes: edge-based and region-based. The edge-based image gradients are used to identify the object’s boundaries. The main limitation of this model is that it usually incorporates the edge information ignoring other image characteristics. The second disadvantage is that it must be initialized close to the local minima of interest in order to avoid the snake to be trapped in other local minima. Meanwhile for the region-based models, the foreground and background are described statistically and this model tries to find the energy that best fits the image. The advantages of this technique include robustness against initial curve placement and insensitivity to image noise. However, techniques that use global statistics are usually not ideal for segmenting heterogeneous objects. In cases where the object to be segmented cannot be easily distinguished in terms of global statistics, region-based active contours may lead to erroneous segmentations. Glottis detection in laryngeal images has a certain degree of complexity because these images are heterogeneous and noisy at the same time. Heterogeneity and noise can be solved using the local statistics approach proposed in [32]. The idea is to model the foreground and background in terms of smaller local regions, since foreground and background regions cannot be always represented with global statistics. This framework allows a correct conversion in cases of inhomogeneity, common in medical images. The analysis of local regions leads to the construction of a family of local energies at each point along the initial curve. In order to optimize the local energies, each point of the curve is considered separately and moves to minimize the energy computed in its own local region. The energy can be modeled in three different ways: the uniform modeling energy, the means separation energy, and the histogram separation energy. In this paper, the Chan-Vessel model was chosen, which models the interior and exterior of the region as constant intensities represented by their means. The experimentation carried out has shown that the anterior and posterior part of the glottis are not always accurately segmented during the correlation merging step, producing in some cases a wrong delineation of those regions. The post-processing uses the result of the correlation regions merging as initialization for the snake. For instance the first figure of the second row in Fig. 6 represents an example of initialization used for the active contour, showing that the anterior part is not segmented correctly by the watershed and the correlation merging.Fig. 6


An automatic method to detect and track the glottal gap from high speed videoendoscopic images.

Andrade-Miranda G, Godino-Llorente JI, Moro-Velázquez L, Gómez-García JA - Biomed Eng Online (2015)

Complete methodology representation. From top to down and left to right: Input image; segmentation obtained after watershed and first region merging; second region merging (the white part of the image represents the region that correlates with the previous step); overlapping results and initialization for the active contour; and final delimitation of the glottis after 100 iterations
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4625946&req=5

Fig6: Complete methodology representation. From top to down and left to right: Input image; segmentation obtained after watershed and first region merging; second region merging (the white part of the image represents the region that correlates with the previous step); overlapping results and initialization for the active contour; and final delimitation of the glottis after 100 iterations
Mentions: There are two main categories for the active contour models or snakes: edge-based and region-based. The edge-based image gradients are used to identify the object’s boundaries. The main limitation of this model is that it usually incorporates the edge information ignoring other image characteristics. The second disadvantage is that it must be initialized close to the local minima of interest in order to avoid the snake to be trapped in other local minima. Meanwhile for the region-based models, the foreground and background are described statistically and this model tries to find the energy that best fits the image. The advantages of this technique include robustness against initial curve placement and insensitivity to image noise. However, techniques that use global statistics are usually not ideal for segmenting heterogeneous objects. In cases where the object to be segmented cannot be easily distinguished in terms of global statistics, region-based active contours may lead to erroneous segmentations. Glottis detection in laryngeal images has a certain degree of complexity because these images are heterogeneous and noisy at the same time. Heterogeneity and noise can be solved using the local statistics approach proposed in [32]. The idea is to model the foreground and background in terms of smaller local regions, since foreground and background regions cannot be always represented with global statistics. This framework allows a correct conversion in cases of inhomogeneity, common in medical images. The analysis of local regions leads to the construction of a family of local energies at each point along the initial curve. In order to optimize the local energies, each point of the curve is considered separately and moves to minimize the energy computed in its own local region. The energy can be modeled in three different ways: the uniform modeling energy, the means separation energy, and the histogram separation energy. In this paper, the Chan-Vessel model was chosen, which models the interior and exterior of the region as constant intensities represented by their means. The experimentation carried out has shown that the anterior and posterior part of the glottis are not always accurately segmented during the correlation merging step, producing in some cases a wrong delineation of those regions. The post-processing uses the result of the correlation regions merging as initialization for the snake. For instance the first figure of the second row in Fig. 6 represents an example of initialization used for the active contour, showing that the anterior part is not segmented correctly by the watershed and the correlation merging.Fig. 6

Bottom Line: Thanks to the ROI implementation, our technique is robust to the camera shifting and also the objective test proved the effectiveness and performance of the approach in the most challenging scenarios that it is when exist an inappropriate closure of the vocal folds.The novelties of the proposed algorithm relies on the used of temporal information for identify an adaptive ROI and the use of watershed merging combined with active contours for the glottis delimitation.Additionally, an automatic procedure for synthesize multiline VKG by the identification of the glottal main axis is developed.

View Article: PubMed Central - PubMed

Affiliation: Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Campus de Montegancedo, Crta. M40 km, 38, Madrid, Spain. gxandrade@ics.upm.es.

ABSTRACT

Background: The image-based analysis of the vocal folds vibration plays an important role in the diagnosis of voice disorders. The analysis is based not only on the direct observation of the video sequences, but also in an objective characterization of the phonation process by means of features extracted from the recorded images. However, such analysis is based on a previous accurate identification of the glottal gap, which is the most challenging step for a further automatic assessment of the vocal folds vibration.

Methods: In this work, a complete framework to automatically segment and track the glottal area (or glottal gap) is proposed. The algorithm identifies a region of interest that is adapted along time, and combine active contours and watershed transform for the final delineation of the glottis and also an automatic procedure for synthesize different videokymograms is proposed.

Results: Thanks to the ROI implementation, our technique is robust to the camera shifting and also the objective test proved the effectiveness and performance of the approach in the most challenging scenarios that it is when exist an inappropriate closure of the vocal folds.

Conclusions: The novelties of the proposed algorithm relies on the used of temporal information for identify an adaptive ROI and the use of watershed merging combined with active contours for the glottis delimitation. Additionally, an automatic procedure for synthesize multiline VKG by the identification of the glottal main axis is developed.

No MeSH data available.


Related in: MedlinePlus