Limits...
Facial expression recognition and histograms of oriented gradients: a comprehensive study.

Carcagnì P, Del Coco M, Leo M, Distante C - Springerplus (2015)

Bottom Line: This paper proposes a comprehensive study on the application of histogram of oriented gradients (HOG) descriptor in the FER problem, highlighting as this powerful technique could be effectively exploited for this purpose.The first experimental phase was aimed at proving the suitability of the HOG descriptor to characterize facial expression traits and, to do this, a successful comparison with most commonly used FER frameworks was carried out.As a final phase, a test on continuous data streams was carried out on-line in order to validate the system in real-world operating conditions that simulated a real-time human-machine interaction.

View Article: PubMed Central - PubMed

Affiliation: National Research Council of Italy, Institute of Applied Sciences and Intelligent Systems, Via della Libertà, 3, 73010 Arnesano , LE Italy.

ABSTRACT
Automatic facial expression recognition (FER) is a topic of growing interest mainly due to the rapid spread of assistive technology applications, as human-robot interaction, where a robust emotional awareness is a key point to best accomplish the assistive task. This paper proposes a comprehensive study on the application of histogram of oriented gradients (HOG) descriptor in the FER problem, highlighting as this powerful technique could be effectively exploited for this purpose. In particular, this paper highlights that a proper set of the HOG parameters can make this descriptor one of the most suitable to characterize facial expression peculiarities. A large experimental session, that can be divided into three different phases, was carried out exploiting a consolidated algorithmic pipeline. The first experimental phase was aimed at proving the suitability of the HOG descriptor to characterize facial expression traits and, to do this, a successful comparison with most commonly used FER frameworks was carried out. In the second experimental phase, different publicly available facial datasets were used to test the system on images acquired in different conditions (e.g. image resolution, lighting conditions, etc.). As a final phase, a test on continuous data streams was carried out on-line in order to validate the system in real-world operating conditions that simulated a real-time human-machine interaction.

No MeSH data available.


Examples of HOG (9 orientations) processing on registered face images (Ne Neutral, Su Surprised.). CS is the cell size of the processed images
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4628009&req=5

Fig6: Examples of HOG (9 orientations) processing on registered face images (Ne Neutral, Su Surprised.). CS is the cell size of the processed images

Mentions: This subsection is aimed at selecting the optimal values for the internal HOG parameters, i.e. the best configuration to capture the most discriminative information for the FER problem. More specifically, the average recall value has been employed for performance evaluation. This choice had been driven looking at the most recent works (as Happy and Routray 2015) where average recall is employed as the main performance value for comparison among different methods. HOG descriptor is characterized by two main parameters, the cell size and the number of orientation bins. Cell size represents the dimension of the patch involved in the single histogram computation. The importance of this parameter has been highlighted in Déniz et al. (2011) where the same grid like HOG extraction was exploited and the fusion of HOG descriptors at different scales has been used to effectively address the face recognition issue. Using a large cell size, the appearance information of a significant region of the facial image is squeezed into a single cell histogram and then some details, useful for subsequent classification, can be lost. On the other hand, with a small cell size, high resolution analysis can be carried out but, in this way, the discrimination between useful and useless extracted details is demanded to the classifier that could be unable to perform this additional task in the best way. The number of orientation bins refers instead to the quantization levels of the gradient information. A low number of orientations could lead to some loss of information and a consequent reduction in FER performance. Vice versa, a high number of quantization levels could spread-out the information along the bins, decreasing the FER performance as well. For these reasons, the choice of these parameters has to be carefully carried out by taking into consideration the goal to be reached in a particular application context. How this choice was made for FER purposes is described below. First, with regard to the cell size, a qualitative assessment can be made: in Fig. 6 the registered versions of a neutral and a surprised face expression are shown with the related processing outcomes obtained by HOG descriptor with a fixed number of 8 orientations and different values of cell size (3, 8 and 15 pixels). It is quite evident that the case with cell size of 15 pixels led to a loss of information: no correspondences between facial traits and HOG histogram can be accomplished since the accumulation of orientations was related to a large image region. On the contrary, the use of a small cell size (3 pixel) produced a very crowded distribution of the bins and then the information cannot be adequately encoded. So far, from Fig. 6, could be deduced that the most discriminative representation is given, instead, by the use of a middle cell size (in the examples 8 pixels).Fig. 6


Facial expression recognition and histograms of oriented gradients: a comprehensive study.

Carcagnì P, Del Coco M, Leo M, Distante C - Springerplus (2015)

Examples of HOG (9 orientations) processing on registered face images (Ne Neutral, Su Surprised.). CS is the cell size of the processed images
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4628009&req=5

Fig6: Examples of HOG (9 orientations) processing on registered face images (Ne Neutral, Su Surprised.). CS is the cell size of the processed images
Mentions: This subsection is aimed at selecting the optimal values for the internal HOG parameters, i.e. the best configuration to capture the most discriminative information for the FER problem. More specifically, the average recall value has been employed for performance evaluation. This choice had been driven looking at the most recent works (as Happy and Routray 2015) where average recall is employed as the main performance value for comparison among different methods. HOG descriptor is characterized by two main parameters, the cell size and the number of orientation bins. Cell size represents the dimension of the patch involved in the single histogram computation. The importance of this parameter has been highlighted in Déniz et al. (2011) where the same grid like HOG extraction was exploited and the fusion of HOG descriptors at different scales has been used to effectively address the face recognition issue. Using a large cell size, the appearance information of a significant region of the facial image is squeezed into a single cell histogram and then some details, useful for subsequent classification, can be lost. On the other hand, with a small cell size, high resolution analysis can be carried out but, in this way, the discrimination between useful and useless extracted details is demanded to the classifier that could be unable to perform this additional task in the best way. The number of orientation bins refers instead to the quantization levels of the gradient information. A low number of orientations could lead to some loss of information and a consequent reduction in FER performance. Vice versa, a high number of quantization levels could spread-out the information along the bins, decreasing the FER performance as well. For these reasons, the choice of these parameters has to be carefully carried out by taking into consideration the goal to be reached in a particular application context. How this choice was made for FER purposes is described below. First, with regard to the cell size, a qualitative assessment can be made: in Fig. 6 the registered versions of a neutral and a surprised face expression are shown with the related processing outcomes obtained by HOG descriptor with a fixed number of 8 orientations and different values of cell size (3, 8 and 15 pixels). It is quite evident that the case with cell size of 15 pixels led to a loss of information: no correspondences between facial traits and HOG histogram can be accomplished since the accumulation of orientations was related to a large image region. On the contrary, the use of a small cell size (3 pixel) produced a very crowded distribution of the bins and then the information cannot be adequately encoded. So far, from Fig. 6, could be deduced that the most discriminative representation is given, instead, by the use of a middle cell size (in the examples 8 pixels).Fig. 6

Bottom Line: This paper proposes a comprehensive study on the application of histogram of oriented gradients (HOG) descriptor in the FER problem, highlighting as this powerful technique could be effectively exploited for this purpose.The first experimental phase was aimed at proving the suitability of the HOG descriptor to characterize facial expression traits and, to do this, a successful comparison with most commonly used FER frameworks was carried out.As a final phase, a test on continuous data streams was carried out on-line in order to validate the system in real-world operating conditions that simulated a real-time human-machine interaction.

View Article: PubMed Central - PubMed

Affiliation: National Research Council of Italy, Institute of Applied Sciences and Intelligent Systems, Via della Libertà, 3, 73010 Arnesano , LE Italy.

ABSTRACT
Automatic facial expression recognition (FER) is a topic of growing interest mainly due to the rapid spread of assistive technology applications, as human-robot interaction, where a robust emotional awareness is a key point to best accomplish the assistive task. This paper proposes a comprehensive study on the application of histogram of oriented gradients (HOG) descriptor in the FER problem, highlighting as this powerful technique could be effectively exploited for this purpose. In particular, this paper highlights that a proper set of the HOG parameters can make this descriptor one of the most suitable to characterize facial expression peculiarities. A large experimental session, that can be divided into three different phases, was carried out exploiting a consolidated algorithmic pipeline. The first experimental phase was aimed at proving the suitability of the HOG descriptor to characterize facial expression traits and, to do this, a successful comparison with most commonly used FER frameworks was carried out. In the second experimental phase, different publicly available facial datasets were used to test the system on images acquired in different conditions (e.g. image resolution, lighting conditions, etc.). As a final phase, a test on continuous data streams was carried out on-line in order to validate the system in real-world operating conditions that simulated a real-time human-machine interaction.

No MeSH data available.