Limits...
Deep Neural Networks with Multistate Activation Functions.

Cai C, Xu Y, Ke D, Su K - Comput Intell Neurosci (2015)

Bottom Line: Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates.Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets.The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

View Article: PubMed Central - PubMed

Affiliation: School of Technology, Beijing Forestry University, No. 35 Qinghuadong Road, Haidian District, Beijing 100083, China.

ABSTRACT
We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including the N-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

No MeSH data available.


Related in: MedlinePlus

The cross entropy losses of the models using the 2-order MSAF on the cross-validation set. The curve of the logistic function is set to make the comparison clearer.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4581500&req=5

fig10: The cross entropy losses of the models using the 2-order MSAF on the cross-validation set. The curve of the logistic function is set to make the comparison clearer.

Mentions: Figure 10 provides curves of the cross entropy losses when the models use the 2-order MSAF as their activation functions as well as a curve of the logistic function. It is obvious that the 2-order MSAF has better results than the logistic function, and the results are considerably better when mean-normalisation is applied. Although both the logistic function and the 2-order MSAF require 15 epochs of iterations to finish training, the final cross entropy loss of the latter is slightly lower than that of the former. Also noteworthy is that the 2-order MSAF combined with mean-normalisation sees the lowest loss and it requires only 12 epochs of iterations.


Deep Neural Networks with Multistate Activation Functions.

Cai C, Xu Y, Ke D, Su K - Comput Intell Neurosci (2015)

The cross entropy losses of the models using the 2-order MSAF on the cross-validation set. The curve of the logistic function is set to make the comparison clearer.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4581500&req=5

fig10: The cross entropy losses of the models using the 2-order MSAF on the cross-validation set. The curve of the logistic function is set to make the comparison clearer.
Mentions: Figure 10 provides curves of the cross entropy losses when the models use the 2-order MSAF as their activation functions as well as a curve of the logistic function. It is obvious that the 2-order MSAF has better results than the logistic function, and the results are considerably better when mean-normalisation is applied. Although both the logistic function and the 2-order MSAF require 15 epochs of iterations to finish training, the final cross entropy loss of the latter is slightly lower than that of the former. Also noteworthy is that the 2-order MSAF combined with mean-normalisation sees the lowest loss and it requires only 12 epochs of iterations.

Bottom Line: Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates.Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets.The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

View Article: PubMed Central - PubMed

Affiliation: School of Technology, Beijing Forestry University, No. 35 Qinghuadong Road, Haidian District, Beijing 100083, China.

ABSTRACT
We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including the N-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

No MeSH data available.


Related in: MedlinePlus