Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Co-Curing Noisy Annotations for Facial Expression Recognition


Affiliations
1 Department of Mathematics and Computer Science, Sri Sathya Sai Institute of Higher Learning, India
     

   Subscribe/Renew Journal


Driven by the advancement in technology that can facilitate implementation of deep neural networks (DNNs), and due to the availability of large scale datasets, automatic recognition performance of the machines has increased by leaps and bounds. This is also true with regard to facial expression recognition (FER) wherein the machine automatically classifies a given facial image in to one of the basic expressions. However, annotations of large scale datasets in FER suffer from noise due to various factors like crowd sourcing, automatic labelling based on key word search etc. Such noisy annotations impede the performance of FER due to the memorization ability of DNNs. To address it, this paper proposes a learning algorithm called Co-curing: peer training of two joint networks using a supervision loss and a mimicry loss that are balanced dynamically, and supplemented with a relabeling module to correct the noisy annotations. Specifically, peer networks are trained independently using supervision loss during early part of the training. As training progresses, mimicry loss is given higher weightage to bring consensus between the two networks. Our Co-curing does not need to know the noise rate. Samples with wrong annotations are relabeled based on the agreement of peer networks. Experiments on synthetic as well real world noisy datasets validate the effectiveness of our method. State-of-the-art (SOTA) results on benchmark in-the-wild FER datasets like RAF-DB (89.70%), FERPlus (89.6%) and AffectNet (61.7%) are reported.

Keywords

Noisy Annotations, Facial Expression Recognition, Co-Curing, Mimicry Loss, Peer Learning
Subscription Login to verify subscription
User
Notifications
Font Size

  • C. Darwin and P. Prodger, “The Expression of the Emotions in Man and Animals”, Oxford University Press, 1998.
  • S. Li, W. Deng, “Deep Facial Expression Recognition: A Survey”, IEEE Transactions on Affective Computing, Early Access, 2020.
  • P. Ekman and W.V. Friesen, “Constants across Cultures in the Face and Emotion”, Journal of Personality and Social Psychology, Vol. 17, No. 2, pp. 124-129, 1971
  • P. Ekman, “Strong Evidence for Universals in Facial Expressions: A Reply to Russell’s Mistaken Critique”, Psychological Bulletin, Vol. 115, No. 2, pp. 268-287, 1994.
  • D. Matsumoto, “More Evidence for the Universality of a Contempt Expression”, Motivation and Emotion, Vol. 16, No. 4, pp. 363-368, 1992.
  • X. Fan, Z. Deng, K. Wang, X. Peng and Y. Qiao, “Learning Discriminative Representation for Facial Expression Recognition from Uncertainties”, Proceedings of IEEE International Conference on Image Processing, pp. 903-907, 2020.
  • J. MA, “Facial Expression Recognition using Hybrid Texture Features based Ensemble Classifier”, International Journal of Advanced Computer Science and Applications, Vol. 6, pp. 1-13, 2017.
  • C. Shan, S. Gong and P.W. Mcwoan, “Facial Expression Recognition based on Local Binary Patterns: A Comprehensive Study”, Image and Vision Computing, Vol. 27, No. 6, pp. 803-816, 2009.
  • P. Hu, D. Cai, S. Wang, A. Yao and Y. Chen, “Learning Supervised Scoring Ensemble for Emotion Recognition in the Wild”, Proceedings of ACM International Conference on Multimodal Interaction, pp. 553-560, 2017.
  • H. Chun Lo and R. Chung, “Facial Expression Recognition Approach for Performance Animation”, Proceedings of IEEE International Workshop on Digital and Computational Video, pp. 613-622, 2001.
  • T. Kanade, J.F. Cohn and Y. Tian, “Comprehensive Database for Facial Expression Analysis”, Proceedings of 4th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 46-53, 2000.
  • P. Lucey, J.F. Cohn, T. Kanade, J. Saragih, Z. Ambadar and I. Matthews, “The Extended Cohn-Kanade Dataset (ck+): A Complete Dataset for Action Unit and Emotion-Specified Expression”, Proceedings of IEEE International Workshops on Computer Vision and Pattern Recognition, pp. 94-101, 2010.
  • G. Zhao, X. Huang, M. Taini, S.Z. Li and M. Pietikainen, “Facial Expression Recognition from Near-Infrared Videos”, Proceedings of IEEE International Conference on Image and Vision Computing, pp. 607-619, 2011.
  • F.Y. Shih, C.F. Chuang and P.S.P. Wang, “Performance Comparisons of Facial Expression Recognition in Jaffe Database”, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 22, No. 3, pp. 445-459, 2008.
  • A. Mollahosseini, B. Hasani and M.H. Mahoor, “A Database for Facial Expression, Valence, and Arousal Computing in the Wild”, IEEE Transactions on Affective Computing, Vol. 10, No. 1, pp.18-31, 2017.
  • E. Barsoum, C. Zhang, C.C. Ferrer and Z. Zhang, “Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution”, Proceedings of 18th ACM International Conference on Multimodal Interaction, pp. 279-283, 2016.
  • S. Li and W. Deng, “Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition”, IEEE Transactions on Image Processing, Vol. 28, No. 1, pp. 356-370, 2018.
  • S. Li, W. Deng and J. Du, “Reliable Crowdsourcing and Deep Locality Preserving Learning for Expression Recognition in the Wild”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 28522-2861, 2017.
  • K. Wang, X. Peng, J. Yang, S. Lu and Y. Qiao, “Suppressing Uncertainties for Large-Scale Facial Expression Recognition”, Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897-6906, 2020.
  • D. Arpit, S. Jastrz, N. Ballas, D. Krueger, E. Bengio, M.S. Kanwal, T. Maharaj, A. Fischer, A. Courville and Y. Bengio, “A Closer Look at Memorization in Deep Networks”, Proceedings of International Conference on Machine Learning, pp. 233-242, 2017.
  • C. Zhang, S. Bengio, M. Hardt, B. Recht and O. Vinyals, “Understanding Deep Learning requires Rethinking Generalization”, Proceedings of International Conference on Machine Learning, pp. 1-13, 2017.
  • B. Frenay and M. Verleysen, “Classification in the Presence of Label Noise: A Survey”, IEEE Transactions on Neural Networks and Learning Systems, Vol. 25, No. 5, pp. 845-869, 2013.
  • J. Goldberger and E. Ben-Reuven, “Training Deep Neural-Networks using A Noise Adaptation Layer”, Proceedings of International Conference on Machine Learning, pp. 1-5, 2016.
  • G. Patrini, A. Rozza, A. Menon, R. Nock and L. Qu, “Making Neural Networks Robust to Label Noise: A Loss Correction Approach”, Proceedings of International Conference on Machine Learning, pp. 1-9, 2016.
  • B. Han, Q. Yao, X. Yu, G. Niu, M. Xu, W. Hu, I. Tsang and M. Sugiyama, “Coteaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels”, Proceedings of International Conference on Machine Learning, pp. 1-13, 2018.
  • Darshan Gera and S. Balasubramanian, “Landmark Guidance Independent Spatio-Channel Attention and Complementary Context Information based Facial Expression Recognition”, Pattern Recognition Letters, Vol:145, pp. 58-66, 2021.
  • Samuli Laine and Timo Aila, “Temporal Ensembling for Semisupervised Learning”, Proceedings of International Conference on Neural and Evolutionary Computing, pp. 1-13, 2016.
  • X. Yu, B. Han, J. Yao, G. Niu, I. Tsang, M. Sugiyama, “How does Disagreement Help Generalization Against Label Corruption?”, Proceedings of International Conference on Machine Learning, pp. 7164-7173, 2019.
  • X. Wang, Y. Hua, E. Kodirov and N.M. Robertson, “Image for Noise-Robust Learning: Mean Absolute Error does not Treat Examples Equally and Gradient Magnitude’s Variance Matters”, Proceedings of International Conference on Machine Learning, pp. 1-14, 2019.
  • Y. Wang, X. Ma, Z. Chen, Y. Luo, J. Yi and J. Bailey, “Symmetric Cross Entropy for Robust Learning with Noisy Labels”, Proceedings of IEEE/CVF International Conference on Computer Vision, pp. 322-330, 2019.
  • Ying Zhang, Tao Xiang, Timothy M. Hospedales and Huchuan Lu, “Deep Mutual Learning”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4320-4328, 2018.
  • Zhilu Zhang and Mert Sabuncu, “Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels”, Proceedings of IEEE Conference on Neural Information Processing Systems, pp. 8778-8788, 2018.
  • Scott Reed, Honglak Lee, Dragomir Anguelov, Christian Szegedy, Dumitru Erhan and Andrew Rabinovich, “Training Deep Neural Networks on Noisy Labels with Bootstrapping”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3320-3328, 2015.
  • H. Siqueira, S. Magg and S. Wermter, “Efficient Facial Feature Learning with Wide Ensemble-Based Convolutional Neural Networks”, Proceedings of AAAI Conference on Artificial Intelligence, pp. 5800-5809, 2020.
  • P. Jiang, B. Wan, Q. Wang and J. Wu, “Fast and Efficient Facial Expression Recognition using a Gabor Convolutional Network”, IEEE Signal Processing Letters, Vol. 27, pp. 1954-1958, 2020.
  • P. Ding and R. Chellappa, “Occlusion-Adaptive Deep Network for Robust Facial Expression Recognition”, Proceedings of IEEE International Joint Conference on Biometrics, pp. 1-9, 2020.
  • Y. Li, J. Zeng, S. Shan and X. Chen, “Occlusion Aware Facial Expression Recognition using CNN with Attention Mechanism”, IEEE Transactions on Image Processing, Vol. 28, No. 5, pp. 243902450, 2018.
  • E. Malach and S. Shalev-Shwartz, “Decoupling” when to Update” from” How to Update”, Proceedings of IEEE International Conference on Advances in Neural Information Processing Systems, pp. 1-11, 2017.
  • H. Wei, L. Feng, X. Chen and B. An, “Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization”, Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13726-13735, 2020.
  • F. Sarfraz, E. Arani and B. Zonooz, “Noisy Concurrent Training for Efficient Learning under Label Noise”, Proceedings of IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3159-3168, 2021.
  • J. Zeng, S. Shan and X. Chen, “Facial Expression Recognition with Inconsistently Annotated Datasets”, Proceedings of European Conference on Computer Vision, pp. 222-237, 2018.
  • K. Zhang, Z. Zhang, Z. Li and Y. Qiao, “Joint Face Detection and Alignment using Multitask Cascaded Convolutional Networks”, IEEE Signal Processing Letters, Vol. 23, No. 10, pp. 1499-1503, 2016.
  • K. He, X. Zhang, S. Ren and J. Sun, “Deep Residual Learning for Image Recognition”, Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
  • Y. Guo, L. Zhang, Y. Hu, X. He and J. Gao, “Ms-Celeb-1m: A Dataset and Benchmark for Large-Scale Face Recognition”, Proceedings of European Conference on Computer Vision, pp. 87-102, 2016.
  • K. Wang, X. Peng, J. Yang, D. Meng and Y. Qiao, “Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition”, IEEE Transactions on Image Processing, Vol. 29, pp. 4057-4069, 2020.
  • S. Li and W. Deng, ‘Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition”, IEEE Transactions on Image Processing, Vol. 28, No. 1, pp. 356-370, 2018.

Abstract Views: 218

PDF Views: 1




  • Co-Curing Noisy Annotations for Facial Expression Recognition

Abstract Views: 218  |  PDF Views: 1

Authors

Darshan Gera
Department of Mathematics and Computer Science, Sri Sathya Sai Institute of Higher Learning, India
S. Balasubramanian
Department of Mathematics and Computer Science, Sri Sathya Sai Institute of Higher Learning, India

Abstract


Driven by the advancement in technology that can facilitate implementation of deep neural networks (DNNs), and due to the availability of large scale datasets, automatic recognition performance of the machines has increased by leaps and bounds. This is also true with regard to facial expression recognition (FER) wherein the machine automatically classifies a given facial image in to one of the basic expressions. However, annotations of large scale datasets in FER suffer from noise due to various factors like crowd sourcing, automatic labelling based on key word search etc. Such noisy annotations impede the performance of FER due to the memorization ability of DNNs. To address it, this paper proposes a learning algorithm called Co-curing: peer training of two joint networks using a supervision loss and a mimicry loss that are balanced dynamically, and supplemented with a relabeling module to correct the noisy annotations. Specifically, peer networks are trained independently using supervision loss during early part of the training. As training progresses, mimicry loss is given higher weightage to bring consensus between the two networks. Our Co-curing does not need to know the noise rate. Samples with wrong annotations are relabeled based on the agreement of peer networks. Experiments on synthetic as well real world noisy datasets validate the effectiveness of our method. State-of-the-art (SOTA) results on benchmark in-the-wild FER datasets like RAF-DB (89.70%), FERPlus (89.6%) and AffectNet (61.7%) are reported.

Keywords


Noisy Annotations, Facial Expression Recognition, Co-Curing, Mimicry Loss, Peer Learning

References