Home // EXPLAINABILITY 2024, The First International Conference on Systems Explainability // View article


Explainable Facial Emotion Recognition with the use of Vision Transformers

Authors:
Isidoros Perikos
Ioannis Kollias
Vaggelis Kapoulas
Michael Paraskevas

Keywords: Facial Emotion Recognition; Vision Transformers (ViT); Explainability; Temporal Convolutional Network (TCN);

Abstract:
Facial Emotion Recognition (FER) is very important in the field of human-computer interaction and it can greatly help computer systems to interpret and react to human emotions. The analysis of facial expressions and the accurate recognition of their emotional content are highly desired and assistive in a wide spectrum of domains. In this paper, we present a work on the recognition of facial expressions using a hybrid framework that incorporates Vision Transformers (ViT) with Temporal Convolution Networks. The proposed ViT’s goal is to extract intricate facial features, whereas the Temporal Convolution Network component effectively captures temporal relationships and aims to enhance the accuracy of facial expression classification. In addition, the LIME technique was used to illustrate the decision-making procedure of the framework utilized. Our framework can achieve an accuracy of 72% on FER2023 dataset, with a strong emphasis on the explanatory power and generalizability of the model.

Pages: 11 to 16

Copyright: Copyright (c) IARIA, 2024

Publication date: November 17, 2024

Published in: conference

ISBN: 978-1-68558-215-9

Location: Valencia, Spain

Dates: from November 17, 2024 to November 21, 2024