Main Page News About the PPGEE Programs Offered Concentration Area and Research Lines Faculty Curricular Components Students Dissertations and Theses Technical and Bibliographic Production Composition of the Board and Auxiliary Committees Laboratories Archives Contact us Financing and Scholarships

Doctoral Thesis Defense Session No. 57 of PPGEEC - Visual-temporal Estimation Systems of Biosignals and Motion Signatures in Videos for Human-Robot Interaction

STUDENT: JOÃO MARCELO SILVA SOUZA

DATE: 03/26/2025

TIME: 09:00

LOCATION: https://us02web.zoom.us/j/83991196315?pwd=lBjuIQzL6onl3bQoeqrSOBRe0DavbJ.1

TITLE: Visual-temporal Estimation Systems of Biosignals and Motion Signatures in Videos for Human-Robot Interaction

KEYWORDS: Biosignals; visual-temporal; facial expressions; motion signatures; facial points of interest; spatio-temporal normalization; time series; VT-FER; FBioT; FACS.

ABSTRACT: In Human-Robot Interaction (HRI), visual estimation of biosignals over time is essential for extracting human features, interpreting behaviors, and providing diverse cyberphysical feedback and stimuli. In this context, Facial Expression Recognition (FER) systems have been developed to automate the computational analysis of human behavior, a process that requires careful observation and complex treatment of spatiotemporal correlations in an integrated manner. However, current FER systems and datasets predominantly explore spatial, static, or instantaneous aspects, limiting the investigation of facial muscle deformations and movements over time applied to real-world situations. To overcome this limitation, this work proposes an alternative approach to the conventional image domain, connecting the visual representation of points of interest to temporal descriptors. To this end, the points are referenced over time, normalized in a spatiotemporal manner, and transformed into measurements that generate movement signatures represented by means of multivariate time series. This work presents: the proposed methodology called Visual-Temporal FER (VT-FER) and its respective framework; the 22 standardized measurements based on the fundamentals of FACS; the pipeline architecture for computational systems; and a new dataset, the Facial Biosignals Time-Series (FBioT), composed of more than 21 thousand seconds of videos of real situations, generated in an uncontrolled environment and originating from public databases. The results of the prototypes allowed validating the temporal hypotheses of the proposal, reaching accuracy levels compatible with benchmarks of the scientific community: 94% in the neural network trained with CK+ reference data for emotion detection in a controlled environment; and 72% for arousal detection in an uncontrolled environment, based on the AFEW reference. Furthermore, with the FBioT dataset it was possible to explore the potential of the methodology in the development of neural networks, reaching 80% accuracy in the visual-temporal detection of emotions embedded in conversation and 88% in the visual identification of words from the temporal observation of the mouth.

BOARD MEMBERS:

JES DE JESUS FIAIS CERQUEIRA (PRESIDENT) UFBA

WAGNER LUIZ ALVES DE OLIVEIRA UFBA

ANTONIO CARLOS LOPES FERNANDES JUNIOR UFBA

EDUARDO FURTADO DE SIMAS FILHO UFBA

FERNANDO ALBERTO CORREIA DOS SANTOS PUC - RJ

THAMILES RODRIGUES DE MELO SENAI/CIMATEC

Em 20/03/2025

© 2010 PPGEE - ppgee@ufba.br
Rua Aristides Novis, n.02, 4° andar, Sala 23 Federação - CEP: 40210-630. Salvador - Bahia, Brasil.
Tel: +55 (71) 3283-9775 - Formulário de Contato

Administração