This online module provides hands-on training in the use of state-of-the-art open-source tools for extracting features of human behaviour. Through tutorials with screen recordings, participants will learn how to apply OpenFace, OpenPose, and openSMILE for behavioural data analysis. The module also includes lectures on multimodal analysis of the extracted data, offering both theoretical background and practical guidance.
Target audience
Postgraduate students and professionals
Prerequisites
Basic programming skills, Python, Bash scripts, Docker
Intended learning outcomes
Analysis of Multimodal Input for Emotion Recognition: Students will be able to use state-of-the-art tools to extract features of human behaviour.
Teaching and learning methods
Teaching and learning methods include tutorials on open-source tools with lectures introducing multimodal data analysis
Assessment
Multiple-choice quizzes at the end of each session
Syllabus
Introduction to Multimodal Interactions and Analysis
- Overview of multimodal emotion recognition, introduction to the importance of facial, vocal, and body features. Theory on multimodal analysis and integration. Reading materials on multimodal signal processing and emotion recognition.
Session 1: OpenFace for Facial Feature Extraction
- Introduction to OpenFace, installation and setup. Tutorials on extracting facial landmarks, gaze direction, head pose, and Action Units. Discussion of strengths/limitations. Hands-on exercises with sample datasets.
Session 2: OpenSmile for Vocal Feature Extraction
- Overview of acoustic features relevant to emotion (pitch, energy, spectral features). Introduction to OpenSmile configuration files and feature sets (e.g., eGeMAPS). Tutorial on extracting vocal features from speech samples. Lab: compare vocal features across different emotional speech datasets.
Session 3: OpenPose for Body Movement Analysis
- Introduction to body pose estimation and gesture analysis. Hands-on tutorial with OpenPose to extract skeletal keypoints. Applications in emotion and behaviour recognition. Lab: Analyse differences in body posture/movement in contrasting emotional states.
Session 4: Combining Multimodal Features
- Strategies for synchronising and integrating features across modalities. Data preprocessing, alignment of timestamps, feature selection. Tutorial: building a multimodal dataset from facial, vocal, and body data.
Session 5: Data Analysis and Emotion Recognition
- Introduction to statistical analysis and machine learning approaches for multimodal data. Tutorials in Python/R for data exploration, feature analysis, and classification. Case study: train a simple model to recognise emotions using multimodal features.
Instructor
Micol Spitale
Politecnico di Milano
About Instructor
