Human-human interaction (HHI) relies on people’s ability to mutually understand each other, often by making use of implicit signals that are physiologically embedded in human behaviour and do not require the sender’s awareness, that is, honest signals. When we are engrossed in a conversation, we align with our partner: we unconsciously mimic each other, coordinate our behaviours and synchronize positive displays of emotion. This tremendously important skill, which spontaneously develops in HHI, is currently lacking in robots.
This project aims at building on advances in deep learning, and in particular on the field of Explainable Artificial Intelligence (XAI), which offers approaches to increase the interpretability and explainability of the complex, highly nonlinear deep neural networks, to develop new machine learning-based methods that: (1) automatically analyse and predict alignment in HHI, (2) visualize and provide interpretation of regions of focus, as well as the type of used information (e.g., face expression, eye movement, body position, etc), in network’s decision/prediction making to aid understanding of the alignment in HHI, (3) mine HHI data to bootstrap emotional alignment in HRI, by iteratively transferring the dynamics of behaviours learnt in HHI to HRI, thus enabling robots to align to humans, and (4) interpret and analyse the learned HHI strategies and the derived HRI alignments to both evaluate the system, and gain knowledge about subtle HHI during co-adaptive emotional alignment.