Current project
The Emotion AI project aims to deeply investigate novel computer vision and machine learning methodology to study how artificial intelligence (AI) can identify human emotions and bring emotion AI to human-computer interaction and
computer mediated human-human interaction for boosting remote communication and collaboration. In addition to expressed visual cues, AI technology is expected to identify suppressed and unseen emotional signals, and at the same time
mask people’s identity information for protecting privacy. We will work with worldwide leading experts from different disciplines, e.g., psychology, cognition sciences, education, and medicine, as well as industry, to advance the
emotional intelligence of AI-based solutions and improve understanding of the significance of emotions in the context of human-computer interactions. The research knowledge generated can accelerate innovations for e.g., real-world
e-teaching, e-service, health and security applications.
https://www.oulu.fi/blogs/science-with-arctic-attitude/emotion-ai
How EmotionAI is organised
Aims: Acquisition of reliable new dataset; Dynamic descriptors; Temporal modeling; Context analysis.
People: Hanlin Mo, Qianru Xu
Aims: Acquisition of reliable new dataset; Dynamic descriptors; Temporal modeling; Context analysis; Subtle motion detection and magnification
People: Yante Li, Haoyu Chen
Aims: Acquisition of reliable new dataset; Temporal modeling; Context analysis; Subtle motion detection and magnification
People: Marko Savic
Aims: Acquisition of reliable new dataset; Context analysis; Multi-model learning
People: All
Aims: Acquisition of reliable new dataset; Emotion cue transfer
People: Kevin Ho Man Cheng, Marko Savic, Haoyu Chen
Aims: Acquisition of reliable new dataset; Context analysis; Multi-model learning; Emotion cue transfer
People: All
Project
You can also find all the other publications from our unit by going to our PI's personal site or her UniOulu page
Article | Year |
---|---|
From Emotion AI to Cognitive AI
G Zhao, Y Li , and Q Xu International Journal of Network Dynamics and Intelligence (IJNDI) |
2022 |
Benchmarking 3D Face De-identification with Preserving Facial Attributes
K H M Cheng, Z Yu, H Chen, G Zhao IEEE International Conference on Image Processing, ICIP 2022 |
2022 |
Deep Learning for Micro-expression Recognition: A Survey
Y Li, J Wei, Y Liu, J Kauttonen, G Zhao IEEE Transactions on Affective Computing, Vol. 13, NO. 4 |
2022 |
Geometry-Contrastive Transformer for Generalized 3D Pose Transfer
H Chen, H Tang, Z Yu,N Sebe, G Zhao Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2022 |
2022 |
AniFormer: Data-driven 3D Animation with Transformer
H Chen, H Tang, N Sebe, G Zhao Proceedings of the British Machine Vision Conference, 2021 https://github.com/mikecheninoulu/AniFormer |
2021 |
Our research in action
Other
There are several other projects we are working on, such as:
Academy project (2018.09-2022.08): Micro-gesture analysis with machine learning for hidden emotion understanding (MiGA)
MiGA project aims to understand human hidden emotions via body gestures. Micro-gestures are inconspicuous, spontaneous gestures, most of which are beyond our awareness, or unconscious. Being able to automatically detect and then amplify
such gestures so that enables one to discover their symbolic meaning, opening up rich paths of emotional intelligence. In this project, we introduced a new dataset for the emotional artificial intelligence research: identity-free video
dataset for Micro Gesture Understanding and Emotion analysis (iMiGUE), which contains 359 short-length videos of 72 famous tennis players. This project is proposed for purely research purposes to enhance the algorithms of recognition of
micro-gestures and emotions.
Academy ICT 2023 project (2020.01-2022.12): Context-aware autonomous neural network learning
Spearhead project (2018.09-2022.08): Towards Reading Micro-Expressions in the Real World
Sharing
There are several datasets we share with the research community, which are briefly introduced below. If you are interested, please contact us.
Oulu VS database includes the video and audio data for 20 subjects uttering ten phrases: Hello, Excuse me, I am sorry, Thank you, Good bye, See you, Nice to meet you, You are welcome, How are you, Have a good time. Each person spoke
each phrase five times. There are also videos with head motion from front to left, from front to right, without utterance, five times for each person. The database can be used, for example, in studying the visual speech recognition
(lipreading).
The details and the baseline results can be found in:
Zhao G, Barnard M & Pietikäinen M (2009). Lipreading with local spatiotemporal descriptors. IEEE Transactions on Multimedia 11(7):1254-1265
Oulu VS2 is a multi-view audiovisual database for non-rigid mouth motion analysis. It includes more than 50 speakers uttering three types of utterances and more importantly, thousands of videos simultaneously recorded by six cameras from five different views spanned between the frontal and profile views.
The details and the baseline results can be found in:
Anina, I., Zhou, Z., Zhao, G., & Pietikäinen, M. (2015). OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis. In 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) (pp. 1-5).
Oulu-CASIA NIR&VIS facial expression database contains videos with the six typical expressions (happiness, sadness, surprise,anger, fear, disgust) from 80 subjects captured with two imaging systems, NIR (Near Infrared) and VIS (Visible
light), under three different illumination conditions: normal indoor illumination, weak illumination (only computer display is on) and dark illumination (all lights are off). The database can be used, for example, in studying the
effects of illumination variations to facial expressions, cross-imaging-system facial expression recognition or face recognition.
The details and the baseline results can be found in:
Zhao, G., Huang, X., Taini, M., Li, S. Z., & PietikäInen, M. (2011). Facial expression recognition from near-infrared videos. Image and Vision Computing, 29(9),
607-619.
SPOS database includes spontaneous and posed facial expressions of 7 subjects. Emotional movie clips were shown to subjects to induce spontaneous facial expressions, which include six categories of basic emotions (happy, sad, anger,
surprise, fear disgust). Subjects were also asked to pose the six kinds of facial expressions after watching movie clips. Data are recorded by both visual and near infer-red camera. All together 84 posed and 147 spontaneous facial
expression clips were labeled out from the starting point to the apex.
So far, spontaneous and posed facial expressions are usually found in different databases. The difference between databases (different experimental setting and different participants) hindered researches which considered both
spontaneous and posed facial expressions. This database offers data collected on the same participants and under the same recording condition, which can be used for comparing or distinguishing spontaneous and posed facial expressions
The details and the baseline results can be found in:
Pfister, T.; Xiaobai Li; Guoying Zhao; Pietikainen, M., "Recognising spontaneous facial micro-expressions," in Computer Vision (ICCV), 2011 IEEE International Conference on
, vol., no., pp.1449-1456, 6-13 Nov. 2011
Micro-expressions are important clues for analyzing people's deceitful behaviors. So far the lack of training source has been hindering research about automatic micro-expression recognition, and SMIC was developed to fill this gap. SMIC database includes spontaneous micro-expressions elicited by emotional movie clips. Emotional movie clips that can induce strong emotion reactions were shown to subjects, and subjects were asked to HIDE their true feelings while watching movie clips. If they failed to do so they will have to fill in a long boring questionnaire as punishment. This kind of setting is to create a high-stake lie situation so that micro-expressions can be induced.
There are also extended versions of SMIC (referred as SMIC-E and SMIC-E-Long), which include long video clips that contains some extra non-micro frames before and after the labelled out micro frames and a lot more contextual frames containing various spontaneous movements.
The details can be found in the the following publications:
SMIC-sub,
SMIC,
SMIC-E,
SMIC-E-Long