WECIA Workshop 2023

Program

Workshop Program

All invited talks, oral presentations and the panel discussion will take place in Room E06 of the Paris Convention Center.
ICCV virtual conference link.
The workshop will happen October 3rd 2023, from 8:30 AM to 13:00 PM (Paris time).

08:30

Opening08:30 AM - 08:40 AM

Opening Remarks

Watch video

08:40

Keynote08:40 AM - 09:15 AM

Socially Aware Language Understanding in the Age of LLMs

Diyi Yang

Abstract & Bio

Abstract. Despite the remarkable performance of NLP these days, current systems often ignore the social part of language, e.g., who says it, or what goals, and with what social implications, all of which severely limits the functionality of these applications and the growth of the field. This talk will discuss some of our recent efforts towards socially aware NLP via two studies. The first part looks at how large language models work in the context of social understanding, and how human-AI collaboration can reduce costs and improve the efficiency of social science research. The second part looks at linguistic prejudice with a participatory design approach to develop dialect-inclusive language tools and adaptation techniques for low-resourced language and dialect. I conclude by discussing the challenges and hidden risks of building socially aware NLP systems in the age of LLMs.

Bio. Diyi Yang is an assistant professor in the Computer Science Department at Stanford, affiliated with the Stanford NLP Group, Stanford HCI Group, Stanford AI Lab (SAIL), and Stanford Human-Centered Artificial Intelligence (HAI). She is interested in Computational Social Science, and Natural Language Processing. Yang's research goal is to better understand human communication in social context and build socially aware language technologies to support human-human and human-computer interaction.

09:15

Keynote09:15 AM - 09:50 AM

FAIR labs at Meta AI

Xinlei Chen

Abstract & Bio

Abstract. Self-supervised learning is becoming an inevitable machine learning approach to learn rich representations from unlabeled data. However, in the era of large language models, it is unclear why we should pursue a paradigm that purely learns from visual observations without text. In this talk, I would like to discuss a motivation in relation to emotionally and culturally intelligent AI, and navigate through two key methods as effective pretext tasks for visual representation learning.

Bio. Xinlei Chen is a Research Scientist in FAIR labs at Meta AI, Menlo Park, CA. His current research interest is on pre-training, especially pre-training of visual representations with self-supervision and/or multi-modality. Chen was a PhD student educated at the Language Technology Institute, Carnegie Mellon University, while working at the Robotics Institute. He graduated with a bachelor's degree in computer science from Zhejiang University, China.

09:50

Keynote09:50 AM - 10:20 AM

The Role of Emotions in Fake News and LLMs

Preslav Nakov

Abstract & Bio

Abstract. We will discuss the gradual shift in terminology and focus when it comes to factuality: from ""fake news"" (focus on factuality), to ""disinformation"" (factuality + malicious intent), and finally to ""infodemic"" (focus on harm). Subsequently, there has been a gradual shift in research attention from factuality to trying to understand intent. One direction we proposed in this respect was to detect the use of specific propaganda techniques in text, e.g., appeal to emotions, fear, prejudices, logical fallacies, etc. Another direction has been to understand framing, e.g., COVID-19 can be discussed from a health, an economic, a political, or a legal perspective, among others. Yet another direction has been to better understand the type of text, e.g., objective news reporting vs. opinion piece vs. satire. All these have to do with emotions and were featured in our recent SemEval-2023 task 3, which further promotes multilinguality, covering English, French, Georgian, German, Greek, Italian, Polish, Russian, and Spanish. Yet another important aspect is multimodality, as Internet memes are much more influential than simple text. We will discuss our work on analyzing memes in terms of propaganda (SemEval-2021 Task 6), harmfulness, harm's target identification, role-labeling in terms of who is portrayed as a hero/villain/victim (CLEF'2024 shared task), and generating natural text explanations for the latter. Finally, we will discuss how emotions play a crucial role when training and aligning large language models (LLMs). In particular, we will share our experience from implementing quardrails for Jais, the best language model for Arabic, which is also on par with LLaMA 2 for English.

Bio. Preslav Nakov is Professor at Mohamed bin Zayed University of Artificial Intelligence. Previously, he was Principal Scientist at the Qatar Computing Research Institute, HBKU, where he led the Tanbih mega-project, developed in collaboration with MIT, which aims to limit the impact of "fake news", propaganda and media bias by making users aware of what they are reading, thus promoting media literacy and critical thinking. He received his PhD degree in Computer Science from the University of California at Berkeley, supported by a Fulbright grant. He is Chair-Elect of the Association for Computational Linguistics (ACL), Secretary of ACL SIGSLAV, and Secretary of the Truth and Trust Online board of trustees. Formerly, he was PC chair of ACL 2022, and President of ACL SIGLEX. He is also member of the editorial board of several journals including Computational Linguistics, TACL, ACM TOIS, IEEE TASL, IEEE TAC, CS&L, NLE, AI Communications, and Frontiers in AI. He authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and 250+ research papers. He received a Best Paper Award at ACM WebSci'2022, a Best Long Paper Award at CIKM'2020, a Best Demo Paper Award (Honorable Mention) at ACL'2020, a Best Task Paper Award (Honorable Mention) at SemEval'2020, a Best Poster Award at SocInfo'2019, and the Young Researcher Award at RANLP’2011. He was also the first to receive the Bulgarian President's John Atanasoff award, named after the inventor of the first automatic electronic digital computer. His research was featured by over 100 news outlets, including Forbes, Boston Globe, Aljazeera, DefenseOne, Business Insider, MIT Technology Review, Science Daily, Popular Science, Fast Company, The Register, WIRED, and Engadget, among others.

10:20

Keynote10:20 AM - 10:50 AM

Break and Poster Session

10:50

Keynote10:50 AM - 11:20 AM

Responsible AI in the Context of Culture

Ricardo Baeza-Yates

Abstract & Bio

Abstract. In the first part, to set the stage, we cover irresponsible AI: (1) discrimination (e.g., facial recognition); (2) phrenology (e.g., predicting emotions); (3) bad uses of generative AI (e.g., mental health) and (4) limitations (e.g., human incompetence). These examples do have a personal bias but set the context for the second part where we address three challenges: (1) principles & governance, (2) regulation and (3) our cognitive biases. We finish discussing our responsible AI initiatives and the near future.

Bio. Ricardo Baeza-Yates is Director of Research at the Institute for Experiential AI of Northeastern University. Before, he was VP of Research at Yahoo Labs, based in Barcelona, Spain, and later in Sunnyvale, California, from 2006 to 2016. He is co-author of the best-seller Modern Information Retrieval textbook published by Addison-Wesley in 1999 and 2011 (2nd ed), that won the ASIST 2012 Book of the Year award. From 2002 to 2004 he was elected to the Board of Governors of the IEEE Computer Society and between 2012 and 2016 was elected for the ACM Council. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow, among other awards and distinctions. He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989, and his areas of expertise are web search and data mining, information retrieval, bias on AI, data science and algorithms in general.

11:20

Oral presentation11:20 AM - 11:50 PM

A Data-Driven, Multicultural Taxonomy of Expressive Behavior

Alan Cowen

Abstract & Bio

Abstract. Guided by semantic space theory, large-scale computational studies have advanced our understanding of the structure and function of emotional behavior. I will integrate findings from large-scale, multicultural experimental studies of facial expression (N=19,656), vocal bursts (N=12,616), speech prosody (N=30,109), multimodal reactions (N=8,056), and more. These studies combine methods from psychology and computer science to yield new insights into what expressive behaviors signal, how they are perceived, and how they interact with language to shape social interaction. Using machine learning to extract cross-cultural dimensions of behavior while minimizing biases due to demographics and context, we arrive at objective measures of the structural dimensions that make up human expression. Expressions are consistently found to be high-dimensional and blended, with their meaning across cultures being efficiently conceptualized in terms of a wide range of specific emotion concepts. In more applied studies, joint embeddings of expressions and language predict real-life outcomes more accurately than language alone. Altogether, these findings support and extend previous findings from the semantic space theory approach, generating a new atlas of expressive behavior. This new taxonomy departs from models such as the basic six and affective circumplex, suggesting a new way forward for training AI models that understand and learn from human expression.

Bio. Alan Cowen is an emotion scientist leading a lab and technology company called Hume AI and a former researcher at U.C. Berkeley and Google. His research uses computational methods to address how emotional behaviors can be evoked, conceptualized, parameterized, predicted, annotated, and translated, how they influence our social interactions and bring meaning to our lives. He has investigated responses to millions of emotional stimuli by tens of thousands of participants around the world, analyzed brain representations of emotion using fMRI and intracranial electrophysiology, investigated ancient sculptures, and used deep learning to measure facial expressions in millions of naturalistic videos from around the world.

11:50

Panel Discussion11:50 AM - 12:25 PM

Panel Discussion

12:25

Announcement12:25 PM - 12:30 PM

Awards & Winner Announcement

12:30

Oral presentation12:30 PM - 12:40 PM

EmoSet: A Large-scale Visual Emotion Dataset with Rich Attributes

Jingyuan Yang

Project page

12:40

Oral presentation12:40 PM - 12:50 PM

Self-regulating Prompts: Foundational Model Adaptation without Forgetting

Salman Khan

Project page

12:50

Oral presentation12:50 PM - 13:00 PM

Socratis: Are large multimodal models emotionally aware?

Arjit Ray

Project page

Challenge

🏆 ArtELingo Challenge

Our proposed challenge aims to improve emotion understanding and its connection to vision and language through two tasks: Emotion Recognition and Affective Image Captioning. Emotion Recognition requires models capable of predicting emotions from textual and visual input, while Affective Image Captioning involves generating captions dependent on both an image and an emotion. We use a multi-lingual, cross-cultural dataset, ArtELingo, to encourage the development of solutions that model emotions and cultural differences. Our goal is to inspire the creation of emotion-culture-aware AI systems, a significant advancement for the field.

In addition to modeling emotions, we aim to encourage solutions that also model cultural differences. Our ultimate objective is to stimulate the development of emotion-culture-aware AI systems within the community.

📦 Dataset

ArtELingo is a large-scale multi-lingual dataset of affective experiences triggered by visual artworks and explained in captions.
It contains 1.4 million annotations in Arabic, English, and Chinese. Emotional experiences are inherently subjective, and existing datasets collected in English from western cultures fail to represent cultural diversity. ArtELingo captures both similarities and differences between cultures, providing a unique opportunity to explore cross-cultural emotional understanding. While many similarities exist, intriguing cultural differences exist. Our challenge aims to inspire the development of models that can better capture these differences.

📨 Submission and metrics

We use the eval.ai platform to host the challenge.
As mentioned above, we propose two tasks; Emotion Recognition and Affective Image Captioning.

For Emotion Recognition, we use the F1 metric on a private test set that is not available to the contestants.

ArtELingo

For the Affective Image Captioning task, we use standard captioning metrics such as BLEU, METEOR, ROUGE, and CIDEr.

In addition, we use an emotion alignment metric that reflects whether the generated caption express a particular emotion or not.

For both tasks, we evaluate the cross culture understanding via a held-out set from each language. Moreover, we evaluate an extreme cross culture scenario where we provide a few shots of annotations collected in Spanish and evaluate the submitted models on a private spanish test set.

Eval.ai challenges links:

Emotion Recognition Challenge: 👉 🌐 https://eval.ai/web/challenges/challenge-page/2106/overview
Affective Image Captioning Challenge: 👉 🌐 https://eval.ai/web/challenges/challenge-page/2104/overview

Papers

🎯 Accepted papers

We are pleased to present the list of accepted papers for the WECIA workshop at ICCV 2023. These papers showcase the latest research and notable progress made in the field of emotionally-aware AI.

✨
Socratis: Are large multimodal models emotionally aware?

Arijit Ray, Boston University

✨
EmoSet: A Large-scale Visual Emotion Dataset with Rich Attributes

Jingyuan Yang, Shenzhen University

✨
Self-regulating Prompts: Foundational Model Adaptation without Forgetting

Muhammad Uzair Khattak, MBZUAI

EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation

Ziqiao Peng, Renmin University of China

Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation

Yuan Gan, Zhejiang University

Affective Image Filter: Reflecting Emotions from Text to Images

Shuchen Weng, Peking University

Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

Xiaoshi Wu, The Chinese University of Hong Kong

Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images

Nitzan Bitton-Guetta, Deutsche Telekom Innovation Labs

We extend our sincere appreciation to all the participants for their valuable contributions to the workshop. The WECIA workshop owes its success to the passion and expertise of these researchers. We cordially invite everyone to explore this curated collection of accepted papers.

Event	Date
Paper submission deadline	August 1, 2023
Notification to authors	August 07, 2023
Camera-ready deadline	August 21, 2023

Event	Date
Paper submission deadline	August 20, 2023
Notification to authors	August 30, 2023
Camera-ready deadline	September 14, 2023

Event	Date
Release of training/validation data	May 1, 2023
Challenge online	July 12, 2023
Submission deadline	September 15, 2023
Winners announcement	October 2, 2023

ICCV 2023 - Paris, France

WECIA: 1st Workshop on Emotionally and Culturally Intelligent AI

Program

Workshop Program

08:30

08:40

09:15

09:50

10:20

10:50

11:20

11:50

12:25

12:30

12:40

12:50

Workshop

Presentation

Welcome to the Workshop on Emotionally and Culturally Intelligent AI (WECIA)!

Speakers

Invited Speakers

Challenge

🏆 ArtELingo Challenge

📦 Dataset

📨 Submission and metrics

Papers

🎯 Accepted papers

Organizers

Workshop Organizers

Dates

📅 Timeline

Archival track:

Non-archival track:

WECIA: 1^st Workshop on Emotionally and Culturally Intelligent AI