Competitions

  1. The Visual Speech Analysis Competition (VSAC) 
    Authors – Shuang Yang (ICT, CAS), Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences), Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences). 
    Abstract – The advancement in computer vision has notably improved visual speech analysis (VSA), yet most research relies on high-quality datasets, overlooking real-world challenges like low resolution, large poses, poor lighting, and unclear captures. To address this gap, our proposed competition aims to stimulate researchers’ attention on tackling these real-world scenarios by introducing two distinct tracks: Track 1 – Visual Speech Recognition under Low-Quality Conditions, and Track 2 – Visual Keyword Spotting. Both tracks focus on Chinese speech analysis to introduce linguistic diversity into the field of VSA and to further advance research in non-Latin script languages.

    Specifically, Track 1 is dedicated to exploring the practical applications of Visual Speech Recognition (VSR) under challenging real-world conditions. It provides participants with both validation and testing datasets, enabling them to rigorously evaluate and compare the performance of their algorithms. Utilizing our newly curated dataset that presents real-world challenges, this track will assess the robustness and effectiveness of current VSR technologies when confronted with the harsh realities of real-world data, which is frequently marred by low resolution, significant pose variations, and suboptimal image quality. Track 2 centers on visual keyword spotting based on visual speech analysis, offering training and validation datasets. This track simulates a scenario where, if accurate recognition is not feasible, the system must be able to identify some key words or phrases. This is particularly relevant for applications where partial understanding can still provide valuable insights.

    We will provide baselines for both tracks. By providing a common ground for benchmarking, we hope to encourage the research community to push the boundaries of VSR, leading to more robust and practical solutions that can be effectively deployed in diverse real-world settings.

  2.  Common AI Innovator Framework for Automated Research in Face and Gesture Recognition
    Authors – Iddo Drori (Boston University and Columbia University), Ani Shporer (MIT), Nakul Verma (Columbia University), Madeleine Udell (Stanford)
    Abstract – 
    We propose a live competition at FG 2025 that challenges participants to use a common AI innovator system that automatically generates novel research ideas, writes research plans, runs experiments, and creates research papers on topics in face and gesture recognition. This competition will push the boundaries of AI research co-pilots and share with the community an AI system that augments and accelerates the scientific process. Participants will interact by voice with the AI innovator to define their research problems in face and gesture recognition, and the AI innovator will generate ideas, verify their novelty, feasibility, and significance, write research plans, implement selected ideas, and produce research papers and review them. The competition will evaluate the quality and impact of AI-generated research and the capability of human-AI collaboration in scientific research.

Competition chairs

Marwa Mahmoud, University of Glasgow, UK

Umur Ciftci, Binghamton University, United States