2nd MUCG @ ECCV 2026

Multimodal Large Language Models for Unified Comprehension and Generation

🗓️ September 8–12, 2026 📍 Malmö, Sweden

Welcome to the ECCV 2026 Workshop on Multimodal Large Language Models for Unified Comprehension and Generation. This workshop aims to consolidate emerging research on unified multimodal intelligence, with a focus on systems that understand, generate, and act across modalities within a coherent framework.

Recent multimodal AI has evolved from vision-language understanding toward broader multimodal foundation models spanning image, video, audio, 3D, and generation. At the same time, the community is moving from modular pipelines toward unified tokenization, hybrid autoregressive-diffusion designs, shared representations, and synergistic learning between understanding and generation.

Our goal is to bring together researchers from academia and industry to discuss architectural designs, tokenization strategies, training objectives, evaluation protocols, and practical challenges for building general-purpose multimodal systems.

Topics and Themes

We welcome technical, position, and perspective papers related to unified multimodal modeling. Topics of interest include, but are not limited to:

01

Unified Multimodal Understanding

Captioning, VQA, retrieval, grounding, segmentation, reasoning, visual document understanding, long-video understanding, and cross-modal knowledge extraction.

02

Unified Multimodal Content Generation

Text-to-image/video, image-to-image, controllable generation, sequential image generation, visual editing, and cross-modal generative modeling.

03

Unified MLLM Understanding and Generation

Architectures and objectives that jointly model comprehension and generation, including autoregressive, diffusion, flow-based, and hybrid paradigms.

04

Synergistic Learning

How understanding and generation, or different modalities and tasks, can mutually enhance each other during pre-training, instruction tuning, and post-training.

05

Benchmarking and Evaluation

Evaluation protocols and benchmarks for unified multimodal systems, including realism, controllability, generalization, reasoning, and fair comparison.

06

Broader Directions

Reinforcement learning for unified modeling, multimodal chain-of-thought, joint vision-language/audio/3D models, cross-task transfer, and efficient training.

Submission Instructions

The workshop accepts two submission tracks to encourage broader participation:

Regular Archival Submissions
Up to 14 pages

Regular archival papers may be up to 14 pages long, including figures and tables, and should use Springer LNCS formatting. Additional pages containing only cited references are permitted. Submissions must follow the ECCV 2026 template and be uploaded through OpenReview. The review process is double-blind and managed by the workshop organizers and program committee. Conflicts of interest will be handled according to the ECCV 2026 Submission Policy.

Non-Archival Submissions
At least 4 pages

Non-archival submissions are intended for work that is already published or that authors prefer not to include in the proceedings. Eligible papers include those already peer-reviewed at major CV/ML conferences or journals. Previously reviewed or published papers will not be re-reviewed; acceptance is based on topical fit and poster-board availability. Unpublished submissions in this track will undergo double-blind review, following the same review process as regular archival submissions. Authors should submit the paper or a link to the paper via the workshop's OpenReview page.

All accepted papers will be presented as posters.
Best Paper Awards will be selected based on reviewer scores and committee evaluation.
Submissions are handled through the official ECCV 2026 MUCG OpenReview site.

Important Dates (AoE)

Archival submission deadlineJul 01
Archival reviews dueJul 14
Archival notificationJul 18
Non-archival submission deadlineJul 25
Non-archival notificationAug 7
Camera-ready deadline for all accepted papersAug 8–12
Metadata deadlineAug 20
Workshop (during ECCV 2026)Sep 8–12

Schedule (tentative)

The workshop will be hybrid, supporting both onsite and online participation. The program consists of invited keynote talks, oral and poster presentations of accepted papers, and a closing panel on future directions in unified multimodal intelligence.

Time Schedule Speaker
08:50 – 09:00 Introduction and Opening Remarks Organizers
09:00 – 09:30 Keynote Talk 1 TBD
09:30 – 10:00 Keynote Talk 2 TBD
10:00 – 10:40 Oral Presentations (Session 1) Selected authors
10:40 – 11:00 Coffee Break
11:00 – 11:30 Keynote Talk 3 TBD
11:30 – 12:00 Keynote Talk 4 TBD
12:00 – 12:30 Poster Session 1 (Interactive) + Virtual Gallery Accepted authors
12:30 – 13:30 Lunch Break
13:30 – 14:00 Keynote Talk 5 TBD
14:00 – 14:30 Keynote Talk 6 TBD
14:30 – 15:20 Poster Session 2 (Interactive) + Virtual Gallery Accepted authors
15:20 – 15:50 Coffee Break
15:50 – 16:20 Keynote Talk 7 TBD
16:20 – 17:20 Panel Discussion TBD
17:20 – 17:30 Closing Remarks + Best Paper Award Organizers

Invited Speakers

Mike Z. Shou
Mike Z. ShouNational University of Singapore
Mohit Bansal
Mohit BansalUniversity of North Carolina Chapel Hill
Chen Change Loy
Chen Change LoyNanyang Technological University
Rogerio Schmidt Feris
Rogerio Schmidt FerisMIT-IBM Watson AI Lab
Jaemin Cho
Jaemin ChoAllen Institute for AI / Johns Hopkins University
Yuren Cong
Yuren CongMeta

Organizing Committee

Shengqiong Wu
Shengqiong WuOrganizerUniversity of Oxford
Jinheng Xie
Jinheng XieOrganizerNational University of Singapore
Haozhe Liu
Haozhe LiuOrganizerKAUST
Tian Ye
Tian YeOrganizerHKUST(GZ) & NVIDIA Research
Yanguang Zhao
Yanguang ZhaoExecutorNational University of Singapore
Enze Xie
Enze XieOrganizerNVIDIA Research / MIT HAN Lab
Sivan Doveh
Sivan DovehOrganizerStanford University
Anna Kukleva
Anna KuklevaOrganizerMax Planck Institute for Informatics
Jehanzeb Mirza
Jehanzeb MirzaOrganizerMIT CSAIL
Mingchen Zhuge
Mingchen ZhugeOrganizerKAUST
Hao Fei
Hao FeiOrganizerUniversity of Oxford

Diversity and Inclusion

We are committed to promoting diversity and inclusion across the organizing committee, invited speakers, program committee, accepted papers, and audience. We will proactively encourage submissions from underrepresented groups and institutions, provide inclusive wording in the call for papers, support mentoring opportunities during poster sessions and panels, and ensure virtual access for remote or resource-constrained participants.

Contact

Questions? Please contact the workshop organizers (shengqiongwu@gmail.com).