IC26 Branded Triangle

InfoComm China

Loading

NVIDIA Forum - Keynote Speech: Technologies for Multimodal Video Understanding and Generation

15 Apr 2026
ConvergeTech Stage @ Hall C
NVIDIA Forum
Multimodal video understanding and generation technology aims to integrate multi-source information such as text, vision and audio to achieve video content understanding and further enable creation and generation of multimodal derivatives. Based on deep learning and cross-modal representation learning, this technology can accomplish video content understanding and support the generation of high-quality videos including AI-generated content. Relevant research is widely applied in scenarios such as video understanding, video content generation (highlights, secondary creation and derivatives), and 3D media, providing core technical support for next-generation media and intelligent video creation.
Speakers
Shaohui Jiao, Head of 3D Video - Volcenginw
View all Beijing InfoComm China 2026 Summit Agenda