Title: Recent results on learning with diffusion models
Speaker: Prof. Ming-Hsuan Yang, University of California, Merced.
Diffusion models have been successfully applied to text-to-image generation with state-of-the-art performance. In this talk, I will
discuss how these models can be used for low-level vision tasks and 3D scenes. First, I will present our findings on exploiting features from diffusion models and transformers for zero-shot semantic correspondence and other applications. Next, I will describe how we exploit diffusion models as effective prior for dense prediction, such as surface normal, depth, and segmentation. I will then discuss how diffusion models can facilitate articulated 3D reconstruction, 3D scene generation, and novel view synthesis. When time allows, I will present other results on fine-grained text-to-image generation and pixel-wise visual grounding of large multimodal models.
Ming-Hsuan Yang is a Professor at UC Merced and a Research Scientist with Google. He received the Google Faculty Award in 2009 and CAREER Award from the National Science Foundation in 2012. Yang received paper awards at UIST 2017, CVPR 2018, ACCV 2018, and Longuet-Higgins Prize in CVPR 2023. He is an Associate Editor in-Chief of PAMI, Editor-in-Chief of CVIU, and Associate Editor of IJCV. He was the Program Chair for ACCV 2014 and ICCV 2019 and Senior Area Chair/Area Chair for CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, IJCAI, and AAAI. Yang is a Fellow of the IEEE and ACM.