Yung-Yu Chuang:  Home  Projects  Publications  Courses  CV  Links 

Deep Learning for Computational Photography
In addition to dealing with computer vision problems, deep learning has also shown great promise for computational photography problems. We have applied deep learning to two different categories of computational photography problems: improving image quality and learning to see the unseen.

Segmentation and Saliency Detection with Fewer Labels
Segmentation and saliency detection are core problems in computer vision. Although deep learning has made significant progress in solving these problems, the most successful methods require labeled training data. Collecting these labels requires a great deal of time and effort. To address these issues, we propose a number of methods that require fewer labels.

Compact Deep Models
Deep models have recently gained a lot of attention for their effectiveness in many computer vision problems. Although effective, these models are often very large in model size, making them difficult to deploy on the edge and embedded devices. We have proposed a few compact deep models that take up only a few megabytes of memory and are suitable for those devices.

Variance Reduction for Monte Carlo Rendering
Monte Carlo (MC) integration is a common technique for rendering images with distributed effects such as antialiasing, depth of field, motion blur, and global illumination. It simulates a variety of sophisticated light transport paths in a unified manner; it estimates pixel values by using stochastic point samples in the integral domain. Despite its generality and simplicity, however, the MC approach converges slowly and suffers from noisy images because of large variance. There are serveal categories of methods for variance reduction of MC rendering, such as filtering, importance sampling, caching and interpolation. Among them, we have developed methods following the paradigms of importance sampling and filtering and adaptive sampling.

Stereoscopic Media Editing
The wide deployment of stereoscopic displays and binocular cameras has made the capture and display of stereoscopic media easy, and has led to more such media. Unfortunately, in spite of the rapid progress in stereoscopic hardware, little progress has been made on the software side, especially for consumer 3D media processing. We attempt to fill the gap by providing users with easy-to-use 3D media editing tools, specifically on display adaptation, image resizing, image cloning, photo slideshow and video stabilization.

Computational Photography and Videography
As getting cheaper and cheaper, cameras (either still image cameras or video camcorders) become more and more accessible. Therefore, computational photography and videography has been a hot research topic recently. We investigate topics to enhance photos or videos taken by users. For example, we have worked on high dynamic range image reconstruction from hand-held cameras. For this, we formulated HDR reconstrution and deblurring in a unified framework. Our early work in this field is "animating pictures" which turns a still image into a video texture. We have also proposed a new method for video stabilization.

Video Concept Detection
The broad availability of videos has led to a general and strong demand for effective and efficient video retrieval. Concept-based video retrieval is a promising approach. However, its success greatly depends on the accuracy of concept detection. We proposed several frameworks to take advantage of both contextual correlation and temporal dependency to improve accuracy for video concept detection from user-provided annotations and/or detector-generated predictions.

Multiple Kernel Clustering
Most clustering algorithms assume a single affinity matrix recording pairwise similarity between data. However, in many applications, there could be multiple potentially useful features and thereby multiple affinity matrices. We applied the multiple kernel learning theory to fuzzy clustering and spectral clustering so that they simultaneously seek for an optimal combination of affinity matrices and optimize clustering results.

Content Analysis
We have been working on semantic content analysis for media such as photos and videos. Such an analysis will enable more efficient and effective utiliziation of massive media, for example, search, organization, summarization and retrieval. In particular, we have proposed methods for automatically segmenting wedding videos into semantic segments and exploring geotags for building geographic ontology and a couple of image tagging/reranking applications. In addition, we have made efforts on collecting ROI benchmrks and comparing performances of ROI algorithms.

Multimedia Applications
We investigate better ways for users to utilize multimedia data, including better methods for presenting photos in a spatial order, making slideshows with transitions of a 3D navigation style, and displaying photo and music with synchronized emotions.

Real-Time Rendering
Photorealism and interactivity are historically two contrary goals of computer graphics. However, due to the tremendous advance of graphics processing units, real-time photorealisitc rendering becomes probable. In this project, we explore possibilities to improve visual quality of rendering under the constraint of rendering in real time by utilizing the computation power of modern GPUs.

Digital matting and compositing
We developed new models and methods for digital matting and compositing, crucial for making visual effects. Conventional matting methods either require a carefully controlled studio setup or demand intensive user interactions. Our Bayesian matting algorithm is capable of pulling an alpha matte of a complex silhouette from a natural image with limited user interactions. We also extended this method to handle videos by interpolating user-drawn ' keyframes using optical flow. Traditional compositing approach only models color blending effects like anti-aliasing, motion blur and transparency, but not reflections, refractions and shadows. Environment matting enables capture of reflections and refractions with a limited accuracy at a modest cost. We further improved this process by developing a more sophisticated sampling scheme to capture environment mattes with higher accuracy, and by developing a technique that requires fewer images to allow for real-time capture. Shadows are yet another effects that the traditional method fails to model correctly. We introduced a novel process called shadow matting and compositing to acquire the photometric and geometric properties of the background for making realistic shadow composites.

Windows Snoop
Linux Snoop
Mac Snoop
A zoomin tool for visualizing a portion of your screen on Windows, Linux and MacOS.

Last Modified: