[2017-11-17] Dr. Tsung-Yi Lin, Google Brain,”Learning Multiscale Visual Representations for Dense Object Detection”

Title: Learning Multiscale Visual Representations for Dense Object Detection
Date: 2017-11-17  02:20-3:30
Location: R103, CSIE
Speaker: Dr. Tsung-Yi Lin, Google Brain
Hosted by: Prof. Yung-Yu Chuang


In this talk, I will introduce our works in object detection. Our goal is to design a simple, fast, and accurate convolutional object detector. We identify the major challenges are twofold: efficient feature computation and learning with extreme class imbalance distribution. To address the first issue, we introduce Feature Pyramid Network (FPN), a generic multiscale feature extractor. The FPN computes multiscale feature representations in one-pass with a single scale input image. The idea is to introduce minimum top-down and lateral connections to efficiently upscale feature representations in ConvNets. Second, we introduce Focal Loss, a novel loss function that addresses the extreme class imbalance distribution by focusing on learning from hard examples. With FPN and Focal Loss, we design an one-stage convolutional object detector, RetinaNet, that achieves state-of-the-art performance for both speed and accuracy. In addition to algorithm design, I will also discuss the recent developments and future research directions of object detection through summarizing the latest COCO competitions in ICCV 2017.


Tsung-Yi Lin is a Research Scientist at Google Brain. He received his B.S. in EE from National Taiwan University in 2009 and a Ph.D. in ECE from Cornell University in 2017. He interned at Microsoft Research and Facebook AI Research during his Ph.D. study. His research interests include Computer Vision and Machine Learning. In particular, Tsung-Yi is interested in learning visual representations for image matching, object recognition, and instance segmentation. He is the recipient of Marr Prize student paper award at ICCV 2017.

最後修改時間:2017-11-08 AM 11:21

cron web_use_log