Winston H. Hsu 
RM 512, CSIE Building,
1 Roosevelt Road, Section 4,
Taipei 106, Taiwan
TEL: +886-2-3366-4888 ext. 512
we believe in!
Dr. Winston Hsu is an active researcher dedicated to novel algorithms
and systems for large-scale image/video retrieval, social media mining,
visual recognition, and machine intelligence – both in mobile and
cloud-based platforms. He is keen to realizing advanced researches
towards business deliverables via academia-industry collaborations and
co-founding startups. Till November 2015, he has 3000+ Google
Citations among his research publications (self-maintained
list and DBLP)
He is a Professor in the Department
of Computer Science and Information Engineering and Graduate
Institute of Networking and Multimedia, National Taiwan
University. He is affiliated with
Communication and Multimedia Lab (CMLab) and the founder of MiRA
Retrieval, and Analysis)
Research Group. He was a Visiting Researcher at Microsoft
Research Redmond (Summer 2014). He received his Ph.D. degree
(2007) from Columbia University, New York, under the supervision of Prof.
Shih-Fu Chang. Before that, he worked for years as a founding
engineer in the multimedia software company, CyberLink
Corp., where experiencing Engineer, Project Leader, and R&D
He has been recognized with technical awards and best paper awards in
multimedia research community (e.g., ACM Multimedia 2006 Best Paper
Runner-Up, First Prize in ACM Multimedia Grand Challenge 2011, First
Place in MSR-Bing Image Retrieval Challenge 2013, ACM Multimedia
2013/2014 Grand Challenge Multimodal Award, etc.). He is on the
Editorial Board for IEEE Multimedia Magazine and the Associate Editor
for IEEE Transactions on Multimedia. He is in the organizing committee
for ACM Multimedia (2010/2012/2013/2016), ICMR 2015/2016, ICME 2009, PCM
2013/2014, MMM 2013. He is in the technical committee for major
conferences (ACM Multimedia 2009-2015, SIGIR 2011/2013/2015, WWW 2011,
ICCV 2015, CVPR 2016, CIKM 2011, ICME 2006-2016, IJCAI 2015). He is a
Senior Member of both the IEEE and the ACM. He was awarded 2011 Ta-You
Wu Memorial Award (吳大猷先生紀念獎), a national and prestigious recognition for
young researchers, and 2013 National Outstanding IT Elite Award
(資訊月傑出資訊人才獎), for his contributions in advanced research and industry
collaborations. See more details in his CV.
1 Postdoc and graduate students (MS/PHD) for Deep
Convolutional Neural Network for large-scale image/video analysis and
retrieval projects, sponsored by leading industry partners (Microsoft
Research, MediaTek, HTC, etc.).
Had a very successful and impressive presentation to Jen-Hsun
Huang, NVIDIA CEO/Co-founder —
thanks to many exciting and deep-learning-based research projects in
large-scale image/video analytics and machine intelligence our group
have been working on. We are to have more closed collaborations in NVIDIA
in the coming months. [photo]
Hsu will serve as ICME
2017 Technical Program Co-Chair, representing IEEE
Multimedia Systems & Applications Technical Committee (MSA TC).
to Yin-Hsi Kuo
for her work, "De-Hashing:
Server-Side Context-Aware Feature Reconstruction for Mobile Visual
Search," accepted for
IEEE TCSVT. It argues a novel methods, "De-hashing," a
bandwidth- and mobile-compliant hashing method and leveraging the
strength in both the mobile and the cloud.
to our PhD alumnus, Yen-Liang
Lin, who will join GE
Global Research, New York, USA, as the Research Scientist for
core computer vision problems.
to our PhD alumnus, Liang-Chi Hsieh,
who will join IBM
Spark Technology Center, USA, as the Data Scientist.
to the group for having 3 papers accepted for ICASSP
2016; two are for deep convolutional neural networks for image
recognition and latent topic discovery; one for very novel activity
recognition by Wi-Fi.
to Wen-Yu LEE for her PhD Thesis accepted for AAAI
2016 Doctoral Consortium, a top AI conference, and will have a
40-min ORAL presentation.
are the Devils Wearing Prada in New York City," fashion mining
in social media, led by Dr. Kuan-Ting Chen and with researchers from Univ.
of Rochester, was widely reported in social media and New
York Post - Fashion
show styles really do translate into everyday trends. It was
also reported in MIT
Technology Review, titled "How
Machine Vision Is About to Change the Fashion World," and Science
Lin and Yu-Chuan
Su for winning IPPR 2015 Best Thesis Awards, for PhD and Master
Hsu will serve as Associate Editor (AE) for IEEE
Transactions on Multimedia (since July 2015).
report for our industry-academia collaborative projects (with Quanta
Research Institute) — 先
to be a guest speaker about "Deep
Learning Methods for Image Classification," in Prof. Lee's Deep
Learning course. Will be happy to share our findings in the past
2.5 years and our recent results.
that we had submitted three Deep
Convolutional Neural Network (DCNN) work to top conferences
which focus on (1) large-scale training data acquisition, (2) new DCNN
learning architecture, and (3) the cross-domain problems commonly
observed in social media (i.e., photos in Instagram). We also have an
early work in DCNN
for video event detection by addressing the training data
- Congratulations to another work, "Scalable Object Detection by
Filter Compression with Regularized Sparse Coding," accepted for CVPR
- Congratulations to the group for winning
Multimedia 2014 Grand Challenge Multimodal Award
- Congratulations to our first work (full paper), "Visually
Interpreting Names as Demographic Attributes by Exploiting
Click-Through Data," accepted for AAAI
- Congratulations to Yan-Ying
Chen and YungChien Hsu for winning 資
訊學會碩博士最佳論文獎 (IICM Thesis Award)
- Congratulations to Yan-Ying
Chen for winning 中華民國人工
智 慧學會最佳博士論文佳作獎 (TAAI PhD Thesis Award)
- Prof. Hsu is the Visiting Researcher at Microsoft
Research Redmond (USA) for the summer 2014 and works on
large-scale image recognition by leveraging social media for automatic
training in deep neural network framework.
- Prof. Hsu will serve as the Area Chair (AC) for ICME
- Congratulations to the group for having three short papers and one
Grand Challenge finalist accepted for ACM
Multimedia 2014, Orlando, Florida.
- Publishing Cross-Age
Celebrity Dataset (CACD) containing 163,446
images from 2,000
celebrities with year (2004-2013) collected from the
- Congratulations to the group for having two papers accepted for ECCV
2014, Zurich, Switzerland.
- Congratulations to Liang-Chi Hsieh
(PhD candidate) and Hsin-Fu Huang (master student) for winning the FIRST
PRIZE in "Etu Hadoop Cluster Deployment Competition" with cash
award NTD 150,000.
- Prof. Hsu was just awarded "ACM Senior Member"
- A SIGIR 2014 FULL
paper, collaborating with Microsoft
Research Asia, accepted.
- Congratulations to our PhD student Yan-Ying
Chen for having successful job interviews and having decided to
join FX Palo Alto Laboratory (CA,
USA) as a researcher and master student Kuan-Yu
Chu for joining Microsoft
Bing, Bellevue, USA.
- Congratulations to Yan-Ying
Chen and Yin-Hsi
Kuo for winning the ACM travel grant for ACM
ICMR 2014, Glasgow, UK.
- Congratulations to the group for having a cool/promising demo paper
accepted for WWW 2014, Seoul, Korea.
- Co-rganizer (with Akisato Kimura and Shin'ichi Satoh) for
"Cross-media Analysis for Social Multimedia (CASM) Workshop," ICME
- Congratulations to the group for having three papers (1 full and 2
short) accepted for ACM ICMR 2014,
- Congratulations Prof. Hsu for receiving 2013
National Outstanding IT Elite Award (傑出資訊人才獎)
- Congratulations to the group for winning
Multimedia 2013 Grand Challenge Multimodal Award.
- Congratulations to Kuan-Ting Chen for being awarded NSC
Grants for PhD Research Abroad (「補助博士生赴國外研究」) in 2014; she
will have (12-month) research visit to Dr.
Jiebo Luo's group for social media mining in ultra-large-scale
- Congratulations to the group for the paper, 3D
Sub-Query Expansion for Improving Sketch-based Multi-View Image
Retrieval, accepted for ICCV
- Congratulations to the group for winning the FIRST
PRIZE in MSR-Bing
Image Retrieval Challenge 2013, hosted by Microsoft Research
(Redmond) and Bing and also awarded USD $10,000.
- ... more
The prevalence of capture devices and the advent of media-sharing
services have drastically increased the sheer amount of image and video
collections. Here arise the strong needs for effective multimedia
analysis and efficient multimedia retrieval. We have been devoted to
large-scale photo and video retrieval, knowledge discovery from
large-scale social media mining, distributed computation for multimedia
analysis and retrieval, and devised novel multimedia applications in
Recently, we focus on numerous Deep
Convolutional Neural Network methods for large-scale image/video
analysis and retrieval projects, sponsored by leading industry
partners (Microsoft Research, MediaTek, HTC, Intel, IBM TJ Watson,
etc.). We investigate effective CNN methods for image/video
applications such as image search, video event detection, face
recognition, facial/clothing attributes, image/video captioning,
surveillance applications, etc. Especially, we aim for effective
training strategies by leveraging web-scale data and scalable learning
algorithms compliant with mobile or IoT devices.
Though having observed very exciting applications in large-scale
multimedia analysis and retrieval, we further identify certain core
challenges and respond to them respectively:
- Semantic gap – bridging the low-level visual
features to satisfy semantic needs by proposing semantic ontology and
learning semantic representations in an automatic manner;
- User gap – helping users issue proper queries to
satisfy their intentions in different application scenarios and mobile
devices (e.g., by sketch, attribute, snapshot, speech, and
- Volume gap – learning ultra-large-scale photos and
videos by distributed learning
and efficient high-dimensional indexing (e.g., hash-based methods) for
real-time query response over big
photo/video data; balancing the technical strengths between
the mobiles and cloud servers.
- Privacy – conducting privacy-preserving mining for
large-scale photos and videos and addressing the privacy concerns as
sharing sensitive photos and videos (e.g., family albums) in the
- Industry needs – besides thorough algorithms for
academic researches, we also investigate practical methods for meeting
possible needs in industrial developments (e.g., technology transfer).
facial/clothing attribute detection/search
web-scale indexing & feature learning
large-scale photo/video recognition
web-scale facial image retrieval
mobile visual recognition
multimodal deep neural network
social media mining
big data analytics and visualization
consumer photo retrieval
- We keep recruiting graduate students
(master or PhD) for the exciting projects
of both great theoretical interests and strong industrial demands.
Students have delivered brilliant results. See our news
or publication sections for more
- We are recruiting students to leverage a cloud
system (with 1000+ cores) for thousand-scale
recognizers and deep learning for photo/video recognition and
- Several image/video/audio retrieval projects are also good for undergraduate
Why having research projects in MiRA group? for CSIE undergraduates
and PhD students. Some quick demos:
- Effective and efficient product query by mobile phone
- Interactive video (Question and Answering)
- Realtime image retrieval over million-scale image collections
- Million-scale image graph construction and clustering by cloud
computation (MapReduce over 18 Hadoop servers).
- Flora Project: Flower Retrieval and Social Network
- Multimodal Fusion for Mobile Annotation – GPS, Compass, or Camera
- Indexing million-scale faces in sub-second response
- Real-time video retrieval and mobile question and answering
- Consumer photo search by facial attributes and face layout on touch
devices (US patent pending #: 13/599,127)
+ FIRST PRIZE in ACM Multimedia Grand Challenge 2011
- Sketch-based image retrieval on mobile devices using compact hash
- Me-link: Link Me to the Media – Fusing Audio and Visual Cues for
Robust and Efficient Mobile Media Interaction
Image Annotation and Retrieval by Integrating Voice Label and Visual
snapshot for the twitter activities among the world -- real-time
estimation from the sampled tweets; a toy demonstration for our
ongoing project in Big Data