Winston H. Hsu 
RM 512, CSIE Building,
1 Roosevelt Road, Section 4,
Taipei 106, Taiwan
TEL: +886-2-3366-4888 ext. 512
Office Hour: 10am ~ 12pm, Tuesday, @R512
we believe in!
Dr. Winston Hsu is an active researcher dedicated to novel algorithms
and systems for large-scale image/video retrieval, social media mining/
recommendation, multimedia analytics, and mobile and cloud-based
multimedia applications. He has been recognized with technical awards
and best paper awards in multimedia research community. He is keen to
realizing advanced researches towards business deliverables. Other major
contributions include mobile visual search solutions, image/video
reranking, image classification models (e.g., Columbia374), and
large-scale multimedia ontologies (e.g., LSCOM). See more details in his
Citations. He delivered keynote speech and lectured several
state-of-the-art tutorials in top conferences. He is in the technical
committee for top conferences (e.g., ACM Multimedia, SIGIR, and WWW) and
the organizing committee for ACM Multimedia and ICME. He is on the
Editorial Board for IEEE
Multimedia Magazine and JMM.
He is an Associate Professor in CSIE
National Taiwan University and the founder of MiRA
Retrieval, and Analysis)
Research Group, in
Communication and Multimedia Lab (CMLab). He received his Ph.D.
(2007) degree from Columbia University, New York, under the supervision
Shih-Fu Chang. Before that, he worked for years as a founding
engineer in the multimedia software company, CyberLink
Corp., where experiencing Engineer, Project Leader, and R&D
Manager. He was experienced with software product/service shipping
processes. He is both ACM and IEEE Senior Member. He is recently awarded
2011 Ta-You Wu Memorial Award, a national and prestigious recognition
for young researchers, and 2013 National Outstanding IT Elite Award
(資訊月傑出資訊人才獎), governmental recognition for the contributions in
industry-academia collaborations. See more details in his CV.
Hsu will serve as Associate Editor (AE) for IEEE
Transactions on Multimedia (since July 2015).
to be a guest speaker about "Deep
Learning Methods for Image Classification," in Prof. Lee's Deep
Learning course. Will be happy to share our findings in the past
2.5 years and our recent results.
that we had submitted three Deep
Convolutional Neural Network (DCNN) work to top conferences
which focus on (1) large-scale training data acquisition, (2) new DCNN
learning architecture, and (3) the cross-domain problems commonly
observed in social media (i.e., photos in Instagram). We also have an
early work in DCNN
for video event detection by addressing the training data
to another work, "Scalable Object Detection by Filter Compression with
Regularized Sparse Coding," accepted for CVPR
to the group for winning ACM
Multimedia 2014 Grand Challenge Multimodal Award
to our first work (full paper), "Visually Interpreting Names as
Demographic Attributes by Exploiting Click-Through Data," accepted for
Chen and YungChien Hsu for winning 資
訊學會碩博士最佳論文獎 (IICM Thesis Award)
Chen for winning 中華民國人工智
慧學會最佳博士論文佳作獎 (TAAI PhD Thesis Award)
Hsu is the Visiting Researcher at Microsoft
Research Redmond (USA) for the summer 2014 and works on
large-scale image recognition by leveraging social media for automatic
training in deep neural network framework.
Hsu will serve as the Area Chair (AC) for ICME
to the group for having three short papers and one Grand Challenge
finalist accepted for ACM
Multimedia 2014, Orlando, Florida.
Celebrity Dataset (CACD) containing 163,446
images from 2,000
celebrities with year (2004-2013) collected from the
to the group for having two papers accepted for ECCV
2014, Zurich, Switzerland.
to Liang-Chi Hsieh (PhD candidate)
and Hsin-Fu Huang (master student) for winning the FIRST
PRIZE in "Etu Hadoop Cluster Deployment Competition" with cash
award NTD 150,000.
Hsu was just awarded "ACM Senior Member"
SIGIR 2014 FULL
paper, collaborating with Microsoft
Research Asia, accepted.
to our PhD student Yan-Ying
Chen for having successful job interviews and having decided to
join FX Palo Alto Laboratory (CA,
USA) as a researcher and master student Kuan-Yu
Chu for joining Microsoft
Bing, Bellevue, USA.
- Congratulations to Yan-Ying
Chen and Yin-Hsi
Kuo for winning the ACM travel grant for ACM
ICMR 2014, Glasgow, UK.
- Congratulations to the group for having a cool/promising demo paper
accepted for WWW 2014, Seoul, Korea.
- Co-rganizer (with Akisato Kimura and Shin'ichi Satoh) for
"Cross-media Analysis for Social Multimedia (CASM) Workshop," ICME
- Congratulations to the group for having three papers (1 full and 2
short) accepted for ACM ICMR 2014,
- Congratulations Prof. Hsu for receiving 2013
National Outstanding IT Elite Award (傑出資訊人才獎)
- Congratulations to the group for winning
Multimedia 2013 Grand Challenge Multimodal Award.
- Congratulations to Kuan-Ting Chen for being awarded NSC
Grants for PhD Research Abroad (「補助博士生赴國外研究」) in 2014; she
will have (12-month) research visit to Dr.
Jiebo Luo's group for social media mining in ultra-large-scale
- Congratulations to the group for the paper, 3D
Sub-Query Expansion for Improving Sketch-based Multi-View Image
Retrieval, accepted for ICCV
- Congratulations to the group for winning the FIRST
PRIZE in MSR-Bing
Image Retrieval Challenge 2013, hosted by Microsoft Research
(Redmond) and Bing and also awarded USD $10,000.
- ... more
The prevalence of capture devices and the advent of media-sharing
services have drastically increased the sheer amount of image and video
collections. Here arise the strong needs for effective multimedia
analysis and efficient multimedia retrieval. We have been devoted to
large-scale photo and video retrieval, knowledge discovery from
large-scale social media mining, distributed computation for multimedia
analysis and retrieval, and devised novel multimedia applications in
Though having observed very exciting applications in large-scale
multimedia analysis and retrieval, we further identify certain core
challenges and respond to them respectively:
- Semantic gap – bridging the low-level visual
features to satisfy semantic needs by proposing semantic ontology and
learning semantic representations in an automatic manner;
- User gap – helping users issue proper queries to
satisfy their intentions in different application scenarios and mobile
devices (e.g., by sketch, attribute, snapshot, speech, and
- Volume gap – learning ultra-large-scale photos and
videos by distributed learning
and efficient high-dimensional indexing (e.g., hash-based methods) for
real-time query response over big
photo/video data; balancing the technical strengths between
the mobiles and cloud servers.
- Privacy – conducting privacy-preserving mining for
large-scale photos and videos and addressing the privacy concerns as
sharing sensitive photos and videos (e.g., family albums) in the
- Industry needs – besides thorough algorithms for
academic researches, we also investigate practical methods for meeting
possible needs in industrial developments (e.g., technology transfer).
facial/clothing attribute detection/search
web-scale indexing & feature learning
large-scale photo/video recognition
web-scale facial image retrieval
mobile visual recognition
multimodal deep neural network
social media mining
big data analytics and visualization
consumer photo retrieval
- We keep recruiting graduate students
(master or PhD) for the exciting projects
of both great theoretical interests and strong industrial demands.
Students have delivered brilliant results. See our news
or publication sections for more
are recruiting students to leverage a cloud
system (with 1000+ cores) for thousand-scale
recognizers and deep learning for photo/video recognition and
- Several image/video/audio retrieval projects are also good for undergraduate
Why having research projects in MiRA group? for CSIE undergraduates
and PhD students. Some quick demos:
- Effective and efficient product query by mobile phone
- Interactive video (Question and Answering)
- Realtime image retrieval over million-scale image collections
- Million-scale image graph construction and clustering by cloud
computation (MapReduce over 18 Hadoop servers).
- Flora Project: Flower Retrieval and Social Network
- Multimodal Fusion for Mobile Annotation – GPS, Compass, or Camera
- Indexing million-scale faces in sub-second response
- Real-time video retrieval and mobile question and answering
- Consumer photo search by facial attributes and face layout on touch
devices (US patent pending #: 13/599,127)
+ FIRST PRIZE in ACM Multimedia Grand Challenge 2011
- Sketch-based image retrieval on mobile devices using compact hash
Link Me to the Media – Fusing Audio and Visual Cues for Robust and
Efficient Mobile Media Interaction
Image Annotation and Retrieval by Integrating Voice Label and Visual
snapshot for the twitter activities among the world -- real-time
estimation from the sampled tweets; a toy demonstration for our
ongoing project in Big Data