Image Thumbnailing
As the number
of digital images increase explosively, the techniques for viewing and browsing
images become more and more important. Image thumbnailing is a technique to
select and crop the significant and important portions of images as thumbnails,
in order to give better impressions on the viewers and make the process related
to large amount of images more efficient.
There are three
main approaches for image thumbnailing: saliency analysis, face detection, and
gaze detection. All of them detect the Region of Interest (ROI) of image, but
by different algorithms and assumptions. For automatic thumbnailing, saliency
analysis and face detection are more suitable since gaze-based approach
requires additional works to detect the actual gazes of viewers. But both
saliency-based and face-based approaches have specified assumptions hence their
performances depend on the property of images.
Since most of
the face detection algorithms are weak in detecting non-frontal faces, we
propose a multi-view face detection algorithm to detect the five
different views of faces, include frontal, half-profile, and profile ones.
Moreover, we integrate saliency
analysis and face detection and propose a fused approach for image
thumbnailing. The advantages in saliency-based and face-based approaches are
combined, and our approach is able to produce appropriate and pleasing thumbnails
with the consideration of photography aesthetics.
Chih-Chau Ma, Yi-Hsuan Yang, Winston Hsu,
“Image Thumbnailing via Muti-View Face
Detection and Saliency Analysis”.
Abstract:
Image thumbnailing is an essential technique to
efficiently visualize large-scale consumer photos or image search results. In
this paper, we propose a hierarchical multi-view face detection algorithm for
image thumbnailing, and a unified framework which aggregates both the low-level
saliency and high-level semantics through detected faces to augment thumbnail
generation with the consideration of photography aesthetics. The approach can
produce more informative and pleasant thumbnails in comparison with the prior
work with limitation representations. We also investigate the effective feature
sets and informative feature dimensions for discriminative multi-view face
detection.
Data
Sets
This page includes the image data used for image
thumbnailing and classifier training. The data are collected from the web via
search engines. The images are in JPG format with various sizes for image
thumbnailing, and a fix size of 100×100 pixels for classifier training. In the
multi-view scheme, five different views of faces are considered: left (left
90°), lefthalf (left 45°), frontal (0°), righthalf (right 45°), right (right
90°). The angle is from the direction of image viewers.
facedata.zip
(9.2M)
The images in this data are used for the
training and validation of two classifiers: the face/non-face classifier and
the view classifier. Most of the face images are collected from the web, and
the head areas in the images are cropped manually. The non-face images as well
as some positive face images used for face classifier are selected from the
true and false positives of a face detector. All of these images are in 100×100 JPG format.
face/: images for the face classifier.
posface/:
positive face images for training, with five different views (left, lefthalf,
frontal, righthalf, right).
nonface/:
non-face images for training as the negative data, which are the false positives
of a face detector.
validation/:
validation images for the face classifier.
view/: images for the view classifier.
left/,
lefthalf/, frontal/, righthalf/, right/: face images in five different views
for training.
validation/:
validation images of the view classifier.