SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 07 | July 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 975
Prediction of Facial Attribute Without Landmark Information
Jayashree S. Somani1, Mrs. V. L. Kolhe2
1Student, Dept. of Computer Engineering, Dr. D. Y. Patil College of Engineering, Akurdi, Pune, Maharashtra, India
2Professor, Dept. of Computer Engineering, Dr. D. Y. Patil College of Engineering, Akurdi, Pune, Maharashtra, India
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract – Prediction of the face attributesisverychallenging
because of complex variation in face. Most of the systems
which are used for predicting the attributes of face are unable
to give the precision as these methods are depending on the
landmark detection and canonical positions. The dependency
on landmark detector is unable to give satisfactory results on
unconstrained faces with large pose angles, occlusion or
blurriness which reduces the performance of attribute
prediction. The system explained here use an AFFAIR method
that gives the face attribute prediction. By learning a global
transformation technique and adaptive part localization
technique, system provides the most relevant part for
predicting a specific attribute on the face. TheAFFAIRlearnsa
good transformation for each input face image directly for
attribute prediction with greater accuracy.
Key Words: AFFAIR, Landmark Detector, Attribute
Prediction, Global Transformation, Part localization.
1. INTRODUCTION
Describing people depending on their feature points like
gender, age, hair style and clothing style is an important
problem for many applications in face analysis. Previously
Detection-Alignment-Recognition (DAR) method [1] is used
for detection of the face attributes with landmark detection
from images. As this method depends on the quality of
landmark detection, it results reduce the performanceofthe
method as it unable to give the fine result on unconstrained
faces. Some of the author uses a pre-trained deep
Convolutional Neural Networks (CNN)[4][6][1] for face
recognition tasks to obtain global face representation and
binary linear SVM classifiers are built on the global face
representations to classify face attributes. The previous
methods use global methods in which entire object for
representation learning and attribute prediction is done
without part information and thelocal methodwhich extract
features from relevant regions or parts for attribute
prediction.
This study aims to investigate the possibility of optimizing
facial landmark detection and alignment which are
complicated tasks. Here we are studying AFFAIR method[1]
which is an aggregation of global and local methods for
attribute prediction. AFFAIR provides an end-to-end
learning framework for finding the appropriate
transformation. The global transformation is used to detect
face and generates transformation parameters tailored for
the original input face. Part LocNet is used to focus on the
most relevant part of the face forattributeprediction.Finally
by integrating both global and local representations, we can
predict the facial attributes with no requirement of external
landmark points for alignment.
2. LITERATURE SURVEY
Jianshu Li et al. [1] published Landmark Free Face Attribute
Prediction in which he proposed the AFFAIR method which
learns global transformation and part localization. Then he
aggregates both global and local featuresforrobustattribute
prediction.
Jianlong Fu et al. [2] published Look Closer to See Better:
Recurrent Attention Convolutional Neural Network forFine-
Grained Image Recognition in which he proposed recurrent
attention convolutional neural network (RA-CNN) for
recognizing the ne grained categories like bird species, etc.
Yang Zhong et al. [3] published Face attribute prediction
using off-the-shelf CNN features in which he workedwith an
alternative way of employing the power of deep
representations from CNNs. Combining with conventional
face localization techniques, he used the off-the-shelf
architecture strained for face recognition to build facial
descriptors.
Max Ehrlich et al. [4] published Facial Attributes
Classification Using Multi-task Representation Learning in
which he proposed a model which learns a shared feature
representation that is well suited for multiple attribute
classification. Then he learns a joint feature representation
which enables interaction between different tasks. For
learning this shared feature representation the author has
used a Restricted Boltzmann Machine (RBM) based model,
enhanced with a factored multi-task component to become
Multi-Task Restricted Boltzmann Machine (MT-RBM).
Hamdi Dibekliolu et al. [5] published Combining Facial
Dynamics with Appearance for Age Estimation in which he
proposed a method which extracts and uses dynamic
features for age estimation, using a person’s smile.
Kaiming He et al. [6] published Deep Residual Learning for
Image Recognition in which he has presented a residual
learning framework to ease the trainingofnetworksthatare
substantially deeper than those used previously. The depth
of representations is of central importance for many visual
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 07 | July 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 976
recognition tasks. By doing this, they obtain a 28% relative
improvement on the COCO object detection dataset.
Andreas Steger et al. [7] published Failure Detection for
Facial Landmark Detectors in which he studied two top
recent facial landmark detectors (AFLW, HELEN)anddevise
confidence models for their outputs. Because of this
approach, it correctly identifies more than 40% of the
failures in the outputs of the landmark detectors.
Yue Wu et al. [8] published Robust Facial Landmark
Detection under Significant Head Poses and Occlusion in
which he proposed a unified robust cascade regression
framework that can handle both images with severe
occlusion and images with large head poses.Heintroduceda
supervised regression method that gradually updates the
landmark visibility probabilities in each iteration to achieve
robustness.
3. SYSTEM DESIGN
Human face attributeestimationhasreceiveda largeamount
of attention in recent years in visual recognition research
because a face attribute provides a wide variety of salient
information such as age, gender e. t. c. The fig.1 shows the
architecture of the system. The working of the system is as
follows:
Fig.1: System Architecture
 Input image: User uploads images using live camera.
Face is detected from the input image and given to the
AFFAIR framework as an input. The AFFAIR method is
an aggregation of both global and local methods for
attribute prediction that integratesbothglobal andlocal
representations and requires no external landmark
points for alignment.
 lAndmark Free Face AttrIbute pRediction method:
lAndmark Free Face AttrIbute pRediction (AFFAIR)
method learns a global transformation and part
localizations on eachinputfaceend-to-end.Itmainlyhas
two important components:
 Global Transformation Network: The global
transformation transformstheinputfacetotheone
with an optimized configuration for further
representation learning and attributes prediction.
Global transformation consists of two parts:Global
TransNet and Global Representation Learning Net.
The global TransNet was followed by global
representationlearningnetwork whichwasusedto
consider all the facial attributes simultaneously.
Thus, the global TransNet and the global
representation learning net is trained end-to-end
for attribute prediction.
The global TransNet in AFFAIR takes the detected
face as input, and produces a set of optimized
transformation parameters ‘Tg’ tailored for the
original input face for attribute representation
learning. The transformation maps the globally
transformed face image with the input image via-
(1)
Because of this, the globally transformed face
images were obtained pixel by pixel. The pixel
value at location (xi
g, yi
g) of the transformed image
was obtained by bilinear interpolating the pixel
values on the input face image centered at (xi
input,
yi
input).
After this, the globally transformed face image in
the pixel format is then given as input to the Global
RepresentationLearningNet whichsimultaneously
considers all the facial attributes. The output face
from the global TransNet was denoted by ꞘθT
g(I).
Then the global face representation learning net,
parameterized by θF
g, maps the transformedimage
from the raw pixel space to a feature space
beneficial for predicting all the facial attributes. All
the facial attributes were denoted by ꞘθF
g, θT
g(I).
 Part Localization Network:
Part LocNet was used to localize the most relevant
and discriminative parts for a specificattributeand
make attribute prediction. LocNet can access the
whole face.
The part LocNet predicts a set of localization
parameters and it focuses on relevant part on the
face through learned scaling and translating
transformations. For example, the shape of the
eyebrow or the appearance of the goatee, etc. are
very small attributes on the face which can be
predicted by part localization. The set of part
localization parameter was denoted as Tp and the
correspondence between the parts to the globally
transformed face image was modeled by the
following equation which links the pixel value at (
xp
i, yp
i) on the output partial face image to the pixel
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 07 | July 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 977
values centered at location ( xg
i, yg
i) on the globally
transformed face
image.
(2)
Part LocNet is also end-to-end trainable. It positionsthe
focus window to a relevant part on the face through
learned scaling and translating transformations. Then
the locally transformed images were then processed by
the Local Representation Learning Net for more than
one or all attributes of the face. The additional
parameter to generate the transformation, Tpi, in the
part LocNet for the ith attribute is denoted by θT
pi. Thus
the generated transformation was given by-
Tpi = ꞘθT
pi, θF
g, θT
g (I) (3)
 Global-Local Feature Fusion : In this phase, both
global and local representations were integrated by
finding a good global transformation to rectify the face
scale, location and orientation, and identify the most
discriminative part on the face for specific attribute
prediction without requiring external landmark points
for alignment. The global and local features are
generated by the global representation learningnetand
the part representation learning net which were fused
for attribute prediction.
 Attribute Prediction: The global and local features
were generated by the global representation learning
net and the part representation learning net,
respectively and were fused to get attribute prediction.
Finally after combining the Global-Local features, the
specific attribute prediction can be done.
4. RESULTS
Fig.2 gives the experimental results of global
transformation and part localization.
The globally transformed images possess 3 × 3 grids on
the transformed face boxes. We can see that the two
eyes lie in the center grid. The figure 2 demonstrates
that the global TransNet is able to generate good global
transformations in the sense that the two eyes are
centered in the transformed faces images.
Fig.2: Results of Global TransNet[1]
Also, the localization results from the part LocNets for
all the 8 facial attribute categories in the CelebA dataset
are shown in fig.3. Each column shows a facial attribute
category and each row shows one test image, where the
original test face images are displayed in the first
column. The next columns show the localization results
on the globally transformed face images. The boxes
indicate the output from the part LocNets. One can see
on top of the globally transformed faces, thepartLocNet
indeed localizes the most discriminative partoftheface.
For example, the eye region is localized for predicting
attributes “Arched Eyebrows”, “Bags Under Eyes”,
“Eyeglasses”, etc. The nose region is localized for
attributes “Big Nose” and “Pointy Nose”.
Fig.3: Results of Part localization [1]
5. ADVANTAGES AND DISADVANTAGES
 Advantages:
 AFFAIR model is able to localize the specific
part for prediction of the facial attribute with
the use of transformation-localizationnetwork.
 The AFFAIR model integrates both global and
local representations, by removing the need of
external landmark points for alignment.
 AFFAIR focuses on the local region and learns
more discriminative representation for better
attribute prediction.
 AFFAIR does not require face alignment as
preprocessing and provides state-of-the-art
results for the CelebA, LFWA and MTFL
datasets.
 Disadvantage:
 This method gives better performanceinall the
cases but when the personwearsthemask then
we cannot detect the face attributes.
6. CONCLUSION
The landmark free Face Attribute prediction (AFFAIR)
system is addressed in this paper which does not
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 07 | July 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 978
depends on landmarks and hardwired face alignment
for prediction of the attributes. By learning global
transformation, it generates the optimized
transformation tailored for each input face. By learning
part localization, it locates the most relevant facial part.
Finally by aggregating the global and local features
attribute prediction is done.
REFERENCES
[1] Jianshu Li, Fang Zhao, Jiashi Feng, Sujoy Roy, Shuicheng
Yan, Terrence Sim,“Landmark Free Face Attribute
Prediction”, IEEE Transactions on Image Processing,
Volume: 27, Issue: 9, pages 4651-4662, Sept. 2018.
[2] Jianlong Fu; Heliang Zheng ; Tao Mei,“Look ClosertoSee
Better: Recurrent Attention Convolutional Neural
Network for Fine-Grained Image Recognition”, 2017
IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), November 2017.
[3] Yang Zhong ; Josephine Sullivan; Haibo Li, “Face
attribute prediction using off-the-shelf CNN features”,
2016 International ConferenceonBiometrics(ICB),June
2016.
[4] Max Ehrlich; Timothy J. Shields; Timur Almaev;
Mohamed R. Amer, “Facial Attributes Classification
Using Multi-task Representation Learning”, 2016 IEEE
Conference on ComputerVisionandPatternRecognition
Workshops (CVPRW), July 2016.
[5] Hamdi Dibekliolu ; Fares Alnajar ; Albert AliSalah;Theo
Gevers, “Combining Facial Dynamics With Appearance
for Age Estimation ”, IEEE Transactions on Image
Processing, Volume: 24, Issue: 6, pages 1928 - 1943,
June 2015
[6] Kaiming He ; Xiangyu Zhang ; Shaoqing Ren ; Jian Sun,
“Deep Residual Learning for Image Recognition”, 2016
IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), December 2016.
[7] Andreas Steger, Radu Timofte, “Failure Detection for
Facial Landmark Detectors”, Asian Conference on
Computer Vision ACCV 2016: Computer Vision ACCV
2016 Workshops pages 361-376.
[8] Yue Wu ; Qiang Ji, “Robust Facial Landmark Detection
under Signicant Head Poses and Occlusion”, 2015 IEEE
International Conference on Computer Vision (ICCV),
Dec. 2015

More Related Content

PDF
Face Recognition based on STWT and DTCWT using two dimensional Q-shift Filters
PDF
IRJET- Multiple Feature Fusion for Facial Expression Recognition in Video: Su...
PDF
An Assimilated Face Recognition System with effective Gender Recognition Rate
PDF
An Accurate Facial Component Detection Using Gabor Filter
PDF
HVDLP : HORIZONTAL VERTICAL DIAGONAL LOCAL PATTERN BASED FACE RECOGNITION
PDF
IRJET- Face Recognition of Criminals for Security using Principal Component A...
PDF
An Efficient Face Recognition Using Multi-Kernel Based Scale Invariant Featur...
PDF
Identifying Gender from Facial Parts Using Support Vector Machine Classifier
Face Recognition based on STWT and DTCWT using two dimensional Q-shift Filters
IRJET- Multiple Feature Fusion for Facial Expression Recognition in Video: Su...
An Assimilated Face Recognition System with effective Gender Recognition Rate
An Accurate Facial Component Detection Using Gabor Filter
HVDLP : HORIZONTAL VERTICAL DIAGONAL LOCAL PATTERN BASED FACE RECOGNITION
IRJET- Face Recognition of Criminals for Security using Principal Component A...
An Efficient Face Recognition Using Multi-Kernel Based Scale Invariant Featur...
Identifying Gender from Facial Parts Using Support Vector Machine Classifier

What's hot (16)

PDF
Fourier mellin transform based face recognition
PDF
H0334749
PDF
IRJET- Facial Expression Recognition: Review
PDF
Facial expression recognition using pca and gabor with jaffe database 11748
PDF
A FACE RECOGNITION USING LINEAR-DIAGONAL BINARY GRAPH PATTERN FEATURE EXTRACT...
PDF
Aa4102207210
PDF
IRJET - Facial Recognition based Attendance System with LBPH
PDF
Reconstruction of partially damaged facial image
PDF
IRJET - Facial In-Painting using Deep Learning in Machine Learning
PDF
IRJET- A Study on Face Recognition based on Local Binary Pattern
PDF
Facial Expression Recognition System Based on SVM and HOG Techniques
PDF
IRJET- A Review on Face Recognition using Local Binary Pattern Algorithm
PDF
Ijetcas14 351
PDF
MSB based Face Recognition Using Compression and Dual Matching Techniques
PDF
IRJET- Facial Expression Recognition
PDF
Implementation of Face Recognition in Cloud Vision Using Eigen Faces
Fourier mellin transform based face recognition
H0334749
IRJET- Facial Expression Recognition: Review
Facial expression recognition using pca and gabor with jaffe database 11748
A FACE RECOGNITION USING LINEAR-DIAGONAL BINARY GRAPH PATTERN FEATURE EXTRACT...
Aa4102207210
IRJET - Facial Recognition based Attendance System with LBPH
Reconstruction of partially damaged facial image
IRJET - Facial In-Painting using Deep Learning in Machine Learning
IRJET- A Study on Face Recognition based on Local Binary Pattern
Facial Expression Recognition System Based on SVM and HOG Techniques
IRJET- A Review on Face Recognition using Local Binary Pattern Algorithm
Ijetcas14 351
MSB based Face Recognition Using Compression and Dual Matching Techniques
IRJET- Facial Expression Recognition
Implementation of Face Recognition in Cloud Vision Using Eigen Faces
Ad

Similar to IRJET- Prediction of Facial Attribute without Landmark Information (20)

PDF
Progression in Large Age-Gap Face Verification
PDF
IRJET- A Review on Various Approaches of Face Recognition
PDF
Face Recognition for Human Identification using BRISK Feature and Normal Dist...
PDF
PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PDF
Development of Real Time Face Recognition System using OpenCV
PDF
IRJET- Face Recognition using Deep Learning
PDF
IRJET- Emotionalizer : Face Emotion Detection System
PDF
IRJET - Emotionalizer : Face Emotion Detection System
PDF
AN EFFICIENT FACE RECOGNITION EMPLOYING SVM AND BU-LDP
PDF
Lecture 10 ming yang - face recognition systems
PDF
IRJET- Persons Identification Tool for Visually Impaired - Digital Eye
PDF
IRJET - A Review on: Face Recognition using Laplacianface
DOC
Facial expression identification by using features of salient facial landmarks
DOC
Facial expression identification by using features of salient facial landmarks
DOCX
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
PDF
Facial recognition based on enhanced neural network
PDF
Cu31632635
PDF
Attendance System using Face Recognition
PDF
Realtime face matching and gender prediction based on deep learning
PDF
IRJET - Real Time Facial Analysis using Tensorflowand OpenCV
Progression in Large Age-Gap Face Verification
IRJET- A Review on Various Approaches of Face Recognition
Face Recognition for Human Identification using BRISK Feature and Normal Dist...
PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
Development of Real Time Face Recognition System using OpenCV
IRJET- Face Recognition using Deep Learning
IRJET- Emotionalizer : Face Emotion Detection System
IRJET - Emotionalizer : Face Emotion Detection System
AN EFFICIENT FACE RECOGNITION EMPLOYING SVM AND BU-LDP
Lecture 10 ming yang - face recognition systems
IRJET- Persons Identification Tool for Visually Impaired - Digital Eye
IRJET - A Review on: Face Recognition using Laplacianface
Facial expression identification by using features of salient facial landmarks
Facial expression identification by using features of salient facial landmarks
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Facial recognition based on enhanced neural network
Cu31632635
Attendance System using Face Recognition
Realtime face matching and gender prediction based on deep learning
IRJET - Real Time Facial Analysis using Tensorflowand OpenCV
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PPTX
Safety Seminar civil to be ensured for safe working.
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PPTX
introduction to high performance computing
PDF
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
PPTX
Feature types and data preprocessing steps
PDF
Exploratory_Data_Analysis_Fundamentals.pdf
PDF
Design Guidelines and solutions for Plastics parts
PDF
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
Fundamentals of Mechanical Engineering.pptx
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PPTX
Current and future trends in Computer Vision.pptx
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PDF
737-MAX_SRG.pdf student reference guides
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PPTX
"Array and Linked List in Data Structures with Types, Operations, Implementat...
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
Abrasive, erosive and cavitation wear.pdf
PDF
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
Safety Seminar civil to be ensured for safe working.
III.4.1.2_The_Space_Environment.p pdffdf
introduction to high performance computing
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
Feature types and data preprocessing steps
Exploratory_Data_Analysis_Fundamentals.pdf
Design Guidelines and solutions for Plastics parts
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Fundamentals of Mechanical Engineering.pptx
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
Current and future trends in Computer Vision.pptx
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
737-MAX_SRG.pdf student reference guides
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
"Array and Linked List in Data Structures with Types, Operations, Implementat...
Fundamentals of safety and accident prevention -final (1).pptx
Abrasive, erosive and cavitation wear.pdf
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...

IRJET- Prediction of Facial Attribute without Landmark Information

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 07 | July 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 975 Prediction of Facial Attribute Without Landmark Information Jayashree S. Somani1, Mrs. V. L. Kolhe2 1Student, Dept. of Computer Engineering, Dr. D. Y. Patil College of Engineering, Akurdi, Pune, Maharashtra, India 2Professor, Dept. of Computer Engineering, Dr. D. Y. Patil College of Engineering, Akurdi, Pune, Maharashtra, India ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract – Prediction of the face attributesisverychallenging because of complex variation in face. Most of the systems which are used for predicting the attributes of face are unable to give the precision as these methods are depending on the landmark detection and canonical positions. The dependency on landmark detector is unable to give satisfactory results on unconstrained faces with large pose angles, occlusion or blurriness which reduces the performance of attribute prediction. The system explained here use an AFFAIR method that gives the face attribute prediction. By learning a global transformation technique and adaptive part localization technique, system provides the most relevant part for predicting a specific attribute on the face. TheAFFAIRlearnsa good transformation for each input face image directly for attribute prediction with greater accuracy. Key Words: AFFAIR, Landmark Detector, Attribute Prediction, Global Transformation, Part localization. 1. INTRODUCTION Describing people depending on their feature points like gender, age, hair style and clothing style is an important problem for many applications in face analysis. Previously Detection-Alignment-Recognition (DAR) method [1] is used for detection of the face attributes with landmark detection from images. As this method depends on the quality of landmark detection, it results reduce the performanceofthe method as it unable to give the fine result on unconstrained faces. Some of the author uses a pre-trained deep Convolutional Neural Networks (CNN)[4][6][1] for face recognition tasks to obtain global face representation and binary linear SVM classifiers are built on the global face representations to classify face attributes. The previous methods use global methods in which entire object for representation learning and attribute prediction is done without part information and thelocal methodwhich extract features from relevant regions or parts for attribute prediction. This study aims to investigate the possibility of optimizing facial landmark detection and alignment which are complicated tasks. Here we are studying AFFAIR method[1] which is an aggregation of global and local methods for attribute prediction. AFFAIR provides an end-to-end learning framework for finding the appropriate transformation. The global transformation is used to detect face and generates transformation parameters tailored for the original input face. Part LocNet is used to focus on the most relevant part of the face forattributeprediction.Finally by integrating both global and local representations, we can predict the facial attributes with no requirement of external landmark points for alignment. 2. LITERATURE SURVEY Jianshu Li et al. [1] published Landmark Free Face Attribute Prediction in which he proposed the AFFAIR method which learns global transformation and part localization. Then he aggregates both global and local featuresforrobustattribute prediction. Jianlong Fu et al. [2] published Look Closer to See Better: Recurrent Attention Convolutional Neural Network forFine- Grained Image Recognition in which he proposed recurrent attention convolutional neural network (RA-CNN) for recognizing the ne grained categories like bird species, etc. Yang Zhong et al. [3] published Face attribute prediction using off-the-shelf CNN features in which he workedwith an alternative way of employing the power of deep representations from CNNs. Combining with conventional face localization techniques, he used the off-the-shelf architecture strained for face recognition to build facial descriptors. Max Ehrlich et al. [4] published Facial Attributes Classification Using Multi-task Representation Learning in which he proposed a model which learns a shared feature representation that is well suited for multiple attribute classification. Then he learns a joint feature representation which enables interaction between different tasks. For learning this shared feature representation the author has used a Restricted Boltzmann Machine (RBM) based model, enhanced with a factored multi-task component to become Multi-Task Restricted Boltzmann Machine (MT-RBM). Hamdi Dibekliolu et al. [5] published Combining Facial Dynamics with Appearance for Age Estimation in which he proposed a method which extracts and uses dynamic features for age estimation, using a person’s smile. Kaiming He et al. [6] published Deep Residual Learning for Image Recognition in which he has presented a residual learning framework to ease the trainingofnetworksthatare substantially deeper than those used previously. The depth of representations is of central importance for many visual
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 07 | July 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 976 recognition tasks. By doing this, they obtain a 28% relative improvement on the COCO object detection dataset. Andreas Steger et al. [7] published Failure Detection for Facial Landmark Detectors in which he studied two top recent facial landmark detectors (AFLW, HELEN)anddevise confidence models for their outputs. Because of this approach, it correctly identifies more than 40% of the failures in the outputs of the landmark detectors. Yue Wu et al. [8] published Robust Facial Landmark Detection under Significant Head Poses and Occlusion in which he proposed a unified robust cascade regression framework that can handle both images with severe occlusion and images with large head poses.Heintroduceda supervised regression method that gradually updates the landmark visibility probabilities in each iteration to achieve robustness. 3. SYSTEM DESIGN Human face attributeestimationhasreceiveda largeamount of attention in recent years in visual recognition research because a face attribute provides a wide variety of salient information such as age, gender e. t. c. The fig.1 shows the architecture of the system. The working of the system is as follows: Fig.1: System Architecture  Input image: User uploads images using live camera. Face is detected from the input image and given to the AFFAIR framework as an input. The AFFAIR method is an aggregation of both global and local methods for attribute prediction that integratesbothglobal andlocal representations and requires no external landmark points for alignment.  lAndmark Free Face AttrIbute pRediction method: lAndmark Free Face AttrIbute pRediction (AFFAIR) method learns a global transformation and part localizations on eachinputfaceend-to-end.Itmainlyhas two important components:  Global Transformation Network: The global transformation transformstheinputfacetotheone with an optimized configuration for further representation learning and attributes prediction. Global transformation consists of two parts:Global TransNet and Global Representation Learning Net. The global TransNet was followed by global representationlearningnetwork whichwasusedto consider all the facial attributes simultaneously. Thus, the global TransNet and the global representation learning net is trained end-to-end for attribute prediction. The global TransNet in AFFAIR takes the detected face as input, and produces a set of optimized transformation parameters ‘Tg’ tailored for the original input face for attribute representation learning. The transformation maps the globally transformed face image with the input image via- (1) Because of this, the globally transformed face images were obtained pixel by pixel. The pixel value at location (xi g, yi g) of the transformed image was obtained by bilinear interpolating the pixel values on the input face image centered at (xi input, yi input). After this, the globally transformed face image in the pixel format is then given as input to the Global RepresentationLearningNet whichsimultaneously considers all the facial attributes. The output face from the global TransNet was denoted by ꞘθT g(I). Then the global face representation learning net, parameterized by θF g, maps the transformedimage from the raw pixel space to a feature space beneficial for predicting all the facial attributes. All the facial attributes were denoted by ꞘθF g, θT g(I).  Part Localization Network: Part LocNet was used to localize the most relevant and discriminative parts for a specificattributeand make attribute prediction. LocNet can access the whole face. The part LocNet predicts a set of localization parameters and it focuses on relevant part on the face through learned scaling and translating transformations. For example, the shape of the eyebrow or the appearance of the goatee, etc. are very small attributes on the face which can be predicted by part localization. The set of part localization parameter was denoted as Tp and the correspondence between the parts to the globally transformed face image was modeled by the following equation which links the pixel value at ( xp i, yp i) on the output partial face image to the pixel
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 07 | July 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 977 values centered at location ( xg i, yg i) on the globally transformed face image. (2) Part LocNet is also end-to-end trainable. It positionsthe focus window to a relevant part on the face through learned scaling and translating transformations. Then the locally transformed images were then processed by the Local Representation Learning Net for more than one or all attributes of the face. The additional parameter to generate the transformation, Tpi, in the part LocNet for the ith attribute is denoted by θT pi. Thus the generated transformation was given by- Tpi = ꞘθT pi, θF g, θT g (I) (3)  Global-Local Feature Fusion : In this phase, both global and local representations were integrated by finding a good global transformation to rectify the face scale, location and orientation, and identify the most discriminative part on the face for specific attribute prediction without requiring external landmark points for alignment. The global and local features are generated by the global representation learningnetand the part representation learning net which were fused for attribute prediction.  Attribute Prediction: The global and local features were generated by the global representation learning net and the part representation learning net, respectively and were fused to get attribute prediction. Finally after combining the Global-Local features, the specific attribute prediction can be done. 4. RESULTS Fig.2 gives the experimental results of global transformation and part localization. The globally transformed images possess 3 × 3 grids on the transformed face boxes. We can see that the two eyes lie in the center grid. The figure 2 demonstrates that the global TransNet is able to generate good global transformations in the sense that the two eyes are centered in the transformed faces images. Fig.2: Results of Global TransNet[1] Also, the localization results from the part LocNets for all the 8 facial attribute categories in the CelebA dataset are shown in fig.3. Each column shows a facial attribute category and each row shows one test image, where the original test face images are displayed in the first column. The next columns show the localization results on the globally transformed face images. The boxes indicate the output from the part LocNets. One can see on top of the globally transformed faces, thepartLocNet indeed localizes the most discriminative partoftheface. For example, the eye region is localized for predicting attributes “Arched Eyebrows”, “Bags Under Eyes”, “Eyeglasses”, etc. The nose region is localized for attributes “Big Nose” and “Pointy Nose”. Fig.3: Results of Part localization [1] 5. ADVANTAGES AND DISADVANTAGES  Advantages:  AFFAIR model is able to localize the specific part for prediction of the facial attribute with the use of transformation-localizationnetwork.  The AFFAIR model integrates both global and local representations, by removing the need of external landmark points for alignment.  AFFAIR focuses on the local region and learns more discriminative representation for better attribute prediction.  AFFAIR does not require face alignment as preprocessing and provides state-of-the-art results for the CelebA, LFWA and MTFL datasets.  Disadvantage:  This method gives better performanceinall the cases but when the personwearsthemask then we cannot detect the face attributes. 6. CONCLUSION The landmark free Face Attribute prediction (AFFAIR) system is addressed in this paper which does not
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 07 | July 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 978 depends on landmarks and hardwired face alignment for prediction of the attributes. By learning global transformation, it generates the optimized transformation tailored for each input face. By learning part localization, it locates the most relevant facial part. Finally by aggregating the global and local features attribute prediction is done. REFERENCES [1] Jianshu Li, Fang Zhao, Jiashi Feng, Sujoy Roy, Shuicheng Yan, Terrence Sim,“Landmark Free Face Attribute Prediction”, IEEE Transactions on Image Processing, Volume: 27, Issue: 9, pages 4651-4662, Sept. 2018. [2] Jianlong Fu; Heliang Zheng ; Tao Mei,“Look ClosertoSee Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition”, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), November 2017. [3] Yang Zhong ; Josephine Sullivan; Haibo Li, “Face attribute prediction using off-the-shelf CNN features”, 2016 International ConferenceonBiometrics(ICB),June 2016. [4] Max Ehrlich; Timothy J. Shields; Timur Almaev; Mohamed R. Amer, “Facial Attributes Classification Using Multi-task Representation Learning”, 2016 IEEE Conference on ComputerVisionandPatternRecognition Workshops (CVPRW), July 2016. [5] Hamdi Dibekliolu ; Fares Alnajar ; Albert AliSalah;Theo Gevers, “Combining Facial Dynamics With Appearance for Age Estimation ”, IEEE Transactions on Image Processing, Volume: 24, Issue: 6, pages 1928 - 1943, June 2015 [6] Kaiming He ; Xiangyu Zhang ; Shaoqing Ren ; Jian Sun, “Deep Residual Learning for Image Recognition”, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), December 2016. [7] Andreas Steger, Radu Timofte, “Failure Detection for Facial Landmark Detectors”, Asian Conference on Computer Vision ACCV 2016: Computer Vision ACCV 2016 Workshops pages 361-376. [8] Yue Wu ; Qiang Ji, “Robust Facial Landmark Detection under Signicant Head Poses and Occlusion”, 2015 IEEE International Conference on Computer Vision (ICCV), Dec. 2015