logo EDITE Grigory ANTIPOV
État académique
Thèse en cours...
Sujet: Thèse sur les méthodes d'apprentissage artificiel pour l'annotation automatique de contenus audio-visuels
Direction de thèse:
Ellipse bleue: doctorant, ellipse jaune: docteur, rectangle vert: permanent, rectangle jaune: HDR. Trait vert: encadrant de thèse, trait bleu: directeur de thèse, pointillé: jury d'évaluation à mi-parcours ou jury de thèse.
Productions scientifiques
Learned vs. Hand-Crafted Features for Pedestrian Gender Recognition
International audience
This paper addresses the problem of image features selection for pedestrian gender recognition. Hand-crafted features (such as HOG) are compared with learned features which are obtained by training convolutional neural networks. The comparison is performed on the recently created collection of versatile pedestrian datasets which allows us to evaluate the impact of dataset properties on the performance of features. The study shows that hand-crafted and learned features perform equally well on small-sized homogeneous datasets. However, learned features significantly outperform hand-crafted ones in the case of heterogeneous and unfamiliar (unseen) datasets. Our best model which is based on learned features obtains 79% average recognition rate on completely unseen datasets. We also show that a relatively small convolutional neural network is able to produce competitive features even with little training data.
Proceedings of the 23rd ACM international conference on Multimedia Conférence internationale https://hal.archives-ouvertes.fr/hal-01380429 Conférence internationale, Oct 2015, Brisbane, Australia. Proceedings of the 23rd ACM international conference on Multimedia, <10.1145/2733373.2806332>ARRAY(0x7f03ff0a0470) 2015-10-26
Minimalistic CNN-based ensemble model for gender prediction from face images
International audience
Despite being extensively studied in the literature, the problem of gender recognition from face images remains difficult when dealing with unconstrained images in a cross-dataset protocol. In this work, we propose a convolutional neural network ensemble model to improve the state-of-the-art accuracy of gender recognition from face images on one of the most challenging face image datasets today, LFW (Labeled Faces in the Wild). We find that convolutional neural networks need significantly less training data to obtain the state-of-the-art performance than previously proposed methods. Furthermore, our ensemble model is deliberately designed in a way that both its memory requirements and running time are minimized. This allows us to envision a potential usage of the constructed model in embedded devices or in a cloud platform for an intensive use on massive image databases.
ISSN: 0167-8655 Pattern Recognition Letters https://hal.archives-ouvertes.fr/hal-01380573 Pattern Recognition Letters, Elsevier, 2016, 70, pp.59-65. <10.1016/j.patrec.2015.11.011>ARRAY(0x7f03ff08ebf8) 2016-01-15
The impact of privacy protection filters on gender recognition
International audience
Deep learning-based algorithms have become increasingly efficient in recognition and detection tasks, especially when they are trained on large-scale datasets. Such recent success has led to a speculation that deep learning methods are comparable to or even outperform human visual system in its ability to detect and recognize objects and their features. In this paper, we focus on the specific task of gender recognition in images when they have been processed by privacy protection filters (e.g., blurring, masking, and pixelization) applied at different strengths. Assuming a privacy protection scenario, we compare the performance of state of the art deep learning algorithms with a subjective evaluation obtained via crowdsourcing to understand how privacy protection filters affect both machine and human vision.
SPIE Optical Engineering+ Applications https://hal.archives-ouvertes.fr/hal-01367561 SPIE Optical Engineering+ Applications, Aug 2015, San diego, United StatesARRAY(0x7f03ff08e8f8) 2015-08-28
Apparent Age Estimation from Face Images Combining General and Children-Specialized Deep Learning Models
International audience
This work describes our solution in the second edition of the ChaLearn LAP competition on Apparent Age Estimation. We train VGG-16 convolutional neural network on the huge IMDB-Wiki dataset for biological age estimation and then fine-tune it for apparent age estimation using the relatively small competition dataset. We show that the precise age estimation of children is the cornerstone of the competition. Therefore, we integrate a separate "children" VGG-16 network for apparent age estimation of children between 0 and 12 years old in our final solution. The "children" network is fine-tuned from the "general" one. We employ different age encoding strategies for training "general" and "children" networks: the soft one (label distribution encoding) for the "general" network and the strict one (0/1 classification encoding) for the "children" network. Our resulting solution wins the 1st place in the competition significantly outperforming the runner-up.
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops Atelier international https://hal.archives-ouvertes.fr/hal-01380587 Atelier international, Jun 2016, Las Vegas, United States. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) WorkshopsARRAY(0x7f03ff0a0530) 2016-06-26
Effective training of convolutional neural networks for face-based gender and age prediction
International audience
Convolutional Neural Networks (CNNs) have been proven very effective for human demographics estimation by a number of recent studies. However, the proposed solutions significantly vary in different aspects leaving many open questions on how to choose an optimal CNN architecture and which training strategy to use. In this work, we shed light on some of these questions improving the existing CNN-based approaches for gender and age prediction and providing practical hints for future studies. In particular, we analyse four important factors of the CNN training for gender recognition and age estimation: (1) the target age encoding and loss function, (2) the CNN depth, (3) the need for pretraining, and (4) the training strategy: mono-task or multi-task. As a result, we design the state-of-the-art gender recognition and age estimation models according to three popular benchmarks: LFW, MORPH-II and FG-NET. Moreover, our best model won the ChaLearn Apparent Age Estimation Challenge 2016 significantly outperforming the solutions of other participants.
ISSN: 0031-3203 Pattern Recognition https://hal.archives-ouvertes.fr/hal-01556389 Pattern Recognition, Elsevier, 2017, 72, pp.15-26. <10.1016/j.patcog.2017.06.031>ARRAY(0x7f03ff09d438) 2017-12