logo EDITE Thibaut DURAND
Identité
Thibaut DURAND
État académique
Thèse soutenue le 2017-09-20
Sujet: Apprentissage faiblement supervisé pour la reconnaissance visuelle
Direction de thèse:
Laboratoire:
Voisinage
Ellipse bleue: doctorant, ellipse jaune: docteur, rectangle vert: permanent, rectangle jaune: HDR. Trait vert: encadrant de thèse, trait bleu: directeur de thèse, pointillé: jury d'évaluation à mi-parcours ou jury de thèse.
Productions scientifiques
oai:hal.archives-ouvertes.fr:hal-01077046
SEMANTIC POOLING FOR IMAGE CATEGORIZATION USING MULTIPLE KERNEL LEARNING
International audience
In this paper, we propose a new method for taking into ac-count the spatial information in image categorization. More specifically, we remove the loss of spatial information in Bag of Words related methods by computing the image signature over specific regions selected by object detectors. We propose to select the detectors using Multiple Kernel Learning tech-niques. We carry out experiments on the well known VOC 2007 dataset, and show our semantic pooling obtains promis-ing results.
IEEE International Conference on Image Processing IEEE International Conference on Image Processing https://hal.archives-ouvertes.fr/hal-01077046 IEEE International Conference on Image Processing, Oct 2014, Paris, France. 2014Conference papers 2014-10-27
oai:hal.archives-ouvertes.fr:hal-01077058
INCREMENTAL LEARNING OF LATENT STRUCTURAL SVM FOR WEAKLY SUPERVISED IMAGE CLASSIFICATION
International audience
Visual learning with weak supervision is a promising re-search area, since it offers the possibility to build large image datasets at reasonable cost. In this paper, we address the prob-lem of weakly supervised object detection, where the goal is to predict the label of the image using object position as latent variable. We propose a new method that builds upon the La-tent Structural SVM (LSSVM) formalism. Specifically, we introduce an original coarse-to-fine approach that limits the evolution of the latent parameter subspace. This incremental strategy drives the learning towards better solutions, provid-ing a model with increased predictive accuracy. In addition, this leads to a significant speed up during learning and infer-ence compared to standard sliding window methods. Experi-ments carried out on Mammal dataset validate the good per-formances and fast training of the method compared to state-of-the-art works.
IEEE International Conference on Image Processing IEEE International Conference on Image Processing https://hal.archives-ouvertes.fr/hal-01077058 IEEE International Conference on Image Processing, Oct 2014, Paris, France. 2014Conference papers 2014-10-27
oai:hal.archives-ouvertes.fr:hal-01078079
Image classification using object detectors
International audience
Image categorization is one of the most competitive topic in computer vision and image processing. In this paper, we propose to use trained object and region detectors to represent the visual content of each image. Compared to similar methods found in the literature, our method encompasses two main areas of novelty: introducing a new spatial pooling formalism and designing a late fusion strategy for combining our rep-resentation with state-of-the art methods based on low-level descriptors, e.g. Fisher Vectors and BossaNova. Our experiments carried out in the challenging PASCAL VOC 2007 dataset reveal outstanding performances. When combined with low-level representations, we reach more than 67.6% in MAP, outperforming recently reported results in this dataset with a large margin.
ICIP 2013 : IEEE International Conference on Image Processing IEEE International Conference on Image Processing https://hal.archives-ouvertes.fr/hal-01078079 IEEE International Conference on Image Processing, Sep 2013, Melbourne, Australia. pp.4340 - 4344, 2013, <10.1109/ICIP.2013.6738894>Conference papers 2013-09-15
oai:hal.archives-ouvertes.fr:hal-01343785
WELDON: Weakly Supervised Learning of Deep Convolutional Neural Networks
International audience
In this paper, we introduce a novel framework for WEakly supervised Learning of Deep cOnvolutional neu-ral Networks (WELDON). Our method is dedicated to automatically selecting relevant image regions from weak annotations , e.g. global image labels, and encompasses the following contributions. Firstly, WELDON leverages recent improvements on the Multiple Instance Learning paradigm, i.e. negative evidence scoring and top instance selection. Secondly, the deep CNN is trained to optimize Average Precision , and fine-tuned on the target dataset with efficient computations due to convolutional feature sharing. A thorough experimental validation shows that WELDON outper-forms state-of-the-art results on six different datasets.
29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016) https://hal.archives-ouvertes.fr/hal-01343785 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), Jun 2016, Las Vegas, NV, United States. <http://cvpr2016.thecvf.com/> http://cvpr2016.thecvf.com/ARRAY(0x7f5472c9cc08) 2016-06-26
oai:hal.archives-ouvertes.fr:hal-01343784
MANTRA: Minimum Maximum Latent Structural SVM for Image Classification and Ranking
International audience
In this work, we propose a novel Weakly Supervised Learning (WSL) framework dedicated to learn discriminative part detectors from images annotated with a global label. Our WSL method encompasses three main contributions. Firstly, we introduce a new structured output latent variable model, Minimum mAximum lateNt sTRucturAl SVM (MANTRA), which prediction relies on a pair of latent variables: $h^+$ (resp. $h^-$) provides positive (resp. negative) evidence for a given output $y$. Secondly, we instantiate MANTRA for two different visual recognition tasks: multi-class classification and ranking. For ranking, we propose efficient solutions to exactly solve the inference and the loss-augmented problems. Finally, extensive experiments highlight the relevance of the proposed method: MANTRA outperforms state-of-the art results on five different datasets.
IEEE International Conference on Computer Vision (ICCV15) https://hal.archives-ouvertes.fr/hal-01343784 IEEE International Conference on Computer Vision (ICCV15), Dec 2015, Santiago, Chile. <http://pamitc.org/iccv15/> http://pamitc.org/iccv15/ARRAY(0x7f5472707f20) 2015-12-11
oai:hal.archives-ouvertes.fr:hal-01515640
WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation
International audience
This paper introduces WILDCAT, a deep learning method which jointly aims at aligning image regions for gaining spatial invariance and learning strongly localized features. Our model is trained using only global image labels and is devoted to three main visual recognition tasks: image classification, weakly supervised pointwise object lo-calization and semantic segmentation. WILDCAT extends state-of-the-art Convolutional Neural Networks at three major levels: the use of Fully Convolutional Networks for maintaining spatial resolution, the explicit design in the network of local features related to different class modalities, and a new way to pool these features to provide a global image prediction required for weakly supervised training. Extensive experiments show that our model significantly out-performs the state-of-the-art methods.
IEEE Conference on Computer Vision and Pattern Recognition IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017) https://hal.archives-ouvertes.fr/hal-01515640 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Jul 2017, Honolulu, HI, United States. IEEE Conference on Computer Vision and Pattern Recognition, 2017, <http://cvpr2017.thecvf.com/> http://cvpr2017.thecvf.com/ARRAY(0x7f5472c9d208) 2017-07-21
Soutenance
Thèse: Weakly Supervised Learning for Visual Recognition
Soutenance: 2017-09-20
Rapporteurs: Alain RAKOTOMAMONJY    Patrick PÉREZ