logo EDITE Clement Antoine CALAUZENES
Clement Antoine CALAUZENES
État académique
Thèse soutenue le 2012-07-19
Sujet: Apprentissage automatique et inférence dans les grands réseaux collaboratifs.
Direction de thèse:
Ellipse bleue: doctorant, ellipse jaune: docteur, rectangle vert: permanent, rectangle jaune: HDR. Trait vert: encadrant de thèse, trait bleu: directeur de thèse, pointillé: jury d'évaluation à mi-parcours ou jury de thèse.
Productions scientifiques
Learning Scoring Functions with Order-Preserving Losses and Standardized Supervision
We address the problem of designing surrogate losses for learning scoring functions in the context of label ranking. We extend to ranking problems a notion of order preserving losses previously introduced for multiclass classi?cation, and show that these losses lead to consistent formulations with respect to a family of ranking evaluation metrics. An order-preserving loss can be tailored for a given evaluation metric by appropriately setting some weights depending on this metric and the observed supervision. These weights, called the standard form of the supervision, do not always exist, but we show that previous consistency results for ranking were proved in special cases where they do. We then evaluate a new pairwise loss consistent with the (Normalized) Discounted Cumulative Gain on benchmark datasets.
international conference on machine learningarticle in peer-reviewed journal 2011-06
Calibration and regret bounds for order-preserving surrogate losses in learning to rank
Learning to rank is usually reduced to learning to score individual objects, leaving the "ranking" step to a sorting algorithm. In that context, the surrogate loss used for training the scoring function needs to behave well with respect to the target performance measure which only sees the final ranking. A characterization of such a good behavior is the notion of calibration, which guarantees that minimizing (over the set of measurable functions) the surrogate risk allows us to maximize the true performance. In this paper, we consider the family of order-preserving (OP) losses which includes popular surrogate losses for ranking such as the squared error and pairwise losses. We show that they are calibrated with performance measures like the Discounted Cumulative Gain (DCG), but also that they are not calibrated with respect to the widely used Mean Average Precision and Expected Reciprocal Rank. We also derive, for some widely used OP losses, quantitative surrogate regret bounds with respect to several DCG-like evaluation measures.
Machine Learningarticle in peer-reviewed journal 2013
On the (Non-)existence of Convex, Calibrated Surrogate Losses for Ranking
We study surrogate losses for learning to rank, in a framework where the rankings are induced by scores and the task is to learn the scoring function. We focus on the calibration of surrogate losses with respect to a ranking evaluation metric, where the calibration is equivalent to the guarantee that near-optimal values of the surrogate risk imply near-optimal values of the risk defined by the evaluation metric. We prove that if a surrogate loss is a convex function of the scores, then it is not calibrated with respect to two evaluation metrics widely used for search engine evaluation, namely the Average Precision and the Expected Reciprocal Rank. We also show that such convex surrogate losses cannot be calibrated with respect to the Pairwise Disagreement, an evaluation metric used when learning from pairwise preferences. Our results cast lights on the intrinsic difficulty of some ranking problems, as well as on the limitations of learning-to-rank algorithms based on the minimization of a convex surrogate risk.
Neural Information Processing Systemsarticle in peer-reviewed journal 2012-12
Thèse: De la Consistance des Formulations de Substitution Convexes pour l'Ordonnancement
Soutenance: 2012-07-19
Rapporteurs: Eyke HÜLLERMEIER    Francis BACH