logo EDITE Melissa AILEM
Identité
Melissa AILEM
État académique
Thèse soutenue le 2016-11-18
Sujet: Modèles vectoriels de documents pour la fouille de textes bio-médicaux : Application à l'identification de relations gènes-maladies
Direction de thèse:
Laboratoire: personnel permanent
Voisinage
Ellipse bleue: doctorant, ellipse jaune: docteur, rectangle vert: permanent, rectangle jaune: HDR. Trait vert: encadrant de thèse, trait bleu: directeur de thèse, pointillé: jury d'évaluation à mi-parcours ou jury de thèse.
Productions scientifiques
oai:hal.archives-ouvertes.fr:hal-01306471
Unsupervised text mining for assessing and augmenting GWAS results.
International audience
ISSN: 1532-0464 Journal of Biomedical Informatics https://hal.archives-ouvertes.fr/hal-01306471 Journal of Biomedical Informatics, Elsevier, 2016, <10.1016/j.jbi.2016.02.008>ARRAY(0x7f547074c2e0) 2016-02-19
oai:hal.archives-ouvertes.fr:hal-01306473
Co-clustering Document-term Matrices by Direct Maximization of Graph Modularity
International audience
We present Coclus, a novel diagonal co-clustering algorithm which is able to effectively co-cluster binary or contingency matrices by directly maximizing an adapted version of the modularity measure traditionally used for networks. While some effective co-clustering algorithms already exist that use network-related measures (normalized cut, modularity), they do so by using spectral relaxations of the discrete optimization problems. In contrast, Coclus allows to get even better co-clusters by directly maximizing modularity using an iterative alternating optimization procedure. Extensive comparative experiments performed on various document-term datasets demonstrate that our algorithm is very effective, stable and outperforms other co-clustering algorithms.
https://hal.archives-ouvertes.fr/hal-01306473 Oct 2015, Melbourne, Australia. 2015, CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, 978-1-4503-3794-6. <10.1145/2806416.2806639>ARRAY(0x7f54706802c0) 2015-10-23
oai:hal.archives-ouvertes.fr:hal-01343616
Graph Modularity Maximization as an Effective Method for Co-clustering Text Data
International audience
In this paper we show how the modularity measure can serve as a useful criterion for co-clustering document-term matrices. We present and investigate the performance of CoClus, a novel, effective block-diagonal co-clustering algorithm which directly maximizes this modularity measure. The maximization is performed using an iterative alternating optimization procedure, in contrast to algorithms that use spectral relaxations of the discrete optimization problems. Extensive comparative experiments performed on various document-term datasets demonstrate that this approach is very effective, stable, and outperforms other block-diagonal co-clustering algorithms devoted to the same task. Another important advantage of using modularity in the co-clustering context is that it provides a novel, simple way of determining the appropriate number of co-clusters.Availability: an implementation of Coclus is available as part of the recently released coclust Python package which is available at:\\https://pypi.python.org/pypi/coclust
ISSN: 0950-7051 Knowledge-Based Systems https://hal.archives-ouvertes.fr/hal-01343616 Knowledge-Based Systems, Elsevier, 2016, <10.1016/j.knosys.2016.07.002>ARRAY(0x7f547067e9c8) 2016-07-04
Soutenance
Thèse: Modèlles vectoriels de documents pour la fouille de textes bio-médicaux : Applications à l'identification de relations gènes-maladies
Soutenance: 2016-11-18