logo EDITE Ahmad ASSAF
Identité
Ahmad ASSAF
État académique
Thèse soutenue le 2015-12-18
Sujet: UN CANEVAS LOGICIEL POUR INFORMATIQUE DECISIONNELLE
Direction de thèse:
Laboratoire:
Voisinage
Ellipse bleue: doctorant, ellipse jaune: docteur, rectangle vert: permanent, rectangle jaune: HDR. Trait vert: encadrant de thèse, trait bleu: directeur de thèse, pointillé: jury d'évaluation à mi-parcours ou jury de thèse.
Productions scientifiques
oai:hal.archives-ouvertes.fr:hal-00758681
Improving Schema Matching with Linked Data
With today's public data sets containing billions of data items, more and more companies are looking to integrate external data with their traditional enterprise data to improve business intelligence analysis. These distributed data sources however exhibit heterogeneous data formats and terminologies and may contain noisy data. In this paper, we present a novel framework that enables business users to semi-automatically perform data integration on potentially noisy tabular data. This framework offers an extension to Google Refine with novel schema matching algorithms leveraging Freebase rich types. First experiments show that using Linked Data to map cell values with instances and column headers with types improves significantly the quality of the matching results and therefore should lead to more informed decisions.
preprint 2012-05-15
oai:hal.archives-ouvertes.fr:hal-00823586
Data Quality Principles in the Semantic Web
The increasing size and availability of web data make data quality a core challenge in many applications. Principles of data quality are recognized as essential to ensure that data fit for their intended use in operations, decision-making, and planning. However, with the rise of the Semantic Web, new data quality issues appear and require deeper consideration. In this paper, we propose to extend the data quality principles to the context of Semantic Web. Based on our extensive industrial experience in data integration, we identify five main classes suited for data quality in Semantic Web. For each class, we list the principles that are involved at all stages of the data management process. Following these principles will provide a sound basis for better decision-making within organizations and will maximize long-term data integration and interoperability.
Proceedings of the 2012 IEEE Sixth International Conference on Semantic Computing Proceedings of the 2012 IEEE Sixth International Conference on Semantic Computingconference proceeding 2012
oai:hal.archives-ouvertes.fr:hal-00823583
RUBIX, A Framework for Improving Data Integration with Linked Data
With today's public data sets containing billions of data items, more and more companies are looking to integrate external data with their traditional enterprise data to improve business intelligence analysis. These distributed data sources however exhibit heterogeneous data formats and terminologies and may contain noisy data. In this paper, we present RUBIX, a novel framework that enables business users to semi-automatically perform data integration on potentially noisy tabular data. This framework offers an extension to Google Refine with novel schema matching algorithms leveraging Freebase rich types. First experiments show that using Linked Data to map cell values with instances and column headers with types improves significantly the quality of the matching results and therefore should lead to more informed decisions.
WOD '12 Proceedings of the First International Workshop on Open Data WOD '12 Proceedings of the First International Workshop on Open Dataconference proceeding 2012-09-08
oai:hal.archives-ouvertes.fr:hal-00873637
SNARC - An Approach for Aggregating and Recommending Contextualized Social Content
The Internet has created a paradigm shift in how we consume and disseminate information. Data nowadays is spread over heterogeneous silos of archived and live data. People willingly share data on social media by posting news, views, presentations, pictures and videos. SNARC is a service that uses semantic web technology and combines services available on the web to aggregate social news. SNARC brings live and archived information to the user that is directly related to his active page. The key advantage is an instantaneous access to complementary information without the need to dig for it. Information appears when it is relevant enabling the user to focus on what is really important.
ESWC 13 ESWC 13conference proceeding 2013
oai:hal.archives-ouvertes.fr:hal-01082169
What are the Important Properties of an Entity? Comparing Users and Knowledge Graph Point of View
International audience
Entities play a key role in knowledge bases in general and in the Web of Data in particular. Entities are generally described with a lot of properties, this is the case for DBpedia. It is, however, difficult to assess which ones are more "important" than others for particular tasks such as visualizing the key facts of an entity or filtering out the ones which will yield better instance matching. In this paper, we perform a reverse engineering of the Google Knowledge graph panel to find out what are the most "important" properties for an entity according to Google. We compare these results with a survey we conducted on 152 users. We finally show how we can represent and explicit this knowledge using the Fresnel vocabulary.
Extended Semantic Web Conference, ESWC 2014 Satellite Events https://hal.archives-ouvertes.fr/hal-01082169 Extended Semantic Web Conference, ESWC 2014 Satellite Events, May 2014, Anissaras, Crete, Greece. <10.1007/978-3-319-11955-7_16>Poster communications 2014-05-25
Soutenance
Thèse: "Fourniture automatisée de données par l'enrichissement sémantique"
Soutenance: 2015-12-18