Semantic Technologies for Vehicle Data
Sujet proposé par
Directeur de thèse:
Unité de recherche
Laboratoire de recherche d'EURECOM
Domaine: Sciences et technologies de l'information et de la communication
Motivation and Research Challenges
Sensor data collected from connected vehicles and shared through online services is currently usually limited to speed and location that can be used for traffic estimation. The next step is an extension of sensor data being collected to include pre-defined or recognized speed limits, lanes on the road, slopes, etc. Our vision leads to an autonomous vehicle, where the car will be able to build a detailed environment model based on multiple sensors.
The diversity of available sensors between devices, manufacturers, and coming from different ecosystems including personal devices represent a first challenge. While quality of sensors installed in vehicles is increasing, managing different sensor types that have their strengths and weaknesses (“sensor honesty”), that offer different qualities of measurements, and that operate in various contexts represent an important problem. In particular, we will aim to use semantic technologies to provide proper interpretation of the raw data that is being captured.
Sensor data can benefit from being integrated with external knowledge bases. However, such a semantic integration is also challenging, in particular when knowledge bases vary constantly like in the case of Open Street Map where crowd-sourced geographic data about new roads and speed limits are constantly updated.
A third challenge consists in putting together all sensor data coming from multiple vehicles in a central backend. Fusing data from different vehicles, coming from different manufacturers will increase the need for contextual information. The use of semantic sensor web technologies has the potential to effectively address those challenges.
An aspect of growing importance when dealing with data from sensor is data privacy. The vehicle has become a data driven machine. Sharing data between different vehicles and data originating from different drivers require: i) user privacy and awareness, such as transparency and opt-in mechanism, ii) anonymity or pseudonymity of data, as well as technical means to incorporate iii) data provenance and iv) information flow control.
Connected Autonomous Vehicles will need to access all kinds of data that are always up-to-date, comprehensive and geographically referenced of its immediate surroundings, as well as cross-OEM crowd-sourced data or real-time data analytics in the cloud. We will investigate methodologies for these data to be communicated and processed between vehicles and the cloud.
A relevant research topic in this scenario is data integration at the semantic level while using as much as possible external or crowdsourced knowledge bases. How can such a heterogeneous system with many different stakeholders and completely different sensor types (LIDAR, radar heat, weather,..) work reliably together with data from uncertain sources? How can information about data type, accuracy, freshness and relevance be shared between arbitrary partners within the automotive industry and smart-city data centers? If one sensor is exchanged for a different one, how will the system be able to
adapt itself and incorporate a new sensor type? How can knowledge about the interpretation of certain sensor data be shared and exchanged? We propose to use Linked Data technologies combined with the selection and modelling of suitable ontologies as key elements for such a connected system. Linking ontologies from different domains (driving, health, navigation, etc.) will be a key enabler for new classes of use cases making use of the potential of the Internet of Things.
Data privacy and information flow control.
A technical necessity for the use of personal and private data in a distributed system is a communication mechanism that knows about data provenance and enforces local and system-wide policies. An approach to establish such communication mechanisms may be borrowed from information security research: information classification and data leakage prevention  as well as marshalling of networked data between entities of a distributed sensor networks . Enhancing data with provenance and following its flow through intermediate computation nodes may be an enabler for enhancing privacy in complex and distributed systems.
Linked Data and semantic sensor networks
The emerging Machine-to-Machine (M2M) field enables machines to communicate with each other without human intervention. Existing semantic sensor networks are domain-specific and add semantics to the context.  proposes an architecture that merges heterogeneous sensor networks and add semantics to the measured data rather than to the context. This architecture enables to: (1) get sensor measurements, (2) enrich sensor measurements with semantic web technologies, domain ontologies and the Link Open Data, and (3) reason on these semantic measurements with semantic tools, machine learning algorithms and recommender systems to provide promising applications.
Complex challenges arise from multiple and various technological prerequisites: enhanced and cooperative data processing, data security, human machine interaction concepts, geo-localization, or environmental and drivers perception. Extending the fully networked vehicle with the Internet of Things (IoT) leads to even further opportunities and research challenges in the area of connected mobility. These include technical aspects such as highly scalable vehicle electronics, dependable communication systems, big data fusion and driver assistance systems. Especially data including individual semantic representations from different sensor domains will have to be aggregated, processed and compiled. This requires a comparative assessment of meta information and data attributes in order to establish a common semantics framework for data from individual sources. In addition, non-technical challenges have to be addressed from the psychological, socio-economic, legal, and business-oriented fields.
In a perfect world, all ‘things’ of an IoT would transmit information that would share the same structure and could be widely understood. Unfortunately, such world is far from being real, and in a general Internet-of-Vehicle, data has local meaning and it is difficult to be able to build a common understanding. The challenge is therefore to be able to be structure crowed sourced data into commonly understood semantic, and second to be able to make new knowledge out of it.
ASAM, Association for Standardization of Automation and Measuring Systems:
W3C Automotive activity, http://www.w3.org/auto/
• W3C and Automotive Industry Start New Web Standards Work for Connected Cars, http://www.w3.org/2015/02/auto.html.en
• Vehicle Information Access API:
• Vehicle Data: http://rawgit.com/w3c/automotive/master/vehicle_data/data_spec.html
• Vehicle Signal Specification: https://github.com/PDXostc/vehicle_signal_specification
W3C Web of Things activity, http://www.w3.org/WoT/
• Web of Things workshop report: http://www.w3.org/2014/02/wot/report.html
• Web of Things Interest Group: http://www.w3.org/2014/12/wot-ig-charter.html
• Web of Things Community Group: http://www.w3.org/community/wot/
The European Union Location Framework (EULF1) project has just released a video2 on its Transportation pilot scheme. The pilot will facilitate the exchange of safety-related road data attributes, such as speed limits and access restrictions, between public road authorities and commercial map providers.
 Croft J Caesar M, Towards Practical Avoidance of Information Leakage in Enterprise Networks in: 6th USENIX Workshop on Hot Topics in Security, 2011
 Park U, Heidemann J, Provenance in sensornet republishing in: Provenance and Annotation of Data and Processes, 2008, pp: 280–292
 Amélie Gyrard, Christian Bonne, Karima Boudaoud. A machine-to-machine architecture to merge semantic sensor measurements. In 22nd International World Wide Web Conference, Doctoral Consortium, May 13-17, 2013, Rio de Janeiro, Brazil