The management of Data Lake technologies is challenged by the increasing flexibility they provide in data storage, as well as the fast-changing and diverse data they handle. In order to effectively identify relevant sources for analysis, it is crucial to make sense of disparate data, which is especially important in data science applications where users need to analyze statistical measures from multiple heterogeneous sources. In the paper, a knowledge-based approach for a Semantic Data Lake is presented to enable efficient integration of data sources and alignment to a Knowledge Graph, which represents indicators of interest, their mathematical formulas, and dimensions of analysis. A query-driven discovery approach is used to dynamically identify, integrate and rank the sources to respond to a given analytical query.

Analytic query answering in a Semantic Data Lake / Diamantini, Claudia; Potena, Domenico; Storti, Emanuele. - 3478:(2023), pp. 369-378. (Intervento presentato al convegno 31st Symposium of Advanced Database Systems tenutosi a Galzingano Terme, Italy nel July 2nd to 5th, 2023).

Analytic query answering in a Semantic Data Lake

Claudia Diamantini;Domenico Potena;Emanuele Storti
2023-01-01

Abstract

The management of Data Lake technologies is challenged by the increasing flexibility they provide in data storage, as well as the fast-changing and diverse data they handle. In order to effectively identify relevant sources for analysis, it is crucial to make sense of disparate data, which is especially important in data science applications where users need to analyze statistical measures from multiple heterogeneous sources. In the paper, a knowledge-based approach for a Semantic Data Lake is presented to enable efficient integration of data sources and alignment to a Knowledge Graph, which represents indicators of interest, their mathematical formulas, and dimensions of analysis. A query-driven discovery approach is used to dynamically identify, integrate and rank the sources to respond to a given analytical query.
2023
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/321671
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact