SQL2SOLID (paper published, perfect grade)

TL;DR

Developed a pipeline that dynamically exposes relational SQL data as RDF triples via Solid pods, enabling secure, privacy-respecting linked data sharing and real-time interoperability.

SQL2SOLID (paper published, perfect grade) preview

Motivation

The SQL2SOLID project was inspired by the growing need for interoperability and secure data sharing across siloed relational databases. I joined a university seminar that focused on Linked Data and the Semantic Web, where my team was aiming to bridge the gap between traditional relational data (like SQL tables) and the semantic web.

The question was: How can we expose valuable relational data as linked data while preserving strict access control and privacy?

Approach

We designed SQL2SOLID as a lightweight, real-time integration layer that sits on top of existing relational databases - specifically, PostgreSQL for this implementation. The system combined:

  • Ontop (R2RML-like mappings): Used to map relational tables and their relationships into RDF triples dynamically.
  • Solid Server: Provided granular authentication and authorization based on Solid’s decentralized identity system.
  • Dockerized Deployment: The entire stack - Ontop, Solid Server, PostgreSQL - was containerized for ease of deployment and consistent environments.
  • Real-Time Transformation: Unlike static ETL pipelines, our approach ensured that any updates in the relational database were instantly reflected in the linked data, keeping everything current and reliable.

My Contributions

In this two-person team, we both worked together closely:

  • We designed the overall architecture and containerized the system.
  • We implemented and configured the Ontop mappings to ensure accurate RDF generation.
  • We handled system integration testing, ensuring the Solid Server correctly enforced access restrictions down to the triple level.

Achievements

The project achieved all its key goals:

  • Real-time RDF exposure of relational data, maintaining the relational DB as the single source of truth.
  • Fine-grained access control for each data element, respecting strict data privacy needs.
  • Strong recognition within the academic community - our revised paper was accepted at Solid Symposium's poster session.

Foundation for future work: The project has been continued by the research group, including extensions for writing data back to SQL via Solid which has also been accepted for publication. This was continued by my former teammate, who is now part of the research group.

Learnings

SQL2SOLID taught me the practical challenges of bridging relational and graph-based data:

  • How to map complex relational schemas to RDF while preserving semantics.
  • How to ensure privacy and security in decentralized data sharing scenarios.
  • The power of real-time linked data for interoperability in research and enterprise applications (e.g., training data for AI).