Castor: a Constraint-based SPARQL Engine with Active Filter Processing
Lundi 16 décembre 2013 à 14h30
Auditoire BARB 94 Place Sainte Barbe, 1 1348 Louvain-la-Neuve (Affiche )
SPARQL is the standard query language for graphs of data in the Semantic
Web. Evaluating queries is closely related to graph matching problems,
and has been shown to be NP-hard. State-of-the-art SPARQL engines solve
queries with traditional relational database technology. Such an
approach works well for simple queries that provide a clearly defined
starting point in the graph. However, queries encompassing the whole
graph and involving complex filtering conditions do not scale well.
In this thesis we propose to solve SPARQL queries with Constraint
Programming (CP). CP solves a combinatorial problem by exploiting the
constraints of the problem to prune the search tree when looking for
solutions. Such technique has been shown to work well for graph matching
problems. We reformulate the SPARQL semantics by means of constraint
satisfaction problems (CSPs). Based on this denotational semantics, we
propose an operational semantics that can be used by off-the-shelf
CP solvers.
Off-the-shelf CP solvers are not designed to handle the huge domains
that come with Semantic Web databases though. To handle large databases,
we introduce Castor, a new SPARQL engine embedding a specialized
lightweight CP solver. Special care has been taken to avoid as much as
possible data structures and algorithms whose time or space complexity
are proportional to the database size.
Experimental evaluations on well-known benchmarks show the feasibility
and efficiency of the approach. Castor is competitive with
state-of-the-art SPARQL engines on simple queries, and outperforms them
on complex queries where filters can be actively exploited during the
search.