Uni-Logo
Databases and Information Systems
Sie sind hier: Startseite Research Current Projects
 
<< back

Sempala

Sempala is a SPARQL-over-SQL approach to provide interactive-time SPARQL query processing on Hadoop. It stores RDF data in a columnar layout (Parquet) on HDFS and uses Impala, a massive parallel processing (MPP) SQL query engine for Hadoop, as the execution layer on top of it. SPARQL queries are translated into Impala SQL for execution.

Related Publications

  • Alexander Schätzle, Martin Przyjaciel-Zablocki, Antony Neu, Georg Lausen:
    Sempala: Interactive SPARQL Query Processing on Hadoop.pdf ]
    Proc. of the 13th International Semantic Web Conference (ISWC 2014). Riva del Garda (Italy).

Requirements

  • Apache Hadoop, we recommend Cloudera's Distribution of Hadoop CDH, the implementation is tested with CDH 4.5
  • Cloudera Impala, included in CDH
  • RDF data in (extended) N-Triples format. Beyond the syntax of N-Triples it also supports the most commonly used Prefixes as well as the Prefixes used in the SP2Bench, LUBM and BSBM benchmarks.

Downloads

05/2014: Sempala v1.0 available for download (source and binaries)

Version Description
Sempala v1.0 Readme and Binaries Sempala_v1.0_bin.tar
Sempala v1.0 Sources Sempala_v1.0_src.tar

SVN

You can also checkout the source code of Sempala from the following subversion repository.
(username: anonymous, password: anonymous)
https://dbissvn.informatik.uni-freiburg.de/intern/Projekte/DiPoS/Sempala/tags

<< back