- Projects:
-
- Chase algorithm *NEW* -
The chase algorithm is an important tool for many database applications, like query
optimization, answering queries using views, data exchange and data integration. The
basic idea of the chase is quite simple: given a database and a set of integrity
constraints, e.g. functional dependencies or inclusion dependencies, the chase repairs
constraint violations in the database instance.
- GCX - The G(arbage) C(ollected) X(Query) engine is an in-memory XQuery engine. GCX is the first streaming XQuery engine that implements active garbage collection, a novel buffer management strategy in which both static and dynamic analysis are exploited. This technique actively purges main memory buffers at runtime based on the current status of query evaluation, which leads to memory-efficient and fast XQuery processing.
- MapReduce & PigLatin *NEW* -
MapReduce is a distributed computing paradigm for processing large quantities of data on commodity hardware. While originally
developed at Google for information retrieval and data mining tasks, we are investigating its applicability
to database-related problems, such as query answering over large RDF stores.
- Query Workflows over Web Data sources -
Most of the information that is needed for daily tasks is available on the Web. The
main problem is often not to get the information, but to process it in an automatic way.
Often it is easier to design the process how to solve
such a problem than stating a single query. Furthermore, most of the data is not
immediately available for querying, but kept in the Deep Web.
- Service Discovery for Annotated Documents - Embedding annotations into web pages provides the possibility to
relate their content automatically to other available data and services. Based on this information
additional features can be integrated directly into the page generating individual user interfaces.
An important step in this process is the discovery of matching services.
- SP²Bench - The SP²Bench SPARQL Performance Benchmark has been designed to test the performance of SPARQL engines and detect deficiencies in their evaluation strategy. SP²Bench comes with a data-generator for creating arbitrarily large RDF documents, implementing data correlations encountered in the well-known DBLP scenario, and 17 benchmark queries designed to test typical SPARQL operator constellations and RDF access patterns.
- TEDI - A top-k keyword search query on an RDF graph finds the top-k answers according to some ranking criteria, where each answer is a substructure of the graph containing all query keywords. We propose a top-k keyword search method based on TEDI (TreE Decomposition based Indexing), an indexing and query processing scheme for shortest path computation.
- RDF(S) Reasoning and Query Answering in a Peer-to-Peer Framework - The current centralized RDF databases have limitations both in their failure
tolerance and in their scalability, and their limited capacities will become
incapable of handling the anticipated load of Semantic Web data available in the
future. Thus the efficient distributed databases are a necessary precondition for
the acceptance of the Semantic Web. Peer-to-Peer networks can offer a foundation layer
of such distributed database.
- Former Projects:
-
|