Database Caching

 The ACCache Project Adaptive Constraint-based Database Caching
 Home  Publications  Jobs  Links
Project Leader Research Topics
Prof. Dr. Theo Härder
Theo Härder
AG DBIS Chair
  • Specification
    • Cached content
    • Cache constraints and cache elements
    • Functional requirements
  • Query Processing
    • Efficient probing
    • Predicate completeness
    • Janus plans
  • Adaptation
    • Self-optimization
    • Adaptation of cache constraints and defenitions
    • Cost models
  • Updates
    • Concurrency control
    • Conservation of cache constraints
    • Serialization levels
  • Distribution
    • P2P-Approaches?
    • Data Grids?
Scientific Staff Members and Ph.D. Students
Andreas Bühmann
Andreas Bühmann
Core Engine Founder,
Query Processing
Joachim Klein
Joachim Klein
Concurrency Control
and Adaptivity
Mayce Ali
Mayce Ali
Database Caching of
XML Data
Students and Former Project Members
Artur Guschakowski
Artur Guschakowski
RCC Index
Structures
Harald Wonneberger
Harald Wonneberger
Reorganisation of
cache groups
Martin Tritschler
Martin Tritschler
Concurrency Control
 
Volker Hudlet
Volker Hudlet
CC and SQL
interface design
Mateus Gomes
Mateus Gomes
Probing, Loading
 
Susanne Braun
Susanne Braun
Garbage Collection
Concurrency Control
Gustavo Machado
Gustavo Machado
Garbage Collection
 
Julia Thiele
Julia Thiele
Concurrency Control,
Problems and Concepts

Wolfgang Scholl
Cache Group Design,
Optimization
Christian Merker
Christian Merker
ACCache Prototype
 
Christian Bayerlein
Christian Bayerlein
Filling behavior
 

Project Description

Web applications are facing an ever increasing number of users as well as the demand of providing each of them with more and more customized, i.e., dynamically generated contents. This places a high workload on every link of the typical processing chain, that is, on the web servers, the application servers, and the database server. While the former ones can often be replicated, the more or less central backend database hinders scaling of the overall application.

The idea of database caching is to disburden the backend by using a number of frontend or cache database servers (caches). These caches, placed close to the application servers, hold frequently used subsets of the backend database which enable the caches to answer queries without accessing the backend. In our ACCache project (Adaptive Contraint-based Cache), we implement a full featured constraint-based cache with importend improvements to recent database cache technologies.

Research - Questions and Details

  • Specification
    • How can the cached portions of the backend database be specified?
    • Predicates and their extensions seem to provide a very abstract view on the ‘objects’ being cached. Which kinds of such predicates can be handled (more easily than others) and how? How are their possibly overlapping extensions stored and maintained efficiently?
    • How can approaches such as DBCache or DBProxy be improved, possibly be combined, or be embedded into an overall database caching model?
  • Query Processing
    • How can we decide which queries or which parts of a query we are able to answer by using the cache contents?
    • How can queries be rewritten such that they can be evaluated in a distributed manner among backend and frontend?
  • Adaptation
    • How do we choose predicates whose extensions are worth caching?
    • If the cache contents are to be dynamically adapted to changing workloads, which strategy is appropriate? Which underlying principle of locality can be exploited?
    • How do we know that the maintenance of a given predicate extension in the cache is no longer useful? (There is a cost with every cached predicate extension if we require its freshness.)
  • Updates
    • What about updates occurring in the backend or being initiated by the application through the frontend?
    • Updates to the backend have to be propagated to all frontends. When should this happen? When must this have happened in order to provide reasonable modes of consistency and freshness (to be defined)?
      • Is it possible to specify some time interval δ that limits the age of a database view exposed at the frontend?
      • Is it possible for different ‘freshness spheres’ to exist side by side?
    • Having in mind the goal of disburdening the backend: Can updates be applied within a transaction only to the cache and be propagated to the backend after transaction commit?
      • Which levels of isolation can be achieved, which ones can be tolerated?
      • Similarly, what about transactional semantics in general? Is caching worthwhile or even possible if ACID must strictly be ensured? (What update/read ratio is acceptable before caching does not pay anymore?)
  • Distribution
    • Is it possible to exchange data directly between multiple frontends in a P2P-like fashion?
    • Might database caching play a role in a forthcoming data grid?