Project Description
Web applications are facing an ever increasing number of users as well as the demand of providing each of them with more and more customized, i.e., dynamically generated contents. This places a high workload on every link of the typical processing chain, that is, on the web servers, the application servers, and the database server. While the former ones can often be replicated, the more or less central backend database hinders scaling of the overall application.
The idea of database caching is to disburden the backend by using a number of frontend or cache database servers (caches). These caches, placed close to the application servers, hold frequently used subsets of the backend database which enable the caches to answer queries without accessing the backend. In our ACCache project (Adaptive Contraint-based Cache), we implement a full featured constraint-based cache with importend improvements to recent database cache technologies.
Research - Questions and Details
- Specification
- How can the cached portions of the backend database be specified?
- Predicates and their extensions seem to provide a very abstract view on the ‘objects’ being cached. Which kinds of such predicates can be handled (more easily than others) and how? How are their possibly overlapping extensions stored and maintained efficiently?
- How can approaches such as DBCache or DBProxy be improved, possibly be combined, or be embedded into an overall database caching model?
- Query Processing
- How can we decide which queries or which parts of a query we are able to answer by using the cache contents?
- How can queries be rewritten such that they can be evaluated in a distributed manner among backend and frontend?
- Adaptation
- How do we choose predicates whose extensions are worth caching?
- If the cache contents are to be dynamically adapted to changing workloads, which strategy is appropriate? Which underlying principle of locality can be exploited?
- How do we know that the maintenance of a given predicate extension in the cache is no longer useful? (There is a cost with every cached predicate extension if we require its freshness.)
- Updates
- What about updates occurring in the backend or being initiated by the application through the frontend?
- Updates to the backend have to be propagated to all frontends. When should this happen? When must this have happened in order to provide reasonable modes of consistency and freshness (to be defined)?
- Is it possible to specify some time interval δ that limits the age of a database view exposed at the frontend?
- Is it possible for different ‘freshness spheres’ to exist side by side?
- Having in mind the goal of disburdening the backend: Can updates be applied within a transaction only to the cache and be propagated to the backend after transaction commit?
- Which levels of isolation can be achieved, which ones can be tolerated?
- Similarly, what about transactional semantics in general? Is caching worthwhile or even possible if ACID must strictly be ensured? (What update/read ratio is acceptable before caching does not pay anymore?)
- Distribution
- Is it possible to exchange data directly between multiple frontends in a P2P-like fashion?
- Might database caching play a role in a forthcoming data grid?