UniKL Logo

Lehrgebiet Informationssysteme

FB Informatik

FB Informatik
 
LG IS
AG DBIS
AG HIS
Jobs / Tasks
Courses
Publications
Contact
Misc
Impressum
(C) AG DBIS
 

Cache Management for Shared Sequential Data Access


Erhard Rahm

University of Kaiserslautern
Dept. of Computer Science
6750 Kaiserslautern, Germany

Donald Ferguson

IBM Thomas J. Watson Research Center
P.O. Box 704
Yorktown Heights, 10598 NY, USA

Full paper (postscript version, compressed by gzip or PDF version )


Abstract

This paper presents a new set of cache management algorithms for shared data objects that are accessed sequentially. I/O delays on sequentially accessed data is a dominant performance factor in many application domains, in particular for batch processing. Our algorithms fall into three classes: replacement, prefetching and scheduling strategies. Our replacement algorithms empirically estimate the rate at which the jobs are proceeding through the data. These velocity estimates are used to project the next reference times for cached data objects and our algorithms replace data with the longest time to re-use. The second type of algorithm performs asynchronous prefetching. This algorithm uses the velocity estimations to predict future cache misses and attempts to pre-load data to avoid these misses. Finally, we present a simple job scheduling strategy that increases locality of reference between jobs. Our new algorithms are evaluated through a detailed simulation study. Our experiments show that the algorithms substantially improve performance compared to traditional algorithms for cache management. The best of our algorithms has been implemented in the new Hiperbatch (High Performance Batch) product of IBM which is being used at more than 300 commercial data centers worldwide.

in: Proc. ACM SIGMETRICS Conf., June 1992