Lehrgebiet InformationssystemeFB Informatik |
||
|
Cache Management for Shared Sequential Data AccessErhard RahmUniversity of KaiserslauternDept. of Computer Science 6750 Kaiserslautern, Germany Donald FergusonIBM Thomas J. Watson Research CenterP.O. Box 704 Yorktown Heights, 10598 NY, USA Full paper (postscript version, compressed by gzip or PDF version )AbstractThis paper presents a new set of cache management algorithms for shared data objects that are accessed sequentially. I/O delays on sequentially accessed data is a dominant performance factor in many application domains, in particular for batch processing. Our algorithms fall into three classes: replacement, prefetching and scheduling strategies. Our replacement algorithms empirically estimate the rate at which the jobs are proceeding through the data. These velocity estimates are used to project the next reference times for cached data objects and our algorithms replace data with the longest time to re-use. The second type of algorithm performs asynchronous prefetching. This algorithm uses the velocity estimations to predict future cache misses and attempts to pre-load data to avoid these misses. Finally, we present a simple job scheduling strategy that increases locality of reference between jobs. Our new algorithms are evaluated through a detailed simulation study. Our experiments show that the algorithms substantially improve performance compared to traditional algorithms for cache management. The best of our algorithms has been implemented in the new Hiperbatch (High Performance Batch) product of IBM which is being used at more than 300 commercial data centers worldwide.in: Proc. ACM SIGMETRICS Conf., June 1992 |