Files

Abstract

We study several transparent techniques for scaling dynamic content web sites, and we evaluate their relative impact when used in combination. Full transparency implies strong data consistency as perceived by the user, no modi- fications to existing dynamic content site tiers and no additional programming effort from the user or site administrator upon deployment. We study strategies for scheduling and load balancing queries on a cluster of replicated database back-ends. We also investigate transparent query caching as a means of enhancing database replication. Our work shows that, on an experimental platform with up to 8 database replicas, the various techniques work in synergy to improve overall scaling for the e-commerce TPCW benchmark. We rank the techniques necessary for high performance in order of impact as follows. Key among the strategies are scheduling strategies, such as conflict-aware scheduling, that minimize consistency maintainance overheads. The choice of load balancing strategy is less important. Transparent query result caching increases performance significantly at any given cluster size for a mostlyread workload. Its benefits are limited for write-intensive workloads, where content-aware scheduling is the only scaling option.

Details

Actions

Preview