Scalable Delivery of Stream Query Result

Continuous queries over data streams typically produce large volumes of result streams. To scale up the system, one should carefully study the problem of delivering the result streams to the end users, which, unfortunately, is often over-looked in existing systems. In this paper, we leverage Distributed Publish/Subscribe System (DPSS), a scalable data dissemination infrastructure, for efficient stream query result delivery. To take advantage of DPSS's multicast-like data dissemination architecture, one has to exploit the common contents among different result streams and maximize the sharing of their delivery. Hence, we propose to merge the user queries into a few representative queries whose results subsume those of the original ones, and disseminate the result streams of these representative queries through the DPSS. To realize this approach, we study the stream query containment theories and propose efficient query grouping and merging algorithms. The proposed approach is non-intrusive and hence can be easily implemented as a middleware to be incorporated into existing stream processing systems. A prototype is developed on top of an open- source stream processing system and results of an extensive performance study on real datasets verify the efficacy of the proposed techniques.


Published in:
Proceedings of 35th International Conference on Very Large Data Bases (VLDB 2009)
Presented at:
35th International Conference on Very Large Data Bases (VLDB 2009), Lyon, France, August 24-28, 2009
Year:
2009
Publisher:
Lyon, France, VLDB
Keywords:
Note:
NCCR-MICS NCCR-MICS/CL4
Laboratories:




 Record created 2009-06-10, last modified 2018-03-17

n/a:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)