Generalizing Bulk-Synchronous Parallel Processing for Data Science: From Data to Threads and Agent-Based Simulations
We generalize the bulk-synchronous parallel (BSP) processing model to make it better support agent-based simulations. Such simulations frequently exhibit hierarchical structure in their communication patterns which can be exploited to improve performance. We allow for the creation of temporary artificial network partitions during which agents synchronize only locally within their group in a way that does not compromise the correctness of a simulation. We have built a distributed engine, CloudCity, which uses this idea to improve the locality of computation, communication, and synchronization in such simulations. We experimentally evaluate the performance of our system on a benchmark of simulation workloads and compare it against other popular BSP-like systems, obtaining insights into the impact of various system design choices and optimization on simulation engine performance.
CloudCity-preprint.pdf
preprint
openaccess
copyright
890.24 KB
Adobe PDF
f103f44e4cb512e1bf61cb5c5b62bf8d