Achille Peternier, Daniele Bonetta, Cesare Pautasso, Walter Binder
International Conference on Service-Oriented Computing and Applications (SOCA 2010), Perth, Australia, pp. 1-8
While modern CPUs offer an increasing number of cores with shared caches, prevailing execution engines for business processes, workflows, or Web service compositions have not been optimized for properly exploiting the abundant processing resources of such CPUs. One factor limiting performance is the inefficient thread scheduling by the operating system, which can result in suboptimal use of shared caches. In this paper we study performance of the JOpera business process execution engine on a recent multicore machine. By analyzing the engine's architecture and by binding threads that are likely to access shared data to cores with a common cache, we achieve speedups up to 13 for a variety of workloads, without modifying the engine's architecture and implementation, apart from binding threads to CPUs. As the engine is implemented in Java, we provide a new Java library to manage thread bindings and hardware performance counters. We also leverage hardware performance counters to explain the observed speedup in our performance analysis.
PDF: ▼jopera-soca2010.pdf (551KB)
- Achille Peternier, Daniele Bonetta, Cesare Pautasso, Walter Binder, Exploiting multicores to optimize business process execution, Proc. of the International Conference on Service-Oriented Computing and Applications (SOCA 2010), Perth, Australia, December 2010, pp. 1-8