logo      Department of Mathematical
                    Scienceslogo

Petr Plechac

309 Ewing Hall
University of Delaware
Newark, Delaware 19716

Telephone: 302-831-0637
Fax: 302-831-4511

email:  plechac at math dot udel dot edu

<< Previous                    Main Menu                            Next >>
Giorgos Arampatzis, Markos A. Katsoulakis, Petr Plechac, Michela Taufer, Lifan Xu

Fractional-step kinetic Monte Carlo algorithms and hierarchical parallelization

We present a new framework for constructing parallel algorithms for lattice Kinetic Monte Carlo (KMC) simulations.These algorithms have the capacity to simulate a wide range of spatio-temporal scales of spatially distributed, non-equilibrium physiochemical processes with complex chemistry and transport micro-mechanisms, while they can be tailored to specific hierarchical parallel architectures such as clusters of GPUs. The proposed parallel algorithms are controlled approximations of kinetic Monte Carlo algorithms, departing from the predominant paradigm of creating parallel KMC algorithms with exactly the same master equation as the serial one. Instead, our methodology relies on first developing a spatio-temporal decomposition of the Markov operator underlying the KMC algorithm into a hierarchy of operators corresponding to the processor' structure in the parallel architecture. Based on this operator decomposition, we formulate Fractional Step Approximation schemes by employing the Trotter Theorem; these schemes, (a) determine the communication schedule between processors, and (b) are run independently on each processor through a serial KMC simulation, called a kernel, on each fractional step time-window. The hierarchical structure can be easily derived and implemented for very general physicochemical processes modeled by lattice systems, allowing users to input as the algorithm's KMC kernel their serial algorithm of choice. This flexibility and hierarchical structure are key advantages for tailoring our framework to particular parallel architectures with complex memory and processor hierarchies, e.g. clusters of GPUs. Furthermore, the numerical and statistical consistency of the proposed algorithms is rigorously justified, showing the convergence of our approximating schemes to the original serial KMC algorithm. In this paper we also include detailed benchmarking using available exact solutions, for example, in Ising-type systems and we demonstrate the capabilities of the method to simulate complex spatially distributed reactions at very large scales on GPUs. Finally, we discuss work load balancing between processors and propose a re-balancing scheme based on probabilistic mass transport methods.

Bibliographical note:

submitted to Journal of Computational Physics
preprint on arXiv: arXiv: 1105.4673 [math.NA]