Paweł Rościszewski, Paweł Czarnul, Rafał Lewandowski, Marcel Schally‐Kacprzak
https://link.springer.com/chapter/10.1007/978-3-642-45249-9_5
The paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs.
The proposed system partitions input data into chunks and assigns to particular devices for processing using OpenCL kernels defined by the user. The system is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and CPUs below a given threshold. We present real measurements regarding performance and power consumption of various GPUs and CPUs used in a modern parallel system. Furthermore we show, for a parallel application for breaking MD5 passwords, how the execution time of the real application changes with various upper bounds on the power consumption.
International Conference on Distributed Computing and Networking. ICDCN 2014: Distributed Computing and Networking pp 66-80