Home // INFOCOMP 2017, The Seventh International Conference on Advanced Communications and Computation // View article
Authors:
Martin Kuehn
Janis Keuper
Franz-Josef Pfreundt
Keywords: GPI-2; Caffe; DNN; SGD; GASPI.
Abstract:
Deep Neural Network (DNN) are currently of great interest in research and application. The training of these networks is a compute intensive and time consuming task. To reduce training times to a bearable amount at reasonable cost we extend the popular Caffe toolbox for DNN with an efficient distributed memory communication pattern. To achieve good scalability we emphasize the overlap of computation and communication and prefer fine granular synchronization patterns over global barriers. To implement these communication patterns we rely on the the "Global address space Programming Interface" version 2 (GPI-2) communication library. This interface provides a light-weight set of asynchronous one-sided communication primitives supplemented by non-blocking fine granular data synchronization mechanisms. Therefore, CaffeGPI is the name of our parallel version of Caffe. First benchmarks demonstrate better scaling behavior compared with other extensions, e.g., the Intel(TM) Caffe. Even within a single symmetric multiprocessing machine with four graphics processing units, the CaffeGPI scales better than the standard Caffe toolbox. These first results demonstrate that the use of standard High Performance Computing (HPC) hardware is a valid cost saving approach to train large DDNs. I/O is an other bottleneck to work with DDN´s in a standard parallel HPC setting, which we will consider in more detail in a forthcoming paper.
Pages: 75 to 79
Copyright: Copyright (c) IARIA, 2017
Publication date: June 25, 2017
Published in: conference
ISSN: 2308-3484
ISBN: 978-1-61208-567-8
Location: Venice, Italy
Dates: from June 25, 2017 to June 29, 2017