Home // INFOCOMP 2011, The First International Conference on Advanced Communications and Computation // View article
MPI-based Solution for Efficient Data Access in Java HPC
Authors:
Aidan Fries
Jordi Portell
Yago Isasi
Javier Castañeda
Raül Sirvent
Guillermo L. Taboada
Keywords: Java Communications; Data Cache; F-MPJ; Gaia; GPFS; Myrinet
Abstract:
Efficient data access is extremely important for many applications in HPC. In many cases, processes running in one node will need to access data held in another node, as well as access data held in some central storage device. In I/O-intensive applications, accessing data not held in the local node can become a bottleneck, especially in cases where the remotely stored data is accessed repeatedly, and when accessing data from virtual machines such as in Java. To address this issue, we have designed and implemented a data cache system, which offers efficient data access to Java applications in HPC. This system, which we call MPJ-Cache, makes use of a Java-based message-passing implementation, such as F-MPJ, and it provides a high-level API for the accessing of data. MPJ-Cache can improve the performance of I/O operations for certain Java applications in HPC by reducing significantly the I/O overhead. In this paper, we describe MPJ-Cache, including the data communication layer, as well as the caching features of the system, and we show how it can be used to improve I/O performance for HPC applications. The comparative performance evaluation of this system against the file system of the MareNostrum supercomputer (Barcelona Supercomputing Center) has shown important performance benefits. Finally, we also show the impact of this solution on a challenging problem such as the data processing system for the ESA Gaia space mission.
Pages: 149 to 154
Copyright: Copyright (c) IARIA, 2011
Publication date: October 23, 2011
Published in: conference
ISSN: 2308-3484
ISBN: 978-1-61208-161-8
Location: Barcelona, Spain
Dates: from October 23, 2011 to October 29, 2011