Home // CLOUD COMPUTING 2025, The Sixteenth International Conference on Cloud Computing, GRIDs, and Virtualization // View article


Trends for Pulling HPC Containers in Cloud

Authors:
Vanessa Sochat

Keywords: cloud; containers; Kubernetes; HPC; trends

Abstract:
Container technologies are foundational for cloud orchestration and have captured the interest of the High Performance Computing (HPC) community. While much work has been done to demonstrate that there is no additional overhead when using a container technology, strategies for building and pulling scientific containers to cloud environments and the cost implications of those choices have not been fully studied. Due to the importance and predominance in the ecosystem, considerations that minimize the time of operations, such as pulling and staging, are essential. This understanding and innovation in the space is becoming more important as more scientific applications are ported to cloud environments. In this study, we first aim to understand the landscape of containerized scientific applications, assembling a sample of more than 77K Dockerfile recipes discovered from repositories in a research software engineering database and across a set of well-known machine learning organizations. We assess these data for trends in building strategy and resulting containers, and show that applying best practices to a set of 10 application containers can lead to improvements in layer redundancy and thus lower time and cost to use the set. Finally, we develop a simulation tool that creates containers for controlled experiments that vary the total size, and the number and size of layers. With this tool, we run an experimental study that varies layer size and count across several scales to better understand the trade-off between layer count and size and the subsequent cost. In this experimental work, we find that total image size is a dominating variable during provisioning, and that strategies to improve I/O and enable lazy loading of images can lead to improvements of 3-15x. This work is valuable to inform the HPC community moving to the cloud about best practices for building and pulling containers.

Pages: 69 to 80

Copyright: Copyright (c) IARIA, 2025

Publication date: April 6, 2025

Published in: conference

ISSN: 2308-4294

ISBN: 978-1-68558-258-6

Location: Valencia, Spain

Dates: from April 6, 2025 to April 10, 2025