LLM-based Distributed Code Generation and Cost-Efficient Execution in the Cloud

Rao, Kunal; Coviello, Giuseppe; Mellone, Gennaro; De Vita, Ciro Giuseppe; Chakradhar, Srimat

Home // CLOUD COMPUTING 2025, The Sixteenth International Conference on Cloud Computing, GRIDs, and Virtualization // View article

LLM-based Distributed Code Generation and Cost-Efficient Execution in the Cloud

Authors:
Kunal Rao
Giuseppe Coviello
Gennaro Mellone
Ciro Giuseppe De Vita
Srimat Chakradhar

Keywords: Cloud Computing; Large Language Models (LLMs); Distributed systems; Code generation; Cost reduction.

Abstract:
The advancement of Generative Artificial Intelli- gence (AI), particularly Large Language Models (LLMs), is reshaping the software industry by automating code generation. Many LLM-driven distributed processing systems rely on serial code generation constrained by predefined libraries, limiting flexibility and adaptability. While some approaches enhance performance through parallel execution or optimize edge-cloud distributed processing for specific domains, they often overlook the cost implications of deployment, restricting scalability and economic feasibility across diverse cloud environments. This paper presents DiCE-C, a system that eliminates these constraints by starting directly from a natural language query. DiCE-C dynamically identifies available tools at runtime, programmatically refines LLM prompts, and employs a stepwise approach—first generating serial code and then transforming it into distributed code. This adaptive methodology enables efficient distributed execution without dependence on specific libraries. By leveraging high-level parallelism at the Application Programming Interface (API) level and managing API execution as services within a Kubernetes-based runtime, DiCE-C reduces idle GPU time and facilitates the use of smaller, cost-effective GPU instances. Experiments with a vision-based insurance application demonstrate that DiCE-C reduces cloud operational costs by up to 72% when using smaller GPUs (A6000 and A4000 GPU machines vs. A100 GPU machine) and by 32% when using identical GPUs (A100 GPU machines). This flexible and cost-efficient approach makes DiCE-C a scalable solution for deploying LLM-generated vision applications in cloud environments.

Pages: 114 to 121

Copyright: Copyright (c) IARIA, 2025

Publication date: April 6, 2025

Published in: conference

ISSN: 2308-4294

ISBN: 978-1-68558-258-6

Location: Valencia, Spain

Dates: from April 6, 2025 to April 10, 2025