Home // International Journal On Advances in Software, volume 2, numbers 2 and 3, 2009 // View article


A Workflow System for Data Processing on Virtual Resources

Authors:
Rainer Schmidt
Christian Sadilek
Ross King

Keywords: data intensive computing; cloud computing; service-oriented, workflow; digital preservation

Abstract:
This paper describes challenges and approaches that have been addressed during the development of a workflow environment for digital preservation. The system addresses the general problem of efficiently processing collections of binary data using commodity software tools. We present a prototype implementation of a job execution service that is capable of providing access to clusters of virtual machines based on standard grid mechanisms. The service allows clients to specify individual tools and execute them in parallel on large volumes of data. This approach allows one to utilize a cloud infrastructure that is based on platform virtualization as a scaling environment for the execution of complex workflows. Here, we outline the architecture of the workflow environment, introduce its programming model, and describe the service enactment. With this paper we extend work previously presented in [1].

Pages: 234 to 244

Copyright: Copyright (c) to authors, 2009. Used with permission.

Publication date: December 1, 2009

Published in: journal

ISSN: 1942-2628