Home // International Journal On Advances in Software, volume 13, numbers 1 and 2, 2020 // View article


Data Science as a Service - Prototyping an integrated and consolidated IT infrastructure combining enterprise self-service platform and reproducible research

Authors:
Hans Laser
Steve Guhr
Jan-Hendrik Martenson
Jannes Gless
Branko Jandric
Joshua Görner
Detlef Amendt
Benjamin Schantze
Svetlana Gerbel

Keywords: data science as a service; reproducible research; enterprise information technology; research data infrastructure; self-services; data science platform; cloud infrastructure

Abstract:
A data scientific process (e.g., Obtain, Scrub, Explore, Model, and iNterpret (OSEMN)) usually consists of different steps and can be understood as an umbrella for the combination of different most modern techniques and tools for the extraction of information and knowledge. When developing a suitable IT infrastructure for a self-service platform in the academic environment, scientific requirements for reproducibility and comprehensibility as well as security aspects such as the availability of services and of data are to be taken into account. In this paper, we show a prototypical implementation for the efficient use of available data center resources as a self-service platform on enterprise technology to support data-driven research.

Pages: 104 to 115

Copyright: Copyright (c) to authors, 2020. Used with permission.

Publication date: June 30, 2020

Published in: journal

ISSN: 1942-2628