Lecture Series:

Workflow Systems for Large-Scale Scientific Data Analysis

Wednesday, 10.07.2024 · 16:00

Speaker: Ulf Leser, HU Berlin

Modern scientific data analysis pipelines consist of multiple interdependent steps arranged in complex workflow structures. When applied to large data sets, these workflows must be executed on distributed compute cluster which requires the orchestration of several heavy-weight infrastructure components like file systems, resource managers, and container manager. Scientific workflow systems can substantially reduce the efforts to develop data analysis under such a setting while also improving reproducibility, workflow exchange, maintenance efforts – and often even performance. This talk will introduce the fundamental concepts of workflow systems with their benefits and current limitations. It presents selected results from our research on improving workflow systems achieved within the Collaborative Research Center FONDA and closes with a glimpse into upcoming research challenges.

ECDF, Wilhelmstrasse 67, 10117 Berlin, Conference room, 1st floor/ Online participation will be possible. Please register here.