?
An Experimental Study of Data Transfer Strategies for Execution of Scientific Workflows
The paper studies the impact of data transfer strategies on the execution of scientific workflows. Five strategies are described, which define when and in what order data transfers are performed during the workflow execution. The strategies are experimentally evaluated by means of simulation using a realistic network model. It is demonstrated that the execution time of data-intensive workflows significantly depends on the used strategy. In particular, Eager and Lazy strategies, often used in theory and practice of workflow scheduling, demonstrate the poor results in most cases. The alternative strategies provide up to 36% makespan improvement by overlapping communications and computations, prioritizing data transfers and reducing network contention.