CCL Home

Research

Software Community Operations

Makeflow = Make + Workflow

Makeflow is a workflow engine for executing large complex workflows on clusters, clouds, and grids. Makeflow is very similar to traditional Make, so if you can write a Makefile, then you can write a Makeflow. You can be up and running workflows in a matter of minutes.

For example, suppose you want to split a dataset into three pieces, run a simulation on each, and then combine the results. Your Makeflow script would look like this:

part1 part2 part3: input.data split.py
        ./split.py input.data

out1: part1 mysim.exe
        ./mysim.exe part1 >out1

out2: part2 mysim.exe
        ./mysim.exe part2 >out2

out3: part3 mysim.exe
        ./mysim.exe part3 >out3

result: out1 out2 out3 join.py
        ./join.py out1 out2 out3 > result
Makeflow can be used to drive several different systems, including a single multicore machine, Condor and SGE batch systems, or the bundled Work Queue system. The same specification works for all systems, so you can easily grow your application from one machine up to thousands.

Makeflow differs from other distributed and parallel make tools in that it does not require a distributed filesystem. You can use it to harness whatever machines you have available, and Makeflow handles the data transfer and caching. In addition, Makeflow is highly fault tolerant: it can crash or be killed, and upon resuming, will reconnect to running jobs and continue where it left off.

For More Information

  • Makeflow User's Manual
  • Makeflow Tutorial Slides
  • Download Makeflow
  • Getting Help with Makeflow
  • Publications

    (Showing papers with tag makeflow. See all papers instead.)