Makeflow = Make + Workflow
Makeflow is a workflow engine for executing large
complex workflows on clusters, clouds, and grids.
Makeflow is very similar to traditional Make, so if you
can write a Makefile, then you can write a Makeflow.
You can be up and running workflows in a matter of minutes.
For example, suppose you want to split a dataset into
three pieces, run a simulation on each, and then combine
the results. Your Makeflow script would look like this:
part1 part2 part3: input.data split.py
./split.py input.data
out1: part1 mysim.exe
./mysim.exe part1 >out1
out2: part2 mysim.exe
./mysim.exe part2 >out2
out3: part3 mysim.exe
./mysim.exe part3 >out3
result: out1 out2 out3 join.py
./join.py out1 out2 out3 > result
|
|
Makeflow can be used to drive several different systems,
including a single multicore machine, Condor
and SGE batch systems,
or the bundled Work Queue
system. The same specification works for all systems, so you can easily grow your application
from one machine up to thousands.
Makeflow differs from other distributed and parallel make tools
in that it does not require a distributed filesystem.
You can use it to harness whatever machines you have available,
and Makeflow handles the data transfer and caching. In addition,
Makeflow is highly fault tolerant: it can crash or be killed,
and upon resuming, will reconnect to running jobs and continue
where it left off.
For More Information
Makeflow User's Manual
Makeflow Tutorial Slides
Download Makeflow
Getting Help with Makeflow
Publications
(Showing papers with tag makeflow. See all papers instead.)
- Michael Albrecht, Patrick Donnelly, Peter Bui, and Douglas Thain,
Makeflow: A Portable Abstraction for Data Intensive Computing on Clusters, Clouds, and Grids, Scalable Workflow Enactment Engines and Technologies, May, 2012.
- Rory Carmichael, Patrick Braga-Henebry, Douglas Thain, and Scott Emrich,
Biocompute 2.0: An Improved Collaborative Workspace for Data Intensive Bio-Science., Concurrency and Computation: Practice and Experience, 23(17), pages 2305-2314, December, 2011. DOI: 10.1002/cpe.1782
- Peter Bui, Li Yu, Andrew Thrasher, Rory Carmichael, Irena Lanc, Patrick Donnelly, Douglas Thain,
Scripting distributed scientific workflows using Weaver, Concurrency and Computation: Practice and Experience, November, 2011. DOI: 10.1002/cpe.1871
- Irena Lanc, Peter Bui, Douglas Thain, and Scott Emrich,
Adapting Bioinformatics Applications for Heterogeneous Systems: A Case Study, Emerging Computational Methods for the Life Sciences Workshop at ACM HPDC, pages 7-13, June, 2011. DOI: 10.1145/1996023.1996025
- Andrew Thrasher, Rory Carmichael, Peter Bui, Li Yu, Douglas Thain, and Scott Emrich,
Taming Complex Bioinformatics Workflows with Weaver, Makeflow, and Starch, Workshop on Workflows in Support of Large Scale Science, pages 1-6, November, 2010. DOI: 10.1109/WORKS.2010.5671858
- Li Yu, Christopher Moretti, Andrew Thrasher, Scott Emrich, Kenneth Judd, and Douglas Thain,
Harnessing Parallelism in Multicore Clusters with the All-Pairs, Wavefront, and Makeflow Abstractions, Journal of Cluster Computing, 13(3), pages 243-256, September, 2010. DOI: 10.1007/s10586-010-0134-7
- Douglas Thain and Christopher Moretti,
Abstractions for Cloud Computing with Condor, Syed Ahson and Mohammad Ilyas, Cloud Computing and Software Services: Theory and Techniques, pages 153-171, CRC Press, July, 2010. ISBN: 9781439803158
- Rory Carmichael, Patrick Braga-Henebry, Douglas Thain, and Scott Emrich,
Biocompute: Toward a Collaborative Workspace for Data Intensive Bio-Science, Workshop on Emerging Computational Methods for Life Sciences at ACM HPDC 2010, pages 489-498, June, 2010. DOI: 10.1145/1851476.1851547
|