Workflow Engine Comparison(First Impressions)



home · about · subscribe

April 13, 2014 · -

I was looking at different options for workflow engines. I have some experience in Oozie, little experience in Luigi and no experience in Azkaban. In this post, I will try to give an overview of these engines in terms of their advantages and disadvantages. Take my word with a grain of salt(based on the experience I have with these tools), though.

Crons do not scale(Surprise!)

If you have a lot of processes which manipulate, transform and write data to database, you will sooner or later will face the limitations of the cron jobs. You want to be able to handle failures, debug processes and rerun the failed jobs. You want to have multiple scripts to run based on data availability, data dependency and time-based scheduling. You may want to also share the data workflow with many people where you cannot do any of the items with cron jobs.

What is sufficient?

Let’s write our own workflow engine

What do we want from the workflow engines?

Oozie

Advantages

Disadvantages

Luigi

Advantages

Disadvantages

Azkaban

Advantages

Disadvantages

All Rights Reserved

Copyright, 2020