This project is read-only.

Twister4Azure is a distributed decentralized iterative MapReduce runtime for Windows Azure Cloud that was developed utilizing Azure cloud infrastructure services. Twister4Azure extends MapReduce paradigm by introducing extensions and optimizations for iterative MapReduce applications. Twister4Azure supports caching of loop-invarient data, adds a new merge step (map->reduce->merge) to the programming model and introduces a novel cache-aware task scheduling mechanism. Twister4Azure running in Azure cloud outperforms Hadoop in local cluster by 2 to 4 times.

AzureMapReduce Architecture

  • Decentralized architecture for clouds
    • Avoids single point of failures
    • Utilize highly available and scalable Cloud services
  • Efficient execution of Iterative MapReduce applications
    • Extends the MR programming model with iterative extensions
    • Multi-level data caching to overcome data access latencies
    • Cache aware hybrid scheduling
    • Collective communication primitives for Iterative MapReduce
  • Support for traditional MapReduce and pleasingly parallel applications
  • Ability to execute multiple MR applications inside a single iteration
  • Dynamic scheduling achieving better load balancing
  • Typical MapReduce fault tolerance, ensuring eventual completion of your computation
  • Web based monitoring console
  • Azure local emulator based local testing/debugging

We are happy to provide support for scientific application developments using Twister4Azure.

Please cite Twister4Azure using,

Subscribe to Twister4Azure mailing list to receive annoucements of the new releases, to get user support and to report issues. Visit the Twister4Azure Google Group. 


  • Judy Qiu (, Assistant Professor of Computer Science and Informatics, Indiana University, Bloomington, IN.
  • Thilina Gunarathne(, PhD candidate,  Computer Science, Indiana University, Bloomington, IN.


Last edited Feb 25, 2014 at 12:10 PM by thilina, version 13