Friday, March 22, 2013

Tutorial on Makeflow and Work Queue at CCGrid 2013

Dinesh Rajan will present a tutorial on Building Elastic Applications with Makeflow and Work Queue as part of CCGrid 2013 in Delft, the Netherlands on May 13th. Come join us and learn how to write applications that scale up to hundreds or thousands of nodes running on clusters, clouds, and grids.

Elastic Apps Paper at CCGrid 2013

Dinesh Rajan will present his paper
Case Studies in Designing Elastic Applications at the IEEE International Conference on Clusters, Clouds, and Grids (CCGrid) in Delft, the Netherlands. This work was done in collaboration with Andrew Thrasher and Scott Emrich from the Notre Dame Bioinformatics Lab, and Badi Abdul-Wahid and Jesus Izaguirre from the Laboratory for Computational Life Sciences.

The paper describes our experience in designing three different elastic applications -- E-MAKER, Elastic Replica Exchange, and Folding at Work -- that run on hundreds to thousands of cores using the Work Queue framework. The paper offers six guidelines for designing similar applications:

  1. Abolish shared writes.
  2. Keep your software close and your dependencies closer.
  3. Synchronize two, you make company; synchronize
    three, you make a crowd.
  4. Make tasks of a feather flock together.
  5. Seek simplicity, and gain power.
  6. Build a model before scaling new heights.

Thursday, March 21, 2013

Genome Assembly Paper in IEEE TPDS

A recent article in IEEE Transactions on Parallel and Distributed Computing describes our work in collaboration with the Notre Dame Bioinformatics Laboratory on SAND - The Scalable Assembler at Notre Dame.

In this article, we describe how to refactor the standard Celera genome assembly pipeline into a scalable computation that runs on thousands of distributed cores using the Work Queue. By explicitly handling the data dependencies between tasks, we are able to significantly improve runtime over Celera on a standard cluster. In addition this technique allows the user to break free of the shared filesystem and run on hundreds thousands of nodes drawn from clusters, clouds, and grids.