A recent article in IEEE Transactions on Parallel and Distributed Computing describes our work in collaboration with the Notre Dame Bioinformatics Laboratory on SAND - The Scalable Assembler at Notre Dame.
In this article, we describe how to refactor the standard Celera genome assembly pipeline into a scalable computation that runs on thousands of distributed cores using the Work Queue. By explicitly handling the data dependencies between tasks, we are able to significantly improve runtime over Celera on a standard cluster. In addition this technique allows the user to break free of the shared filesystem and run on hundreds thousands of nodes drawn from clusters, clouds, and grids.
No comments:
Post a Comment