- Casey Robinson presented Automated Packaging of Bioinformatics Workflows for Portability and Durability Using Makeflow at the Workshop on Scientific Workflows (WORKS).
- Patrick Donnelly presented Design of an Active Storage Cluster Filesystem for DAG Workflows at the Workshop on Data Intensive Computing Systems (DISCS).
- Prof. Thain gave the opening talk, Toward a Common Model of Highly Concurrent Applications at the Workshop on Many Task Computing on Clusters, Grids, and Supercomputers (MTAGS).
Monday, November 18, 2013
Friday, October 11, 2013
The workshop is an opportunity for beginners and experts alike to learn about the latest software release, get to know the CCL team, meet other people engaged in large scale scientific computing, and influence the direction of our research and development.
Thursday, October 10th
Afternoon: Introduction and Software Tutorials
Friday, October 11th
All Day: New Technology, Scientific Talks, and Discussion
There is no registration fee, however, space is limited, so please register in advance to reserve your place. We hope to see you there!
Wednesday, August 21, 2013
In the original design of Work Queue, each worker was a sequential process that executed one task at a time. This paper describes the extension of Work Queue into two respects:
- Workers can now run multiple tasks simultaneously, each sharing a local cache directory.
- Workers can be combined into hierarchies, each headed by a foreman, which provides a common disk cache for each sub tree.
The effect of these two changes is to dramatically reduce the network footprint at the master process, and at each execution site. The resulting system is more 'friendly' to local clusters, and is capable of scaling to even greater sizes.
Monday, July 29, 2013
This is the first release of the 4.0 series, with some major changes:
- To support new features on WorkQueue, backwards compatibility of master and workers pre-4.0 is broken. Specifically, workers from 4.0 cannot connect to masters pre-4.0, and masters from 4.0 will not accept connection from workers pre-4.0. The API did not change, thus unless you want to take advantage of new features, you should not need to modify your code.
- All code related to WorkQueue has been consolidated to its own library. When linking work queue applications in C, you will need to use: -lwork_queue -ldttools rather than just -ldttools. If you are using the perl or python bindings, no change is necessary.
- The auto-mode option -a for communicating with the catalog server is being deprecated. It is now implied when a master, or project name (-M, -N) is specified.
- Most tools now support long options at the command line (e.g., --help).
- [WorkQueue] Support for workers hierarchies, with a master-foremen-workers paradigm. [Michael Albrecht]
- [WorkQueue] Multi-slot workers. A worker now is able to handle more than one task at a time. [Michael Albrecht]
- [WorkQueue] Resource reports. A worker reports its resources (disk, memory, and cpu) to the master, and each task in the master can specify a minimum of such resources. [Michael Albrecht, DeVonte Applewhite]
- [WorkQueue] Authentication between master and workers when using the catalog server [Douglas Thain].
- [WorkQueue] Python bindings now include most C API. [Dinesh Rajan]
- [WorkQueue] Several bug fixes and code reorganization. [Dinesh Rajan, Michael Albrecht]
- [WorkQueue] Policies can be specified to work_queue_pool to submit workers on demand. [Li Yu] [Makeflow] Support for task categories. A rule can be labeled with a category, and required computational resources (disk, memory, and cpu) can be specified per category. Makeflow then automatically communicates these requirements to work queue or condor. [Ben Tovar]
- [Parrot/Chirp] Support for a search system call added. Search allows for finding files in a number of directories with a shell pattern. See parrot_search for more information. [Patrick Donnelly, Brenden Kokoszka]
- [Parrot] Several bug fixes for cvmfs support. [Douglas Thain, Ben Tovar, Patrick Donnelly]
- [Monitor] A resource monitor/watchdog for computational resources (e.g. disk, memory, cpu, and io) that can be used standalone, or automatically by Makeflow and Work Queue. [Ben Tovar]
- [Monitor] A visualizer that builds a webpage to show the resources histograms from the reports of the resource monitor. [Casey Robinson]
Please refer to the doc/ directory in the distribution for the usage of this new features. You can download the software here:
Thanks goes to the contributors for this release: Michael Albrecht, DeVonte Applewhite, Peter Bui, Patrick Donnelly, Brenden Kokoszka, Kyle Mulholland, Francesco Prelz, Dinesh Rajan, Casey Robinson, Peter Sempolinski, Douglas Thain, Ben Tovar, and Li Yu.
Friday, July 26, 2013
Congratulations to Dr. Li Yu, who successfully defended his Ph.D. dissertation, Right-sizing Resource Allocations for Scientific Applications in Clusters, Grids, and Clouds!
Monday, July 22, 2013
We will be offering a tutorial titled Building Scalable Scientific Applications using Makeflow and Work Queue as part of XSEDE 2013 in San Diego on July 22.
Saturday, June 1, 2013
- Jesus Izaguirre, University of Notre Dame and Eric Darve, Stanford University
Tuesday, May 21, 2013
Friday, March 22, 2013
Dinesh Rajan will present a tutorial on Building Elastic Applications with Makeflow and Work Queue as part of CCGrid 2013 in Delft, the Netherlands on May 13th. Come join us and learn how to write applications that scale up to hundreds or thousands of nodes running on clusters, clouds, and grids.
Dinesh Rajan will present his paper
Case Studies in Designing Elastic Applications at the IEEE International Conference on Clusters, Clouds, and Grids (CCGrid) in Delft, the Netherlands. This work was done in collaboration with Andrew Thrasher and Scott Emrich from the Notre Dame Bioinformatics Lab, and Badi Abdul-Wahid and Jesus Izaguirre from the Laboratory for Computational Life Sciences.
The paper describes our experience in designing three different elastic applications -- E-MAKER, Elastic Replica Exchange, and Folding at Work -- that run on hundreds to thousands of cores using the Work Queue framework. The paper offers six guidelines for designing similar applications:
- Abolish shared writes.
- Keep your software close and your dependencies closer.
- Synchronize two, you make company; synchronize
three, you make a crowd.
- Make tasks of a feather flock together.
- Seek simplicity, and gain power.
- Build a model before scaling new heights.
Thursday, March 21, 2013
A recent article in IEEE Transactions on Parallel and Distributed Computing describes our work in collaboration with the Notre Dame Bioinformatics Laboratory on SAND - The Scalable Assembler at Notre Dame.
In this article, we describe how to refactor the standard Celera genome assembly pipeline into a scalable computation that runs on thousands of distributed cores using the Work Queue. By explicitly handling the data dependencies between tasks, we are able to significantly improve runtime over Celera on a standard cluster. In addition this technique allows the user to break free of the shared filesystem and run on hundreds thousands of nodes drawn from clusters, clouds, and grids.
Monday, February 18, 2013
The Cooperative Computing Lab is pleased to announce the release of version 3.7.0 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.
The software may be downloaded here.
This is a minor release which adds numerous features and fixes several bugs:
- [WorkQueue] It is now possible to specify chunks (pieces) of an input file to be used as input for worker tasks. [Dinesh Rajan]
- [Chirp] File extended attributes are now supported. [Patrick Donnelly]
- [Makeflow] New -i switch now outputs pre-execution analysis of Makeflow DAG. [Li Yu]
- [WorkQueue/Makeflow] Support for submitting tasks to the PBS batch submission platform added. [Dinesh Rajan]
- [Makeflow] makeflow_log_parser now ignores comments in Makeflow logs. [Andrew Thrasher]
- [Catalog] New catalog_update which reports information to a catalog server. [Peter Bui, Dinesh Rajan]
- [WorkQueue] Various minor tweaks made to the API. [Li Yu, Dinesh Rajan]
- [Catalog/WorkQueue] Support added for querying workers and tasks at run-time. [Douglas Thain]
- [WorkQueue] Many environment variables removed in favor of option manipulation API. [Li Yu]
- [Makeflow] Deprecated -t option (capacity tolerance) removed.
- [WorkQueue] -W (worker status) now has working_dir and current_time fields.
- [WorkQueue] -T (task status) now reports working_dir, current_time, address_port, submit_to_queue_time, send_input_start_time, execute_cmd_start_time. [Li Yu]
- [WorkQueue] -Q (queue status) now reports working_dir.
- [Makeflow] Input file (dependency) renaming supported with new "->" operator. [Michael Albrecht, Ben Tovar]
- [WorkQueue] work_queue_pool now supports a new -L option to specify a log file. [Li Yu]
- [WorkQueue] Tasks are now killed using SIGKILL.
- [WorkQueue] Protocol based keep-alives added to workers. [Dinesh Rajan]
Thanks goes to the contributors for many minor features and bug fixes:
- Michael Albrecht
- Peter Bui
- Patrick Donnelly
- Brian Du Sell
- Kyle Mulholland
- Dinesh Rajan
- Douglas Thain
- Andrew Thrasher
- Ben Tovar
- Li Yu
Please send any feedback to the CCTools discussion mailing list.
Monday, February 11, 2013
The Cooperative Computing Lab is pleased to announce the release of version 3.6.2 of the Cooperative Computing Tools, including Parrot, Chirp, Makeflow, WorkQueue, SAND, All-Pairs, and other software.
This is a bug fix release of version 3.6.1. No new features were added.
The software may be downloaded here:
- [WorkQueue] Corrected memory errors leading to a SEGFAULT. [Li Yu]
- [Makeflow] Properly interpret escape codes in Makeflow files: \n, \t, etc. [Brian Du Sell]
- [Parrot] Watchdog now properly honors minimum wait time. [Li Yu]
- [Parrot] Reports the logical executable name for /proc/self/exe instead of the physical name. [Douglas Thain]
- [WorkQueue] Race conditions in signal handling for workers were corrected. Tasks now have a unique process group to properly kill all task children on abort. [Dinesh Rajan, Li Yu]
- [WorkQueue] Corrected incorrect handling of -C option where worker would not use the same catalog server as work_queue_pool. [Li Yu]
Thanks goes to the contributors for this release: Patrick Donnelly, Brian Du Sell, Dinesh Rajan, Douglas Thain, and Li Yu.
Tuesday, January 15, 2013
- Ronald J. Nowling and Jesus A. Izaguirre, University of Notre Dame
Tuesday, January 1, 2013
- Eric Lyons, University of Arizona