Monday, June 11, 2018
Early Experience Using Amazon Batch for Scientific Workflows gives some of practical experience using Amazon Batch for scientific workflows, comparing the performance of straight EC2 virtual machines against Amazon Batch and overlaying the WorkQueue system on top of virtual machines.
Efficient Integration of Containers into Scientific Workflows explores different methods of composing container images into complex workflows, in order to make efficient use of shared filesystems and data movement.
Monday, June 4, 2018
Tuesday, May 29, 2018
Friday, May 25, 2018
VC3 makes it easy for science groups to deploy custom software stacks across existing university clusters and national facilities. For example, if you want to run your own private Condor pool across three clusters and share it with your collaborators, then VC3 is for you.
We are now running VC3 as a "limited beta" for early adopters who would like to give it a try and send us feedback. Check out the instructions and invitation to sign up.
Tuesday, May 22, 2018
It was a busy graduation weekend here at Notre Dame! The CSE department graduated nineteen PhD students, including CCL grads Dr. Peter Ivie and Dr. James Sweet. Prof. Thain gave the graduation address at the CSE department ceremony. Congratulations and good luck to everyone!
Wednesday, April 25, 2018
In Automatic Dependency Management for Scientific workflows (paper slides) we introduce a tool for sofware environments deployments in clusters. This tool, called the vc3-builder, has minimal dependencies and a lightbootsrap, which allows it to be deployed along batch jobs. The vc3-builder then install any missing sofware using only user-priviliges (e.g., no sudo) so that the actual user payload can be executed. The vc3-builder is being developed as part of the DOE funded Virtual Clusters for Community Computation (VC3) project, in which users can construct custom short-lived virtual clusters across different computational sites.
In MAKER as a Service: Moving HPC applications to Jetstream Cloud (paper poster slides) we discussed the lessons learn in migrating MAKER, a traditional HPC application, to the cloud. This focused on issues like recreating the software stack using VC3-Bulder, addressing the lack of shared filesystems and inter-node communications with Work Queue, and building the application focused on user feedback allowing for informed decisions in the cloud. Using WQ-MAKER we were able to run MAKER not only on Jetstream, but also resources from Notre Dame's Condor cluster. Below you can see the systems architecture.
Monday, March 12, 2018
You can find the slides and tutorial at: CCL at CyVerse Container Camp
Here you can see the active participants:
Monday, January 15, 2018
If you have discovered a new result, published a paper, given a talk, or just done something cool, please take a few minutes to tell us about it on this simple form.
If we accept your submission, we will highlight your story on our website, include a mention in our annual report to the NSF, and send you some neat CCL stickers and swag as a little thank-you.
You can see what others have submitted on our community highlights page.
Monday, December 4, 2017
- Thinking opportunistically
- Overview of the Cooperative Computing Tools
- Makeflow using Work Queue as a batch system
- Makeflow using the Cloud as a batch system
- Specifying and managing resources
- Using containers in and on a Makeflow
- Specifying Makeflow with JSON and JX
The initial solution, which is effective in many cases, is to turn on garbage collection. Garbage collection in Makeflow track a created file from creation until it is no longer needed, at which point it is deleted. This initial solution works well to limit the active footprint of the workflow. However, the user is still left in a situation where they are not aware of the space needed to execute.
To resolve this added an algorithm that will estimate the size of the workflow, and what this minimum size needed to execute said workflow would be. This is done by determining the different paths of execution and finding the resulting minimum path(s) through the workflow. This is most accurately done by estimating and labeling the files in the Makeflow:
.SIZE test.dat 1024K
Using this information Makeflow can statically analyze the workflow and tell you the minimum and maximum storage needed to execute. This information can then be coupled with a run-time storage manager and garbage collection to stay within a user specified limit. Instead of actively trying to schedule in an order to prevent going over the limit, nodes are submitted when there is enough space to permit the it to run and have space for all of its children. This allows for the more concurrency if the space allows. Below is an image that shows how this limit can be used to at different levels.
This first image shows a bioinformatics workflow running using the minimum required space. We can see several(10) peaks in the workflow. Each of these correspond to a larger set of intermediate files that can be removed later. In the naive case where we don't track storage these can all occur at the same time using more storage than may be available.
- Label the files. A slight over-estimate will work as well as the exact number is not known ahead of time. The default size is 1G.
.SIZE file.name 5M
- Find the estimated size of the Makeflow.
makeflow --storage-print storage-estimate.dat
- Run Makeflow setting a mode and limit. The limit can be anywhere between the min and the max. Type 1 indicates a min tracking which holds below the limit set.
makeflow --storage-type 1 --storage-limit 10G