Tuesday, October 20, 2015

CMS Case Study Paper at CHEP

Our case study work on how to preserve and reproduce a high energy physics (HEP) application with Parrot has been accepted by Journal of Physics: Conference Series (JPCS 2015).

The HEP application under investigation is called TauRoast, and authored by our physics collaborator, Matthias. TauRoast is a complex single-machine application having lots of implicit and explicit dependencies: CVS, github, PyYAML websites, personal websites, CVMFS, AFS, HDFS, NFS, and PanFS. The total size of these dependencies is about 166.8TB.

To make TauRoast reproducible, we propose one fine-grained dependency management toolkit based on Parrot to track the really used data and create a reduced package which gets rid of all the unused data. By doing so, the original execution environment with the size of 166.8TB is reduced into a package with the size of 21GB. The correctness of the preserved package is demonstrated in three different environments - the original machine, one virtual machine from the Notre Dame Cloud Platform and one virtual machine from the Amazon EC2 Platform.

 

Monday, October 19, 2015

OpenMalaria Preservation with Umbrella

Haiyan worked together with Alexander from CRC, successfully preserved and reproduced a C++ application, openMalaria, using Umbrella. The data dependencies of openMalaria include packages from yum repositories, OS images from the CCL websites, software and data dependencies from curateND. Through a JSON-format specification, Umbrella allows the user to specify the complete execution environment for his application: hardware, kernel, OS, software, data, packages supported by package managers, environment variables, and command. Each dependency in an Umbrella specification also comes with its metadata information - size, format, checksum, downloading urls, and mountpoints. During runtime, Umbrella tries to construct the execution environment specified in the specification with the help of different sandboxing techniques such as Parrot and Docker, and run the user's task.

Tuesday, October 13, 2015

DAGVz Paper at Visual Performance Analysis Workshop

An Huynh will be presenting a paper on the visualization of task-parallel programs at the Visual Performance Analysis workshop at Supercomputing 2015.  (He is a student of Kenjiro Taura at the University of Tokyo who spent a semester working with us in the Cooperative Computing Lab.)

An developed DAGViz, a tool for exploring the rich structure of task parallel programs, which requires associating DAG structure with performance measures at multiple levels of detail.  This allows the analyst to zoom in to trouble spots and view exactly where and how each basic block of a program ran on a multi-core machine.