Monday, August 9, 2021

New PythonTask Interface in Work Queue

The most recent version of Work Queue supports two different categories of tasks.  Standard Tasks describe a Unix command line and corresponding files, just as before.  The new PythonTask describes a Python function invocation and corresponding arguments:

def my_sum(x, y):
    import math
    return x+y

task = wq.PythonTask(mysum,10,20)
When a task is returned, the function value is available as t.output:
task = queue.wait(5);
if task:
    print("task {} completed with result {}".format(,task.output))
Underneath, a PythonTask serializes the desired function and arguments, and turns it into a standard task which can be remotely executed, using all of the existing capabilities of Work Queue.  And so, a task can be given a resource allocation, time limits, tags, and everything else needed to manage distributed execution:
Thanks for new CCL graduate student Barry Sly-Delgado for adding this new capability to Work Queue!  See the full documentation.

Monday, August 2, 2021

Harnessing HPC at User Level for High Energy Physics

Ben Tovar presented some recent work at the (virtual) CHEP 2021 conference:  "Harnessing HPC Resources for CMS Jobs Using a Virtual Private Network".

The future computing needs of the Compact Muon Solenoid (CMS) experiment will require making use of national HPC facilities.  These facilities have substantial computational power, but for a variety of reasons, are not set up to allow network access from computational nodes out to the Internet.  This presents a barrier for CMS analysis workloads, which expect to make use of wide area data federations (like XROOTD) and global filesystems (like CVMFS) in order to execute.

In this paper, Ben demonstrates a prototype that creates a user-level virtual private network (VPN) that is dynamically deployed alongside a running analysis application.  This trick here is to make the whole thing work without requiring any root-level privileges, because that simply isn't possible at an HPC facility.  The solution brings together a variety of technologies -- namespaces, openconnect, slirp4netns, ld_preload -- in order to provide a complete user-level solution:

The performance of this setup is a bit surprising: when running a single worker node, the throughput of a single file transfer is substantially lower when tunneled (196 Mbps) compared to native performance (872 Mbps).  However, when running twenty or so workers, the tunneled solution achieves the same aggregate bandwidth as native.  (872 Mbps)  The most likely explanation is that tunneling a TCP connections over another TCP connections results in substantial start-up penalty while both stacks perform slow-start.

You can try the solution yourself here: 

New CCL Swag!

Check out the new CCL swag! Send us a brief note of how you use Work Queue or Makeflow to scale up your computational work, and we'll send you some laptop stickers and other items as a small thank-you.