Cooperative Computing Lab News: November 2021

Friday, November 5, 2021

JX Language: REPL Tool and Dot Operator

Undergraduate student Jack Rundle has been making improvements to the JX language used throughout the CCTools package for expressing workflows, database queries, and other structured information.

First, we added a new command line tool, jx_repl, which provides an interactive REPL environment to work with the JX language:

In addition to standard JX evaluation, the tool also reserves a number of symbols in the context, acting as commands when entered (ex: "help", "quit", etc.). A full guide for the REPL is outlined in the CCTools documentation. One interesting feature is that both the input expression and output for each line are stored throughout the program's life-cycle. Previous input expressions can be referenced via "in_#" and the associated output via "out_#". Furthermore, JX will resolve symbols in the input expressions, which themselves may include references to "out_#".

Next, we provide support for a new operator in JX: the “dot” operator, which resembles anaphoric macros in Lisp. The dot operator can be placed after an expression (A) and before a function (B), then JX will evaluate the operator by inserting the expression as the first parameter of the function (B(A)). In cases of functions with multiple parameters, the other parameters simply get shifted over. For example:

BEFORE: len([1,2,3,4]) # 4
AFTER: [1,2,3,4].len() # 4

BEFORE: like("abc", "a.+") # true
AFTER: "abc".like("a.+") # true

BEFORE: format("ceil(%f) -> %d", 9.1, 10) # "ceil(9.1) -> 10"
AFTER: "ceil(%f) -> %d".format(9.1, 10) # "ceil(9.1) -> 10"

BEFORE: len(project(select([{"a": 1}, {"a": 2}], a>0), a)) # 2
AFTER: [{"a": 1}, {"a": 2}].select(a>0).project(a).len() # 2

In order to make this work, we did have to swap the parameter order for three different functions: project(), select(), and like(). However, we can now query the global catalog server with database like queries:

fetch("http://catalog.cse.nd.edu:9097/query.json").select(type=="wq_master").project([name,tasks_total_cores])

Yields:

[
    ["earth.crc.nd.edu",7373],
    ["hallofa.ps.uci.edu",15],
    ["hpc-services1.oit.uci.edu",2],
    ["vm65-195.iplantcollaborative.org",1460],
    ...
]

Thursday, November 4, 2021

PONCHO Toolkit for Portable Python

PONCHO, is a lightweight Python based toolkit which allows users to synthesize environments from a concise, human-readable JSON file containing the necessary information required to build a self-contained Conda virtual environment needed to execute scientific applications on distributed systems. Poncho is composed of three parts: poncho_package_analyze, poncho_package_create and poncho_package_run

poncho_package_analyze performs a static analysis of dependencies used within a python application. The output is JSON file listing the dependencies.

poncho_package_analyze application.py spec.json

This will give you a dependency file like this:

{
"conda":{
"channels":[
"defaults",
"conda-forge"
],
"packages":[
"ndcctools=7.3.0",
"parsl=1.1.0",
]
},
"pip": [
"topcoffea"
]
}

Then if needed, you can manually add other kinds of code and data dependencies like this:

{
"git": {
       "DATA_DIR": {
           "remote": "http://.../repo.git"
       }
},
   "http": {
"REFERENCE_DB": {
"type": "file",
"url": "https://.../example.dat"
}
}
}

poncho_package_create allows users to create an environment from a JSON specification file. This specification may include Conda packages, Pip packages, remote Git repos and arbitrary files accessible via HTTPS. This environment is then packaged into a tarball.

poncho_package_create spec.json env.tar.gz

poncho_package_run will unpack and and activate the an environment. As an input, a command will then be executed within this environment. Any Git repos or files specified within the environment will be set as environment variables.

poncho_package_run -e env.tar.gz python application.py

This programmable interface allows us to now take a Python application and easily move it from place to place within a cluster, and is in production with the Coffea data analysis application and the Parsl workflow system when using Work Queue as an execution system.

The poncho tools can be found in the latest release of the Cooperative Computing Tools.

WORKS Paper: Adaptive Resource Allocation for Heterogeneous Tasks in Dynamic Workflows

CCL graduate student Thanh Son Phung will be presenting his recent work on managing dynamic tasks at the WORKS workshop at Supercomputing 2021:

Dynamic workflows are emerging as the preferable class of workflow management systems due to their offerings of flexibility, convenience, and performance to users. They allow users to generate tasks automatically and programmatically at run time, abstract away the gory implementation details, and retain the intrinsic benefit of parallelism from underlying distributed systems. The below figure shows the full picture of the transitions from logical task generations to actual task deployments and executions in the Colmena-XTB workflow.

However, from a systems developer's standpoint, the dynamic nature of task generation poses a significant problem in term of resource management. That is, what quantity of resources should we allocate for a newly generated task? Figure below shows the memory consumption of tasks over time in the Colmena-XTB workflow.

As demonstrated, tasks vary significantly in their memory consumption (from 2GBs to 30GBs). A large allocation will decrease the probability of task failure due to under-allocation, but increase the potential waste of resource as tasks may only consume a small portion of it. On the other hand, a small allocation has the opposite effects.

We observe that task allocation can be automated and improved considerably by grouping tasks with similar consumption. A task scheduler can use this information of completed tasks to allocate ready tasks. Figure below visually shows our strategy in task allocation, where each task is first allocated with the value of the blue line, and upon failure due to under-allocation, is allocated with the value of the upper line.

We evaluated our strategies on seven datasets of resource consumption and noticed the substantial improvement of resource allocation efficiency. In details, the average task consumption efficiency under our allocation strategies can range anywhere from 16.1% to 93.9% with the mean of 62.1%.

Read the full paper here:

Thanh Son Phung, Logan Ward, Kyle Chard, and Douglas Thain, Not All Tasks Are Created Equal: Adaptive Resource Allocation for Heterogeneous Tasks in Dynamic Workflows, WORKS Workshop on Workflows at Supercomputing, November, 2021.