Samuel Huang, an exchange student in the iSURE program, recently completed a summer project with the Cooperative Computing Lab at the University of Notre Dame. He developed tools for visualizing the performance and behavior of distributed Work Queue applications. These applications can run on thousands of nodes and may have surprisingly complex behavior over time. Visualization is key to understanding what's going on.
For example, this display shows an application consisting of about 30,000 tasks. Each line segment shows one task from beginning to end, sorted by submission time. (The color indicates the type of each task: preprocessing, processing, or accumulation.). As this display clearly shows, this application goes through several distinct phases, in which tasks of different types take increasing amounts of time. In fact, the last few thousands tasks take much longer, showing a classic "long tail" behavior common to distributed applications.