Two-phase IO Enabling Large-scale Performance Introspection (poster)

End-to-End Framework - Viveka

Authors: Fan, K., Kumar, S.

Publication: SC23 The International Conference for High Performance Computing, Networking, Storage, and Analysis, Denver, CO

Numerous sophisticated profiling and visualization tools have been developed to enable programmers to expose semantic information from their application components. However, effective and interactive exploration of the profiles of large-scale parallel programs remains a challenge due to the high I/O overheads of profiles and the difficulties in scaling downstream visualization tools. In this poster, we present a full-stack approach to a performance introspection framework that tackles key challenges in profiling and visualizing performance data at scale. Our novelty lies in a scalable and compact data model and a two-phase I/O system, which instill scalability into the profiler making it low overhead-- even at high process counts (< 5%). We then build a web-based, visual-analytic dashboard with linked views. Our profiling and visualization tools are both lightweight and easy-to-use, which strikes a balance between providing sophisticated features and operating quickly and efficiently at high process counts.

This work was funded in part by NSF RII Track-4 award 2132013, NSF PPoSS planning award 2217036, NSF PPoSS large award 2316157 and, NSF collaborative research award 2221811. We thank the ALCF's Directors Discretionary (DD) program for offering us the compute hours to run our experiments on the Theta Supercomputer.

Date: November 12, 1970 - November 17, 1970

Document: View PDF

Related Entries

Directory:

Events:

Related Categories