Flagship
Measuring where the energy goes on the GPU
Energy-measurement tooling for Kokkos, the US Department of Energy's performance-portability framework. Connectors merged into Kokkos Tools, plus an analysis dashboard.
The problem
Kokkos lets one C++ source run across NVIDIA, AMD, and Intel GPUs, which is exactly why energy is hard to reason about: the same kernel draws different power on every backend, and application teams had no portable way to see it. On DOE machines, where power is now a first-class constraint, that blind spot matters.
What I built
A set of Kokkos Tools connectors that sample power while kernels run and attribute the integrated energy to the Kokkos regions that caused it: an NVML backend for NVIDIA GPUs, a Variorum backend for node-level power, a background daemon sampling on a fixed interval, and CSV export. On top, a Python dashboard turns that output into per-kernel energy analysis. It hooks the Kokkos profiling interface, so application code is untouched.
Where it stands
The periodic-sampling daemon is merged into kokkos-tools, written up in an ORNL report, and presented as a poster, 'Understanding GPU Energy Dynamics in HPC Applications', at the 2025 Smoky Mountains Conference. The NVML and Variorum connectors are in review, with ROCm SMI sketched for AMD.