HPCG Performance Matrix - Galactic Edition

Initializing Warp Drive... Stand by for data transmission...

Understanding the Metrics (Weak Scaling Context)

We ran out study setting a fixed local problem size per processor. Notably, the instance types vary in how many processes they have, meaning comparing raw values is not always fair. The following table explains the metrics available in the "Data Sector" dropdown and how they might be interpreted in this weak scaling scenario (where local problem size per process is fixed, and total processes `N` varies).

Original Metric (from HPCG Output) Normalized Metric Name for Heatmap Normalization Formula (Conceptual) Expected Trend in Ideal Weak Scaling
GFLOP/s Summary::Raw SpMV= (and other raw GFLOP/s) Per-Process SpMV GFLOP/s (or MG, DDOT, WAXPBY, Total) V(Raw_Metric_GFLOPs) / N Constant
Final Summary::HPCG result is VALID with a GFLOP/s rating of= Per-Process Official FOM (GFLOP/s) V(Official_FOM) / N Constant
Benchmark Time Summary::Total= Avg. Time per CG Iteration (approx.) V(Total_Time) / Total_Optimized_CG_Iterations Constant
GB/s Summary::Raw Total B/W= (and read/write) Per-Process Total Memory BW (GB/s) (or Read/Write) V(Total_BW) / N Constant
Memory Use Information::Total memory used for data (Gbytes)= Per-Process Memory Used (GB) V(Memory_Used) / N Constant
Setup Information::Setup Time= Per-Process Setup Time (sec) V(Setup_Time) / N Constant / Slow Growth
DDOT Timing Variations::Max DDOT MPI_Allreduce time= Max MPI_Allreduce Time (sec) V(Max_Allreduce_Time) (compare raw values) Slow Growth
DDOT Timing Variations::Avg DDOT MPI_Allreduce time= Avg MPI_Allreduce Time (sec) V(Avg_Allreduce_Time) (compare raw values) Slow Growth
Benchmark Time Summary::DDOT= Avg. DDOT Time per CG Iteration (approx.) V(Total_DDOT_Time) / Total_Optimized_CG_Iterations Constant / Slow Growth
Spectral Convergence Tests::Unpreconditioned::Maximum iteration count= Unpreconditioned CG Iterations V(Iteration_Count) (compare raw values) Constant

Note: `V(Metric)` refers to the value of the metric, and `N` is the total number of MPI processes used for the run. For metrics compared raw, observe their trend as `N` changes. "Total Optimized CG Iterations" is also extracted from the HPCG output. Metrics not in the table report values as is.