Aligning Star Charts... Stand by for scatter data...
This study was run by setting a fixed local problem size per core/process. Instance types vary in their total core/process counts (`N`), which also defines the X-axis for the scatter plot. Comparing raw values is not always an "apples-to-apples" comparison for a fixed global problem size, but rather shows how performance scales. The following table explains the metrics available and how they might be interpreted in this context.
| Original Metric (from HPCG Output) | Normalized Metric Name for Plot | Normalization Formula (Conceptual) | Expected Trend in Ideal Weak Scaling |
|---|---|---|---|
GFLOP/s Summary::Raw SpMV= (and other raw GFLOP/s) |
Per-Core SpMV GFLOP/s (or MG, DDOT, WAXPBY, Total) | V(Raw_Metric_GFLOPs) / N |
Constant |
Final Summary::HPCG result is VALID with a GFLOP/s rating of= |
Per-Core Official FOM (GFLOP/s) | V(Official_FOM) / N |
Constant |
Benchmark Time Summary::Total= |
Avg. Time per CG Iteration (approx.) | V(Total_Time) / Total_Optimized_CG_Iterations |
Constant |
GB/s Summary::Raw Total B/W= (and read/write) |
Per-Core Total Memory BW (GB/s) (or Read/Write) | V(Total_BW) / N |
Constant |
Memory Use Information::Total memory used for data (Gbytes)= |
Per-Core Memory Used (GB) | V(Memory_Used) / N |
Constant |
Setup Information::Setup Time= |
Per-Core Setup Time (sec) | V(Setup_Time) / N |
Constant / Slow Growth |
DDOT Timing Variations::Max DDOT MPI_Allreduce time= |
Max MPI_Allreduce Time (sec) | V(Max_Allreduce_Time) (compare raw values) |
Slow Growth |
DDOT Timing Variations::Avg DDOT MPI_Allreduce time= |
Avg MPI_Allreduce Time (sec) | V(Avg_Allreduce_Time) (compare raw values) |
Slow Growth |
Benchmark Time Summary::DDOT= |
Avg. DDOT Time per CG Iteration (approx.) | V(Total_DDOT_Time) / Total_Optimized_CG_Iterations |
Constant / Slow Growth |
Spectral Convergence Tests::Unpreconditioned::Maximum iteration count= |
Unpreconditioned CG Iterations | V(Iteration_Count) (compare raw values) |
Constant |
Note: `V(Metric)` refers to the value of the metric, and `N` is the total number of Cores/MPI processes for the instance (from the JSON `coreMap`). "Total Optimized CG Iterations" is also extracted from the HPCG output. For metrics in the table labeled "Per-Core," you would perform the division by `N` *before* creating the JSON file if you want the Y-axis to show these normalized values. Otherwise, the Y-axis shows raw values, and the table guides interpretation. Metrics not explicitly listed as "Per-Core" in the table are generally plotted as their raw values.