LayerNorm Benchmarks - Aggregated Results

This document combines benchmark results from multiple LayerNorm implementations.

Combined Summary and Visualization

2025-10-31T20:13:56.885734 image/svg+xml Matplotlib v3.10.7, https://matplotlib.org/ LN_B16_S2048_D4096 LN_B16_S2048_D8192 LN_B16_S4096_D4096 LN_B16_S4096_D8192 Workload 1.0 1.5 2.0 2.5 3.0 Latency P50 (ms) Attention Implementation Latency torch_layer_norm hf_kernels_layer_norm
▶ code ▼ output ▶ uv-logs | Cell: combine | 4.28s | Raw
======================================================================
LOADING BENCHMARK DATA
======================================================================
✓ PyTorch LayerNorm             : /__w/kernels-benchmarks/kernels-benchmarks/benches/layer_norm/impls/.uvnote/cache/4403c31e9bef6e648597b4fcc9cfdc402678aaa4f90636b74325f12d334214a3
✓ HF Kernels LayerNorm          : /__w/kernels-benchmarks/kernels-benchmarks/benches/layer_norm/impls/.uvnote/cache/bd278151199f29b397d85857b87922edaa39a62623fb28e0465de47d6a3bac74

  ✓ Found PyTorch LayerNorm
     Path: /__w/kernels-benchmarks/kernels-benchmarks/benches/layer_norm/impls/.uvnote/cache/4403c31e9bef6e648597b4fcc9cfdc402678aaa4f90636b74325f12d334214a3/layer_norm.jsonl
  ✓ Found HF Kernels LayerNorm
     Path: /__w/kernels-benchmarks/kernels-benchmarks/benches/layer_norm/impls/.uvnote/cache/bd278151199f29b397d85857b87922edaa39a62623fb28e0465de47d6a3bac74/layer_norm.jsonl

======================================================================
Summary: 2 found, 0 skipped, 0 missing
======================================================================

COMBINED BENCHMARK SUMMARY

impl                     wl                  p50(ms)  ok
hf_kernels_layer_norm    LN_B16_S2048_D4096     0.84  True
hf_kernels_layer_norm    LN_B16_S2048_D8192     1.66  True
hf_kernels_layer_norm    LN_B16_S4096_D4096     1.66  True
hf_kernels_layer_norm    LN_B16_S4096_D8192     3.27  True
torch_layer_norm         LN_B16_S2048_D4096     0.82  True
torch_layer_norm         LN_B16_S2048_D8192     1.68  True
torch_layer_norm         LN_B16_S4096_D4096     1.61  True
torch_layer_norm         LN_B16_S4096_D8192     3.33  True

GENERATING COMBINED VISUALIZATION

Loaded 8 records
✓ Visualization saved as latency.svg
Saved latency.png
✓ Visualization saved as latency.svg
✓ SVG visualization ready!

ANALYSIS COMPLETE
Total implementations analyzed: 2

Implementations included:
  ✓ PyTorch LayerNorm
  ✓ HF Kernels LayerNorm
▶ UV Install Logs

Artifacts:

latency.svg
2025-10-31T20:13:56.885734 image/svg+xml Matplotlib v3.10.7, https://matplotlib.org/ LN_B16_S2048_D4096 LN_B16_S2048_D8192 LN_B16_S4096_D4096 LN_B16_S4096_D8192 Workload 1.0 1.5 2.0 2.5 3.0 Latency P50 (ms) Attention Implementation Latency torch_layer_norm hf_kernels_layer_norm