lux/benchmarks/RESULTS.md

# Lux Language Benchmark Results

Generated: Feb 16 2026

## Environment
- **Platform**: Linux x86_64 (NixOS)
- **Lux**: Compiled via C backend + gcc -O3
- **Tools**: hyperfine, poop
- **Comparison**: C (gcc), Rust (rustc+LLVM), Zig (LLVM)

## Quick Start

```bash
nix run .#bench        # Full hyperfine comparison
nix run .#bench-poop   # Detailed CPU metrics
nix run .#bench-quick  # Just Lux vs C
```

## CPU Benchmark Results

### hyperfine (Statistical Timing)

```
Summary
  /tmp/fib_lux ran
    1.03 ± 0.08 times faster than /tmp/fib_c
    1.47 ± 0.04 times faster than /tmp/fib_rust
    1.67 ± 0.05 times faster than /tmp/fib_zig
```

| Binary | Mean | Std Dev | vs Lux |
|--------|------|---------|--------|
| **Lux (compiled)** | 28.1ms | ±0.6ms | baseline |
| C (gcc -O3) | 29.0ms | ±2.1ms | 1.03x slower |
| Rust | 41.2ms | ±0.6ms | 1.47x slower |
| Zig | 47.0ms | ±1.1ms | 1.67x slower |

### poop (Detailed CPU Metrics)

| Metric | C | Lux | Rust | Zig |
|--------|---|-----|------|-----|
| Wall Time | 29.0ms | 29.2ms | 42.0ms | 48.1ms |
| CPU Cycles | 53.1M | 53.2M | 78.2M | 90.4M |
| Instructions | 293M | 292M | 302M | 317M |
| Cache Misses | 4.39K | 4.62K | 6.47K | 340 |
| Branch Misses | 28.3K | 32.0K | 33.5K | 29.6K |
| Peak RSS | 1.56MB | 1.63MB | 2.00MB | 1.07MB |

## Why Lux Matches/Beats C, Rust, Zig

### The Key: gcc's Recursion Transformation

Lux compiles to C, which gcc optimizes aggressively. For the Fibonacci benchmark:

**Rust/Zig (LLVM)** keeps recursive calls:
```asm
call   fib    ; actual recursive call in hot path
```

**Lux/C (gcc)** transforms to loops:
```asm
; No recursive calls - fully loop-transformed
; Uses registers as accumulators
```

### Instruction Count Tells the Story

- **Lux/C**: 292-293M instructions executed
- **Rust**: 302M instructions (+3%)
- **Zig**: 317M instructions (+8%)

More instructions = more work = slower execution.

## HTTP Benchmarks

For HTTP server benchmarks, use established tools:

### TechEmpower Framework Benchmarks
The industry standard: https://www.techempower.com/benchmarks/

### Standard HTTP Benchmark Tools

```bash
# wrk - modern HTTP benchmarking
wrk -t4 -c100 -d10s http://localhost:8080/

# ab (Apache Bench) - classic tool
ab -n 10000 -c 100 http://localhost:8080/

# hey - written in Go
hey -n 10000 -c 100 http://localhost:8080/
```

### Reference Implementations

For fair HTTP comparisons, use minimal stdlib servers:

| Language | Command |
|----------|---------|
| Go | `go run` with `net/http` |
| Rust | `cargo run` with `std::net` or hyper |
| Node.js | `node` with `http` module |
| Python | `python -m http.server` |

HTTP benchmarks measure I/O patterns more than language speed. Use established frameworks for meaningful comparisons.

## Reproducing Results

```bash
# Enter dev shell
nix develop

# Compile all
cargo run --release -- compile benchmarks/fib.lux -o /tmp/fib_lux
gcc -O3 benchmarks/fib.c -o /tmp/fib_c
rustc -C opt-level=3 -C lto benchmarks/fib.rs -o /tmp/fib_rust
zig build-exe benchmarks/fib.zig -O ReleaseFast -femit-bin=/tmp/fib_zig

# Run benchmarks
hyperfine --warmup 3 --runs 10 '/tmp/fib_lux' '/tmp/fib_c' '/tmp/fib_rust' '/tmp/fib_zig'
poop '/tmp/fib_c' '/tmp/fib_lux' '/tmp/fib_rust' '/tmp/fib_zig'
```

## Caveats

1. **Micro-benchmark**: Fibonacci tests recursion optimization, not general performance
2. **gcc-specific**: Results depend on gcc's aggressive loop transformation
3. **No allocation**: fib doesn't test memory management (Perceus RC)
4. **Single-threaded**: No concurrency testing
5. **Linux-specific**: poop requires Linux perf counters

## When Lux Won't Be Fastest

| Scenario | Likely Winner | Why |
|----------|---------------|-----|
| Simple recursion | **Lux/C** | gcc's strength |
| SIMD/vectorization | Rust/Zig | Explicit intrinsics |
| Async I/O | Rust (tokio) | Mature runtime |
| Memory-heavy | Zig | Allocator control |
| Unsafe operations | C | No safety checks |