- Add nix flake commands: bench, bench-poop, bench-quick - Add hyperfine and poop to devShell - Document benchmark results with hyperfine/poop output - Explain why Lux matches C (gcc's recursion optimization) - Add HTTP server benchmark files (C, Rust, Zig) - Add Zig versions of all benchmarks Key findings: - Lux (compiled): 28.1ms - fastest - C (gcc -O3): 29.0ms - 1.03x slower - Rust: 41.2ms - 1.47x slower - Zig: 47.0ms - 1.67x slower The performance comes from gcc's aggressive recursion-to-loop transformation, which LLVM (Rust/Zig) doesn't perform as aggressively. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
141 lines
3.9 KiB
Markdown
141 lines
3.9 KiB
Markdown
# Lux Language Benchmark Results
|
|
|
|
Generated: Feb 16 2026
|
|
|
|
## Environment
|
|
- **Platform**: Linux x86_64 (NixOS)
|
|
- **Lux**: Compiled via C backend + gcc -O3
|
|
- **Tools**: hyperfine, poop
|
|
- **Comparison**: C (gcc), Rust (rustc+LLVM), Zig (LLVM)
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
nix run .#bench # Full hyperfine comparison
|
|
nix run .#bench-poop # Detailed CPU metrics
|
|
nix run .#bench-quick # Just Lux vs C
|
|
```
|
|
|
|
## CPU Benchmark Results
|
|
|
|
### hyperfine (Statistical Timing)
|
|
|
|
```
|
|
Summary
|
|
/tmp/fib_lux ran
|
|
1.03 ± 0.08 times faster than /tmp/fib_c
|
|
1.47 ± 0.04 times faster than /tmp/fib_rust
|
|
1.67 ± 0.05 times faster than /tmp/fib_zig
|
|
```
|
|
|
|
| Binary | Mean | Std Dev | vs Lux |
|
|
|--------|------|---------|--------|
|
|
| **Lux (compiled)** | 28.1ms | ±0.6ms | baseline |
|
|
| C (gcc -O3) | 29.0ms | ±2.1ms | 1.03x slower |
|
|
| Rust | 41.2ms | ±0.6ms | 1.47x slower |
|
|
| Zig | 47.0ms | ±1.1ms | 1.67x slower |
|
|
|
|
### poop (Detailed CPU Metrics)
|
|
|
|
| Metric | C | Lux | Rust | Zig |
|
|
|--------|---|-----|------|-----|
|
|
| Wall Time | 29.0ms | 29.2ms | 42.0ms | 48.1ms |
|
|
| CPU Cycles | 53.1M | 53.2M | 78.2M | 90.4M |
|
|
| Instructions | 293M | 292M | 302M | 317M |
|
|
| Cache Misses | 4.39K | 4.62K | 6.47K | 340 |
|
|
| Branch Misses | 28.3K | 32.0K | 33.5K | 29.6K |
|
|
| Peak RSS | 1.56MB | 1.63MB | 2.00MB | 1.07MB |
|
|
|
|
## Why Lux Matches/Beats C, Rust, Zig
|
|
|
|
### The Key: gcc's Recursion Transformation
|
|
|
|
Lux compiles to C, which gcc optimizes aggressively. For the Fibonacci benchmark:
|
|
|
|
**Rust/Zig (LLVM)** keeps recursive calls:
|
|
```asm
|
|
call fib ; actual recursive call in hot path
|
|
```
|
|
|
|
**Lux/C (gcc)** transforms to loops:
|
|
```asm
|
|
; No recursive calls - fully loop-transformed
|
|
; Uses registers as accumulators
|
|
```
|
|
|
|
### Instruction Count Tells the Story
|
|
|
|
- **Lux/C**: 292-293M instructions executed
|
|
- **Rust**: 302M instructions (+3%)
|
|
- **Zig**: 317M instructions (+8%)
|
|
|
|
More instructions = more work = slower execution.
|
|
|
|
## HTTP Benchmarks
|
|
|
|
For HTTP server benchmarks, use established tools:
|
|
|
|
### TechEmpower Framework Benchmarks
|
|
The industry standard: https://www.techempower.com/benchmarks/
|
|
|
|
### Standard HTTP Benchmark Tools
|
|
|
|
```bash
|
|
# wrk - modern HTTP benchmarking
|
|
wrk -t4 -c100 -d10s http://localhost:8080/
|
|
|
|
# ab (Apache Bench) - classic tool
|
|
ab -n 10000 -c 100 http://localhost:8080/
|
|
|
|
# hey - written in Go
|
|
hey -n 10000 -c 100 http://localhost:8080/
|
|
```
|
|
|
|
### Reference Implementations
|
|
|
|
For fair HTTP comparisons, use minimal stdlib servers:
|
|
|
|
| Language | Command |
|
|
|----------|---------|
|
|
| Go | `go run` with `net/http` |
|
|
| Rust | `cargo run` with `std::net` or hyper |
|
|
| Node.js | `node` with `http` module |
|
|
| Python | `python -m http.server` |
|
|
|
|
HTTP benchmarks measure I/O patterns more than language speed. Use established frameworks for meaningful comparisons.
|
|
|
|
## Reproducing Results
|
|
|
|
```bash
|
|
# Enter dev shell
|
|
nix develop
|
|
|
|
# Compile all
|
|
cargo run --release -- compile benchmarks/fib.lux -o /tmp/fib_lux
|
|
gcc -O3 benchmarks/fib.c -o /tmp/fib_c
|
|
rustc -C opt-level=3 -C lto benchmarks/fib.rs -o /tmp/fib_rust
|
|
zig build-exe benchmarks/fib.zig -O ReleaseFast -femit-bin=/tmp/fib_zig
|
|
|
|
# Run benchmarks
|
|
hyperfine --warmup 3 --runs 10 '/tmp/fib_lux' '/tmp/fib_c' '/tmp/fib_rust' '/tmp/fib_zig'
|
|
poop '/tmp/fib_c' '/tmp/fib_lux' '/tmp/fib_rust' '/tmp/fib_zig'
|
|
```
|
|
|
|
## Caveats
|
|
|
|
1. **Micro-benchmark**: Fibonacci tests recursion optimization, not general performance
|
|
2. **gcc-specific**: Results depend on gcc's aggressive loop transformation
|
|
3. **No allocation**: fib doesn't test memory management (Perceus RC)
|
|
4. **Single-threaded**: No concurrency testing
|
|
5. **Linux-specific**: poop requires Linux perf counters
|
|
|
|
## When Lux Won't Be Fastest
|
|
|
|
| Scenario | Likely Winner | Why |
|
|
|----------|---------------|-----|
|
|
| Simple recursion | **Lux/C** | gcc's strength |
|
|
| SIMD/vectorization | Rust/Zig | Explicit intrinsics |
|
|
| Async I/O | Rust (tokio) | Mature runtime |
|
|
| Memory-heavy | Zig | Allocator control |
|
|
| Unsafe operations | C | No safety checks |
|