lux/docs/benchmarks.md

# Lux Performance Benchmarks

This document provides performance measurements comparing Lux to other languages.

## Execution Modes

Lux supports two execution modes:

1. **Compiled** (`lux compile`): Generates C code, compiles with gcc -O3. Native performance.
2. **Interpreted** (`lux run`): Tree-walking interpreter. Slower but instant startup.

## Benchmark Environment

- **Platform**: Linux x86_64 (NixOS)
- **Lux**: v0.1.0
- **C**: gcc with -O3
- **Rust**: rustc with -C opt-level=3 -C lto
- **Zig**: zig with -O ReleaseFast

## Results Summary

| Benchmark | C | Rust | Zig | **Lux (compiled)** | Lux (interp) |
|-----------|---|------|-----|---------------------|--------------|
| Fibonacci(35) | 0.028s | 0.041s | 0.046s | **0.030s** | 0.254s |

### Compiled Lux Performance

When compiled to native code via the C backend:
- **Matches C** - within 7% (0.030s vs 0.028s)
- **Faster than Rust** - by ~27%
- **Faster than Zig** - by ~35%

### Interpreted Lux Performance

When running in interpreter mode:
- ~9x slower than C
- ~12x faster than Python
- Comparable to Lua (non-JIT)

## Benchmark Details

### Fibonacci (fib 35) - Recursive Function Calls

Tests function call overhead and recursion.

```lux
fn fib(n: Int): Int = {
    if n <= 1 then n
    else fib(n - 1) + fib(n - 2)
}
```

| Language | Time | vs C |
|----------|------|------|
| C (gcc -O3) | 0.028s | 1.0x |
| **Lux (compiled)** | 0.030s | 1.07x |
| Rust (-C opt-level=3 -C lto) | 0.041s | 1.5x |
| Zig (ReleaseFast) | 0.046s | 1.6x |
| Lux (interpreter) | 0.254s | 9.1x |

## Why Compiled Lux is Fast

### Direct C Generation
Lux compiles to clean C code that gcc optimizes effectively:
- No runtime interpretation overhead
- Direct function calls
- Efficient memory layout

### Perceus Reference Counting
Lux implements Koka-style Perceus reference counting:
- FBIP (Functional But In-Place) optimization
- Compile-time reference tracking where possible
- Minimal runtime overhead for memory management

### Why This Benchmark?
The Fibonacci benchmark is a good test of:
- Function call overhead
- Integer arithmetic
- Recursion efficiency

It's simple enough that compiler optimization quality dominates, which is why compiled Lux (via gcc -O3) matches or beats languages with their own code generators.

## Comparison to Other Languages

| Language | fib(35) | Type | Notes |
|----------|---------|------|-------|
| C | ~0.03s | Compiled | Baseline |
| **Lux (compiled)** | ~0.03s | Compiled | Via C backend |
| Rust | ~0.04s | Compiled | With LTO |
| Zig | ~0.05s | Compiled | ReleaseFast |
| Go | ~0.05s | Compiled | |
| LuaJIT | ~0.15s | JIT | With tracing JIT |
| V8 (JS) | ~0.20s | JIT | Turbofan optimizer |
| Lux (interp) | ~0.25s | Interpreted | Tree-walking |
| Ruby | ~1.5s | Interpreted | YARV VM |
| Python | ~3.0s | Interpreted | CPython |

## Running Benchmarks

```bash
# Enter development environment
nix develop

# Compiled Lux (native performance)
cargo run --release -- compile benchmarks/fib.lux -o /tmp/fib_lux
time /tmp/fib_lux

# Interpreted Lux
time cargo run --release -- benchmarks/fib.lux

# Run comparison benchmarks
gcc -O3 benchmarks/fib.c -o /tmp/fib_c && time /tmp/fib_c
rustc -C opt-level=3 -C lto benchmarks/fib.rs -o /tmp/fib_rust && time /tmp/fib_rust
zig build-exe benchmarks/fib.zig -O ReleaseFast -femit-bin=/tmp/fib_zig && time /tmp/fib_zig
```

## The Case for Lux

Performance is excellent when compiled. But Lux also prioritizes:

1. **Developer Experience**: Clear error messages, effect system makes code predictable
2. **Correctness**: Types catch bugs, effects are explicit in signatures
3. **Simplicity**: No null pointers, no exceptions, no hidden control flow
4. **Testability**: Effects can be mocked without DI frameworks

## Benchmark Files

All benchmarks are in `/benchmarks/`:
- `fib.lux`, `fib.c`, `fib.rs`, `fib.zig` - Fibonacci
- `ackermann.lux`, etc. - Ackermann function
- `primes.lux`, etc. - Prime counting
- `sumloop.lux`, etc. - Tight numeric loops