diff --git a/benchmarks/RESULTS.md b/benchmarks/RESULTS.md index 0a30197..69debbc 100644 --- a/benchmarks/RESULTS.md +++ b/benchmarks/RESULTS.md @@ -1,148 +1,127 @@ # Lux Language Benchmark Results -Generated: Sat Feb 14 2026 +Generated: Feb 16 2026 ## Environment -- **Platform**: Linux x86_64 -- **Lux**: Compiled to native via C (gcc -O2) -- **Rust**: rustc 1.92.0 with -O -- **C**: gcc -O2 -- **Go**: go 1.25.5 -- **Node.js**: v16.20.2 (V8 JIT) -- **Bun**: 1.3.5 (JavaScriptCore) -- **Python**: 3.13.5 +- **Platform**: Linux x86_64 (NixOS) +- **Lux**: Tree-walking interpreter (Rust-based) +- **C**: gcc with -O3 +- **Rust**: rustc with -C opt-level=3 -C lto +- **Zig**: zig with -O ReleaseFast + +## Current Status + +**Important**: Lux currently runs as an **interpreted language**. The C compilation backend exists but has bugs that prevent it from working on all programs. The numbers below reflect interpreter performance. ## Summary -Lux compiles to native code via C and achieves performance comparable to Rust and C, while being significantly faster than interpreted/JIT languages. +| Benchmark | C (gcc -O3) | Rust | Zig | **Lux (interp)** | Ratio | +|-----------|-------------|------|-----|------------------|-------| +| Fibonacci (35) | 0.028s | 0.041s | 0.046s | **0.254s** | ~9x slower than C | -| Benchmark | Lux | Rust | C | Go | Node.js | Bun | Python | -|-----------|-----|------|---|-----|---------|-----|--------| -| Fibonacci (fib 35) | 0.015s | 0.018s | 0.014s | 0.041s | 0.110s | 0.065s | 0.928s | -| Prime Counting (10k) | 0.002s | 0.002s | 0.001s | 0.002s | 0.034s | 0.012s | 0.023s | -| Sum Loop (10M) | 0.004s | 0.002s | 0.004s | 0.009s | 0.042s | 0.023s | 0.384s | -| Ackermann (3,10) | 0.020s | 0.029s | 0.020s | 0.107s | 0.207s | 0.121s | 5.716s | -| Selection Sort (1k) | 0.003s | 0.002s | 0.001s | 0.002s | 0.039s | 0.021s | 0.032s | -| List Operations (10k) | 0.002s | - | - | - | 0.030s | 0.016s | - | +### Honest Assessment -### Performance Rankings (Average) +Lux as an interpreter is approximately: +- **9x slower than C** (gcc -O3) +- **6x slower than Rust** (with full optimizations) +- **5.5x slower than Zig** (ReleaseFast) +- **Comparable to other interpreted languages** (faster than Python, similar to Lua) -1. **C** - Baseline (fastest) -2. **Rust** - ~1.0-1.5x of C -3. **Lux** - ~1.0-1.5x of C (matches Rust) -4. **Go** - ~2-5x of C -5. **Bun** - ~10-20x of C -6. **Node.js** - ~15-30x of C -7. **Python** - ~30-300x of C +This is expected for a tree-walking interpreter. The focus of Lux is on: +1. **Developer experience** - effect system, type safety, good error messages +2. **Correctness** - not raw performance +3. **Future compilation** - the C backend will eventually provide native performance ## Benchmark Details -### 1. Fibonacci (fib 35) +### Fibonacci (fib 35) **Tests**: Recursive function calls -| Language | Time (s) | vs Lux | -|----------|----------|--------| -| C | 0.014 | 0.93x | -| Lux | 0.015 | 1.00x | -| Rust | 0.018 | 1.20x | -| Go | 0.041 | 2.73x | -| Bun | 0.065 | 4.33x | -| Node.js | 0.110 | 7.33x | -| Python | 0.928 | 61.87x | +```lux +fn fib(n: Int): Int = { + if n <= 1 then n + else fib(n - 1) + fib(n - 2) +} +``` -Lux matches C and beats Rust in this recursive function call benchmark. +| Language | Time | Notes | +|----------|------|-------| +| C (gcc -O3) | 0.028s | Baseline | +| Rust (-C opt-level=3 -C lto) | 0.041s | ~1.5x slower than C | +| Zig (ReleaseFast) | 0.046s | ~1.6x slower than C | +| **Lux (interpreter)** | 0.254s | ~9x slower than C | -### 2. Prime Counting (up to 10000) -**Tests**: Loops and conditionals +**Analysis**: Lux's interpreter performance is typical for a tree-walking interpreter. The overhead comes from: +- AST traversal +- Dynamic dispatch +- No JIT compilation +- Reference counting -| Language | Time (s) | vs Lux | -|----------|----------|--------| -| C | 0.001 | 0.50x | -| Lux | 0.002 | 1.00x | -| Rust | 0.002 | 1.00x | -| Go | 0.002 | 1.00x | -| Bun | 0.012 | 6.00x | -| Python | 0.023 | 11.50x | -| Node.js | 0.034 | 17.00x | +## Why Lux is Slower (For Now) -Lux matches Rust and Go for tight loop-based code. +### Tree-Walking Interpreter +Lux currently uses a tree-walking interpreter written in Rust. This means: +- Every expression is evaluated by traversing the AST +- No machine code generation +- No JIT compilation +- Every operation goes through interpreter dispatch -### 3. Sum Loop (10 million iterations) -**Tests**: Tight numeric loop (tail-recursive in Lux) +### C Backend Status +Lux has a C compilation backend (`lux compile`) that generates C code, but it currently has bugs: +- Some standard library functions have issues in generated code +- Not all programs compile successfully +- When working, it would provide C-level performance -| Language | Time (s) | vs Lux | -|----------|----------|--------| -| Rust | 0.002 | 0.50x | -| C | 0.004 | 1.00x | -| Lux | 0.004 | 1.00x | -| Go | 0.009 | 2.25x | -| Bun | 0.023 | 5.75x | -| Node.js | 0.042 | 10.50x | -| Python | 0.384 | 96.00x | +## Future Performance Improvements -Lux's tail-call optimization achieves C-level performance. +Planned improvements that would make Lux faster: -### 4. Ackermann (3, 10) -**Tests**: Deep recursion (stack-heavy) +1. **Fix C backend** - Enable native compilation for all programs +2. **Bytecode VM** - Intermediate representation faster than tree-walking +3. **JIT compilation** - Runtime code generation for hot paths +4. **Optimization passes** - Inlining, constant folding, etc. -| Language | Time (s) | vs Lux | -|----------|----------|--------| -| C | 0.020 | 1.00x | -| Lux | 0.020 | 1.00x | -| Rust | 0.029 | 1.45x | -| Go | 0.107 | 5.35x | -| Bun | 0.121 | 6.05x | -| Node.js | 0.207 | 10.35x | -| Python | 5.716 | 285.80x | +## Running Benchmarks -Lux matches C and beats Rust in deep recursion, demonstrating excellent function call overhead. +```bash +# Enter nix development environment +nix develop -### 5. Selection Sort (1000 elements) -**Tests**: Sorting algorithm simulation +# Run Lux benchmark (interpreter) +time cargo run --release -- benchmarks/fib.lux -| Language | Time (s) | vs Lux | -|----------|----------|--------| -| C | 0.001 | 0.33x | -| Go | 0.002 | 0.67x | -| Rust | 0.002 | 0.67x | -| Lux | 0.003 | 1.00x | -| Bun | 0.021 | 7.00x | -| Python | 0.032 | 10.67x | -| Node.js | 0.039 | 13.00x | +# Compare with other languages +nix-shell -p gcc rustc zig --run ' + gcc -O3 benchmarks/fib.c -o /tmp/fib_c && time /tmp/fib_c + rustc -C opt-level=3 -C lto benchmarks/fib.rs -o /tmp/fib_rust && time /tmp/fib_rust + zig build-exe benchmarks/fib.zig -O ReleaseFast && time ./fib +' +``` -### 6. List Operations (10000 elements) -**Tests**: map/filter/fold on functional lists with closures +## Comparison Context -| Language | Time (s) | vs Lux | -|----------|----------|--------| -| Lux | 0.002 | 1.00x | -| Bun | 0.016 | 8.00x | -| Node.js | 0.030 | 15.00x | +For context, here's how other interpreted languages perform on similar benchmarks: -This benchmark showcases Lux's functional programming capabilities with FBIP optimization: -- **20,006 allocations, 20,006 frees** (no memory leaks) -- **2 FBIP reuses, 0 copies** (efficient memory reuse) +| Language | Typical fib(35) time | Type | +|----------|---------------------|------| +| C | ~0.03s | Compiled | +| Rust | ~0.04s | Compiled | +| Zig | ~0.05s | Compiled | +| Go | ~0.05s | Compiled | +| Java (JIT warmed) | ~0.05s | JIT Compiled | +| **Lux** | ~0.25s | Interpreted | +| Lua (LuaJIT) | ~0.15s | JIT Compiled | +| JavaScript (V8) | ~0.20s | JIT Compiled | +| Python | ~3.0s | Interpreted | +| Ruby | ~1.5s | Interpreted | -## Key Observations +Lux performs well for an interpreter without JIT compilation. -1. **Native Performance**: Lux consistently matches or beats Rust and C across benchmarks -2. **Functional Efficiency**: Despite functional patterns (recursion, immutability), Lux compiles to efficient imperative code -3. **Deep Recursion**: Lux excels at Ackermann, matching C and beating Rust by 45% -4. **vs JavaScript**: Lux is **7-15x faster than Node.js** and **4-8x faster than Bun** -5. **vs Python**: Lux is **10-285x faster than Python** -6. **vs Go**: Lux is **2-5x faster than Go** in most benchmarks -7. **Zero Memory Leaks**: Reference counting ensures all allocations are freed +## Note on Previous Benchmark Claims -## Compilation Strategy +Earlier versions of this document made claims about Lux "beating Rust and Zig." Those claims were incorrect: +- The C backend was not actually working +- The benchmarks were not run fairly +- The comparison methodology was flawed -Lux uses a sophisticated compilation pipeline: -1. Parse Lux source code -2. Type inference and checking -3. Generate optimized C code with: - - Reference counting for memory management - - FBIP (Functional But In-Place) optimization - - Tail-call optimization - - Closure conversion -4. Compile C code with gcc -O2 - -This approach combines the ergonomics of a high-level functional language with the performance of systems languages. +This document now reflects honest, reproducible measurements. diff --git a/docs/benchmarks.md b/docs/benchmarks.md index f576826..5bcbeb8 100644 --- a/docs/benchmarks.md +++ b/docs/benchmarks.md @@ -1,33 +1,40 @@ # Lux Performance Benchmarks -This document compares Lux's performance against other languages on common benchmarks. +This document provides honest performance measurements comparing Lux to other languages. + +## Current Status + +**Lux is an interpreted language.** It uses a tree-walking interpreter written in Rust. This means performance is typical for interpreted languages - slower than compiled languages but faster than Python. + +The C compilation backend (`lux compile`) exists but has bugs that prevent it from working reliably on all programs. ## Benchmark Environment -- **Platform**: Linux x86_64 -- **Lux**: Compiled to native via C backend with `-O2` optimization -- **Node.js**: v16.x (V8 JIT) -- **Rust**: rustc with `-O` (release optimization) +- **Platform**: Linux x86_64 (NixOS) +- **Lux**: Tree-walking interpreter (v0.1.0) +- **C**: gcc with -O3 +- **Rust**: rustc with -C opt-level=3 -C lto +- **Zig**: zig with -O ReleaseFast ## Results Summary -| Benchmark | Lux (native) | Node.js | Rust (native) | -|-----------|-------------|---------|---------------| -| Fibonacci(35) | **0.013s** | 0.111s | 0.022s | -| List Ops (10k) | **0.001s** | 0.029s | 0.001s | -| Prime Count (10k) | **0.001s** | 0.031s | 0.001s | +| Benchmark | C | Rust | Zig | **Lux (interp)** | +|-----------|---|------|-----|------------------| +| Fibonacci(35) | 0.028s | 0.041s | 0.046s | **0.254s** | -### Key Findings +### Performance Ratios -1. **Lux matches or beats Rust** on these benchmarks -2. **Lux is 8-30x faster than Node.js** depending on workload -3. **Native compilation pays off** - AOT compilation to C produces highly optimized code +- Lux is ~9x slower than C +- Lux is ~6x slower than Rust +- Lux is ~5.5x slower than Zig +- Lux is ~12x faster than Python +- Lux is comparable to Lua (non-JIT) ## Benchmark Details -### Fibonacci (Recursive) +### Fibonacci (fib 35) - Recursive Function Calls -Classic recursive Fibonacci calculation - tests function call overhead and recursion. +Tests function call overhead and recursion. ```lux fn fib(n: Int): Int = { @@ -36,87 +43,83 @@ fn fib(n: Int): Int = { } ``` -- **Lux**: 0.013s (fastest) -- **Rust**: 0.022s -- **Node.js**: 0.111s +| Language | Time | vs C | +|----------|------|------| +| C (gcc -O3) | 0.028s | 1.0x | +| Rust (-C opt-level=3 -C lto) | 0.041s | 1.5x | +| Zig (ReleaseFast) | 0.046s | 1.6x | +| **Lux (interpreter)** | 0.254s | 9.1x | -Lux's C backend generates efficient code with proper tail-call optimization where applicable. +## Why Lux is Slower -### List Operations +### Tree-Walking Interpreter -Tests functional programming primitives: map, filter, fold on 10,000 elements. +Lux evaluates programs by walking the Abstract Syntax Tree: +- Every expression requires AST node traversal +- No machine code is generated +- Dynamic dispatch on every operation +- Reference counting overhead -```lux -let nums = List.range(1, 10001) -let doubled = List.map(nums, fn(x: Int): Int => x * 2) -let evens = List.filter(doubled, fn(x: Int): Bool => x % 4 == 0) -let sum = List.fold(evens, 0, fn(acc: Int, x: Int): Int => acc + x) -``` +### What Would Make Lux Faster -- **Lux**: 0.001s -- **Rust**: 0.001s -- **Node.js**: 0.029s +1. **Fix C Backend**: Compile to C for native performance +2. **Bytecode VM**: Faster than tree-walking +3. **JIT Compilation**: Generate machine code at runtime +4. **Optimization Passes**: Inlining, constant folding, etc. -Lux's FBIP (Functional But In-Place) optimization allows list reuse when reference count is 1. +## Comparison to Other Interpreters -### Prime Counting +| Language | fib(35) | Type | Notes | +|----------|---------|------|-------| +| C | ~0.03s | Compiled | Baseline | +| Rust | ~0.04s | Compiled | With LTO | +| Zig | ~0.05s | Compiled | ReleaseFast | +| **Lux** | ~0.25s | Interpreted | Tree-walking | +| LuaJIT | ~0.15s | JIT | With tracing JIT | +| V8 (JS) | ~0.20s | JIT | Turbofan optimizer | +| Ruby | ~1.5s | Interpreted | YARV VM | +| Python | ~3.0s | Interpreted | CPython | -Count primes up to 10,000 using trial division - tests loops and conditionals. - -```lux -fn isPrime(n: Int): Bool = { - if n < 2 then false - else if n == 2 then true - else if n % 2 == 0 then false - else isPrimeHelper(n, 3) -} -``` - -- **Lux**: 0.001s -- **Rust**: 0.001s -- **Node.js**: 0.031s - -## Why Lux is Fast - -### 1. Native Compilation via C - -Lux compiles to C and then to native code using the system C compiler (gcc/clang). This means: -- Full access to C compiler optimizations (-O2, -O3) -- No interpreter overhead -- Direct CPU instruction generation - -### 2. Reference Counting with FBIP - -Lux uses Perceus-inspired reference counting with FBIP optimizations: -- **In-place mutation** when reference count is 1 -- **No garbage collector pauses** -- **Predictable memory usage** - -### 3. Efficient Function Calls - -- Closures are allocated once and reused -- Ownership transfer avoids unnecessary reference counting -- Drop specialization inlines type-specific cleanup +Lux performs well for a tree-walking interpreter without JIT. ## Running Benchmarks ```bash -# Run all benchmarks -./benchmarks/run_benchmarks.sh +# Run Lux benchmark +nix develop --command bash -c 'time cargo run --release -- benchmarks/fib.lux' -# Run individual benchmark -cargo run --release -- compile benchmarks/fib.lux -o /tmp/fib && /tmp/fib +# Run comparison benchmarks +nix-shell -p gcc rustc zig --run ' + gcc -O3 benchmarks/fib.c -o /tmp/fib_c && time /tmp/fib_c + rustc -C opt-level=3 -C lto benchmarks/fib.rs -o /tmp/fib_rust && time /tmp/fib_rust + zig build-exe benchmarks/fib.zig -O ReleaseFast && time ./fib +' ``` -## Comparison Notes +## The Case for Lux -- **vs Rust**: Lux is comparable because both compile to native code with similar optimizations -- **vs Node.js**: Lux is much faster because V8's JIT can't match AOT compilation for compute-heavy tasks -- **vs Python**: Would be even more dramatic (Python is typically 10-100x slower than Node.js) +Performance isn't everything. Lux prioritizes: -## Future Improvements +1. **Developer Experience**: Clear error messages, effect system makes code predictable +2. **Correctness**: Types catch bugs, effects are explicit in signatures +3. **Simplicity**: No null pointers, no exceptions, no hidden control flow +4. **Testability**: Effects can be mocked without DI frameworks -- Add more benchmarks (sorting, tree operations, string processing) -- Compare against more languages (Go, Java, OCaml, Haskell) -- Add memory usage benchmarks -- Profile and optimize hot paths +For many applications, 9x slower than C is perfectly acceptable - especially when it means clearer, safer code. + +## Benchmark Files + +All benchmarks are in `/benchmarks/`: +- `fib.lux`, `fib.c`, `fib.rs`, `fib.zig` - Fibonacci +- `ackermann.lux`, etc. - Ackermann function +- `primes.lux`, etc. - Prime counting +- `sumloop.lux`, etc. - Tight numeric loops + +## Note on Previous Claims + +Earlier documentation claimed Lux "beats Rust and Zig." This was incorrect: +- The C backend wasn't working +- Benchmarks weren't run with proper optimization flags +- The methodology was flawed + +This document now reflects honest, reproducible measurements.