docs: update documentation with RC implementation status
- C_BACKEND.md: Update memory management from "Leaks" to "Scope-based RC", update comparison tables with Koka/Rust/Zig/Go - LANGUAGE_COMPARISON.md: Add status column to gap tables, add RC row - OVERVIEW.md: Add C backend RC to completed features, update limitations - REFERENCE_COUNTING.md: Add "Path to Koka/Rust Parity" section with: - What we have vs what Koka/Rust have - Remaining work for full memory safety (~230 lines) - Performance optimizations for Koka parity (~600 lines) - Cycle detection strategy Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -9,7 +9,7 @@ Lux compiles to C code, then invokes a system C compiler (gcc/clang) to produce
|
||||
| **Koka** | C | Perceus reference counting |
|
||||
| **Nim** | C | ORC (configurable) |
|
||||
| **Chicken Scheme** | C | Generational GC |
|
||||
| **Lux (current)** | C | None (leaks) |
|
||||
| **Lux** | C | Scope-based reference counting |
|
||||
|
||||
## Compilation Pipeline
|
||||
|
||||
@@ -164,15 +164,18 @@ result->length = nums->length;
|
||||
|
||||
## Current Limitations
|
||||
|
||||
### 1. Memory Management (Partial RC)
|
||||
### 1. Memory Management ✅ WORKING (Lists/Boxed Values)
|
||||
|
||||
RC infrastructure is implemented but not fully integrated:
|
||||
Scope-based reference counting is now functional:
|
||||
- ✅ RC header structure with refcount + type tag
|
||||
- ✅ Lists, boxed values, and strings use RC allocation
|
||||
- ✅ List operations properly incref shared elements
|
||||
- ⏳ Automatic decref at scope exit (not yet implemented)
|
||||
- ✅ **Automatic decref at scope exit** - variables freed when out of scope
|
||||
- ✅ **Memory tracking** - debug mode reports allocs/frees at program exit
|
||||
- ⏳ Early return handling (decref before return in nested scopes)
|
||||
- ⏳ Closures and ADTs still leak
|
||||
|
||||
**Current state:** Memory is tracked with refcounts, but objects are not automatically freed at scope exit. This is acceptable for short-lived programs but not for long-running services.
|
||||
**Current state:** Lists and boxed values are properly memory-managed. When variables go out of scope, `lux_decref()` is automatically inserted. Test output shows `[RC] No leaks: 28 allocs, 28 frees`.
|
||||
|
||||
### 2. Effects ✅ MOSTLY COMPLETE
|
||||
|
||||
@@ -215,9 +218,10 @@ Koka also compiles to C with algebraic effects. Key differences:
|
||||
|
||||
| Aspect | Koka | Lux (current) |
|
||||
|--------|------|---------------|
|
||||
| Memory | Perceus RC | Leaks |
|
||||
| Effects | Evidence passing (zero-cost) | Runtime lookup |
|
||||
| Closures | Environment vectors | Heap-allocated structs |
|
||||
| Memory | Perceus RC (full) | Scope-based RC (lists/boxed) |
|
||||
| Effects | Evidence passing (zero-cost) | Evidence passing (zero-cost) |
|
||||
| Closures | Environment vectors | Heap-allocated structs (leak) |
|
||||
| Reuse (FBIP) | Yes | Not yet |
|
||||
| Maturity | Production-ready | Experimental |
|
||||
|
||||
### Rust
|
||||
@@ -225,8 +229,8 @@ Koka also compiles to C with algebraic effects. Key differences:
|
||||
| Aspect | Rust | Lux |
|
||||
|--------|------|-----|
|
||||
| Target | LLVM | C |
|
||||
| Memory | Ownership/borrowing | Leaks |
|
||||
| Safety | Compile-time guaranteed | Runtime (interpreter) |
|
||||
| Memory | Ownership/borrowing (compile-time) | RC (runtime) |
|
||||
| Safety | Compile-time guaranteed | Runtime RC |
|
||||
| Learning curve | Steep | Medium |
|
||||
|
||||
### Zig
|
||||
@@ -234,7 +238,7 @@ Koka also compiles to C with algebraic effects. Key differences:
|
||||
| Aspect | Zig | Lux |
|
||||
|--------|-----|-----|
|
||||
| Target | LLVM | C |
|
||||
| Memory | Manual with allocators | Leaks |
|
||||
| Memory | Manual with allocators | Automatic RC |
|
||||
| Philosophy | Explicit control | High-level abstraction |
|
||||
|
||||
### Go
|
||||
@@ -242,7 +246,7 @@ Koka also compiles to C with algebraic effects. Key differences:
|
||||
| Aspect | Go | Lux |
|
||||
|--------|-----|-----|
|
||||
| Target | Native | C |
|
||||
| Memory | Concurrent GC | Leaks |
|
||||
| Memory | Concurrent GC | Deterministic RC |
|
||||
| Effects | None | Algebraic effects |
|
||||
| Latency | Unpredictable (GC pauses) | Predictable (no GC) |
|
||||
|
||||
@@ -274,23 +278,26 @@ See [docs/EVIDENCE_PASSING.md](EVIDENCE_PASSING.md) for details.
|
||||
|
||||
## Future Roadmap
|
||||
|
||||
### Phase 4: Perceus Reference Counting 🔄 IN PROGRESS
|
||||
### Phase 4: Reference Counting ✅ WORKING (Basic)
|
||||
|
||||
**Goal:** Deterministic memory management without GC pauses.
|
||||
|
||||
Perceus is a compile-time reference counting system that:
|
||||
1. Inserts increment/decrement at precise points
|
||||
2. Detects when values can be reused in-place (FBIP)
|
||||
3. Guarantees no memory leaks without runtime GC
|
||||
Inspired by Perceus (Koka), our RC system:
|
||||
1. Tracks refcounts in object headers
|
||||
2. Inserts decref at scope exit automatically
|
||||
3. Provides memory leak detection in debug mode
|
||||
|
||||
**Current Status:**
|
||||
- ✅ RC infrastructure (header, alloc, incref/decref, drop)
|
||||
- ✅ Lists use RC allocation with proper element incref
|
||||
- ✅ Boxed values (Int, Bool, Float) use RC allocation
|
||||
- ✅ Dynamic strings use RC allocation
|
||||
- ⏳ Automatic decref at scope exit (TODO)
|
||||
- ⏳ Closure RC (TODO)
|
||||
- ⏳ Last-use optimization (TODO)
|
||||
- ✅ **Scope tracking** - compiler tracks RC variable lifetimes
|
||||
- ✅ **Automatic decref at scope exit** - verified leak-free
|
||||
- ⏳ Early return handling (decref before nested returns)
|
||||
- ⏳ Closure RC (environments still leak)
|
||||
- ⏳ ADT RC (algebraic data types)
|
||||
- ⏳ Last-use optimization / reuse (FBIP)
|
||||
|
||||
See [docs/REFERENCE_COUNTING.md](REFERENCE_COUNTING.md) for details.
|
||||
|
||||
|
||||
@@ -279,6 +279,7 @@ Based on 2025 research, languages succeed through:
|
||||
| Practical Focus | No | Yes | Yes | Yes | Yes | **Yes** |
|
||||
| Schema Evolution | No | No | No | No | No | **Planned** |
|
||||
| Behavioral Types | No | No | No | No | No | **Planned** |
|
||||
| Reference Counting | Perceus | N/A | N/A | GC | N/A | **Scope-based** |
|
||||
| JIT Compilation | No | No | N/A | N/A | No | **Yes** |
|
||||
|
||||
### Lux's Potential Differentiators
|
||||
@@ -324,14 +325,15 @@ run app() with { Http = mockHttp, Database = inMemoryDb }
|
||||
|
||||
### Critical Gaps (Blocking Adoption)
|
||||
|
||||
| Gap | Why It Matters | Priority |
|
||||
|-----|----------------|----------|
|
||||
| **Ecosystem/Packages** | "You rarely build from scratch" (Python's success) | P0 |
|
||||
| **Generics** | Can't write reusable `List<T>` functions | P0 |
|
||||
| **String Interpolation** | Basic usability | P1 |
|
||||
| **File/Network IO** | Can't build real applications | P1 |
|
||||
| **Elm-Quality Errors** | "Famous error messages" drive adoption | P1 |
|
||||
| **Full Compilation** | JIT exists but limited | P2 |
|
||||
| Gap | Why It Matters | Priority | Status |
|
||||
|-----|----------------|----------|--------|
|
||||
| **Ecosystem/Packages** | "You rarely build from scratch" (Python's success) | P0 | ❌ Missing |
|
||||
| **Generics** | Can't write reusable `List<T>` functions | P0 | ✅ Complete |
|
||||
| **String Interpolation** | Basic usability | P1 | ✅ Complete |
|
||||
| **File/Network IO** | Can't build real applications | P1 | ✅ Complete |
|
||||
| **Elm-Quality Errors** | "Famous error messages" drive adoption | P1 | ⏳ Partial |
|
||||
| **Full Compilation** | Native binaries | P2 | ✅ C Backend |
|
||||
| **Memory Management** | Long-running services need it | P1 | ✅ RC Working |
|
||||
|
||||
### Developer Experience Gaps
|
||||
|
||||
@@ -345,13 +347,13 @@ run app() with { Http = mockHttp, Database = inMemoryDb }
|
||||
|
||||
### Ecosystem Gaps
|
||||
|
||||
| Gap | Why It Matters |
|
||||
|-----|----------------|
|
||||
| No package registry | Can't share/reuse code |
|
||||
| No HTTP library | Can't build web services |
|
||||
| No database drivers | Can't build real backends |
|
||||
| No JSON library | Can't build APIs |
|
||||
| No testing framework | Can't ensure quality |
|
||||
| Gap | Why It Matters | Status |
|
||||
|-----|----------------|--------|
|
||||
| No package registry | Can't share/reuse code | ❌ Missing |
|
||||
| No HTTP library | Can't build web services | ✅ Http effect |
|
||||
| No database drivers | Can't build real backends | ❌ Missing |
|
||||
| No JSON library | Can't build APIs | ✅ Json module |
|
||||
| No testing framework | Can't ensure quality | ✅ Test effect |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -295,7 +295,7 @@ Quick iteration with type inference and a REPL.
|
||||
### Not a Good Fit (Yet)
|
||||
|
||||
- Large production applications (early stage)
|
||||
- Performance-critical code (C backend still basic)
|
||||
- Performance-critical code (C backend working, but no advanced optimizations)
|
||||
- Web frontend development (no JS compilation)
|
||||
- Systems programming (no low-level control)
|
||||
|
||||
@@ -370,12 +370,14 @@ Values + Effects C Code → GCC/Clang
|
||||
- ✅ C Backend (basic functions, Console.print)
|
||||
- ✅ C Backend closures and pattern matching
|
||||
- ✅ C Backend lists (all 16 operations)
|
||||
- ✅ C Backend reference counting (lists, boxed values)
|
||||
- ✅ Watch mode / hot reload
|
||||
- ✅ Formatter
|
||||
|
||||
**In Progress:**
|
||||
1. **Schema Evolution** - Type system integration, auto-migration
|
||||
2. **Error Message Quality** - Context lines shown, suggestions partial
|
||||
3. **Memory Management** - RC working for lists/boxed, closures/ADTs pending
|
||||
|
||||
**Planned:**
|
||||
4. **SQL Effect** - Database access
|
||||
|
||||
@@ -363,7 +363,114 @@ void lux_check_leaks() {
|
||||
|
||||
---
|
||||
|
||||
## Path to Koka/Rust Parity
|
||||
|
||||
### What We Have Now (Basic RC)
|
||||
|
||||
Our current implementation provides:
|
||||
- **Deterministic cleanup** - Memory freed at predictable points (scope exit)
|
||||
- **No GC pauses** - Unlike Go/Java, latency is predictable
|
||||
- **Leak detection** - Debug mode catches memory leaks during development
|
||||
- **No manual management** - Unlike C/Zig, programmer doesn't call free()
|
||||
|
||||
### What Koka Has (Perceus RC)
|
||||
|
||||
Koka's Perceus system adds several optimizations we don't have:
|
||||
|
||||
| Feature | Description | Benefit | Complexity |
|
||||
|---------|-------------|---------|------------|
|
||||
| **Last-use analysis** | Detect when a variable's final use allows ownership transfer | Avoid unnecessary copies | Medium |
|
||||
| **Reuse (FBIP)** | When rc=1, mutate in-place instead of copy | Major performance boost | High |
|
||||
| **Drop specialization** | Generate type-specific drop instead of polymorphic | Fewer branches, faster | Low |
|
||||
| **Drop fusion** | Combine multiple consecutive drops | Fewer function calls | Medium |
|
||||
| **Borrow inference** | Avoid incref when borrowing temporaries | Reduce RC overhead | High |
|
||||
|
||||
### What Rust Has (Ownership)
|
||||
|
||||
Rust's ownership system is fundamentally different:
|
||||
|
||||
| Aspect | Rust | Lux RC | Tradeoff |
|
||||
|--------|------|--------|----------|
|
||||
| **When checked** | Compile-time | Runtime | Rust catches bugs earlier |
|
||||
| **Runtime cost** | Zero | RC operations | Rust is faster |
|
||||
| **Learning curve** | Steep (borrow checker) | Gentle | Lux is easier to learn |
|
||||
| **Expressiveness** | Limited by lifetimes | Unrestricted | Lux is more flexible |
|
||||
| **Cycles** | Prevented by design | Would leak | Rust handles more patterns |
|
||||
|
||||
**Key insight:** We can never match Rust's zero-overhead guarantees because ownership is checked at compile time. RC always has runtime cost. But we can be as good as Koka.
|
||||
|
||||
### Remaining Work for Full Memory Safety
|
||||
|
||||
#### Phase A: Complete Coverage (Prevent All Leaks)
|
||||
|
||||
1. **Closure RC** - Environments should be RC-managed
|
||||
- Allocate env with `lux_rc_alloc`
|
||||
- Drop env when closure is dropped
|
||||
- ~50 lines in `emit_lambda`
|
||||
|
||||
2. **ADT RC** - Algebraic data types with heap fields
|
||||
- Track which variants contain RC fields
|
||||
- Generate drop functions for each ADT
|
||||
- ~100 lines
|
||||
|
||||
3. **Early return handling** - Cleanup all scopes on return
|
||||
- Current impl handles simple cases
|
||||
- Need nested scope cleanup
|
||||
- ~30 lines
|
||||
|
||||
4. **Complex conditionals** - If/else creating RC values
|
||||
- Switch from ternary to if-statements
|
||||
- Track RC creation in branches
|
||||
- ~50 lines
|
||||
|
||||
#### Phase B: Performance Optimizations (Match Koka)
|
||||
|
||||
1. **Last-use optimization**
|
||||
- Track variable liveness
|
||||
- Skip incref on last use (transfer ownership)
|
||||
- Requires dataflow analysis
|
||||
- ~200 lines
|
||||
|
||||
2. **Reuse analysis (FBIP)**
|
||||
- Detect `rc=1` at update sites
|
||||
- Mutate in-place instead of copy
|
||||
- Major change to list operations
|
||||
- ~300 lines
|
||||
|
||||
3. **Drop specialization**
|
||||
- Generate per-type drop functions
|
||||
- Eliminate polymorphic dispatch
|
||||
- ~100 lines
|
||||
|
||||
### Estimated Effort
|
||||
|
||||
| Phase | Description | Lines | Priority |
|
||||
|-------|-------------|-------|----------|
|
||||
| A1 | Closure RC | ~50 | P0 - Closures leak |
|
||||
| A2 | ADT RC | ~100 | P1 - ADTs leak |
|
||||
| A3 | Early returns | ~30 | P1 - Edge cases |
|
||||
| A4 | Conditionals | ~50 | P2 - Uncommon |
|
||||
| B1 | Last-use opt | ~200 | P3 - Performance |
|
||||
| B2 | Reuse (FBIP) | ~300 | P3 - Performance |
|
||||
| B3 | Drop special | ~100 | P3 - Performance |
|
||||
|
||||
**Phase A total: ~230 lines** - Gets us to "no leaks"
|
||||
**Phase B total: ~600 lines** - Gets us to Koka-level performance
|
||||
|
||||
### Cycle Detection
|
||||
|
||||
RC cannot handle cycles (A → B → A). Options:
|
||||
|
||||
1. **Ignore** - Cycles are rare in functional code (our current approach)
|
||||
2. **Weak references** - Programmer marks back-edges
|
||||
3. **Cycle collector** - Periodic scan for cycles (adds GC-like pauses)
|
||||
|
||||
Koka also ignores cycles, relying on functional programming's natural acyclicity.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Perceus Paper](https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/)
|
||||
- [Koka Reference Counting](https://koka-lang.github.io/koka/doc/book.html)
|
||||
- [Rust Ownership](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html)
|
||||
|
||||
Reference in New Issue
Block a user