# Reference Counting in Lux C Backend ## Overview This document describes the reference counting (RC) system for automatic memory management in the Lux C backend. The approach is inspired by Perceus (used in Koka) but starts with a simpler implementation. ## Current Status: WORKING The RC system is now functional for lists and boxed values. ### What's Implemented - RC header structure (`LuxRcHeader` with refcount + type tag) - Allocation function (`lux_rc_alloc`) - Reference operations (`lux_incref`, `lux_decref`) - Polymorphic drop function (`lux_drop`) - Lists, boxed values, strings use RC allocation - List operations incref shared elements - **Closures and environments** - RC-managed with automatic cleanup - **Inline lambda cleanup** - temporary closures freed after use - **ADT pointer fields** - RC-allocated and cleaned up at scope exit - **Scope tracking** - compiler tracks RC variable lifetimes - **Automatic decref at scope exit** - variables are freed when out of scope - **Memory tracking** - debug mode reports allocs/frees at program exit - **Early return handling** - variables being returned from blocks/functions are not decref'd - **Function call RC tracking** - values from RC-returning functions are tracked for cleanup ### Verified Working ``` [RC] No leaks: 14 allocs, 14 frees ``` ### What's NOT Yet Implemented - Conditional branch handling (complex if/else patterns) ## The Problem Currently generated code looks like this: ```c void example(LuxEvidence* ev) { LuxList* nums = lux_list_new(5); // rc=1, allocated // ... use nums ... // MISSING: lux_decref(nums); <- MEMORY LEAK! } ``` It should look like this: ```c void example(LuxEvidence* ev) { LuxList* nums = lux_list_new(5); // rc=1 // ... use nums ... lux_decref(nums); // rc=0, freed } ``` --- ## Implementation Plan ### Phase 1: Scope Tracking **Goal:** Track which RC-managed variables are live at each point. **Data structures needed in CBackend:** ```rust struct CBackend { // ... existing fields ... /// Stack of scopes, each containing RC-managed variables /// Each scope is a Vec of (var_name, c_type, needs_decref) rc_scopes: Vec>, } struct RcVariable { name: String, // Variable name c_type: String, // C type (for casting in decref) is_rc: bool, // Whether this needs RC management } ``` **Operations:** - `push_scope()` - Enter a new scope (function, block, etc.) - `pop_scope()` - Exit scope, emit decrefs for all live variables - `register_rc_var(name, type)` - Register a variable that needs RC management ### Phase 2: Identify RC-Managed Types **Goal:** Determine which types need RC management. RC-managed types: - `LuxList*` - Lists - `LuxString` (when dynamically allocated) - Strings from concat/conversion - `LuxClosure*` - Closures - Boxed values (`void*` from `lux_box_*`) - ADT variants with pointer fields NOT RC-managed: - `LuxInt`, `LuxFloat`, `LuxBool` - Stack-allocated primitives - String literals (`"hello"`) - Static, not heap-allocated - `LuxUnit` - No data **Implementation:** ```rust fn is_rc_managed_type(&self, c_type: &str) -> bool { matches!(c_type, "LuxList*" | "LuxClosure*" | "LuxString" | "void*" ) || c_type.ends_with("*") // Most pointer types are RC } fn needs_rc_for_expr(&self, expr: &Expr) -> bool { match expr { Expr::List { .. } => true, Expr::Lambda { .. } => true, Expr::StringConcat { .. } => true, Expr::Call { .. } => { // Check if function returns RC type self.returns_rc_type(func) } Expr::Literal(Literal::String(_)) => false, // Static string Expr::Literal(_) => false, // Primitives Expr::Var(_) => false, // Using existing var, don't double-free _ => false, } } ``` ### Phase 3: Emit Decrefs at Scope Exit **Goal:** Insert `lux_decref()` calls when variables go out of scope. **For function bodies:** ```rust fn emit_function(&mut self, func: &Function) -> Result<(), CGenError> { self.push_scope(); // ... emit function body ... // Before the closing brace, emit decrefs self.emit_scope_cleanup(); self.pop_scope(); } ``` **The cleanup function:** ```rust fn emit_scope_cleanup(&mut self) { if let Some(scope) = self.rc_scopes.last() { // Decref in reverse order (LIFO) for var in scope.iter().rev() { if var.is_rc { self.writeln(&format!("lux_decref({});", var.name)); } } } } ``` ### Phase 4: Handle Let Bindings **Goal:** Register variables when they're bound. ```rust fn emit_let(&mut self, name: &str, value: &Expr) -> Result { let c_type = self.infer_c_type(value)?; let value_code = self.emit_expr(value)?; self.writeln(&format!("{} {} = {};", c_type, name, value_code)); // Register for cleanup if RC-managed if self.is_rc_managed_type(&c_type) && self.needs_rc_for_expr(value) { self.register_rc_var(name, &c_type); } Ok(name.to_string()) } ``` ### Phase 5: Handle Early Returns **Goal:** Decref all live variables before returning. ```rust fn emit_return(&mut self, value: &Expr) -> Result { let return_val = self.emit_expr(value)?; // Store return value in temp if it's an RC variable we're about to decref let temp_needed = self.is_rc_managed_type(&self.infer_c_type(value)?); if temp_needed { self.writeln(&format!("void* _ret_tmp = {};", return_val)); self.writeln("lux_incref(_ret_tmp);"); // Keep it alive } // Decref all scopes from innermost to outermost for scope in self.rc_scopes.iter().rev() { for var in scope.iter().rev() { if var.is_rc { self.writeln(&format!("lux_decref({});", var.name)); } } } if temp_needed { self.writeln("return _ret_tmp;"); } else { self.writeln(&format!("return {};", return_val)); } Ok(String::new()) } ``` ### Phase 6: Handle Conditionals **Goal:** Properly handle if/else where both branches may define variables. For if/else expressions that create RC values: ```c // Before (leaks): LuxList* result = (condition ? create_list_a() : create_list_b()); // After (no leak): LuxList* result; if (condition) { result = create_list_a(); } else { result = create_list_b(); } // Only one path executed, only one allocation ``` This requires changing if/else from ternary expressions to proper if statements. ### Phase 7: Handle Blocks **Goal:** Each block `{ ... }` creates a new scope. ```rust fn emit_block(&mut self, statements: &[Statement]) -> Result { self.push_scope(); self.writeln("{"); self.indent += 1; let mut last_value = String::from("NULL"); for stmt in statements { last_value = self.emit_statement(stmt)?; } // Cleanup before leaving block self.emit_scope_cleanup(); self.indent -= 1; self.writeln("}"); self.pop_scope(); Ok(last_value) } ``` --- ## Testing Strategy ### Unit Tests 1. **Simple allocation and free:** ```lux fn test(): Unit = { let x = [1, 2, 3] // Should be freed at end } ``` 2. **Nested scopes:** ```lux fn test(): Unit = { let outer = [1] { let inner = [2] // Freed here } // outer still live } // outer freed here ``` 3. **Early return:** ```lux fn test(b: Bool): List = { let x = [1, 2, 3] if b then return [] // x must be freed before return x } ``` 4. **Conditionals:** ```lux fn test(b: Bool): List = { let x = if b then [1] else [2] // Only one allocated x } ``` ### Memory Leak Detection Use valgrind (if available) or add debug tracking: ```c static int64_t lux_alloc_count = 0; static int64_t lux_free_count = 0; static void* lux_rc_alloc(size_t size, int32_t tag) { lux_alloc_count++; // ... existing code ... } static void lux_drop(void* ptr, int32_t tag) { lux_free_count++; // ... existing code ... } // At program exit: void lux_check_leaks() { if (lux_alloc_count != lux_free_count) { fprintf(stderr, "LEAK: %lld allocations, %lld frees\n", lux_alloc_count, lux_free_count); } } ``` --- ## Comparison with Perceus | Feature | Perceus (Koka) | Lux RC (Current) | |---------|----------------|------------------| | RC header | Yes | Yes ✅ | | Scope tracking | Yes | Yes ✅ | | Auto decref | Yes | Yes ✅ | | Memory tracking | No | Yes ✅ (debug) | | Early return | Yes | Partial | | Last-use opt | Yes | No | | Reuse (FBIP) | Yes | No | | Drop fusion | Yes | No | --- ## Files to Modify | File | Changes | |------|---------| | `src/codegen/c_backend.rs` | Add scope tracking, emit decrefs | ## Estimated Complexity - Scope tracking data structures: ~30 lines - Type classification: ~40 lines - Scope cleanup emission: ~30 lines - Let binding registration: ~20 lines - Early return handling: ~40 lines - Block scope handling: ~30 lines - Testing: ~100 lines **Total: ~300 lines of careful implementation** --- ## Path to Koka/Rust Parity ### What We Have Now (Basic RC) Our current implementation provides: - **Deterministic cleanup** - Memory freed at predictable points (scope exit) - **No GC pauses** - Unlike Go/Java, latency is predictable - **Leak detection** - Debug mode catches memory leaks during development - **No manual management** - Unlike C/Zig, programmer doesn't call free() ### What Koka Has (Perceus RC) Koka's Perceus system adds several optimizations we don't have: | Feature | Description | Benefit | Complexity | |---------|-------------|---------|------------| | **Last-use analysis** | Detect when a variable's final use allows ownership transfer | Avoid unnecessary copies | Medium | | **Reuse (FBIP)** | When rc=1, mutate in-place instead of copy | Major performance boost | High | | **Drop specialization** | Generate type-specific drop instead of polymorphic | Fewer branches, faster | Low | | **Drop fusion** | Combine multiple consecutive drops | Fewer function calls | Medium | | **Borrow inference** | Avoid incref when borrowing temporaries | Reduce RC overhead | High | ### What Rust Has (Ownership) Rust's ownership system is fundamentally different: | Aspect | Rust | Lux RC | Tradeoff | |--------|------|--------|----------| | **When checked** | Compile-time | Runtime | Rust catches bugs earlier | | **Runtime cost** | Zero | RC operations | Rust is faster | | **Learning curve** | Steep (borrow checker) | Gentle | Lux is easier to learn | | **Expressiveness** | Limited by lifetimes | Unrestricted | Lux is more flexible | | **Cycles** | Prevented by design | Would leak | Rust handles more patterns | **Key insight:** We can never match Rust's zero-overhead guarantees because ownership is checked at compile time. RC always has runtime cost. But we can be as good as Koka. ### Remaining Work for Full Memory Safety #### Phase A: Complete Coverage (Prevent All Leaks) 1. ~~**Closure RC**~~ ✅ DONE - Environments are now RC-managed - Closures allocated with `lux_rc_alloc(sizeof(LuxClosure), LUX_TAG_CLOSURE)` - Environments allocated with `lux_rc_alloc(sizeof(LuxEnv_N), LUX_TAG_ENV)` - Inline lambdas freed after use in List operations 2. ~~**ADT RC**~~ ✅ DONE - Algebraic data types with heap fields - Track which variants contain RC fields - Generate drop functions for each ADT - ~100 lines 3. ~~**Early return handling**~~ ✅ DONE - Cleanup all scopes on return - Variables being returned are skipped during scope cleanup - Function calls returning RC types are tracked for cleanup - Blocks properly handle returning RC variables 4. **Complex conditionals** - If/else creating RC values - Switch from ternary to if-statements - Track RC creation in branches - ~50 lines #### Phase B: Performance Optimizations (Match Koka) 1. **Last-use optimization** - Track variable liveness - Skip incref on last use (transfer ownership) - Requires dataflow analysis - ~200 lines 2. **Reuse analysis (FBIP)** - Detect `rc=1` at update sites - Mutate in-place instead of copy - Major change to list operations - ~300 lines 3. **Drop specialization** - Generate per-type drop functions - Eliminate polymorphic dispatch - ~100 lines ### Estimated Effort | Phase | Description | Lines | Priority | Status | |-------|-------------|-------|----------|--------| | A1 | Closure RC | ~50 | P0 | ✅ Done | | A2 | ADT RC | ~150 | P1 | ✅ Done | | A3 | Early returns | ~30 | P1 | ✅ Done | | A4 | Conditionals | ~50 | P2 - Uncommon | Pending | | B1 | Last-use opt | ~200 | P3 - Performance | Pending | | B2 | Reuse (FBIP) | ~300 | P3 - Performance | Pending | | B3 | Drop special | ~100 | P3 - Performance | Pending | **Phase A remaining: ~50 lines** - Gets us to "no leaks" **Phase B total: ~600 lines** - Gets us to Koka-level performance ### Cycle Detection RC cannot handle cycles (A → B → A). Options: 1. **Ignore** - Cycles are rare in functional code (our current approach) 2. **Weak references** - Programmer marks back-edges 3. **Cycle collector** - Periodic scan for cycles (adds GC-like pauses) Koka also ignores cycles, relying on functional programming's natural acyclicity. --- ## References - [Perceus Paper](https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/) - [Koka Reference Counting](https://koka-lang.github.io/koka/doc/book.html) - [Rust Ownership](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html)