Files
lux/docs/REFERENCE_COUNTING.md
Brandon Lucas b2f4beeaa2 docs: update documentation with RC implementation status
- C_BACKEND.md: Update memory management from "Leaks" to "Scope-based RC",
  update comparison tables with Koka/Rust/Zig/Go
- LANGUAGE_COMPARISON.md: Add status column to gap tables, add RC row
- OVERVIEW.md: Add C backend RC to completed features, update limitations
- REFERENCE_COUNTING.md: Add "Path to Koka/Rust Parity" section with:
  - What we have vs what Koka/Rust have
  - Remaining work for full memory safety (~230 lines)
  - Performance optimizations for Koka parity (~600 lines)
  - Cycle detection strategy

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-14 13:05:17 -05:00

13 KiB

Reference Counting in Lux C Backend

Overview

This document describes the reference counting (RC) system for automatic memory management in the Lux C backend. The approach is inspired by Perceus (used in Koka) but starts with a simpler implementation.

Current Status: WORKING

The RC system is now functional for lists and boxed values.

What's Implemented

  • RC header structure (LuxRcHeader with refcount + type tag)
  • Allocation function (lux_rc_alloc)
  • Reference operations (lux_incref, lux_decref)
  • Polymorphic drop function (lux_drop)
  • Lists, boxed values, strings use RC allocation
  • List operations incref shared elements
  • Scope tracking - compiler tracks RC variable lifetimes
  • Automatic decref at scope exit - variables are freed when out of scope
  • Memory tracking - debug mode reports allocs/frees at program exit

Verified Working

[RC] No leaks: 28 allocs, 28 frees

What's NOT Yet Implemented

  • Early return handling (decref before return in nested scopes)
  • Conditional branch handling (complex if/else patterns)
  • Closure RC (environments still leak)
  • ADT RC

The Problem

Currently generated code looks like this:

void example(LuxEvidence* ev) {
    LuxList* nums = lux_list_new(5);  // rc=1, allocated
    // ... use nums ...
    // MISSING: lux_decref(nums);  <- MEMORY LEAK!
}

It should look like this:

void example(LuxEvidence* ev) {
    LuxList* nums = lux_list_new(5);  // rc=1
    // ... use nums ...
    lux_decref(nums);  // rc=0, freed
}

Implementation Plan

Phase 1: Scope Tracking

Goal: Track which RC-managed variables are live at each point.

Data structures needed in CBackend:

struct CBackend {
    // ... existing fields ...

    /// Stack of scopes, each containing RC-managed variables
    /// Each scope is a Vec of (var_name, c_type, needs_decref)
    rc_scopes: Vec<Vec<RcVariable>>,
}

struct RcVariable {
    name: String,      // Variable name
    c_type: String,    // C type (for casting in decref)
    is_rc: bool,       // Whether this needs RC management
}

Operations:

  • push_scope() - Enter a new scope (function, block, etc.)
  • pop_scope() - Exit scope, emit decrefs for all live variables
  • register_rc_var(name, type) - Register a variable that needs RC management

Phase 2: Identify RC-Managed Types

Goal: Determine which types need RC management.

RC-managed types:

  • LuxList* - Lists
  • LuxString (when dynamically allocated) - Strings from concat/conversion
  • LuxClosure* - Closures
  • Boxed values (void* from lux_box_*)
  • ADT variants with pointer fields

NOT RC-managed:

  • LuxInt, LuxFloat, LuxBool - Stack-allocated primitives
  • String literals ("hello") - Static, not heap-allocated
  • LuxUnit - No data

Implementation:

fn is_rc_managed_type(&self, c_type: &str) -> bool {
    matches!(c_type,
        "LuxList*" | "LuxClosure*" | "LuxString" | "void*"
    ) || c_type.ends_with("*")  // Most pointer types are RC
}

fn needs_rc_for_expr(&self, expr: &Expr) -> bool {
    match expr {
        Expr::List { .. } => true,
        Expr::Lambda { .. } => true,
        Expr::StringConcat { .. } => true,
        Expr::Call { .. } => {
            // Check if function returns RC type
            self.returns_rc_type(func)
        }
        Expr::Literal(Literal::String(_)) => false,  // Static string
        Expr::Literal(_) => false,  // Primitives
        Expr::Var(_) => false,  // Using existing var, don't double-free
        _ => false,
    }
}

Phase 3: Emit Decrefs at Scope Exit

Goal: Insert lux_decref() calls when variables go out of scope.

For function bodies:

fn emit_function(&mut self, func: &Function) -> Result<(), CGenError> {
    self.push_scope();

    // ... emit function body ...

    // Before the closing brace, emit decrefs
    self.emit_scope_cleanup();
    self.pop_scope();
}

The cleanup function:

fn emit_scope_cleanup(&mut self) {
    if let Some(scope) = self.rc_scopes.last() {
        // Decref in reverse order (LIFO)
        for var in scope.iter().rev() {
            if var.is_rc {
                self.writeln(&format!("lux_decref({});", var.name));
            }
        }
    }
}

Phase 4: Handle Let Bindings

Goal: Register variables when they're bound.

fn emit_let(&mut self, name: &str, value: &Expr) -> Result<String, CGenError> {
    let c_type = self.infer_c_type(value)?;
    let value_code = self.emit_expr(value)?;

    self.writeln(&format!("{} {} = {};", c_type, name, value_code));

    // Register for cleanup if RC-managed
    if self.is_rc_managed_type(&c_type) && self.needs_rc_for_expr(value) {
        self.register_rc_var(name, &c_type);
    }

    Ok(name.to_string())
}

Phase 5: Handle Early Returns

Goal: Decref all live variables before returning.

fn emit_return(&mut self, value: &Expr) -> Result<String, CGenError> {
    let return_val = self.emit_expr(value)?;

    // Store return value in temp if it's an RC variable we're about to decref
    let temp_needed = self.is_rc_managed_type(&self.infer_c_type(value)?);

    if temp_needed {
        self.writeln(&format!("void* _ret_tmp = {};", return_val));
        self.writeln("lux_incref(_ret_tmp);");  // Keep it alive
    }

    // Decref all scopes from innermost to outermost
    for scope in self.rc_scopes.iter().rev() {
        for var in scope.iter().rev() {
            if var.is_rc {
                self.writeln(&format!("lux_decref({});", var.name));
            }
        }
    }

    if temp_needed {
        self.writeln("return _ret_tmp;");
    } else {
        self.writeln(&format!("return {};", return_val));
    }

    Ok(String::new())
}

Phase 6: Handle Conditionals

Goal: Properly handle if/else where both branches may define variables.

For if/else expressions that create RC values:

// Before (leaks):
LuxList* result = (condition ? create_list_a() : create_list_b());

// After (no leak):
LuxList* result;
if (condition) {
    result = create_list_a();
} else {
    result = create_list_b();
}
// Only one path executed, only one allocation

This requires changing if/else from ternary expressions to proper if statements.

Phase 7: Handle Blocks

Goal: Each block { ... } creates a new scope.

fn emit_block(&mut self, statements: &[Statement]) -> Result<String, CGenError> {
    self.push_scope();
    self.writeln("{");
    self.indent += 1;

    let mut last_value = String::from("NULL");
    for stmt in statements {
        last_value = self.emit_statement(stmt)?;
    }

    // Cleanup before leaving block
    self.emit_scope_cleanup();

    self.indent -= 1;
    self.writeln("}");
    self.pop_scope();

    Ok(last_value)
}

Testing Strategy

Unit Tests

  1. Simple allocation and free:
fn test(): Unit = {
    let x = [1, 2, 3]  // Should be freed at end
}
  1. Nested scopes:
fn test(): Unit = {
    let outer = [1]
    {
        let inner = [2]  // Freed here
    }
    // outer still live
}  // outer freed here
  1. Early return:
fn test(b: Bool): List<Int> = {
    let x = [1, 2, 3]
    if b then return []  // x must be freed before return
    x
}
  1. Conditionals:
fn test(b: Bool): List<Int> = {
    let x = if b then [1] else [2]  // Only one allocated
    x
}

Memory Leak Detection

Use valgrind (if available) or add debug tracking:

static int64_t lux_alloc_count = 0;
static int64_t lux_free_count = 0;

static void* lux_rc_alloc(size_t size, int32_t tag) {
    lux_alloc_count++;
    // ... existing code ...
}

static void lux_drop(void* ptr, int32_t tag) {
    lux_free_count++;
    // ... existing code ...
}

// At program exit:
void lux_check_leaks() {
    if (lux_alloc_count != lux_free_count) {
        fprintf(stderr, "LEAK: %lld allocations, %lld frees\n",
                lux_alloc_count, lux_free_count);
    }
}

Comparison with Perceus

Feature Perceus (Koka) Lux RC (Current)
RC header Yes Yes
Scope tracking Yes Yes
Auto decref Yes Yes
Memory tracking No Yes (debug)
Early return Yes Partial
Last-use opt Yes No
Reuse (FBIP) Yes No
Drop fusion Yes No

Files to Modify

File Changes
src/codegen/c_backend.rs Add scope tracking, emit decrefs

Estimated Complexity

  • Scope tracking data structures: ~30 lines
  • Type classification: ~40 lines
  • Scope cleanup emission: ~30 lines
  • Let binding registration: ~20 lines
  • Early return handling: ~40 lines
  • Block scope handling: ~30 lines
  • Testing: ~100 lines

Total: ~300 lines of careful implementation


Path to Koka/Rust Parity

What We Have Now (Basic RC)

Our current implementation provides:

  • Deterministic cleanup - Memory freed at predictable points (scope exit)
  • No GC pauses - Unlike Go/Java, latency is predictable
  • Leak detection - Debug mode catches memory leaks during development
  • No manual management - Unlike C/Zig, programmer doesn't call free()

What Koka Has (Perceus RC)

Koka's Perceus system adds several optimizations we don't have:

Feature Description Benefit Complexity
Last-use analysis Detect when a variable's final use allows ownership transfer Avoid unnecessary copies Medium
Reuse (FBIP) When rc=1, mutate in-place instead of copy Major performance boost High
Drop specialization Generate type-specific drop instead of polymorphic Fewer branches, faster Low
Drop fusion Combine multiple consecutive drops Fewer function calls Medium
Borrow inference Avoid incref when borrowing temporaries Reduce RC overhead High

What Rust Has (Ownership)

Rust's ownership system is fundamentally different:

Aspect Rust Lux RC Tradeoff
When checked Compile-time Runtime Rust catches bugs earlier
Runtime cost Zero RC operations Rust is faster
Learning curve Steep (borrow checker) Gentle Lux is easier to learn
Expressiveness Limited by lifetimes Unrestricted Lux is more flexible
Cycles Prevented by design Would leak Rust handles more patterns

Key insight: We can never match Rust's zero-overhead guarantees because ownership is checked at compile time. RC always has runtime cost. But we can be as good as Koka.

Remaining Work for Full Memory Safety

Phase A: Complete Coverage (Prevent All Leaks)

  1. Closure RC - Environments should be RC-managed

    • Allocate env with lux_rc_alloc
    • Drop env when closure is dropped
    • ~50 lines in emit_lambda
  2. ADT RC - Algebraic data types with heap fields

    • Track which variants contain RC fields
    • Generate drop functions for each ADT
    • ~100 lines
  3. Early return handling - Cleanup all scopes on return

    • Current impl handles simple cases
    • Need nested scope cleanup
    • ~30 lines
  4. Complex conditionals - If/else creating RC values

    • Switch from ternary to if-statements
    • Track RC creation in branches
    • ~50 lines

Phase B: Performance Optimizations (Match Koka)

  1. Last-use optimization

    • Track variable liveness
    • Skip incref on last use (transfer ownership)
    • Requires dataflow analysis
    • ~200 lines
  2. Reuse analysis (FBIP)

    • Detect rc=1 at update sites
    • Mutate in-place instead of copy
    • Major change to list operations
    • ~300 lines
  3. Drop specialization

    • Generate per-type drop functions
    • Eliminate polymorphic dispatch
    • ~100 lines

Estimated Effort

Phase Description Lines Priority
A1 Closure RC ~50 P0 - Closures leak
A2 ADT RC ~100 P1 - ADTs leak
A3 Early returns ~30 P1 - Edge cases
A4 Conditionals ~50 P2 - Uncommon
B1 Last-use opt ~200 P3 - Performance
B2 Reuse (FBIP) ~300 P3 - Performance
B3 Drop special ~100 P3 - Performance

Phase A total: ~230 lines - Gets us to "no leaks" Phase B total: ~600 lines - Gets us to Koka-level performance

Cycle Detection

RC cannot handle cycles (A → B → A). Options:

  1. Ignore - Cycles are rare in functional code (our current approach)
  2. Weak references - Programmer marks back-edges
  3. Cycle collector - Periodic scan for cycles (adds GC-like pauses)

Koka also ignores cycles, relying on functional programming's natural acyclicity.


References