blu/lux

Files

Brandon Lucas b2f4beeaa2 docs: update documentation with RC implementation status

- C_BACKEND.md: Update memory management from "Leaks" to "Scope-based RC",
  update comparison tables with Koka/Rust/Zig/Go
- LANGUAGE_COMPARISON.md: Add status column to gap tables, add RC row
- OVERVIEW.md: Add C backend RC to completed features, update limitations
- REFERENCE_COUNTING.md: Add "Path to Koka/Rust Parity" section with:
  - What we have vs what Koka/Rust have
  - Remaining work for full memory safety (~230 lines)
  - Performance optimizations for Koka parity (~600 lines)
  - Cycle detection strategy

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-14 13:05:17 -05:00

13 KiB

Raw Blame History

Reference Counting in Lux C Backend

Overview

This document describes the reference counting (RC) system for automatic memory management in the Lux C backend. The approach is inspired by Perceus (used in Koka) but starts with a simpler implementation.

Current Status: WORKING

The RC system is now functional for lists and boxed values.

What's Implemented

RC header structure (LuxRcHeader with refcount + type tag)
Allocation function (lux_rc_alloc)
Reference operations (lux_incref, lux_decref)
Polymorphic drop function (lux_drop)
Lists, boxed values, strings use RC allocation
List operations incref shared elements
Scope tracking - compiler tracks RC variable lifetimes
Automatic decref at scope exit - variables are freed when out of scope
Memory tracking - debug mode reports allocs/frees at program exit

Verified Working

[RC] No leaks: 28 allocs, 28 frees

What's NOT Yet Implemented

Early return handling (decref before return in nested scopes)
Conditional branch handling (complex if/else patterns)
Closure RC (environments still leak)
ADT RC

The Problem

Currently generated code looks like this:

void example(LuxEvidence* ev) {
    LuxList* nums = lux_list_new(5);  // rc=1, allocated
    // ... use nums ...
    // MISSING: lux_decref(nums);  <- MEMORY LEAK!
}

It should look like this:

void example(LuxEvidence* ev) {
    LuxList* nums = lux_list_new(5);  // rc=1
    // ... use nums ...
    lux_decref(nums);  // rc=0, freed
}

Implementation Plan

Phase 1: Scope Tracking

Goal: Track which RC-managed variables are live at each point.

Data structures needed in CBackend:

struct CBackend {
    // ... existing fields ...

    /// Stack of scopes, each containing RC-managed variables
    /// Each scope is a Vec of (var_name, c_type, needs_decref)
    rc_scopes: Vec<Vec<RcVariable>>,
}

struct RcVariable {
    name: String,      // Variable name
    c_type: String,    // C type (for casting in decref)
    is_rc: bool,       // Whether this needs RC management
}

Operations:

push_scope() - Enter a new scope (function, block, etc.)
pop_scope() - Exit scope, emit decrefs for all live variables
register_rc_var(name, type) - Register a variable that needs RC management

Phase 2: Identify RC-Managed Types

Goal: Determine which types need RC management.

RC-managed types:

LuxList* - Lists
LuxString (when dynamically allocated) - Strings from concat/conversion
LuxClosure* - Closures
Boxed values (void* from lux_box_*)
ADT variants with pointer fields

NOT RC-managed:

LuxInt, LuxFloat, LuxBool - Stack-allocated primitives
String literals ("hello") - Static, not heap-allocated
LuxUnit - No data

Implementation:

fn is_rc_managed_type(&self, c_type: &str) -> bool {
    matches!(c_type,
        "LuxList*" | "LuxClosure*" | "LuxString" | "void*"
    ) || c_type.ends_with("*")  // Most pointer types are RC
}

fn needs_rc_for_expr(&self, expr: &Expr) -> bool {
    match expr {
        Expr::List { .. } => true,
        Expr::Lambda { .. } => true,
        Expr::StringConcat { .. } => true,
        Expr::Call { .. } => {
            // Check if function returns RC type
            self.returns_rc_type(func)
        }
        Expr::Literal(Literal::String(_)) => false,  // Static string
        Expr::Literal(_) => false,  // Primitives
        Expr::Var(_) => false,  // Using existing var, don't double-free
        _ => false,
    }
}

Phase 3: Emit Decrefs at Scope Exit

Goal: Insert lux_decref() calls when variables go out of scope.

For function bodies:

fn emit_function(&mut self, func: &Function) -> Result<(), CGenError> {
    self.push_scope();

    // ... emit function body ...

    // Before the closing brace, emit decrefs
    self.emit_scope_cleanup();
    self.pop_scope();
}

The cleanup function:

fn emit_scope_cleanup(&mut self) {
    if let Some(scope) = self.rc_scopes.last() {
        // Decref in reverse order (LIFO)
        for var in scope.iter().rev() {
            if var.is_rc {
                self.writeln(&format!("lux_decref({});", var.name));
            }
        }
    }
}

Phase 4: Handle Let Bindings

Goal: Register variables when they're bound.

fn emit_let(&mut self, name: &str, value: &Expr) -> Result<String, CGenError> {
    let c_type = self.infer_c_type(value)?;
    let value_code = self.emit_expr(value)?;

    self.writeln(&format!("{} {} = {};", c_type, name, value_code));

    // Register for cleanup if RC-managed
    if self.is_rc_managed_type(&c_type) && self.needs_rc_for_expr(value) {
        self.register_rc_var(name, &c_type);
    }

    Ok(name.to_string())
}

Phase 5: Handle Early Returns

Goal: Decref all live variables before returning.

fn emit_return(&mut self, value: &Expr) -> Result<String, CGenError> {
    let return_val = self.emit_expr(value)?;

    // Store return value in temp if it's an RC variable we're about to decref
    let temp_needed = self.is_rc_managed_type(&self.infer_c_type(value)?);

    if temp_needed {
        self.writeln(&format!("void* _ret_tmp = {};", return_val));
        self.writeln("lux_incref(_ret_tmp);");  // Keep it alive
    }

    // Decref all scopes from innermost to outermost
    for scope in self.rc_scopes.iter().rev() {
        for var in scope.iter().rev() {
            if var.is_rc {
                self.writeln(&format!("lux_decref({});", var.name));
            }
        }
    }

    if temp_needed {
        self.writeln("return _ret_tmp;");
    } else {
        self.writeln(&format!("return {};", return_val));
    }

    Ok(String::new())
}

Phase 6: Handle Conditionals

Goal: Properly handle if/else where both branches may define variables.

For if/else expressions that create RC values:

// Before (leaks):
LuxList* result = (condition ? create_list_a() : create_list_b());

// After (no leak):
LuxList* result;
if (condition) {
    result = create_list_a();
} else {
    result = create_list_b();
}
// Only one path executed, only one allocation

This requires changing if/else from ternary expressions to proper if statements.

Phase 7: Handle Blocks

Goal: Each block { ... } creates a new scope.

fn emit_block(&mut self, statements: &[Statement]) -> Result<String, CGenError> {
    self.push_scope();
    self.writeln("{");
    self.indent += 1;

    let mut last_value = String::from("NULL");
    for stmt in statements {
        last_value = self.emit_statement(stmt)?;
    }

    // Cleanup before leaving block
    self.emit_scope_cleanup();

    self.indent -= 1;
    self.writeln("}");
    self.pop_scope();

    Ok(last_value)
}

Testing Strategy

Unit Tests

Simple allocation and free:

fn test(): Unit = {
    let x = [1, 2, 3]  // Should be freed at end
}

Nested scopes:

fn test(): Unit = {
    let outer = [1]
    {
        let inner = [2]  // Freed here
    }
    // outer still live
}  // outer freed here

Early return:

fn test(b: Bool): List<Int> = {
    let x = [1, 2, 3]
    if b then return []  // x must be freed before return
    x
}

Conditionals:

fn test(b: Bool): List<Int> = {
    let x = if b then [1] else [2]  // Only one allocated
    x
}

Memory Leak Detection

Use valgrind (if available) or add debug tracking:

static int64_t lux_alloc_count = 0;
static int64_t lux_free_count = 0;

static void* lux_rc_alloc(size_t size, int32_t tag) {
    lux_alloc_count++;
    // ... existing code ...
}

static void lux_drop(void* ptr, int32_t tag) {
    lux_free_count++;
    // ... existing code ...
}

// At program exit:
void lux_check_leaks() {
    if (lux_alloc_count != lux_free_count) {
        fprintf(stderr, "LEAK: %lld allocations, %lld frees\n",
                lux_alloc_count, lux_free_count);
    }
}

Comparison with Perceus

Feature	Perceus (Koka)	Lux RC (Current)
RC header	Yes	Yes ✅
Scope tracking	Yes	Yes ✅
Auto decref	Yes	Yes ✅
Memory tracking	No	Yes ✅ (debug)
Early return	Yes	Partial
Last-use opt	Yes	No
Reuse (FBIP)	Yes	No
Drop fusion	Yes	No

Files to Modify

File	Changes
`src/codegen/c_backend.rs`	Add scope tracking, emit decrefs

Estimated Complexity

Scope tracking data structures: ~30 lines
Type classification: ~40 lines
Scope cleanup emission: ~30 lines
Let binding registration: ~20 lines
Early return handling: ~40 lines
Block scope handling: ~30 lines
Testing: ~100 lines

Total: ~300 lines of careful implementation

Path to Koka/Rust Parity

What We Have Now (Basic RC)

Our current implementation provides:

Deterministic cleanup - Memory freed at predictable points (scope exit)
No GC pauses - Unlike Go/Java, latency is predictable
Leak detection - Debug mode catches memory leaks during development
No manual management - Unlike C/Zig, programmer doesn't call free()

What Koka Has (Perceus RC)

Koka's Perceus system adds several optimizations we don't have:

Feature	Description	Benefit	Complexity
Last-use analysis	Detect when a variable's final use allows ownership transfer	Avoid unnecessary copies	Medium
Reuse (FBIP)	When rc=1, mutate in-place instead of copy	Major performance boost	High
Drop specialization	Generate type-specific drop instead of polymorphic	Fewer branches, faster	Low
Drop fusion	Combine multiple consecutive drops	Fewer function calls	Medium
Borrow inference	Avoid incref when borrowing temporaries	Reduce RC overhead	High

What Rust Has (Ownership)

Rust's ownership system is fundamentally different:

Aspect	Rust	Lux RC	Tradeoff
When checked	Compile-time	Runtime	Rust catches bugs earlier
Runtime cost	Zero	RC operations	Rust is faster
Learning curve	Steep (borrow checker)	Gentle	Lux is easier to learn
Expressiveness	Limited by lifetimes	Unrestricted	Lux is more flexible
Cycles	Prevented by design	Would leak	Rust handles more patterns

Key insight: We can never match Rust's zero-overhead guarantees because ownership is checked at compile time. RC always has runtime cost. But we can be as good as Koka.

Remaining Work for Full Memory Safety

Phase A: Complete Coverage (Prevent All Leaks)

Closure RC - Environments should be RC-managed
- Allocate env with lux_rc_alloc
- Drop env when closure is dropped
- ~50 lines in emit_lambda
ADT RC - Algebraic data types with heap fields
- Track which variants contain RC fields
- Generate drop functions for each ADT
- ~100 lines
Early return handling - Cleanup all scopes on return
- Current impl handles simple cases
- Need nested scope cleanup
- ~30 lines
Complex conditionals - If/else creating RC values
- Switch from ternary to if-statements
- Track RC creation in branches
- ~50 lines

Phase B: Performance Optimizations (Match Koka)

Last-use optimization
- Track variable liveness
- Skip incref on last use (transfer ownership)
- Requires dataflow analysis
- ~200 lines
Reuse analysis (FBIP)
- Detect rc=1 at update sites
- Mutate in-place instead of copy
- Major change to list operations
- ~300 lines
Drop specialization
- Generate per-type drop functions
- Eliminate polymorphic dispatch
- ~100 lines

Estimated Effort

Phase	Description	Lines	Priority
A1	Closure RC	~50	P0 - Closures leak
A2	ADT RC	~100	P1 - ADTs leak
A3	Early returns	~30	P1 - Edge cases
A4	Conditionals	~50	P2 - Uncommon
B1	Last-use opt	~200	P3 - Performance
B2	Reuse (FBIP)	~300	P3 - Performance
B3	Drop special	~100	P3 - Performance

Phase A total: ~230 lines - Gets us to "no leaks" Phase B total: ~600 lines - Gets us to Koka-level performance

Cycle Detection

RC cannot handle cycles (A → B → A). Options:

Ignore - Cycles are rare in functional code (our current approach)
Weak references - Programmer marks back-edges
Cycle collector - Periodic scan for cycles (adds GC-like pauses)

Koka also ignores cycles, relying on functional programming's natural acyclicity.

13 KiB Raw Blame History