Files
lux/docs/REFERENCE_COUNTING.md
Brandon Lucas b1cffadc83 feat: implement automatic RC cleanup at scope exit
Add scope tracking for reference-counted variables in the C backend:

- Add RcVariable struct and rc_scopes stack to CBackend
- Track RC variables when assigned in let bindings
- Emit lux_decref() calls when scopes exit (functions, blocks)
- Add memory tracking counters (alloc/free) for leak detection
- Fix List.filter to incref elements before copying (prevents double-free)
- Handle return values by incref/decref to keep them alive through cleanup

The RC system now properly frees memory at scope exit. Verified with
test showing "[RC] No leaks: 28 allocs, 28 frees".

Remaining work: early returns, complex conditionals, closures, ADTs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-14 12:55:44 -05:00

8.9 KiB

Reference Counting in Lux C Backend

Overview

This document describes the reference counting (RC) system for automatic memory management in the Lux C backend. The approach is inspired by Perceus (used in Koka) but starts with a simpler implementation.

Current Status: WORKING

The RC system is now functional for lists and boxed values.

What's Implemented

  • RC header structure (LuxRcHeader with refcount + type tag)
  • Allocation function (lux_rc_alloc)
  • Reference operations (lux_incref, lux_decref)
  • Polymorphic drop function (lux_drop)
  • Lists, boxed values, strings use RC allocation
  • List operations incref shared elements
  • Scope tracking - compiler tracks RC variable lifetimes
  • Automatic decref at scope exit - variables are freed when out of scope
  • Memory tracking - debug mode reports allocs/frees at program exit

Verified Working

[RC] No leaks: 28 allocs, 28 frees

What's NOT Yet Implemented

  • Early return handling (decref before return in nested scopes)
  • Conditional branch handling (complex if/else patterns)
  • Closure RC (environments still leak)
  • ADT RC

The Problem

Currently generated code looks like this:

void example(LuxEvidence* ev) {
    LuxList* nums = lux_list_new(5);  // rc=1, allocated
    // ... use nums ...
    // MISSING: lux_decref(nums);  <- MEMORY LEAK!
}

It should look like this:

void example(LuxEvidence* ev) {
    LuxList* nums = lux_list_new(5);  // rc=1
    // ... use nums ...
    lux_decref(nums);  // rc=0, freed
}

Implementation Plan

Phase 1: Scope Tracking

Goal: Track which RC-managed variables are live at each point.

Data structures needed in CBackend:

struct CBackend {
    // ... existing fields ...

    /// Stack of scopes, each containing RC-managed variables
    /// Each scope is a Vec of (var_name, c_type, needs_decref)
    rc_scopes: Vec<Vec<RcVariable>>,
}

struct RcVariable {
    name: String,      // Variable name
    c_type: String,    // C type (for casting in decref)
    is_rc: bool,       // Whether this needs RC management
}

Operations:

  • push_scope() - Enter a new scope (function, block, etc.)
  • pop_scope() - Exit scope, emit decrefs for all live variables
  • register_rc_var(name, type) - Register a variable that needs RC management

Phase 2: Identify RC-Managed Types

Goal: Determine which types need RC management.

RC-managed types:

  • LuxList* - Lists
  • LuxString (when dynamically allocated) - Strings from concat/conversion
  • LuxClosure* - Closures
  • Boxed values (void* from lux_box_*)
  • ADT variants with pointer fields

NOT RC-managed:

  • LuxInt, LuxFloat, LuxBool - Stack-allocated primitives
  • String literals ("hello") - Static, not heap-allocated
  • LuxUnit - No data

Implementation:

fn is_rc_managed_type(&self, c_type: &str) -> bool {
    matches!(c_type,
        "LuxList*" | "LuxClosure*" | "LuxString" | "void*"
    ) || c_type.ends_with("*")  // Most pointer types are RC
}

fn needs_rc_for_expr(&self, expr: &Expr) -> bool {
    match expr {
        Expr::List { .. } => true,
        Expr::Lambda { .. } => true,
        Expr::StringConcat { .. } => true,
        Expr::Call { .. } => {
            // Check if function returns RC type
            self.returns_rc_type(func)
        }
        Expr::Literal(Literal::String(_)) => false,  // Static string
        Expr::Literal(_) => false,  // Primitives
        Expr::Var(_) => false,  // Using existing var, don't double-free
        _ => false,
    }
}

Phase 3: Emit Decrefs at Scope Exit

Goal: Insert lux_decref() calls when variables go out of scope.

For function bodies:

fn emit_function(&mut self, func: &Function) -> Result<(), CGenError> {
    self.push_scope();

    // ... emit function body ...

    // Before the closing brace, emit decrefs
    self.emit_scope_cleanup();
    self.pop_scope();
}

The cleanup function:

fn emit_scope_cleanup(&mut self) {
    if let Some(scope) = self.rc_scopes.last() {
        // Decref in reverse order (LIFO)
        for var in scope.iter().rev() {
            if var.is_rc {
                self.writeln(&format!("lux_decref({});", var.name));
            }
        }
    }
}

Phase 4: Handle Let Bindings

Goal: Register variables when they're bound.

fn emit_let(&mut self, name: &str, value: &Expr) -> Result<String, CGenError> {
    let c_type = self.infer_c_type(value)?;
    let value_code = self.emit_expr(value)?;

    self.writeln(&format!("{} {} = {};", c_type, name, value_code));

    // Register for cleanup if RC-managed
    if self.is_rc_managed_type(&c_type) && self.needs_rc_for_expr(value) {
        self.register_rc_var(name, &c_type);
    }

    Ok(name.to_string())
}

Phase 5: Handle Early Returns

Goal: Decref all live variables before returning.

fn emit_return(&mut self, value: &Expr) -> Result<String, CGenError> {
    let return_val = self.emit_expr(value)?;

    // Store return value in temp if it's an RC variable we're about to decref
    let temp_needed = self.is_rc_managed_type(&self.infer_c_type(value)?);

    if temp_needed {
        self.writeln(&format!("void* _ret_tmp = {};", return_val));
        self.writeln("lux_incref(_ret_tmp);");  // Keep it alive
    }

    // Decref all scopes from innermost to outermost
    for scope in self.rc_scopes.iter().rev() {
        for var in scope.iter().rev() {
            if var.is_rc {
                self.writeln(&format!("lux_decref({});", var.name));
            }
        }
    }

    if temp_needed {
        self.writeln("return _ret_tmp;");
    } else {
        self.writeln(&format!("return {};", return_val));
    }

    Ok(String::new())
}

Phase 6: Handle Conditionals

Goal: Properly handle if/else where both branches may define variables.

For if/else expressions that create RC values:

// Before (leaks):
LuxList* result = (condition ? create_list_a() : create_list_b());

// After (no leak):
LuxList* result;
if (condition) {
    result = create_list_a();
} else {
    result = create_list_b();
}
// Only one path executed, only one allocation

This requires changing if/else from ternary expressions to proper if statements.

Phase 7: Handle Blocks

Goal: Each block { ... } creates a new scope.

fn emit_block(&mut self, statements: &[Statement]) -> Result<String, CGenError> {
    self.push_scope();
    self.writeln("{");
    self.indent += 1;

    let mut last_value = String::from("NULL");
    for stmt in statements {
        last_value = self.emit_statement(stmt)?;
    }

    // Cleanup before leaving block
    self.emit_scope_cleanup();

    self.indent -= 1;
    self.writeln("}");
    self.pop_scope();

    Ok(last_value)
}

Testing Strategy

Unit Tests

  1. Simple allocation and free:
fn test(): Unit = {
    let x = [1, 2, 3]  // Should be freed at end
}
  1. Nested scopes:
fn test(): Unit = {
    let outer = [1]
    {
        let inner = [2]  // Freed here
    }
    // outer still live
}  // outer freed here
  1. Early return:
fn test(b: Bool): List<Int> = {
    let x = [1, 2, 3]
    if b then return []  // x must be freed before return
    x
}
  1. Conditionals:
fn test(b: Bool): List<Int> = {
    let x = if b then [1] else [2]  // Only one allocated
    x
}

Memory Leak Detection

Use valgrind (if available) or add debug tracking:

static int64_t lux_alloc_count = 0;
static int64_t lux_free_count = 0;

static void* lux_rc_alloc(size_t size, int32_t tag) {
    lux_alloc_count++;
    // ... existing code ...
}

static void lux_drop(void* ptr, int32_t tag) {
    lux_free_count++;
    // ... existing code ...
}

// At program exit:
void lux_check_leaks() {
    if (lux_alloc_count != lux_free_count) {
        fprintf(stderr, "LEAK: %lld allocations, %lld frees\n",
                lux_alloc_count, lux_free_count);
    }
}

Comparison with Perceus

Feature Perceus (Koka) Lux RC (Current)
RC header Yes Yes
Scope tracking Yes Yes
Auto decref Yes Yes
Memory tracking No Yes (debug)
Early return Yes Partial
Last-use opt Yes No
Reuse (FBIP) Yes No
Drop fusion Yes No

Files to Modify

File Changes
src/codegen/c_backend.rs Add scope tracking, emit decrefs

Estimated Complexity

  • Scope tracking data structures: ~30 lines
  • Type classification: ~40 lines
  • Scope cleanup emission: ~30 lines
  • Let binding registration: ~20 lines
  • Early return handling: ~40 lines
  • Block scope handling: ~30 lines
  • Testing: ~100 lines

Total: ~300 lines of careful implementation


References