- C_BACKEND.md: Update memory management from "Leaks" to "Scope-based RC", update comparison tables with Koka/Rust/Zig/Go - LANGUAGE_COMPARISON.md: Add status column to gap tables, add RC row - OVERVIEW.md: Add C backend RC to completed features, update limitations - REFERENCE_COUNTING.md: Add "Path to Koka/Rust Parity" section with: - What we have vs what Koka/Rust have - Remaining work for full memory safety (~230 lines) - Performance optimizations for Koka parity (~600 lines) - Cycle detection strategy Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
13 KiB
Reference Counting in Lux C Backend
Overview
This document describes the reference counting (RC) system for automatic memory management in the Lux C backend. The approach is inspired by Perceus (used in Koka) but starts with a simpler implementation.
Current Status: WORKING
The RC system is now functional for lists and boxed values.
What's Implemented
- RC header structure (
LuxRcHeaderwith refcount + type tag) - Allocation function (
lux_rc_alloc) - Reference operations (
lux_incref,lux_decref) - Polymorphic drop function (
lux_drop) - Lists, boxed values, strings use RC allocation
- List operations incref shared elements
- Scope tracking - compiler tracks RC variable lifetimes
- Automatic decref at scope exit - variables are freed when out of scope
- Memory tracking - debug mode reports allocs/frees at program exit
Verified Working
[RC] No leaks: 28 allocs, 28 frees
What's NOT Yet Implemented
- Early return handling (decref before return in nested scopes)
- Conditional branch handling (complex if/else patterns)
- Closure RC (environments still leak)
- ADT RC
The Problem
Currently generated code looks like this:
void example(LuxEvidence* ev) {
LuxList* nums = lux_list_new(5); // rc=1, allocated
// ... use nums ...
// MISSING: lux_decref(nums); <- MEMORY LEAK!
}
It should look like this:
void example(LuxEvidence* ev) {
LuxList* nums = lux_list_new(5); // rc=1
// ... use nums ...
lux_decref(nums); // rc=0, freed
}
Implementation Plan
Phase 1: Scope Tracking
Goal: Track which RC-managed variables are live at each point.
Data structures needed in CBackend:
struct CBackend {
// ... existing fields ...
/// Stack of scopes, each containing RC-managed variables
/// Each scope is a Vec of (var_name, c_type, needs_decref)
rc_scopes: Vec<Vec<RcVariable>>,
}
struct RcVariable {
name: String, // Variable name
c_type: String, // C type (for casting in decref)
is_rc: bool, // Whether this needs RC management
}
Operations:
push_scope()- Enter a new scope (function, block, etc.)pop_scope()- Exit scope, emit decrefs for all live variablesregister_rc_var(name, type)- Register a variable that needs RC management
Phase 2: Identify RC-Managed Types
Goal: Determine which types need RC management.
RC-managed types:
LuxList*- ListsLuxString(when dynamically allocated) - Strings from concat/conversionLuxClosure*- Closures- Boxed values (
void*fromlux_box_*) - ADT variants with pointer fields
NOT RC-managed:
LuxInt,LuxFloat,LuxBool- Stack-allocated primitives- String literals (
"hello") - Static, not heap-allocated LuxUnit- No data
Implementation:
fn is_rc_managed_type(&self, c_type: &str) -> bool {
matches!(c_type,
"LuxList*" | "LuxClosure*" | "LuxString" | "void*"
) || c_type.ends_with("*") // Most pointer types are RC
}
fn needs_rc_for_expr(&self, expr: &Expr) -> bool {
match expr {
Expr::List { .. } => true,
Expr::Lambda { .. } => true,
Expr::StringConcat { .. } => true,
Expr::Call { .. } => {
// Check if function returns RC type
self.returns_rc_type(func)
}
Expr::Literal(Literal::String(_)) => false, // Static string
Expr::Literal(_) => false, // Primitives
Expr::Var(_) => false, // Using existing var, don't double-free
_ => false,
}
}
Phase 3: Emit Decrefs at Scope Exit
Goal: Insert lux_decref() calls when variables go out of scope.
For function bodies:
fn emit_function(&mut self, func: &Function) -> Result<(), CGenError> {
self.push_scope();
// ... emit function body ...
// Before the closing brace, emit decrefs
self.emit_scope_cleanup();
self.pop_scope();
}
The cleanup function:
fn emit_scope_cleanup(&mut self) {
if let Some(scope) = self.rc_scopes.last() {
// Decref in reverse order (LIFO)
for var in scope.iter().rev() {
if var.is_rc {
self.writeln(&format!("lux_decref({});", var.name));
}
}
}
}
Phase 4: Handle Let Bindings
Goal: Register variables when they're bound.
fn emit_let(&mut self, name: &str, value: &Expr) -> Result<String, CGenError> {
let c_type = self.infer_c_type(value)?;
let value_code = self.emit_expr(value)?;
self.writeln(&format!("{} {} = {};", c_type, name, value_code));
// Register for cleanup if RC-managed
if self.is_rc_managed_type(&c_type) && self.needs_rc_for_expr(value) {
self.register_rc_var(name, &c_type);
}
Ok(name.to_string())
}
Phase 5: Handle Early Returns
Goal: Decref all live variables before returning.
fn emit_return(&mut self, value: &Expr) -> Result<String, CGenError> {
let return_val = self.emit_expr(value)?;
// Store return value in temp if it's an RC variable we're about to decref
let temp_needed = self.is_rc_managed_type(&self.infer_c_type(value)?);
if temp_needed {
self.writeln(&format!("void* _ret_tmp = {};", return_val));
self.writeln("lux_incref(_ret_tmp);"); // Keep it alive
}
// Decref all scopes from innermost to outermost
for scope in self.rc_scopes.iter().rev() {
for var in scope.iter().rev() {
if var.is_rc {
self.writeln(&format!("lux_decref({});", var.name));
}
}
}
if temp_needed {
self.writeln("return _ret_tmp;");
} else {
self.writeln(&format!("return {};", return_val));
}
Ok(String::new())
}
Phase 6: Handle Conditionals
Goal: Properly handle if/else where both branches may define variables.
For if/else expressions that create RC values:
// Before (leaks):
LuxList* result = (condition ? create_list_a() : create_list_b());
// After (no leak):
LuxList* result;
if (condition) {
result = create_list_a();
} else {
result = create_list_b();
}
// Only one path executed, only one allocation
This requires changing if/else from ternary expressions to proper if statements.
Phase 7: Handle Blocks
Goal: Each block { ... } creates a new scope.
fn emit_block(&mut self, statements: &[Statement]) -> Result<String, CGenError> {
self.push_scope();
self.writeln("{");
self.indent += 1;
let mut last_value = String::from("NULL");
for stmt in statements {
last_value = self.emit_statement(stmt)?;
}
// Cleanup before leaving block
self.emit_scope_cleanup();
self.indent -= 1;
self.writeln("}");
self.pop_scope();
Ok(last_value)
}
Testing Strategy
Unit Tests
- Simple allocation and free:
fn test(): Unit = {
let x = [1, 2, 3] // Should be freed at end
}
- Nested scopes:
fn test(): Unit = {
let outer = [1]
{
let inner = [2] // Freed here
}
// outer still live
} // outer freed here
- Early return:
fn test(b: Bool): List<Int> = {
let x = [1, 2, 3]
if b then return [] // x must be freed before return
x
}
- Conditionals:
fn test(b: Bool): List<Int> = {
let x = if b then [1] else [2] // Only one allocated
x
}
Memory Leak Detection
Use valgrind (if available) or add debug tracking:
static int64_t lux_alloc_count = 0;
static int64_t lux_free_count = 0;
static void* lux_rc_alloc(size_t size, int32_t tag) {
lux_alloc_count++;
// ... existing code ...
}
static void lux_drop(void* ptr, int32_t tag) {
lux_free_count++;
// ... existing code ...
}
// At program exit:
void lux_check_leaks() {
if (lux_alloc_count != lux_free_count) {
fprintf(stderr, "LEAK: %lld allocations, %lld frees\n",
lux_alloc_count, lux_free_count);
}
}
Comparison with Perceus
| Feature | Perceus (Koka) | Lux RC (Current) |
|---|---|---|
| RC header | Yes | Yes ✅ |
| Scope tracking | Yes | Yes ✅ |
| Auto decref | Yes | Yes ✅ |
| Memory tracking | No | Yes ✅ (debug) |
| Early return | Yes | Partial |
| Last-use opt | Yes | No |
| Reuse (FBIP) | Yes | No |
| Drop fusion | Yes | No |
Files to Modify
| File | Changes |
|---|---|
src/codegen/c_backend.rs |
Add scope tracking, emit decrefs |
Estimated Complexity
- Scope tracking data structures: ~30 lines
- Type classification: ~40 lines
- Scope cleanup emission: ~30 lines
- Let binding registration: ~20 lines
- Early return handling: ~40 lines
- Block scope handling: ~30 lines
- Testing: ~100 lines
Total: ~300 lines of careful implementation
Path to Koka/Rust Parity
What We Have Now (Basic RC)
Our current implementation provides:
- Deterministic cleanup - Memory freed at predictable points (scope exit)
- No GC pauses - Unlike Go/Java, latency is predictable
- Leak detection - Debug mode catches memory leaks during development
- No manual management - Unlike C/Zig, programmer doesn't call free()
What Koka Has (Perceus RC)
Koka's Perceus system adds several optimizations we don't have:
| Feature | Description | Benefit | Complexity |
|---|---|---|---|
| Last-use analysis | Detect when a variable's final use allows ownership transfer | Avoid unnecessary copies | Medium |
| Reuse (FBIP) | When rc=1, mutate in-place instead of copy | Major performance boost | High |
| Drop specialization | Generate type-specific drop instead of polymorphic | Fewer branches, faster | Low |
| Drop fusion | Combine multiple consecutive drops | Fewer function calls | Medium |
| Borrow inference | Avoid incref when borrowing temporaries | Reduce RC overhead | High |
What Rust Has (Ownership)
Rust's ownership system is fundamentally different:
| Aspect | Rust | Lux RC | Tradeoff |
|---|---|---|---|
| When checked | Compile-time | Runtime | Rust catches bugs earlier |
| Runtime cost | Zero | RC operations | Rust is faster |
| Learning curve | Steep (borrow checker) | Gentle | Lux is easier to learn |
| Expressiveness | Limited by lifetimes | Unrestricted | Lux is more flexible |
| Cycles | Prevented by design | Would leak | Rust handles more patterns |
Key insight: We can never match Rust's zero-overhead guarantees because ownership is checked at compile time. RC always has runtime cost. But we can be as good as Koka.
Remaining Work for Full Memory Safety
Phase A: Complete Coverage (Prevent All Leaks)
-
Closure RC - Environments should be RC-managed
- Allocate env with
lux_rc_alloc - Drop env when closure is dropped
- ~50 lines in
emit_lambda
- Allocate env with
-
ADT RC - Algebraic data types with heap fields
- Track which variants contain RC fields
- Generate drop functions for each ADT
- ~100 lines
-
Early return handling - Cleanup all scopes on return
- Current impl handles simple cases
- Need nested scope cleanup
- ~30 lines
-
Complex conditionals - If/else creating RC values
- Switch from ternary to if-statements
- Track RC creation in branches
- ~50 lines
Phase B: Performance Optimizations (Match Koka)
-
Last-use optimization
- Track variable liveness
- Skip incref on last use (transfer ownership)
- Requires dataflow analysis
- ~200 lines
-
Reuse analysis (FBIP)
- Detect
rc=1at update sites - Mutate in-place instead of copy
- Major change to list operations
- ~300 lines
- Detect
-
Drop specialization
- Generate per-type drop functions
- Eliminate polymorphic dispatch
- ~100 lines
Estimated Effort
| Phase | Description | Lines | Priority |
|---|---|---|---|
| A1 | Closure RC | ~50 | P0 - Closures leak |
| A2 | ADT RC | ~100 | P1 - ADTs leak |
| A3 | Early returns | ~30 | P1 - Edge cases |
| A4 | Conditionals | ~50 | P2 - Uncommon |
| B1 | Last-use opt | ~200 | P3 - Performance |
| B2 | Reuse (FBIP) | ~300 | P3 - Performance |
| B3 | Drop special | ~100 | P3 - Performance |
Phase A total: ~230 lines - Gets us to "no leaks" Phase B total: ~600 lines - Gets us to Koka-level performance
Cycle Detection
RC cannot handle cycles (A → B → A). Options:
- Ignore - Cycles are rare in functional code (our current approach)
- Weak references - Programmer marks back-edges
- Cycle collector - Periodic scan for cycles (adds GC-like pauses)
Koka also ignores cycles, relying on functional programming's natural acyclicity.