ADT values with pointer fields (like recursive Tree types) now properly manage memory: - Assign unique type tags (starting at 100) to each ADT type - Track which ADTs have pointer fields that need cleanup - Generate lux_drop_adt() function with per-ADT drop logic - Allocate ADT pointer fields with lux_rc_alloc instead of malloc - Track ADT variables with pointer fields in scope - Emit field cleanup code at scope exit (switch on tag, decref fields) Test results: - ADT test: [RC] No leaks: 6 allocs, 6 frees - List test: [RC] No leaks: 31 allocs, 31 frees - Closure test: [RC] No leaks: 8 allocs, 8 frees - All 263 tests pass Remaining: early returns, complex conditionals. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
478 lines
13 KiB
Markdown
478 lines
13 KiB
Markdown
# Reference Counting in Lux C Backend
|
|
|
|
## Overview
|
|
|
|
This document describes the reference counting (RC) system for automatic memory management in the Lux C backend. The approach is inspired by Perceus (used in Koka) but starts with a simpler implementation.
|
|
|
|
## Current Status: WORKING
|
|
|
|
The RC system is now functional for lists and boxed values.
|
|
|
|
### What's Implemented
|
|
- RC header structure (`LuxRcHeader` with refcount + type tag)
|
|
- Allocation function (`lux_rc_alloc`)
|
|
- Reference operations (`lux_incref`, `lux_decref`)
|
|
- Polymorphic drop function (`lux_drop`)
|
|
- Lists, boxed values, strings use RC allocation
|
|
- List operations incref shared elements
|
|
- **Closures and environments** - RC-managed with automatic cleanup
|
|
- **Inline lambda cleanup** - temporary closures freed after use
|
|
- **ADT pointer fields** - RC-allocated and cleaned up at scope exit
|
|
- **Scope tracking** - compiler tracks RC variable lifetimes
|
|
- **Automatic decref at scope exit** - variables are freed when out of scope
|
|
- **Memory tracking** - debug mode reports allocs/frees at program exit
|
|
|
|
### Verified Working
|
|
```
|
|
[RC] No leaks: 28 allocs, 28 frees
|
|
```
|
|
|
|
### What's NOT Yet Implemented
|
|
- Early return handling (decref before return in nested scopes)
|
|
- Conditional branch handling (complex if/else patterns)
|
|
|
|
## The Problem
|
|
|
|
Currently generated code looks like this:
|
|
|
|
```c
|
|
void example(LuxEvidence* ev) {
|
|
LuxList* nums = lux_list_new(5); // rc=1, allocated
|
|
// ... use nums ...
|
|
// MISSING: lux_decref(nums); <- MEMORY LEAK!
|
|
}
|
|
```
|
|
|
|
It should look like this:
|
|
|
|
```c
|
|
void example(LuxEvidence* ev) {
|
|
LuxList* nums = lux_list_new(5); // rc=1
|
|
// ... use nums ...
|
|
lux_decref(nums); // rc=0, freed
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Plan
|
|
|
|
### Phase 1: Scope Tracking
|
|
|
|
**Goal:** Track which RC-managed variables are live at each point.
|
|
|
|
**Data structures needed in CBackend:**
|
|
|
|
```rust
|
|
struct CBackend {
|
|
// ... existing fields ...
|
|
|
|
/// Stack of scopes, each containing RC-managed variables
|
|
/// Each scope is a Vec of (var_name, c_type, needs_decref)
|
|
rc_scopes: Vec<Vec<RcVariable>>,
|
|
}
|
|
|
|
struct RcVariable {
|
|
name: String, // Variable name
|
|
c_type: String, // C type (for casting in decref)
|
|
is_rc: bool, // Whether this needs RC management
|
|
}
|
|
```
|
|
|
|
**Operations:**
|
|
- `push_scope()` - Enter a new scope (function, block, etc.)
|
|
- `pop_scope()` - Exit scope, emit decrefs for all live variables
|
|
- `register_rc_var(name, type)` - Register a variable that needs RC management
|
|
|
|
### Phase 2: Identify RC-Managed Types
|
|
|
|
**Goal:** Determine which types need RC management.
|
|
|
|
RC-managed types:
|
|
- `LuxList*` - Lists
|
|
- `LuxString` (when dynamically allocated) - Strings from concat/conversion
|
|
- `LuxClosure*` - Closures
|
|
- Boxed values (`void*` from `lux_box_*`)
|
|
- ADT variants with pointer fields
|
|
|
|
NOT RC-managed:
|
|
- `LuxInt`, `LuxFloat`, `LuxBool` - Stack-allocated primitives
|
|
- String literals (`"hello"`) - Static, not heap-allocated
|
|
- `LuxUnit` - No data
|
|
|
|
**Implementation:**
|
|
|
|
```rust
|
|
fn is_rc_managed_type(&self, c_type: &str) -> bool {
|
|
matches!(c_type,
|
|
"LuxList*" | "LuxClosure*" | "LuxString" | "void*"
|
|
) || c_type.ends_with("*") // Most pointer types are RC
|
|
}
|
|
|
|
fn needs_rc_for_expr(&self, expr: &Expr) -> bool {
|
|
match expr {
|
|
Expr::List { .. } => true,
|
|
Expr::Lambda { .. } => true,
|
|
Expr::StringConcat { .. } => true,
|
|
Expr::Call { .. } => {
|
|
// Check if function returns RC type
|
|
self.returns_rc_type(func)
|
|
}
|
|
Expr::Literal(Literal::String(_)) => false, // Static string
|
|
Expr::Literal(_) => false, // Primitives
|
|
Expr::Var(_) => false, // Using existing var, don't double-free
|
|
_ => false,
|
|
}
|
|
}
|
|
```
|
|
|
|
### Phase 3: Emit Decrefs at Scope Exit
|
|
|
|
**Goal:** Insert `lux_decref()` calls when variables go out of scope.
|
|
|
|
**For function bodies:**
|
|
```rust
|
|
fn emit_function(&mut self, func: &Function) -> Result<(), CGenError> {
|
|
self.push_scope();
|
|
|
|
// ... emit function body ...
|
|
|
|
// Before the closing brace, emit decrefs
|
|
self.emit_scope_cleanup();
|
|
self.pop_scope();
|
|
}
|
|
```
|
|
|
|
**The cleanup function:**
|
|
```rust
|
|
fn emit_scope_cleanup(&mut self) {
|
|
if let Some(scope) = self.rc_scopes.last() {
|
|
// Decref in reverse order (LIFO)
|
|
for var in scope.iter().rev() {
|
|
if var.is_rc {
|
|
self.writeln(&format!("lux_decref({});", var.name));
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Phase 4: Handle Let Bindings
|
|
|
|
**Goal:** Register variables when they're bound.
|
|
|
|
```rust
|
|
fn emit_let(&mut self, name: &str, value: &Expr) -> Result<String, CGenError> {
|
|
let c_type = self.infer_c_type(value)?;
|
|
let value_code = self.emit_expr(value)?;
|
|
|
|
self.writeln(&format!("{} {} = {};", c_type, name, value_code));
|
|
|
|
// Register for cleanup if RC-managed
|
|
if self.is_rc_managed_type(&c_type) && self.needs_rc_for_expr(value) {
|
|
self.register_rc_var(name, &c_type);
|
|
}
|
|
|
|
Ok(name.to_string())
|
|
}
|
|
```
|
|
|
|
### Phase 5: Handle Early Returns
|
|
|
|
**Goal:** Decref all live variables before returning.
|
|
|
|
```rust
|
|
fn emit_return(&mut self, value: &Expr) -> Result<String, CGenError> {
|
|
let return_val = self.emit_expr(value)?;
|
|
|
|
// Store return value in temp if it's an RC variable we're about to decref
|
|
let temp_needed = self.is_rc_managed_type(&self.infer_c_type(value)?);
|
|
|
|
if temp_needed {
|
|
self.writeln(&format!("void* _ret_tmp = {};", return_val));
|
|
self.writeln("lux_incref(_ret_tmp);"); // Keep it alive
|
|
}
|
|
|
|
// Decref all scopes from innermost to outermost
|
|
for scope in self.rc_scopes.iter().rev() {
|
|
for var in scope.iter().rev() {
|
|
if var.is_rc {
|
|
self.writeln(&format!("lux_decref({});", var.name));
|
|
}
|
|
}
|
|
}
|
|
|
|
if temp_needed {
|
|
self.writeln("return _ret_tmp;");
|
|
} else {
|
|
self.writeln(&format!("return {};", return_val));
|
|
}
|
|
|
|
Ok(String::new())
|
|
}
|
|
```
|
|
|
|
### Phase 6: Handle Conditionals
|
|
|
|
**Goal:** Properly handle if/else where both branches may define variables.
|
|
|
|
For if/else expressions that create RC values:
|
|
```c
|
|
// Before (leaks):
|
|
LuxList* result = (condition ? create_list_a() : create_list_b());
|
|
|
|
// After (no leak):
|
|
LuxList* result;
|
|
if (condition) {
|
|
result = create_list_a();
|
|
} else {
|
|
result = create_list_b();
|
|
}
|
|
// Only one path executed, only one allocation
|
|
```
|
|
|
|
This requires changing if/else from ternary expressions to proper if statements.
|
|
|
|
### Phase 7: Handle Blocks
|
|
|
|
**Goal:** Each block `{ ... }` creates a new scope.
|
|
|
|
```rust
|
|
fn emit_block(&mut self, statements: &[Statement]) -> Result<String, CGenError> {
|
|
self.push_scope();
|
|
self.writeln("{");
|
|
self.indent += 1;
|
|
|
|
let mut last_value = String::from("NULL");
|
|
for stmt in statements {
|
|
last_value = self.emit_statement(stmt)?;
|
|
}
|
|
|
|
// Cleanup before leaving block
|
|
self.emit_scope_cleanup();
|
|
|
|
self.indent -= 1;
|
|
self.writeln("}");
|
|
self.pop_scope();
|
|
|
|
Ok(last_value)
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
|
|
1. **Simple allocation and free:**
|
|
```lux
|
|
fn test(): Unit = {
|
|
let x = [1, 2, 3] // Should be freed at end
|
|
}
|
|
```
|
|
|
|
2. **Nested scopes:**
|
|
```lux
|
|
fn test(): Unit = {
|
|
let outer = [1]
|
|
{
|
|
let inner = [2] // Freed here
|
|
}
|
|
// outer still live
|
|
} // outer freed here
|
|
```
|
|
|
|
3. **Early return:**
|
|
```lux
|
|
fn test(b: Bool): List<Int> = {
|
|
let x = [1, 2, 3]
|
|
if b then return [] // x must be freed before return
|
|
x
|
|
}
|
|
```
|
|
|
|
4. **Conditionals:**
|
|
```lux
|
|
fn test(b: Bool): List<Int> = {
|
|
let x = if b then [1] else [2] // Only one allocated
|
|
x
|
|
}
|
|
```
|
|
|
|
### Memory Leak Detection
|
|
|
|
Use valgrind (if available) or add debug tracking:
|
|
|
|
```c
|
|
static int64_t lux_alloc_count = 0;
|
|
static int64_t lux_free_count = 0;
|
|
|
|
static void* lux_rc_alloc(size_t size, int32_t tag) {
|
|
lux_alloc_count++;
|
|
// ... existing code ...
|
|
}
|
|
|
|
static void lux_drop(void* ptr, int32_t tag) {
|
|
lux_free_count++;
|
|
// ... existing code ...
|
|
}
|
|
|
|
// At program exit:
|
|
void lux_check_leaks() {
|
|
if (lux_alloc_count != lux_free_count) {
|
|
fprintf(stderr, "LEAK: %lld allocations, %lld frees\n",
|
|
lux_alloc_count, lux_free_count);
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Comparison with Perceus
|
|
|
|
| Feature | Perceus (Koka) | Lux RC (Current) |
|
|
|---------|----------------|------------------|
|
|
| RC header | Yes | Yes ✅ |
|
|
| Scope tracking | Yes | Yes ✅ |
|
|
| Auto decref | Yes | Yes ✅ |
|
|
| Memory tracking | No | Yes ✅ (debug) |
|
|
| Early return | Yes | Partial |
|
|
| Last-use opt | Yes | No |
|
|
| Reuse (FBIP) | Yes | No |
|
|
| Drop fusion | Yes | No |
|
|
|
|
---
|
|
|
|
## Files to Modify
|
|
|
|
| File | Changes |
|
|
|------|---------|
|
|
| `src/codegen/c_backend.rs` | Add scope tracking, emit decrefs |
|
|
|
|
## Estimated Complexity
|
|
|
|
- Scope tracking data structures: ~30 lines
|
|
- Type classification: ~40 lines
|
|
- Scope cleanup emission: ~30 lines
|
|
- Let binding registration: ~20 lines
|
|
- Early return handling: ~40 lines
|
|
- Block scope handling: ~30 lines
|
|
- Testing: ~100 lines
|
|
|
|
**Total: ~300 lines of careful implementation**
|
|
|
|
---
|
|
|
|
## Path to Koka/Rust Parity
|
|
|
|
### What We Have Now (Basic RC)
|
|
|
|
Our current implementation provides:
|
|
- **Deterministic cleanup** - Memory freed at predictable points (scope exit)
|
|
- **No GC pauses** - Unlike Go/Java, latency is predictable
|
|
- **Leak detection** - Debug mode catches memory leaks during development
|
|
- **No manual management** - Unlike C/Zig, programmer doesn't call free()
|
|
|
|
### What Koka Has (Perceus RC)
|
|
|
|
Koka's Perceus system adds several optimizations we don't have:
|
|
|
|
| Feature | Description | Benefit | Complexity |
|
|
|---------|-------------|---------|------------|
|
|
| **Last-use analysis** | Detect when a variable's final use allows ownership transfer | Avoid unnecessary copies | Medium |
|
|
| **Reuse (FBIP)** | When rc=1, mutate in-place instead of copy | Major performance boost | High |
|
|
| **Drop specialization** | Generate type-specific drop instead of polymorphic | Fewer branches, faster | Low |
|
|
| **Drop fusion** | Combine multiple consecutive drops | Fewer function calls | Medium |
|
|
| **Borrow inference** | Avoid incref when borrowing temporaries | Reduce RC overhead | High |
|
|
|
|
### What Rust Has (Ownership)
|
|
|
|
Rust's ownership system is fundamentally different:
|
|
|
|
| Aspect | Rust | Lux RC | Tradeoff |
|
|
|--------|------|--------|----------|
|
|
| **When checked** | Compile-time | Runtime | Rust catches bugs earlier |
|
|
| **Runtime cost** | Zero | RC operations | Rust is faster |
|
|
| **Learning curve** | Steep (borrow checker) | Gentle | Lux is easier to learn |
|
|
| **Expressiveness** | Limited by lifetimes | Unrestricted | Lux is more flexible |
|
|
| **Cycles** | Prevented by design | Would leak | Rust handles more patterns |
|
|
|
|
**Key insight:** We can never match Rust's zero-overhead guarantees because ownership is checked at compile time. RC always has runtime cost. But we can be as good as Koka.
|
|
|
|
### Remaining Work for Full Memory Safety
|
|
|
|
#### Phase A: Complete Coverage (Prevent All Leaks)
|
|
|
|
1. ~~**Closure RC**~~ ✅ DONE - Environments are now RC-managed
|
|
- Closures allocated with `lux_rc_alloc(sizeof(LuxClosure), LUX_TAG_CLOSURE)`
|
|
- Environments allocated with `lux_rc_alloc(sizeof(LuxEnv_N), LUX_TAG_ENV)`
|
|
- Inline lambdas freed after use in List operations
|
|
|
|
2. ~~**ADT RC**~~ ✅ DONE - Algebraic data types with heap fields
|
|
- Track which variants contain RC fields
|
|
- Generate drop functions for each ADT
|
|
- ~100 lines
|
|
|
|
3. **Early return handling** - Cleanup all scopes on return
|
|
- Current impl handles simple cases
|
|
- Need nested scope cleanup
|
|
- ~30 lines
|
|
|
|
4. **Complex conditionals** - If/else creating RC values
|
|
- Switch from ternary to if-statements
|
|
- Track RC creation in branches
|
|
- ~50 lines
|
|
|
|
#### Phase B: Performance Optimizations (Match Koka)
|
|
|
|
1. **Last-use optimization**
|
|
- Track variable liveness
|
|
- Skip incref on last use (transfer ownership)
|
|
- Requires dataflow analysis
|
|
- ~200 lines
|
|
|
|
2. **Reuse analysis (FBIP)**
|
|
- Detect `rc=1` at update sites
|
|
- Mutate in-place instead of copy
|
|
- Major change to list operations
|
|
- ~300 lines
|
|
|
|
3. **Drop specialization**
|
|
- Generate per-type drop functions
|
|
- Eliminate polymorphic dispatch
|
|
- ~100 lines
|
|
|
|
### Estimated Effort
|
|
|
|
| Phase | Description | Lines | Priority | Status |
|
|
|-------|-------------|-------|----------|--------|
|
|
| A1 | Closure RC | ~50 | P0 | ✅ Done |
|
|
| A2 | ADT RC | ~150 | P1 | ✅ Done |
|
|
| A3 | Early returns | ~30 | P1 - Edge cases | Pending |
|
|
| A4 | Conditionals | ~50 | P2 - Uncommon | Pending |
|
|
| B1 | Last-use opt | ~200 | P3 - Performance | Pending |
|
|
| B2 | Reuse (FBIP) | ~300 | P3 - Performance | Pending |
|
|
| B3 | Drop special | ~100 | P3 - Performance | Pending |
|
|
|
|
**Phase A remaining: ~80 lines** - Gets us to "no leaks"
|
|
**Phase B total: ~600 lines** - Gets us to Koka-level performance
|
|
|
|
### Cycle Detection
|
|
|
|
RC cannot handle cycles (A → B → A). Options:
|
|
|
|
1. **Ignore** - Cycles are rare in functional code (our current approach)
|
|
2. **Weak references** - Programmer marks back-edges
|
|
3. **Cycle collector** - Periodic scan for cycles (adds GC-like pauses)
|
|
|
|
Koka also ignores cycles, relying on functional programming's natural acyclicity.
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [Perceus Paper](https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/)
|
|
- [Koka Reference Counting](https://koka-lang.github.io/koka/doc/book.html)
|
|
- [Rust Ownership](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html)
|