Files
lux/docs/REFERENCE_COUNTING.md
Brandon Lucas 56f0fa4eaa feat: add Perceus-inspired reference counting infrastructure
Implements Phase 1-3 of the RC system for automatic memory management:

- Add LuxRcHeader with refcount and type tag for all heap objects
- Add lux_rc_alloc, lux_incref, lux_decref, and lux_drop functions
- Update list allocation to use RC (lux_list_new uses lux_rc_alloc)
- List operations (concat, reverse, take, drop) now incref shared elements
- Update boxing functions (box_int, box_bool, box_float) to use RC
- String operations (concat, int_to_string, readLine) return RC strings
- File and HTTP operations return RC-managed strings

The infrastructure is ready for automatic decref insertion at scope exit
(Phase 4) and closure RC (Phase 5) in future work.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-14 12:27:54 -05:00

225 lines
5.6 KiB
Markdown

# Reference Counting in Lux C Backend
## Overview
This document describes the reference counting (RC) system for automatic memory management in the Lux C backend. The approach is inspired by Perceus (used in Koka) but starts with a simpler implementation.
## Current Status
**Phase 1-2 Complete**: The RC infrastructure and allocation functions are implemented. All heap-allocated objects (strings, lists, boxed values) are now RC-managed.
**What's Implemented:**
- RC header with refcount and type tag
- `lux_rc_alloc()` for allocating RC-managed objects
- `lux_incref()` / `lux_decref()` operations
- Polymorphic `lux_drop()` function
- Lists, boxed values, and dynamically-created strings use RC allocation
- List operations properly incref shared elements
**What's NOT Yet Implemented:**
- Automatic decref insertion at scope exit
- Last-use analysis for ownership transfer
- Closure RC (environments still leak)
- ADT RC
## Design
### RC Header
All heap-allocated objects share a common header:
```c
typedef struct {
int32_t rc; // Reference count
int32_t tag; // Type tag for polymorphic drop
} LuxRcHeader;
// Macro to get header from object pointer
#define LUX_RC_HEADER(ptr) (((LuxRcHeader*)(ptr)) - 1)
```
### Type Tags
```c
typedef enum {
LUX_TAG_STRING = 1,
LUX_TAG_LIST = 2,
LUX_TAG_CLOSURE = 3,
LUX_TAG_BOXED_INT = 4,
LUX_TAG_BOXED_BOOL = 5,
LUX_TAG_BOXED_FLOAT = 6,
LUX_TAG_ENV = 7, // Closure environment
LUX_TAG_ADT = 100 // ADT types start at 100
} LuxTypeTag;
```
### RC Operations
```c
// Allocate RC-managed memory with initial refcount of 1
static void* lux_rc_alloc(size_t size, int32_t tag) {
LuxRcHeader* hdr = (LuxRcHeader*)malloc(sizeof(LuxRcHeader) + size);
if (!hdr) return NULL;
hdr->rc = 1;
hdr->tag = tag;
return hdr + 1; // Return pointer after header
}
// Increment reference count
static inline void lux_incref(void* ptr) {
if (ptr) LUX_RC_HEADER(ptr)->rc++;
}
// Decrement reference count, call drop if zero
static inline void lux_decref(void* ptr) {
if (ptr) {
LuxRcHeader* hdr = LUX_RC_HEADER(ptr);
if (--hdr->rc == 0) {
lux_drop(ptr, hdr->tag);
}
}
}
```
### Drop Functions
The polymorphic drop function handles cleanup for each type:
```c
static void lux_drop(void* ptr, int32_t tag) {
if (!ptr) return;
switch (tag) {
case LUX_TAG_STRING:
// Strings are just char arrays, no sub-references
break;
case LUX_TAG_LIST: {
LuxList* list = (LuxList*)ptr;
// Decref each element (they're all boxed/RC-managed)
for (int64_t i = 0; i < list->length; i++) {
lux_decref(list->elements[i]);
}
free(list->elements);
break;
}
case LUX_TAG_CLOSURE: {
LuxClosure* closure = (LuxClosure*)ptr;
// Decref the environment if it's RC-managed
lux_decref(closure->env);
break;
}
case LUX_TAG_BOXED_INT:
case LUX_TAG_BOXED_BOOL:
case LUX_TAG_BOXED_FLOAT:
// Primitive boxes have no sub-references
break;
default:
// ADT types - handled by generated drop functions
break;
}
// Free the object and its RC header
free(LUX_RC_HEADER(ptr));
}
```
## Code Generation Rules (Future Work)
### Variable Bindings
When a value is bound to a variable:
```c
// let x = expr
Type x = expr; // expr returns owned reference (rc=1)
```
### Variable Use
When a variable is used (not the last use):
```c
// Using x in expression
lux_incref(x);
some_function(x); // Pass owned reference
```
### Last Use
When a variable is used for the last time:
```c
// Last use of x - no incref needed
some_function(x); // Transfer ownership
```
### Scope Exit
When a scope ends, decref all local variables:
```c
{
Type x = ...;
Type y = ...;
// ... use x and y ...
lux_decref(y);
lux_decref(x);
}
```
## Implementation Phases
### Phase 1: RC Infrastructure ✅ COMPLETE
- Add LuxRcHeader and allocation functions
- Add incref/decref/drop functions
- Type tags for built-in types
### Phase 2: List RC ✅ COMPLETE
- Modify lux_list_new to use RC allocation
- Add drop function for lists
- List operations (concat, reverse, etc.) incref shared elements
### Phase 3: Boxing RC ✅ COMPLETE
- All box functions use lux_rc_alloc
- String operations create RC-managed strings
### Phase 4: Scope Tracking (TODO)
- Track variable lifetimes
- Insert decref at scope exit
- Handle early returns
### Phase 5: Closure RC (TODO)
- Modify closure allocation to use RC
- Environment structs use RC
- Handle captured variables
### Phase 6: Last-Use Analysis (Optimization)
- Track last use of variables
- Skip incref on last use (ownership transfer)
- Enable Perceus-style reuse
## Memory Layout
RC-managed objects have this memory layout:
```
+------------------+
| LuxRcHeader | <- malloc returns this pointer
| int32_t rc |
| int32_t tag |
+------------------+
| Object Data | <- lux_rc_alloc returns this pointer
| ... |
+------------------+
```
## Comparison with Perceus
| Feature | Perceus (Koka) | Lux RC (Current) |
|---------|----------------|------------------|
| RC header | Yes | Yes ✅ |
| RC insertion | Compile-time | Partial |
| Last-use opt | Yes | TODO |
| Reuse (FBIP) | Yes | Future |
| Drop fusion | Yes | No |
| Borrow inference | Yes | No |
## References
- [Perceus Paper](https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/)
- [Koka Reference Counting](https://koka-lang.github.io/koka/doc/book.html)