feat: add Perceus-inspired reference counting infrastructure
Implements Phase 1-3 of the RC system for automatic memory management: - Add LuxRcHeader with refcount and type tag for all heap objects - Add lux_rc_alloc, lux_incref, lux_decref, and lux_drop functions - Update list allocation to use RC (lux_list_new uses lux_rc_alloc) - List operations (concat, reverse, take, drop) now incref shared elements - Update boxing functions (box_int, box_bool, box_float) to use RC - String operations (concat, int_to_string, readLine) return RC strings - File and HTTP operations return RC-managed strings The infrastructure is ready for automatic decref insertion at scope exit (Phase 4) and closure RC (Phase 5) in future work. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
224
docs/REFERENCE_COUNTING.md
Normal file
224
docs/REFERENCE_COUNTING.md
Normal file
@@ -0,0 +1,224 @@
|
||||
# Reference Counting in Lux C Backend
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the reference counting (RC) system for automatic memory management in the Lux C backend. The approach is inspired by Perceus (used in Koka) but starts with a simpler implementation.
|
||||
|
||||
## Current Status
|
||||
|
||||
**Phase 1-2 Complete**: The RC infrastructure and allocation functions are implemented. All heap-allocated objects (strings, lists, boxed values) are now RC-managed.
|
||||
|
||||
**What's Implemented:**
|
||||
- RC header with refcount and type tag
|
||||
- `lux_rc_alloc()` for allocating RC-managed objects
|
||||
- `lux_incref()` / `lux_decref()` operations
|
||||
- Polymorphic `lux_drop()` function
|
||||
- Lists, boxed values, and dynamically-created strings use RC allocation
|
||||
- List operations properly incref shared elements
|
||||
|
||||
**What's NOT Yet Implemented:**
|
||||
- Automatic decref insertion at scope exit
|
||||
- Last-use analysis for ownership transfer
|
||||
- Closure RC (environments still leak)
|
||||
- ADT RC
|
||||
|
||||
## Design
|
||||
|
||||
### RC Header
|
||||
|
||||
All heap-allocated objects share a common header:
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
int32_t rc; // Reference count
|
||||
int32_t tag; // Type tag for polymorphic drop
|
||||
} LuxRcHeader;
|
||||
|
||||
// Macro to get header from object pointer
|
||||
#define LUX_RC_HEADER(ptr) (((LuxRcHeader*)(ptr)) - 1)
|
||||
```
|
||||
|
||||
### Type Tags
|
||||
|
||||
```c
|
||||
typedef enum {
|
||||
LUX_TAG_STRING = 1,
|
||||
LUX_TAG_LIST = 2,
|
||||
LUX_TAG_CLOSURE = 3,
|
||||
LUX_TAG_BOXED_INT = 4,
|
||||
LUX_TAG_BOXED_BOOL = 5,
|
||||
LUX_TAG_BOXED_FLOAT = 6,
|
||||
LUX_TAG_ENV = 7, // Closure environment
|
||||
LUX_TAG_ADT = 100 // ADT types start at 100
|
||||
} LuxTypeTag;
|
||||
```
|
||||
|
||||
### RC Operations
|
||||
|
||||
```c
|
||||
// Allocate RC-managed memory with initial refcount of 1
|
||||
static void* lux_rc_alloc(size_t size, int32_t tag) {
|
||||
LuxRcHeader* hdr = (LuxRcHeader*)malloc(sizeof(LuxRcHeader) + size);
|
||||
if (!hdr) return NULL;
|
||||
hdr->rc = 1;
|
||||
hdr->tag = tag;
|
||||
return hdr + 1; // Return pointer after header
|
||||
}
|
||||
|
||||
// Increment reference count
|
||||
static inline void lux_incref(void* ptr) {
|
||||
if (ptr) LUX_RC_HEADER(ptr)->rc++;
|
||||
}
|
||||
|
||||
// Decrement reference count, call drop if zero
|
||||
static inline void lux_decref(void* ptr) {
|
||||
if (ptr) {
|
||||
LuxRcHeader* hdr = LUX_RC_HEADER(ptr);
|
||||
if (--hdr->rc == 0) {
|
||||
lux_drop(ptr, hdr->tag);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Drop Functions
|
||||
|
||||
The polymorphic drop function handles cleanup for each type:
|
||||
|
||||
```c
|
||||
static void lux_drop(void* ptr, int32_t tag) {
|
||||
if (!ptr) return;
|
||||
switch (tag) {
|
||||
case LUX_TAG_STRING:
|
||||
// Strings are just char arrays, no sub-references
|
||||
break;
|
||||
case LUX_TAG_LIST: {
|
||||
LuxList* list = (LuxList*)ptr;
|
||||
// Decref each element (they're all boxed/RC-managed)
|
||||
for (int64_t i = 0; i < list->length; i++) {
|
||||
lux_decref(list->elements[i]);
|
||||
}
|
||||
free(list->elements);
|
||||
break;
|
||||
}
|
||||
case LUX_TAG_CLOSURE: {
|
||||
LuxClosure* closure = (LuxClosure*)ptr;
|
||||
// Decref the environment if it's RC-managed
|
||||
lux_decref(closure->env);
|
||||
break;
|
||||
}
|
||||
case LUX_TAG_BOXED_INT:
|
||||
case LUX_TAG_BOXED_BOOL:
|
||||
case LUX_TAG_BOXED_FLOAT:
|
||||
// Primitive boxes have no sub-references
|
||||
break;
|
||||
default:
|
||||
// ADT types - handled by generated drop functions
|
||||
break;
|
||||
}
|
||||
// Free the object and its RC header
|
||||
free(LUX_RC_HEADER(ptr));
|
||||
}
|
||||
```
|
||||
|
||||
## Code Generation Rules (Future Work)
|
||||
|
||||
### Variable Bindings
|
||||
|
||||
When a value is bound to a variable:
|
||||
```c
|
||||
// let x = expr
|
||||
Type x = expr; // expr returns owned reference (rc=1)
|
||||
```
|
||||
|
||||
### Variable Use
|
||||
|
||||
When a variable is used (not the last use):
|
||||
```c
|
||||
// Using x in expression
|
||||
lux_incref(x);
|
||||
some_function(x); // Pass owned reference
|
||||
```
|
||||
|
||||
### Last Use
|
||||
|
||||
When a variable is used for the last time:
|
||||
```c
|
||||
// Last use of x - no incref needed
|
||||
some_function(x); // Transfer ownership
|
||||
```
|
||||
|
||||
### Scope Exit
|
||||
|
||||
When a scope ends, decref all local variables:
|
||||
```c
|
||||
{
|
||||
Type x = ...;
|
||||
Type y = ...;
|
||||
// ... use x and y ...
|
||||
lux_decref(y);
|
||||
lux_decref(x);
|
||||
}
|
||||
```
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: RC Infrastructure ✅ COMPLETE
|
||||
- Add LuxRcHeader and allocation functions
|
||||
- Add incref/decref/drop functions
|
||||
- Type tags for built-in types
|
||||
|
||||
### Phase 2: List RC ✅ COMPLETE
|
||||
- Modify lux_list_new to use RC allocation
|
||||
- Add drop function for lists
|
||||
- List operations (concat, reverse, etc.) incref shared elements
|
||||
|
||||
### Phase 3: Boxing RC ✅ COMPLETE
|
||||
- All box functions use lux_rc_alloc
|
||||
- String operations create RC-managed strings
|
||||
|
||||
### Phase 4: Scope Tracking (TODO)
|
||||
- Track variable lifetimes
|
||||
- Insert decref at scope exit
|
||||
- Handle early returns
|
||||
|
||||
### Phase 5: Closure RC (TODO)
|
||||
- Modify closure allocation to use RC
|
||||
- Environment structs use RC
|
||||
- Handle captured variables
|
||||
|
||||
### Phase 6: Last-Use Analysis (Optimization)
|
||||
- Track last use of variables
|
||||
- Skip incref on last use (ownership transfer)
|
||||
- Enable Perceus-style reuse
|
||||
|
||||
## Memory Layout
|
||||
|
||||
RC-managed objects have this memory layout:
|
||||
|
||||
```
|
||||
+------------------+
|
||||
| LuxRcHeader | <- malloc returns this pointer
|
||||
| int32_t rc |
|
||||
| int32_t tag |
|
||||
+------------------+
|
||||
| Object Data | <- lux_rc_alloc returns this pointer
|
||||
| ... |
|
||||
+------------------+
|
||||
```
|
||||
|
||||
## Comparison with Perceus
|
||||
|
||||
| Feature | Perceus (Koka) | Lux RC (Current) |
|
||||
|---------|----------------|------------------|
|
||||
| RC header | Yes | Yes ✅ |
|
||||
| RC insertion | Compile-time | Partial |
|
||||
| Last-use opt | Yes | TODO |
|
||||
| Reuse (FBIP) | Yes | Future |
|
||||
| Drop fusion | Yes | No |
|
||||
| Borrow inference | Yes | No |
|
||||
|
||||
## References
|
||||
|
||||
- [Perceus Paper](https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/)
|
||||
- [Koka Reference Counting](https://koka-lang.github.io/koka/doc/book.html)
|
||||
Reference in New Issue
Block a user