# Lux C Backend ## Overview Lux compiles to C code, then invokes a system C compiler (gcc/clang) to produce native binaries. This approach is used by several production languages: | Language | Target | Memory Management | |----------|--------|-------------------| | **Koka** | C | Perceus reference counting | | **Nim** | C | ORC (configurable) | | **Chicken Scheme** | C | Generational GC | | **Lux (current)** | C | None (leaks) | ## Compilation Pipeline ``` ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Lux Source │ ──► │ Parser │ ──► │ Type Check │ ──► │ C Codegen │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Binary │ ◄── │ cc/gcc/ │ ◄── │ Temp .c │ ◄───│ C Code │ │ │ │ clang │ │ File │ │ (string) │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ ``` **Usage:** ```bash lux compile foo.lux # Produces ./foo binary lux compile foo.lux -o app # Produces ./app binary lux compile foo.lux --run # Compile and execute lux compile foo.lux --emit-c # Output C code (for debugging) ``` ## Runtime Type Representations ### Primitive Types ```c typedef int64_t LuxInt; typedef double LuxFloat; typedef bool LuxBool; typedef char* LuxString; typedef void* LuxUnit; ``` ### Closures Closures are represented as a pair of environment pointer and function pointer: ```c typedef struct { void* env; // Pointer to captured variables void* fn_ptr; // Pointer to the function } LuxClosure; ``` **Example - capturing a variable:** ```lux let multiplier = 3 let triple = fn(x: Int): Int => x * multiplier ``` Generates: ```c // Environment struct for captured variables typedef struct { LuxInt multiplier; } Env_triple; // The lambda function LuxInt lambda_triple(void* _env, LuxInt x) { Env_triple* env = (Env_triple*)_env; return x * env->multiplier; } // Creating the closure Env_triple* env = malloc(sizeof(Env_triple)); env->multiplier = multiplier; LuxClosure* triple = malloc(sizeof(LuxClosure)); triple->env = env; triple->fn_ptr = (void*)lambda_triple; ``` ### Algebraic Data Types (ADTs) ADTs compile to tagged unions: ```lux type Option = | Some(Int) | None ``` Generates: ```c typedef enum { Option_TAG_SOME, Option_TAG_NONE } Option_Tag; typedef struct { Option_Tag tag; union { struct { LuxInt field0; } some; // None has no fields } data; } Option; ``` **Pattern matching** compiles to if/else chains: ```lux match opt { Some(x) => x, None => 0 } ``` Generates: ```c if (opt.tag == Option_TAG_SOME) { LuxInt x = opt.data.some.field0; result = x; } else if (opt.tag == Option_TAG_NONE) { result = 0; } ``` ### Lists Lists are dynamic arrays with boxed elements: ```c typedef struct { void** elements; // Array of boxed elements int64_t length; int64_t capacity; } LuxList; ``` Elements are boxed/unboxed at access time: ```c void* lux_box_int(LuxInt n) { LuxInt* p = malloc(sizeof(LuxInt)); *p = n; return p; } LuxInt lux_unbox_int(void* p) { return *(LuxInt*)p; } ``` **List operations** (map, filter, fold, etc.) generate inline loops: ```c // List.map(nums, fn(x) => x * 2) LuxList* result = lux_list_new(nums->length); for (int64_t i = 0; i < nums->length; i++) { void* elem = nums->elements[i]; LuxInt mapped = ((LuxInt(*)(void*, LuxInt))fn->fn_ptr)(fn->env, lux_unbox_int(elem)); result->elements[i] = lux_box_int(mapped); } result->length = nums->length; ``` ## Current Limitations ### 1. Memory Management (Partial RC) RC infrastructure is implemented but not fully integrated: - ✅ Lists, boxed values, and strings use RC allocation - ✅ List operations properly incref shared elements - ⏳ Automatic decref at scope exit (not yet implemented) - ⏳ Closures and ADTs still leak **Current state:** Memory is tracked with refcounts, but objects are not automatically freed at scope exit. This is acceptable for short-lived programs but not for long-running services. ### 2. Effects ✅ MOSTLY COMPLETE All major effects are now supported: - `Console` (print, readLine) - `Random` (int, float, bool) - `Time` (now, sleep) - `File` (read, write, append, exists, delete, isDir, mkdir) - `Http` (get, post, put, delete) All effects use evidence passing for O(1) handler lookup. ### 3. If/Else Side Effects The C backend uses ternary operators for if/else: ```c (condition ? then_value : else_value) ``` **Problem:** If branches contain side effects (like `Console.print`), both branches are evaluated during code generation, causing both to execute. **Workaround:** Use pure expressions in if/else branches, then print the result: ```lux // Bad - both prints execute if x > 0 then Console.print("positive") else Console.print("negative") // Good - only one print let msg = if x > 0 then "positive" else "negative" Console.print(msg) ``` --- ## Comparison with Other Languages ### Koka (Our Inspiration) Koka also compiles to C with algebraic effects. Key differences: | Aspect | Koka | Lux (current) | |--------|------|---------------| | Memory | Perceus RC | Leaks | | Effects | Evidence passing (zero-cost) | Runtime lookup | | Closures | Environment vectors | Heap-allocated structs | | Maturity | Production-ready | Experimental | ### Rust | Aspect | Rust | Lux | |--------|------|-----| | Target | LLVM | C | | Memory | Ownership/borrowing | Leaks | | Safety | Compile-time guaranteed | Runtime (interpreter) | | Learning curve | Steep | Medium | ### Zig | Aspect | Zig | Lux | |--------|-----|-----| | Target | LLVM | C | | Memory | Manual with allocators | Leaks | | Philosophy | Explicit control | High-level abstraction | ### Go | Aspect | Go | Lux | |--------|-----|-----| | Target | Native | C | | Memory | Concurrent GC | Leaks | | Effects | None | Algebraic effects | | Latency | Unpredictable (GC pauses) | Predictable (no GC) | --- ## Current Progress ### Evidence Passing (Zero-Cost Effects) ✅ COMPLETE **Interpreter:** ✅ Complete - O(1) HashMap lookup instead of O(n) stack search. **C Backend:** ✅ Complete - Full evidence threading through function calls. **Generated code example:** ```c void greet_lux(LuxEvidence* ev) { ev->console->print(ev->console->env, "Hello!"); } int main(int argc, char** argv) { greet_lux(&default_evidence); return 0; } ``` See [docs/EVIDENCE_PASSING.md](EVIDENCE_PASSING.md) for details. --- ## Future Roadmap ### Phase 4: Perceus Reference Counting 🔄 IN PROGRESS **Goal:** Deterministic memory management without GC pauses. Perceus is a compile-time reference counting system that: 1. Inserts increment/decrement at precise points 2. Detects when values can be reused in-place (FBIP) 3. Guarantees no memory leaks without runtime GC **Current Status:** - ✅ RC infrastructure (header, alloc, incref/decref, drop) - ✅ Lists use RC allocation with proper element incref - ✅ Boxed values (Int, Bool, Float) use RC allocation - ✅ Dynamic strings use RC allocation - ⏳ Automatic decref at scope exit (TODO) - ⏳ Closure RC (TODO) - ⏳ Last-use optimization (TODO) See [docs/REFERENCE_COUNTING.md](REFERENCE_COUNTING.md) for details. **Example - reuse analysis (future):** ```lux fn increment(xs: List): List = List.map(xs, fn(x) => x + 1) ``` If `xs` has refcount=1, the list can be mutated in-place instead of copied. ### Phase 2: More Effects ✅ COMPLETE Implemented C versions of: - `Random` (int, float, bool) - LCG random number generator - `Time` (now, sleep) - using clock_gettime/nanosleep - `File` (read, write, append, exists, delete, isDir, mkdir) All effects use evidence passing for handler customization. ### Phase 3: Http Effect ✅ COMPLETE HTTP client using POSIX sockets: - `Http.get(url)` - GET request - `Http.post(url, body)` - POST request - `Http.put(url, body)` - PUT request - `Http.delete(url)` - DELETE request Self-contained implementation (no external dependencies like libcurl). ### Phase 5: JavaScript Backend Compile Lux to JavaScript for browser/Node.js: - Effects → Direct DOM/API calls - No runtime needed - Enables full-stack Lux development --- ## Implementation Details ### Name Mangling Lux identifiers are mangled for C compatibility: | Lux | C | |-----|---| | `foo` | `foo_lux` | | `myFunction` | `myFunction_lux` | | `List.map` | Inline code (not a function call) | ### Generated C Structure ```c // 1. Includes and type definitions #include #include #include #include #include typedef int64_t LuxInt; // ... more types ... // 2. Runtime helpers (string concat, list operations, etc.) static LuxString lux_string_concat(LuxString a, LuxString b) { ... } static LuxList* lux_list_new(int64_t capacity) { ... } // ... more helpers ... // 3. Forward declarations void main_lux(void); // 4. Closure/lambda definitions static LuxInt lambda_1(void* _env, LuxInt x) { ... } // 5. User-defined functions void greet_lux(LuxString name) { ... } // 6. Main function void main_lux(void) { ... } // 7. Entry point int main(int argc, char** argv) { main_lux(); return 0; } ``` ### Prelude Size The generated C prelude is approximately 150 lines, including: - Type definitions (~20 lines) - String operations (~30 lines) - List types and operations (~80 lines) - Boxing/unboxing helpers (~20 lines) --- ## Testing the C Backend ```bash # Compile and run lux compile examples/hello.lux --run # Compile to binary lux compile examples/hello.lux -o hello ./hello # View generated C (for debugging) lux compile examples/hello.lux --emit-c # Save C to file lux compile examples/hello.lux --emit-c -o hello.c ``` --- ## References - [Perceus: Garbage Free Reference Counting](https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/) - Microsoft Research - [Generalized Evidence Passing for Effect Handlers](https://www.microsoft.com/en-us/research/publication/generalized-evidence-passing-for-effect-handlers/) - Koka's effect compilation - [Koka Language](https://koka-lang.github.io/koka/doc/book.html) - Effect system language that compiles to C - [Nim Backend Integration](https://nim-lang.org/docs/backends.html) - Another compile-to-C language