feat: add list support to C backend and improve compile workflow
C Backend Lists: - Add LuxList type (dynamic array with void* boxing) - Implement all 16 list operations: length, isEmpty, concat, reverse, range, take, drop, head, tail, get, map, filter, fold, find, any, all - Higher-order operations generate inline loops with closure calls - Fix unique variable names to prevent redefinition errors Compile Command: - `lux compile file.lux` now produces a binary (like rustc, go build) - Add `--emit-c` flag to output C code instead - Binary name derived from source filename (foo.lux -> ./foo) - Clean up temp files after compilation Documentation: - Create docs/C_BACKEND.md with full strategy documentation - Document compilation pipeline, runtime types, limitations - Compare with Koka, Rust, Zig, Go, Nim, OCaml approaches - Outline future roadmap (evidence passing, Perceus RC) - Fix misleading doc comment (remove false Perceus claim) - Update OVERVIEW.md and ROADMAP.md to reflect list completion Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
399
docs/C_BACKEND.md
Normal file
399
docs/C_BACKEND.md
Normal file
@@ -0,0 +1,399 @@
|
||||
# Lux C Backend
|
||||
|
||||
## Overview
|
||||
|
||||
Lux compiles to C code, then invokes a system C compiler (gcc/clang) to produce native binaries. This approach is used by several production languages:
|
||||
|
||||
| Language | Target | Memory Management |
|
||||
|----------|--------|-------------------|
|
||||
| **Koka** | C | Perceus reference counting |
|
||||
| **Nim** | C | ORC (configurable) |
|
||||
| **Chicken Scheme** | C | Generational GC |
|
||||
| **Lux (current)** | C | None (leaks) |
|
||||
|
||||
## Compilation Pipeline
|
||||
|
||||
```
|
||||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||
│ Lux Source │ ──► │ Parser │ ──► │ Type Check │ ──► │ C Codegen │
|
||||
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||
│ Binary │ ◄── │ cc/gcc/ │ ◄── │ Temp .c │ ◄───│ C Code │
|
||||
│ │ │ clang │ │ File │ │ (string) │
|
||||
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
|
||||
```
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
lux compile foo.lux # Produces ./foo binary
|
||||
lux compile foo.lux -o app # Produces ./app binary
|
||||
lux compile foo.lux --run # Compile and execute
|
||||
lux compile foo.lux --emit-c # Output C code (for debugging)
|
||||
```
|
||||
|
||||
## Runtime Type Representations
|
||||
|
||||
### Primitive Types
|
||||
|
||||
```c
|
||||
typedef int64_t LuxInt;
|
||||
typedef double LuxFloat;
|
||||
typedef bool LuxBool;
|
||||
typedef char* LuxString;
|
||||
typedef void* LuxUnit;
|
||||
```
|
||||
|
||||
### Closures
|
||||
|
||||
Closures are represented as a pair of environment pointer and function pointer:
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
void* env; // Pointer to captured variables
|
||||
void* fn_ptr; // Pointer to the function
|
||||
} LuxClosure;
|
||||
```
|
||||
|
||||
**Example - capturing a variable:**
|
||||
```lux
|
||||
let multiplier = 3
|
||||
let triple = fn(x: Int): Int => x * multiplier
|
||||
```
|
||||
|
||||
Generates:
|
||||
```c
|
||||
// Environment struct for captured variables
|
||||
typedef struct {
|
||||
LuxInt multiplier;
|
||||
} Env_triple;
|
||||
|
||||
// The lambda function
|
||||
LuxInt lambda_triple(void* _env, LuxInt x) {
|
||||
Env_triple* env = (Env_triple*)_env;
|
||||
return x * env->multiplier;
|
||||
}
|
||||
|
||||
// Creating the closure
|
||||
Env_triple* env = malloc(sizeof(Env_triple));
|
||||
env->multiplier = multiplier;
|
||||
LuxClosure* triple = malloc(sizeof(LuxClosure));
|
||||
triple->env = env;
|
||||
triple->fn_ptr = (void*)lambda_triple;
|
||||
```
|
||||
|
||||
### Algebraic Data Types (ADTs)
|
||||
|
||||
ADTs compile to tagged unions:
|
||||
|
||||
```lux
|
||||
type Option =
|
||||
| Some(Int)
|
||||
| None
|
||||
```
|
||||
|
||||
Generates:
|
||||
```c
|
||||
typedef enum { Option_TAG_SOME, Option_TAG_NONE } Option_Tag;
|
||||
|
||||
typedef struct {
|
||||
Option_Tag tag;
|
||||
union {
|
||||
struct { LuxInt field0; } some;
|
||||
// None has no fields
|
||||
} data;
|
||||
} Option;
|
||||
```
|
||||
|
||||
**Pattern matching** compiles to if/else chains:
|
||||
|
||||
```lux
|
||||
match opt {
|
||||
Some(x) => x,
|
||||
None => 0
|
||||
}
|
||||
```
|
||||
|
||||
Generates:
|
||||
```c
|
||||
if (opt.tag == Option_TAG_SOME) {
|
||||
LuxInt x = opt.data.some.field0;
|
||||
result = x;
|
||||
} else if (opt.tag == Option_TAG_NONE) {
|
||||
result = 0;
|
||||
}
|
||||
```
|
||||
|
||||
### Lists
|
||||
|
||||
Lists are dynamic arrays with boxed elements:
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
void** elements; // Array of boxed elements
|
||||
int64_t length;
|
||||
int64_t capacity;
|
||||
} LuxList;
|
||||
```
|
||||
|
||||
Elements are boxed/unboxed at access time:
|
||||
```c
|
||||
void* lux_box_int(LuxInt n) {
|
||||
LuxInt* p = malloc(sizeof(LuxInt));
|
||||
*p = n;
|
||||
return p;
|
||||
}
|
||||
|
||||
LuxInt lux_unbox_int(void* p) {
|
||||
return *(LuxInt*)p;
|
||||
}
|
||||
```
|
||||
|
||||
**List operations** (map, filter, fold, etc.) generate inline loops:
|
||||
```c
|
||||
// List.map(nums, fn(x) => x * 2)
|
||||
LuxList* result = lux_list_new(nums->length);
|
||||
for (int64_t i = 0; i < nums->length; i++) {
|
||||
void* elem = nums->elements[i];
|
||||
LuxInt mapped = ((LuxInt(*)(void*, LuxInt))fn->fn_ptr)(fn->env, lux_unbox_int(elem));
|
||||
result->elements[i] = lux_box_int(mapped);
|
||||
}
|
||||
result->length = nums->length;
|
||||
```
|
||||
|
||||
## Current Limitations
|
||||
|
||||
### 1. Memory Leaks
|
||||
|
||||
**Everything allocated is never freed.** This includes:
|
||||
- Closure environments
|
||||
- ADT values
|
||||
- List elements and arrays
|
||||
- Strings from concatenation
|
||||
|
||||
This is acceptable for short-lived programs but not for long-running services.
|
||||
|
||||
### 2. Limited Effects
|
||||
|
||||
Only `Console.print` is supported, hardcoded to `printf`:
|
||||
|
||||
```c
|
||||
static void lux_console_print(LuxString msg) {
|
||||
printf("%s\n", msg);
|
||||
}
|
||||
```
|
||||
|
||||
Other effects (File, Http, Random, etc.) are not yet implemented in the C backend.
|
||||
|
||||
### 3. If/Else Side Effects
|
||||
|
||||
The C backend uses ternary operators for if/else:
|
||||
|
||||
```c
|
||||
(condition ? then_value : else_value)
|
||||
```
|
||||
|
||||
**Problem:** If branches contain side effects (like `Console.print`), both branches are evaluated during code generation, causing both to execute.
|
||||
|
||||
**Workaround:** Use pure expressions in if/else branches, then print the result:
|
||||
```lux
|
||||
// Bad - both prints execute
|
||||
if x > 0 then Console.print("positive") else Console.print("negative")
|
||||
|
||||
// Good - only one print
|
||||
let msg = if x > 0 then "positive" else "negative"
|
||||
Console.print(msg)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Comparison with Other Languages
|
||||
|
||||
### Koka (Our Inspiration)
|
||||
|
||||
Koka also compiles to C with algebraic effects. Key differences:
|
||||
|
||||
| Aspect | Koka | Lux (current) |
|
||||
|--------|------|---------------|
|
||||
| Memory | Perceus RC | Leaks |
|
||||
| Effects | Evidence passing (zero-cost) | Runtime lookup |
|
||||
| Closures | Environment vectors | Heap-allocated structs |
|
||||
| Maturity | Production-ready | Experimental |
|
||||
|
||||
### Rust
|
||||
|
||||
| Aspect | Rust | Lux |
|
||||
|--------|------|-----|
|
||||
| Target | LLVM | C |
|
||||
| Memory | Ownership/borrowing | Leaks |
|
||||
| Safety | Compile-time guaranteed | Runtime (interpreter) |
|
||||
| Learning curve | Steep | Medium |
|
||||
|
||||
### Zig
|
||||
|
||||
| Aspect | Zig | Lux |
|
||||
|--------|-----|-----|
|
||||
| Target | LLVM | C |
|
||||
| Memory | Manual with allocators | Leaks |
|
||||
| Philosophy | Explicit control | High-level abstraction |
|
||||
|
||||
### Go
|
||||
|
||||
| Aspect | Go | Lux |
|
||||
|--------|-----|-----|
|
||||
| Target | Native | C |
|
||||
| Memory | Concurrent GC | Leaks |
|
||||
| Effects | None | Algebraic effects |
|
||||
| Latency | Unpredictable (GC pauses) | Predictable (no GC) |
|
||||
|
||||
---
|
||||
|
||||
## Future Roadmap
|
||||
|
||||
### Phase 1: Evidence Passing (Zero-Cost Effects)
|
||||
|
||||
**Goal:** Eliminate runtime effect handler lookup.
|
||||
|
||||
**Current approach (slow):**
|
||||
```rust
|
||||
// O(n) search through handler stack
|
||||
for handler in self.handler_stack.iter().rev() {
|
||||
if handler.effect == request.effect {
|
||||
return handler.invoke(request);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Evidence passing (fast):**
|
||||
```c
|
||||
typedef struct {
|
||||
Console* console;
|
||||
FileIO* fileio;
|
||||
} Evidence;
|
||||
|
||||
void greet(Evidence* ev, const char* name) {
|
||||
ev->console->print(ev, name); // Direct call, no search
|
||||
}
|
||||
```
|
||||
|
||||
**Expected speedup:** 10-20x for effect-heavy code.
|
||||
|
||||
### Phase 2: Perceus Reference Counting
|
||||
|
||||
**Goal:** Deterministic memory management without GC pauses.
|
||||
|
||||
Perceus is a compile-time reference counting system that:
|
||||
1. Inserts increment/decrement at precise points
|
||||
2. Detects when values can be reused in-place (FBIP)
|
||||
3. Guarantees no memory leaks without runtime GC
|
||||
|
||||
**Example - reuse analysis:**
|
||||
```lux
|
||||
fn increment(xs: List<Int>): List<Int> =
|
||||
List.map(xs, fn(x) => x + 1)
|
||||
```
|
||||
|
||||
If `xs` has refcount=1, the list can be mutated in-place instead of copied.
|
||||
|
||||
### Phase 3: More Effects
|
||||
|
||||
Implement C versions of:
|
||||
- `File` (read, write, exists)
|
||||
- `Http` (get, post)
|
||||
- `Random` (int, bool)
|
||||
- `Time` (now, sleep)
|
||||
|
||||
### Phase 4: JavaScript Backend
|
||||
|
||||
Compile Lux to JavaScript for browser/Node.js:
|
||||
- Effects → Direct DOM/API calls
|
||||
- No runtime needed
|
||||
- Enables full-stack Lux development
|
||||
|
||||
---
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Name Mangling
|
||||
|
||||
Lux identifiers are mangled for C compatibility:
|
||||
|
||||
| Lux | C |
|
||||
|-----|---|
|
||||
| `foo` | `foo_lux` |
|
||||
| `myFunction` | `myFunction_lux` |
|
||||
| `List.map` | Inline code (not a function call) |
|
||||
|
||||
### Generated C Structure
|
||||
|
||||
```c
|
||||
// 1. Includes and type definitions
|
||||
#include <stdint.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
typedef int64_t LuxInt;
|
||||
// ... more types ...
|
||||
|
||||
// 2. Runtime helpers (string concat, list operations, etc.)
|
||||
static LuxString lux_string_concat(LuxString a, LuxString b) { ... }
|
||||
static LuxList* lux_list_new(int64_t capacity) { ... }
|
||||
// ... more helpers ...
|
||||
|
||||
// 3. Forward declarations
|
||||
void main_lux(void);
|
||||
|
||||
// 4. Closure/lambda definitions
|
||||
static LuxInt lambda_1(void* _env, LuxInt x) { ... }
|
||||
|
||||
// 5. User-defined functions
|
||||
void greet_lux(LuxString name) { ... }
|
||||
|
||||
// 6. Main function
|
||||
void main_lux(void) { ... }
|
||||
|
||||
// 7. Entry point
|
||||
int main(int argc, char** argv) {
|
||||
main_lux();
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
### Prelude Size
|
||||
|
||||
The generated C prelude is approximately 150 lines, including:
|
||||
- Type definitions (~20 lines)
|
||||
- String operations (~30 lines)
|
||||
- List types and operations (~80 lines)
|
||||
- Boxing/unboxing helpers (~20 lines)
|
||||
|
||||
---
|
||||
|
||||
## Testing the C Backend
|
||||
|
||||
```bash
|
||||
# Compile and run
|
||||
lux compile examples/hello.lux --run
|
||||
|
||||
# Compile to binary
|
||||
lux compile examples/hello.lux -o hello
|
||||
./hello
|
||||
|
||||
# View generated C (for debugging)
|
||||
lux compile examples/hello.lux --emit-c
|
||||
|
||||
# Save C to file
|
||||
lux compile examples/hello.lux --emit-c -o hello.c
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Perceus: Garbage Free Reference Counting](https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/) - Microsoft Research
|
||||
- [Generalized Evidence Passing for Effect Handlers](https://www.microsoft.com/en-us/research/publication/generalized-evidence-passing-for-effect-handlers/) - Koka's effect compilation
|
||||
- [Koka Language](https://koka-lang.github.io/koka/doc/book.html) - Effect system language that compiles to C
|
||||
- [Nim Backend Integration](https://nim-lang.org/docs/backends.html) - Another compile-to-C language
|
||||
@@ -181,7 +181,6 @@ fn processAny(x: Int @latest): Int = x // any version
|
||||
|
||||
### Planned (Not Yet Fully Implemented)
|
||||
|
||||
- **C Backend Lists**: Closures and pattern matching work, lists pending
|
||||
- **Auto-migration Generation**: Migration bodies stored, execution pending
|
||||
|
||||
---
|
||||
@@ -234,7 +233,6 @@ Quick iteration with type inference and a REPL.
|
||||
|
||||
| Limitation | Description |
|
||||
|------------|-------------|
|
||||
| **Limited C Backend** | Functions, closures, ADTs work; lists pending |
|
||||
| **No Package Manager** | Can't share/publish packages yet |
|
||||
| **New Paradigm** | Effects require learning new concepts |
|
||||
| **Small Ecosystem** | No community packages yet |
|
||||
@@ -371,13 +369,13 @@ Values + Effects C Code → GCC/Clang
|
||||
- ✅ Console.readLine and Console.readInt
|
||||
- ✅ C Backend (basic functions, Console.print)
|
||||
- ✅ C Backend closures and pattern matching
|
||||
- ✅ C Backend lists (all 16 operations)
|
||||
- ✅ Watch mode / hot reload
|
||||
- ✅ Formatter
|
||||
|
||||
**In Progress:**
|
||||
1. **C Backend Lists** - List operations pending
|
||||
2. **Schema Evolution** - Type system integration, auto-migration
|
||||
3. **Error Message Quality** - Context lines shown, suggestions partial
|
||||
1. **Schema Evolution** - Type system integration, auto-migration
|
||||
2. **Error Message Quality** - Context lines shown, suggestions partial
|
||||
|
||||
**Planned:**
|
||||
4. **SQL Effect** - Database access
|
||||
|
||||
@@ -223,7 +223,7 @@
|
||||
| C backend (basic) | P1 | — | ✅ Complete (functions, Console.print) |
|
||||
| Extend C backend (closures) | P1 | — | ✅ Complete |
|
||||
| Extend C backend (pattern matching) | P1 | — | ✅ Complete |
|
||||
| Extend C backend (lists) | P1 | 1 week | ❌ Missing |
|
||||
| Extend C backend (lists) | P1 | — | ✅ Complete |
|
||||
| JS backend | P2 | 4 weeks | ❌ Missing |
|
||||
| WASM backend | P3 | 4 weeks | ❌ Missing |
|
||||
|
||||
|
||||
Reference in New Issue
Block a user