v0.1.0-dev

A minimal, hackable systems language for learning explicit low-level programming. Designed as a stepping stone before C, C++, or lower-level systems work.

What is cnegative?

cnegative keeps manual control, reduces hidden behavior, and prefers words over symbolic shortcuts when that improves clarity. It compiles to native code through LLVM.

The compiler ships today with structured diagnostics, a typed IR stage, LLVM IR emission, and object + binary linking through the host Clang toolchain.

current status

This is v0.1.0-dev. The language and compiler are under active development. The surface is intentionally small.

Core rules at a glance

  • Semicolons required for import declarations, simple statements, and struct fields.
  • Non-void functions must use explicit return on every path.
  • Conditions must be actual bool values — no implicit integer truthiness.
  • Visibility is explicit: pfn and pstruct for public exports.
no implicit truthiness

if x {} is rejected when x is an int. Write if x > 0 {} instead.

Quick example

cneg
// hello.cneg
fn:void main() {
    let name:str = input();
    print("hello, ");
    print(name);
    free name;
}
shell
$ cnegc build hello.cneg build/hello
$ ./build/hello
alice
hello, alice

Platform support

FeatureLinux x86_64macOS arm64Windows x86_64
Compiler (C)YESYESYES
LLVM pathYESYESYES
Lexer hot-path ASMYESYESfallback C
Prebuilt releaseYESYESYES

Install the prebuilt cnegc binary from GitHub Releases. No build tools required for checking and IR dumping. Native output needs Clang.

Download

Go to github.com/cnegative/cnegative/releases ↗ and pick the archive for your platform:

PlatformArchive
Linuxcnegc-<tag>-linux-x86_64.zip
macOScnegc-<tag>-macos-arm64.zip
Windowscnegc-<tag>-windows-x86_64.zip

Linux & macOS

shell
unzip cnegc-v0.1.0-dev-linux-x86_64.zip
mkdir -p "$HOME/.local/bin"
cp release/cnegc/cnegc "$HOME/.local/bin/"
chmod +x "$HOME/.local/bin/cnegc"

If ~/.local/bin isn't on your PATH, add this to your shell config:

shell
export PATH="$HOME/.local/bin:$PATH"

Windows

powershell
# Extract to a stable folder, e.g. C:\tools\cnegc\
# Add that folder to your user PATH
cnegc.exe  # verify it works

What's in the release

  • cnegc / cnegc.exe
  • LICENSE, README.md
  • docs/how-to-run-and-build.md
native output requires clang

check, ir, and llvm-ir need only cnegc. To emit objects or link binaries, you also need clang-18 or clang on your PATH.

Write, check, and run your first cnegative program from scratch.

Your first function

Create add.cneg:

cneg
fn:int add(a:int, b:int) {
    return a + b;
}

fn:int main() {
    let result:int = add(2, 3);
    print(result);
    return 0;
}

Check it (no Clang needed):

shell
cnegc check add.cneg

Build and run:

shell
cnegc build add.cneg build/add
./build/add
5

Using result types

Fallible operations return result T. The .value field is only accessible after a guard:

cneg
fn:result int divide(a:int, b:int) {
    if b == 0 {
        return err;
    }
    return ok a / b;
}

fn:int main() {
    let r:result int = divide(10, 2);
    if r.ok {
        print(r.value);  // only valid inside this guard
    }
    return 0;
}
unguarded .value is rejected

Accessing r.value without a preceding if r.ok guard is a compile-time error E3024.

Importing modules

cneg
// shapes.cneg
pstruct Point {
    x:int;
    y:int;
}

pfn:Point make_point(x:int, y:int) {
    return Point { x: x, y: y };
}
cneg
// main.cneg
import shapes as s;

fn:int main() {
    let p:s.Point = s.make_point(3, 4);
    print(p.x);
    return 0;
}

Build cnegc from source or use an existing binary. Covers prerequisites, build systems, all compiler commands, and the lexer benchmark.

Using a prebuilt binary

CommandNeeds Clang?Output
cnegc check <file>Nodiagnostics only
cnegc ir <file>Notyped IR text
cnegc llvm-ir <file>NoLLVM IR text
cnegc obj <file> [out]Yes.o object file
cnegc build <file> [out]Yeslinked binary
cnegc bench-lexer <file> NNotiming output

Build from source

Prerequisites

  • A C compiler available as cc (for Make) or any C compiler (for CMake)
  • make or CMake 3.20+
  • clang-18 or clang in PATH for the full test suite
  • bash for make test

With make

shell
make          # produces build/cnegc
make test     # runs the full test suite

With CMake

shell
cmake -S . -B out
cmake --build out   # produces out/build/cnegc
ctest --test-dir out --output-on-failure
llvm-as is optional

Smoke tests use llvm-as-18 or llvm-as when available, and fall back to clang -c -x ir otherwise.

Project rule: line cap

No source file may exceed 3 000 lines. Run make check-lines to verify before committing.

Complete syntax and semantics reference. cnegative is explicit, readable, and low-level — manual control with reduced hidden behavior.

Functions

Return type comes after the colon in fn:type. Private by default; use pfn to export.

cneg
fn:int main() {
    return 0;
}

pfn:int add(a:int, b:int) {
    return a + b;
}

Variables

Immutable by default. Add mut to allow reassignment. Explicit type annotation required.

cneg
let x:int = 10;
let mut y:int = 20;
y = 30;  // ok — y is mut

Primitive types

TypeDescription
int64-bit signed integer
boolBoolean (true / false)
strUTF-8 string
voidNo return value
ptr TPointer to T
result TFallible value (ok or err)

Control flow

Conditions must be bool. No implicit integer truthiness.

cneg
// if / else
if x > 5 {
    print(x);
}

// while
while x < 10 {
    x = x + 1;
}

// range for
for i:int in 0..10 {
    print(i);
}

// infinite loop
loop {
}

Structs

Use pstruct to export. Fields end with ;.

cneg
pstruct Point {
    x:int;
    y:int;
}

fn:int main() {
    let p:Point = Point { x: 1, y: 2 };
    return p.x;
}

Pointers

cneg
let mut x:int = 10;
let p:ptr int = addr x;
p.value = 11;

let heap:ptr int = alloc int;
heap.value = 42;
free heap;

Source goes through a multi-stage pipeline. Each stage can be inspected independently with the CLI.

Stages

Source Lexer Parser AST Sema Typed IR LLVM IR Object Binary
StageCLI flagSource dir
Lexersrc/lex/
Parser + ASTsrc/parse/
Semantic analysischecksrc/sema/
Typed IRirsrc/sema/
LLVM IRllvm-irbackend
Object fileobjvia clang
Binarybuildvia clang

Directory layout

text
include/cnegative/   — public compiler headers
src/support/         — memory, source, diagnostics
src/lex/             — token and lexer logic
src/parse/           — AST and parser
src/sema/            — semantic checking
src/cli/             — command entry point
src/asm/             — profiled hot-path assembly
cmake/               — build and test scripts
.github/workflows/   — CI workflows

After semantic analysis, checked source is lowered to a structured Typed IR before any LLVM work begins.

Properties

  • Independent IR node types — not reusing the parser AST.
  • Canonical module-qualified function and struct names.
  • Explicit return statements preserved from source.
  • Structured control flow preserved for if, while, loop, and range for.
  • No SSA, basic blocks, or LLVM-specific details at this stage.

Dump the IR

shell
cnegc ir examples/valid_imported_structs.cneg

Example output

text
module valid_imported_structs (...) {
    fn valid_imported_structs.main() -> int {
        let p:shapes.Point = shapes.make_point(3, 4);
        return w.point.y;
    }
}
purpose of this stage

Typed IR stabilizes typing and symbol resolution before control-flow lowering and LLVM emission. It is a checkpoint, not an optimization stage.

The LLVM backend lowers Typed IR to textual LLVM IR, then uses the host Clang toolchain to emit objects and link binaries.

CLI

shell
cnegc llvm-ir examples/valid_llvm_backend.cneg
cnegc obj     examples/valid_basic.cneg
cnegc build   examples/valid_basic.cneg

Supported lowering

  • int, bool, str, arrays, structs, ptr, and result types.
  • Local bindings with mutable reassignment.
  • Arithmetic and comparison operators.
  • Short-circuit && and ||.
  • if, while, loop, and range for.
  • Local function calls and imported module function calls.
  • Struct literals, array literals, field access, indexing.
  • alloc, addr, deref, free, ok, err, guarded .value.
  • print(...), input(), and string equality via embedded runtime helpers.
  • Host-native target triple — not hardcoded to Linux.

Runtime notes

input() ownership

input() trims the trailing newline and returns a heap-allocated owned copy. Use free to release it. Freeing string literals is a safe no-op.

String equality uses strcmp — content-based, not pointer identity.

Unsupported lowering operations report E3021 before any LLVM IR text is printed.

cnegc diagnostics use stable error codes so errors can be documented and referenced consistently. All diagnostics show source path, line, and column.

Parse errors (E1xxx)

CodeDescription
E1001Expected token missing in current grammar position
E1002Unexpected token for current grammar rule
E1003Invalid type syntax
E1004Invalid character during lexing
E1005Unterminated string literal

Semantic errors (E3xxx)

CodeDescription
E3001Duplicate function name
E3002Unknown name
E3003Duplicate local binding in the same scope
E3004Type mismatch
E3005Control-flow condition is not bool
E3006Assignment to immutable binding
E3007Non-void function does not return on every path
E3008Incorrect function call arity
E3009Unknown or invalid field access
E3010Invalid indexing target
E3011Array literal size mismatch
E3012Unknown declared type or module-qualified type without matching import
E3013Duplicate struct name
E3014Invalid call target or module-as-value usage
E3015err used without an expected result type
E3016Duplicate import alias
E3017Module file could not be resolved or loaded
E3018Cyclic module import
E3019free requires a pointer or string value
E3020Internal typed IR lowering invariant failed
E3021LLVM backend does not support the requested feature yet
E3022External backend toolchain step failed
E3023Public API exposes a private type
E3024result.value used without a proven-ok guard

Diagnostic style

  • Show source path, line, and column.
  • One clear primary sentence per diagnostic.
  • Prefer describing both expected and actual types for mismatches.
  • Reject ambiguous truthiness in conditions with E3005.
  • Report missing struct fields directly at the literal or access site.

Enforced rules that keep the compiler codebase consistent and auditable.

Implementation rules

  • Compiler and tooling code is written in C.
  • Performance-critical hot paths are reserved for assembly once profiling proves they matter.
  • No source file may exceed 3 000 lines.
  • Developer-facing memory leak tracking must be enabled from the start.
  • Diagnostics must be specific, stable, and documented.
  • Statement-terminator rules stay explicit: semicolons required for simple statements.

Enforced checks

shell
make check-lines  # rejects files over the 3000-line cap

The compiler uses a tracked allocator and prints live allocations on shutdown if any memory is left unreleased.

Near-term compiler work. This list reflects the current priorities, not a release schedule.

Up next

  1. String ownership story — generalize beyond input() so ownership is explicit for more producers.
  2. Module-level constants — add constants and finish visibility rules for exported symbols beyond functions and structs.
  3. Parser recovery — improve recovery so one syntax mistake does not cascade into follow-on errors across the file.
  4. Standard library surface — add more backend and runtime coverage for a richer stdlib.
  5. IR optimization passes — introduce optimization on typed IR before LLVM lowering.
contributions welcome

Open an issue or pull request on GitHub ↗ to discuss ideas or report bugs.