| build | ||
| compiler | ||
| examples | ||
| runtime | ||
| std | ||
| tests/asm | ||
| tools | ||
| README.md | ||
| SYNTAX.md | ||
QuinLang (QL)
QuinLang is a tiny, C-style language and compiler that I built to learn about parsing, type checking, and code generation from the ground up.
The project now has two backends:
- An original 8086/DOS backend that spits out real-mode
.COMbinaries and runs them under DOSBox. - A newer, self-contained QuinVM bytecode interpreter written in Python, which is the default and requires no external tools.
The language is intentionally small but surprisingly capable:
int,bool,str,ptr, and fixed-size stack arraysint[N]- Functions with parameters and
int/voidreturns if/else,while- Arithmetic, comparisons, and short-circuit
&&/|| - Built-in
print/println - Pointer intrinsics:
load16,store16,memcpy,memset - Array helpers:
array_push,array_pop
This README walks through how to build and run QL code, what the language supports, and how the compilation pipeline is wired up.
Getting started
Prerequisites
- Python 3.10+ (for the QuinVM backend)
The old 8086 backend additionally expects:
- NASM
- DOSBox-X
You don’t need those if you only care about running on QuinVM, which is the default path now.
Running a QL program on the VM
From the project root:
python -m compiler.driver_vm examples/hello.ql
You should see output similar to:
30201043211111789000
Hello
My Name Is Nathan1
2
That program exercises arrays, array_push / array_pop, pointers, memory intrinsics, printing, and boolean logic all in one go.
If you just want a sanity check:
python -m compiler.driver_vm examples/vm_test.ql
# -> 42
(Optional) Running via the 8086/DOS backend
If you have NASM and DOSBox-X installed and want to see real-mode .COM binaries in action, there is still a script for that in the repo (8086 backend). The VM is the preferred path going forward, but the 8086 codegen is kept around as a reference and for fun.
Language overview
QuinLang is intentionally close to “baby C” with a few constraints to keep everything manageable.
Types
Supported types:
int– 16-bit signed integerbool– boolean (true/false), stored in a 16-bit slotstr– interned string id in the VM; printed via a string tableptr– generic 16-bit pointer (used withload16/store16/memcpy/memset)void– no valueint[N]– fixed-size array ofNintelements stored on the stack
There’s no heap or dynamic allocation. All arrays are fixed-size and live in the current stack frame.
Functions
fn name(param1: Type1, param2: Type2, ...): ReturnType {
// statements
}
nameis the function identifier.- Parameters are
name: Typepairs separated by commas. ReturnTypecan be omitted forvoidfunctions.- The entry point must be:
fn main(): int {
// ...
return 0;
}
Variables and statements
Variable declarations:
let x: int;
let x: int = 42;
let msg: str;
let p: ptr;
let a: int[3];
Assignments:
x = 5;
x = x + 1;
a[0] = 10;
a[i] = x;
if / else and while look like you would expect:
if (condition) {
// then
} else {
// else (optional)
}
while (condition) {
// body
}
condition must type-check as bool. Under the hood everything is 0/1.
Return statements:
return;
return 0;
return x + 1;
- In
voidfunctions,return;just exits. - In non-void functions, you must return an expression of the right type.
Any bare expression followed by ; is an expression statement:
some_fn();
array_push(a, len, 10);
Expressions
Literals:
123
0xFF // hex int
"Hello" // string literal (VM backend)
true
false
Variables and identifiers:
x
my_var
Unary operators:
-x // arithmetic negation
!flag // logical not
&x // address-of (see pointers below)
Binary arithmetic:
x + y
x - y
x * y
x / y
Binary comparisons (produce bool):
x == y
x != y
x < y
x <= y
x > y
x >= y
Logical operators with short-circuit:
za && zb
za || zb
&&and||are short-circuiting, so the right-hand side is only evaluated when needed.- Precedence is
!> comparisons >&&>||, roughly C-like.
Function calls:
foo();
bar(1, 2);
println(42);
Arrays
Arrays are fixed-size int[N] values stored directly on the stack. There is no resizing or reallocation.
Declare and use an array:
fn main(): int {
let a: int[3];
let i: int;
a[0] = 10;
a[1] = 20;
a[2] = 30;
i = 0;
while (i < 3) {
println(a[i]);
i = i + 1;
}
return 0;
}
Key points:
- Type is always
int[N]for some literalN. - Indexing uses
array[index]syntax. - Index must be an
int. - There’s currently no bounds checking.
array_push / array_pop
Arrays are paired with an explicit len: int that tracks how many slots are “in use”:
let arr: int[3];
let len: int;
let v: int;
len = 0;
len = array_push(arr, len, 10);
len = array_push(arr, len, 20);
len = array_push(arr, len, 30);
v = array_pop(arr, len);
len = len - 1;
println(v); // 30
array_push(arr, len, value)writesvaluetoarr[len]and returnslen + 1.array_pop(arr, len)returnsarr[len - 1].- You are responsible for tracking
lenand staying within the fixed capacity of the array.
Pointers and memory intrinsics
QuinLang exposes a very small pointer/memory API. Pointers are just 16-bit integers with type ptr.
Taking addresses:
let x: int;
let a: int[3];
let p: ptr;
p = &x; // pointer to scalar
p = &a[1]; // pointer to array element
Low-level memory helpers:
// Read a 16-bit word
load16(p: ptr): int
// Write a 16-bit word
store16(p: ptr, value: int): void
// Copy raw bytes
memcpy(dst: ptr, src: ptr, count: int): void
// Fill raw bytes
memset(dst: ptr, value: int, count: int): void
Example of using them together:
fn main(): int {
let a: int;
let b: int;
let pa: ptr;
let pb: ptr;
a = 1234;
b = 0;
pa = &a;
pb = &b;
store16(pa, 4321);
println(a); // 4321
store16(pb, 1111);
println(b); // 1111
let buf1: int[3];
let buf2: int[3];
buf1[0] = 7;
buf1[1] = 8;
buf1[2] = 9;
// 3 ints * 2 bytes = 6 bytes
memcpy(&buf2[0], &buf1[0], 6);
println(buf2[0]); // 7
println(buf2[1]); // 8
println(buf2[2]); // 9
memset(&buf2[0], 0, 6);
println(buf2[0]); // 0
println(buf2[1]); // 0
println(buf2[2]); // 0
return 0;
}
On the 8086 backend these truly operate on bytes. On QuinVM they operate on a locals array, but the surface semantics are the same.
Compilation pipeline
At a high level, the compiler does this:
- Lexing – Convert the source text into a stream of tokens (
fn, identifiers, numbers,{,}, etc.). - Parsing – Turn tokens into an abstract syntax tree (AST) of expressions and statements.
- Semantic analysis –
- Resolve variable and function names.
- Enforce type rules (e.g.
intvsbool, correct function arguments, array element types). - Attach types to every expression node.
- Code generation – Lower the typed AST into either:
- 16-bit 8086 assembly (original backend), or
- A compact QuinVM bytecode sequence.
- Execution –
- For the VM backend, the Python
QuinVMinterpreter runs the bytecode directly. - For the 8086 backend, the generated assembly is assembled and run under DOSBox.
- For the VM backend, the Python
QuinVM bytecode
The VM backend lives in two main modules:
compiler/bytecode.py– defines theOpCodeenum andInstructionobjects.runtime/vm.py– implements theQuinVMinterpreter.
The bytecode is deliberately minimal:
- Stack/locals:
PUSH_INT,LOAD_LOCAL,STORE_LOCAL,LOAD_LOCAL_IDX,STORE_LOCAL_IDX - Arithmetic:
ADD,SUB,MUL,DIV,NEG - Comparisons:
CMP_EQ,CMP_NE,CMP_LT,CMP_LE,CMP_GT,CMP_GE - Logic:
NOT, plus control-flowJMP,JZ,JNZ - Calls:
CALL,RETwith a simple calling convention (args on stack, return value on stack) - Pseudo-pointer ops on locals:
LOAD_INDIRECT,STORE_INDIRECT,MEMCPY_LOCALS,MEMSET_LOCALS - I/O:
PRINT_INT,PRINT_STR,PRINTLN_INT,PRINTLN_STR
codegen_vm.py walks the AST and emits these instructions. Each QL function gets a FunctionInfo that records its entry point, local count, and parameter count so CALL/RET can set up frames correctly.
8086 backend (historical)
The 8086 backend follows the same front-end (lexer/parser/sema) but generates NASM assembly instead. It uses a small runtime (runtime/*.asm) for printing and string handling.
This backend was the original implementation and served as the reference for the VM. It’s still useful if you want to see how the same language maps down to real hardware, but day-to-day development can live entirely on QuinVM.
Project layout
Rough structure of the repo:
compiler/lexer.py,tokens.py– lexical analysisparser.py– recursive-descent parser that builds the ASTast.py– node definitionstypes.py– type objects and helperssema.py– semantic analysis and type checkingbuiltins.py– builtin function signaturescodegen_8086.py– 8086/DOS codegencodegen_vm.py– QuinVM codegenbytecode.py– bytecode opcodes andInstructiondriver.py– CLI entry point for the 8086 backenddriver_vm.py– CLI entry point for the VM backend
runtime/vm.py– QuinVM interpreter*.asm– 8086 runtime support (print, strings, etc.)
examples/- Small QL programs demonstrating features (
hello.ql,vm_test.ql,vm_arrays*.ql,control_flow.ql, ...)
- Small QL programs demonstrating features (
Limitations and future ideas
QL is intentionally small and has some rough edges:
- No heap or dynamic allocation; only fixed-size stack arrays.
- No bounds checking on array indexing or
array_push/array_pop. ptris untyped; you are responsible for pointing at the right thing.- Only 16-bit
int; no 32-bit or 64-bit types yet.
Some obvious future directions:
- A simple module system and separate compilation.
- A tagged
enum/uniontype. - A tiny standard library layered on top of the VM.
For now, the goal is to keep the compiler and VM simple enough that you can read through them in one sitting and see exactly how each language feature works end-to-end.