>_EXECUTIVE SUMMARY
awkrs is a single-binary AWK implementation written in Rust. The interpreter compiles AWK source to a 95-opcode bytecode, runs it on a stack VM with peephole-fused instructions, and JITs hot chunks via Cranelift to native code with tiered compilation. Parallel record processing rides on rayon work-stealing. First AWK implementation to combine bytecode VM + Cranelift JIT + persistent on-disk bytecode cache (rkyv-backed at ~/.awkrs/scripts.rkyv) — repeat -f script.awk runs skip lex/parse/compile entirely. 2,116 parity files cross-check byte-for-byte against gawk, mawk, and BSD awk; CI runs full corpus against all three references on every push.
>_ARCHITECTURE
Single-pass pipeline: lexer.rs → parser.rs → AST in ast.rs → compiler.rs emits bytecode Chunks with peephole fusion → vm.rs runs the stack VM → jit.rs tiered-compiles hot chunks via Cranelift. script_cache.rs memoizes the CompiledProgram as an rkyv archive so the next run mmaps it zero-copy. runtime.rs holds the central state (fields, $0, vars, slots).
Source layout
| Module | LOC | Tests | Role |
|---|---|---|---|
| vm.rs | 2,680 | 0 | Stack VM main loop, opcode dispatch, fused fast paths, array call-by-reference (tests moved to vm_tests.rs) |
| vm_tests.rs | 3,856 | 509 | Stack VM test suite (split from vm.rs) |
| parser.rs | 5,107 | 510 | Recursive-descent parser, gawk extensions, ^= / **= compound, cmd |& getline |
| runtime.rs | 4,886 | 149 | Field/record state, FS default " ", MPFR (-M) integration, CONVFMT/OFMT |
| compiler.rs | 3,877 | 170 | AST → bytecode, slot allocation, peephole fusion, user-function tracking |
| lexer.rs | 3,598 | 442 | POSIX + gawk tokens, string escapes (\a \b \f \v \xHH \NNN), scientific notation |
| lib.rs | 2,296 | 57 | CLI entry, parallel record driver, file plumbing |
| format.rs | 2,248 | 149 | sprintf / printf formatting (all POSIX specifiers, locale-aware %' grouping, %u 2^64 fallback) |
| cli_effects.rs | 2,029 | 37 | --debug / --lint / --pretty-print / --gen-pot output |
| builtins.rs | 1,801 | 111 | gsub/sub/match/split/asort/strftime/strtonum/gensub (with \N backrefs), unified arity messages |
| namespace.rs | 1,103 | 32 | gawk-style @namespace + qualified identifiers, 61 builtin name table |
| script_cache.rs | 1,015 | 26 | rkyv shard at ~/.awkrs/scripts.rkyv, flock-serialized writes |
| bytecode.rs | 761 | 21 | Op enum (95 variants), Chunk, CompiledProgram |
| gawk_extensions.rs | 674 | 22 | ord/chr/intdiv/readdir/stat/readfile/writea/reada, sleep, statvfs, fts |
| (other modules) | 8,654 | 532 | fusevm_bridge (1,202), ast (1,034), ast_fmt (979), vm_builtins (942), record_io (803), cli (772), bignum (657), source_expand (531), ast/parallel (515), procinfo (362), error (256), cyber_help (226), gettext_util (129), locale_numeric (124), flow (58), limits (32), main (16), bin/ars (16) |
| TOTAL | 45,003 | 2,770 | +929 integration tests = 3,699 total |
>_PERFORMANCE STACK
Three layers compound to produce competitive runtimes vs gawk / mawk / BSD awk on supported workloads (see benchmarks/benchmark-results.md). Each layer carries its own cost recovery threshold; tests pin the boundaries so a regression surfaces as a named test failure.
Bytecode VM with peephole fusion
Compiler emits a flat Vec<Op> per chunk. Peephole pass fuses common idioms into single opcodes that skip stack traffic:
print $N→PrintFieldStdout(N)(direct field write to print_buf)s += $N→AddFieldToSlot(in-place numeric parse)i++/++i→IncrSlot(one numeric add)s "lit"→ConcatPoolStr(no PushStr)- 3 fused field-print variants + array-arithmetic fusion
Cranelift JIT (tiered)
Hot bytecode chunks are JIT-compiled by Cranelift with opt_level=speed. Tiered: chunks run on the interpreter until they cross AWKRS_JIT_MIN_INVOCATIONS (default 3) — avoids compile cost on one-shot BEGIN blocks.
ABI: (vmctx, slots, field_fn, array_field_add, var_dispatch, field_dispatch, io_dispatch, val_dispatch) → f64. Mixed-mode chunks NaN-box Value in slots; non-mixed chunks keep slots in Cranelift SSA. Unsupported opcodes fall back to bytecode.
Persistent bytecode cache (rkyv)
Single archive at ~/.awkrs/scripts.rkyv: rkyv-archived outer shard, bincode-encoded CompiledProgram blobs. Read path is mmap + zero-copy ArchivedHashMap lookup + bincode-decode of the matched entry only. Writes are flock-serialized with atomic rename.
Invalidated automatically on source mtime change OR a newer awkrs binary. Disable with AWKRS_CACHE=0.
Parallel record processing (rayon)
-j N spawns a rayon thread pool. Records are processed in parallel when the program is parallel-safe (no range patterns, no exit, no primary getline, no cross-record assignments). Output is reassembled in order.
Files use RS-aware mmap-and-split; stdin uses chunked line buffering. Worker runtime is a per-thread Runtime with shared Arc<CompiledProgram>.
Inline fast paths
For single-rule programs with one fused opcode (e.g. { print $1 }, { s += $1 }), the record loop bypasses VmCtx entirely and writes fields directly from the mmap'd input. Literal gsub("lit", "repl"); print on absent needles skips the VM and copies lines verbatim.
Direct-to-buffer I/O
Stdout writes accumulate in a 64 KB Vec<u8> flushed at file boundaries — no per-record String, no format!(), no per-record stdout locking. OFS / ORS bytes are cached on the runtime and updated only on assignment.
>_TESTING
3,699 tests, 0 failures. The unit suite is the catcher for behavioral regressions; the parity suite (2,116 files) cross-checks against gawk + mawk + BSD awk. Test-driven divergence hunting has now closed 60+ POSIX/gawk gaps — each was first surfaced by a pinning test, then fixed, then the FIXME marker removed.
| Layer | Count | Style | Catches |
|---|---|---|---|
| Lib unit tests | 2,770 | Per-function pinning | Function-level contracts, POSIX semantics, opcode dispatch shapes |
| Integration tests | 922 | Subprocess via run_awkrs_stdin | End-to-end CLI behavior, exit codes, stdin/stdout/stderr |
| Parity files (cases/ + cases_portable/) | 2,054 | Diff vs gawk, mawk, BSD awk | Cross-reference compatibility across the awk language surface |
| Parity files (cases_gawk/) | 66 | Diff vs gawk only | gawk-only extensions (FPAT, typeof, gensub, **=, namespaces, strftime, bit-op arity) |
| cargo fmt | — | rustfmt --check | Style drift |
| cargo clippy | — | --all-targets -D warnings | Lint regressions: ptr_arg, approx_constant, etc. |
| cargo doc | — | --no-deps with RUSTDOCFLAGS=-D warnings | Stale doc references |
Divergences fixed via the test-then-fix loop (60+ total)
POSIX semantics (8 fixes)
- Array auto-create on
x = a[k]read - CONVFMT honored in concat (was bypassed by ConcatPoolStr peephole)
- CONVFMT honored in array subscripts
typeofreturns "untyped" for unset (gawk 5.x vocab)"0"string literal is truthy (gawk parity)^=/**=compound assignment- Comment-as-statement-terminator (newline preserved after #)
%05dof -42 = "-0042" (zeros after sign, not before)
gawk parity (6 fixes)
- FPAT leftmost-longest semantic (alternation regex)
- gensub
\Ncapture-group backref expansion - CSV mode leading-comma field count (
,,,= NF=4) - Lexer scientific notation (
1e7single token) - Array call-by-reference for user functions (incl. delete)
printof large integers bypasses OFMT past i64::MAX
Lexer / runtime / peephole (6 fixes)
- Escape sequences
\a \b \f \v \/ \xHH \NNNall decode sprintf("%.5d", 42)→ "00042" (precision honored)- 7 dormant peephole fusions activated via
normalize_field_indices()(PushNumDecimalStr-to-PushNum for small integers preceding GetField) - String-vs-number compare: literals stay as STRING (gawk parity, reverted earlier mis-fix)
- Integer-valued floats print exact via
%.0fregardless of magnitude (up to f64::MAX) - Frame-aware
array_elem_get/array_elem_set/for_in_keys/deletefor by-ref array params
Post-v0.4.2 parity batch (40+ fixes)
- Unified arity-error messages across
close/rand/srand/asort/asorti/system/printf/deleteand bitwise/time/typeof builtins; arity panics → runtime errors - Scalar-used-as-array now fatals across
read,in,for-in, assign, with baredeleteon scalar fatal - NaN-sign normalized at math-fn source:
sin/cos/exp/atan2+ arithmeticinf - inf - Regex:
\d/\Dtreated as literal letter (gawk parity);\1..\9no longer hard-errors in POSIX ERE FSdefaults to single space " " (POSIX/gawk parity, not empty)sub/gsubwith non-lvalue 3rd arg returns correct match count- Parser/runtime accept
cmd |& getlineandclose(cmd, "to"|"from") printf %upast 2^64 falls back to %g;%0fatal; locale-aware%'grouping with C-locale fallback- CONVFMT honored in Num-vs-StrLit comparisons; bare
inf/nancoercion; noisy-field strnum asort/asortion unassigned name returns 0 (not fatal);match-arr start/length parity- Ternary-else as assignment lvalue;
mktimeUTC flag;strftimedefault format; PROCINFO strftime - Bit-exact
==on doubles; paragraph RT captures the full run; missing input-file error message corrected
>_COMPETITIVE LANDSCAPE
Based on a survey of major public AWK implementations: awkrs is the first to combine a bytecode VM, a JIT compiler, and a persistent on-disk bytecode cache. frawk has VM + Cranelift/LLVM JIT but recompiles per invocation. gawk's pm-gawk persists script variables (heap), not compiled bytecode.
| Implementation | Bytecode VM | JIT | Persistent bytecode cache |
|---|---|---|---|
| BWK awk (one-true-awk) | — | — | — |
| gawk | ✓ | — | — (pm-gawk is for vars) |
| mawk | ✓ | — | — |
| goawk | ✓ | — | — |
| frawk | ✓ | Cranelift + LLVM | — |
| zawk (frawk fork) | ✓ | Cranelift + LLVM | — |
| awkrs | ✓ | Cranelift | ✓ (rkyv mmap) |
>_LANGUAGE COVERAGE
POSIX core
All patterns (BEGIN, END, regex, expression, range), all builtins (print, printf, sprintf, substr, index, length, split, gsub, sub, match, getline, sprintf, system, …), all numeric/string coercion rules including CONVFMT/OFMT.
gawk extensions
BEGINFILE/ENDFILE- Coprocess
|&two-way pipes --csv/ FPAT (leftmost-longest)PROCINFO/SYMTAB/FUNCTAB@include/@load/@namespace/inet/tcp|udp/...sockets via socket2- MPFR via
-Musingrug gensubwith\Nbackrefs- Typed regex
@/pat/ - Indirect calls
@func(args) typeof(),strtonum(),mkbool()- Bit ops:
and/or/xor/lshift/rshift/compl
awkrs-specific extensions
-j Nparallel record processing--read-aheadstdin chunk sizing-O/-sJIT enable/disable- Persistent bytecode cache (auto)
- Cyberpunk HUD help (
-h) with theme switching - Compatibility flags from mawk (
-W) and BSD awk
>_REPOSITORY
- Source — github.com/MenkeTechnologies/awkrs
- Crate — crates.io/crates/awkrs (
cargo install awkrs) - API docs — docs.rs/awkrs
- User docs — awkrs documentation
- Bytecode cache section — ~/.awkrs/scripts.rkyv layout + invalidation rules
- Parity tests —
parity/directory cross-checks output against gawk + mawk + BSD awk for 2,116 cases (2,054 portable incases/+cases_portable/, 62 gawk-only incases_gawk/) - License — MIT
Stats snapshot: 2026-06-05 · awkrs v0.4.14 · full parity green vs gawk + mawk + BSD awk