// AWKRS — ENGINEERING REPORT

AWK implementation in Rust · bytecode VM + Cranelift JIT · rkyv bytecode cache · parallel records via rayon · POSIX + gawk + mawk compatibility

// Color scheme

>_EXECUTIVE SUMMARY

awkrs is a single-binary AWK implementation written in Rust. The interpreter compiles AWK source to a 95-opcode bytecode, runs it on a stack VM with peephole-fused instructions, and JITs hot chunks via Cranelift to native code with tiered compilation. Parallel record processing rides on rayon work-stealing. First AWK implementation to combine bytecode VM + Cranelift JIT + persistent on-disk bytecode cache (rkyv-backed at ~/.awkrs/scripts.rkyv) — repeat -f script.awk runs skip lex/parse/compile entirely. 2,116 parity files cross-check byte-for-byte against gawk, mawk, and BSD awk; CI runs full corpus against all three references on every push.

45,003
Rust Lines
3,699
Tests (Total)
2,770
Unit Tests (Lib)
95
VM Opcodes
61
Builtin Functions
0
Test Failures
2,116
Parity Files (vs gawk + mawk + BSD)
v0.4.14
crates.io Release

>_ARCHITECTURE

Single-pass pipeline: lexer.rsparser.rs → AST in ast.rscompiler.rs emits bytecode Chunks with peephole fusion → vm.rs runs the stack VM → jit.rs tiered-compiles hot chunks via Cranelift. script_cache.rs memoizes the CompiledProgram as an rkyv archive so the next run mmaps it zero-copy. runtime.rs holds the central state (fields, $0, vars, slots).

Source layout

ModuleLOCTestsRole
vm.rs2,6800Stack VM main loop, opcode dispatch, fused fast paths, array call-by-reference (tests moved to vm_tests.rs)
vm_tests.rs3,856509Stack VM test suite (split from vm.rs)
parser.rs5,107510Recursive-descent parser, gawk extensions, ^= / **= compound, cmd |& getline
runtime.rs4,886149Field/record state, FS default " ", MPFR (-M) integration, CONVFMT/OFMT
compiler.rs3,877170AST → bytecode, slot allocation, peephole fusion, user-function tracking
lexer.rs3,598442POSIX + gawk tokens, string escapes (\a \b \f \v \xHH \NNN), scientific notation
lib.rs2,29657CLI entry, parallel record driver, file plumbing
format.rs2,248149sprintf / printf formatting (all POSIX specifiers, locale-aware %' grouping, %u 2^64 fallback)
cli_effects.rs2,02937--debug / --lint / --pretty-print / --gen-pot output
builtins.rs1,801111gsub/sub/match/split/asort/strftime/strtonum/gensub (with \N backrefs), unified arity messages
namespace.rs1,10332gawk-style @namespace + qualified identifiers, 61 builtin name table
script_cache.rs1,01526rkyv shard at ~/.awkrs/scripts.rkyv, flock-serialized writes
bytecode.rs76121Op enum (95 variants), Chunk, CompiledProgram
gawk_extensions.rs67422ord/chr/intdiv/readdir/stat/readfile/writea/reada, sleep, statvfs, fts
(other modules)8,654532fusevm_bridge (1,202), ast (1,034), ast_fmt (979), vm_builtins (942), record_io (803), cli (772), bignum (657), source_expand (531), ast/parallel (515), procinfo (362), error (256), cyber_help (226), gettext_util (129), locale_numeric (124), flow (58), limits (32), main (16), bin/ars (16)
TOTAL45,0032,770+929 integration tests = 3,699 total

>_PERFORMANCE STACK

Three layers compound to produce competitive runtimes vs gawk / mawk / BSD awk on supported workloads (see benchmarks/benchmark-results.md). Each layer carries its own cost recovery threshold; tests pin the boundaries so a regression surfaces as a named test failure.

Bytecode VM with peephole fusion

Compiler emits a flat Vec<Op> per chunk. Peephole pass fuses common idioms into single opcodes that skip stack traffic:

  • print $NPrintFieldStdout(N) (direct field write to print_buf)
  • s += $NAddFieldToSlot (in-place numeric parse)
  • i++ / ++iIncrSlot (one numeric add)
  • s "lit"ConcatPoolStr (no PushStr)
  • 3 fused field-print variants + array-arithmetic fusion

Cranelift JIT (tiered)

Hot bytecode chunks are JIT-compiled by Cranelift with opt_level=speed. Tiered: chunks run on the interpreter until they cross AWKRS_JIT_MIN_INVOCATIONS (default 3) — avoids compile cost on one-shot BEGIN blocks.

ABI: (vmctx, slots, field_fn, array_field_add, var_dispatch, field_dispatch, io_dispatch, val_dispatch) → f64. Mixed-mode chunks NaN-box Value in slots; non-mixed chunks keep slots in Cranelift SSA. Unsupported opcodes fall back to bytecode.

Persistent bytecode cache (rkyv)

Single archive at ~/.awkrs/scripts.rkyv: rkyv-archived outer shard, bincode-encoded CompiledProgram blobs. Read path is mmap + zero-copy ArchivedHashMap lookup + bincode-decode of the matched entry only. Writes are flock-serialized with atomic rename.

Invalidated automatically on source mtime change OR a newer awkrs binary. Disable with AWKRS_CACHE=0.

Parallel record processing (rayon)

-j N spawns a rayon thread pool. Records are processed in parallel when the program is parallel-safe (no range patterns, no exit, no primary getline, no cross-record assignments). Output is reassembled in order.

Files use RS-aware mmap-and-split; stdin uses chunked line buffering. Worker runtime is a per-thread Runtime with shared Arc<CompiledProgram>.

Inline fast paths

For single-rule programs with one fused opcode (e.g. { print $1 }, { s += $1 }), the record loop bypasses VmCtx entirely and writes fields directly from the mmap'd input. Literal gsub("lit", "repl"); print on absent needles skips the VM and copies lines verbatim.

Direct-to-buffer I/O

Stdout writes accumulate in a 64 KB Vec<u8> flushed at file boundaries — no per-record String, no format!(), no per-record stdout locking. OFS / ORS bytes are cached on the runtime and updated only on assignment.


>_TESTING

3,699 tests, 0 failures. The unit suite is the catcher for behavioral regressions; the parity suite (2,116 files) cross-checks against gawk + mawk + BSD awk. Test-driven divergence hunting has now closed 60+ POSIX/gawk gaps — each was first surfaced by a pinning test, then fixed, then the FIXME marker removed.

LayerCountStyleCatches
Lib unit tests2,770Per-function pinningFunction-level contracts, POSIX semantics, opcode dispatch shapes
Integration tests922Subprocess via run_awkrs_stdinEnd-to-end CLI behavior, exit codes, stdin/stdout/stderr
Parity files (cases/ + cases_portable/)2,054Diff vs gawk, mawk, BSD awkCross-reference compatibility across the awk language surface
Parity files (cases_gawk/)66Diff vs gawk onlygawk-only extensions (FPAT, typeof, gensub, **=, namespaces, strftime, bit-op arity)
cargo fmtrustfmt --checkStyle drift
cargo clippy--all-targets -D warningsLint regressions: ptr_arg, approx_constant, etc.
cargo doc--no-deps with RUSTDOCFLAGS=-D warningsStale doc references

Divergences fixed via the test-then-fix loop (60+ total)

POSIX semantics (8 fixes)

  • Array auto-create on x = a[k] read
  • CONVFMT honored in concat (was bypassed by ConcatPoolStr peephole)
  • CONVFMT honored in array subscripts
  • typeof returns "untyped" for unset (gawk 5.x vocab)
  • "0" string literal is truthy (gawk parity)
  • ^= / **= compound assignment
  • Comment-as-statement-terminator (newline preserved after #)
  • %05d of -42 = "-0042" (zeros after sign, not before)

gawk parity (6 fixes)

  • FPAT leftmost-longest semantic (alternation regex)
  • gensub \N capture-group backref expansion
  • CSV mode leading-comma field count (,,, = NF=4)
  • Lexer scientific notation (1e7 single token)
  • Array call-by-reference for user functions (incl. delete)
  • print of large integers bypasses OFMT past i64::MAX

Lexer / runtime / peephole (6 fixes)

  • Escape sequences \a \b \f \v \/ \xHH \NNN all decode
  • sprintf("%.5d", 42) → "00042" (precision honored)
  • 7 dormant peephole fusions activated via normalize_field_indices() (PushNumDecimalStr-to-PushNum for small integers preceding GetField)
  • String-vs-number compare: literals stay as STRING (gawk parity, reverted earlier mis-fix)
  • Integer-valued floats print exact via %.0f regardless of magnitude (up to f64::MAX)
  • Frame-aware array_elem_get / array_elem_set / for_in_keys / delete for by-ref array params

Post-v0.4.2 parity batch (40+ fixes)

  • Unified arity-error messages across close/rand/srand/asort/asorti/system/printf/delete and bitwise/time/typeof builtins; arity panics → runtime errors
  • Scalar-used-as-array now fatals across read, in, for-in, assign, with bare delete on scalar fatal
  • NaN-sign normalized at math-fn source: sin/cos/exp/atan2 + arithmetic inf - inf
  • Regex: \d/\D treated as literal letter (gawk parity); \1..\9 no longer hard-errors in POSIX ERE
  • FS defaults to single space " " (POSIX/gawk parity, not empty)
  • sub/gsub with non-lvalue 3rd arg returns correct match count
  • Parser/runtime accept cmd |& getline and close(cmd, "to"|"from")
  • printf %u past 2^64 falls back to %g; %0 fatal; locale-aware %' grouping with C-locale fallback
  • CONVFMT honored in Num-vs-StrLit comparisons; bare inf/nan coercion; noisy-field strnum
  • asort/asorti on unassigned name returns 0 (not fatal); match-arr start/length parity
  • Ternary-else as assignment lvalue; mktime UTC flag; strftime default format; PROCINFO strftime
  • Bit-exact == on doubles; paragraph RT captures the full run; missing input-file error message corrected

>_COMPETITIVE LANDSCAPE

Based on a survey of major public AWK implementations: awkrs is the first to combine a bytecode VM, a JIT compiler, and a persistent on-disk bytecode cache. frawk has VM + Cranelift/LLVM JIT but recompiles per invocation. gawk's pm-gawk persists script variables (heap), not compiled bytecode.

ImplementationBytecode VMJITPersistent bytecode cache
BWK awk (one-true-awk)
gawk— (pm-gawk is for vars)
mawk
goawk
frawkCranelift + LLVM
zawk (frawk fork)Cranelift + LLVM
awkrsCranelift✓ (rkyv mmap)

>_LANGUAGE COVERAGE

POSIX core

All patterns (BEGIN, END, regex, expression, range), all builtins (print, printf, sprintf, substr, index, length, split, gsub, sub, match, getline, sprintf, system, …), all numeric/string coercion rules including CONVFMT/OFMT.

gawk extensions

  • BEGINFILE / ENDFILE
  • Coprocess |& two-way pipes
  • --csv / FPAT (leftmost-longest)
  • PROCINFO / SYMTAB / FUNCTAB
  • @include / @load / @namespace
  • /inet/tcp|udp/... sockets via socket2
  • MPFR via -M using rug
  • gensub with \N backrefs
  • Typed regex @/pat/
  • Indirect calls @func(args)
  • typeof(), strtonum(), mkbool()
  • Bit ops: and/or/xor/lshift/rshift/compl

awkrs-specific extensions

  • -j N parallel record processing
  • --read-ahead stdin chunk sizing
  • -O/-s JIT enable/disable
  • Persistent bytecode cache (auto)
  • Cyberpunk HUD help (-h) with theme switching
  • Compatibility flags from mawk (-W) and BSD awk

>_REPOSITORY

Stats snapshot: 2026-06-05 · awkrs v0.4.14 · full parity green vs gawk + mawk + BSD awk