>_EXECUTIVE SUMMARY
strykelang is a single-binary Perl 5 implementation written in Rust. The interpreter compiles source to a 347-opcode bytecode, runs it on a register+stack hybrid VM, and JITs hot blocks through Cranelift to native code. Parallel primitives (pmap, pgrep, pflat_map, pfor) ride on rayon work-stealing. Synchronization spans the full stack: mysync / oursync for lockless atomic shared state, mutex() + semaphore($n) with parking_lot::Mutex + Condvar for explicit blocking critical sections / bounded concurrency, and POSIX flock($fh, OP) for cross-process advisory file locks. Pipe-forward (|>) and the thread macro (~>) extend Perl 5 syntax with Clojure / Racket / Scala threading semantics. Editor tooling rides on the same binary — stryke --lsp for Language Server Protocol, stryke --dap for Debug Adapter Protocol (line breakpoints, step over/into/out, recursive variable drill-down) — consumed by a first-class JetBrains plugin under editors/intellij/. 361k production Rust lines + 228k .stk lines + 10,459 builtins (11,191 keys in %all) + 1,648 Rosetta tasks + 173 Exercism solutions — matching surface coverage that took dynamic languages with multi-decade histories.
Source Distribution — 478,297 lines
Production: 197 files across strykelang/ (including strykelang/pkg/ + strykelang/bins/) excluding *_tests.rs. Tests: 20 inline *_tests.rs modules in strykelang/ + 358 integration modules in tests/. Tooling: build.rs + benches/ + fuzz/.
~SCALE & POSITION
Quantitative comparison against the established dynamic-language reference set. Source-LOC figures are approximate (counted from each project's public source tree, excluding vendored deps and external bindings). Stryke is single-author single-language-binary; the others span tens to hundreds of contributors over multi-decade histories.
| Language | First release | Years | Production source | Builtins (primary) | Native JIT | Rosetta complete? |
|---|---|---|---|---|---|---|
| stryke | 2026 | 0.09 | 361,190 Rust + 228,166 stk | 10,459 | Cranelift | 1,349 / 1,349 (100%) |
| Lua 5.4 | 1993 | ~32 | ~13,000 (C) | ~80 | no (LuaJIT is separate) | partial |
| LuaJIT | 2005 | ~20 | ~85,000 (C+asm) | ~80 | tracing | n/a (rides on Lua) |
| Tcl 8.6 | 1988 | ~37 | ~150,000 (C) | ~90 | no | complete |
| Crystal | 2014 | ~12 | ~250,000 (Crystal) | stdlib classes | LLVM AOT | partial |
| Perl 5 | 1994 | ~31 | ~700,000 (C+Perl) | ~250 keywords | no | complete |
| Python 3 | 2008 | ~17 | ~1,200,000 (C+Py) | ~150 builtins | no (3.13 experimental) | complete |
By Per-Year Output
Since the first public commits in April 2026, the tree has already crossed a quarter-million lines of production Rust — a higher sustained velocity than interpreters that historically averaged hundreds of lines per author-year of C over multi-decade lifetimes. Different orders of magnitude — different feature scope.
By Test Surface
14,001 #[test] functions + 20,056 Perl-parity cases + 12,955 assert_eq / assert_ok checks in examples/rosetta + examples/project. The cross-implementation parity suite is the critical part — only mature dynamic langs ship one (Perl, Ruby, Python) and stryke is the only one running its parity suite against another implementation (perl) on every CI cycle.
By Built-in Capability
Things in stryke's binary that the reference set doesn't ship in-binary: Cranelift JIT, distributed-compute primitive (cluster/pmap_on), package manager (s pkg), web framework, language-level AI (ai/tool fn/MCP), AOT static-binary deployment (s build). Encyclopedic-stdlib axis: 10,459 builtins (11,191 spellings) vs ~80–250 for the reference set.
Rosetta Code Completion
Clearing all 1,349 published tasks is a third-party-verifiable completeness bar that takes most languages a decade or more to reach — differential is the encyclopedic-stdlib axis: tasks that other languages need third-party libraries for are one-builtin calls in stryke.
#Builtin Count vs Other Bareword-Fn Languages
Stryke's 10,459 primary builtins exceed every other language whose stdlib is exposed as bareword fns — including Wolfram Language, the only prior holder of "world's largest builtin set" branding. APL/J/K/q have hundreds of primitives, but as glyphs (+/, ⌽, ~:) rather than barewords; R has ~2k+, but split across library()'d packages rather than always-loaded. Stryke is the only bareword-fn language past the 10k threshold.
| Language | Bareword Builtins | Source / Probe |
|---|---|---|
| stryke | 10,459 | len(keys %b) at runtime — dynamic, not hardcoded |
| Wolfram Language v14.1 | ~7,000 | Stephen Wolfram blog ("7,000 or so" at v14.1, +89 net new from v14.0's 6,602) |
| Wolfram Language v14.3 (est.) | 7,100–7,300 | Extrapolated from per-version deltas; exact count not published |
| Ruby Kernel | ~250 | Kernel.methods.size |
| Perl 5 core | ~220 | perlfunc manpage |
| TCL | ~100 | info commands |
| Lua base + std | ~100 | _G + standard tables |
| Bash | ~80 | enable -a | wc -l |
| Python builtins | ~70 | dir(__builtins__) — stdlib is module.fn, not bareword |
| AWK | ~30 | gawk manual |
Wolfram is the only language that has marketed builtin count as a competitive metric. Stryke's 10,459 clears Wolfram v14.1's stated count by ~3,284 and clears the high end of a plausible v14.3 band (~7,300) by ~3,052. Bareword-fn rivalry past the 10k mark: stryke is alone in the category.
Density: builtins per byte of binary
| Toolchain | Builtins | Install Size | Bytes / Builtin |
|---|---|---|---|
| stryke (single static binary) | 11,191 | 44 MB | ~4.0 KB |
| Wolfram Engine v14 (free tier) | ~7,000 | ~5–6 GB | ~750 KB |
| MATLAB R2025a + toolboxes | thousands | 30+ GB | n/a |
~200× denser than Wolfram per builtin/byte. The single-binary qualifier matters: no kernel boot, no paclet manager, no licensing daemon, no GUI runtime, no library() resolution — ~/.cargo/bin/s is one statically-linked Rust binary that cold-starts in <10 ms with 11,102 callable spellings (10,459 builtins + 643 aliases) plus 89 keywords reflected in %all (11,191 keys total). Caveat: bytes-per-builtin is partly apples-to-oranges — Wolfram's GBs include datasets, image / video native code, and the Wolfram Notebook GUI. The defensible composite claim is "11,102 callable names + 44 MB single binary + sub-10 ms cold start", which no other language clears simultaneously.
#SUBSYSTEM BREAKDOWN
Source partitioned by role. Builtins dominate at 32.7% — the language ships with 10,459 primaries in %b (11,191 keys in %all = 10,459 primaries + 643 aliases + 89 keywords) spanning string / array / hash / regex / I/O / process / OO / format / pack / crypto / codec / parallel / sync (mutex / semaphore / flock) / HTTP / database / network / AI / web / math / physics / chemistry / biology / signal / finance domains. Core pipeline (lexer→parser→compiler→VM→JIT) is ~16.8% of the Rust slice.
| Subsystem | Key Files | Lines | % | Share | Description |
|---|---|---|---|---|---|
| Builtins | builtins*, list_builtins, math_wolfram* | 152,367 | 32.7% | 10,459 builtins in %b / 11,191 keys in %all: string, array, hash, regex, math, physics, chemistry, biology, signal, finance, ML, geometry, special functions, I/O, file, process, OO, format, pack, crypto, codec, network, HTTP, JWT | |
| VM & Compiler | vm, compiler, bytecode | 23,234 | 5.0% | 347-opcode bytecode, register+stack VM, AST→bytecode compiler with constant folding / register allocation / peephole, opcode disassembler | |
| VM Runtime State (vm_helper) | vm_helper | 22,581 | 4.8% | Persistent state container for the bytecode VM: scope graph, sub table, error context, debugger hooks, special variables ($_, $!, $/, …), reflection-hash population, package-stash refresh, AOP intercept registry, I/O handle map. Not a tree-walking interpreter — stryke has none. The absence is pinned by tests/tree_walker_absent_aop.rs as an architectural invariant. | |
| Parser | parser, ast, token | 25,043 | 5.4% | Recursive-descent parser, 196 parse_* functions, AST variants across multiple enums, operator precedence climbing (or/and/not at lowest tier per perlop), heredocs, prototype parsing, builtin-name whitelist | |
| AI / Web | ai, ai_sugar, web, web_orm, mcp | 14,483 | 3.1% | Anthropic / OpenAI client, tool-use protocol, AI sugar (ai as native verb), MCP server / client, web framework (route DSL, controllers, ORM, sessions, templates), JSON / form / multipart parsers | |
| Value System | value, nanbox, scope, capture, convert, deconvert | 10,342 | 2.2% | NaN-boxed StrykeValue: int / float / string / array / hash / code / regex / undef. Scoping (pad slots, closures, local/my/our/oursync), type coercion, capture cells | |
| Codecs / Data | native_codec, native_data, map_stream, map_grep_fast, sort_fast, sketches | 10,197 | 2.2% | JSON / YAML / TOML / CSV / MessagePack encode-decode, streaming map/grep iterators, optimized sort, probabilistic sketches (HLL, t-digest, count-min) | |
| Distribution | cluster, controller, agent, remote_wire, pkg/*, script_cache, pcache | 7,748 | 1.7% | Multi-host cluster execution, agent / controller protocol, wire format, package manager (pkg — manifest, lockfile, store, resolver), bytecode cache | |
| LSP | lsp, lsp_symbols, lsp_extras, lsp_docs_domains, static_analysis | 22,342 | 4.8% | Language Server Protocol over stdio (stryke --lsp) — diagnostics (strict-vars on by default with full Perl/AOP/block-param exemption set), hover descriptions (with full key tables for hash-returning builtins, suppressed inside string literals), completion (sigil vars, subs, classes/structs/enums/traits with proper CompletionItemKind, enum variants, constants, loop labels, hash-key completion driven by 20+ builtin return schemas, parse-error recovery during typing, suffix-only insertText for qualified subs), semantic tokens, signature help, code actions (Extract Variable/Constant/Parameter/Function, caret-only, inside-string aware), goto-definition, document symbols, package-aware rename (AST-based for fields/methods, no textual fallback, cross-file via require graph BFS, defensive ::-strip on newName). OOP-aware: $self->X/$obj->X walks extends + impl chains plus a universal-method whitelist; constructor key validation across inheritance; positional vs keyed auto-detection. Cross-file require walks up to a project root with a sibling lib/. Consumed by VS Code, Neovim, Helix, and the JetBrains plugin. | |
| DAP Debugger | dap, debugger | 2,881 | 0.7% | Debug Adapter Protocol server over stdio or TCP (stryke --dap [HOST:PORT]) sharing the same Debugger state machine as the TTY REPL (stryke -d). Line + function breakpoints, step over/into/out, recursive Variables panel with class / struct / enum / set / sketch drill-down, Evaluate dialog with scalar-prelude injection, real-time Console output. Consumed by editors/intellij/. | |
| CLI & REPL | main, repl, cli_runners, getopts | 8,943 | 2.0% | CLI entry: argv parsing, script loading, one-liner mode (-e/-E/-n/-p), reedline-based REPL with utop-style menu, columnar completion, history; subcommand runners and Getopt::Long-compatible flag parser | |
| JIT | jit | 5,174 | 1.2% | Cranelift block-level JIT: hot-path detection, native code emission for tight numeric loops, deoptimization stubs back to bytecode VM | |
| Format / Deparse | fmt, deparse, format, minify | 4,801 | 1.0% | perltidy-compatible code formatter, AST→source deparser (B::Deparse equivalent), Perl format/write system, source minifier | |
| Parallel Runtime | par_lines, par_list, par_pipeline, par_walk, ppool, pchannel, pwatch, pmap_progress, parallel_trace, stress, builtins_sync | 4,803 | 1.0% | Rayon work-stealing: pmap, pgrep, pflat_map, pfor, parallel file walking, async pipelines, progress tracking, channels, stress-testing primitives (heat, fire). Blocking sync primitives backed by parking_lot::Mutex + Condvar: mutex / mutex_lock / mutex_try_lock, semaphore / sem_acquire / sem_release / sem_try_acquire; cross-process flock($fh, OP) for advisory file locks (POSIX flock(2)). | |
| Perl Compat | perl_decode, perl_fs, perl_inc, perl_regex, perl_signal, perl_pty, special_vars, english | 4,062 | 0.9% | @INC, %ENV, %SIG, file ops, Perl regex engine, signal handling, PTY support, magic special variables, use English | |
| Lexer | lexer | 4,187 | 0.9% | Tokenizer for Perl 5 + stryke extensions: heredocs, regex literals, q/qq/qw/qr, string interpolation, Unicode identifiers, sigil disambiguation | |
| Misc / Error | error, mro, lib, bench_fusion, fib_like_tail, pending_destroy, kvstore, banner, doc_render, docs, bins/{gen_docs,s,st}, serialize_normalize, parse_smoke_*, run_semantics_more | 6,165 | 1.3% | Rich error reporter, MRO (C3 / DFS), library entry, fusion-pass optimizers, finalizer queue, key-value store, banner, doc generator + renderer, thin bin shims, smoke tests | |
| Tooling | profiler, perf_recorder, stryke_log | 1,425 | 0.3% | Sampling profiler, perf event recorder, structured logger. (Static analyzer counts under the LSP row; the step debugger lives under the DAP Debugger row above so the TTY -d front-end and the DAP server count together.) | |
| FFI / AOT | rust_ffi, rust_sugar, aot, data_section, aop | 1,823 | 0.4% | Embed-Rust closure FFI, AOT bytecode caching, data-section relocations, AOP intercepts | |
| Crypto / Pack | crypt_util, jwt, pack, secrets | 1,834 | 0.4% | Symmetric / asymmetric encryption, password hashing, JWT encode/decode, pack/unpack templates, secret-store helpers | |
| Tests | tests/suite/*, *_tests.rs | 110,653 | 23.2% | 14,001 tests (cargo test -- --list) across 384 Rust modules (20 inline + 358 integration): parser, lexer, builtins, parallel, sync (21 mutex / semaphore tests in builtins_sync_tests.rs), regex, values, OO, semantics, CLI, LSP, DAP, runtime, fix-regression suite, and 142 behavioral pin suites (incl. reduce_pin, split_pin, zip_pad_pin) | |
| Build / Bench | build.rs, benches/, fuzz/ | 5,464 | 1.2% | Build-time reflection generator (BUILTIN_ARMS, CATEGORY_MAP, DESCRIPTIONS), JIT vs interpreter micro-bench, libfuzzer harnesses | |
| TOTAL | 478,297 | 100% | |||
$TOP 20 FILES BY SIZE
The 20 largest source files account for 42.8% of the codebase (Rust under strykelang/ + tests/ + tooling slice). Builtins is by far the largest single file: every Perl 5 core function plus stryke extensions (HTTP, JSON, parallel, codec, OO, AI, web) lives here, dispatched through a single match table generated at build time.
| File | Lines | Role |
|---|---|---|
| strykelang/builtins.rs | 43,806 | Core builtin dispatch table covering string/array/hash/regex/math/I/O/file/process/OO domains; routes 10,459 builtins in %b (11,191 keys in %all) |
| strykelang/math_wolfram*.rs | 58,672 | 82 batch files (math_wolfram, math_wolfram2…82) include!'d into builtins.rs: 4,643 fns spanning Wolfram-class math, physics, chemistry, biology, signal, finance, ML, geometry, special functions, calendrical, astronomy, BLAS/LAPACK, sabermetrics, Excel/financial, GIS, robotics/control, actuarial, epidemiology, archive/encoding, music theory, geology, logic/SAT/SMT, compilers/parsing, linguistics, Postgres SQL/JSON, Redis, scipy.special, economics/game theory |
| strykelang/vm_helper.rs | 22,595 | Runtime: scope graph, evaluation context, error reporter, debugger hooks (used by both -d TTY REPL and --dap server), special-variable handling, reflection-hash population, package-stash refresh |
| strykelang/parser.rs | 22,790 | Recursive-descent parser; precedence climbing, heredocs; whitelists every builtin spelling |
| strykelang/builtins_extended.rs | 12,643 | Extended builtins: crypto, JWT, codecs (JSON/YAML/TOML/CSV/MsgPack), HTTP, email, QR, compression |
| strykelang/vm.rs | 11,648 | Bytecode VM: register+stack hybrid, 347 opcodes, hot-path detection, exception unwinder, debugger should_stop hook on every op |
| strykelang/compiler.rs | 9,894 | AST→bytecode: constant folding, register allocation, peephole optimization, basic-block formation |
| tests/suite/fix_regressions.rs | 7,654 | Regression test suite — one test per fixed bug; pin behaviors so refactors can’t silently re-break |
| strykelang/lsp.rs | 12,094 | LSP server core — hover, completion (hash-key from builtin schemas, type/enum/constant/loop-label categories, qualified-suffix insertText, parse-error recovery, { trigger), diagnostics, semantic tokens, signature help, code actions (Extract V/C/P/M, caret-only, inside-string aware), goto/refs/rename (package-aware, AST-only field rename, defensive ::-strip on newName, cross-file via require graph), document symbols. Hover suppressed inside string literals. |
| strykelang/static_analysis.rs | 2,850 | Linter / strict-vars: OOP-aware $self->X/$obj->X checks walking extends + impl chains, universal-method whitelist (isa/can/clone/with/to_hash/...), constructor key validation with parent fields, match-arm enum-variant typo detection, positional vs keyed constructor auto-detect, exemptions for Perl special vars ($^X-family, $$, $#arr), AOP intercept context, block-param grammar, open-my-fh declarations, exists(&sub) introspection, _thread_par_run parser-emitted calls. Cross-file require resolution walks up to project root (sibling lib/ detection). |
| strykelang/lsp_symbols.rs | 1,448 | SymbolTable for the LSP server — declarations (sub, type, class, struct, enum, trait, field, our, format, label, package), references, scope chains. Drives rename, goto, references, document outline. Pre-pass collects every Type + Field name so cross-stmt lookups don't need topological ordering. |
| strykelang/ai.rs | 6,631 | Anthropic / OpenAI API client, tool-use protocol, batch / pmap fan-out, MCP integration, vision / audio modalities |
| strykelang/run_semantics_tests.rs | 5,446 | Perl-semantics correctness suite: behaviors that must match perl byte-for-byte |
| strykelang/jit.rs | 5,174 | Cranelift JIT: IR generation, native code emission, deoptimization stubs, calling-convention bridge |
| strykelang/crate_api_tests.rs | 5,164 | Public crate-API test suite: validates the embeddable interface (stryke::eval, VMHelper, ...) |
| strykelang/main.rs | 5,461 | CLI: argv, script loading, one-liner mode (-e/-E/-n/-p), shebang dispatch, builtin-as-command, --lsp / --dap / -d dispatch |
| strykelang/value.rs | 5,050 | StrykeValue: NaN-boxed union, conversions, stringify / numify, magic context handling |
| strykelang/lexer.rs | 4,187 | Tokenizer: heredocs, regex literals, q/qq/qw/qr, string interpolation, sigil disambiguation |
| strykelang/web.rs | 3,567 | Web framework: route DSL, controllers, sessions, cookies, flash, multipart parsing, static / template rendering |
| strykelang/native_codec.rs | 3,602 | Built-in JSON / YAML / TOML / CSV / MessagePack encode-decode |
| strykelang/scope.rs | 3,270 | Lexical scoping: pad slots, closures, my/local/our/oursync, BEGIN/END/INIT/CHECK |
| strykelang/bytecode.rs | 2,474 | Opcode definitions (347 variants), instruction encoding, disassembler |
| TOP 20 SUBTOTAL | 201,637 | 42.3% of 478,297-line Rust slice |
@EXECUTION PIPELINE
Source flows through 5 stages. Tier 1 is the bytecode VM; tier 2 is Cranelift JIT'd native code for hot blocks. Both tiers share the same NaN-boxed value representation, so deoptimization (JIT→VM fallback on type miss) is a frame-pointer swap, not a re-marshal.
Source (.stk / .pl / -e '...')
│
▼
┌─────────────┐ ┌─────────────┐ ┌──────────────┐
│ lexer.rs │────▶│ parser.rs │────▶│ compiler.rs │
│ (4,187) │ │ (22,790) │ │ (9,894) │
│ Tokenizer │ │ AST node │ │ 347 opcodes │
│ heredocs │ │ variants │ │ const fold │
│ regex lit │ │ prec climb │ │ reg alloc │
│ q/qq/qw/qr │ │ whitelist │ │ peephole │
└─────────────┘ └─────────────┘ └──────┬───────┘
│
┌────────────────────┤
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ vm.rs │ │ jit.rs │
│ (11,648) │ │ (5,174) │
│ Bytecode VM │ │ Cranelift │
│ reg + stack │ │ block JIT │
│ hot detect │────▶│ native code │
│ exc unwind │◀────│ deopt stubs │
└──────┬───────┘ └──────────────┘
│
▼
┌───────────────────┐
│ builtins.rs │
│ (43,806) │
│ 10,459 builtins │
│ 11,191 spellings │
│ + extended │
│ (12,643) │
│ + math_wolfram* │
│ (58,672 / 82 f) │
└─────────┬─────────┘
│
┌────────────────┼─────────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ value.rs │ │ scope.rs │ │ parallel │
│ (5,050) │ │ (3,270) │ │ (4,803) │
│ NaN-boxed │ │ my/local/ │ │ pmap/pgrep │
│ StrykeValue │ │ oursync │ │ rayon work │
└─────────────┘ └─────────────┘ └─────────────┘
&BUILTIN INVENTORY
Builtins are partitioned at build time by build.rs, which scans strykelang/parser.rs and strykelang/builtins.rs to generate CORE_CATEGORY_MAP (Perl 5 core), EXT_CATEGORY_MAP (stryke extensions), and ALL_CATEGORY_MAP (every callable spelling, primaries plus aliases); language keywords (if, my, sub, fn, …) live in a separate hand-curated KEYWORDS table. The reflection tables back the runtime %b / %k / %all / %pc / %e / %a / %d / %c / %p hashes (plus their %stryke::* long-form aliases) under the disjoint-union invariant %all = %a + %b + %k. At runtime stryke also exposes %parameters (zsh-style live binding view) and per-package stashes (%main::, %Foo::) refreshed lazily on read.
118 hot builtins are reachable through dedicated BuiltinId opcodes in bytecode.rs — the rest go through the generic dispatch table. The 527-alias overhead exists so canonical Perl spellings (localtime), abbreviations (tj), and stryke-style snake_case alternatives all coexist without disambiguation. Disjoint union: keys %a + keys %b + keys %k == keys %all.
// CATEGORIES
String (byte)
Perl 5 layer: length, substr, index, rindex, pos — byte-indexed, kept for compat. chomp, uc, lc, sprintf, reverse, split, join, tr///, s///, m//
String (codepoint)
Stryke layer: len, $s[i], $s[a:b], cindex, crindex — codepoint-indexed, all share a coordinate system. Never mix with byte-layer outputs.
Array
push, pop, shift, unshift, map, grep, sort, reverse, splice, wantarray
Hash
keys, values, each, exists, delete, tied, tie, untie
Numeric
abs, int, sqrt, sin, cos, atan2, exp, log, hex, oct, rand, srand
I/O
open, close, read, write, print, printf, say, readline, seek, tell, eof
File
stat, lstat, chmod, chown, unlink, rename, symlink, readlink, mkdir, rmdir, file-test ops
Process
fork, exec, system, wait, waitpid, kill, getpid, getppid, setsid
Regex
m//, s///, qr//, tr///, pos, named captures, lookahead/behind, fancy-regex backrefs
OO
bless, ref, isa, can, SUPER::, MRO C3 / DFS, roles, traits, class blocks
Format / Pack
pack, unpack, format, write, printf, sprintf with full Perl 5 format strings
Codec
to_json, from_json, to_yaml, from_yaml, to_toml, to_csv, to_msgpack, to_html, to_pdf
Crypto
AES, ChaCha20, RSA, Ed25519, P-256/384, X25519, Argon2, scrypt, bcrypt, PBKDF2, BLAKE3, SHA-2/3, JWT
Network
http_get, http_post, tcp_connect, tcp_listen, DNS lookup, URL parse, IP utilities
Parallel
pmap, pgrep, pflat_map, pfor, par_lines, par_walk, work-stealing iterators, channels
Sync Primitives
mutex() + mutex_lock / mutex_unlock / mutex_try_lock / mutex_is_locked (intra-process, parking_lot::Mutex + Condvar, no busy-wait). semaphore($n) + sem_acquire / sem_release / sem_try_acquire / sem_permits / sem_limit (counting semaphore, same backing). flock($fh, OP) wraps POSIX flock(2) for cross-process advisory locks (LOCK_SH=1, LOCK_EX=2, LOCK_NB|LOCK_EX=4, LOCK_UN=8). Pair with defer for RAII.
Filesystem Stream
fr (file rec), fw (file write), fa (file append), slurp, spurt, walk
Reflection
%b / %k / %all / %pc / %e / %a / %d / %c / %p (nine reflection hashes — eight build-time, plus %k hand-curated keyword table), %parameters (live bindings, zsh $parameters), %main:: / %Pkg:: (package stashes), caller, __PACKAGE__, __SUB__
// STRING COORDINATES — BYTES VS CODEPOINTS
Stryke runs string code in two coordinate systems. Perl 5 builtins stay byte-indexed for binary-protocol and .pm-source compat. Stryke extensions are codepoint-indexed so search positions and slice bounds line up. Never auto-converted — mixing across systems silently misaligns on any non-ASCII content.
| Operation | Stryke (codepoints) | Perl 5 (bytes) |
|---|---|---|
| Length | len $s | length $s |
| Index char | $s[$i] | substr $s, $i, 1 |
| Slice | $s[$a:$b] (inclusive) | substr $s, $start, $len |
| Search forward | cindex $s, $needle [, $from] | index $s, $needle [, $from] |
| Search backward | crindex $s, $needle [, $from] | rindex $s, $needle [, $from] |
For "hello ─ world" (15 bytes, 13 codepoints): length = 15, len = 13, index $s, "world" = 9, cindex $s, "world" = 7.
!VM & BYTECODE
347 opcodes grouped into 27 functional categories. The VM is a register+stack hybrid: frame-local scalar slots provide O(1) access for hot variables (no string lookup), while the operand stack handles temporaries and complex expressions. Frame-local slots win on tight loops; the stack wins on irregular control flow.
// OPCODE CATEGORIES
Constants
LoadInt, LoadFloat, LoadConst, LoadUndef
Stack
Pop, Dup, Swap, Roll
Scalars
Name-pool indexed get/set, $_ magic, local/my/our binding
Arrays
NewArray, Push, Pop, Shift, Unshift, indexed get/set
Hashes
NewHash, get/set/delete, exists, each, keys, values
Arithmetic
Add, Sub, Mul, Div, Mod, Pow, Neg, integer/float dispatch
String
Concat, Repeat, Length, interpolation, escape decoding
Comparison
Numeric (==, !=, <, ...) + string (eq, ne, lt, ...) + spaceship
Logical / Bitwise
And, Or, Not, BitAnd, BitOr, BitXor, shifts
Control Flow
Absolute jumps, conditional branches, last/next/redo labels
Functions
Call, TailCall, Return, prototype dispatch
Try / Catch
VM exception handling: try_recover_from_exception, finally blocks, error propagation
Frame Slots
O(1) frame-local scalar access — bypasses pad lookup for hot registers
Streaming Map
Lazy iterator opcodes for |> and ~> chains: produce pull-side, no intermediate vec
~CRANELIFT JIT
Tier-2 compiler. The VM detects hot blocks (loops with stable types) and hands them to jit.rs, which lowers bytecode to Cranelift IR, then to native machine code. NaN-boxed StrykeValue stays the same shape across tiers, so a JIT→VM deopt is frame-pointer-swap-and-go — no value re-boxing.
Cranelift Backend
x86-64 + aarch64 native code generation. Same IR backend as Wasmtime — production-grade register allocator, optimizer, and code emitter.
Block-Level Compilation
Cranelift compiles individual basic blocks of bytecode, not whole functions. Lower latency than method-JIT; warm-up cost amortized across invocations.
Hot-Path Detection
VM tracks per-block invocation counts and type stability. Once a threshold is crossed and types haven’t shifted, the block is compiled.
Deoptimization
Type guard miss in JIT'd code branches back into the VM at the same bytecode offset. Stack and frame state survive the transition.
Numeric Specialization
Tight integer / float loops get specialized arithmetic with no boxing. NaN-tag bits are checked once per iteration, not per op.
Builtin Inlining
The 118 bytecoded builtins reachable through BuiltinId can be inlined directly into JIT code, skipping the dispatch table entirely.
%PARALLEL RUNTIME
11 modules listed below sum to 2,968 lines; the full parallel subsystem (helpers + stress.rs + orchestration) is 4,803 lines in the roll-up table. Every parallel builtin is a streaming iterator under the hood: pmap, pgrep, and pflat_map produce lazy results that flow into pipe-forward chains without materializing intermediate vectors. builtins_sync.rs adds blocking mutex + counting semaphore primitives backed by parking_lot::Mutex + Condvar for cases where lockless mysync isn't enough (compound multi-statement critical sections, bounded-concurrency throttles).
| File | Lines | Role |
|---|---|---|
| pmap_progress.rs | 755 | Progress reporting for long-running parallel maps; ETA estimation; cancellation tokens |
| par_pipeline.rs | 642 | Multi-stage parallel pipelines: stage fusion, cross-stage work-stealing |
| ppool.rs | 265 | Worker pool wrapper around rayon: thread-count tuning, panic isolation |
| pchannel.rs | 251 | Bounded MPMC channels for cross-stage handoff in par_pipeline |
| par_list.rs | 232 | Parallel list operations: pmap, pgrep, pflat_map, pfor on plain arrays |
| pwatch.rs | 205 | Filesystem-watcher integration: parallel reload on file change |
| builtins_sync.rs | 306 | Blocking sync primitives: mutex() / mutex_lock / mutex_unlock / mutex_try_lock / mutex_is_locked; semaphore($n) / sem_acquire / sem_release / sem_try_acquire / sem_permits / sem_limit. parking_lot::Mutex + Condvar — no busy-wait. Cross-process locking via the Perl-compat flock($fh, OP) builtin in builtins.rs:39185. |
| par_lines.rs | 144 | Parallel line-oriented file processing: chunked reads, line-boundary recovery |
| par_walk.rs | 69 | Parallel filesystem walking: walk, fr (file rec), parallel find |
| parallel_trace.rs | 61 | Tracer that records work-steal events for debugging contention |
| pcache.rs | 38 | Per-worker thread-local result cache; avoids cross-thread coordination on hot reads |
| PARALLEL TOTAL | 2,968 | 11 modules |
^PARSER & AST
Recursive-descent parser with operator-precedence climbing. 196 parse_* functions cover the full Perl 5 grammar plus stryke extensions: pipe-forward (|>), thread macro (~>), match patterns, class blocks, named arguments, retry / backoff syntax, defer, try/catch/finally.
| Enum | Variants | Role |
|---|---|---|
| ExprKind | 191 | Every expression form: literals, ops, calls, slices, regex, formats, captures |
| StmtKind | 47 | Statements: if, while, for, foreach, sub, package, use, BEGIN, ... |
| BinOp | 33 | Binary operators: arithmetic, comparison, logical, bitwise, string, regex bind |
| PerlTypeName | 10 | Type signatures: Int, Str, Array, Hash, Code, Ref, ... |
| UnaryOp | 7 | Unary: -, !, ~, defined, not, file-test, sigil-deref |
| MatchPattern | 6 | Pattern-match arms: literal, range, struct, array, alternation, guard |
| SubSigParam | 5 | Subroutine signature parameters: positional, named, slurpy, default, ... |
| Sigil / StringPart / MatchArrayElem / GrepBuiltinKeyword | 4 each | Lexical detail enums for sigils, interpolation parts, match elements, grep keywords |
| 20 enums total | 343 | Full AST surface area |
*TESTS & PARITY
Two complementary suites. Unit + integration tests are #[test] functions in Rust; parity tests are .pl scripts that must produce byte-identical output between system perl and stryke. CI runs both on every push.
// TEST SURFACES
Parity vs Perl
20,056 .pl cases under parity/cases/. CI runs each through both perl and stryke; any output mismatch fails the job. Catches semantic drift the unit tests miss.
Fix Regressions
tests/suite/fix_regressions.rs at 7,654 lines. One test per fixed bug. Refactors can’t silently re-break old behavior.
Crate API
strykelang/crate_api_tests.rs at 5,164 lines. Validates the embeddable interface: stryke::eval, VMHelper, public types stay stable across versions.
Run Semantics
strykelang/run_semantics_tests.rs at 5,446 lines. Behaviors that must match perl byte-for-byte: special variables, autovivification, list context.
Rosetta
1,648 .stk programs in examples/rosetta/t/ — canonical cross-language tasks (sort, parse JSON, fetch URL) ported into stryke. 100% of list.txt tasks complete (1,349 / 1,349).
Stryke Project
examples/project/ — full multi-module project with 183 stryke files (24,509 LOC) exercising the package layout, lib/, t/.
>REPL & INTERACTIVE SHELL
The s binary doubles as a Perl 5 interpreter and a Unix shell. Reedline-backed REPL with persistent history, tab-completion against the full 22,071-name corpus, ANSI-colored cyberpunk banner, and 80+ shell-style coreutils builtins (ls, cat, grep, cd, pushd, whoami, curl_get, tree) that are in-process Rust function calls, not subprocess spawns. No fork, no PATH walk, no exec — every result is a stryke value ready for the next pipeline stage.
%c%k// STARTUP BANNER
The banner is computed at REPL launch from live reflection — never hard-coded. Numbers shift as builtins land or get removed.
███████╗████████╗██████╗ ██╗ ██╗██╗ ██╗███████╗ ██╔════╝╚══██╔══╝██╔══██╗╚██╗ ██╔╝██║ ██╔╝██╔════╝ ███████╗ ██║ ██████╔╝ ╚████╔╝ █████╔╝ █████╗ ╚════██║ ██║ ██╔══██╗ ╚██╔╝ ██╔═██╗ ██╔══╝ ███████║ ██║ ██║ ██║ ██║ ██║ ██╗███████╗ ╚══════╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝╚══════╝ ┌────────────────────────────────────────────────────────────────┐ │ SYSTEM status: ONLINE // os: macos arch: aarch64 pid: 30386 │ │ CORES 18 MEM 56.2 / 64.0 GiB available │ ├────────────────────────────────────────────────────────────────┤ │ %b builtins 10296 %a aliases 641 %all 11022 │ │ %k keywords 85 %o operators 70 %v 97 │ │ %pc perl5 core 185 %e stryke ext 10111 %d 4255 │ │ %c categories 307 %p primaries 8805 │ └────────────────────────────────────────────────────────────────┘ >> PARALLEL PERL5 INTERPRETER // RUST-POWERED v0.16.31 <<
// SHELL-LIKE BUILTINS (NO FORK, NO EXEC)
Filesystem (21)
cd pwd ls cat rm mv cp ln mkdir rmdir touch chmod chown realpath basename dirname mktemp mktempdir tree find glob — all wrap std::fs / tempfile in-process. Output is a stryke value, composable with |>.
Directory Stack (3)
pushd / popd / dir_stack backed by a process-wide Mutex<Vec<PathBuf>>. zsh-style swap-on-bare-pushd.
Identity (8)
whoami groups id hostname getpwuid getpwnam getgrgid getgrnam — libc-backed, no subprocess.
Process (10)
ps top kill nice renice time sleep wait fork exec via libc / nix.
Text Processing (16)
grep sort uniq wc head tail cut tr tac rev_lines shuf column comm diff myers_diff patience_diff — pure Rust.
Networking (9)
curl_get curl_post http_fetch http_post fetch_async fetch_json fetch_url_tool openurl xdg_open via ureq + OS launcher.
Archives (6)
tar zip unzip xz gzip zstd — tar / zip / xz2 / flate2 / zstd crates, all already deps.
Terminal (12)
clear cls reset beep ring_bell set_title set_term_title term_size term_width term_height tty_raw tty_cooked — termios ioctl + ANSI escapes.
REPL State (5)
history repl_alias repl_unalias set_alias unset_alias — process-wide tables, reedline history bridge via crate::builtins::repl_history_push().
Search (4)
which whereis type command — $PATH walker.
Encoding (5)
iconv strftime date_iso_format format_bytes format_duration — chrono + hand-rolled.
Inspection (7)
perfview docs lsp_completion_words methods_for which_all fields to_hash — introspect every reflection hash.
// DESIGN DECISIONS
| Decision | Rationale |
|---|---|
| In-process builtins, not subprocess wrappers | A subprocess fork+exec costs ~1-3 ms on Linux/macOS. Composing 5 of them in a pipeline = 10-15 ms of overhead. Stryke's cat |> grep |> sort |> uniq stays inside the VM — no syscall boundaries, output of one stage is a heap-resident StrykeValue passed to the next. |
Reedline frontend, not raw readline | Pure-Rust line editor, no libreadline linkage (which would need GPL or BSD-readline alternatives). Native multi-line editing, syntax highlighting, history search, completer hooks. |
Tab complete from %all via lsp_completion_words.txt | 22,071 names cached at build time. Reedline calls into the completer with each prefix, gets back a sorted candidate list. Includes every primary, alias, keyword, special variable, file-test op, and zsh-style glob qualifier. |
repl_alias instead of alias | Bare alias is reserved for Perl typeglob assignment (*alias = \&original); shadowing it with a builtin would break tests/suite/behavior_pin_2026_05_d.rs::typeglob_assigns_coderef_alias. Same pattern for unalias → repl_unalias. |
No source builtin | Perl already has do FILE (eval-in-current-scope) and require FILE (eval-once with INC tracking). Adding a third spelling violates the "no dups" rule. |
| Stack-shared dir stack, not per-tab | pushd / popd mutate process cwd via chdir(2), which is process-wide on POSIX. The stack mirrors that — one Mutex<Vec<PathBuf>> for the whole interpreter. Tab- or window-scoped stacks would lie about what chdir did. |
| Banner numbers are computed live | Reflection hashes (%b, %a, %all, %k, %d, %c) are populated at compile time by build.rs from the actual try_builtin dispatch arms. The startup banner reads scalar(keys %b) etc. — if a builtin is added, the next REPL launch shows the updated count without touching banner.rs. |
// COMPARISON
| REPL | Builtins in scope | Tab-complete corpus | Cold start | Pipelines in-process |
|---|---|---|---|---|
| bash 5 | ~150 | ~250 | ~5 ms | No (fork+exec per stage) |
| zsh 5 | ~140 + zinit-loaded | ~2k via compinit | ~30 ms (cold) | No |
| fish 3 | ~200 | ~1k | ~25 ms | No |
| nushell 0.x | ~350 | ~400 | ~50 ms | Yes (pipe is in-VM) |
| python -i | 71 | 71 + imported | ~40 ms | No (subprocess module spawns) |
| node | 0 (require-first) | REPL-internal | ~80 ms | No |
| stryke | 10,459 | 22,071 | < 10 ms | Yes (every |> stage is a heap value) |
Nushell is the only direct architectural peer (in-process pipelines, structured values) but it's an order of magnitude smaller in builtin surface and slower to cold-start. No other REPL passes the combination of shell-grade pipelines + 10k+ builtin surface + sub-10 ms launch.
:EDITOR TOOLING — LSP / DAP / JETBRAINS
The same stryke binary that runs scripts also serves Language Server Protocol and Debug Adapter Protocol. No separate stryke-lsp or stryke-dap artifacts — editors point their LSP/DAP clients at stryke --lsp / stryke --dap and get hover docs, completion, semantic tokens, breakpoints, step over/into/out, and recursive variable drill-down. The TTY debugger (stryke -d) and the DAP server share a single Debugger state machine.
// LSP SERVER — stryke --lsp
- Hover over any of 4,255 documented builtins → category + signature + description from the live
%stryke::descriptions/%stryke::categorieshashes - Completion with the full builtin set (every spelling in
%all— 11,191 keys including 643 aliases and 89 keywords) - Semantic tokens — keywords, builtins, sigils, types, regex flags get distinct categories so themes can paint them independently
- Signature help on function-call cursor; code actions for common diagnostics
- Go-to-definition for user subs, classes, structs, enums
// DAP DEBUGGER — stryke --dap [HOST:PORT]
Standard Debug Adapter Protocol over stdio or TCP. Same Debugger state machine the TTY REPL (stryke -d) uses — breakpoints, step modes, scope inspection, call-stack tracking, all shared. Two front-ends on one core.
| Capability | Detail |
|---|---|
| Line breakpoints | Gutter clicks fire on the exact source line. Lexer line-tracking fixed so class declarations don't drift the line counter and silently drop BPs. |
| Function breakpoints | Stop on entry to any named sub. |
| Step over / into / out | Depth-tracking on every UDF entry/exit so step-over stays in the current frame instead of diving into call sites. |
| Pause / Continue / Run-to-Cursor | Standard transport actions. |
| Recursive Variables panel | Every hash, array, struct, enum variant, class instance, set, and sketch (TDigestSketch, BloomFilter, HllSketch, CmsSketch, TopKSketch) expands inline through variablesReference. Class fields show visibility markers (+ public / # protected / - private) plus the __class / __isa chain. |
| Sort order | User my $foo vars first, then topic family ($_, $_0, $_1, …), then $a / $b, then runtime builtins. Compiler-synthetic names (__foreach_i__ etc.) hidden. |
| Evaluate dialog | Scalar prelude injection so $a * $b returns the right value from the paused frame. |
| Console output | p / print / say stream into the IDE Console in real-time (DAP mode forces $| = 1 autoflush). |
Supported DAP requests: initialize, setBreakpoints (line), setFunctionBreakpoints, setExceptionBreakpoints, configurationDone, launch, threads, stackTrace, scopes, variables, continue, next, stepIn, stepOut, pause, evaluate, terminate, disconnect. Events: initialized, stopped, output, process, thread, exited, terminated.
// JETBRAINS PLUGIN — editors/intellij/
First-class JVM plugin built on IntelliJ Platform Gradle Plugin 2.16.0, Kotlin 2.0.21, JDK 17. Supports the entire JetBrains paid lineup: RustRover, IDEA Ultimate, GoLand, PyCharm Pro, WebStorm, RubyMine, PhpStorm, CLion, Rider, DataGrip, Aqua. Community editions are not supported — LSP API is paid-only on the JetBrains side.
.stkfile association with 44-slot color scheme. Keywords, sigils, builtins, types, regex bodies/flags, heredocs, sub names, class fields, all paintable independently.- LSP client consuming
stryke --lsp— semantic tokens, completion, hover, signature help, code actions. - DAP debugger consuming
stryke --dap HOST:PORT(TCP transport keepsOSProcessHandlerstdout/stderr free for the program's ownp/printoutput). - Reflection tool window — 9 tabs on the live
%stryke::*hashes (≈11k builtin names, 643 aliases, 89 keywords, 317 categories, 4,391 descriptions, 8,942 primaries, 10,273 extensions, 186 perl_compats). Left single-click opens the docs popup; right-click inserts at cursor. - Run configurations + gutter run icons — execute, debug, or profile any
.stkfile with one click.
// OTHER EDITORS
| Editor | Integration |
|---|---|
| Neovim | editors/stryke.lua + ALE / vim-lsp / coc.nvim via stryke --lsp |
| Vim | editors/stryke.vim — syntax + indent + filetype detection |
| Helix | editors/helix-languages.toml drop-in |
| VS Code | coc + editors/vscode-settings.json bridge |
.EXAMPLES & SOLVED PROBLEMS
Stryke ships with the largest curated corpus of cross-language exemplars in any single-author Perl 5 work-alike: 2,957 .stk programs under examples/ totaling 228,166 LOC. All 1,349 Rosetta Code tasks in list.txt are solved (100%) — every test compiles and runs against the current build, so any builtin regression surfaces as a failing test before it reaches the parity suite.
// CORPUS BREAKDOWN
| Corpus | Path | Files | LOC | Status |
|---|---|---|---|---|
| Rosetta Code | examples/rosetta/t/ | 1,648 | 73,643 | All 1,349 of 1,349 problems in list.txt solved (100%). Avg 45 LOC/test. Largest: heavy_light 189, red_black_tree 179, baum_welch 177, strassen 168, delaunay 167. |
| Sample Project | examples/project/ | 183 | 24,509 | Multi-module mini-project: lib/ namespaces + t/ test layout exercising the package manager, use resolution, namespace dispatch. |
| Exercism | examples/exercism/ | 346 | 6,083 | 173 exercises (one solution + tests each). Polyglot .exercism.json compatible. |
| Top-level | examples/*.stk | 458 | 46,439 | Quick-start demos + worked end-to-end programs (TF-IDF search, mini-SQL executor, Kalman filter, Raft simulator, HTTP router with middleware, pub/sub bus, CSV ETL, …). |
| Polyglot ports | examples/ruby/, examples/scheme/ | 2 | 9,077 | Reference ports (not counted in Rosetta / Exercism rows). |
| EXAMPLES TOTAL | 2,957 | 228,166 | 5 groups under examples/ · ~77 LOC/file avg | |
The Rosetta corpus is a complete cross-language reference port at 1,349/1,349 (100%). The full test runner (s test examples/project/t examples/rosetta/t) executes 1,751 .stk test files and 12,211 assert_eq / assert_ok checks across those trees; any builtin drift surfaces as a failing test before it reaches the parity suite. New Rosetta tasks land as fast as the canonical site adds them.
+DEPENDENCIES
135 direct dependencies, 680 transitive crates in Cargo.lock. Every dep is audited against the “will this still build cleanly in 2030” bar — foundational crates (serde, regex, libc, nix) over fashionable ones.
Core Runtime
rayon, crossbeam, parking_lot, dashmap, indexmap, thiserror
JIT
cranelift-codegen, cranelift-frontend, cranelift-jit, cranelift-module
Regex
regex, fancy-regex — standard + lookahead/behind / backref engine
Codec
serde_json, serde_yaml, toml, csv, rmp-serde (MessagePack)
Crypto
aes, chacha20, rsa, ed25519-dalek, x25519-dalek, p256, p384, k256, argon2, scrypt, bcrypt, sha2, sha3, blake3, ripemd, md4, md-5
HTTP / Network
ureq, scraper, roxmltree, uuid, TCP / DNS via libc+nix
Output
svg2pdf, pdf-extract, id3, mp3lame-encoder, qrcode
CLI
clap (no default features), itoa, caseless, glob, libc
LSP
lsp-server, serde, serde_derive
System
sysinfo, nix, libc, rand, lru, itertools, totp-lite
;PUBLIC API SURFACE
strykelang ships as both a binary (stryke, st, s) and an embeddable Rust crate (stryke on docs.rs). The public surface stays small relative to internals — embedders get the high-level entry points; the rest is implementation.
| Surface | Count | Notes |
|---|---|---|
| Public functions | 874 | Across the whole repo — pub fn / pub(crate) fn at the crate root |
| Public structs | 97 | Plus internal structs (count varies as internals grow) |
| Public enums | 42 | Plus internal enums (count varies as internals grow) |
| Impl blocks | 189 | Across both data and trait impls |
| Total fns (any vis) | 18,041 | Including methods, closures, and inline fns |
?KEY DESIGN DECISIONS
Why stryke looks the way it does. Each call-out below is a decision the implementation could have gone either way on, with the rationale for the path taken.
NaN-Boxed StrykeValue
Every value is one 64-bit word. Tag bits live in the unused mantissa of an IEEE-754 NaN. Saves a pointer indirection on every read; keeps the JIT’s register allocator happy with uniform value width.
Bytecode VM, Not Tree-Walker
An AST tree-walker is simpler but pays a virtual-call cost per node. Bytecode collapses that into a dense dispatch loop, which the CPU’s branch predictor can actually warm up to.
Cranelift, Not LLVM
LLVM gives slightly better steady-state code but compiles 10× slower and pulls in a giant C++ dependency. Cranelift is pure Rust, fast to compile, and ships in Wasmtime — production-tested.
Rayon, Not Tokio
Stryke parallelism is data-parallel, not async-IO-parallel. Rayon’s work-stealing fits pmap/pgrep exactly; tokio’s task model would force every callback into a future.
Streaming Iterators
pmap doesn’t materialize a vec before pgrep consumes it. map_stream.rs threads results pull-side through pipe-forward chains; only the final sink pays the allocation.
Single Cargo Package
Stryke is one crate, three binaries (stryke/st/s) and a library. No workspace, no internal sub-crates — rebuilds are smaller, dependency graph is shallow, embedders see a flat API.
Build-Time Reflection
build.rs scans the dispatcher source and generates %builtins, %aliases, %descriptions. %keywords is a hand-curated parallel table for syntactic names (if, my, sub, fn, …), keeping %builtins callable-only. Source is the single truth; LSP descriptions can’t drift from the actual dispatcher.
Pipe-Forward + Thread Macro
|> and ~> aren’t cosmetic — they’re first-class parser productions that compile into streaming-iterator opcodes. Reading left-to-right is the point; allocation-free chains are the dividend.
AOP at the Call Site
before / after / around register advice into a single Vec<Intercept> on the interpreter. vm_dispatch_user_call takes a one-line glob-match guard before the existing fast-path, so calls with no intercepts pay zero overhead. Around is AspectJ-style: the block’s value is the call’s return; proceed() invokes the original. Same surface as zshrs intercept — one design across CLI and language.
rkyv KV Store (world-first)
First-class CRUD store with rkyv as the on-disk codec. kv_open/kv_put/kv_get/kv_del/kv_exists/kv_keys/kv_scan/kv_len/kv_commit/kv_batch/kv_close/kv_stats ship as native builtins, dispatched via kvstore.rs::builtin_kv_*. Reads are mmap + validate + cast — same primitive script_cache.rs:454 already uses for cached bytecode; no parse, no allocate per row. SQLite-shaped ergonomics, beats SQLite on reads for any store that fits comfortably in RAM. Atomic rewrite on commit (tmp + rename + fsync), versioned format header (STKV magic + format_version) so future stryke versions reject mismatched archives instead of silently mis-deserialising. All-or-nothing kv_batch with in-memory snapshot rollback. World-first: no other scripting language ships a zero-copy archive KV as a core builtin. Phase 2 ships stryke kvd server + remote kv_connect client over the same archive bytes.
Sketch Algebra (world-first)
Probabilistic data structures are first-class operands for + | & ^ -. $bloom_a + $bloom_b = Bloom union, $hll_a + $hll_b = HyperLogLog union, $cms_a + $cms_b = Count-Min counter sum, $topk_a + $topk_b = SpaceSaving merge, $td_a + $td_b = t-digest centroid merge, $rb_a & $rb_b / | / ^ / - = Roaring set algebra. Operators are functional — operand sketches are never mutated. Dispatch lives in vm.rs’s Op::Add / BitOr / BitAnd / BitXor / Sub arms, falling through to try_sketch_binop in sketches.rs before the default numeric path. Shape mismatch falls back cleanly — no panic, no silent truncation. No other language treats these as syntactic primitives.