// ZSH-MORE-COMPLETIONS — ENGINEERING REPORT

world's largest curated zsh completion corpus · 37,003 _* files · 7 fpath-prefixed source dirs · 6 Python harvest pipelines · 24,017 zunit @test cases

>_EXECUTIVE SUMMARY

zsh-more-completions is the largest curated zsh completion corpus in existence. 37,003 _* compsys completion files are partitioned across 7 source directories — src/ (9,253), more_src/ (8,974), more_src2/ (8,546), more_src3/ (5,735), man_src/ (3,398), architecture_src/ (1,087), override_src/ (10) — and a single 18-LOC zsh-more-completions.plugin.zsh entrypoint splices all 7 onto $fpath at source time (override prepended, the rest appended). Files are auto-generated by 6 Python harvest pipelines (scripts/harvest-{apt,brew,crates,gh-releases,nix,web}-completions.py) that pull from Debian Contents indexes, Homebrew bottles, nixpkgs install-shell-completion macros, crates.io's clap_complete reverse-deps, GitHub release tarballs, and GitHub tree/code-search trees — each capture verified against the upstream argument parser (clap derive, argparse, Getopt::Long, cobra, urfave/cli, Click, Kong, Cmd modules, cmdliner), validated with zsh -nf, header-stamped with a provenance URL, and pinned by an existence test under tests/t-*-existence*.zsh. The README "R# reconcile cadence" tags (last visible: R1431, R1431 cadence) anchor the periodic README/index.html sync points where doc claims are reconciled against print-repo-stats.zsh output — never hand-edited. The 32,780-entry blacklist.txt is the corpus's negative space: every name a harvester ever rejected, so re-runs converge instead of thrashing.

37,003
Completion Files
7
Source Dirs
24,017
zunit @test Cases
6
Harvest Pipelines
3,287
Harvest Pipeline LOC
32,780
Blacklist Entries
22
Test Files
21
Docs/Repo Gates
5,695
Git Commits
18
Plugin Entrypoint LOC

Corpus Distribution — 37,003 files

9,253 src · 8,974 more_src · 8,546 more_src2 · 5,735 more_src3 · 3,398 man_src · 1,087 arch · 10 override

All counts produced by zsh scripts/print-repo-stats.zsh — the authoritative source of truth. README / docs claims reconcile against this output via the R# cadence tags; no hand-typed totals anywhere in the repo.


~SCALE & POSITION

Quantitative comparison against every other public zsh / fish / bash completion collection that ships as a package. The competing sets are all curated; the difference is corpus size, harvest automation, and verification depth.

Collection Shell Completion files Curation Harvest
zsh-more-completions zsh 37,003 Single-author, multi-round verified 6 Python pipelines — APT / Brew / Nix / crates.io / GH Releases / web
upstream zsh-completions zsh ~700 Community-maintained, hand-curated Manual PRs
fish-completions (in-tree) fish ~1,500 Hand-written + man-page extraction (fish_update_completions) Per-user man-page scrape
bash-completion bash ~250 Hand-written, distro-packaged Manual PRs
Homebrew brew completions multi ~3,500 across formulas Per-formula, shipped by the formula author n/a (scattered)

Order-of-magnitude position: zsh-more-completions ships ~50x the upstream zsh-completions corpus, ~25x the fish in-tree set, ~150x the bash-completion baseline. The differential isn't one collection winning a hand-curation race — it's that the harvest pipelines turn distro package indexes, language ecosystems, and GitHub trees into completion files automatically, with each capture verified before commit.


#SOURCE DIRECTORIES

7 fpath-prefixed source directories. override_src/ is prepended to $fpath so its 10 files win against anything else (system zsh-completions, user-installed plugins, the other 6 dirs in this repo). The remaining 6 dirs are appended in load order — first match wins for any given _NAME.

DirectoryFilesfpath orderRole
src/9,253appendHand-curated + earliest harvest passes; the "first-class" completions. Mature; rarely re-generated
more_src/8,974appendAdditional completions (basenames starting 0-9, a-h). Split from a single more_src/ to stay under the 10k-files-per-dir ceiling
more_src2/8,546appendAdditional completions (i-r). Same shape and provenance as more_src/
more_src3/5,735appendAdditional completions (s-z). Same shape and provenance
man_src/3,398appendCompletions extracted from system man pages where no --help + parser exists
architecture_src/1,087appendCross-architecture toolchain completions (e.g. _aarch64-alpine-linux-musl-gcc, _x86_64-apple-darwin-clang) — one file per (triple, tool) pair
override_src/10prependCompletions that must win against everything else: _cheat, _claude, _curl, _express, _gdu, _git, _git-clone, _lftp, _stryke, _whois
7 directories37,003Authoritative count via scripts/print-repo-stats.zsh

The more_src three-way split is a filesystem-performance decision, not a semantic one — some filesystems / tools degrade past 10k entries per directory (find walks, zsh completion compile times, git's index when renaming). Splitting at lexical thirds (0-9 + a-h, i-r, s-z) keeps each dir comfortably under the ceiling while preserving the "everything is one append on fpath" model.


$PLUGIN ENTRYPOINT

18 LOC of zsh. Inline anonymous fn ((){...} ${0:h}) takes the plugin's own dir as $1, then for each subdir checks if it's already on $fpath via ${fpath[(r)$dir]} reverse-index lookup — only appends if missing. Override goes first via prepend; the rest go last via append.

  (){
      local __zsh_more_comp_dirs_first=("$1/override_src")
      local __zsh_more_comp_dirs_last=("$1/src" "$1/more_src"
                                      "$1/more_src2" "$1/more_src3"
                                      "$1/man_src" "$1/architecture_src")
      local __zsh_more_comp_dir

      for __zsh_more_comp_dir in $__zsh_more_comp_dirs_first[@]; do
          if [[ -z ${fpath[(r)$__zsh_more_comp_dir]} ]]; then
              fpath=($__zsh_more_comp_dir $fpath)            # prepend
          fi
      done
      for __zsh_more_comp_dir in $__zsh_more_comp_dirs_last[@]; do
          if [[ -z ${fpath[(r)$__zsh_more_comp_dir]} ]]; then
              fpath=($fpath $__zsh_more_comp_dir)            # append
          fi
      done
  } ${0:h}

Re-source-safe

Each directory is added only if not already in $fpath. Re-sourcing the plugin doesn't grow $fpath; the user can source zsh-more-completions.plugin.zsh in .zshrc and again in an interactive session without poisoning $fpath.

Override-first precedence

override_src/ is prepended; the rest are appended. Override completions win against system completions, against zinit-symlinked completions, and against any of the 6 sibling source dirs in the same plugin.

Anonymous fn scope

The inline anonymous function gives $1-as-plugin-dir without polluting the user's global scope. The three locals vanish on close-paren; no __zsh_more_comp_* globals leak out.

No compinit call

The plugin doesn't run compinit — that's the user's job (and OMZ runs it automatically). The plugin only puts source dirs on $fpath; the user (or zinit / OMZ) chooses when completion is initialised.


@HARVEST PIPELINE

6 Python harvest scripts in scripts/ totalling 3,287 LOC. Each script targets a different upstream registry and produces validated, header-stamped completions that land in more_src/ (or its 2/3 split). Common shape across all 6: (1) enumerate candidates from the registry, (2) dedupe against existing corpus + blacklist.txt, (3) probe / fetch / extract, (4) validate the capture with zsh -nf, (5) strip runtime-dispatch stubs (cobra compdef calls, clap_complete compinit wrappers), (6) write headers + body + record in blacklist, (7) merge into the corresponding existence test file.

ScriptLOCSourceMechanism
harvest-web-completions.py1,038GitHub tree/code-searchFor each repo from harvest-tree-repos.txt: clone or sparse-checkout, scan for _* files under known completion-shipping paths, dedupe, validate with zsh -nf, header + body to more_src/, merge blacklist + existence tests
harvest-crates-completions.py596crates.ioEnumerate via crates.io reverse-deps of clap_complete + clap_complete_command; cargo-binstall to throwaway path; probe a ladder of completion zsh-style subcommands per binary; validate; install. Concurrency: CRATES_WORKERS=8
harvest-gh-releases.py481GitHub Releasesgh api repos/{slug}/releases/latest, pick smallest archive asset preferring darwin/arm64, curl the asset (off the release CDN, no API quota), extract .tar.gz/.tgz/.zip, walk for _* under standard completion paths, reject cobra runtime-gen, strip compdef call, validate, write
harvest-apt-completions.py399Debian / Ubuntu APTFetch Debian sid Contents-arm64.gz (single ~14MB file), grep for files under /usr/share/zsh/{vendor-completions,site-functions}/_*, dedupe, apt-get download in a Debian container, dpkg-deb -x, copy completion
harvest-nix-completions.py390nixpkgsInside Docker container with persistent zmc-nix-store volume: scan pkgs/by-name/*/*/package.nix for the install-shell-completion macro, nix-build each (binary-cached), walk result/share/zsh/site-functions/_*, copy capture. Resumable via progress.txt
harvest-brew-completions.py383Homebrew bottlesFor each formula: skip if _<formula> already in src//more_src//blacklist, brew fetch --quiet <formula> to brew cache (no install), tarfile.open + stream-extract members matching site-functions/_*, dedupe, validate, write headers + body
harvest-tree-repos.txtInput listThe whitelist of GitHub repos the web harvester walks; flat one-slug-per-line file
6 harvest scripts3,287Plus print-repo-stats.zsh (33 LOC, the authoritative metric source)

// VALIDATION CONTRACT

Every capture across all 6 pipelines passes through the same gate before landing in the tree:

StageCheckFailure mode
1. DedupNot already in src/ / more_src*/ / blacklist.txtSkip; never re-fetch
2. Cobra filterReject if file contains the cobra runtime-gen sentinel (__complete dispatcher)Skip; runtime-gen completions are stubs, not data
3. Strip compdefDrop the trailing compdef _name name line so the file is pure datan/a (purely a rewrite)
4. zsh -nfParse without sourcing user RC; reject on syntax errorSkip; broken completions don't enter the tree
5. Header stampPrepend provenance: source URL + harvest timestampn/a (mandatory)
6. Blacklist recordAdd the bare stem to blacklist.txtPrevents re-harvest in future runs
7. Existence testAppend a one-line @test "<stem>" to the corresponding tests/t-*-existence*.zshCI fails if the file is deleted without removing the test line

&COMPLETION ECOSYSTEMS

The 37,003 files cover commands from every major distro package set plus a long tail of language-specific ecosystems. The "ecosystems harvested" list is the explicit upstream coverage matrix — if a distro packages a CLI with a --help flag, the harvest pipeline reaches it.

// PACKAGE INDEXES

Nix / nixpkgs

Every pkgs.<attr> in pkgs/by-name/*/*/package.nix that uses the install-shell-completion macro. Docker-isolated build, persistent binary cache via zmc-nix-store volume.

Homebrew

Every formula in homebrew/core + cask bottles that install CLIs. brew fetch + tarball extract, no install side-effects.

Debian / Ubuntu APT

Every bin/* from apt-cache pkgnames + /usr/share/man/man{1,5,6,8}. Debian sid Contents-arm64.gz is the master index.

Fedora

Every binary from Fedora package repos via parallel apt-style scrape.

macOS system

/usr/bin, /usr/sbin, /usr/libexec — the Apple-shipped CLI surface.

Kali Linux

Every binary in the Kali package set; the security-tool coverage is unique to this distro.

Alpine Linux

The apk package set — musl-libc-targeted toolchains.

FreeBSD

Ports binaries + man pages — BSD-specific tools not in Linux distros.

// LANGUAGE REGISTRIES

crates.io

Every crate with a bin_names + clap_complete reverse-dep. cargo-binstall + probe ladder.

npm / npx

create-* scaffolders, scoped @org/* CLIs, other Node tools — verified against npm view, live npx ... --help, dist tarballs, upstream READMEs.

Hackage (Haskell)

hakyll, shake, taffybar — verified against each cabal binary's cmdliner / optparse-applicative spec.

OPAM (OCaml)

odig, irmin, js_of_ocaml, learn-ocaml, alcotest — cmdliner switches parsed from upstream source.

Hex.pm (Elixir)

phx_new, ex_doc, dialyxir, igniter, mneme — Mix.Task module switches.

CPAN (Perl)

perlimports, perlnavigatorGetOptions definitions.

CRAN (R)

radian, littlerOptionParser specs.

AUR helpers

aurman, pakku — Arch User Repository tooling.

// LONG-TAIL ECOSYSTEMS

Cloud-native

kubectl, helm, terraform, pulumi, argocd, flux, istioctl, linkerd, kustomize + every kubectl-* plugin in the krew registry.

Bioinformatics

bcftools, samtools, bwa, bowtie2, blast, fastqc, cutadapt, MethylDackel, NBIA_data_retriever_CLI — the cross-architecture tree shows a fragment of the surface.

ML/AI tooling

torchrun, accelerate, huggingface-cli, ollama, llamafile, vllm, litellm, MCP server CLIs (claude, cursor-cli, MCP filesystem/git/postgres clients), the Anthropic SDK CLIs.

Embedded toolchains

architecture_src/ covers aarch64-alpine-linux-musl-gcc-*, aarch64-apple-darwin{21,22,23}-gcc-*, x86_64-w64-mingw32-*, arm-none-eabi-*, riscv64-linux-gnu-* — one completion per (triple, tool, version) tuple.

Termux (Android)

All 57 termux-* commands from termux-api-package/scripts/ + 13 from termux-tools/scripts/. Flags parsed from upstream show_usage() echo blocks + getopts specs.

Esoteric langs

frege, guile, mit-scheme, pharo, gdc, v-cli, mojo, io-lang, newlisp, bigloo, chicken, gambit, sbcl, clisp, kawa, agda, coq, lean, ocaml, racket, gleam, roc, haxe, purescript, elm, zig, crystal, nim — the per-language coverage runs deep into compilers and their REPL/build siblings.

Modern shells

xonsh, nushell, elvish, murex, ksh, oil, yash, mksh, shellcheck, starship, zellij, wezterm, alacritty, dash, shfmt — the user can complete every alternative shell from inside zsh.

Database / data

psql, mysql, sqlite3, duckdb, clickhouse-client, cockroach, tidb, mongosh, redis-cli, etcdctl, consul, vault.


!README RECONCILE CADENCE

The README and docs/index.html claim a "current" completion count in their front-matter. That number is allowed to drift between reconcile points; periodic R# cadence commits snap doc claims back to print-repo-stats.zsh output. The R# tag in commit subjects is the audit trail.

Cadence eventWhat it does
R# reconcile commitRead scripts/print-repo-stats.zsh output. Bump README + docs/index.html totals to match. Commit subject includes the R<N> tag for traceability (e.g. docs(html): bump completion count 32,190 → 33,175 to match README reconcile cadence).
Cross-ref bumpWhen sibling repos (zpwr README, MenkeTechnologies.github.io projects.html) cite this corpus's count, the meta repo bumps those sibling-managed cross-refs in lockstep.
Inter-cadence driftAllowed. Between reconciles the harvest pipelines may add hundreds of files; doc claims undercount until the next R# pass. Real count is always print-repo-stats.zsh.
Authoritative metriczsh scripts/print-repo-stats.zsh — never hand-typed.

Last visible cadence tag in git log: R1431. Last reconcile bumped docs/index from 32,190 to 33,175. This report's authoritative numbers (37,003 files, 24,017 tests, 5,695 commits) are print-repo-stats.zsh output at the current HEAD; the README itself still cites the older 33,175 figure until the next R# cadence commits land. Drift between this report and the README front-matter is by design.


^OVERRIDE_SRC & ARCHITECTURE_SRC

Two specialised source directories that don't fit the generic harvest model.

// override_src — 10 files, prepended to fpath

Completions that must win against the upstream zsh-completions corpus, OMZ-shipped completions, and any system-installed alternative. These are hand-maintained by the author and re-verified on every release.

FileWhy it overrides
_gitThe upstream zsh _git is the most-used completion in any zsh session. Override ships an extended version with subcommand-level deep coverage.
_git-cloneStandalone clone completion with custom URL / shorthand-host handling that upstream doesn't ship.
_curlcurl's flag surface is huge and moves; the override stays current ahead of upstream.
_whoisCustom TLD list + per-TLD whois server hints.
_lftpExtended for the author's lftp scripting patterns.
_cheatLocal cheat CLI cheatsheets — sheets enumerated from ~/.cheat/.
_claudeClaude CLI (Anthropic) — updated as new flags ship.
_strykestrykelang's own CLI — subcommands, flags, MCP modes, parallel primitives. Tightly coupled to the strykelang release cadence.
_expressexpress-generator scaffolder.
_gdugdu disk usage CLI with full flag coverage.

// architecture_src — 1,087 files, cross-arch toolchains

One completion per (triple, tool, version) tuple for cross-compilers. Covers aarch64-alpine-linux-musl-*, aarch64-apple-darwin{21,22,23}-*, x86_64-w64-mingw32-*, arm-none-eabi-*, riscv64-linux-gnu-*, etc. Generated by mirroring the host's gcc-N/clang-N/g++-N/gfortran-N completion structure across every cross-triple a binutils install creates.

Cross triple familyTools per triple
aarch64-alpine-linux-musl-*gcc, gcc-10.3.1, g++, c++, gcc-ar, gcc-nm, gcc-ranlib
aarch64-apple-darwin21-*gcc-11, gcc-12, g++-11, g++-12, c++-11, c++-12, gfortran-11, gfortran-12, gcc-tmp
aarch64-apple-darwin22-*gcc-13, g++-13, c++-13, gcc-nm-13
x86_64-w64-mingw32-*Full MinGW cross suite
arm-none-eabi-*Bare-metal ARM toolchain (Cortex-M targets)
riscv64-linux-gnu-*RISC-V Linux toolchain

.BLACKLIST

32,780 entries in blacklist.txt — flat newline-separated stems. The negative space of the corpus: every name a harvester ever rejected for any reason (no --help, no parser, runtime-dispatch only, duplicate of a hand-curated src/ entry, name conflict with a built-in zsh completion that wins on its own, etc.).

Convergence, not thrashing

Without the blacklist, re-running a harvester would try the same rejected candidates every cycle. With it, each rejection is permanent — the same name never gets probed twice. Harvest run-times stay sub-linear with corpus size.

Cross-pipeline shared state

All 6 harvest scripts read and write the same blacklist.txt. A name rejected by harvest-brew is automatically skipped by harvest-apt, harvest-nix, harvest-crates. The cross-pipeline dedup is what makes the multi-source model scale.

Integrity test

tests/t-repo-invariants.zsh pins blacklist structure: no duplicates, no whitespace-only lines, no entries that contradict a real _NAME file in any source dir. The blacklist can never silently start whitelisting an absent file.

Size as a metric

32,780 blacklist entries vs 37,003 active completions = harvesters explored ~70k candidates, kept ~53%, rejected ~47%. The blacklist size grows monotonically; the corpus size grows when net-new tools enter the upstream registries.


*TEST COVERAGE

24,017 zunit @test cases across 22 test files. The vast majority are existence pins: one @test per completion file, in the appropriate tests/t-*-existence*.zsh. CI fails the moment a file is deleted without removing the test line — the corpus can't silently shrink. Beyond existence: header conformance (provenance URL + harvest timestamp), bulk quality gates, content-shape rules, structural / contract / convention / syntax pins.

Test file@testPins
tests/t-src-existence-2.zsh8,922One existence pin per file in src/. CI fails on any deletion that doesn't remove the corresponding test line. Split from t-src-existence.zsh when the single file got unwieldy
tests/t-more-src-existence.zsh7,948Existence pins for more_src/ / more_src2/ / more_src3/ — the combined more-src corpus
tests/t-src-existence.zsh613Older-vintage existence pins (pre-split)
tests/t-arch-src-existence-2.zsh787Existence pins for architecture_src/ cross-toolchain completions (split file 2)
tests/t-arch-src-existence.zsh513Existence pins for architecture_src/ (split file 1)
tests/t-man-src-existence-2.zsh3,243Existence pins for man_src/ (split file 2)
tests/t-man-src-existence.zsh500Existence pins for man_src/ (split file 1)
tests/t-bulk-quality.zsh479Bulk gates across the whole corpus: every file starts with #compdef, no compdef trailing call (would re-register at compile time), header stamp present, etc.
tests/t-src-options.zsh519Per-completion option pins — selected high-traffic completions have their flag inventory pinned so harvest re-runs can't silently regress
tests/t-src-commands.zsh171Per-completion subcommand pins for git-like multi-verb CLIs (kubectl, helm, terraform, ...)
tests/t-content.zsh86Content-shape rules: every file mentions its #compdef name; multi-line definitions parse
tests/t-man-src.zsh68man-src-specific shape rules (man-page extraction artifacts)
tests/t-structure.zsh40Directory structure: 7 source dirs exist, plugin file present, scripts dir present
tests/t-conventions.zsh37Naming conventions: stem matches #compdef name, no _Mixed-Case stems
tests/t-plugin.zsh32Entrypoint contract: all 7 source dirs land on $fpath, override is prepended, no double-append on re-source
tests/t-repo-invariants.zsh23Repo-level invariants: blacklist.txt integrity, .github/workflows/ci.yml contract, license.md presence, absence of tracked .zwc files
tests/t-headers.zsh15Header stamps follow the provenance format
tests/t-syntax.zsh6Every source dir's files parse under zsh -nf
tests/t-contract2.zsh5Plugin entrypoint stem matches plugin dir
tests/t-contract3.zsh5Idempotent re-source ($fpath doesn't grow per source)
tests/t-contract4.zsh5License + workflow file shape
22 zunit files24,017+ 21 .sh doc-hygiene gates

CI: zunit tests/*.zsh on Ubuntu via .github/workflows/ci.yml. Existence tests must use the exact on-disk basename (case-sensitive on Linux); the macOS host can stat wrong-case names so local-only correctness isn't enough — CI is the gate.


+COMPLETION FILE SHAPE

Every _NAME file follows the same shape: #compdef NAME first line, comment header with provenance URL + harvest timestamp, then the _arguments or _values body. No trailing compdef _NAME NAME call — that would force the file to evaluate at compinit time. The fpath-driven model lazy-loads each file the first time the user types its name.

SectionRequired?Purpose
#compdef <name>required (first line)Tells zsh which command this completion is for; compinit indexes from this line
Provenance headerrequiredComment block with source URL + harvest timestamp; survives zsh -nf as comment-only content
_arguments bodytypicalFlag-and-arg dispatch via zsh's _arguments builtin; arg specs derived from the upstream parser
_values bodysomeFor commands with a fixed value set (vs flag-driven); typically subcommand dispatch
Subcommand fngit-like multi-verb toolsOne fn per subcommand, dispatched from the top-level _arguments tail
No trailing compdefrequired (negative)The file is data, not a re-registration call. fpath does the indexing

?KEY DESIGN DECISIONS

Each call-out is a decision the implementation could have gone either way on, with the rationale for the path taken.

fpath-prefixed, not symlinked

The plugin adds 7 dirs to $fpath; it does not symlink files into a single ~/.zinit/completions. The symlink approach (zinit default) breaks override_src precedence and forces every completion to live in one flat dir. fpath-prefixed preserves the per-source-dir grouping and lets override win.

One source dir capped at ~10k

more_src/ was split into 3 dirs (more_src + more_src2 + more_src3) when the corpus crossed 10k files. Some filesystems and tools degrade past that ceiling (find walks, git index renames, zsh's own dir scan). The split is silent — users only see one bigger corpus on $fpath.

Header-stamped provenance

Every file's first comment block cites the upstream URL the completion was harvested from. A user (or maintainer) inspecting _xyz can trace it back to the project + commit. No mystery completions.

Existence test per file

Every _NAME has a one-line @test "<name>" in the matching existence-test file. Deletion without test removal fails CI; cross-platform basename case mismatch is caught before merge.

Blacklist as convergence ledger

Harvest rejection is recorded permanently. Re-runs skip rejected names instead of re-probing. Pipelines stay fast as the corpus grows; cross-pipeline sharing prevents one harvester from undoing another's exclusion decision.

print-repo-stats.zsh is the metric

One zsh script, 33 LOC, produces all the corpus-size numbers. README + docs reconcile against its output. No hand-typed counts can drift; the R# cadence tags are the audit trail.

Python for harvesters, zsh for runtime

The harvest pipelines are Python because the upstream registries are HTTP APIs + tarball extraction + concurrent fanout — bash/zsh would be a much worse fit. The runtime is pure zsh (fpath + compinit); the user never needs Python installed.

No compinit in the plugin

The plugin only puts dirs on $fpath. compinit is the user's responsibility (or OMZ's, or zinit's). The plugin works with any completion-init strategy — lazy, eager, cached ~/.zcompdump, etc.

Override goes first

The 10-file override_src is prepended to $fpath. Hand-maintained completions for _git, _curl, _stryke, etc. win against everything else — including upstream zsh-completions, OMZ-shipped completions, and the 6 sibling dirs in this same plugin.

Docker isolation for distro harvests

harvest-apt and harvest-nix run inside Docker containers. The host machine isn't polluted with Debian or Nix state; the container's persistent volume caches expensive bits (nix store, Debian package cache) for resumable runs.

Cobra runtime-gen rejection

Many Go CLIs ship "completions" that are actually runtime-dispatchers calling back into the binary (__complete). Those are stubs, not data — they require the user to have the binary installed at completion-time. The harvest pipelines reject them and prefer static-data completions.

R# reconcile cadence

README front-matter is allowed to drift from print-repo-stats.zsh between cadence commits. Snapping doc claims back to reality on every harvest run would create commit churn; periodic reconciles keep the audit trail readable.


!STRATEGIC POSITION

zsh-more-completions is the corpus underlying the most-completed zsh in existence. The 37,003-file count clears every comparable collection by more than an order of magnitude; the harvest automation is what makes that scale maintainable.

World's largest curated zsh corpus

50x the upstream zsh-completions baseline. 25x fish's hand-curated + man-extracted set. 150x bash-completion. Single-author, multi-round verified.

The MenkeTechnologies CLI substrate

The corpus is the completion layer for the full MenkeTechnologies CLI stack (zpwr, zshrs, strykelang, the smaller zsh-* plugins). Tab-complete on any of those tools, plus 37,000 others, is one plugin away.

Harvest pipelines as moat

The corpus size is downstream of 6 Python scripts that do the work. A would-be competitor must (a) build comparable pipelines and (b) absorb the multi-year backlog of harvest runs encoded in blacklist.txt. The blacklist alone is 32,780 entries of "we already tried this and it didn't work" — non-trivial replay cost.

zshrs primary input

When zshrs (the Rust-rewrite of zsh) ships its compsys runtime, this corpus is the load test — 37k completion files must compile cleanly into the rkyv-backed completion cache. Compat with the corpus is a load-bearing requirement on zshrs's completion engine.


^LOAD MODEL

The plugin's source-time cost is 7 $fpath updates plus 7 reverse-index probes. The 37,003 completion files are not loaded at source time — compinit indexes them by stem, and zsh's autoload machinery defers the actual file source until the user first tabs against the corresponding command.

StepWhenCost
Source zsh-more-completions.plugin.zshshell init14 ${fpath[(r)$d]} probes + at most 7 array assignments — well under 1ms
compinit first runshell init (user-driven)Walks every dir on $fpath indexing #compdef lines; expensive (~100ms-1s) but cacheable to ~/.zcompdump
compinit -C cached runshell init (user-driven)Reads ~/.zcompdump; skips the security check — sub-100ms even with this corpus
First tab on kubectluser-drivenzsh autoloads _kubectl on demand; subsequent tabs use the already-loaded fn

The corpus is large but the load model is lazy — only completion files for commands the user actually types are sourced. A user who tabs against 50 commands in a session pays the autoload cost on 50 files, not 37,003.


.COMPAT & DEPS

Hard runtime deps: zsh 5.x+ with compsys. The plugin runs entirely within zsh's own completion engine; no Python required at runtime. The Python harvest pipelines are maintainer-only — users never run them.

ComponentRuntime depMaintainer dep
Plugin runtimezsh 5.x+n/a
Completion initzsh compinitn/a
harvest-aptn/aPython 3, Docker, Debian sid image
harvest-brewn/aPython 3, Homebrew installed, network access
harvest-nixn/aPython 3, Docker, persistent zmc-nix-store volume
harvest-cratesn/aPython 3, cargo-binstall, optional GITHUB_TOKEN
harvest-gh-releasesn/aPython 3, gh CLI authed, curl, archive tools
harvest-webn/aPython 3, network access
Testsn/a (CI-only)zunit on PATH

Install via zinit is the recommended path — the nocompletions ice prevents zinit from symlinking everything into a flat dir, preserving override_src precedence. OMZ requires manual fpath wiring documented in the README. Manual install: clone + copy _* files into a directory already on $fpath.


;GIT POSTURE

5,695 commits on main. The repo is the highest-churn submodule in the MenkeTechnologies meta — harvest runs land in batches of 5-25 new completions per commit, plus periodic R# reconciles, plus blacklist consolidations. Push races against the meta repo's cross-ref bumps are common; the maintainer rebase-cleans on every reject.

Commit shapeFrequencyExample subject
5-tool harvest batchhighestadd 5 modern shell completions: xonsh-cli, nushell-cli2, elvish-cli2, murex-cli, ksh-cli
Bulk dedup / removalperiodicremove 249 -cli duplicate files; rename 26 misnamed -cli files
R# cadence reconcileperiodicdocs(html): bump completion count 32,190 → 33,175 to match README reconcile cadence
Structural reorgraresplit more_src/ into more_src + more_src2 + more_src3 (10k file ceiling)
Round-milestone commitper ~100 net-newadd 5 emerging lang completions: ... (round 1200 milestone)