>_STRYKE-PARQUET
See into parquet without loading it. Parquet file inspector for stryke — schema, footer stats, row-group breakdown, head/tail, recompression. Diagnostic counterpart to stryke-arrow.
Install
# build the helper binary, install as a stryke package cd ~/projects/stryke-parquet cargo build --release s pkg install -g . # one-liner make install # verify parquet --help
After install, parquet --help works from anywhere on PATH (assuming ~/.stryke/bin/ is on PATH). The stryke library is auto-discoverable to any project that depends on the package via [deps] parquet = { path = "..." } or, when published, by name.
CLI: parquet
| show schema | parquet schema sales.parquet |
| footer + row-group stats | parquet stats sales.parquet |
| first 20 rows | parquet head -n 20 sales.parquet |
| last 5 rows | parquet tail -n 5 sales.parquet |
| recompress with zstd | parquet recompress --codec zstd sales.parquet sales-zstd.parquet |
The full flag matrix lives in the README "CLI" section.
Why a package, not a builtin
Schema + footer / row-group introspection is a different workload from reading parquet into a typed DataFrame. Lightweight tool for ops + debugging.
The stryke side is a thin NDJSON-pipe wrapper; the heavy code lives in the stryke-parquet-helper sidecar binary and is loaded on demand. Core stryke is never linked against this package's deps.
Helper protocol
The stryke-parquet-helper sidecar speaks newline-delimited JSON over stdin/stdout. The stryke library shells out per call and pipes structured data both ways. This keeps stryke startup small while making the package's surface area available on demand.
# manual invocation (debugging only)
echo '{"op":"version"}' | stryke-parquet-helper
Layout
stryke-parquet/ ├── Cargo.toml # bin = stryke-parquet-helper (publish = false) ├── src/ │ └── main.rs # helper binary entry point ├── lib/ # stryke .stk wrapper(s) ├── stryke.toml # stryke package manifest ├── t/ # zunit-style tests ├── examples/ # runnable .stk examples ├── Makefile # `make install` builds + installs └── docs/ # this site (GitHub Pages)
Sibling packages
Part of the stryke connector family. Browse the others via the MenkeTechnologiesMeta umbrella repo (Tier 2):
- stryke-arrow — Apache Arrow / Parquet / Feather / arrow-CSV/JSON
- stryke-aws — S3, DynamoDB, SQS, Lambda, STS
- stryke-docker — Docker daemon API
- stryke-duckdb — embedded DuckDB
- stryke-gcp — Cloud Storage + Pub/Sub
- stryke-grpc — reflection-based gRPC client
- stryke-k8s — Kubernetes
- stryke-kafka — Apache Kafka
- stryke-mongo — MongoDB
- stryke-mysql — MySQL / MariaDB
- stryke-parquet — Parquet file inspector
- stryke-postgres — PostgreSQL
- stryke-redis — Redis / Valkey
- stryke-spark — Spark Connect (no JVM)