A multi-agent system that makes every piece sound like me, not like AI — with a measurable bar for failure.
Slide Bullets
Scout grounds every piece in real company research before Scribe writes a word — fixes the "generic AI output" problem at the source
Scribe drafts against a voice fingerprint derived from 4 years of corpus: measurable patterns (avg sentence length, hedging rate, claim density), not adjectives
Inspector is pure Python — forbidden words, sentence length, em-dash count — deterministic gate fires before any AI review runs
Critic runs on Gemini reviewing Claude's output: cross-model adversarial review catches the specific tells each model misses in its own drafts
Miller gates persuasive pieces against StoryBrand structure — hero, stakes, guide, plan, CTA — only when intent: persuasive
Proof Points
15 pieces shipped through the full pipeline: 4 cover letters, 7 LinkedIn rewrites, 4 articles and internal docs
Every piece clears 4–5 sequential agent gates before it ships — ~60+ total gate evaluations across the corpus
Bar is binary: if Jeff rewrites >30% of words, the engine has failed V1. Not "feels better" — a hard number set before Phase 1 began
outcome-architect issued a Phase 1 authorization at 95% confidence after auditing all four failure modes were addressed in the architecture
Origin: A NerdWallet cover letter missed Riskalyze and RightCapital entirely — two platforms central to the role. Generic AI output with no contextual grounding. That failure became the spec.
Story 2 of 2
Ship Pipeline — 4-Agent Code Quality Gate
A non-developer shipping production code through a structured 4-agent pipeline — with a security gate that literally blocks the deploy command if it doesn't pass.
Slide Bullets
outcome-architect runs an ODF audit before any code is written — issues a /goal authorization block; build cannot start without it (voice-engine cleared at 95% confidence)
gemini-researcher maps the codebase and delivers an architecture plan; gemini-builder handles 80–90% of all implementation — Claude orchestrates, Jeff reviews and approves
gemini-security-reviewer audits every deploy: OWASP Top 10, auth patterns, secrets exposure, injection risks, dependency CVEs
The deploy gate is literal — a PreToolUse(Bash) hook intercepts every modal deploy command and exits with error if pip-audit or evals fail
Proof Points
15 security vulnerabilities closed on SearchOps in one project: 6 High, 9 Medium — all caught before production
29 CVEs surfaced and cleared by automated dep scan; 7 more resolved in a single session (starlette, python-multipart, requests)
Three-agent quality gate ran clean on SearchOps v1 public launch: outcome-architect PASS, gemini-researcher 42/42 tests no issues, gemini-security-reviewer zero findings
SearchOps shipped solo in ~15 sessions: 4-layer scoring engine, 63 passing tests, full security audit, production-deployed on Modal
232× DB query improvement on hot paths, 19–24× on dashboard counts — surfaced by gemini-researcher before performance became a production problem