PromptBuildShip.dev

A flagship, general-purpose developer productivity suite for Claude Code — every phase
of a real software workflow, with each agent and skill that already knows its lane.

16 Agents · 11 Skills
Full SDLC
Zero ad-hoc prompts

The Pitch

The pack ships 16 specialist agents and 11 skills that
cover every phase of a real software workflow: scoping requirements, writing code,
reviewing it, debugging it, testing it, documenting it, researching the unknowns, and
shipping the result.

It is for any developer using Claude Code who is tired of writing the same long prompt
every time. Instead of re-explaining "act as a senior reviewer and check for X, Y, Z"
on every task, you invoke an agent that already knows its domain — its scope, its
non-goals, its review format, and which agent to hand off to next. A requirements doc
flows into the coding orchestrator. The coding orchestrator hands changed files to the
code reviewer. The reviewer routes test gaps to the test engineer. The technical writer
turns the merged work into release notes. Each piece knows its lane and where the next
lane starts.

The core problem this solves is ad-hoc prompting. One-off prompts produce one-off
quality: inconsistent reviews, forgotten edge cases, no clean handoff between stages,
and no memory of what was decided. The Power Pack replaces that with a set of agents
and skills that enforce a discipline — front-load questions, keep changes surgical,
verify before declaring done, challenge diagnoses before fixing, and capture what was
learned at the end of the session.

What's Included

Agents

Agent What it does
requirements-architect Authors and revises requirement, design, and feature-spec documents that downstream coding agents execute against, coordinating per-section council review and specialist sign-off.
code-reviewer Reviews a single C# project file-by-file for naming, structure, async correctness, complexity, error handling, public API hygiene, and test coverage on changed code.
code-council Runs a coding or architecture decision through five independent advisors who analyze, peer-review, and synthesize a final technical verdict.
debug-investigator Enumerates every plausible root cause for a bug or failure, verifies each with a read-only command, and reports confirmed versus ruled-out causes with evidence.
hostile-reviewer Argues against a proposed diagnosis or fix, surfacing unverified claims, untested coverage gaps, and regression risk before any code changes.
project-manager Acceptance gate that maps every numbered requirement to its implementation, runs the build and tests, flags scope drift, and produces a traceability matrix and verdict.
agent-roster-reviewer Audits a Claude Code agent roster for gaps, duplication, scope drift, broken cross-references, and model-choice consistency, then produces a decision-ready report.
git-engineer Handles all daily git and gh operations — commit, push, branch, merge, rebase, tag, conflict triage, PR ops — and returns concise summaries instead of raw output.
technical-writer Translates finished engineering work into release notes, CHANGELOG entries, migration guides, README sections, and API integration guides for the people who consume the product.
web-designer-ux Writes accessible, responsive, modern HTML/CSS/JavaScript that integrates with C# web stacks, owning visual design, interaction, accessibility, and frontend performance.
llm-integration-engineer Owns the seam between an app and its LLM providers — prompt design and versioning, structured output, evals, cost tracking, rate limiting, and observability.
email-engineer Owns HTML email layout, SMTP integration, deliverability (SPF/DKIM/DMARC/BIMI), transactional versus marketing patterns, and bounce/complaint handling.
document-export-engineer Owns document export pipelines — PDF, Google Docs, .docx, .xlsx — including pagination, font embedding, templating, and accessibility tagging.
web-crawler-engineer Owns crawlers and scrapers — Playwright orchestration, polite rate-limited fan-out, resilient extraction, snapshot persistence, and Firecrawl integration when available.
telemetry-analyst Defines metrics, instrumentation, dashboards, funnels, alerts, and SLOs so a feature's behavior in production is visible and tied to decisions.
human-voice-writer Rewrites short conversational copy — DMs, follow-ups, cold outreach — to strip robotic patterns and read like a real person, without changing substance.

Skills

Skill command What it does
/code Orchestrates the full development lifecycle from requirements through phased implementation, review gates, and verification, with five run modes including unattended overnight execution.
/commit Reviews changes, stages relevant files, writes a conventional-commit message, commits, and always pushes — then reports the SHA.
/deep-dive Runs a scoped, multi-source research pipeline — parallel searches, source triangulation, claim verification — and saves a citation-backed Markdown report to disk.
/design-review Reviews HTML/CSS UI work against shared design laws and anti-patterns, returning critical/warning/pass findings before UI ships.
/handoff Summarizes completed phases, outstanding items, key file paths, and a ready-to-paste prompt for a fresh session, saved to handoff.md.
/overnight-build-handoff Front-loads every question, decides scope, invokes /code to build a nightly schedule, and writes a handoff with decisions, risks, and kickoff instructions.
/prompt-forge Writes or sharpens a prompt for any AI tool. Detects the target, extracts intent, strips waste, and delivers one clean copyable prompt block with a one-line strategy note.
/retrospective Interactive post-session retro that scans the conversation, asks focused questions, and saves approved learnings to memory or skill files.
/security-review Runs an OWASP Top 10 audit on changed or specified code and produces a severity-ranked findings report, with optional auto-fix for Critical/High.
/simplify Reviews changed code for over-engineering and surgical-change violations (Simplicity First and Surgical Changes discipline) and optionally applies the fixes.
/webapp-testing Toolkit for testing local web apps with Playwright — verifying frontend behavior, capturing screenshots, and reading browser logs.

Installation

Prerequisites

  • The Claude Code CLI, installed and authenticated. If you do not have it yet, get it at https://claude.ai/code.
  • A ~/.claude/ directory (Claude Code creates this on first run).

Step 1 — Unzip the pack

unzip Claude-Code-Power-Pack.zip
cd Claude-Code-Power-Pack

Step 2 — Copy the agents

Agents are single .md files. Copy them into your user agents directory:

cp agents/*.md ~/.claude/agents/

Step 3 — Copy the skills

Each skill is a directory (it carries its own SKILL.md plus any reference files, scripts, and templates). Copy the directories, not just the files:

cp -r skills/* ~/.claude/skills/

Step 4 — Verify

Restart Claude Code so it reloads agents and skills. Then:

  • Type @ in the prompt. The picker lists the agents you just installed (@code-reviewer, @debug-investigator, and so on).
  • Invoke /code to confirm skills load. If the coding orchestrator skill responds, the skill directory copied correctly.

Note on overwrites

cp overwrites silently. If you already have agents or skills with the same names in ~/.claude/, this replaces them. Back up first if you need to keep your existing versions:

cp -r ~/.claude/agents ~/.claude/agents.bak
cp -r ~/.claude/skills ~/.claude/skills.bak

Agent Reference

requirements-architect

Authors and revises requirement, design, and feature-spec documents that downstream coding agents execute against without further clarification. Every ambiguity left in the doc becomes a bug, so the writing is declarative, unambiguous, and machine-actionable. It runs each section through code-council before moving on, prompts you about telemetry on every new spec, and emits REVIEW_REQUEST: lines so the right specialists validate the relevant sections before sign-off. Reach for it at the start of any feature, when scoping a PRD, or when an existing spec needs revision.

Example "Write a requirements doc for a user-facing order cancellation feature — cancel-in-flight orders, refund handling, and an audit trail."

code-reviewer

A per-file, per-function C# quality reviewer. It reads the changed code and calls out the things that will hurt the team next month: bad naming, oversized files and methods, mutation where a record fits, swallowed exceptions, missing CancellationToken forwarding, leaked IQueryable, magic values, and untested changed methods. It runs the formatter and analyzers first so it only flags what those miss, then issues an Approved / Revise / Block verdict. Reach for it before merging a single-project feature or whenever a coding agent emits REVIEW_REQUEST: code-reviewer.

Example "Review the files I just changed in the OrderService for code quality before I merge."

code-council

Takes a coding or architecture decision and runs it through five advisors — the Security & Reliability Auditor, the Minimalist, the Architect, the Code Reader, and the Pragmatist. They analyze independently, peer-review each other anonymously, and a chairman synthesizes where they agree, where they clash, and what to actually do. Reach for it on architecture decisions, approach selection, debugging dead ends, and "is this over-engineered?" questions — not factual lookups or tasks with one right answer.

Example "Council this: should I use event sourcing for the audit log or just an append-only audit table?"

debug-investigator

A systematic root-cause investigator. Before anyone writes a fix, it enumerates every plausible cause across all categories (environment, config, code, dependency, state, timing, permissions, network), assigns a read-only verification command to each, runs them, and reports confirmed versus ruled-out causes with the exact evidence. It produces one fix recommendation but does not implement it. This is the first thing to run on any bug, error, service failure, or regression.

Example "The waitlist endpoint started returning 500s after this morning's deploy — investigate the root cause."

hostile-reviewer

The devil's advocate. It reads a proposed diagnosis and fix and actively argues against them: is the evidence causation or correlation, what categories of cause were never checked, is there a simpler explanation, does the fix address the root cause or mask a symptom, what could regress. Every objection comes with the exact verification that would resolve it. Run it after debug-investigator and before committing any fix, or against any significant architectural decision.

Example "Here's the diagnosis and proposed fix for the 500 errors — challenge it before I implement anything."

project-manager

The acceptance gate. It reads the requirements doc and the implementation in the same pass, maps every numbered requirement to the code that satisfies it, runs the build and the test suite, flags any code that traces to no requirement (scope drift), and produces a traceability matrix plus an Approved / Revise / Block verdict. It does not write specs or code — it verifies and reports. Run it when implementation is reported complete or before declaring a milestone done.

Example "Implementation for the waitlist feature is done — verify it against docs/requirements.md and tell me what's actually met."

agent-roster-reviewer

A senior auditor for a Claude Code agent roster. It catalogs every agent, builds a name index, cross-checks delegation references for dangling names, clusters agents by domain to find duplication, and anchors gap analysis to the projects you actually work on. It produces a decision-ready report — it does not author new agent files. Reach for it when you want to audit, rationalize, or deduplicate your roster, or after adding several new agents.

Example "Audit my agents directory against this project and tell me what's duplicated or missing."

git-engineer

The git operator. Its primary job is context savings: verbose git status, git diff, and gh pr view output lives in its context, not your main session's. It commits with conventional messages, pushes (because "commit" means commit and push), opens PRs, triages conflicts, and refuses destructive operations (force-push to main, reset --hard, history rewrite) without explicit confirmation. Reach for it for any commit, push, branch, merge, tag, or PR operation.

Example "Commit the waitlist page changes with a sensible message and push."

technical-writer

Translates finished work into the documents users read: release notes grouped by impact, Keep a Changelog entries, migration guides with before/after examples, README sections, deprecation notices, and API integration guides. It picks an audience before writing (end user, admin, API consumer, contributor), uses active voice and present tense, and refuses to promise behavior the code does not deliver. Reach for it when preparing a release, when a breaking change ships, or when any user-facing doc is needed.

Example "Write release notes for v1.2.0 covering everything merged since the v1.1.0 tag, for end users."

web-designer-ux

Owns everything a user sees and interacts with — markup, styling, motion, accessibility, and frontend performance. It works in Razor, Blazor, or standalone frontends, applies WCAG 2.1 AA, mobile-first responsive layout, OKLCH color, fluid type, and a functional-first JavaScript posture. It checks for a design.md reference before starting new UI and runs /design-review before closing out. Reach for it for any HTML, CSS, JavaScript, responsive, or accessibility surface.

Example "Build an accessible, responsive pricing section for the landing page that matches our existing design tokens."

llm-integration-engineer

Owns the seam between an application and the LLM providers it calls (Anthropic, OpenAI, Gemini, local models). It handles versioned prompt files, structured output via the provider's native schema mechanism, eval pipelines that gate prompt rollouts, cost and token tracking at every call site, retry-with-idempotency, prompt caching, and observability. Reach for it when adding or revising an LLM-backed feature, migrating providers, or fixing a feature that burns budget.

Example "Add a Claude-backed summarization step for crawled pages — versioned prompt, structured output, eval set, cost tracking."

email-engineer

Owns email end to end: table-based HTML layout that survives Outlook, inline CSS via a build step, dark-mode handling, plain-text alternatives, and the deliverability stack (SPF, DKIM, DMARC, BIMI, list hygiene). It distinguishes transactional from marketing patterns and wires bounce/complaint webhooks. Reach for it when a project sends email, when deliverability regresses, or when DKIM/SPF setup is being designed.

Example "Build the waitlist confirmation email — transactional, renders in Gmail and Outlook, plain-text alternative, and tell me the DNS records I need."

document-export-engineer

Owns the pipeline between application data and the formatted documents users download — PDF, Google Docs, .docx, .xlsx. It picks the right engine per scope (Playwright HTML-to-PDF when an HTML view exists, QuestPDF for code-defined docs), handles pagination, font embedding, headers and footers, templating, and accessibility tagging, and tests with realistic data volumes. Reach for it when a project exports reports or an export pipeline regresses.

Example "We need the recap report exported as a paginated, brand-consistent PDF from the existing HTML view — handle the 200-row case."

web-crawler-engineer

Owns crawlers and scrapers, politely and legitimately by default: robots.txt respected, concurrency capped, User-Agent identified, aggressive backoff on 429/503. It writes pure extractor functions tested against snapshot corpora, wires per-target health checks, and uses Firecrawl as a managed layer when it is available. Reach for it when adding a target site, when extraction or rate-limit issues regress, or when designing a high-volume crawl.

Example "Add a new crawl target for this competitor's public pricing page — polite rate limits, resilient selectors, snapshot the raw HTML."

telemetry-analyst

Translates "we want to know if this works" into instrumentation, dashboards, and decisions. It defines metrics tied to decisions, reviews whether the app actually emits the needed signals, authors dashboards that answer one question per panel, sets SLOs and alerts, and stays PII-safe. It is the agent that asks, at spec time, "how will we know this is working after we ship?" Reach for it at the start of a feature spec or when production behavior is unclear.

Example "Define the success metrics and a dashboard for the waitlist feature so we can tell if signups convert after launch."

human-voice-writer

A focused prose editor for short conversational copy — DMs, email follow-ups, cold outreach, replies. It strips throat-clearing openers, over-hedging, vague credential language, and hollow closers, and replaces them with direct, specific, credibly human voice. It does not change the substance or honesty of the message. Reach for it when a short message needs to pass a "did a real person write this?" test.

Example "Rewrite this cold outreach email so it sounds like a person, not a template — don't change the offer."

Skill Reference

/code

The coding orchestrator. It does not write implementation code itself — it owns the workflow: parse the PRD into phases and tasks, confirm model and effort with you, dispatch the right specialist agent per task, run three review gates after every task (spec compliance, code quality, unit tests), debug systematically on failure, and capture memory after each phase. It has five run modes, including building and executing an unattended overnight schedule. Every subagent it dispatches gets the nine coding-discipline rules as a preamble.

Example "/code — build the waitlist signup feature from docs/requirements.md."

/commit

A lightweight commit-and-push. It runs git status and git diff, stages the relevant files, writes a conventional-commit message, commits, always pushes to the current branch's remote, and reports the pushed SHA.

Example "/commit"

/deep-dive

A structured research analyst. It scopes the question into 3–5 sub-questions, plans a targeted search agenda, retrieves and evaluates sources, triangulates every factual claim across independent sources, and delivers a citation-backed Markdown report saved to disk. Every claim has ≥2 sources or is explicitly flagged as single-source. A "What We Don't Know" section is always present.

Example "/deep-dive — compare the deliverability tradeoffs of AWS SES versus Postmark for a low-volume transactional sender."

/design-review

A firm design director for HTML UI. It runs preflight gates, detects whether the surface is a brand register or a product register, and reviews against shared design laws — OKLCH color, 65–75ch line length, spacing scale, motion performance — plus a list of absolute bans (side-stripe borders, gradient text, em dashes in UI copy). Output is critical / warning / pass with a one-sentence verdict. It reviews; it does not edit.

Example "/design-review — critique the hero section I just built."

/handoff

Captures session state for a clean restart. It summarizes completed phases, outstanding items, and key file paths (real project paths, not /tmp), then writes a ready-to-paste prompt for a fresh session into handoff.md.

Example "/handoff — we're out of context, write the handoff."

/overnight-build-handoff

The pre-flight orchestrator for unattended overnight builds. It reads the project silently, front-loads every question into one batched call, decides the right /code mode, invokes /code to write docs/nightly-schedule.md, and writes handoff.md with decisions, risks, environment prerequisites, and the exact phrase to start the run. It plans only — it never writes application code.

Example "/overnight-build-handoff — I'll be away tonight, set up an unattended run for Phases 2 through 4."

/prompt-forge

A prompt engineer. It detects the target AI tool, extracts intent across six dimensions (task, input, output, constraints, context, success criteria), asks at most 3 clarifying questions, selects the right prompt pattern (RTF, RISEN, few-shot, agent brief, visual descriptor, file-scope), runs a token efficiency pass to strip every word that doesn't change the output, and delivers one clean copyable prompt block with a one-line strategy note.

Example "/prompt-forge — write me a prompt for Midjourney to generate a rubber-hose cartoon mascot."

/retrospective

An interactive post-session retro. It scans the conversation for skill failures, user corrections, repeated patterns, and cross-skill workflows, then presents up to five ranked candidate learnings in a single question call. Approved items are written to project memory, skill files, or CLAUDE.md. The whole interaction is two moments: one question, then silent execution.

Example "/retrospective — wrap up the session and capture what we learned."

/security-review

An OWASP Top 10 audit. It detects scope from the git diff or your specified files, works through all ten categories plus hardcoded-secret, path-traversal, deserialization, and shell-injection grep sweeps, and produces a severity-ranked findings report (Critical/High/Medium/Low/Info) with attack vector and remediation per finding. It can apply unambiguous Critical and High fixes when asked.

Example "/security-review — audit the auth and signup code I just changed before I ship it."

/simplify

A code-quality pass that enforces two rules: Simplicity First (no speculative features, no single-use abstractions, no error handling for impossible states) and Surgical Changes (every changed line traces to the request; no drive-by refactors). It produces a findings report with line-level recommendations and can apply the unambiguous fixes, one edit per finding so each can be reverted.

Example "/simplify — review my last change for over-engineering and surgical-change violations."

/webapp-testing

A Playwright toolkit for testing local web apps. It uses a bundled server-lifecycle helper as a black box, follows a reconnaissance-then-action pattern (navigate, wait for networkidle, inspect the rendered DOM, then act), and captures screenshots and browser logs. Use it to verify frontend behavior after a UI change.

Example "/webapp-testing — start the dev server and verify the waitlist form submits and shows the success state."

Power Combinations

The pack is designed for chaining. Each agent or skill produces output the next one consumes.

  1. Full SDLC from idea to shipped PR. requirements-architect produces an unambiguous spec → /code breaks it into phases and dispatches specialist agents per task → code-reviewer issues a per-file verdict on the changed code → unit-test-engineer (dispatched by the review gate) fills coverage gaps → git-engineer opens the PR → /commit lands the final change with a conventional message and a push.
  2. Debug and verify loop. debug-investigator enumerates and verifies every plausible root cause and reports one fix recommendation → hostile-reviewer challenges that diagnosis and names what was never checked → you apply the fix once the surviving objections are resolved → /webapp-testing confirms the fixed behavior in a real browser.
  3. Research, spec, document. /deep-dive produces a citation-tracked report on the unknowns → requirements-architect turns the findings into a machine-actionable spec with acceptance criteria → technical-writer turns the shipped result into release notes and a migration guide for the people who consume it.
  4. Content and outreach. /deep-dive gathers the supporting evidence → human-voice-writer strips the AI tells from the draft → /prompt-forge produces a tool-specific prompt for any image, video, or copy generation the content needs.
  5. Overnight autonomous build. requirements-architect writes the spec while you are present → /overnight-build-handoff front-loads every decision and invokes /code to write the nightly schedule → /code (execute-schedule mode) runs the build unattended through its review gates → on wake-up, project-manager verifies what actually shipped against the spec and flags scope drift.
  6. Pre-merge hardening gauntlet. After a feature is implemented, run /simplify to strip over-engineering and drive-by refactors → /security-review for an OWASP audit on the changed code → code-reviewer for the per-file quality verdict → project-manager as the final acceptance gate before the merge. Four lenses, no single blind spot.
  7. Design a feature you can prove works. requirements-architect drafts the spec and, on its telemetry prompt, you engage telemetry-analyst to define measurable success signals before any code exists → web-designer-ux builds the surface and runs /design-review → after launch, telemetry-analyst reads the dashboard and reports whether the feature actually converts. The "how will we know?" question is answered at spec time, not after.

Tips and Best Practices

  1. Always run debug-investigator before proposing a fix. Confirmation bias makes "obvious" bugs the most dangerous. Diagnose with evidence first; the one-line fix recommendation comes after the verification, not before it.
  2. Use hostile-reviewer as a mandatory gate, not an optional one. After every diagnosis and before every significant architectural change. If a fix has failed more than once, re-run it — the root-cause assumption is probably wrong.
  3. Let requirements-architect own the contract. Downstream coding agents execute against the spec without re-asking. The clearer the acceptance criteria up front, the less rework later. Say yes to its telemetry prompt for anything whose adoption you will need to validate.
  4. Route all git through git-engineer. It keeps verbose diff and log output out of your main session's context, and it refuses destructive operations by default. "Commit" means commit and push — that is intentional; a held commit is the failure mode that rule prevents.
  5. Reach for code-council on decisions, not lookups. Use it for architecture and approach calls — when being wrong is expensive and there are real tradeoffs. It is not for questions with one right answer.
  6. Stack the review skills before merging. /simplify and /security-review answer different questions than code-reviewer and catch different defects. Run them together on anything that touches auth, input handling, or data access.
  7. Close every substantial session with /retrospective. It captures user corrections, repeated workarounds, and workflow patterns into memory and skill files so the next session does not relearn them. Pair it with /handoff when you are out of context mid-task.
  8. Humanize before you ship copy. Run user-facing prose through human-voice-writer as the last step. AI tells in release notes, outreach, and marketing erode trust faster than a missing feature.