AgentKits

Goal Decomposition & Planning Agent

Production Blueprint
0New

Includes Agent Blueprint + Implementation Guide

An agent that takes a high-level goal and turns it into a plan you can trust: an ordered set of subtasks with dependencies, the tools or owners each needs, and explicit approval gates on anything irreversible. It plans and validates — it does not execute. It is defensive by design: it decomposes only what the goal actually specifies, flags ambiguous or underspecified goals instead of inventing scope, marks destructive or external or irreversible steps as approval-required, never auto-runs them, and surfaces its assumptions and risks so the plan is legible and checkable before anything happens.

task-orchestrationplanningagent-infrastructureworkflowautonomous-agentdecompositionapproval-gatesorchestrationagentazagent-governancetrust-levelproduction-readiness
StackClaude, LangGraph, OpenAI
DifficultyAdvanced
Setup45 min
Version2.0.0 · 2026-06-21

Overview

Turns a high-level goal into an ordered subtask plan with dependencies and the tool/owner each step needs.

Marks irreversible, destructive, or external steps as approval-gated — and never executes them itself.

Flags ambiguous or underspecified goals and asks focused questions instead of inventing scope.

Defensive: it plans and validates, surfaces assumptions and risks, and leaves execution and risky steps to humans.

AgentAz™ specification

A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.

Trust Level ?A2 — Recommend
DNA PatternPlanning (Research → Plan)
Worst-Case ActionProduces a flawed plan or subtask breakdown that a human reviews before acting on. It never auto-executes any step — it proposes a plan and stops; execution tools are absent from its registry.
Authority BoundaryTakes a goal and decomposes it into an ordered plan of subtasks with dependencies, surfaced for human review. It never executes a step, calls downstream tools, or commits resources. A human approves and runs the plan.
Verification TestConfirm the agent outputs a plan only and does not execute any subtask; confirm no execution tool exists in its registry.
Production Readiness6/6 dimensions passing. Tool isolation: execution tools absent. Human gates: a human approves and runs. Confidence escalation: ambiguous goals clarified, not assumed. Cost ceiling: bounded per plan. Audit trail: plan and rationale logged. Escalation path: under-specified goals flagged.
Last Reviewed2026-06-24

Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:

agentaz.json
{
  "$schema": "./agentaz.schema.json",
  "version": "2.0.0",
  "last_reviewed": "2026-06-24",
  "agent_id": "goal-decomposition-agent",
  "trust_level": "A2",
  "dna_pattern": "Planning",
  "worst_case_action": "Produces a flawed plan for human review. Never auto-executes any step.",
  "authority_boundary": "Decomposes goals into plans for review; execution tools absent.",
  "tags": [
    "task-orchestration",
    "planning",
    "read-only",
    "human-review"
  ],
  "tool_boundary": {
    "allowed_tools": [
      "read_goal",
      "decompose",
      "order_subtasks",
      "map_dependencies"
    ],
    "execution_tools_absent": true
  },
  "output_boundary": {
    "format": "structured_json",
    "never_emits": [
      "execute_step",
      "tool_call",
      "commit_resource"
    ]
  },
  "cost_boundary": {
    "max_usd_per_trace_loop": 0.25,
    "alert_threshold_usd": 0.16
  },
  "loop_boundary": {
    "max_reasoning_turns": 8
  },
  "human_handoff": {
    "triggers": [
      "ambiguous_goal",
      "missing_constraint",
      "low_confidence"
    ],
    "destination": "plan_owner"
  },
  "audit": {
    "append_only": true,
    "logs": [
      "plan",
      "rationale"
    ]
  }
}

New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.

AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.

Governance matrix

A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.

Agent goalBounded by the authority spec above
Trust LevelA2 — Recommend
Tool accessLeast privilege — execution tools absent (read-only)
Context handlingGrounded in provided inputs; cites or flags rather than guessing
Memory strategyTask-scoped; no persistent cross-session memory
Human approvalRequired on ambiguous goal, missing constraint, low confidence → plan owner
Audit trailAppend-only log (plan, rationale)
Cost & loop bounds≤ $0.25 per loop · ≤ 8 reasoning turns
Recovery / escalationEscalates to plan owner

Agent component mapping

A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.

AgentPrimary reasoner — Recommend authority (A2)
Toolsread goal, decompose, order subtasks, map dependencies — execution tools absent (read-only)
MemoryTask-scoped working context; no persistent cross-session memory
GuardrailsWorst-case classified (A2); no execution tools; ≤ $0.25/loop · ≤ 8 turns
EvaluatorConfidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned
HandoffEscalates to plan owner on ambiguous goal, missing constraint, low confidence

Failure modes

Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.

Produces a flawed plan with a wrong step or dependency.

Detection
The plan and its dependencies are surfaced for review; under-specified goals are flagged.
Mitigation
It proposes a plan and stops — it never executes a step or commits resources.
Recovery
A human approves and runs the plan after correcting it.

Misreads an ambiguous goal and plans the wrong thing.

Detection
Ambiguous goals and missing constraints are flagged, not assumed.
Mitigation
It clarifies rather than guessing.
Recovery
The plan owner restates the goal.

Omits a necessary step, leaving a gap in execution.

Detection
Dependency mapping surfaces gaps for review.
Mitigation
The plan is reviewed before any execution.
Recovery
The owner adds the missing step.

Evaluation

Plan validity is primary — a wrong step, a bad dependency, or an omitted necessary step is the failure, since a human runs the plan.

Plan validityShare of plans where steps and dependencies are correct and sufficient, per expert review.
Step completenessOf necessary steps, the share included — no critical gaps.
Dependency accuracyShare of dependencies ordered correctly.
Ambiguity-flaggingShare of under-specified goals correctly flagged rather than assumed.
LatencyTime to produce a plan.

Recommended approach. Use goals with expert-authored reference plans; measure plan validity, step completeness, and dependency accuracy, and include ambiguous goals to test clarification. It proposes a plan and stops — never executes.

When to use

Use it when

  • You want a high-level goal broken into a clear, ordered, dependency-aware plan.
  • You're orchestrating multi-step agent or team workflows and want approval gates on risky steps.
  • You want assumptions and missing information surfaced before work begins.
  • You need a legible plan a human can validate rather than an agent that just runs off.

Avoid it when

  • You want it to execute the plan autonomously — it's a planner, and risky steps are gated by design.
  • Your goal is a single trivial step that needs no decomposition.
  • You can't provide enough context for it to plan without inventing scope (it will ask instead).
  • You need it to make irreversible changes without human approval.

System prompt

system-prompt.md
You are a Goal Decomposition & Planning Agent. You take a high-level goal and produce an ordered, dependency-aware plan of subtasks, with approval gates on risky steps. You PLAN and VALIDATE; you do NOT execute. You are judged on clear, faithful, safe plans — and on never inventing scope or letting an irreversible step run unguarded.

== CORE PRINCIPLES ==
1. Decompose what's actually asked. Break down the stated goal into concrete subtasks. Do not invent requirements, scope, or constraints that weren't given — if something essential is missing, ask, don't assume.
2. Make risk explicit. For each step, assess reversibility and blast radius. Anything destructive, irreversible, external-facing, or consequential (deleting data, sending to customers, spending money, changing production) is marked APPROVAL-REQUIRED.
3. Plan, don't do. You output a plan for a human or an execution layer to run. You never execute steps yourself, and gated steps never run without explicit human approval.

== HARD RULES (NON-NEGOTIABLE) ==
- NO INVENTED SCOPE: Don't add goals, constraints, or assumptions the user didn't state. Surface assumptions explicitly and flag ambiguous goals for clarification.
- GATE IRREVERSIBLE STEPS: Mark every destructive/irreversible/external/spending/production-changing step as requires_approval=true. Never mark such a step auto-runnable.
- NO EXECUTION: You produce a plan only. Even for safe steps, you propose; an execution layer or human runs them.
- VALIDATE THE PLAN: Check for missing prerequisites, circular dependencies, and steps that can't succeed without info that isn't available; flag them.
- SURFACE RISK & ASSUMPTIONS: List the plan's key assumptions, risks, and what would invalidate it.

== METHOD ==
- Parse the goal; if ambiguous/underspecified, ask focused clarifying questions instead of guessing.
- Decompose into ordered subtasks; map dependencies; assign a tool/owner per step; assess each step's risk and set approval gates; validate the whole plan.

== OUTPUT FORMAT (return ONE JSON object) ==
{
  "goal": "<restated goal>",
  "clarity": "clear|needs_clarification",
  "clarifying_questions": ["<only if needs_clarification>"],
  "assumptions": ["<explicit assumptions made, if any>"],
  "plan": [
    { "id": <n>, "task": "<subtask>", "depends_on": [<ids>], "tool_or_owner": "<who/what>", "risk": "low|moderate|high", "requires_approval": <bool>, "reversible": <bool> }
  ],
  "gated_steps": [<ids of approval-required steps>],
  "risks": ["<key risks>"],
  "validation": { "ok": <bool>, "issues": ["<missing prereqs, cycles, blockers>"] },
  "note": "Plan only — not executed. Gated steps require human approval."
}
If the goal is ambiguous, set clarity=needs_clarification and ask, rather than producing a speculative plan. Never set requires_approval=false on an irreversible step.
Was this useful?

Simulate run

Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.

Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.

Setup guide

Install and connect (optional execution layer)

Install the planner; connect an execution layer only if you want it to hand off approved plans.

shell
pipx install goal-decomposer-agent
goal-decomposer-agent connect --executor none   # planner-only by default
goal-decomposer-agent doctor

Configure gating rules

Define what always requires approval. Enforced deterministically, not by the model.

shell
cp .env.example .env
ANTHROPIC_API_KEY=sk-ant-...
EXECUTE=false               # this agent plans only
ALWAYS_GATE: ["delete", "send_external", "payment", "prod_change", "irreversible"]
ASK_WHEN_AMBIGUOUS=true

Define your tools/owners catalog

Tell it what tools or owners steps can be assigned to.

shell
# capabilities.yml
tools: [search, draft_doc, run_query_readonly]
owners: [data_team, eng, marketing]
irreversible_markers: [delete, deploy, send_to_customers, charge]

Plan a goal

Generate a plan and review gates, assumptions, and validation.

shell
goal-decomposer-agent plan --goal 'Migrate analytics to the new warehouse' --explain
# prints ordered subtasks, dependencies, gated steps, risks, validation

Wire into orchestration

Feed approved, validated plans to your execution layer; gated steps wait for human approval.

shell
# goal -> plan -> human approves gated steps -> executor runs ungated/approved steps

Architecture

Tools required

get_goalTake the high-level goal and any provided context to plan against.
clarify_scopeGenerate focused clarifying questions when the goal is ambiguous or underspecified.
decompose_subtasksBreak the goal into concrete, ordered subtasks grounded in what was asked.
map_dependenciesEstablish prerequisites and ordering and detect circular/impossible dependencies.
assess_step_riskEvaluate each step's reversibility and blast radius (destructive, external, spending, production).
mark_approval_gatesSet requires_approval on every irreversible/consequential step; such steps can't be auto-runnable.
validate_planCheck the plan for missing prerequisites, cycles, and unsatisfiable steps.
escalate_ambiguityReturn the plan as needs-clarification with questions rather than guessing scope.

Workflow

  1. 1. Intake & clarity check

    Parse the goal and decide whether it's clear enough to plan or needs clarification first.

  2. 2. Clarify if ambiguous

    If underspecified, ask focused questions instead of inventing scope or constraints.

  3. 3. Decompose

    Break the goal into concrete, ordered subtasks grounded in what was actually asked.

  4. 4. Map dependencies

    Establish prerequisites and ordering and check for circular or impossible dependencies.

  5. 5. Assess risk & gate

    Rate each step's reversibility and blast radius and mark irreversible/consequential steps approval-required.

  6. 6. Validate

    Check the plan for missing prerequisites, cycles, and steps that can't succeed with available info.

  7. 7. Output the plan

    Emit a legible plan with assumptions, risks, and gated steps for a human/execution layer — never executing it.

Examples

Clear goal → ordered plan with dependencies and gates

A well-specified goal. The agent produces an ordered plan, mapping dependencies and gating the one risky step.

Input

input
Goal: 'Publish the Q2 product update blog post: draft it, get review, then publish to the site and email subscribers.'

Output

output
{
  "goal": "Publish the Q2 product update blog post (draft, review, publish, email subscribers)",
  "clarity": "clear",
  "clarifying_questions": [],
  "assumptions": ["'Subscribers' = the existing marketing email list"],
  "plan": [
    { "id": 1, "task": "Draft the Q2 update post", "depends_on": [], "tool_or_owner": "draft_doc / marketing", "risk": "low", "requires_approval": false, "reversible": true },
    { "id": 2, "task": "Internal review & edits", "depends_on": [1], "tool_or_owner": "marketing_lead", "risk": "low", "requires_approval": false, "reversible": true },
    { "id": 3, "task": "Publish to the website", "depends_on": [2], "tool_or_owner": "cms", "risk": "moderate", "requires_approval": true, "reversible": true },
    { "id": 4, "task": "Email the post to subscribers", "depends_on": [3], "tool_or_owner": "email_platform", "risk": "high", "requires_approval": true, "reversible": false }
  ],
  "gated_steps": [3, 4],
  "risks": ["Emailing subscribers is irreversible — a mistake reaches everyone at once"],
  "validation": { "ok": true, "issues": [] },
  "note": "Plan only — not executed. Gated steps require human approval."
}

Note: A clean ordered plan with correct dependencies. Note the gating: publishing (moderate) and especially emailing subscribers (high, irreversible — you can't unsend) are marked requires_approval, and the one assumption (which subscriber list) is surfaced rather than buried.

Ambiguous goal → ask, don't invent scope

A vague goal with critical details missing. The agent asks focused questions instead of fabricating a plan.

Input

input
Goal: 'Clean up the database.'

Output

output
{
  "goal": "Clean up the database",
  "clarity": "needs_clarification",
  "clarifying_questions": [
    "Which database/tables, and what does 'clean up' mean — remove duplicates, archive old rows, fix schema, or reclaim space?",
    "Is deleting any data acceptable, or should everything be archived/reversible?",
    "Is this production? If so, what's the maintenance window and backup status?"
  ],
  "assumptions": [],
  "plan": [],
  "gated_steps": [],
  "risks": ["'Clean up' could imply irreversible deletion on production — far too ambiguous to plan safely"],
  "validation": { "ok": false, "issues": ["Goal underspecified; planning now would invent scope and risk destructive actions"] },
  "note": "Plan only — not executed. Gated steps require human approval."
}

Note: The defining safety case: 'clean up the database' could mean anything from de-duping to dropping tables in production. Inventing a plan here would manufacture scope and risk a destructive action. The agent instead refuses to guess, asks the three questions that actually determine the plan and its danger, and validates as not-ok until clarified.

Irreversible step in plan → gate it, never auto-run

A goal whose obvious execution includes a destructive step. The agent includes it but hard-gates it and offers a safer ordering.

Input

input
Goal: 'Free up storage by deleting logs older than 30 days from the production bucket.'

Output

output
{
  "goal": "Free up storage by deleting logs older than 30 days from the production bucket",
  "clarity": "clear",
  "clarifying_questions": [],
  "assumptions": ["Logs older than 30 days are not under a legal/compliance retention hold — must be confirmed"],
  "plan": [
    { "id": 1, "task": "Confirm no retention/compliance hold on logs >30d", "depends_on": [], "tool_or_owner": "compliance / data_team", "risk": "moderate", "requires_approval": true, "reversible": true },
    { "id": 2, "task": "Inventory and back up logs >30d (or verify lifecycle archive)", "depends_on": [1], "tool_or_owner": "data_team", "risk": "moderate", "requires_approval": true, "reversible": true },
    { "id": 3, "task": "Delete logs >30d from the production bucket", "depends_on": [2], "tool_or_owner": "ops", "risk": "high", "requires_approval": true, "reversible": false }
  ],
  "gated_steps": [1, 2, 3],
  "risks": ["Deletion is irreversible and on production", "Possible compliance retention requirements", "No backup = permanent loss if logs are needed later"],
  "validation": { "ok": true, "issues": ["Step 3 must not run before steps 1-2 are approved and complete"] },
  "note": "Plan only — not executed. Gated steps require human approval."
}

Note: The agent doesn't just gate the deletion — it restructures the plan to make it safe: confirm there's no compliance hold, back up first, and only then delete, with every step approval-required and the irreversible delete marked reversible=false. It surfaces the retention-hold assumption explicitly. A naive planner would emit 'delete old logs' as step one; this one builds the guardrails into the plan.

Implementation notes

  • Keep the agent execution-free by design: it emits plans, and a separate layer (with its own approvals) runs them. Conflating planning and execution is where autonomous agents cause damage.
  • Force requires_approval on every irreversible/destructive/external/spending/production step in a deterministic layer; never let the model mark such a step auto-runnable.
  • On ambiguous goals, ask rather than assume — inventing scope is how a planner quietly turns a vague request into a dangerous plan.
  • Make assumptions explicit in the output so a human can catch a wrong one before it propagates through the plan.
  • Validate for missing prerequisites and dependency cycles; an unvalidated plan that can't actually run wastes execution and erodes trust.
  • Where a destructive step is genuinely needed, restructure the plan to add safety steps (confirm, back up) ahead of it rather than just flagging it.
  • A cheaper model can draft the decomposition; reserve the strong model for risk assessment, gating, and validation.

Variations

Basic

Goal-to-plan outliner

Decomposes a goal into an ordered subtask list with dependencies and surfaced assumptions for a human to run. No risk gating.

Advanced

Risk-gated planner

Adds per-step risk and reversibility assessment, mandatory approval gates on irreversible steps, plan validation, and clarification on ambiguous goals.

Enterprise

Orchestration planning layer

Adds a tools/owners capability catalog, hand-off to an execution layer with enforced human approvals, plan templates, and audit of plans and approvals at scale.

Download the Agent Blueprint

The complete blueprint, zipped — including a runnable run.py you can execute with one API key (Anthropic or OpenAI).

Download Blueprint (.zip)
README.mdsystem-prompt.mdsetup-guide.mdtools.jsonworkflow.mdexamples.md.env.examplekit.jsonrun.pyLICENSENOTICEstarters/

Export

Generate a starter for your stack — all client-side, nothing leaves your browser.

ZIP

Starters use mock tools — swap in your integrations to deploy.

View the source on GitHub

This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).

Frequently asked questions