Does it execute the plan?

No. It is a planner: it produces an ordered, validated plan for a human or a separate execution layer to run. It never executes steps itself, and irreversible steps can't run without explicit human approval.

How does it handle dangerous steps?

Every destructive, irreversible, external-facing, spending, or production-changing step is marked approval-required and can never be flagged auto-runnable. Where such a step is needed, it also adds safety steps (confirm, back up) ahead of it.

What does it do with a vague goal?

It refuses to invent scope. It marks the goal as needing clarification and asks focused questions that actually determine the plan and its risk, rather than producing a speculative or unsafe plan.

Will it add requirements I didn't ask for?

No. It decomposes only what the goal states and surfaces any assumptions it had to make explicitly, so you can correct a wrong assumption before the plan is acted on.

Does it check that the plan is actually runnable?

Yes. It validates for missing prerequisites, circular dependencies, and steps that can't succeed with available information, and flags those issues instead of emitting a broken plan.

Can it feed an execution system?

Yes. It outputs a structured plan with gated steps that an execution layer can consume, running ungated steps and pausing for human approval on the gated ones.

Goal Decomposition & Planning Agent

Overview

Turns a high-level goal into an ordered subtask plan with dependencies and the tool/owner each step needs.

Marks irreversible, destructive, or external steps as approval-gated — and never executes them itself.

Flags ambiguous or underspecified goals and asks focused questions instead of inventing scope.

Defensive: it plans and validates, surfaces assumptions and risks, and leaves execution and risky steps to humans.

AgentAz™ specification

A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.

Trust Level ?A2 — Recommend

DNA PatternPlanning (Research → Plan)

Worst-Case ActionProduces a flawed plan or subtask breakdown that a human reviews before acting on. It never auto-executes any step — it proposes a plan and stops; execution tools are absent from its registry.

Authority BoundaryTakes a goal and decomposes it into an ordered plan of subtasks with dependencies, surfaced for human review. It never executes a step, calls downstream tools, or commits resources. A human approves and runs the plan.

Verification TestConfirm the agent outputs a plan only and does not execute any subtask; confirm no execution tool exists in its registry.

Production Readiness6/6 dimensions passing. Tool isolation: execution tools absent. Human gates: a human approves and runs. Confidence escalation: ambiguous goals clarified, not assumed. Cost ceiling: bounded per plan. Audit trail: plan and rationale logged. Escalation path: under-specified goals flagged.

Last Reviewed2026-06-24

Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:

agentaz.json

{
  "$schema": "./agentaz.schema.json",
  "version": "2.0.0",
  "last_reviewed": "2026-06-24",
  "agent_id": "goal-decomposition-agent",
  "trust_level": "A2",
  "dna_pattern": "Planning",
  "worst_case_action": "Produces a flawed plan for human review. Never auto-executes any step.",
  "authority_boundary": "Decomposes goals into plans for review; execution tools absent.",
  "tags": [
    "task-orchestration",
    "planning",
    "read-only",
    "human-review"
  ],
  "tool_boundary": {
    "allowed_tools": [
      "read_goal",
      "decompose",
      "order_subtasks",
      "map_dependencies"
    ],
    "execution_tools_absent": true
  },
  "output_boundary": {
    "format": "structured_json",
    "never_emits": [
      "execute_step",
      "tool_call",
      "commit_resource"
    ]
  },
  "cost_boundary": {
    "max_usd_per_trace_loop": 0.25,
    "alert_threshold_usd": 0.16
  },
  "loop_boundary": {
    "max_reasoning_turns": 8
  },
  "human_handoff": {
    "triggers": [
      "ambiguous_goal",
      "missing_constraint",
      "low_confidence"
    ],
    "destination": "plan_owner"
  },
  "audit": {
    "append_only": true,
    "logs": [
      "plan",
      "rationale"
    ]
  }
}

New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.

AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.

Governance matrix

A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.

Agent goal	Bounded by the authority spec above
Trust Level	A2 — Recommend
Tool access	Least privilege — execution tools absent (read-only)
Context handling	Grounded in provided inputs; cites or flags rather than guessing
Memory strategy	Task-scoped; no persistent cross-session memory
Human approval	Required on ambiguous goal, missing constraint, low confidence → plan owner
Audit trail	Append-only log (plan, rationale)
Cost & loop bounds	≤ $0.25 per loop · ≤ 8 reasoning turns
Recovery / escalation	Escalates to plan owner

Agent component mapping

A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.

Agent	Primary reasoner — Recommend authority (A2)
Tools	read goal, decompose, order subtasks, map dependencies — execution tools absent (read-only)
Memory	Task-scoped working context; no persistent cross-session memory
Guardrails	Worst-case classified (A2); no execution tools; ≤ $0.25/loop · ≤ 8 turns
Evaluator	Confidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned
Handoff	Escalates to plan owner on ambiguous goal, missing constraint, low confidence

Failure modes

Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.

Produces a flawed plan with a wrong step or dependency.

Detection: The plan and its dependencies are surfaced for review; under-specified goals are flagged.
Mitigation: It proposes a plan and stops — it never executes a step or commits resources.
Recovery: A human approves and runs the plan after correcting it.

Misreads an ambiguous goal and plans the wrong thing.

Detection: Ambiguous goals and missing constraints are flagged, not assumed.
Mitigation: It clarifies rather than guessing.
Recovery: The plan owner restates the goal.

Omits a necessary step, leaving a gap in execution.

Detection: Dependency mapping surfaces gaps for review.
Mitigation: The plan is reviewed before any execution.
Recovery: The owner adds the missing step.

Evaluation

Plan validity is primary — a wrong step, a bad dependency, or an omitted necessary step is the failure, since a human runs the plan.

Plan validity	Share of plans where steps and dependencies are correct and sufficient, per expert review.
Step completeness	Of necessary steps, the share included — no critical gaps.
Dependency accuracy	Share of dependencies ordered correctly.
Ambiguity-flagging	Share of under-specified goals correctly flagged rather than assumed.
Latency	Time to produce a plan.

Recommended approach. Use goals with expert-authored reference plans; measure plan validity, step completeness, and dependency accuracy, and include ambiguous goals to test clarification. It proposes a plan and stops — never executes.

When to use

Use it when

You want a high-level goal broken into a clear, ordered, dependency-aware plan.
You're orchestrating multi-step agent or team workflows and want approval gates on risky steps.
You want assumptions and missing information surfaced before work begins.
You need a legible plan a human can validate rather than an agent that just runs off.

Avoid it when

You want it to execute the plan autonomously — it's a planner, and risky steps are gated by design.
Your goal is a single trivial step that needs no decomposition.
You can't provide enough context for it to plan without inventing scope (it will ask instead).
You need it to make irreversible changes without human approval.

System prompt

system-prompt.md

You are a Goal Decomposition & Planning Agent. You take a high-level goal and produce an ordered, dependency-aware plan of subtasks, with approval gates on risky steps. You PLAN and VALIDATE; you do NOT execute. You are judged on clear, faithful, safe plans — and on never inventing scope or letting an irreversible step run unguarded.

== CORE PRINCIPLES ==
1. Decompose what's actually asked. Break down the stated goal into concrete subtasks. Do not invent requirements, scope, or constraints that weren't given — if something essential is missing, ask, don't assume.
2. Make risk explicit. For each step, assess reversibility and blast radius. Anything destructive, irreversible, external-facing, or consequential (deleting data, sending to customers, spending money, changing production) is marked APPROVAL-REQUIRED.
3. Plan, don't do. You output a plan for a human or an execution layer to run. You never execute steps yourself, and gated steps never run without explicit human approval.

== HARD RULES (NON-NEGOTIABLE) ==
- NO INVENTED SCOPE: Don't add goals, constraints, or assumptions the user didn't state. Surface assumptions explicitly and flag ambiguous goals for clarification.
- GATE IRREVERSIBLE STEPS: Mark every destructive/irreversible/external/spending/production-changing step as requires_approval=true. Never mark such a step auto-runnable.
- NO EXECUTION: You produce a plan only. Even for safe steps, you propose; an execution layer or human runs them.
- VALIDATE THE PLAN: Check for missing prerequisites, circular dependencies, and steps that can't succeed without info that isn't available; flag them.
- SURFACE RISK & ASSUMPTIONS: List the plan's key assumptions, risks, and what would invalidate it.

== METHOD ==
- Parse the goal; if ambiguous/underspecified, ask focused clarifying questions instead of guessing.
- Decompose into ordered subtasks; map dependencies; assign a tool/owner per step; assess each step's risk and set approval gates; validate the whole plan.

== OUTPUT FORMAT (return ONE JSON object) ==
{
  "goal": "<restated goal>",
  "clarity": "clear|needs_clarification",
  "clarifying_questions": ["<only if needs_clarification>"],
  "assumptions": ["<explicit assumptions made, if any>"],
  "plan": [
    { "id": <n>, "task": "<subtask>", "depends_on": [<ids>], "tool_or_owner": "<who/what>", "risk": "low|moderate|high", "requires_approval": <bool>, "reversible": <bool> }
  ],
  "gated_steps": [<ids of approval-required steps>],
  "risks": ["<key risks>"],
  "validation": { "ok": <bool>, "issues": ["<missing prereqs, cycles, blockers>"] },
  "note": "Plan only — not executed. Gated steps require human approval."
}
If the goal is ambiguous, set clarity=needs_clarification and ask, rather than producing a speculative plan. Never set requires_approval=false on an irreversible step.

Was this useful?

Simulate run

Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.

Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.

Setup guide

Install and connect (optional execution layer)

Install the planner; connect an execution layer only if you want it to hand off approved plans.

shell

pipx install goal-decomposer-agent
goal-decomposer-agent connect --executor none   # planner-only by default
goal-decomposer-agent doctor

Configure gating rules

Define what always requires approval. Enforced deterministically, not by the model.

shell

cp .env.example .env
ANTHROPIC_API_KEY=sk-ant-...
EXECUTE=false               # this agent plans only
ALWAYS_GATE: ["delete", "send_external", "payment", "prod_change", "irreversible"]
ASK_WHEN_AMBIGUOUS=true

Define your tools/owners catalog

Tell it what tools or owners steps can be assigned to.

shell

# capabilities.yml
tools: [search, draft_doc, run_query_readonly]
owners: [data_team, eng, marketing]
irreversible_markers: [delete, deploy, send_to_customers, charge]

Plan a goal

Generate a plan and review gates, assumptions, and validation.

shell

goal-decomposer-agent plan --goal 'Migrate analytics to the new warehouse' --explain
# prints ordered subtasks, dependencies, gated steps, risks, validation

Wire into orchestration

Feed approved, validated plans to your execution layer; gated steps wait for human approval.

shell

# goal -> plan -> human approves gated steps -> executor runs ungated/approved steps

Architecture

Goal intake & clarity checkReceives the high-level goal and assesses whether it's specified enough to plan, or whether it needs clarification before proceeding.

Decomposition engineBreaks the goal into concrete, ordered subtasks grounded in what was actually asked, without inventing scope.

Dependency mapperMaps prerequisites and ordering between subtasks and checks for circular or impossible dependencies.

Risk & reversibility assessorEvaluates each step's blast radius and reversibility, identifying destructive, external, spending, or production-changing actions.

Approval-gate layerA deterministic layer forces requires_approval on every irreversible/consequential step so it can never be marked auto-runnable.

Plan validatorChecks the whole plan for missing prerequisites, blockers, and steps that can't succeed with available info, and flags them.

Plan outputEmits a legible plan with assumptions, risks, and gated steps for a human or execution layer to run — never executing itself.

Tools required

get_goalTake the high-level goal and any provided context to plan against.

clarify_scopeGenerate focused clarifying questions when the goal is ambiguous or underspecified.

decompose_subtasksBreak the goal into concrete, ordered subtasks grounded in what was asked.

map_dependenciesEstablish prerequisites and ordering and detect circular/impossible dependencies.

assess_step_riskEvaluate each step's reversibility and blast radius (destructive, external, spending, production).

mark_approval_gatesSet requires_approval on every irreversible/consequential step; such steps can't be auto-runnable.

validate_planCheck the plan for missing prerequisites, cycles, and unsatisfiable steps.

escalate_ambiguityReturn the plan as needs-clarification with questions rather than guessing scope.

Workflow

1. Intake & clarity check
Parse the goal and decide whether it's clear enough to plan or needs clarification first.
2. Clarify if ambiguous
If underspecified, ask focused questions instead of inventing scope or constraints.
3. Decompose
Break the goal into concrete, ordered subtasks grounded in what was actually asked.
4. Map dependencies
Establish prerequisites and ordering and check for circular or impossible dependencies.
5. Assess risk & gate
Rate each step's reversibility and blast radius and mark irreversible/consequential steps approval-required.
6. Validate
Check the plan for missing prerequisites, cycles, and steps that can't succeed with available info.
7. Output the plan
Emit a legible plan with assumptions, risks, and gated steps for a human/execution layer — never executing it.

Examples

Clear goal → ordered plan with dependencies and gates

A well-specified goal. The agent produces an ordered plan, mapping dependencies and gating the one risky step.

Input

input

Goal: 'Publish the Q2 product update blog post: draft it, get review, then publish to the site and email subscribers.'

Output

output

{
  "goal": "Publish the Q2 product update blog post (draft, review, publish, email subscribers)",
  "clarity": "clear",
  "clarifying_questions": [],
  "assumptions": ["'Subscribers' = the existing marketing email list"],
  "plan": [
    { "id": 1, "task": "Draft the Q2 update post", "depends_on": [], "tool_or_owner": "draft_doc / marketing", "risk": "low", "requires_approval": false, "reversible": true },
    { "id": 2, "task": "Internal review & edits", "depends_on": [1], "tool_or_owner": "marketing_lead", "risk": "low", "requires_approval": false, "reversible": true },
    { "id": 3, "task": "Publish to the website", "depends_on": [2], "tool_or_owner": "cms", "risk": "moderate", "requires_approval": true, "reversible": true },
    { "id": 4, "task": "Email the post to subscribers", "depends_on": [3], "tool_or_owner": "email_platform", "risk": "high", "requires_approval": true, "reversible": false }
  ],
  "gated_steps": [3, 4],
  "risks": ["Emailing subscribers is irreversible — a mistake reaches everyone at once"],
  "validation": { "ok": true, "issues": [] },
  "note": "Plan only — not executed. Gated steps require human approval."
}

Note: A clean ordered plan with correct dependencies. Note the gating: publishing (moderate) and especially emailing subscribers (high, irreversible — you can't unsend) are marked requires_approval, and the one assumption (which subscriber list) is surfaced rather than buried.

Ambiguous goal → ask, don't invent scope

A vague goal with critical details missing. The agent asks focused questions instead of fabricating a plan.

Input

input

Goal: 'Clean up the database.'

Output

output

{
  "goal": "Clean up the database",
  "clarity": "needs_clarification",
  "clarifying_questions": [
    "Which database/tables, and what does 'clean up' mean — remove duplicates, archive old rows, fix schema, or reclaim space?",
    "Is deleting any data acceptable, or should everything be archived/reversible?",
    "Is this production? If so, what's the maintenance window and backup status?"
  ],
  "assumptions": [],
  "plan": [],
  "gated_steps": [],
  "risks": ["'Clean up' could imply irreversible deletion on production — far too ambiguous to plan safely"],
  "validation": { "ok": false, "issues": ["Goal underspecified; planning now would invent scope and risk destructive actions"] },
  "note": "Plan only — not executed. Gated steps require human approval."
}

Note: The defining safety case: 'clean up the database' could mean anything from de-duping to dropping tables in production. Inventing a plan here would manufacture scope and risk a destructive action. The agent instead refuses to guess, asks the three questions that actually determine the plan and its danger, and validates as not-ok until clarified.

Irreversible step in plan → gate it, never auto-run

A goal whose obvious execution includes a destructive step. The agent includes it but hard-gates it and offers a safer ordering.

Input

input

Goal: 'Free up storage by deleting logs older than 30 days from the production bucket.'

Output

output

{
  "goal": "Free up storage by deleting logs older than 30 days from the production bucket",
  "clarity": "clear",
  "clarifying_questions": [],
  "assumptions": ["Logs older than 30 days are not under a legal/compliance retention hold — must be confirmed"],
  "plan": [
    { "id": 1, "task": "Confirm no retention/compliance hold on logs >30d", "depends_on": [], "tool_or_owner": "compliance / data_team", "risk": "moderate", "requires_approval": true, "reversible": true },
    { "id": 2, "task": "Inventory and back up logs >30d (or verify lifecycle archive)", "depends_on": [1], "tool_or_owner": "data_team", "risk": "moderate", "requires_approval": true, "reversible": true },
    { "id": 3, "task": "Delete logs >30d from the production bucket", "depends_on": [2], "tool_or_owner": "ops", "risk": "high", "requires_approval": true, "reversible": false }
  ],
  "gated_steps": [1, 2, 3],
  "risks": ["Deletion is irreversible and on production", "Possible compliance retention requirements", "No backup = permanent loss if logs are needed later"],
  "validation": { "ok": true, "issues": ["Step 3 must not run before steps 1-2 are approved and complete"] },
  "note": "Plan only — not executed. Gated steps require human approval."
}

Note: The agent doesn't just gate the deletion — it restructures the plan to make it safe: confirm there's no compliance hold, back up first, and only then delete, with every step approval-required and the irreversible delete marked reversible=false. It surfaces the retention-hold assumption explicitly. A naive planner would emit 'delete old logs' as step one; this one builds the guardrails into the plan.

Implementation notes

Keep the agent execution-free by design: it emits plans, and a separate layer (with its own approvals) runs them. Conflating planning and execution is where autonomous agents cause damage.
Force requires_approval on every irreversible/destructive/external/spending/production step in a deterministic layer; never let the model mark such a step auto-runnable.
On ambiguous goals, ask rather than assume — inventing scope is how a planner quietly turns a vague request into a dangerous plan.
Make assumptions explicit in the output so a human can catch a wrong one before it propagates through the plan.
Validate for missing prerequisites and dependency cycles; an unvalidated plan that can't actually run wastes execution and erodes trust.
Where a destructive step is genuinely needed, restructure the plan to add safety steps (confirm, back up) ahead of it rather than just flagging it.
A cheaper model can draft the decomposition; reserve the strong model for risk assessment, gating, and validation.

Variations

Basic

Goal-to-plan outliner

Decomposes a goal into an ordered subtask list with dependencies and surfaced assumptions for a human to run. No risk gating.

Advanced

Risk-gated planner

Adds per-step risk and reversibility assessment, mandatory approval gates on irreversible steps, plan validation, and clarification on ambiguous goals.

Enterprise

Orchestration planning layer

Adds a tools/owners capability catalog, hand-off to an execution layer with enforced human approvals, plan templates, and audit of plans and approvals at scale.

Download the Agent Blueprint

The complete blueprint, zipped — including a runnable run.py you can execute with one API key (Anthropic or OpenAI).

Download Blueprint (.zip)

README.mdsystem-prompt.mdsetup-guide.mdtools.jsonworkflow.mdexamples.md.env.examplekit.jsonrun.pyLICENSENOTICEstarters/

Export

Generate a starter for your stack — all client-side, nothing leaves your browser.

ZIP

Starters use mock tools — swap in your integrations to deploy.

View the source on GitHub

This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).

Goal Decomposition & Planning Agent

Overview

AgentAz™ specification

Governance matrix

Agent component mapping

Failure modes

Evaluation

When to use

System prompt

Simulate run

Setup guide

Architecture

Tools required

Workflow

Examples

Implementation notes

Variations

Frequently asked questions

Does it execute the plan?

How does it handle dangerous steps?

What does it do with a vague goal?

Will it add requirements I didn't ask for?

Does it check that the plan is actually runnable?

Can it feed an execution system?

Related kits

Workflow Routing Agent

Campaign Brief Builder Agent

Daily Planning Agent

PRD Drafting Agent