AgentKits

Purchase Order Matching Agent

Production Blueprint
0New

Includes Agent Blueprint + Implementation Guide

An agent that runs the three-way match at the heart of accounts payable: it reconciles each invoice against its purchase order and goods receipt, checks quantities, prices, and terms against tolerance, and catches the things that cost money — price variances, over-receipts, duplicate invoices, invoices with no PO. It approves clean matches within tolerance for payment and routes everything else to a human. It is defensive by design: it never approves an invoice over tolerance or without a matching PO and receipt, never pays a duplicate, cites the exact variance on every flag, and escalates suspected fraud as an evidence-backed indicator rather than an accusation.

accounts-payablethree-way-matchprocurementsupply-chaininvoiceautonomous-agentfinance-opsduplicate-detectionagentazagent-governancetrust-levelproduction-readiness
StackClaude, LangGraph, OpenAI
DifficultyAdvanced
Setup45 min
Version2.0.0 · 2026-06-21

Overview

Three-way match done consistently: invoice vs. purchase order vs. goods receipt, with quantities, prices, and terms checked against tolerance.

Catches the costly exceptions: price variances, over-receipts, duplicate invoices, and invoices with no PO — each flag cites the exact variance.

Approves clean matches within tolerance for payment, and routes mismatches and exceptions to a human.

Defensive: never pays a duplicate, never approves over tolerance or without a matching PO and receipt, and flags fraud as evidence, not accusation.

AgentAz™ specification

A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.

Trust Level ?A2 — Recommend
DNA PatternSynthesis (Extract → Synthesize → Verify)
Worst-Case ActionProduces an incorrect match or flags a valid one, surfaced for human review. It cannot approve, post, or pay against a purchase order — execution tools are absent from its registry.
Authority BoundaryPerforms a three-way match across purchase order, receipt, and invoice, and flags discrepancies for review. It never approves a match, posts to a ledger, or releases payment. A human in finance decides.
Verification TestAttempt to call an approve, post, or payment tool → confirm it is absent from the agent's registry.
Production Readiness6/6 dimensions passing. Tool isolation: approval/payment tools absent. Human gates: finance decides. Confidence escalation: partial or fuzzy matches flagged. Cost ceiling: bounded per match. Audit trail: matches and discrepancies logged. Escalation path: mismatches routed to AP.
Last Reviewed2026-06-24

Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:

agentaz.json
{
  "$schema": "./agentaz.schema.json",
  "version": "2.0.0",
  "last_reviewed": "2026-06-24",
  "agent_id": "po-matching-agent",
  "trust_level": "A2",
  "dna_pattern": "Synthesis",
  "worst_case_action": "Produces an incorrect match for human review. Cannot approve, post, or pay.",
  "authority_boundary": "Three-way matches PO/receipt/invoice and flags discrepancies; approval/payment tools absent.",
  "tags": [
    "supply-chain",
    "procurement",
    "reconciliation",
    "read-only",
    "human-review"
  ],
  "tool_boundary": {
    "allowed_tools": [
      "read_po",
      "read_receipt",
      "read_invoice",
      "match",
      "flag_discrepancy"
    ],
    "execution_tools_absent": true
  },
  "output_boundary": {
    "format": "structured_json",
    "never_emits": [
      "approve_match",
      "ledger_post",
      "payment"
    ]
  },
  "cost_boundary": {
    "max_usd_per_trace_loop": 0.22,
    "alert_threshold_usd": 0.15
  },
  "loop_boundary": {
    "max_reasoning_turns": 8
  },
  "human_handoff": {
    "triggers": [
      "partial_match",
      "quantity_mismatch",
      "price_mismatch"
    ],
    "destination": "ap_review"
  },
  "audit": {
    "append_only": true,
    "logs": [
      "matches",
      "discrepancies"
    ]
  }
}

New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.

AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.

Governance matrix

A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.

Agent goalBounded by the authority spec above
Trust LevelA2 — Recommend
Tool accessLeast privilege — execution tools absent (read-only)
Context handlingGrounded in provided inputs; cites or flags rather than guessing
Memory strategyTask-scoped; no persistent cross-session memory
Human approvalRequired on partial match, quantity mismatch, price mismatch → ap review
Audit trailAppend-only log (matches, discrepancies)
Cost & loop bounds≤ $0.22 per loop · ≤ 8 reasoning turns
Recovery / escalationEscalates to ap review

Agent component mapping

A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.

AgentPrimary reasoner — Recommend authority (A2)
Toolsread po, read receipt, read invoice, match, flag discrepancy — execution tools absent (read-only)
MemoryTask-scoped working context; no persistent cross-session memory
GuardrailsWorst-case classified (A2); no execution tools; ≤ $0.22/loop · ≤ 8 turns
EvaluatorConfidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned
HandoffEscalates to ap review on partial match, quantity mismatch, price mismatch

Failure modes

Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.

False match — approves a PO and invoice pair that doesn't tie out.

Detection
Match confidence is scored and partial or fuzzy matches are flagged.
Mitigation
It never approves, posts, or pays — a match is a recommendation.
Recovery
AP rejects it and the discrepancy re-opens.

Misses a price or quantity mismatch.

Detection
Line-level price and quantity checks run; the residual must reconcile.
Mitigation
All mismatches are surfaced, never hidden.
Recovery
The mismatch is flagged to AP for resolution.

A partial receipt is matched as complete.

Detection
Received quantity is checked against ordered quantity.
Mitigation
Partials are flagged as partial.
Recovery
The match is held until the receipt completes.

Evaluation

Match precision and mismatch recall together are the core metrics — a false match approves a bad pair, a missed mismatch lets a price or quantity error through.

Match accuracyShare of proposed PO–invoice–receipt matches that are correct.
PrecisionOf matches proposed, the share that truly tie out — false-match resistance.
Mismatch recallOf genuine price or quantity mismatches, the share it surfaces.
Partial-receipt handlingShare of partial receipts correctly flagged as partial rather than complete.
LatencyTime to match a batch.

Recommended approach. Use a labeled set of three-way-match cases with seeded mismatches and partial receipts; measure precision and mismatch recall. The residual must reconcile, and nothing is approved or paid during evaluation.

When to use

Use it when

  • Accounts payable processes a high volume of PO-backed invoices and the matching work is repetitive.
  • You have POs, goods receipts, and invoices the agent can reconcile, plus a tolerance policy.
  • You want consistent, documented three-way matching with a clear approval trail.
  • You want to auto-clear clean matches and surface only the genuine exceptions, duplicates, and variances to humans.

Avoid it when

  • You lack structured PO/receipt/invoice data for the agent to match against.
  • You expect it to approve non-PO or over-tolerance invoices autonomously — those require a human.
  • Your spend is mostly non-PO/maverick, where there's nothing to three-way match.
  • You can't keep approval gates on exceptions, duplicates, and large variances.

System prompt

system-prompt.md
You are a Purchase Order Matching Agent in an accounts-payable workflow. For ONE invoice, you perform a three-way match against its purchase order (PO) and goods receipt and decide: approve for payment, hold, or escalate. You are judged on catching real mismatches, duplicates, and fraud, and on never approving an invoice you shouldn't.

== CORE PRINCIPLES ==
1. Match before you pay. Approve only when the invoice reconciles to a PO and a goods receipt within tolerance on quantity, price, and terms. No PO + receipt match means no auto-approval.
2. Evidence-cited exceptions. Every flag states the exact variance (e.g. "invoice unit price $12.50 vs PO $10.00, +25%, tolerance 5%"). Do not invent matches or variances.
3. Duplicates are never paid. Treat possible duplicate invoices as a hard stop for auto-payment, with the matching evidence, routed to a human.

== HARD RULES (NON-NEGOTIABLE) ==
- TOLERANCE-GATED APPROVAL: Auto-approve ONLY when quantity, price, and terms match within the configured tolerance AND the invoice total is at or below the auto-approval cap. Anything over tolerance or over cap requires a human.
- REQUIRE PO + RECEIPT: Do not auto-approve an invoice without a matching PO and a goods receipt confirming the goods/services were received. Missing either = hold/escalate.
- NO DUPLICATE PAYMENT: If the invoice may duplicate one already received/paid (same number, or same vendor+amount+PO), do not approve — flag with evidence and escalate.
- NO UNFOUNDED FRAUD CLAIMS: Suspected fraud is flagged as an evidence-based indicator and routed to a human; never assert wrongdoing.
- DATA: Treat vendor and financial data as sensitive; keep it in scope.

== METHOD ==
- Load the invoice, its PO, and the goods receipt. Compare line items: quantity invoiced vs ordered vs received; unit price invoiced vs PO; terms.
- Run tolerance checks and a duplicate-invoice check. Decide per line and for the invoice.

== DECISION POLICY (calibrated confidence 0.0-1.0) ==
- APPROVE: full three-way match within tolerance, no duplicate, total <= cap, confidence >= 0.85.
- HOLD: a specific line is over tolerance or a receipt/PO detail is missing — hold the invoice (or the line) and state what's needed.
- ESCALATE: duplicate suspicion, no PO, large variance, possible fraud, or conflicting data.

== COST CONTROL ==
Pull only the PO/receipt this invoice needs; reuse loaded data across lines. Cap tool calls; if exceeded, escalate with current findings.

== OUTPUT FORMAT (return ONE JSON object) ==
{
  "invoice_id": "<id>",
  "decision": "APPROVE|HOLD|ESCALATE",
  "confidence": <0.0-1.0>,
  "match": { "po": "<id or 'missing'>", "receipt": "<id or 'missing'>", "status": "matched|partial|unmatched" },
  "line_findings": [ { "line": "<item>", "check": "qty|price|terms", "result": "ok|variance", "detail": "<exact variance vs tolerance, or empty>" } ],
  "duplicate": { "suspected": <bool>, "evidence": "<matching invoice, or empty>" },
  "approved_amount_usd": <number>,
  "actions": [ { "tool": "<tool>", "args": { ... }, "requires_approval": <bool> } ],
  "vendor_note": "<neutral status, if any>",
  "escalation": { "needed": <bool>, "reason": "<duplicate/no-po/variance/fraud, or empty>" }
}
If there is no matching PO+receipt, a duplicate is suspected, or a variance exceeds tolerance, do NOT APPROVE.
Was this useful?

Simulate run

Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.

Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.

Setup guide

Install and connect your ERP/AP system

Install the agent and connect it to the system holding POs, receipts, and invoices.

shell
pipx install po-match-agent
po-match-agent connect --erp netsuite
po-match-agent doctor

Configure tolerance and cap

Tolerance, the PO+receipt requirement, and the cap are enforced deterministically, not by the model.

shell
cp .env.example .env
ANTHROPIC_API_KEY=sk-ant-...
PRICE_TOLERANCE_PCT=5
QTY_TOLERANCE_PCT=0
AUTO_APPROVE_CAP_USD=10000
REQUIRE_PO_AND_RECEIPT=true

Define matching & duplicate rules

Provide the match keys and duplicate-detection rule.

shell
# match.yml
match_on: [po_number, line_item, quantity, unit_price]
duplicate_keys: [invoice_number, [vendor, amount, po_number]]
block_payment_on_duplicate: true

Backtest on past invoices

Replay matched/paid invoices to compare the agent's decisions to actual outcomes.

shell
po-match-agent backtest --range 90d --explain
# reports match accuracy + any duplicate or over-tolerance approvals (must be 0)

Wire into AP intake

Route incoming invoices to the agent. Start in assist mode, enable auto-approval for clean within-cap matches once backtests are clean.

shell
# invoice intake -> POST https://your-host/ap/match (HMAC)
# promote MODE=act for clean three-way matches within the cap

Architecture

Tools required

get_invoiceFetch the invoice: vendor, line items, quantities, unit prices, totals, and PO reference.
get_poRetrieve the referenced purchase order: ordered items, quantities, agreed prices, and terms.
get_receiptRetrieve the goods receipt confirming what was actually received against the PO.
three_way_matchCompare invoice vs PO vs receipt line by line on quantity, price, and terms.
tolerance_checkApply the tolerance policy to each variance and determine pass/fail.
duplicate_invoice_checkCheck whether this invoice may duplicate one already received or paid (number, or vendor+amount+PO).
approve_for_paymentApprove a clean matched invoice for payment. Hard-gated: rejects over-tolerance, over-cap, unmatched, or duplicate invoices.
escalate_exceptionRoute duplicates, no-PO invoices, large variances, and possible fraud to an AP reviewer with the evidence.

Workflow

  1. 1. Intake the invoice

    Load the invoice and its PO reference and normalize the line items.

  2. 2. Retrieve PO and receipt

    Pull the purchase order and the goods receipt the invoice must match against.

  3. 3. Run the three-way match

    Compare quantity ordered/received/invoiced, unit price vs PO, and terms for each line.

  4. 4. Apply tolerance & duplicate checks

    Test each variance against tolerance and check for a duplicate invoice in history.

  5. 5. Decide per line and invoice

    Approve clean lines, hold lines over tolerance, and decide the invoice outcome within the cap.

  6. 6. Act through the gate

    Auto-approve clean matches for payment; hold variance lines; escalate duplicates, no-PO, and large variances with evidence.

  7. 7. Record the trail

    Log the match result with cited variances for finance controls and audit.

Examples

Clean three-way match within tolerance → approve for payment

Invoice matches its PO and receipt on quantity and price within tolerance, under the cap. The routine, payable case.

Input

input
Invoice INV-9001 $4,800: 40 units @ $120, PO-551 (40 @ $120), receipt confirms 40 received. Tolerance 5%, cap $10k.

Output

output
{
  "invoice_id": "INV-9001",
  "decision": "APPROVE",
  "confidence": 0.95,
  "match": { "po": "PO-551", "receipt": "GR-551", "status": "matched" },
  "line_findings": [
    { "line": "40 units widget-A", "check": "qty", "result": "ok", "detail": "ordered 40 / received 40 / invoiced 40" },
    { "line": "40 units widget-A", "check": "price", "result": "ok", "detail": "invoice $120 = PO $120" }
  ],
  "duplicate": { "suspected": false, "evidence": "" },
  "approved_amount_usd": 4800,
  "actions": [ { "tool": "approve_for_payment", "args": { "invoice_id": "INV-9001", "amount_usd": 4800 }, "requires_approval": false } ],
  "vendor_note": "Invoice INV-9001 ($4,800) matched to PO-551 and approved for payment.",
  "escalation": { "needed": false, "reason": "" }
}

Note: Full three-way match: quantity ordered/received/invoiced all agree, price equals the PO, under the cap, no duplicate. It approves for payment and documents the match. This is the volume the kit clears so AP only touches exceptions.

Price variance over tolerance → hold the line, cite the variance

Quantities match, but the invoiced unit price exceeds the PO price beyond tolerance. The agent holds and cites the exact variance.

Input

input
Invoice INV-9044 $5,400: 40 units @ $135, PO-560 (40 @ $120), receipt confirms 40. Tolerance 5%.

Output

output
{
  "invoice_id": "INV-9044",
  "decision": "HOLD",
  "confidence": 0.92,
  "match": { "po": "PO-560", "receipt": "GR-560", "status": "partial" },
  "line_findings": [
    { "line": "40 units widget-B", "check": "qty", "result": "ok", "detail": "ordered 40 / received 40 / invoiced 40" },
    { "line": "40 units widget-B", "check": "price", "result": "variance", "detail": "invoice $135 vs PO $120 = +12.5%, exceeds 5% tolerance" }
  ],
  "duplicate": { "suspected": false, "evidence": "" },
  "approved_amount_usd": 0,
  "actions": [ { "tool": "escalate_exception", "args": { "invoice_id": "INV-9044", "reason": "price_variance", "detail": "+12.5% vs PO" }, "requires_approval": false } ],
  "vendor_note": "Invoice INV-9044 is on hold: the unit price ($135) is 12.5% above the PO price ($120), beyond our 5% tolerance. AP will review.",
  "escalation": { "needed": false, "reason": "Price variance +12.5% exceeds tolerance — needs AP review or a PO amendment." }
}

Note: Quantities and receipt are fine, but the price is 12.5% over the PO against a 5% tolerance — so the agent holds rather than pays, and cites the exact variance and threshold. This is the line-level catch that prevents silent overbilling.

Duplicate invoice / no PO → escalate, never pay

An invoice has no PO reference and looks like a duplicate of one already paid. The agent refuses to approve and escalates with evidence.

Input

input
Invoice INV-9100 $7,200, vendor Acme, no PO referenced. duplicate_invoice_check finds INV-8841 (Acme, $7,200, PO-540) already paid last month.

Output

output
{
  "invoice_id": "INV-9100",
  "decision": "ESCALATE",
  "confidence": 0.8,
  "match": { "po": "missing", "receipt": "missing", "status": "unmatched" },
  "line_findings": [],
  "duplicate": { "suspected": true, "evidence": "Matches INV-8841 (Acme, $7,200, PO-540) already paid 2026-05" },
  "approved_amount_usd": 0,
  "actions": [ { "tool": "escalate_exception", "args": { "invoice_id": "INV-9100", "reason": "duplicate_and_no_po", "evidence": ["INV-8841"] }, "requires_approval": false } ],
  "vendor_note": "Invoice INV-9100 is on hold pending review — it has no PO and appears to match a previously paid invoice. AP will follow up; this may be a re-send.",
  "escalation": { "needed": true, "reason": "No PO and a likely duplicate of paid INV-8841 — must not be auto-paid." }
}

Note: The defining defensive case: no PO to match against and strong duplicate evidence (same vendor, amount, and PO as an already-paid invoice). The agent treats it as a hard stop on payment, escalates with the specific matching invoice, and keeps the vendor note neutral — it may be an honest re-send. Duplicate and non-PO invoices are exactly where automated payment goes wrong.

Implementation notes

  • Enforce tolerance, the PO+receipt requirement, the duplicate block, and the approval cap in a deterministic gate; the model matches, the gate controls what can be paid.
  • Cite the exact variance (values, percentage, and the tolerance) on every flag — a finding without numbers isn't auditable.
  • Make duplicate detection a hard stop on auto-payment; duplicate payments are among the most common and recoverable AP losses.
  • Require a goods receipt, not just a PO: paying for goods that were ordered but never received is a classic leakage point.
  • Treat suspected fraud as an evidence-based indicator routed to a human, with neutral vendor-facing language.
  • Backtest against paid-invoice history with duplicate/over-tolerance approvals as a hard-zero metric before enabling auto-approval.
  • Spend the strong model on variance judgment and the escalation decision — a cheaper model can extract and align line items.

Variations

Basic

Match & flag assistant

Performs the three-way match, applies tolerance, runs duplicate detection, and returns flagged exceptions with cited variances for an AP clerk. No auto-approval.

Advanced

Guarded auto-approval

Auto-approves clean within-tolerance, within-cap matches for payment, holds variance lines, and escalates duplicates, no-PO invoices, and large variances.

Enterprise

Governed AP automation

Adds ERP integration, multi-currency and partial-receipt handling, vendor-level analytics, full audit trails and SLAs, fraud-pattern detection, and tuning from AP outcomes.

Download the Agent Blueprint

The complete blueprint, zipped — including a runnable run.py you can execute with one API key (Anthropic or OpenAI).

Download Blueprint (.zip)
README.mdsystem-prompt.mdsetup-guide.mdtools.jsonworkflow.mdexamples.md.env.examplekit.jsonrun.pyLICENSENOTICEstarters/

Export

Generate a starter for your stack — all client-side, nothing leaves your browser.

ZIP

Starters use mock tools — swap in your integrations to deploy.

View the source on GitHub

This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).

Frequently asked questions