Will it pay invoices automatically?

Only when the invoice fully matches its PO and goods receipt within tolerance, isn't a duplicate, and is within your configured cap. Anything over tolerance, over cap, without a PO/receipt, or possibly duplicate is held or escalated to a human.

How does it catch duplicate invoices?

It checks each invoice against received/paid history on the invoice number and on vendor+amount+PO combinations, and treats a likely match as a hard stop on payment, escalating with the specific matching invoice as evidence.

What happens on a price or quantity variance?

It holds the affected line (or invoice) and cites the exact variance against the tolerance — for example, invoiced price 12.5% over the PO against a 5% tolerance — so AP can review or amend the PO.

Does it require a goods receipt?

Yes, for auto-approval. Matching only the PO would let it pay for goods that were ordered but never received, so it requires the receipt confirming delivery.

Does it accuse vendors of fraud?

No. It surfaces evidence-based indicators (like a duplicate) and routes them to a human, keeping vendor-facing language neutral. It never asserts wrongdoing.

How do we roll it out safely?

Start in assist mode where it only recommends, backtest against paid-invoice history to confirm zero duplicate or over-tolerance approvals, then enable auto-approval for clean within-cap matches.

Purchase Order Matching Agent

Overview

Three-way match done consistently: invoice vs. purchase order vs. goods receipt, with quantities, prices, and terms checked against tolerance.

Catches the costly exceptions: price variances, over-receipts, duplicate invoices, and invoices with no PO — each flag cites the exact variance.

Approves clean matches within tolerance for payment, and routes mismatches and exceptions to a human.

Defensive: never pays a duplicate, never approves over tolerance or without a matching PO and receipt, and flags fraud as evidence, not accusation.

AgentAz™ specification

A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.

Trust Level ?A2 — Recommend

DNA PatternSynthesis (Extract → Synthesize → Verify)

Worst-Case ActionProduces an incorrect match or flags a valid one, surfaced for human review. It cannot approve, post, or pay against a purchase order — execution tools are absent from its registry.

Authority BoundaryPerforms a three-way match across purchase order, receipt, and invoice, and flags discrepancies for review. It never approves a match, posts to a ledger, or releases payment. A human in finance decides.

Verification TestAttempt to call an approve, post, or payment tool → confirm it is absent from the agent's registry.

Production Readiness6/6 dimensions passing. Tool isolation: approval/payment tools absent. Human gates: finance decides. Confidence escalation: partial or fuzzy matches flagged. Cost ceiling: bounded per match. Audit trail: matches and discrepancies logged. Escalation path: mismatches routed to AP.

Last Reviewed2026-06-24

Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:

agentaz.json

{
  "$schema": "./agentaz.schema.json",
  "version": "2.0.0",
  "last_reviewed": "2026-06-24",
  "agent_id": "po-matching-agent",
  "trust_level": "A2",
  "dna_pattern": "Synthesis",
  "worst_case_action": "Produces an incorrect match for human review. Cannot approve, post, or pay.",
  "authority_boundary": "Three-way matches PO/receipt/invoice and flags discrepancies; approval/payment tools absent.",
  "tags": [
    "supply-chain",
    "procurement",
    "reconciliation",
    "read-only",
    "human-review"
  ],
  "tool_boundary": {
    "allowed_tools": [
      "read_po",
      "read_receipt",
      "read_invoice",
      "match",
      "flag_discrepancy"
    ],
    "execution_tools_absent": true
  },
  "output_boundary": {
    "format": "structured_json",
    "never_emits": [
      "approve_match",
      "ledger_post",
      "payment"
    ]
  },
  "cost_boundary": {
    "max_usd_per_trace_loop": 0.22,
    "alert_threshold_usd": 0.15
  },
  "loop_boundary": {
    "max_reasoning_turns": 8
  },
  "human_handoff": {
    "triggers": [
      "partial_match",
      "quantity_mismatch",
      "price_mismatch"
    ],
    "destination": "ap_review"
  },
  "audit": {
    "append_only": true,
    "logs": [
      "matches",
      "discrepancies"
    ]
  }
}

New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.

AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.

Governance matrix

A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.

Agent goal	Bounded by the authority spec above
Trust Level	A2 — Recommend
Tool access	Least privilege — execution tools absent (read-only)
Context handling	Grounded in provided inputs; cites or flags rather than guessing
Memory strategy	Task-scoped; no persistent cross-session memory
Human approval	Required on partial match, quantity mismatch, price mismatch → ap review
Audit trail	Append-only log (matches, discrepancies)
Cost & loop bounds	≤ $0.22 per loop · ≤ 8 reasoning turns
Recovery / escalation	Escalates to ap review

Agent component mapping

A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.

Agent	Primary reasoner — Recommend authority (A2)
Tools	read po, read receipt, read invoice, match, flag discrepancy — execution tools absent (read-only)
Memory	Task-scoped working context; no persistent cross-session memory
Guardrails	Worst-case classified (A2); no execution tools; ≤ $0.22/loop · ≤ 8 turns
Evaluator	Confidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned
Handoff	Escalates to ap review on partial match, quantity mismatch, price mismatch

Failure modes

Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.

False match — approves a PO and invoice pair that doesn't tie out.

Detection: Match confidence is scored and partial or fuzzy matches are flagged.
Mitigation: It never approves, posts, or pays — a match is a recommendation.
Recovery: AP rejects it and the discrepancy re-opens.

Misses a price or quantity mismatch.

Detection: Line-level price and quantity checks run; the residual must reconcile.
Mitigation: All mismatches are surfaced, never hidden.
Recovery: The mismatch is flagged to AP for resolution.

A partial receipt is matched as complete.

Detection: Received quantity is checked against ordered quantity.
Mitigation: Partials are flagged as partial.
Recovery: The match is held until the receipt completes.

Evaluation

Match precision and mismatch recall together are the core metrics — a false match approves a bad pair, a missed mismatch lets a price or quantity error through.

Match accuracy	Share of proposed PO–invoice–receipt matches that are correct.
Precision	Of matches proposed, the share that truly tie out — false-match resistance.
Mismatch recall	Of genuine price or quantity mismatches, the share it surfaces.
Partial-receipt handling	Share of partial receipts correctly flagged as partial rather than complete.
Latency	Time to match a batch.

Recommended approach. Use a labeled set of three-way-match cases with seeded mismatches and partial receipts; measure precision and mismatch recall. The residual must reconcile, and nothing is approved or paid during evaluation.

When to use

Use it when

Accounts payable processes a high volume of PO-backed invoices and the matching work is repetitive.
You have POs, goods receipts, and invoices the agent can reconcile, plus a tolerance policy.
You want consistent, documented three-way matching with a clear approval trail.
You want to auto-clear clean matches and surface only the genuine exceptions, duplicates, and variances to humans.

Avoid it when

You lack structured PO/receipt/invoice data for the agent to match against.
You expect it to approve non-PO or over-tolerance invoices autonomously — those require a human.
Your spend is mostly non-PO/maverick, where there's nothing to three-way match.
You can't keep approval gates on exceptions, duplicates, and large variances.

System prompt

system-prompt.md

You are a Purchase Order Matching Agent in an accounts-payable workflow. For ONE invoice, you perform a three-way match against its purchase order (PO) and goods receipt and decide: approve for payment, hold, or escalate. You are judged on catching real mismatches, duplicates, and fraud, and on never approving an invoice you shouldn't.

== CORE PRINCIPLES ==
1. Match before you pay. Approve only when the invoice reconciles to a PO and a goods receipt within tolerance on quantity, price, and terms. No PO + receipt match means no auto-approval.
2. Evidence-cited exceptions. Every flag states the exact variance (e.g. "invoice unit price $12.50 vs PO $10.00, +25%, tolerance 5%"). Do not invent matches or variances.
3. Duplicates are never paid. Treat possible duplicate invoices as a hard stop for auto-payment, with the matching evidence, routed to a human.

== HARD RULES (NON-NEGOTIABLE) ==
- TOLERANCE-GATED APPROVAL: Auto-approve ONLY when quantity, price, and terms match within the configured tolerance AND the invoice total is at or below the auto-approval cap. Anything over tolerance or over cap requires a human.
- REQUIRE PO + RECEIPT: Do not auto-approve an invoice without a matching PO and a goods receipt confirming the goods/services were received. Missing either = hold/escalate.
- NO DUPLICATE PAYMENT: If the invoice may duplicate one already received/paid (same number, or same vendor+amount+PO), do not approve — flag with evidence and escalate.
- NO UNFOUNDED FRAUD CLAIMS: Suspected fraud is flagged as an evidence-based indicator and routed to a human; never assert wrongdoing.
- DATA: Treat vendor and financial data as sensitive; keep it in scope.

== METHOD ==
- Load the invoice, its PO, and the goods receipt. Compare line items: quantity invoiced vs ordered vs received; unit price invoiced vs PO; terms.
- Run tolerance checks and a duplicate-invoice check. Decide per line and for the invoice.

== DECISION POLICY (calibrated confidence 0.0-1.0) ==
- APPROVE: full three-way match within tolerance, no duplicate, total <= cap, confidence >= 0.85.
- HOLD: a specific line is over tolerance or a receipt/PO detail is missing — hold the invoice (or the line) and state what's needed.
- ESCALATE: duplicate suspicion, no PO, large variance, possible fraud, or conflicting data.

== COST CONTROL ==
Pull only the PO/receipt this invoice needs; reuse loaded data across lines. Cap tool calls; if exceeded, escalate with current findings.

== OUTPUT FORMAT (return ONE JSON object) ==
{
  "invoice_id": "<id>",
  "decision": "APPROVE|HOLD|ESCALATE",
  "confidence": <0.0-1.0>,
  "match": { "po": "<id or 'missing'>", "receipt": "<id or 'missing'>", "status": "matched|partial|unmatched" },
  "line_findings": [ { "line": "<item>", "check": "qty|price|terms", "result": "ok|variance", "detail": "<exact variance vs tolerance, or empty>" } ],
  "duplicate": { "suspected": <bool>, "evidence": "<matching invoice, or empty>" },
  "approved_amount_usd": <number>,
  "actions": [ { "tool": "<tool>", "args": { ... }, "requires_approval": <bool> } ],
  "vendor_note": "<neutral status, if any>",
  "escalation": { "needed": <bool>, "reason": "<duplicate/no-po/variance/fraud, or empty>" }
}
If there is no matching PO+receipt, a duplicate is suspected, or a variance exceeds tolerance, do NOT APPROVE.

Was this useful?

Simulate run

Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.

Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.

Setup guide

Install and connect your ERP/AP system

Install the agent and connect it to the system holding POs, receipts, and invoices.

shell

pipx install po-match-agent
po-match-agent connect --erp netsuite
po-match-agent doctor

Configure tolerance and cap

Tolerance, the PO+receipt requirement, and the cap are enforced deterministically, not by the model.

shell

cp .env.example .env
ANTHROPIC_API_KEY=sk-ant-...
PRICE_TOLERANCE_PCT=5
QTY_TOLERANCE_PCT=0
AUTO_APPROVE_CAP_USD=10000
REQUIRE_PO_AND_RECEIPT=true

Define matching & duplicate rules

Provide the match keys and duplicate-detection rule.

shell

# match.yml
match_on: [po_number, line_item, quantity, unit_price]
duplicate_keys: [invoice_number, [vendor, amount, po_number]]
block_payment_on_duplicate: true

Backtest on past invoices

Replay matched/paid invoices to compare the agent's decisions to actual outcomes.

shell

po-match-agent backtest --range 90d --explain
# reports match accuracy + any duplicate or over-tolerance approvals (must be 0)

Wire into AP intake

Route incoming invoices to the agent. Start in assist mode, enable auto-approval for clean within-cap matches once backtests are clean.

shell

# invoice intake -> POST https://your-host/ap/match (HMAC)
# promote MODE=act for clean three-way matches within the cap

Architecture

Invoice intakeReceives the invoice (vendor, lines, amounts, PO reference) and normalizes it for matching.

PO & receipt retrievalPulls the referenced purchase order and the goods receipt that confirms what was actually received — the two documents the invoice is matched against.

Three-way match engineCompares quantity ordered vs received vs invoiced, unit price invoiced vs PO, and terms, line by line.

Tolerance & duplicate checksApplies the tolerance policy to each variance and runs a duplicate-invoice check against received/paid history.

Approval gateA deterministic gate enforces tolerance, the PO+receipt requirement, the duplicate block, and the auto-approval cap; only clean matches pass.

Payment routing & holdsApproves clean matched invoices for payment within limits, holds specific variance lines, and routes duplicates/exceptions to AP with the evidence.

Audit trailLogs every match decision with the cited variances and outcome for finance controls and audit.

Tools required

get_invoiceFetch the invoice: vendor, line items, quantities, unit prices, totals, and PO reference.

get_poRetrieve the referenced purchase order: ordered items, quantities, agreed prices, and terms.

get_receiptRetrieve the goods receipt confirming what was actually received against the PO.

three_way_matchCompare invoice vs PO vs receipt line by line on quantity, price, and terms.

tolerance_checkApply the tolerance policy to each variance and determine pass/fail.

duplicate_invoice_checkCheck whether this invoice may duplicate one already received or paid (number, or vendor+amount+PO).

approve_for_paymentApprove a clean matched invoice for payment. Hard-gated: rejects over-tolerance, over-cap, unmatched, or duplicate invoices.

escalate_exceptionRoute duplicates, no-PO invoices, large variances, and possible fraud to an AP reviewer with the evidence.

Workflow

1. Intake the invoice
Load the invoice and its PO reference and normalize the line items.
2. Retrieve PO and receipt
Pull the purchase order and the goods receipt the invoice must match against.
3. Run the three-way match
Compare quantity ordered/received/invoiced, unit price vs PO, and terms for each line.
4. Apply tolerance & duplicate checks
Test each variance against tolerance and check for a duplicate invoice in history.
5. Decide per line and invoice
Approve clean lines, hold lines over tolerance, and decide the invoice outcome within the cap.
6. Act through the gate
Auto-approve clean matches for payment; hold variance lines; escalate duplicates, no-PO, and large variances with evidence.
7. Record the trail
Log the match result with cited variances for finance controls and audit.

Examples

Clean three-way match within tolerance → approve for payment

Invoice matches its PO and receipt on quantity and price within tolerance, under the cap. The routine, payable case.

Input

input

Invoice INV-9001 $4,800: 40 units @ $120, PO-551 (40 @ $120), receipt confirms 40 received. Tolerance 5%, cap $10k.

Output

output

{
  "invoice_id": "INV-9001",
  "decision": "APPROVE",
  "confidence": 0.95,
  "match": { "po": "PO-551", "receipt": "GR-551", "status": "matched" },
  "line_findings": [
    { "line": "40 units widget-A", "check": "qty", "result": "ok", "detail": "ordered 40 / received 40 / invoiced 40" },
    { "line": "40 units widget-A", "check": "price", "result": "ok", "detail": "invoice $120 = PO $120" }
  ],
  "duplicate": { "suspected": false, "evidence": "" },
  "approved_amount_usd": 4800,
  "actions": [ { "tool": "approve_for_payment", "args": { "invoice_id": "INV-9001", "amount_usd": 4800 }, "requires_approval": false } ],
  "vendor_note": "Invoice INV-9001 ($4,800) matched to PO-551 and approved for payment.",
  "escalation": { "needed": false, "reason": "" }
}

Note: Full three-way match: quantity ordered/received/invoiced all agree, price equals the PO, under the cap, no duplicate. It approves for payment and documents the match. This is the volume the kit clears so AP only touches exceptions.

Price variance over tolerance → hold the line, cite the variance

Quantities match, but the invoiced unit price exceeds the PO price beyond tolerance. The agent holds and cites the exact variance.

Input

input

Invoice INV-9044 $5,400: 40 units @ $135, PO-560 (40 @ $120), receipt confirms 40. Tolerance 5%.

Output

output

{
  "invoice_id": "INV-9044",
  "decision": "HOLD",
  "confidence": 0.92,
  "match": { "po": "PO-560", "receipt": "GR-560", "status": "partial" },
  "line_findings": [
    { "line": "40 units widget-B", "check": "qty", "result": "ok", "detail": "ordered 40 / received 40 / invoiced 40" },
    { "line": "40 units widget-B", "check": "price", "result": "variance", "detail": "invoice $135 vs PO $120 = +12.5%, exceeds 5% tolerance" }
  ],
  "duplicate": { "suspected": false, "evidence": "" },
  "approved_amount_usd": 0,
  "actions": [ { "tool": "escalate_exception", "args": { "invoice_id": "INV-9044", "reason": "price_variance", "detail": "+12.5% vs PO" }, "requires_approval": false } ],
  "vendor_note": "Invoice INV-9044 is on hold: the unit price ($135) is 12.5% above the PO price ($120), beyond our 5% tolerance. AP will review.",
  "escalation": { "needed": false, "reason": "Price variance +12.5% exceeds tolerance — needs AP review or a PO amendment." }
}

Note: Quantities and receipt are fine, but the price is 12.5% over the PO against a 5% tolerance — so the agent holds rather than pays, and cites the exact variance and threshold. This is the line-level catch that prevents silent overbilling.

Duplicate invoice / no PO → escalate, never pay

An invoice has no PO reference and looks like a duplicate of one already paid. The agent refuses to approve and escalates with evidence.

Input

input

Invoice INV-9100 $7,200, vendor Acme, no PO referenced. duplicate_invoice_check finds INV-8841 (Acme, $7,200, PO-540) already paid last month.

Output

output

{
  "invoice_id": "INV-9100",
  "decision": "ESCALATE",
  "confidence": 0.8,
  "match": { "po": "missing", "receipt": "missing", "status": "unmatched" },
  "line_findings": [],
  "duplicate": { "suspected": true, "evidence": "Matches INV-8841 (Acme, $7,200, PO-540) already paid 2026-05" },
  "approved_amount_usd": 0,
  "actions": [ { "tool": "escalate_exception", "args": { "invoice_id": "INV-9100", "reason": "duplicate_and_no_po", "evidence": ["INV-8841"] }, "requires_approval": false } ],
  "vendor_note": "Invoice INV-9100 is on hold pending review — it has no PO and appears to match a previously paid invoice. AP will follow up; this may be a re-send.",
  "escalation": { "needed": true, "reason": "No PO and a likely duplicate of paid INV-8841 — must not be auto-paid." }
}

Note: The defining defensive case: no PO to match against and strong duplicate evidence (same vendor, amount, and PO as an already-paid invoice). The agent treats it as a hard stop on payment, escalates with the specific matching invoice, and keeps the vendor note neutral — it may be an honest re-send. Duplicate and non-PO invoices are exactly where automated payment goes wrong.

Implementation notes

Enforce tolerance, the PO+receipt requirement, the duplicate block, and the approval cap in a deterministic gate; the model matches, the gate controls what can be paid.
Cite the exact variance (values, percentage, and the tolerance) on every flag — a finding without numbers isn't auditable.
Make duplicate detection a hard stop on auto-payment; duplicate payments are among the most common and recoverable AP losses.
Require a goods receipt, not just a PO: paying for goods that were ordered but never received is a classic leakage point.
Treat suspected fraud as an evidence-based indicator routed to a human, with neutral vendor-facing language.
Backtest against paid-invoice history with duplicate/over-tolerance approvals as a hard-zero metric before enabling auto-approval.
Spend the strong model on variance judgment and the escalation decision — a cheaper model can extract and align line items.

Variations

Basic

Match & flag assistant

Performs the three-way match, applies tolerance, runs duplicate detection, and returns flagged exceptions with cited variances for an AP clerk. No auto-approval.

Advanced

Guarded auto-approval

Auto-approves clean within-tolerance, within-cap matches for payment, holds variance lines, and escalates duplicates, no-PO invoices, and large variances.

Enterprise

Governed AP automation

Adds ERP integration, multi-currency and partial-receipt handling, vendor-level analytics, full audit trails and SLAs, fraud-pattern detection, and tuning from AP outcomes.

Download the Agent Blueprint

The complete blueprint, zipped — including a runnable run.py you can execute with one API key (Anthropic or OpenAI).

Download Blueprint (.zip)

README.mdsystem-prompt.mdsetup-guide.mdtools.jsonworkflow.mdexamples.md.env.examplekit.jsonrun.pyLICENSENOTICEstarters/

Export

Generate a starter for your stack — all client-side, nothing leaves your browser.

ZIP

Starters use mock tools — swap in your integrations to deploy.

View the source on GitHub

This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).

Purchase Order Matching Agent

Overview

AgentAz™ specification

Governance matrix

Agent component mapping

Failure modes

Evaluation

When to use

System prompt

Simulate run

Setup guide

Architecture

Tools required

Workflow

Examples

Implementation notes

Variations

Frequently asked questions

Will it pay invoices automatically?

How does it catch duplicate invoices?

What happens on a price or quantity variance?

Does it require a goods receipt?

Does it accuse vendors of fraud?

How do we roll it out safely?

Related kits

Supply Chain Disruption Monitor

Invoice Data Extraction Agent

Expense Audit & Compliance Agent

Access Request & Provisioning Agent