AgentKits

Product Catalog Enrichment Agent

Production Blueprint
0TrendingNew

Includes Agent Blueprint + Implementation Guide

An agent that turns thin product data into rich, consistent catalog entries — readable descriptions, structured attributes, clean categorization, and SEO fields — built strictly from your source data. It is built defensively: it enriches only from the product information you provide, never fabricates specs, dimensions, materials, compatibility, or claims, flags missing or uncertain attributes for review rather than guessing them, never invents safety or compatibility claims, stays faithful to the source, and keeps your taxonomy consistent.

ecommercecatalogproduct-dataenrichmentpimautonomous-agentmerchandisingseoagentazagent-governancetrust-levelproduction-readiness
StackClaude, LangGraph, OpenAI
DifficultyIntermediate
Setup40 min
Version2.0.0 · 2026-06-21

Overview

Turns thin product data into rich descriptions, attributes, categories, and SEO fields.

Builds every entry strictly from the source data you provide.

Flags missing or uncertain specs for review instead of guessing them.

Defensive: never fabricates specs or compatibility/safety claims, and keeps the taxonomy consistent.

AgentAz™ specification

A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.

Trust Level ?A2 — Recommend
DNA PatternSynthesis (Extract → Synthesize → Verify)
Worst-Case ActionProposes an incorrect enrichment that a human reviews before it reaches the catalog. It cannot write to the catalog and never fabricates a spec or claim — it enriches only from provided source data, flagging low-confidence fields.
Authority BoundaryEnriches product records strictly from provided source data — normalizing attributes, filling gaps it can support — and flags low-confidence fields. It never fabricates specs, certifications, or marketing claims, and never writes to the catalog. A human approves before publish.
Verification TestConfirm enrichments cite source data and unsupported fields are left blank/flagged, not invented; confirm no catalog-write tool exists.
Production Readiness6/6 dimensions passing. Tool isolation: catalog-write tools absent. Human gates: a human approves before publish. Confidence escalation: low-confidence fields flagged. Cost ceiling: bounded per product. Audit trail: enrichments and sources logged. Escalation path: unsupported claims flagged.
Last Reviewed2026-06-24

Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:

agentaz.json
{
  "$schema": "./agentaz.schema.json",
  "version": "2.0.0",
  "last_reviewed": "2026-06-24",
  "agent_id": "catalog-enrichment-agent",
  "trust_level": "A2",
  "dna_pattern": "Synthesis",
  "worst_case_action": "Proposes a wrong enrichment for human review. Cannot write to catalog; never fabricates specs.",
  "authority_boundary": "Enriches from source data and flags low confidence; catalog-write tools absent; no fabrication.",
  "tags": [
    "ecommerce",
    "catalog",
    "enrichment",
    "read-only",
    "human-review"
  ],
  "tool_boundary": {
    "allowed_tools": [
      "read_source",
      "normalize_attributes",
      "fill_supported_gaps",
      "flag_low_confidence"
    ],
    "execution_tools_absent": true
  },
  "output_boundary": {
    "format": "structured_json",
    "never_emits": [
      "catalog_write"
    ],
    "never_fabricates": true
  },
  "cost_boundary": {
    "max_usd_per_trace_loop": 0.2,
    "alert_threshold_usd": 0.14
  },
  "loop_boundary": {
    "max_reasoning_turns": 8
  },
  "human_handoff": {
    "triggers": [
      "unsupported_claim",
      "low_confidence_field"
    ],
    "destination": "catalog_reviewer"
  },
  "audit": {
    "append_only": true,
    "logs": [
      "enrichments",
      "sources"
    ]
  }
}

New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.

AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.

Governance matrix

A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.

Agent goalBounded by the authority spec above
Trust LevelA2 — Recommend
Tool accessLeast privilege — execution tools absent (read-only)
Context handlingGrounded in provided inputs; cites or flags rather than guessing
Memory strategyTask-scoped; no persistent cross-session memory
Human approvalRequired on unsupported claim, low confidence field → catalog reviewer
Audit trailAppend-only log (enrichments, sources)
Cost & loop bounds≤ $0.2 per loop · ≤ 8 reasoning turns
Recovery / escalationEscalates to catalog reviewer

Agent component mapping

A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.

AgentPrimary reasoner — Recommend authority (A2)
Toolsread source, normalize attributes, fill supported gaps, flag low confidence — execution tools absent (read-only)
MemoryTask-scoped working context; no persistent cross-session memory
GuardrailsWorst-case classified (A2); no execution tools; ≤ $0.2/loop · ≤ 8 turns
EvaluatorConfidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned
HandoffEscalates to catalog reviewer on unsupported claim, low confidence field

Failure modes

Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.

Fabricates a spec or claim not in the source, such as a certification.

Detection
Enrichments are tied to source data and unsupported fields are flagged.
Mitigation
It never fabricates — unsupported fields are left blank, and there is no catalog-write.
Recovery
A human approves before publish and unsupported claims are dropped.

Normalizes an attribute incorrectly, with a wrong unit or category.

Detection
Type and unit validation runs.
Mitigation
Ambiguous normalizations are flagged.
Recovery
The reviewer corrects it before publish.

Low-confidence enrichment is treated as fact.

Detection
Each field carries confidence and low confidence is flagged.
Mitigation
Low-confidence fields go to review, not straight to the catalog.
Recovery
The reviewer verifies or discards them.

Evaluation

Enrichment accuracy with zero fabricated claims is primary — an invented spec or certification is the failure.

Enrichment accuracyShare of enriched attributes matching source or verified data.
Fabrication rateFrequency of specs or claims not in the source — should be near zero.
Normalization accuracyShare of attributes normalized to the correct unit or category.
Low-confidence routingShare of low-confidence fields correctly sent to review rather than published.
LatencyTime to enrich per product.

Recommended approach. Use a labeled product set with verified attributes; measure enrichment and normalization accuracy and treat any unsupported claim as a critical fabrication. Confirm nothing writes to the catalog without review.

When to use

Use it when

  • You have thin or inconsistent product data to enrich at scale.
  • You want descriptions and attributes grounded in real source data.
  • You want missing specs flagged, not invented.
  • You want consistent categorization and SEO fields.

Avoid it when

  • You want it to invent specs or claims to fill gaps — it won't.
  • You have no source product data for it to enrich from.
  • You can't review flagged uncertain attributes.
  • You need fabricated reviews or ratings (it won't produce them).

System prompt

system-prompt.md
You are a Product Catalog Enrichment Agent. You enrich product catalog entries (descriptions, attributes, categories, SEO fields) using the PROVIDED source data. You are judged on rich, consistent, accurate entries and on never fabricating a spec, attribute, or claim.

== CORE PRINCIPLES ==
1. Enrich from source only. Build descriptions and attributes from the product data provided. Never invent specs (dimensions, materials, weight, compatibility) or claims that aren't supported by the source.
2. Flag, don't guess. If an attribute is missing or unclear, mark it as missing/uncertain for review. A blank flagged field beats a guessed spec that's wrong on a product page.
3. Faithful and consistent. Don't change the meaning of source data. Keep categorization and attribute naming consistent with the taxonomy.

== HARD RULES (NON-NEGOTIABLE) ==
- NO FABRICATED SPECS: Never invent dimensions, materials, weight, capacity, or technical specs. Source-supported only; missing = flagged.
- NO INVENTED CLAIMS: Never fabricate compatibility ("works with all models"), safety/regulatory ("FDA approved", "UL listed"), performance, or health claims. These carry legal risk; only state what the source supports.
- NO FAKE SOCIAL PROOF: Never invent reviews, ratings, awards, or testimonials.
- FLAG UNCERTAIN: Missing/ambiguous attributes -> flagged for review, not guessed.
- FAITHFUL + CONSISTENT: Preserve source meaning; apply the taxonomy and attribute schema consistently.

== METHOD ==
- Read the source data. Write an accurate description, extract structured attributes, categorize per the taxonomy, and generate SEO fields — all source-grounded. Flag missing/uncertain attributes and any claim needing verification.

== OUTPUT FORMAT (return ONE JSON object) ==
{
  "product_id": "<id>",
  "description": "<source-grounded description>",
  "attributes": { "<attr>": "<value or null>" },
  "category": "<taxonomy path>",
  "seo": { "title": "<...>", "meta": "<...>" },
  "flagged_for_review": ["<missing/uncertain attributes>"],
  "claims_to_verify": ["<any claim needing source confirmation>"],
  "note": "Enriched from provided source data only. No specs or claims were fabricated."
}
Never invent a spec or claim. Flag missing attributes. Keep the taxonomy consistent.
Was this useful?

Simulate run

Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.

Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.

Setup guide

Install and connect PIM

Install the agent and connect your catalog/PIM source.

shell
pipx install catalog-enrich-agent
catalog-enrich-agent connect --pim shopify --taxonomy ./taxonomy.yml
catalog-enrich-agent doctor

Configure guardrails

No fabricated specs/claims is enforced here.

shell
cp .env.example .env
ANTHROPIC_API_KEY=sk-ant-...
NO_FABRICATED_SPECS=true
NO_INVENTED_CLAIMS=true
FLAG_MISSING_ATTRIBUTES=true

Define schema & taxonomy

Declare attributes and categories.

shell
# taxonomy.yml
attributes: [material, dimensions, weight, color, compatibility]
categories: [Home > Kitchen > Cookware, ...]

Enrich samples

Run a batch and review flags and claims.

shell
catalog-enrich-agent run --products ./batch.csv --explain
# prints enriched entries + flagged_for_review + claims_to_verify

Wire into the catalog

Enrich incoming products; flagged fields go to review.

shell
# product in -> enrich from source -> flagged attributes/claims -> human review

Architecture

Tools required

get_productRetrieve the product's source data.
extract_attributesPull structured attributes from the source.
enrich_descriptionWrite a source-grounded product description.
categorizeAssign the product to the taxonomy consistently.
seo_fieldsGenerate SEO title and meta fields from grounded content.
flag_uncertainFlag missing or uncertain attributes for review.
fact_guardBlock fabricated specs and claims; flag those needing verification.
validate_taxonomyKeep categorization and attribute naming consistent.

Workflow

  1. 1. Take the source

    Receive the product/source data to enrich from.

  2. 2. Extract attributes

    Pull structured attributes and map them to the schema.

  3. 3. Write the description

    Draft an accurate description grounded in the source.

  4. 4. Categorize

    Assign the taxonomy consistently with similar products.

  5. 5. Guard claims

    Block fabricated specs/claims; flag any needing verification.

  6. 6. Flag gaps

    Mark missing or uncertain attributes for review rather than guessing.

  7. 7. Generate SEO

    Produce SEO fields from the grounded content.

Examples

Good source data → faithful enriched entry

Solid source data produces a rich, accurate entry with structured attributes.

Input

input
Source: 'Stainless steel frying pan, 28cm, induction-compatible, dishwasher safe.'

Output

output
{
  "product_id": "PAN-28",
  "description": "A 28cm stainless steel frying pan suited to everyday cooking. Induction-compatible and dishwasher safe for easy use and cleanup.",
  "attributes": { "material": "Stainless steel", "diameter_cm": "28", "induction_compatible": "Yes", "dishwasher_safe": "Yes" },
  "category": "Home > Kitchen > Cookware > Frying Pans",
  "seo": { "title": "28cm Stainless Steel Frying Pan — Induction Compatible", "meta": "Durable 28cm stainless steel frying pan. Induction-compatible and dishwasher safe." },
  "flagged_for_review": [],
  "claims_to_verify": [],
  "note": "Enriched from provided source data only. No specs or claims were fabricated."
}

Note: Every attribute and claim traces to the source. The description reads well but adds nothing the data doesn't support. The routine, high-volume case the agent should automate.

Missing spec → flag for review, don't invent it

A key attribute isn't in the source. The agent leaves it blank and flags it rather than guessing.

Input

input
Source: 'Wool blend sweater, navy.' No size, fabric percentages, or care instructions provided.

Output

output
{
  "product_id": "SWT-NV",
  "description": "A navy wool-blend sweater. (Additional details pending — see flagged attributes.)",
  "attributes": { "material": "Wool blend", "color": "Navy", "fabric_composition": null, "sizes": null, "care": null },
  "category": "Apparel > Sweaters",
  "seo": { "title": "Navy Wool-Blend Sweater", "meta": "Navy wool-blend sweater." },
  "flagged_for_review": ["fabric_composition — not provided (don't guess wool %)", "sizes — not provided", "care instructions — not provided"],
  "claims_to_verify": [],
  "note": "Enriched from provided source data only. No specs or claims were fabricated."
}

Note: The agent enriches what it can ('wool blend', 'navy') and flags the missing fabric percentages, sizes, and care info as null for review — rather than inventing '80% wool / 20% acrylic' or a care label. A guessed spec on a product page is a return and a trust problem.

Asked to add a compatibility/safety claim not in source → refuse to fabricate

The request wants impressive but unsupported claims. The agent won't invent them.

Input

input
Request: 'Add that this charger is compatible with all phones and is FDA approved — sounds better.'

Output

output
{
  "product_id": "CHG-1",
  "description": "<accurate description from the actual source data>",
  "attributes": { "<from source>": "..." },
  "category": "Electronics > Chargers",
  "seo": { "title": "<grounded>", "meta": "<grounded>" },
  "flagged_for_review": [],
  "claims_to_verify": [
    "'Compatible with all phones' — NOT added. The source doesn't support a universal-compatibility claim, and it's likely false (connector/wattage limits). Provide the actual supported devices and I'll list them accurately.",
    "'FDA approved' — NOT added. This is a regulatory claim with legal consequences if untrue, and chargers aren't FDA-regulated this way. I won't fabricate a certification. Provide real certifications (e.g., UL, CE) and I'll include them."
  ],
  "note": "Enriched from provided source data only. No specs or claims were fabricated."
}

Note: The defining defensive case: asked to bolt on a false universal-compatibility claim and a fabricated regulatory certification. The agent refuses both — a wrong compatibility claim drives returns, and a fake 'FDA approved' is a serious legal/regulatory liability. It flags them, explains why, and offers to add real, source-backed claims instead.

Implementation notes

  • Enrich strictly from source data and never invent specs like dimensions, materials, or weight; a guessed spec on a live product page causes returns and erodes trust.
  • Never fabricate compatibility, safety, or regulatory claims ('works with all', 'FDA approved'); these carry direct legal and returns risk, so flag them for verification instead.
  • Never invent reviews, ratings, awards, or other social proof.
  • Flag missing or uncertain attributes as null for review rather than filling them with plausible guesses.
  • Keep categorization and attribute naming consistent with the taxonomy so the catalog stays clean and filterable.
  • Stay faithful to the source meaning; enrich the wording, not the facts.
  • A cheaper model is usually enough to format and categorize clean entries, so keep the strong model for ambiguous data and claim-guarding.

Variations

Basic

Description enricher

Writes source-grounded descriptions and extracts attributes for catalog entries. On demand.

Advanced

Guarded enrichment

Adds taxonomy categorization, SEO fields, no-fabricated-spec/claim guards, and missing-attribute flagging for review.

Enterprise

Catalog operations

Adds PIM integration, batch enrichment, taxonomy governance, claim-verification workflows, and audit trails across the catalog.

Download the Agent Blueprint

The complete blueprint, zipped — including a runnable run.py you can execute with one API key (Anthropic or OpenAI).

Download Blueprint (.zip)
README.mdsystem-prompt.mdsetup-guide.mdtools.jsonworkflow.mdexamples.md.env.examplekit.jsonrun.pyLICENSENOTICEstarters/

Export

Generate a starter for your stack — all client-side, nothing leaves your browser.

ZIP

Starters use mock tools — swap in your integrations to deploy.

View the source on GitHub

This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).

Frequently asked questions