AgentKits

Literature Synthesis Agent

Production Blueprint
0New

Includes Agent Blueprint + Implementation Guide

An agent that synthesizes a set of sources — papers, reports, articles you provide — into a structured review: what the evidence says, how strong it is, where sources agree, and where they conflict. Its defining discipline is citation honesty. It is built defensively: it cites every claim to a real provided source, never fabricates findings or citations (the classic and dangerous failure mode), weighs evidence strength rather than treating all sources equally, surfaces conflicting findings instead of cherry-picking, marks gaps, and doesn't overgeneralize beyond what the sources support.

researchliterature-reviewsynthesiscitationsevidenceautonomous-agentacademicknowledgeagentazagent-governancetrust-levelproduction-readiness
StackClaude, LangGraph, OpenAI
DifficultyAdvanced
Setup45 min
Version2.0.0 · 2026-06-21

Overview

Synthesizes the sources you provide into a structured, cited review.

Cites every claim to a real source and weighs how strong the evidence is.

Surfaces where sources agree and, honestly, where they conflict.

Defensive: never fabricates findings or citations, marks gaps, and doesn't overgeneralize.

AgentAz™ specification

A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.

Trust Level ?A1 — Research
DNA PatternResearch (Research → Verify)
Worst-Case ActionIncludes a wrong or overstated claim in a literature synthesis that a human reviews before relying on it. It only gathers and cites sources; it never fabricates a citation or finding and never takes any action.
Authority BoundarySynthesizes provided or retrieved literature into a structured summary with citations, distinguishing well-supported findings from weak ones and flagging gaps. It never invents a citation, overstates evidence, or takes action.
Verification TestConfirm every claim maps to a real cited source and confidence is qualified; confirm no fabricated citation and no action tool in the registry.
Production Readiness6/6 dimensions passing. Tool isolation: action tools absent. Human gates: a human reviews. Confidence escalation: weak or conflicting evidence flagged. Cost ceiling: bounded per synthesis. Audit trail: sources and citations logged. Escalation path: thin evidence flagged.
Last Reviewed2026-06-24

Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:

agentaz.json
{
  "$schema": "./agentaz.schema.json",
  "version": "2.0.0",
  "last_reviewed": "2026-06-24",
  "agent_id": "literature-synthesis-agent",
  "trust_level": "A1",
  "dna_pattern": "Research",
  "worst_case_action": "Includes an overstated claim in a synthesis for human review. Never fabricates citations; no actions.",
  "authority_boundary": "Synthesizes literature with citations; never fabricates; action tools absent.",
  "tags": [
    "research",
    "literature",
    "read-only",
    "cited"
  ],
  "tool_boundary": {
    "allowed_tools": [
      "read_sources",
      "synthesize",
      "cite_source",
      "grade_evidence",
      "flag_gap"
    ],
    "execution_tools_absent": true,
    "read_only": true
  },
  "output_boundary": {
    "format": "structured_json",
    "never_emits": [
      "action"
    ],
    "never_fabricates": true
  },
  "cost_boundary": {
    "max_usd_per_trace_loop": 0.3,
    "alert_threshold_usd": 0.2
  },
  "loop_boundary": {
    "max_reasoning_turns": 10
  },
  "human_handoff": {
    "triggers": [
      "weak_evidence",
      "conflicting_findings"
    ],
    "destination": "researcher"
  },
  "audit": {
    "append_only": true,
    "logs": [
      "sources",
      "citations",
      "evidence_grades"
    ]
  }
}

New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.

AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.

Governance matrix

A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.

Agent goalBounded by the authority spec above
Trust LevelA1 — Research
Tool accessLeast privilege — execution tools absent (read-only)
Context handlingGrounded in provided inputs; cites or flags rather than guessing
Memory strategyTask-scoped; no persistent cross-session memory
Human approvalRequired on weak evidence, conflicting findings → researcher
Audit trailAppend-only log (sources, citations, evidence grades)
Cost & loop bounds≤ $0.3 per loop · ≤ 10 reasoning turns
Recovery / escalationEscalates to researcher

Agent component mapping

A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.

AgentPrimary reasoner — Research authority (A1)
Toolsread sources, synthesize, cite source, grade evidence, flag gap — execution tools absent (read-only)
MemoryTask-scoped working context; no persistent cross-session memory
GuardrailsWorst-case classified (A1); no execution tools; ≤ $0.3/loop · ≤ 10 turns
EvaluatorConfidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned
HandoffEscalates to researcher on weak evidence, conflicting findings

Failure modes

Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.

Overstates a finding the evidence doesn't support.

Detection
Evidence strength is graded per finding and weak evidence is flagged.
Mitigation
It distinguishes well-supported findings from weak ones and never overstates.
Recovery
The researcher reviews against the cited sources.

Fabricates a citation or attributes a claim to the wrong source.

Detection
Every claim maps to a real cited source; unmapped claims are withheld.
Mitigation
It never invents a citation.
Recovery
The researcher verifies the citation.

Misses a contradicting study, implying false consensus.

Detection
Conflicting findings and gaps are flagged.
Mitigation
It surfaces disagreement rather than smoothing it over.
Recovery
The researcher reviews the full body of evidence.

Evaluation

Evidence-faithful synthesis with real citations is primary — an overstated finding or a fabricated citation is the failure.

Citation validityShare of claims mapping to a real, correctly-attributed source.
Evidence calibrationShare of findings whose stated strength matches the underlying evidence.
Fabrication rateFrequency of invented citations or misattributed claims — should be near zero.
Contradiction recallOf contradicting studies present, the share surfaced rather than smoothed into false consensus.
LatencyTime to synthesize a corpus.

Recommended approach. Use a corpus with annotated findings and citations; verify every claim maps to a real source and measure evidence calibration and contradiction recall. Any invented citation is a critical failure.

When to use

Use it when

  • You have a set of sources and want a structured, cited synthesis.
  • You want evidence strength weighed and conflicts surfaced, not smoothed over.
  • You need every claim traceable to a real source.
  • You want gaps and limitations flagged honestly.

Avoid it when

  • You want it to find or cite sources you didn't provide — it synthesizes what you give it.
  • You expect a definitive answer where the evidence is genuinely mixed.
  • You want medical, legal, or financial advice (it summarizes evidence, not advice).
  • You can't provide the source material.

System prompt

system-prompt.md
You are a Literature Synthesis Agent. You synthesize a set of PROVIDED sources into a structured, cited review. You are judged on a faithful, well-organized, honestly-weighted synthesis and on never fabricating a finding or a citation.

== CORE PRINCIPLES ==
1. Cite every claim to a real provided source. Each statement of fact or finding must reference a source you were actually given. No fabricated citations, no half-remembered references, no sources you weren't provided.
2. Weigh the evidence. Don't treat all sources equally. Note evidence strength (study type, sample size, recency, peer review) and reflect it. Strong and weak evidence are not the same.
3. Show agreement AND conflict. Where sources agree, say so. Where they conflict, present both sides honestly with citations. Don't cherry-pick to manufacture a clean conclusion.

== HARD RULES (NON-NEGOTIABLE) ==
- NO FABRICATED CITATIONS: Never invent a source, author, title, year, statistic, or quote. Cite only provided sources. If a claim isn't supported by them, don't make it.
- NO FABRICATED FINDINGS: Never assert a result the sources don't contain. Unknown/unsupported = mark as a gap.
- NO OVERGENERALIZATION: Don't extend findings beyond the population, context, or strength the sources support. Note limitations.
- HONEST CONFLICTS: Surface contradictory findings; don't suppress inconvenient ones.
- NOT ADVICE: You summarize evidence. You do not give medical, legal, or financial advice or definitive real-world recommendations.

== METHOD ==
- Read the provided sources. Extract findings with citations. Assess evidence strength. Identify consensus and conflicts. Mark gaps and limitations. Produce a structured synthesis.

== OUTPUT FORMAT (return ONE JSON object) ==
{
  "question": "<synthesis focus>",
  "sources_used": ["<provided sources, by ref>"],
  "findings": [ { "claim": "<finding>", "citation": "<provided source ref>", "evidence_strength": "strong|moderate|weak", "note": "<study type/limits>" } ],
  "consensus": ["<where sources agree, cited>"],
  "conflicts": [ { "topic": "<x>", "positions": ["<source A says... / source B says...>"] } ],
  "gaps": ["<what the sources don't establish>"],
  "caveat": "Synthesis of provided sources only. No citations or findings were fabricated. Not advice."
}
Never fabricate a citation or finding. Cite every claim. Surface conflicts honestly.
Was this useful?

Simulate run

Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.

Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.

Setup guide

Install and connect sources

Install the agent and connect your source library.

shell
pipx install lit-synth-agent
lit-synth-agent connect --library zotero --pdfs ./papers
lit-synth-agent doctor

Configure citation guardrails

No fabricated citations/findings is enforced here.

shell
cp .env.example .env
ANTHROPIC_API_KEY=sk-ant-...
CITE_PROVIDED_ONLY=true
NO_FABRICATED_CITATIONS=true
WEIGH_EVIDENCE=true

Set the review format

Define the synthesis structure and strength criteria.

shell
# synth.yml
sections: [findings, consensus, conflicts, gaps]
evidence_criteria: [study_type, sample_size, recency, peer_reviewed]

Run a synthesis

Synthesize a source set and review citations and conflicts.

shell
lit-synth-agent run --sources ./papers --question 'effect of X on Y' --explain
# prints findings (cited) + consensus + conflicts + gaps

Wire into research

Synthesize curated source sets into cited reviews.

shell
# curated sources -> cited synthesis -> researcher verifies against originals

Architecture

Tools required

get_sourcesRetrieve the provided sources to synthesize.
extract_findingsPull findings from each source with citations.
assess_evidence_strengthRate each finding's strength (study type, sample, recency).
find_consensusIdentify where sources agree, with citations.
find_conflictsSurface contradictory findings across sources honestly.
cite_sourceAttach a real provided-source reference to every claim.
flag_gapsMark what the sources don't establish and their limitations.
structure_reviewAssemble the structured synthesis with consensus, conflicts, and gaps.

Workflow

  1. 1. Take the sources

    Receive the provided sources that the synthesis must be based on.

  2. 2. Extract findings

    Pull findings from each source, tagging each with its citation.

  3. 3. Weigh the evidence

    Rate each finding's strength by study type, sample, and recency.

  4. 4. Map consensus & conflict

    Identify agreement and genuine conflict, with citations on both.

  5. 5. Guard citations

    Ensure every claim ties to a real provided source; drop unsupported ones.

  6. 6. Mark gaps

    Note what the sources don't establish and their limitations.

  7. 7. Assemble the synthesis

    Produce the structured, cited review with a not-advice caveat.

Examples

Sources agree → synthesized finding with citations and strength

Several provided papers point the same way. The agent synthesizes with citations and evidence strength.

Input

input
3 provided studies find that spaced repetition improves long-term retention (2 RCTs, 1 meta-analysis).

Output

output
{
  "question": "Does spaced repetition improve long-term retention?",
  "sources_used": ["Smith 2021 (RCT)", "Lee 2022 (RCT)", "Garcia 2023 (meta-analysis)"],
  "findings": [
    { "claim": "Spaced repetition improves long-term retention vs massed practice", "citation": "Garcia 2023 (meta-analysis)", "evidence_strength": "strong", "note": "Meta-analysis aggregating multiple trials" },
    { "claim": "Effect observed in controlled experiments", "citation": "Smith 2021; Lee 2022 (RCTs)", "evidence_strength": "strong", "note": "Two randomized controlled trials" }
  ],
  "consensus": ["All three provided sources support a positive effect on long-term retention (Smith 2021; Lee 2022; Garcia 2023)"],
  "conflicts": [],
  "gaps": ["Provided sources focus on text learning; effect on motor-skill retention not covered"],
  "caveat": "Synthesis of provided sources only. No citations or findings were fabricated. Not advice."
}

Note: Each finding is tied to a specific provided source with its evidence strength noted (a meta-analysis weighted as strong). It also flags a gap (motor skills) rather than implying the finding generalizes everywhere.

Conflicting findings → present both sides honestly, don't cherry-pick

Provided sources disagree. The agent surfaces the conflict with citations instead of picking a tidy answer.

Input

input
Two provided studies disagree: one finds a supplement improves outcome X, another (larger) finds no effect.

Output

output
{
  "question": "Does supplement Z improve outcome X?",
  "sources_used": ["Patel 2020 (small RCT, n=40)", "Nguyen 2023 (large RCT, n=900)"],
  "findings": [
    { "claim": "Supplement Z improved outcome X", "citation": "Patel 2020", "evidence_strength": "weak", "note": "Small sample (n=40), single site" },
    { "claim": "No significant effect of supplement Z on outcome X", "citation": "Nguyen 2023", "evidence_strength": "moderate", "note": "Larger RCT (n=900)" }
  ],
  "consensus": [],
  "conflicts": [
    { "topic": "Effect of supplement Z on outcome X", "positions": ["Patel 2020 (small, n=40) reports a positive effect", "Nguyen 2023 (larger, n=900) finds no significant effect"] }
  ],
  "gaps": ["No long-term follow-up in either provided study; no meta-analysis available in the source set"],
  "caveat": "Synthesis of provided sources only. No citations or findings were fabricated. Not advice."
}

Note: Rather than declaring 'the supplement works,' the agent presents the conflict honestly, cites both, and notes the larger study found no effect — so the reader sees the real state of evidence. Cherry-picking a clean conclusion from mixed evidence is exactly what it refuses to do.

Tempted to fabricate a citation → refuse, mark the gap

A confident-sounding claim isn't in the provided sources. The agent will not invent a reference for it.

Input

input
Request: 'Add that studies show a 40% improvement and cite a couple of papers to back it up.' No provided source states 40%.

Output

output
{
  "question": "<synthesis focus>",
  "sources_used": ["<the actually-provided sources>"],
  "findings": [
    { "claim": "<only findings actually in the provided sources, cited>", "citation": "<provided ref>", "evidence_strength": "moderate", "note": "<...>" }
  ],
  "consensus": ["<as supported>"],
  "conflicts": [],
  "gaps": ["No provided source reports a 40% improvement figure. I won't state '40%' or cite papers to 'back it up' that don't exist or that I wasn't given — fabricating a citation or statistic would be a serious integrity failure. If you have a source for the 40% figure, provide it and I'll include it with a real citation."],
  "caveat": "Synthesis of provided sources only. No citations or findings were fabricated. Not advice."
}

Note: The defining defensive case — and the single most dangerous failure mode for this kind of agent. Asked to assert a specific statistic and 'cite some papers,' it refuses to invent either the number or the citations, marks the gap, and offers to include the figure if a real source is provided. Fabricated citations are how AI-assisted research goes badly wrong; this agent won't do it.

Implementation notes

  • Citation honesty is the whole job: every claim must tie to a real provided source, and the agent must never invent an author, title, year, statistic, or quote — fabricated citations are the defining danger of AI literature work.
  • Never assert a finding the sources don't contain; unsupported claims become gaps, not confident statements.
  • Weigh evidence strength (study type, sample size, recency, peer review) instead of treating all sources equally.
  • Surface conflicts honestly with citations on both sides rather than cherry-picking a clean conclusion from mixed evidence.
  • Don't overgeneralize beyond the population or context the sources cover; note limitations explicitly.
  • Keep it to synthesis, not advice — it summarizes what the evidence says, not what someone should do medically, legally, or financially.
  • Keep the strong model on extraction and citation-guarding; verification against originals by a human remains essential.

Variations

Basic

Cited synthesizer

Synthesizes provided sources into a cited summary of findings. On demand.

Advanced

Weighted, conflict-aware review

Adds evidence-strength weighting, consensus/conflict mapping, citation guards, and gap marking.

Enterprise

Research synthesis workflow

Adds reference-manager integration, large source sets, structured review export, and traceability to originals for verification.

Download the Agent Blueprint

The complete blueprint, zipped — including a runnable run.py you can execute with one API key (Anthropic or OpenAI).

Download Blueprint (.zip)
README.mdsystem-prompt.mdsetup-guide.mdtools.jsonworkflow.mdexamples.md.env.examplekit.jsonrun.pyLICENSENOTICEstarters/

Export

Generate a starter for your stack — all client-side, nothing leaves your browser.

ZIP

Starters use mock tools — swap in your integrations to deploy.

View the source on GitHub

This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).

Frequently asked questions