AgentKits

Resume Screening Agent

Production Blueprint
0TrendingNew

Includes Agent Blueprint + Implementation Guide

An agent that screens resumes against a specific role's stated requirements and produces a structured, evidence-cited assessment — which qualifications are met, which aren't, and where information is missing — plus interview focus areas and a recommendation to a human. It is built for fair, auditable hiring: it scores only job-relevant criteria, never uses or infers protected characteristics or their proxies, cites evidence from the resume for every claim, refuses to auto-reject, and flags gaps instead of guessing. The goal is to make screening faster and more consistent while keeping a human accountable for every decision.

recruitinghrresume-screeninghiringfairnessbias-mitigationautonomous-agentatsagentazagent-governancetrust-levelproduction-readiness
StackClaude, LangGraph, OpenAI
DifficultyAdvanced
Setup45 min
Version2.0.0 · 2026-06-21

Overview

Screens each resume against the role's required and preferred qualifications, with evidence cited from the resume for every met/unmet call.

Fairness-first: it scores only job-relevant criteria and never uses or infers protected attributes (age, gender, race, disability, etc.) or proxies like name, photo, or graduation year.

Human-in-the-loop by design: it recommends advance / review / decline-with-reasons but never auto-rejects — a recruiter decides.

Auditable: structured, explainable output with citations supports consistent, defensible screening.

AgentAz™ specification

A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.

Trust Level ?A2 — Recommend
DNA PatternEvaluation (Research → Evaluate)
Worst-Case ActionProduces a biased or incorrect screening assessment, surfaced for a recruiter to review. It cannot reject, advance, contact a candidate, or make any hiring decision — execution tools are absent from its registry.
Authority BoundaryReviews a resume against job-relevant, documented criteria and surfaces a structured assessment for human review. It never rejects, advances, or contacts candidates, and never makes a hiring decision. A recruiter decides, and protected-characteristic factors are excluded.
Verification TestAttempt to call a reject, advance, or candidate-contact tool → confirm it is absent; confirm the criteria exclude protected characteristics.
Production Readiness6/6 dimensions passing. Tool isolation: decision/contact tools absent. Human gates: a recruiter decides. Confidence escalation: borderline candidates flagged for review, never auto-rejected. Cost ceiling: bounded per resume. Audit trail: assessment and criteria logged for fairness review. Escalation path: ambiguous cases routed to a recruiter.
Last Reviewed2026-06-24

Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:

agentaz.json
{
  "$schema": "./agentaz.schema.json",
  "version": "2.0.0",
  "last_reviewed": "2026-06-24",
  "agent_id": "resume-screening-agent",
  "trust_level": "A2",
  "dna_pattern": "Evaluation",
  "worst_case_action": "Produces a biased/incorrect assessment for recruiter review. Cannot reject, advance, or contact candidates.",
  "authority_boundary": "Assesses resumes against job-relevant criteria for human review; decision/contact tools absent.",
  "tags": [
    "hr",
    "recruiting",
    "resume-screening",
    "fairness",
    "human-review"
  ],
  "tool_boundary": {
    "allowed_tools": [
      "read_resume",
      "match_criteria",
      "summarize_fit",
      "flag_for_review"
    ],
    "execution_tools_absent": true
  },
  "output_boundary": {
    "format": "structured_json",
    "never_emits": [
      "reject",
      "advance",
      "contact_candidate",
      "hiring_decision"
    ],
    "excludes_protected_characteristics": true
  },
  "cost_boundary": {
    "max_usd_per_trace_loop": 0.22,
    "alert_threshold_usd": 0.15
  },
  "loop_boundary": {
    "max_reasoning_turns": 8
  },
  "human_handoff": {
    "triggers": [
      "borderline_fit",
      "insufficient_info",
      "low_confidence"
    ],
    "destination": "recruiter"
  },
  "audit": {
    "append_only": true,
    "logs": [
      "assessment",
      "criteria_used"
    ]
  }
}

New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.

AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.

Governance matrix

A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.

Agent goalBounded by the authority spec above
Trust LevelA2 — Recommend
Tool accessLeast privilege — execution tools absent (read-only)
Context handlingGrounded in provided inputs; cites or flags rather than guessing
Memory strategyTask-scoped; no persistent cross-session memory
Human approvalRequired on borderline fit, insufficient info, low confidence → recruiter
Audit trailAppend-only log (assessment, criteria used)
Cost & loop bounds≤ $0.22 per loop · ≤ 8 reasoning turns
Recovery / escalationEscalates to recruiter

Agent component mapping

A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.

AgentPrimary reasoner — Recommend authority (A2)
Toolsread resume, match criteria, summarize fit, flag for review — execution tools absent (read-only)
MemoryTask-scoped working context; no persistent cross-session memory
GuardrailsWorst-case classified (A2); no execution tools; ≤ $0.22/loop · ≤ 8 turns
EvaluatorConfidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned
HandoffEscalates to recruiter on borderline fit, insufficient info, low confidence

Failure modes

Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.

Produces a biased screen using a protected characteristic.

Detection
Criteria exclude protected characteristics by design and outputs are auditable for fairness.
Mitigation
It assesses against job-relevant criteria only and never decides.
Recovery
A recruiter reviews; the assessment and criteria are logged for fairness review.

Auto-rejects a qualified candidate.

Detection
It has no reject or advance tool and borderline cases are flagged.
Mitigation
It surfaces an assessment for review, never a decision.
Recovery
The recruiter decides — nothing is auto-rejected.

Over-weights keyword matching, missing a strong non-standard candidate.

Detection
Borderline and unusual profiles are flagged for review.
Mitigation
Positioned as a screening aid, not a gate.
Recovery
The recruiter reviews flagged profiles.

Evaluation

Job-relevant assessment quality and fairness are primary — a biased screen or an auto-rejection is the failure.

Assessment agreementAgreement of assessments with recruiter judgments on job-relevant criteria.
Fairness auditOutcome parity checks across protected groups on matched-qualification sets.
Protected-attribute leakageWhether protected characteristics influence the output — should be none.
AuditabilityShare of assessments with logged, reviewable criteria.
LatencyTime per resume.

Recommended approach. Have recruiters label resumes on job-relevant criteria; measure agreement and run a fairness audit comparing outcomes across protected groups on matched-qualification pairs. It surfaces assessments only — never an auto-reject.

When to use

Use it when

  • You receive high applicant volume for defined roles and want a faster, more consistent first pass.
  • You have clear, job-relevant requirements (required vs. preferred) the agent can screen against.
  • You need explainable, auditable assessments with evidence — for quality and compliance.
  • You want to surface the strongest candidates and structured interview focus areas while a human owns every decision.

Avoid it when

  • You want it to auto-reject or make final hiring decisions — it must not, and those stay with humans.
  • You don't have job-relevant criteria defined, so screening would be subjective.
  • You intend to screen on anything other than job-relevant qualifications — this agent explicitly won't.
  • You can't keep recruiter review on outcomes or maintain an audit trail.

System prompt

system-prompt.md
You are a Resume Screening Agent supporting a hiring team. You assess ONE resume against ONE role's stated, job-relevant requirements and produce an evidence-based, fair, auditable summary for a human recruiter. You do NOT make hiring decisions. You are judged on accuracy, fairness, and never overstepping into auto-rejection or biased reasoning.

== FAIRNESS RULES (NON-NEGOTIABLE) ==
1. Job-relevant criteria ONLY. Evaluate against the role's required/preferred qualifications, skills, and experience. Nothing else.
2. NEVER use or infer protected characteristics: age, gender, race, ethnicity, national origin, religion, disability, pregnancy/family status, sexual orientation, or anything similar. Do not estimate them. If the resume reveals them, ignore them entirely.
3. NO PROXIES. Do not use name, photo, address/neighborhood, graduation years, citizenship (beyond a bona-fide work-authorization requirement), gaps you can't tie to a job-relevant fact, or school 'prestige' as a stand-in for quality. Judge demonstrated, job-relevant evidence.
4. EVIDENCE OR IT DIDN'T HAPPEN. Cite the specific resume content supporting each met/unmet finding. Never credit a qualification the resume doesn't show, and never fabricate one.

== HARD RULES ==
- NO AUTO-REJECTION / NO DECISION: You recommend; a human decides. Never output a final 'reject'/'hire'. Use advance / review / decline_with_reasons as a recommendation only.
- FLAG MISSING INFO: If you can't tell whether a requirement is met, mark it 'unclear' and say what's missing — do not assume present or absent.
- EXPLAINABILITY: Every recommendation must be traceable to job-relevant evidence and the role's criteria, suitable for audit.
- CONSISTENCY: Apply the same criteria to every candidate; do not invent new bars for one resume.

== METHOD ==
- Load the role's requirements (required vs. preferred) and the resume.
- For each requirement, decide met / not_met / unclear, with a cited evidence snippet.
- Note relevant strengths and genuine, job-relevant gaps. Propose interview focus areas to probe unclear items.
- Recommend advance / review / decline_with_reasons — based only on requirement coverage.

== RECOMMENDATION POLICY ==
- advance: meets the required qualifications with evidence.
- review: mixed/borderline or important items unclear — a human should look closely.
- decline_with_reasons: clearly misses required, job-relevant qualifications, with the specific unmet requirements cited. (Still a recommendation, not a rejection.)

== OUTPUT FORMAT (return ONE JSON object) ==
{
  "role": "<title>",
  "recommendation": "advance|review|decline_with_reasons",
  "requirements": [
    { "requirement": "<job-relevant requirement>", "status": "met|not_met|unclear", "evidence": "<cited resume snippet or 'not found'>" }
  ],
  "strengths": ["<job-relevant strengths with evidence>"],
  "gaps": ["<job-relevant gaps or unclear items>"],
  "interview_focus": ["<areas to probe>"],
  "fairness_note": "Assessed only on job-relevant criteria; protected attributes and proxies were not considered.",
  "human_review_required": true
}
If a required qualification is 'unclear', prefer 'review' over 'decline'. Never base any field on a protected attribute or proxy.
Was this useful?

Simulate run

Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.

Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.

Setup guide

Install and connect your ATS

Install the agent and connect it to your applicant tracking system.

shell
pipx install resume-screen-agent
resume-screen-agent connect --ats greenhouse
resume-screen-agent doctor

Configure fairness guardrails

These are on by default and enforced outside the model. Keep human review required.

shell
cp .env.example .env
ANTHROPIC_API_KEY=sk-ant-...
AUTO_REJECT=false            # never; recommendations only
IGNORE_PROTECTED=true
IGNORE_PROXIES=["name","photo","address","graduation_year","school_prestige"]
HUMAN_REVIEW_REQUIRED=true

Define job-relevant criteria per role

List required vs. preferred qualifications. This is the only basis for screening.

shell
# roles/backend-engineer.yml
required:
  - "3+ years building production backend services"
  - "Proficiency in a modern backend language (Go/Java/Python/Node)"
  - "Experience with relational databases"
preferred:
  - "Experience with distributed systems"
  - "Cloud (AWS/GCP/Azure)"

Dry-run and review the audit trail

Screen a few resumes and inspect the cited evidence and fairness note before going live.

shell
resume-screen-agent screen --role backend-engineer --resume ./candidate.pdf --explain
# prints per-requirement status + evidence, gaps, interview_focus, recommendation

Integrate as a first-pass assistant

Surface assessments to recruiters in the ATS. The agent assists; recruiters decide.

shell
# ATS webhook -> POST https://your-host/screen
# assessment attaches to the candidate; recruiter makes the call

Architecture

Tools required

get_requisitionLoad the role's job-relevant requirements (required vs. preferred), skills, and experience criteria.
get_resumeRetrieve the candidate's resume/application content for screening.
parse_resumeStructure the resume into skills, roles, durations, and accomplishments for requirement matching.
redact_protectedRemove/neutralize name, photo, and proxy signals so scoring is driven only by job-relevant evidence.
match_requirementsCompare the resume to each requirement and return met/not_met/unclear with a cited evidence snippet.
evidence_citeAttach the exact resume text supporting a finding so the assessment is auditable.
summarize_assessmentAssemble the cited findings, strengths, gaps, and interview focus into a structured recruiter summary.
route_to_recruiterSend the assessment and recommendation to a human recruiter; never finalizes a hire/reject decision.

Workflow

  1. 1. Load role criteria

    Pull the requisition's job-relevant required and preferred qualifications — the only yardstick used.

  2. 2. Parse the resume

    Structure the resume into skills, experience, and accomplishments for matching.

  3. 3. Neutralize proxies

    Strip/ignore name, photo, and other proxy signals so they can't influence scoring.

  4. 4. Match requirement by requirement

    Decide met / not_met / unclear for each requirement, with a cited evidence snippet. No evidence, no credit; missing info is 'unclear'.

  5. 5. Run the fairness check

    Confirm the reasoning used only job-relevant criteria and no protected attribute or proxy; strip and flag anything that slips in.

  6. 6. Summarize & recommend

    Produce strengths, job-relevant gaps, interview focus areas, and a recommendation (advance/review/decline_with_reasons).

  7. 7. Route to a human

    Send the assessment to a recruiter for the decision, and log it for audit and adverse-impact monitoring.

Examples

Strong match → advance with cited evidence

A candidate clearly meets the required qualifications. The agent advances them and cites the resume evidence per requirement.

Input

input
Role: Backend Engineer (required: 3+ yrs backend, modern backend language, relational DBs). Resume shows 5 years building Go services at a SaaS company, Postgres, and a high-throughput payments API.

Output

output
{
  "role": "Backend Engineer",
  "recommendation": "advance",
  "requirements": [
    { "requirement": "3+ years building production backend services", "status": "met", "evidence": "'5 years building backend services' + roles dated 2021–2026" },
    { "requirement": "Modern backend language (Go/Java/Python/Node)", "status": "met", "evidence": "'Built and maintained Go microservices'" },
    { "requirement": "Relational databases", "status": "met", "evidence": "'Designed Postgres schemas for a high-throughput payments API'" }
  ],
  "strengths": ["High-throughput payments API experience (relevant to our scale)", "5 years > the 3-year bar"],
  "gaps": [],
  "interview_focus": ["Depth of distributed-systems experience (a preferred item)", "Ownership scope on the payments API"],
  "fairness_note": "Assessed only on job-relevant criteria; protected attributes and proxies were not considered.",
  "human_review_required": true
}

Note: Every 'met' is backed by a specific resume snippet — no credit without evidence. The recommendation is 'advance,' but note human_review_required stays true; the agent surfaces a strong candidate, the recruiter still owns the decision.

Required qualification unclear → review, not decline

The resume is strong but doesn't clearly show one required item. The agent marks it unclear and recommends review instead of guessing.

Input

input
Role: Data Engineer (required: experience with a workflow orchestrator like Airflow/Dagster). Resume lists 'built ETL pipelines' and 'scheduled data jobs' but never names an orchestrator.

Output

output
{
  "role": "Data Engineer",
  "recommendation": "review",
  "requirements": [
    { "requirement": "Experience with a workflow orchestrator (Airflow/Dagster/etc.)", "status": "unclear", "evidence": "'built ETL pipelines' and 'scheduled data jobs' — orchestrator not named" },
    { "requirement": "SQL and data modeling", "status": "met", "evidence": "'Designed dimensional models in Snowflake'" },
    { "requirement": "Python for data engineering", "status": "met", "evidence": "'Python ETL with pandas and PySpark'" }
  ],
  "strengths": ["Strong SQL/modeling and Python evidence"],
  "gaps": ["Orchestrator experience not explicitly stated — may exist but isn't shown"],
  "interview_focus": ["Which orchestrator they've used (Airflow/Dagster?) and at what scale"],
  "fairness_note": "Assessed only on job-relevant criteria; protected attributes and proxies were not considered.",
  "human_review_required": true
}

Note: The candidate 'scheduled data jobs,' which might mean an orchestrator — but the resume doesn't say. Rather than assume met or not_met, the agent marks it 'unclear,' recommends review, and turns it into an interview question. Flagging missing info instead of guessing is what keeps screening fair and accurate.

Proxy bait → ignores protected signals, scores on evidence only

The resume contains signals a biased screener might latch onto (a name suggesting ethnicity/gender, a 1998 graduation year implying age, a 'career gap'). The agent explicitly disregards them and judges only job-relevant evidence.

Input

input
Role: Product Manager (required: 4+ yrs PM experience shipping software products). Resume: name and photo present; 'BA 1998'; a 2015–2018 gap; then 'Senior PM, 2018–2026, led 3 product launches generating $20M revenue.'

Output

output
{
  "role": "Product Manager",
  "recommendation": "advance",
  "requirements": [
    { "requirement": "4+ years PM experience shipping products", "status": "met", "evidence": "'Senior PM 2018–2026, led 3 product launches' (8 years, multiple shipped products)" }
  ],
  "strengths": ["Shipped 3 product launches with quantified $20M revenue impact (job-relevant outcome)"],
  "gaps": [],
  "interview_focus": ["Role and scope on the 3 launches; how success was measured"],
  "fairness_note": "I did not consider the candidate's name, photo, graduation year, or the 2015–2018 gap. Graduation year and gaps are not job-relevant here and can encode age/other protected signals; assessment is based solely on demonstrated PM experience and outcomes.",
  "human_review_required": true
}

Note: The defining fairness example: the resume is full of proxy bait — a name, a photo, a 1998 graduation year (age signal), and an employment gap. The agent explicitly states it ignored all of them and scores only the demonstrated, job-relevant PM experience. The fairness_note makes the non-consideration auditable, which is exactly what defensible, compliant screening requires.

Implementation notes

  • Make 'no auto-rejection' an enforced rule, not a preference: the agent recommends and a human decides every outcome. This is both an ethics and a compliance requirement.
  • Neutralize proxies (name, photo, graduation year, address, school prestige) before scoring, and have the fairness gate strip any protected-attribute reasoning that slips into the output.
  • Require cited evidence for every met/unmet finding; 'unclear' (with what's missing) is the correct answer when the resume doesn't show it — never assume.
  • Apply identical criteria to every candidate for a role; consistency is what makes the process fair and auditable.
  • Log assessments and recruiter outcomes and monitor for adverse-impact patterns; screening tools must be watched, not trusted blindly.
  • Keep the criteria strictly job-relevant and documented per role — vague or non-job-related bars are where bias and legal risk enter.
  • The strong model earns its cost on evidence-based requirement matching, while a cheaper model can parse and structure resumes.

Variations

Basic

Evidence-cited screener

Assesses a resume against the role's requirements with cited evidence and a recommendation for a recruiter. No auto-decisions; fairness guardrails on.

Advanced

Fair screening with audit trail

Adds proxy neutralization, a fairness gate, interview-focus generation, and structured, auditable assessments routed to recruiters — still human-decided.

Enterprise

Governed, monitored screening

Adds ATS integration, per-role criteria libraries, adverse-impact monitoring, full audit logging, and calibration from recruiter outcomes — with humans accountable for every decision.

Download the Agent Blueprint

The complete blueprint, zipped — including a runnable run.py you can execute with one API key (Anthropic or OpenAI).

Download Blueprint (.zip)
README.mdsystem-prompt.mdsetup-guide.mdtools.jsonworkflow.mdexamples.md.env.examplekit.jsonrun.pyLICENSENOTICEstarters/

Export

Generate a starter for your stack — all client-side, nothing leaves your browser.

ZIP

Starters use mock tools — swap in your integrations to deploy.

View the source on GitHub

This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).

Frequently asked questions