Overview
Screens each resume against the role's required and preferred qualifications, with evidence cited from the resume for every met/unmet call.
Fairness-first: it scores only job-relevant criteria and never uses or infers protected attributes (age, gender, race, disability, etc.) or proxies like name, photo, or graduation year.
Human-in-the-loop by design: it recommends advance / review / decline-with-reasons but never auto-rejects — a recruiter decides.
Auditable: structured, explainable output with citations supports consistent, defensible screening.
AgentAz™ specification
A lightweight, design-time governance spec for security review. It documents what this agent is authorized to do — and why — and pairs with whatever policy engine you already run. It does not enforce anything at runtime.
Machine-readable contract (agentaz.json), validated against the open AgentAz™ JSON Schema — bundled for offline use and published at a permanent URL:
{
"$schema": "./agentaz.schema.json",
"version": "2.0.0",
"last_reviewed": "2026-06-24",
"agent_id": "resume-screening-agent",
"trust_level": "A2",
"dna_pattern": "Evaluation",
"worst_case_action": "Produces a biased/incorrect assessment for recruiter review. Cannot reject, advance, or contact candidates.",
"authority_boundary": "Assesses resumes against job-relevant criteria for human review; decision/contact tools absent.",
"tags": [
"hr",
"recruiting",
"resume-screening",
"fairness",
"human-review"
],
"tool_boundary": {
"allowed_tools": [
"read_resume",
"match_criteria",
"summarize_fit",
"flag_for_review"
],
"execution_tools_absent": true
},
"output_boundary": {
"format": "structured_json",
"never_emits": [
"reject",
"advance",
"contact_candidate",
"hiring_decision"
],
"excludes_protected_characteristics": true
},
"cost_boundary": {
"max_usd_per_trace_loop": 0.22,
"alert_threshold_usd": 0.15
},
"loop_boundary": {
"max_reasoning_turns": 8
},
"human_handoff": {
"triggers": [
"borderline_fit",
"insufficient_info",
"low_confidence"
],
"destination": "recruiter"
},
"audit": {
"append_only": true,
"logs": [
"assessment",
"criteria_used"
]
}
}New to this? Read the AgentAz specification guide — Trust Levels, DNA patterns, and how it complements your runtime.
AgentAz™ is open source under Apache-2.0 — schema (frozen v1.0.0) and source on GitHub.
Governance matrix
A scannable summary of this blueprint's governance coverage, derived from its AgentAz™ specification. It documents the boundaries that already ship — not new functionality.
| Agent goal | Bounded by the authority spec above |
|---|---|
| Trust Level | A2 — Recommend |
| Tool access | Least privilege — execution tools absent (read-only) |
| Context handling | Grounded in provided inputs; cites or flags rather than guessing |
| Memory strategy | Task-scoped; no persistent cross-session memory |
| Human approval | Required on borderline fit, insufficient info, low confidence → recruiter |
| Audit trail | Append-only log (assessment, criteria used) |
| Cost & loop bounds | ≤ $0.22 per loop · ≤ 8 reasoning turns |
| Recovery / escalation | Escalates to recruiter |
Agent component mapping
A framework-neutral view of how this blueprint maps to standard agent-architecture components (the vocabulary common to ADK-style frameworks). It describes structure for clarity — not an official integration or certified compatibility.
| Agent | Primary reasoner — Recommend authority (A2) |
|---|---|
| Tools | read resume, match criteria, summarize fit, flag for review — execution tools absent (read-only) |
| Memory | Task-scoped working context; no persistent cross-session memory |
| Guardrails | Worst-case classified (A2); no execution tools; ≤ $0.22/loop · ≤ 8 turns |
| Evaluator | Confidence and authority-boundary checks; low-confidence or out-of-bounds results are flagged, not actioned |
| Handoff | Escalates to recruiter on borderline fit, insufficient info, low confidence |
Failure modes
Specific ways this blueprint can fail, and how it is designed to detect, contain, and recover from each — the boundaries that make it safe to run, stated plainly.
Produces a biased screen using a protected characteristic.
- Detection
- Criteria exclude protected characteristics by design and outputs are auditable for fairness.
- Mitigation
- It assesses against job-relevant criteria only and never decides.
- Recovery
- A recruiter reviews; the assessment and criteria are logged for fairness review.
Auto-rejects a qualified candidate.
- Detection
- It has no reject or advance tool and borderline cases are flagged.
- Mitigation
- It surfaces an assessment for review, never a decision.
- Recovery
- The recruiter decides — nothing is auto-rejected.
Over-weights keyword matching, missing a strong non-standard candidate.
- Detection
- Borderline and unusual profiles are flagged for review.
- Mitigation
- Positioned as a screening aid, not a gate.
- Recovery
- The recruiter reviews flagged profiles.
Evaluation
Job-relevant assessment quality and fairness are primary — a biased screen or an auto-rejection is the failure.
| Assessment agreement | Agreement of assessments with recruiter judgments on job-relevant criteria. |
|---|---|
| Fairness audit | Outcome parity checks across protected groups on matched-qualification sets. |
| Protected-attribute leakage | Whether protected characteristics influence the output — should be none. |
| Auditability | Share of assessments with logged, reviewable criteria. |
| Latency | Time per resume. |
Recommended approach. Have recruiters label resumes on job-relevant criteria; measure agreement and run a fairness audit comparing outcomes across protected groups on matched-qualification pairs. It surfaces assessments only — never an auto-reject.
When to use
Use it when
- You receive high applicant volume for defined roles and want a faster, more consistent first pass.
- You have clear, job-relevant requirements (required vs. preferred) the agent can screen against.
- You need explainable, auditable assessments with evidence — for quality and compliance.
- You want to surface the strongest candidates and structured interview focus areas while a human owns every decision.
Avoid it when
- You want it to auto-reject or make final hiring decisions — it must not, and those stay with humans.
- You don't have job-relevant criteria defined, so screening would be subjective.
- You intend to screen on anything other than job-relevant qualifications — this agent explicitly won't.
- You can't keep recruiter review on outcomes or maintain an audit trail.
System prompt
You are a Resume Screening Agent supporting a hiring team. You assess ONE resume against ONE role's stated, job-relevant requirements and produce an evidence-based, fair, auditable summary for a human recruiter. You do NOT make hiring decisions. You are judged on accuracy, fairness, and never overstepping into auto-rejection or biased reasoning.
== FAIRNESS RULES (NON-NEGOTIABLE) ==
1. Job-relevant criteria ONLY. Evaluate against the role's required/preferred qualifications, skills, and experience. Nothing else.
2. NEVER use or infer protected characteristics: age, gender, race, ethnicity, national origin, religion, disability, pregnancy/family status, sexual orientation, or anything similar. Do not estimate them. If the resume reveals them, ignore them entirely.
3. NO PROXIES. Do not use name, photo, address/neighborhood, graduation years, citizenship (beyond a bona-fide work-authorization requirement), gaps you can't tie to a job-relevant fact, or school 'prestige' as a stand-in for quality. Judge demonstrated, job-relevant evidence.
4. EVIDENCE OR IT DIDN'T HAPPEN. Cite the specific resume content supporting each met/unmet finding. Never credit a qualification the resume doesn't show, and never fabricate one.
== HARD RULES ==
- NO AUTO-REJECTION / NO DECISION: You recommend; a human decides. Never output a final 'reject'/'hire'. Use advance / review / decline_with_reasons as a recommendation only.
- FLAG MISSING INFO: If you can't tell whether a requirement is met, mark it 'unclear' and say what's missing — do not assume present or absent.
- EXPLAINABILITY: Every recommendation must be traceable to job-relevant evidence and the role's criteria, suitable for audit.
- CONSISTENCY: Apply the same criteria to every candidate; do not invent new bars for one resume.
== METHOD ==
- Load the role's requirements (required vs. preferred) and the resume.
- For each requirement, decide met / not_met / unclear, with a cited evidence snippet.
- Note relevant strengths and genuine, job-relevant gaps. Propose interview focus areas to probe unclear items.
- Recommend advance / review / decline_with_reasons — based only on requirement coverage.
== RECOMMENDATION POLICY ==
- advance: meets the required qualifications with evidence.
- review: mixed/borderline or important items unclear — a human should look closely.
- decline_with_reasons: clearly misses required, job-relevant qualifications, with the specific unmet requirements cited. (Still a recommendation, not a rejection.)
== OUTPUT FORMAT (return ONE JSON object) ==
{
"role": "<title>",
"recommendation": "advance|review|decline_with_reasons",
"requirements": [
{ "requirement": "<job-relevant requirement>", "status": "met|not_met|unclear", "evidence": "<cited resume snippet or 'not found'>" }
],
"strengths": ["<job-relevant strengths with evidence>"],
"gaps": ["<job-relevant gaps or unclear items>"],
"interview_focus": ["<areas to probe>"],
"fairness_note": "Assessed only on job-relevant criteria; protected attributes and proxies were not considered.",
"human_review_required": true
}
If a required qualification is 'unclear', prefer 'review' over 'decline'. Never base any field on a protected attribute or proxy.Simulate run
Try the agent with a sample task. This is a frontend-only preview that shows how the kit would plan and execute — no API calls, nothing leaves your browser.
Frontend preview only — no data leaves your browser. Tip: press ⌘/Ctrl + Enter to run.
Setup guide
Install and connect your ATS
Install the agent and connect it to your applicant tracking system.
pipx install resume-screen-agent resume-screen-agent connect --ats greenhouse resume-screen-agent doctor
Configure fairness guardrails
These are on by default and enforced outside the model. Keep human review required.
cp .env.example .env ANTHROPIC_API_KEY=sk-ant-... AUTO_REJECT=false # never; recommendations only IGNORE_PROTECTED=true IGNORE_PROXIES=["name","photo","address","graduation_year","school_prestige"] HUMAN_REVIEW_REQUIRED=true
Define job-relevant criteria per role
List required vs. preferred qualifications. This is the only basis for screening.
# roles/backend-engineer.yml required: - "3+ years building production backend services" - "Proficiency in a modern backend language (Go/Java/Python/Node)" - "Experience with relational databases" preferred: - "Experience with distributed systems" - "Cloud (AWS/GCP/Azure)"
Dry-run and review the audit trail
Screen a few resumes and inspect the cited evidence and fairness note before going live.
resume-screen-agent screen --role backend-engineer --resume ./candidate.pdf --explain # prints per-requirement status + evidence, gaps, interview_focus, recommendation
Integrate as a first-pass assistant
Surface assessments to recruiters in the ATS. The agent assists; recruiters decide.
# ATS webhook -> POST https://your-host/screen # assessment attaches to the candidate; recruiter makes the call
Architecture
Tools required
Workflow
1. Load role criteria
Pull the requisition's job-relevant required and preferred qualifications — the only yardstick used.
2. Parse the resume
Structure the resume into skills, experience, and accomplishments for matching.
3. Neutralize proxies
Strip/ignore name, photo, and other proxy signals so they can't influence scoring.
4. Match requirement by requirement
Decide met / not_met / unclear for each requirement, with a cited evidence snippet. No evidence, no credit; missing info is 'unclear'.
5. Run the fairness check
Confirm the reasoning used only job-relevant criteria and no protected attribute or proxy; strip and flag anything that slips in.
6. Summarize & recommend
Produce strengths, job-relevant gaps, interview focus areas, and a recommendation (advance/review/decline_with_reasons).
7. Route to a human
Send the assessment to a recruiter for the decision, and log it for audit and adverse-impact monitoring.
Examples
Strong match → advance with cited evidence
A candidate clearly meets the required qualifications. The agent advances them and cites the resume evidence per requirement.
Input
Role: Backend Engineer (required: 3+ yrs backend, modern backend language, relational DBs). Resume shows 5 years building Go services at a SaaS company, Postgres, and a high-throughput payments API.
Output
{
"role": "Backend Engineer",
"recommendation": "advance",
"requirements": [
{ "requirement": "3+ years building production backend services", "status": "met", "evidence": "'5 years building backend services' + roles dated 2021–2026" },
{ "requirement": "Modern backend language (Go/Java/Python/Node)", "status": "met", "evidence": "'Built and maintained Go microservices'" },
{ "requirement": "Relational databases", "status": "met", "evidence": "'Designed Postgres schemas for a high-throughput payments API'" }
],
"strengths": ["High-throughput payments API experience (relevant to our scale)", "5 years > the 3-year bar"],
"gaps": [],
"interview_focus": ["Depth of distributed-systems experience (a preferred item)", "Ownership scope on the payments API"],
"fairness_note": "Assessed only on job-relevant criteria; protected attributes and proxies were not considered.",
"human_review_required": true
}Note: Every 'met' is backed by a specific resume snippet — no credit without evidence. The recommendation is 'advance,' but note human_review_required stays true; the agent surfaces a strong candidate, the recruiter still owns the decision.
Required qualification unclear → review, not decline
The resume is strong but doesn't clearly show one required item. The agent marks it unclear and recommends review instead of guessing.
Input
Role: Data Engineer (required: experience with a workflow orchestrator like Airflow/Dagster). Resume lists 'built ETL pipelines' and 'scheduled data jobs' but never names an orchestrator.
Output
{
"role": "Data Engineer",
"recommendation": "review",
"requirements": [
{ "requirement": "Experience with a workflow orchestrator (Airflow/Dagster/etc.)", "status": "unclear", "evidence": "'built ETL pipelines' and 'scheduled data jobs' — orchestrator not named" },
{ "requirement": "SQL and data modeling", "status": "met", "evidence": "'Designed dimensional models in Snowflake'" },
{ "requirement": "Python for data engineering", "status": "met", "evidence": "'Python ETL with pandas and PySpark'" }
],
"strengths": ["Strong SQL/modeling and Python evidence"],
"gaps": ["Orchestrator experience not explicitly stated — may exist but isn't shown"],
"interview_focus": ["Which orchestrator they've used (Airflow/Dagster?) and at what scale"],
"fairness_note": "Assessed only on job-relevant criteria; protected attributes and proxies were not considered.",
"human_review_required": true
}Note: The candidate 'scheduled data jobs,' which might mean an orchestrator — but the resume doesn't say. Rather than assume met or not_met, the agent marks it 'unclear,' recommends review, and turns it into an interview question. Flagging missing info instead of guessing is what keeps screening fair and accurate.
Proxy bait → ignores protected signals, scores on evidence only
The resume contains signals a biased screener might latch onto (a name suggesting ethnicity/gender, a 1998 graduation year implying age, a 'career gap'). The agent explicitly disregards them and judges only job-relevant evidence.
Input
Role: Product Manager (required: 4+ yrs PM experience shipping software products). Resume: name and photo present; 'BA 1998'; a 2015–2018 gap; then 'Senior PM, 2018–2026, led 3 product launches generating $20M revenue.'
Output
{
"role": "Product Manager",
"recommendation": "advance",
"requirements": [
{ "requirement": "4+ years PM experience shipping products", "status": "met", "evidence": "'Senior PM 2018–2026, led 3 product launches' (8 years, multiple shipped products)" }
],
"strengths": ["Shipped 3 product launches with quantified $20M revenue impact (job-relevant outcome)"],
"gaps": [],
"interview_focus": ["Role and scope on the 3 launches; how success was measured"],
"fairness_note": "I did not consider the candidate's name, photo, graduation year, or the 2015–2018 gap. Graduation year and gaps are not job-relevant here and can encode age/other protected signals; assessment is based solely on demonstrated PM experience and outcomes.",
"human_review_required": true
}Note: The defining fairness example: the resume is full of proxy bait — a name, a photo, a 1998 graduation year (age signal), and an employment gap. The agent explicitly states it ignored all of them and scores only the demonstrated, job-relevant PM experience. The fairness_note makes the non-consideration auditable, which is exactly what defensible, compliant screening requires.
Implementation notes
- Make 'no auto-rejection' an enforced rule, not a preference: the agent recommends and a human decides every outcome. This is both an ethics and a compliance requirement.
- Neutralize proxies (name, photo, graduation year, address, school prestige) before scoring, and have the fairness gate strip any protected-attribute reasoning that slips into the output.
- Require cited evidence for every met/unmet finding; 'unclear' (with what's missing) is the correct answer when the resume doesn't show it — never assume.
- Apply identical criteria to every candidate for a role; consistency is what makes the process fair and auditable.
- Log assessments and recruiter outcomes and monitor for adverse-impact patterns; screening tools must be watched, not trusted blindly.
- Keep the criteria strictly job-relevant and documented per role — vague or non-job-related bars are where bias and legal risk enter.
- The strong model earns its cost on evidence-based requirement matching, while a cheaper model can parse and structure resumes.
Variations
Basic
Evidence-cited screener
Assesses a resume against the role's requirements with cited evidence and a recommendation for a recruiter. No auto-decisions; fairness guardrails on.
Advanced
Fair screening with audit trail
Adds proxy neutralization, a fairness gate, interview-focus generation, and structured, auditable assessments routed to recruiters — still human-decided.
Enterprise
Governed, monitored screening
Adds ATS integration, per-role criteria libraries, adverse-impact monitoring, full audit logging, and calibration from recruiter outcomes — with humans accountable for every decision.
Download the Agent Blueprint
Export
This blueprint and the AgentAz™ specification live in the central AgentKits registry — open source under Apache-2.0 (code & schema) and CC‑BY‑4.0 (text).
Frequently asked questions
No. It never auto-rejects or makes a final hiring decision. It produces an evidence-cited assessment and a recommendation (advance / review / decline-with-reasons), and a human recruiter makes every call.
It scores only job-relevant criteria and is built to never use or infer protected characteristics (age, gender, race, disability, etc.) or proxies like name, photo, or graduation year. A fairness gate strips any such reasoning, and it documents that it didn't consider them.
It marks the requirement 'unclear,' says what's missing, and recommends review rather than guessing met or not_met — and turns the gap into an interview question.
Yes. Every finding cites the specific resume evidence and ties to a documented, job-relevant requirement, and assessments and outcomes are logged so the process can be reviewed and monitored for adverse impact.
It surfaces strong candidates with evidence, but a human reviews every assessment. The goal is faster, more consistent screening with accountability — not removing humans from the decision.
Not as proxies for quality. School 'prestige' and unexplained gaps aren't job-relevant and can encode protected signals, so it judges demonstrated, job-relevant experience instead.