Hiring Manager's Guide · 2026

How to Run a Vibe Coding Interview

The format that tests how candidates actually work in 2026 — with AI in the loop, not pretending it doesn't exist.

What This Is

A vibe coding interview lets candidates use Cursor, Claude Code, or Copilot during the assessment. You score on verification (do they catch model errors?), prompt quality (how efficient is their AI use?), ownership (can they defend every line?), and orchestration (do they decompose tasks correctly?). You do not score on raw coding speed or algorithm memorization.

What a Vibe Coding Interview Actually Measures

The term "vibe coding" describes a failure mode — accepting whatever the AI generates without reading it, understanding it, or testing it. A strong vibe coding interview tests the opposite: whether the candidate has the discipline to verify, critique, and own AI-generated output.

In 2026, 91% of US engineers use agentic AI tools daily. The interview format that evaluates whether someone can write binary search from memory is evaluating a skill they'll almost never use. The skill they use every day is: read a repo, scope a task, prompt precisely, verify the output, catch errors the model missed, and ship code they can defend line by line.

That's what a vibe coding interview measures.

How It Differs from Traditional Technical Screens

✗ Traditional interview

  • AI banned or ignored
  • Tests algorithm memorization
  • Evaluates typing speed and recall
  • Take-home: no way to know who wrote it
  • Scores on output quality (correct/incorrect)
  • Selects for candidates who practiced LeetCode

✓ Vibe coding interview

  • AI tools explicitly allowed and required
  • Tests judgment, verification, ownership
  • Evaluates prompting quality and efficiency
  • Live screen-share: you see how they work
  • Scores on process quality (how they got there)
  • Selects for candidates who work like your team
The take-home collapse: Roughly 45% of US employers still send take-home assessments, but trust has broken down — there's no reliable way to know whether the candidate wrote the code or handed the ticket to Claude Code at midnight. Live vibe coding with a screen share solves this: you see the process, not just the artifact.

The Interview Format (60–90 Minutes)

Setup

Format A — Bug hunt (60 min, best signal per minute)

Plant one or two bugs in the repo — a race condition, a swallowed error in a try/catch, an off-by-one in pagination. The bug should be the type your team actually ships and reverts. Give the candidate 60 minutes. The session has three phases:

  1. First 10 min: Read the codebase. Strong candidates open CLAUDE.md, AGENTS.md, or the README before typing a single prompt. Weak candidates start prompting immediately.
  2. Next 35 min: Find and fix the bug using their AI tools. You observe in real time.
  3. Final 15 min: Defense. Remove the AI. Ask them to explain every line in the diff. Ask why the model chose that approach. Ask what would break this fix.

Format B — PR review (30 min, good secondary signal)

Hand the candidate a 200–300 line PR that an AI generated. Three changes are subtly wrong: a fabricated import, a null check missing, a logic inversion. Can they find all three in 20 minutes? No AI needed for this one — it's pure verification reflex.

Format C — Spec-first build (45 min, strongest prompt quality signal)

Give a small feature spec. Before writing any code, ask the candidate to write their prompt contract — what scope, what constraints, what they'll verify. Score the prompt contract as much as the result. A strong candidate writes a precise brief with edge cases and exit criteria spelled out. A weak candidate writes "build me X."

Best combination: Format A (60 min) + 15-minute transcript review together. You get signal on both generation and verification — the two halves of AI-native engineering — in 75 minutes total.

What to Score and How

Score four dimensions, each 1–5. Weighted total determines hire recommendation.

Threshold: 4.0+ = strong hire | 3.5–3.9 = conditional | below 3.0 = no-hire. Verification score below 3 = hard no-hire regardless of total.

Download the full 5-level scoring rubric with specific examples per score

Red Flags and Green Flags in Real Time

Green flags (what you want to see)

Red flags (stop and probe when you see these)

Reviewing the Session Transcript After

The AI session transcript is the most underused interview artifact. Claude Code keeps session logs. Cursor Composer shows the full history. Ask the candidate to share it or screen-record the session. Then read it the way you'd read a PR:

FAQ

Is "vibe coding interview" the same as an "AI agent interview"?

They're used interchangeably, but there's a nuance. "Vibe coding" originally described a failure mode — accepting AI output without verification. "Vibe coding interview" is the assessment format that tests whether a candidate does this or its opposite. "AI agent interview" is the broader term for any technical assessment where AI tools are allowed and scored. Same format, slightly different framing depending on your audience.

How do you prevent candidates from just letting the AI do everything?

You can't prevent it — and you shouldn't try. If a candidate lets the AI do everything and can't explain any of it under questioning, that IS the signal. The defense phase (15 minutes at the end where you ask them to explain every line) is where AI-over-reliance becomes immediately visible. There's no way to fake understanding of code you didn't write when someone asks "why did the model choose this approach?" and then "rewrite this section without AI."

What seniority level does this format work for?

All levels, with calibration. For L3/L4: check baseline verification discipline and prompt fundamentals. For L5+: evaluate orchestration judgment, architectural decisions made while directing agents, and the ability to scope and decompose large problems. Senior candidates should be able to run parallel agent tasks while reviewing the output of another — and explain every decision made.

Can we run this format without building it ourselves?

Yes — Altor conducts vibe coding and AI agent proficiency interviews on behalf of engineering teams. We run the session, review the transcript, score against the rubric, and deliver a hire/no-hire recommendation with written reasoning within 24 hours.

Altor Runs Vibe Coding Interviews For You

We conduct live AI agent proficiency interviews on behalf of US engineering teams. You get the scored report — we handle the format, rubric, and session review.

Related: Complete AI agent interview guide · Free scoring rubric · Altor's interview service · Karat vs Altor