Now in private beta · for hiring teams

Interview the human,
not the model.

Every candidate uses AI now. Prooftrace runs real, ambitious tasks and shows you whether they actually understand what they're shipping — or just pasted and prayed.

See a sample candidate replay →

What you get back — not a score, the receipts

replay · candidate-A · rate-limiter

Engaged · reads & edits

rateLimiter.jstypedAI pasteedited

1export function rateLimiter({ windowMs, max }) {
2  const hits = new Map()
3 
4  return (key) => {AI
5    const now = Date.now()AI
6    const rec = hits.get(key) ?? { n: 0, reset: now + windowMs }AI
7    if (now > rec.reset) {edited
8      rec.n = 0; rec.reset = now + windowMsedited
9    }AI
10    rec.n++
11    hits.set(key, rec)
12    return rec.n <= max
13  }AI
14}

Session timeline

00:38Prompt — "sliding-window limiter, return boolean"
01:12Pasted 9 lines from Claude Opus 4.7
01:40Edited reset logic by hand (lines 7–8)
02:05Prompt — “won’t idle keys leak memory?”
03:18Rejected a hallucinated .expire() API
04:02All tests pass — 12 / 12

02:38 / 04:30

Competitive advantage · engagement signals

See how they think with AI.

Two candidates. Same prompt. Both pass every test. The signals tell you which one you'd actually want shipping at 2am.

Prompt

Implement debounce(fn, ms) — return a function that delays calls until ms have passed since the last call.

Candidate A

Engaged with the model

94

engaged

function debounce(fn, ms) {
  let t
  return (...args) => {
    clearTimeout(t)
    t = setTimeout(() => fn(...args), ms)
  }
}

Edited the AI output by hand

Pushed back on a hallucination

Four prompts, each refined

Keystroke cadenceirregular · read + edit

Candidate B

Pasted and prayed

12

engaged

/**
 * Creates a debounced version of the provided function that delays
 * its execution until `ms` milliseconds have elapsed since the
 * last time the debounced function was invoked.
 *
 * @param {Function} fn - The function to debounce
 * @param {number} ms  - The debounce delay in milliseconds
 * @returns {Function} The debounced function
 */
function debounce(fn, ms) {
  let timeoutId;
  return function(...args) {
    clearTimeout(timeoutId);
    timeoutId = setTimeout(() => {
      fn.apply(this, args);
    }, ms);
  };
}

✕412-character blind paste

✕No edits, no review

✕One prompt, shipped as-is

Keystroke cadenceflat · paste-and-go

─── Engagement score = prompt refinement + edit ratio + review time − blind-paste events. Not a black box.

The cost of getting it wrong

One wrong senior hire costs you a quarter before you notice.

Recruiting time, onboarding, the velocity they drag down, and the months to re-open the role. Every candidate now passes the test with AI — so the test stopped telling you who can actually do the job.

Recruiter + panel hours

sourcing, screens, onsite loops, debriefs

Ramp that never lands

months of salary before real output

Velocity tax on the team

reviews, rework, and trust eroded

Re-opening the role

back to square one, two quarters lost

How it works

A live interview, with a black box.

Candidates code in our browser sandbox. You get an annotated transcript and a verdict — not a vibe.

01
Send a link.
Pick a problem from our bank or upload your own. Set rules: internet on/off, models allowed, banned words.
02
Candidate codes.
Browser-native IDE. We record keystrokes, focus changes, paste sources, prompts, model output — locally, encrypted, consented.
03
Assess how they use AI.
See when they pushed back on a hallucination, refined a prompt, or pasted unread. Score the judgment — not just the output.
04
You see the receipts.
Replay the whole session. Hover any line for its origin. Approve, reject or invite to onsite — with evidence.

Real tasks · real environments

Bigger tasks.
Real environments.

Because AI does the boilerplate, you can finally ask the questions you actually care about. Spin up real services. Bring your codebase. Test on real data.

60-minute architecture, not 20-minute leetcode
Design a rate limiter, build a sync engine, refactor a hot path. Whiteboard-grade problems with a real environment.
Pre-wired services — Postgres, Redis, S3, queues
Pick the stack. We provision it. Pre-seed fixtures and traffic. Candidates connect from the editor — no install dance.
Bring your codebase, monitoring, fixtures
Drop in a private repo, your Prometheus dashboard, your Sentry. Interview in the stack the candidate would actually ship to.

task.config.yaml

ready

title:"Rate limiter, 90 min"
duration:90m
repo:github.com/acme/billing
branch:main
services:
- postgres:15# pre-seeded
- redis:7# flushed
- stripe-sandbox# mock keys
fixtures:
- /seed/users.sql# 4,200 rows
- /seed/events.json# 120k events
monitoring:
- prometheus# exposed :9090
- sentry# project-acme
ai_allowed:
- claude-opus-4.7# on
- gpt-5.5# on
- deepseek-v3.2# on
- budget# $5.00

Controls

You set the rules.
We enforce them.

Not every role bans AI — and for some, prompting is the skill. Configure what's allowed, per question. Budget the model spend so a candidate can't burn $200 looking for the answer.

Cut internet access

Sandbox the browser. No DNS, no models, no Stack Overflow. Just the editor and stdlib docs you whitelist.

internet: blocked

Prohibit prompts & words

Block specific phrases, regex patterns, or whole categories — leetcode answers, API keys, framework names.

"give me the answer"leetcode.com/o1-.*/

Whitelist models & budget

Choose which assistants are allowed — and cap the spend. Cut a candidate off at $5 or 100k tokens, whichever comes first.

Claude Opus 4.7allowed
GPT-5.5allowed
DeepSeek v3.2allowed
Kimi K2blocked
GLM 4.6blocked
budget$1.84 / $5.00

Candidate trust · privacy by design

We watch the work,
never the person.

Surveillance-grade hiring scares off your best candidates and puts you on the wrong side of the GDPR and the EU AI Act. Prooftrace is built the other way around.

Consent before the timer starts
Candidates read a plain-English summary of what's captured and agree before any recording begins. No dark patterns, no fine print.
A human makes every call
Prooftrace surfaces evidence — it never auto-rejects. Under the EU AI Act, hiring is high-risk, so the hire / no-hire decision always stays with a person.
Region-pinned & encrypted
EU sessions stay on EU infrastructure, encrypted at rest, deleted on your schedule. We never train third-party models on candidate work.
Right to access & erasure
Candidates can view and delete their own session. DPA, lawful basis and sub-processor list available on request.

consent · before you begin

Here’s exactly what we record.

Nothing starts until you agree. You can request your data or have it deleted, any time.

During the task

Code & keystrokes in the editor
Prompts you send & the AI's output
Pastes and where they came from
Focus within the task tab

Never touched

✕Your webcam or microphone
✕Your screen outside the task
✕Your files or other browser tabs
✕Anything after you submit

A person reviews every result. Prooftrace never makes the hire decision for them.

Encrypted · EU-hosted
deletable on request

Limited beta

Stop using interview tasks
unattached from reality.

Request access. We're onboarding teams in the order they sign up, rolling out through summer 2026.

Frequently asked

Interview the human,not the model.

See how they think with AI.

One wrong senior hire costs you a quarter before you notice.

A live interview, with a black box.

Bigger tasks.Real environments.

You set the rules.We enforce them.

We watch the work,never the person.

Stop using interview tasksunattached from reality.

Questions we get a lot.

Interview the human,
not the model.

Bigger tasks.
Real environments.

You set the rules.
We enforce them.

We watch the work,
never the person.

Stop using interview tasks
unattached from reality.