See how they reason with the agent.

HotTea provisions a real developer VM, lets your candidate work with Claude or Codex on a practical repo task, and produces one reviewer packet that shows their AI fluency: transcript, terminal trail, git snapshots, hidden-test grading, and an ATS-ready note.

Start founding access Open the sample packet

Real VM, not a proctorCandidates use Claude / Codex on a real repo. No locked browsers, no fake editors.

One packet per candidateTranscript, terminal, diff, tests, hidden checks, and reviewer note in one place.

Hard cost capsPer-session, per-message, and monthly token ceilings. No surprise overage.

What you pay for

$199/mo

Founding-team access. Hard caps, no overage.

25sessions

Candidate sessions included each month.

2Mtokens

Estimated monthly token cap. Requests stop before overage.

Why assessment teams care

Practical coding tasks still work. The missing piece is the AI-era audit trail of AI fluency.

GitHub-based assessments already show what a candidate can build. HotTea adds the reviewer packet for agentic engineering — how the candidate directs the agent, what they probe, what they push back on, what they verify. No proctoring, no locked browsers, no theater. The candidate works the way they really work. You read the receipts.

What lands in the packet

Real agentic workspace

Each applicant gets an isolated developer VM — Claude, Codex, tests, git, normal shell tooling. No mock IDE.

AI fluency evidence

Transcript, terminal output, git snapshots, final diff, tests, and a structured fluency analysis with quoted reasoning moments, probe responses, and follow-up questions.

Cost guardrails

Prompt length, message count, candidate session count, and monthly token caps are hard stops, not soft alerts.

See how they reason with the agent.

Practical coding tasks still work. The missing piece is the AI-era audit trail of AI fluency.

Founding access

Real agentic workspace

AI fluency evidence

Cost guardrails