I Tried the 3-Line Agent Skill grill-me — A Pre-Implementation Harness

Lead

Hi, I’m Pomarano.

In an earlier post, I compared JSON formatting with and without a harness. That harness validated output after generation.
grill-me, which I found recently, is an Agent Skill by Matt Pocock that works differently: it asks questions before implementation until you and the agent share the same understanding.

I read Ryo Nakae’s Zenn article, thought “can three lines really change this much?”, and tried it in Cursor.
This post covers the skill itself, how it relates to my earlier harness posts, what I verified, and when it helps (or doesn’t).


What is grill-me?

grill-me tells the agent to interview you relentlessly about a plan or design until you reach shared understanding.
In Claude Code you invoke /grill-me; in Cursor you load it as a Skill.

Key behaviors:

  • One question at a time
  • Walk each branch of the design decision tree
  • Include a recommended answer with each question
  • Explore the codebase instead of asking you, when the answer is in the repo

The skill body is effectively three lines, excluding frontmatter (the YAML metadata block at the top of a Markdown file). It is not a long prompt template — just a short instruction on how to behave.


Two kinds of harness

In my JSON harness post, schema validation and retries fixed output shape.
In semi-automated X drafts, character limits and duplicate checks constrained quality after generation.

grill-me works upstream of both.

KindWhenExamples
Output harnessAfter generationJSON schema, character count, retries
Alignment harness (grill-me)Before implementationRequirements, boundaries, priorities

If you only say “please build this,” agents tend to rush out a plan or code. grill-me is a short leash that stops and asks first.


Full skill text (original)

Source: mattpocock/skills — grill-me by Matt Pocock

---
name: grill-me
description: Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree. Use when user wants to stress-test a plan, get grilled on their design, or mentions "grill me".
---

Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.

Ask the questions one at a time.

If a question can be answered by exploring the codebase, explore the codebase instead.

The three instruction lines (after frontmatter) are the entire behavioral spec.


Setup and how I invoked it

Environment

  • Cursor (Agent mode)
  • Compared against: normal chat and “dump a plan immediately” behavior (similar to Plan mode)

Install (Cursor)

Follow skills.sh:

npx skills add https://github.com/mattpocock/skills --skill grill-me

Or place grill-me/SKILL.md under ~/.cursor/skills/ (personal) or .cursor/skills/ (project).

How to call it

Describe your goal in chat and say you want grill-me. Example:

Use grill-me for a design review.

I want to add "one blog URL per week" to my X draft agent.
Align on requirements before implementation.

What I verified

I ran two passes:

  1. Meta verification — use grill-me-style questioning to decide this article’s structure (before writing)
  2. Pre-implementation verification — stress-test a real feature idea on an existing project

1. Meta verification (article outline)

Before drafting, I answered grill-me-style questions one by one. The session ran about 12 questions and settled on:

#Example questionDecision
1Who is the reader?People who read my harness and X automation posts
2Goal of the post?My verification log, not evangelism
3Relation to prior posts?Contrast output harness vs alignment harness
4Verification topic?Real feature idea (weekly blog URL on X drafts)
5Tools?Cursor — not Claude Code only
6Compare to Plan mode?Re-test points from the Zenn article in my own words
7Downsides?Be honest about fatigue and time cost
8Skill text?Include original with attribution
9Diagrams?A two-type harness table is enough
10Internal linksai/272, ai/318 (and EN equivalents)
11English version?Separate draft after JP publish
12Reproducibility?Install steps and sample prompt

Takeaway: After 12 questions, the outline and “what counts as verification” were mostly fixed. Compared to Plan mode dumping a full doc at once, one question at a time reduced drift.

2. Pre-implementation verification (X draft agent extension)

Topic: Add a rule to the semi-automated X draft agent to include one blog article URL per week in a post.

ItemResult (sample — replace with live log if you re-run)
Question count(e.g. 18)
Codebase exploration(e.g. auto-read x-shuuchaku-agent-spec.md)
Decided spec(e.g. Sundays only / short URL within 140 chars / human must approve)
Still open(e.g. English post URLs)
Session time(e.g. ~25 min)

Sample Q&A (replace with your actual session):

  • Q: Fixed weekday or rotation for “once a week”? → Sunday (fits weekly reflection)
  • Q: Conflicts with spec “no URLs”? → Needs an explicit exception in the spec file
  • Q: Auto-insert or suggest only? → Suggest in draft; human posts (keeps semi-auto policy)

Comparison with Plan mode and JSON harness

ApproachStrengthsWeaknesses
Plan modeFull picture at onceSmall edits often rewrite the whole plan
JSON harnessPass/fail on output is mechanicalDoes not fix wrong requirements
grill-meReduces pre-implementation mismatchCosts time and focus; tiring

As Ryo Nakae’s Zenn post notes, after grill-me the conversation log becomes the implementation plan — easy to hand off with “OK, build this.” I saw the same pattern.


What was awkward

  • Tiring — past ~10 questions, your brain keeps working. The Zenn “brain sizzling” line resonates
  • Overkill for tiny tasks — not for typo fixes
  • Depth depends on the model — reports that Opus-class models branch deeper sound plausible (add your own note after trying)
  • Does not auto-trigger — you must say “use grill-me” explicitly

Summary

  • grill-me is a pre-implementation alignment harness in just a few lines
  • JSON harness is after output; grill-me is before
  • In verification, confusion cleared faster than code would have — for both article structure and a feature sketch
  • Downsides are time and fatigue; not every task needs it

Next I may try grill-me → requirements note → implement on the X draft pipeline.

See also Matt Pocock’s grill-me intro on AI Hero and skills.sh.


コメント