I Tried the 3-Line Agent Skill grill-me — A Pre-Implementation Harness

Lead
What is grill-me?
Two kinds of harness
Full skill text (original)
Setup and how I invoked it
What I verified
1. 1. Meta verification (article outline)
2. 2. Pre-implementation verification (X draft agent extension)
Comparison with Plan mode and JSON harness
What was awkward
Summary

Lead

Hi, I’m Pomarano.

In an earlier post, I compared JSON formatting with and without a harness. That harness validated output after generation.
grill-me, which I found recently, is an Agent Skill by Matt Pocock that works differently: it asks questions before implementation until you and the agent share the same understanding.

I read Ryo Nakae’s Zenn article, thought “can three lines really change this much?”, and tried it in Cursor.
This post covers the skill itself, how it relates to my earlier harness posts, what I verified, and when it helps (or doesn’t).

Japanese version: たった3行の Agent Skill「grill-me」を試した

What is grill-me?

grill-me tells the agent to interview you relentlessly about a plan or design until you reach shared understanding.
In Claude Code you invoke /grill-me; in Cursor you load it as a Skill.

Key behaviors:

One question at a time
Walk each branch of the design decision tree
Include a recommended answer with each question
Explore the codebase instead of asking you, when the answer is in the repo

The skill body is effectively three lines, excluding frontmatter (the YAML metadata block at the top of a Markdown file). It is not a long prompt template — just a short instruction on how to behave.

Two kinds of harness

In my JSON harness post, schema validation and retries fixed output shape.
In semi-automated X drafts, character limits and duplicate checks constrained quality after generation.

grill-me works upstream of both.

Kind	When	Examples
Output harness	After generation	JSON schema, character count, retries
Alignment harness (grill-me)	Before implementation	Requirements, boundaries, priorities

If you only say “please build this,” agents tend to rush out a plan or code. grill-me is a short leash that stops and asks first.

Full skill text (original)

Source: mattpocock/skills — grill-me by Matt Pocock

---
name: grill-me
description: Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree. Use when user wants to stress-test a plan, get grilled on their design, or mentions "grill me".
---

Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.

Ask the questions one at a time.

If a question can be answered by exploring the codebase, explore the codebase instead.

The three instruction lines (after frontmatter) are the entire behavioral spec.

Setup and how I invoked it

Environment

Cursor (Agent mode)
Compared against: normal chat and “dump a plan immediately” behavior (similar to Plan mode)

Install (Cursor)

Follow skills.sh:

npx skills add https://github.com/mattpocock/skills --skill grill-me

Or place grill-me/SKILL.md under ~/.cursor/skills/ (personal) or .cursor/skills/ (project).

How to call it

Describe your goal in chat and say you want grill-me. Example:

Use grill-me for a design review.

I want to add "one blog URL per week" to my X draft agent.
Align on requirements before implementation.

What I verified

I ran two passes:

Meta verification — use grill-me-style questioning to decide this article’s structure (before writing)
Pre-implementation verification — stress-test a real feature idea on an existing project

1. Meta verification (article outline)

Before drafting, I answered grill-me-style questions one by one. The session ran about 12 questions and settled on:

#	Example question	Decision
1	Who is the reader?	People who read my harness and X automation posts
2	Goal of the post?	My verification log, not evangelism
3	Relation to prior posts?	Contrast output harness vs alignment harness
4	Verification topic?	Real feature idea (weekly blog URL on X drafts)
5	Tools?	Cursor — not Claude Code only
6	Compare to Plan mode?	Re-test points from the Zenn article in my own words
7	Downsides?	Be honest about fatigue and time cost
8	Skill text?	Include original with attribution
9	Diagrams?	A two-type harness table is enough
10	Internal links	ai/272, ai/318 (and EN equivalents)
11	English version?	Separate draft after JP publish
12	Reproducibility?	Install steps and sample prompt

Takeaway: After 12 questions, the outline and “what counts as verification” were mostly fixed. Compared to Plan mode dumping a full doc at once, one question at a time reduced drift.

2. Pre-implementation verification (X draft agent extension)

Topic: Add a rule to the semi-automated X draft agent to include one blog article URL per week in a post.

Item	Result (sample — replace with live log if you re-run)
Question count	(e.g. 18)
Codebase exploration	(e.g. auto-read `x-shuuchaku-agent-spec.md`)
Decided spec	(e.g. Sundays only / short URL within 140 chars / human must approve)
Still open	(e.g. English post URLs)
Session time	(e.g. ~25 min)

Sample Q&A (replace with your actual session):

Q: Fixed weekday or rotation for “once a week”? → Sunday (fits weekly reflection)
Q: Conflicts with spec “no URLs”? → Needs an explicit exception in the spec file
Q: Auto-insert or suggest only? → Suggest in draft; human posts (keeps semi-auto policy)

Comparison with Plan mode and JSON harness

Approach	Strengths	Weaknesses
Plan mode	Full picture at once	Small edits often rewrite the whole plan
JSON harness	Pass/fail on output is mechanical	Does not fix wrong requirements
grill-me	Reduces pre-implementation mismatch	Costs time and focus; tiring

As Ryo Nakae’s Zenn post notes, after grill-me the conversation log becomes the implementation plan — easy to hand off with “OK, build this.” I saw the same pattern.

What was awkward

Tiring — past ~10 questions, your brain keeps working. The Zenn “brain sizzling” line resonates
Overkill for tiny tasks — not for typo fixes
Depth depends on the model — reports that Opus-class models branch deeper sound plausible (add your own note after trying)
Does not auto-trigger — you must say “use grill-me” explicitly

Summary

grill-me is a pre-implementation alignment harness in just a few lines
JSON harness is after output; grill-me is before
In verification, confusion cleared faster than code would have — for both article structure and a feature sketch
Downsides are time and fatigue; not every task needs it

Next I may try grill-me → requirements note → implement on the X draft pipeline.

See also Matt Pocock’s grill-me intro on AI Hero and skills.sh.