4  Skills for Publication-Ready Research

A researcher’s skills library for the full publication pipeline

5 What Are Skills?

A skill is a reusable set of instructions that tells Claude how to perform a specific task. At its core, a skill is just a folder containing a SKILL.md file — a plain-text document written in markdown that describes what Claude should do, step by step. When Claude encounters a relevant task, it reads the skill and follows the instructions. You can also invoke a skill directly by typing /skill-name in the terminal.

Think of skills as saved expertise. Without a skill, you’d type the same detailed instructions every time you wanted Claude to, say, read a paper carefully or audit your code. With a skill, those instructions are written once and reused automatically.

There is no magic here. A skill is just a prompt saved to disk — you could paste the contents of any SKILL.md directly into a chat and get the same result. The advantage is that the instructions are written once and loaded automatically. If you can write clear instructions for a research assistant, you can write a skill.

Where skills live. Skills are stored in .claude/skills/ inside your project (available to that project only) or in ~/.claude/skills/ on your machine (available across all projects). Each skill gets its own folder. Note that .claude/ is a hidden folder (the dot makes it invisible by default) — use ls -a in the terminal or press Cmd+Shift+. in macOS Finder to see it:

.claude/skills/read-paper/
├── SKILL.md          ← the instructions Claude reads (required)
└── methodology.md    ← optional supporting files

What SKILL.md contains. The file starts with a short YAML header (name, description) and then markdown instructions. The description helps Claude decide when to use the skill automatically. Here’s a minimal example:

---
name: check-data
description: Validate a dataset for common problems before analysis
---

When asked to check a dataset:

1. Report dimensions (rows × columns)
2. Flag missing values by variable
3. Check for duplicate observations
4. Summarize variable types and ranges

That’s it — no programming required. If you can write a checklist, you can write a skill.

Note

Additional frontmatter fields. The name and description fields are the essentials. The full specification also supports disable-model-invocation (prevents auto-triggering), user-invocable, allowed-tools, model, context (set to fork for subagent execution), agent, argument-hint, effort, and hooks (skill-scoped hooks), among others. See Anthropic’s skills documentation for the complete reference.

6 When Are Skills Worth Building?

Not all skills are equally valuable. Before writing one, ask: would Claude do this well from a one-off prompt? If yes, you don’t need a skill — just prompt it. Skills earn their keep in three situations:

  1. Guardrail skills prevent mistakes Claude makes inconsistently — like slipping into causal language when interpreting OLS, or reading a 60-page PDF in one pass and hallucinating half the citations. The skill doesn’t teach Claude anything new; it enforces discipline.
  2. Workflow skills impose a systematic process Claude wouldn’t follow on its own — like classifying every referee comment into 5 categories, or running robustness checks across 7 dimensions instead of listing a few and stopping.
  3. Convenience skills just save typing — reminding Claude which R packages you prefer or what your YAML header looks like. These are the weakest candidates. If it’s just formatting preferences, put it in CLAUDE.md instead.

If a skill doesn’t fall into category 1 or 2, you probably don’t need it.

Tip

When you’re ready to build your first custom skill:

“Help me create a Claude Code skill. I want a skill called [name] that does [what]. I’ve been doing this manually — here’s what the process looks like: [describe steps]. Create the SKILL.md file in .claude/skills/[name]/ with frontmatter and instructions.”

7 Auto-Memory: How Claude Learns From Its Mistakes

Before we look at individual skills, there’s one piece of infrastructure to cover: auto-memory. Claude Code has a built-in memory system that captures corrections and preferences across sessions.

7.1 The problem auto-memory solves

Claude has no memory between sessions. Every time you start a new conversation, it begins from scratch. If you corrected Claude yesterday — told it to use theme_bw() instead of theme_meridian(), or to never modify raw data files — it has already forgotten. Without intervention, you’ll make the same correction again tomorrow, and the day after that.

CLAUDE.md (Tutorial 2) solves part of this problem by giving Claude stable rules at the start of every session. But CLAUDE.md is for rules you know in advance. Many of the most important lessons only emerge through use — the edge cases, the quirks, the project-specific conventions that you discover when Claude gets something wrong.

Auto-memory captures those lessons automatically. When you correct Claude, it saves the correction to ~/.claude/projects/<project>/memory/ as a structured markdown file. An index file (MEMORY.md) lists all memories — the first 200 lines are loaded at every session start.

7.2 How it works

Claude saves memories as individual markdown files in ~/.claude/projects/<project>/memory/. Each file has a name, description, and content. In practice, memories tend to capture things like:

  • Your preferences — “works primarily in R, uses theme_bw() for ggplot2”
  • Corrections — “don’t use theme_meridian(), use theme_bw()”
  • Project context — ongoing goals, decisions, key constraints

Unlike the older manual [LEARN] tag approach (from Pedro Sant’Anna’s CLAUDE.md template), auto-memory is a built-in Claude Code feature — no Session Startup section or manual logging required.

7.3 How corrections get added

When Claude makes a mistake and you correct it, it automatically saves the lesson. You can also ask Claude to remember something conversationally:

Tip

Prompt:

“Always use theme_bw() for ggplot2 figures in this project”

Claude saves this as a memory that persists across sessions. You can manage memories with the /memory command, which lets you toggle auto-memory on or off and open the memory folder to review or edit files directly.

7.4 Why this matters for skills

Many of the skills in this tutorial produce outputs that depend on project conventions — which plotting theme to use, where to save files, how to name variables. Auto-memory ensures Claude applies your accumulated corrections every time, so skills produce output that fits your workflow without you repeating the same instructions.

8 What This Tutorial Covers

This tutorial walks through a working skills library for the full publication pipeline — from reading papers through submission. The skills borrow from Scott Cunningham’s MixtapeTools, Hugo Sant’Anna’s Clo-Author, and my own experience with the academic workflow. I’ve adapted them for political science and applied economics.

The popescu_claude/ folder accompanying this tutorial has all of it: skills and project configuration:

popescu_claude/
├── README.md                ← what's in the folder and how to use it
├── .claude/
│   ├── CLAUDE.md            ← project configuration (Tutorial 2)
│   └── skills/
│       ├── read-paper/          ← Structured paper reading (guardrail)
│       ├── search-pdf/          ← Targeted search in large PDFs (convenience)
│       ├── discover-lit/        ← Literature discovery (workflow)
│       ├── strategize/          ← Research design before code (workflow)
│       ├── argument-review/     ← Adversarial paper & argument audit (workflow)
│       ├── methods-review/      ← Adversarial code & methods audit (workflow)
│       ├── presentation-builder/← Academic slide deck generation (workflow)
│       ├── robustness-battery/  ← Systematic specification testing (workflow)
│       ├── regression-interpret/← Regression output interpretation (guardrail)
│       ├── humanizer/           ← AI writing pattern removal (guardrail)
│       └── submit-package/      ← Journal submission package (workflow)

The .claude/ folder has the project configuration from Tutorial 2: the CLAUDE.md that tells Claude who you are and how to behave. Auto-memory (explained above) handles corrections automatically. Skills live in .claude/skills/ — this is where Claude Code looks for them.

I’ll walk through three skills in detail and then briefly describe the rest.

9 Skill 1: read-paper (Guardrail)

9.1 The problem

When you give Claude a full PDF, it tries to process everything at once. For short documents this works fine. For a 40-page economics paper with tables, equations, and appendices, it produces shallow summaries with hallucinated details — the kind where the paper is real, the method is roughly right, but the specific coefficient from Table 3 Column 4 is fabricated.

Literature reviews in good journals require you to actually engage with source material: correct attribution of findings, precise methodological descriptions, honest representation of what prior work does and doesn’t show. If you let Claude read papers in one pass and then write your lit review, you’ll get hallucinated citations, methodological mischaracterization, and overconfident claims about what sources actually found.

9.2 What the skill enforces

Three hard constraints that Claude would not follow on its own:

  1. Never read a full PDF. Split into 4-page chunks first.
  2. Read only 3 chunks at a time (~12 pages). After each batch, update running notes and pause.
  3. Wait for user confirmation before reading the next batch.

The pause-and-confirm protocol is what makes this work. It forces Claude to commit to an interpretation of pages 1–12 before seeing pages 13–24, which makes it unlikely to retroactively revise early claims to sound more coherent with later material. This is closer to how you’d actually read a paper yourself.

Structured extraction. As Claude reads, it collects information along 8 dimensions:

# Dimension What to Look For
1 Research question What and why
2 Audience Which sub-community
3 Method Identification strategy
4 Data Source, unit, sample, period
5 Statistical methods Estimator, key specifications
6 Findings Coefficients, standard errors
7 Contributions What’s new
8 Replication feasibility Public data? Code archive?

These extract what you need to build on or replicate the work, not just summarize it.

9.3 How to use it

Tip

Prompt:

“/read-paper articles/acemoglu_2001.pdf”

Claude splits the PDF, reads the first ~12 pages, presents structured notes, and waits for your confirmation before continuing. The final output is a notes.md file in the split directory with specific data sources, variable names, and coefficient estimates — not vague summaries.

After reading all papers for a literature review, synthesize:

Tip

Prompt:

“I’ve now read 5 papers on colonial institutions and development. Synthesize the structured notes into a literature review identifying: points of agreement, methodological debates, and gaps my study addresses. Do NOT invent any citations.”

Warning

Always audit citations. Even with structured reading, verify that every paper exists, the author actually said what’s claimed, the findings are accurately reported, and table/column/figure references are correct. AI confidently produces citations where the paper is real but the specific numbers are fabricated. These are the hardest hallucinations to spot because they look plausible.

10 Skill 2: argument-review (Workflow)

10.1 The problem

If you ask Claude to “review my paper,” you get a generic list of suggestions. Some are useful, some are hallucinated, and there’s no way to tell which is which.

10.2 What the skill does

The argument-review skill runs an 8-step adversarial intellectual review of your paper. It checks argument gaps, internal consistency, methods-theory alignment, assumptions, notation, and simulates a referee report — then forces an author defense before finalizing.

Step What it checks
1. Ruthless Reader Claims asserted but not demonstrated, logical gaps
2. Internal Consistency Numbers in text vs. tables, abstract vs. results
3. Methods-Theory Alignment Does the design identify what the theory claims?
4. Assumption Stress Test Are identifying assumptions stated, testable, tested?
5. Notation & Precision Consistent symbols, defined terms, notation drift
6. Simulated Referee Report 3–5 most damaging criticisms, classified by severity
7. Author Defense Steelman defense of each criticism — does it survive?
8. Synthesis Remove hallucinated criticisms, deduplicate, produce actionable report

Step 3 (Methods-Theory Alignment) is the most valuable. It forces Claude to write out explicitly what the research design identifies and compare it to the theoretical claim. Many papers have a clean design that identifies something subtly different from what the theory needs — this step catches that gap.

Step 7 (Author Defense) is what makes this honest. After generating criticisms, the skill switches perspective and stress-tests each one. Any criticism that turns out to be hallucinated, factually wrong, or that the author can fully defend gets dropped from the final report.

The final synthesis uses a constructive tone — “here is how to strengthen” rather than “here is what is wrong” — and organizes by importance to referees.

10.3 How to use it

Tip

Prompt:

“/argument-review paper/main.tex”

Run this after /methods-review has verified the code and replication — argument problems are worth fixing only if the implementation is correct.

11 Skill 3: presentation-builder (Workflow)

11.1 The problem

When you ask Claude to turn a paper into slides, you get a bulleted summary of each section — one slide per section, bullet points under each, generic “Results” and “Conclusion” titles. The deck reads like a document reformatted as slides, not something you’d actually want to sit through.

11.2 What the skill does

The presentation-builder skill enforces design constraints that Claude would otherwise ignore. It builds on Scott Cunningham’s Rhetoric of Decks framework, adapted from Beamer to Quarto RevealJS. The rules:

  • Assertion titles — “Colonial institutions explain 75% of income variation,” not “Results”
  • One idea per slide — cognitive load is distributed evenly, not front-loaded
  • Figures carry the argument — every chart answers one question, with direct labels (no legends requiring eye movement)
  • Minimal bullet points — lists are converted to diagrams, tables, or prose (maximum 3 items if truly needed)

After building the deck, the skill runs a multi-agent review: one agent checks argument quality and narrative flow, another checks visual elements (figure sizing, label visibility, numerical accuracy). It also generates a proof PDF via decktape and reads it page by page to catch silent visual errors.

The full presentation philosophy (the Three Laws of slide design, the rhetoric framework, and the principle that each slide should carry equal cognitive load) is in the presentation-builder/methodology.md file. Read it if you want to understand the design theory or customize the skill.

11.3 How to use it

Tip

Prompt:

“/presentation-builder paper/main.tex”

Claude reads the paper, plans assertion titles, generates figures (R with theme_meridian() to match the slide theme — an exception to the theme_bw() default), builds the Quarto RevealJS deck, compiles, runs the multi-agent review, and delivers a polished .html deck. You can customize for context:

Tip

Conference talk (20 min):

“/presentation-builder paper/main.tex — conference presentation for political scientists in comparative politics, 20 minutes, emphasize identification strategy and key results”

Job talk (45 min):

“/presentation-builder paper/main.tex — job talk, include roadmap slide, more time on data and robustness”

12 The Other Skills

The popescu_claude/.claude/skills/ folder contains eight additional skills not covered in detail above. Each is a single SKILL.md file you can read, modify, and install:

  • search-pdf — targeted search in large PDFs (books, dissertations, reports). Extracts full text using pymupdf, builds a reusable page index, searches by keyword and context, and returns page numbers with surrounding passages. You pick which pages to read in detail — it doesn’t auto-read hundreds of pages. Handles scanned PDFs by flagging when OCR is needed.
  • methods-review — adversarial 5-audit project review. Checks code quality, cross-language replication (optional), directory structure and reproducibility, output automation, and methods. Never modifies author code — it only reads, runs, and creates replication scripts. Produces a formal report with a preliminary assessment (Accept / Minor / Major / Reject).
  • discover-lit — structured literature discovery with proximity scoring, BibTeX output, frontier map, and positioning statement. Uses tiered search (reliable web search vs. best-effort citation chains) with a verification pass that re-searches each citation to catch hallucinations. Supplements, but does not replace, a manual database search.
  • strategize — designs a causal identification strategy before any code is written. Proposes ranked candidate strategies, plans robustness checks and falsification tests, then audits the strategy from an adversarial perspective. Includes a user checkpoint between design and audit.
  • robustness-battery — takes your main regression specification and systematically generates alternative specifications across 7 dimensions (samples, controls, fixed effects, standard errors, estimators, functional forms, placebos), producing organized scripts, a summary table, and a specification curve dataset. Classifies specification stability as robust, sensitive, or fragile.
  • regression-interpret — a guardrail skill that checks design credibility before interpreting magnitudes. Prevents common errors: slipping into causal language for OLS, misreading log specifications, overstating significance, ignoring that a large point estimate with a wide confidence interval is not a strong result. Produces a draft results paragraph with proper hedging.
  • humanizer — removes 11 patterns of AI-generated writing from academic text, based on Wikipedia’s “Signs of AI writing” guide. Targets inflated significance, vague attributions, hedging, promotional language, and formulaic structures.
  • submit-package — builds a journal-ready replication package with AEA-format README, master script, dependency documentation, data provenance, and a 10-point verification checklist.
NoteWhy structured audits matter: the Dias & Fontes example

Cunningham applied the methods-review approach to a published paper — Dias & Fontes (“The Effects of a Large-Scale Mental Health Reform: Evidence from Brazil,” AEJ: Economic Policy, 2024) — and the results illustrate why structured verification catches things human review misses (Federal Reserve Board of Governors talk, 2026).

The audit flagged a suspicious variable: the “rural” covariate was defined as population divided by a trend in calendar year, producing a mechanically declining variable. It found that trimming the sample dropped only 2002 rows for treated units, creating an unbalanced panel. It identified severe covariate imbalance: 12 of 31 covariates exceeded the Imbens-Rubin threshold. And it found duplicate estimation — the same Callaway & Sant’Anna model run three times with identical specifications, undocumented.

Most strikingly, the cross-language replication revealed that six implementations of the same estimator (Callaway & Sant’Anna, 2021) — csdid and csdid2 in Stata, did and ddml in R, differences and diff-diff in Python — on the same data with the same covariates gave ATT estimates ranging from 0.0 to 2.38. An ANOVA decomposition attributed 40% of the variation to specification choice, 16% to package choice, and 44% to their interaction. This is not hallucination or coding error — it reflects real implementation differences in how packages handle propensity scores and near-separation.

A human reviewer would need months and trilingual fluency to uncover this. Claude Code wrote 96 scripts across three languages in a single session. The point is not that the original paper is flawed — it is that a structured audit finds issues that conventional review does not, and the skills in this tutorial make that audit routine rather than heroic.

13 Putting It Together: The Pre-Submission Checklist

Before submitting a paper, run these checks in sequence:

Step Skill What to Ask Time
1 methods-review “/methods-review . R” 20–30 min
2 argument-review “/argument-review paper/main.tex” 15–20 min
3 robustness-battery “/robustness-battery code/main_analysis.R R” 10–15 min
4 submit-package “/submit-package .” 10–15 min
5 presentation-builder “/presentation-builder paper/main.tex” 15–25 min

Step 1 catches bugs: coding errors, reproducibility failures, hardcoded numbers. Step 2 catches paper-level weaknesses: argument gaps, methods-theory misalignment, notation drift, and it simulates a referee report with author defense. Step 3 catches fragile results, meaning specifications where the headline finding doesn’t survive reasonable alternatives. Step 4 builds the replication package with AEA-format README and verification checklist. Step 5 produces the conference deck.

After running the first three, you should have reports in quality_reports/ covering code, paper, and robustness. Address every finding before submitting.

14 The Skills Library Folder

The popescu_claude/ folder that comes with this tutorial is meant to be shared. It has everything you need to install the skills library and project configuration into your own Claude Code project. The README.md at the folder root explains what each file does and how to install.

To install:

  1. Copy the .claude/ folder into your project root (it contains skills and CLAUDE.md)
  2. Edit .claude/CLAUDE.md to match your profile
  3. Use Claude Code as usual — most skills require explicit invocation with /skill-name. Two skills (/read-paper and /search-pdf) trigger automatically when relevant. To prevent any skill from auto-triggering, add disable-model-invocation: true to its YAML frontmatter.

15 Further Reading

The skills in this tutorial draw on two open-source projects and Anthropic’s documentation:

  • MixtapeTools by Scott Cunningham — the original methods review protocol, read-paper skill, presentation philosophy, and project scaffolding tools
  • Clo-Author by Hugo Sant’Anna — a full research pipeline with 10 slash commands, worker-critic agent pairs, and quality gates. Documentation at hsantanna.org/clo-author
  • Anthropic skills documentation — the technical reference for building and installing Claude Code skills
  • Panjwani’s AI MBA — practical training on skills, MCP, and Claude Code workflows