1  Agentic AI for Social Science Research

What it is, why it matters, and how to get started

2 What Is Agentic AI?

Most researchers interact with AI through a chat interface — you type a question, get a response, and the conversation stays inside that browser tab. That’s fine for brainstorming or drafting text, but it can’t touch your files, run your code, or remember your project across sessions.

An agent is different. An agent is an AI that can act: it reads files on your computer, writes code, executes it, inspects the output, fixes errors, and iterates, all within your actual project directory. Research is not a conversation. It is a workflow: data cleaning, estimation, verification, writing, revision, submission. An agent can participate in that workflow in ways a chatbot cannot.

“An AI that does things is fundamentally more useful than an AI that says things.”

— Ethan Mollick, “A Guide to Which AI to Use in the Agentic Era”

A useful framework here is the distinction between models, apps, and harnesses:

Layer What It Is Example
Model The intelligence Claude Opus 4.6, GPT-4o
App The interface claude.ai, ChatGPT
Harness The system enabling autonomy Claude Code, Codex

The same model powering a chat interface becomes far more capable when placed inside a harness that gives it tools: file access, code execution, version control, and the ability to plan multi-step workflows. This tutorial series teaches you how to use that harness for empirical social science research.

3 Why This Matters for Research

Research is not product development. The difference matters for how you use AI tools:

Aspect Product Development Research
Goal Ship working code Understand correctly
Error cost Bug in production Wrong conclusion in paper
Iteration Fast, fix later Slow, careful, get it right
Testing Unit tests, CI/CD Cross-language replication
Success Does it run? Does it mean what we think?

Most AI coding tools were built for product development: they optimize for speed and working code. Research needs different defaults. Correctness over speed. Verification over confidence. Design before results. The workflow in this tutorial series is built around those priorities.

“Claude Code is not vibe coding. It is agentic coding.”

— Scott Cunningham, MixtapeTools

“Vibe coding” means accepting what the AI writes without scrutiny. “Agentic coding” means treating the AI as a thinking partner that you interrogate, verify, and direct. You use protocols to catch errors that informal review would miss.

But if AI requires verification, who does the verifying? That tension is worth sitting with before touching any tool.

4 The Expertise-Verification Paradox

Ethan Mollick identifies a core problem with AI-assisted work:

“AI proves most useful where expertise already exists to spot errors.”

— Mollick, “15 Times to Use AI, and 5 Not To”

You need domain expertise to verify AI output. But using AI may prevent you from developing that expertise in the first place. Each delegation prevents building judgment. This is especially dangerous in research, where the stakes are not a software bug but a published conclusion.

Mollick flags three cases where AI use is actively harmful:

  1. Learning and synthesis — AI bypasses the cognitive struggle that produces understanding
  2. High-accuracy tasks without verification — hallucinations are confident and specific
  3. Tasks where struggle enables breakthrough — the difficulty is the point

So what do you do? You use AI within structured protocols: cross-language replication, adversarial review in fresh terminals, formal audit trails, explicit quality gates. These become more important as AI improves, not less.

This paradox also has an ethical dimension. Before submitting any AI-assisted research, ask yourself three questions:

  1. Would I disclose this AI use to a referee?
  2. Could a colleague reproduce my AI-assisted steps?
  3. Did I verify every AI-generated claim?

If any answer is “no,” fix it first. The audit trail (git history, CLAUDE.md, MEMORY.md, quality reports) exists precisely so you can answer “yes” to all three. The three-question test is really a practical check on the expertise-verification paradox. It forces you to confirm that you actually understood and verified the AI’s contribution rather than trusting it passively.

The tutorials that follow teach protocols that make these answers defensible: CLAUDE.md for configuration and institutional memory, git for audit trails and safe experimentation, and a skills library that bakes verification steps into the workflow.

5 Two Modes of Working with AI

So how closely should you integrate AI into your work? Mollick draws a useful distinction between two modes.

Centaur mode is a clear division of labor. You decide the argument, the identification strategy, the interpretation. The AI handles formatting, compilation, figure generation, and mechanical tasks. Lower risk, and easier to verify.

Cyborg mode is deep integration, a fluid back-and-forth where the AI brainstorms, drafts, and iterates alongside you. Higher risk, harder to separate contributions, harder to verify.

Most of this tutorial series teaches Centaur mode. The protocols (CLAUDE.md, skills, adversarial review, cross-language replication) are about keeping human judgment and AI execution separate. As you get comfortable with the tools and verification protocols, you can start moving toward Cyborg mode for specific tasks like exploratory data analysis or brainstorming research questions.

6 The Research Workflow

Now for the actual process. The workflow combines Cunningham’s five-step research protocol with Blattman’s iterative PPRR loop into a single framework:

Dialogue → Code → Replicate → Verify → Document

Each step follows the same internal rhythm: Prompt → Plan → Review → Revise. You don’t fire off a single request and accept the result. You iterate within each step until the output meets your standards.

  1. Dialogue — start with an observation or question, not a code request. Discuss the substance before asking for implementation. Brain-dump your idea, have Claude structure it into an actionable plan, then stress-test the plan before committing to code.
  2. Code — implement in your primary language (R, Stata, Python). Use structured prompts: context, task, format, constraints. Review Claude’s plan before it writes, then review the code before running it.
  3. Replicate — implement key analyses in a second language. Compare to 6 decimal places for linear models; set tolerance upfront for nonlinear. This is the strongest guard against silent errors.
  4. Verify — run adversarial review in a fresh terminal (so Claude can’t see its own earlier work). Five audits, formal report, iterate until “Accept.” This is the Referee 2 protocol.
  5. Document — commit with meaningful messages, update CLAUDE.md and MEMORY.md, maintain audit trails. If it’s not documented, it didn’t happen.

AI works best as iteration, not a single query.

7 The Tutorial Series

This tutorial series walks through a complete research workflow using Claude Code, from initial setup through journal submission. Each tutorial is a standalone HTML document you can read and follow independently, but they build on each other:

Tutorial Topic What You’ll Learn
1 (this one) Introduction What agentic AI is, why it matters, installation
2 CLAUDE.md Configuring Claude Code for research (the amnesia solution)
3 Git Version control for AI-assisted research workflows
4 Skills A skills library for the full publication pipeline
5 AI Work Hygiene Using agentic tools without losing your mind

The tutorials use an accompanying skills_library/ folder containing a working skills library, project configuration (CLAUDE.md), and institutional memory (MEMORY.md). This folder is designed to be shared. You can copy the skills into your own project and customize the configuration for your profile.

8 Sources

This tutorial series draws on five researchers and Anthropic’s documentation. Cunningham provides the research-focused Claude Code workflow and cross-language replication philosophy; Pedro Sant’Anna contributes the “project constitution” approach to CLAUDE.md with quality gates and [LEARN] tags; Hugo Sant’Anna built Clo-Author, a full research pipeline with slash commands and worker-critic agent pairs; Blattman offers the most beginner-friendly installation and prompt-engineering guide for social scientists; and Panjwani provides practical training on skills, MCPs, and context window management. Full references and links appear in the Further Reading section at the end.

9 Installing Claude Code

Enough concepts. Let’s get Claude Code running on your machine.

9.1 Prerequisites

You need a computer (macOS, Linux, or Windows), an internet connection, and a paid Claude subscription or API key. The free plan does not include Claude Code. For current pricing, see claude.ai/pricing.

9.1.1 Required

Tool Purpose Install
Claude Code The agentic coding harness See Step 1 below
Claude subscription Pays for Claude Code usage Claude Pro, Max, Teams, or Enterprise, or an API key from the Claude Console
Git Version control and audit trail brew install git (macOS) or apt install git (Linux) — see Tutorial 3

9.1.3 Optional

Tool Purpose Install
Python Cross-language replication python.org or brew install python
LaTeX Paper compilation TeX Live or MacTeX
Note

You don’t need everything on day one. Claude Code (required) is all you need for Tutorials 1 and 2. Git (required) is introduced in Tutorial 3. R, Quarto, and the optional tools become relevant in Tutorial 4 when you start using skills. Install them as you go.

9.2 Step 1: Install Claude Code

Open a terminal and run the installer for your operating system:

macOS or Linux:

curl -fsSL https://claude.ai/install.sh | bash

Windows (PowerShell):

irm https://claude.ai/install.ps1 | iex

The native installer requires no dependencies and auto-updates in the background.

Note

Alternative: npm installation. If you prefer to install via npm (e.g., for version pinning), you can run npm install -g @anthropic-ai/claude-code. This requires Node.js to be installed first. The npm method still works but is officially deprecated by Anthropic in favor of the native installer.

9.3 Step 2: First Run

Navigate to a project directory and start Claude Code:

cd ~/Dropbox/my-research-project
claude

Your browser will open for Anthropic authentication. Approve access, return to the terminal, and Claude Code is running. Try asking it something about your files:

What files are in my current folder?

Claude can now see your actual project. Type /exit to end the session.

Important

Always navigate to your project folder before running claude. Claude Code sees whatever directory you’re in when you launch it. If you start it from your home directory, it sees everything. If you start it from your project folder, it sees your project and reads your CLAUDE.md automatically.

9.4 Choosing a Model

Claude Code defaults to Sonnet on Pro and Team Standard plans and Opus on Max and Team Premium plans. For research work (analysis, writing, review), Opus is the better choice:

/model

Select Claude Opus from the list. Opus is slower and more expensive but noticeably better at reasoning, nuanced interpretation, and longer output. Use Sonnet for mechanical tasks (reformatting, file operations, simple code edits) and Opus for anything requiring judgment.

9.5 Useful Commands

Action How
Send a message Type and press Enter
Multi-line message Press \ then Enter (or Shift+Enter)
Queue messages while Claude works Just type — they’ll be sent when it finishes
Scroll history Up/Down arrow keys
Switch model /model
Get help /help
Exit /exit or Ctrl + C
Compact context /compact
Note

Managing long sessions. Claude Code automatically compresses earlier conversation as the context window fills. For very long research sessions, you can manually run /compact to summarize the conversation so far and free up space. If a session becomes sluggish or Claude starts forgetting earlier instructions, this usually helps.

Warning

Compaction is lossy. When Claude compacts a conversation, whether automatically or via /compact, it summarizes the history but drops specific details: which approaches failed, constraints you stated three prompts ago, a reviewer’s formatting preferences you mentioned in passing. This is one more reason to keep sessions short, commit often, and put important constraints in CLAUDE.md or MEMORY.md rather than relying on conversation history. Those files are re-read at the start of every session and are immune to compaction.

ChatGPT vs. Claude: which should I use?

You don’t have to choose one exclusively. Each has strengths: Claude Code is the best current agentic harness for working inside your project directory; Claude is strong on long documents and nuanced reasoning; ChatGPT’s Deep Research mode is good for literature surveys; and ChatGPT handles image generation, which Claude does not. If you had to pick one tool for research, Claude with Claude Code is the better option right now, but there’s no reason not to use both. Note that OpenAI’s Codex uses AGENTS.md the way Claude Code uses CLAUDE.md — the concept is converging across platforms.

10 Further Reading

If you want to go deeper into the sources behind this tutorial series:

Scott Cunningham (Baylor University) — MixtapeTools. The original research-focused Claude Code workflow: estimation philosophy, the split-pdf reading protocol, cross-language replication as default verification, the Referee 2 adversarial audit, and a presentation philosophy built on the “Three Laws.” His workflow.md is the fullest statement of the “agentic coding” philosophy for empirical economics.

Pedro Sant’Anna (Emory University) — claude-code-my-workflow. The “project constitution” approach to CLAUDE.md: core principles, quality gates (80/90/95 numeric scoring), [LEARN] tags for institutional memory, and the ~150 line limit.

Hugo Sant’Anna (University of Alabama at Birmingham) — Clo-Author. A full research pipeline with 10 slash commands (/discover, /strategize, /analyze, /write, /review, /submit), worker-critic agent pairs, and an orchestrator loop with dependency graphs. Documentation at hsantanna.org/clo-author.

Chris Blattman (University of Chicago) — claudeblattman.com. The friendliest guide to Claude Code for social scientists: installation walkthroughs, prompt engineering advice, the PPRR loop, voice files for teaching AI your writing style, and the “embarrassment heuristic” for deciding when AI use is appropriate.

Aniket PanjwaniAI MBA. Practical training on skills, MCPs, and Claude Code workflows. Covers the skill-building process, skills-vs-MCPs comparison, git as the foundation of AI-assisted work, and context window management.

AnthropicClaude Code documentation. The technical reference for skills, CLAUDE.md, and Claude Code configuration.