3  Git for Research

Version control for AI-assisted academic workflows

4 Why Git?

Claude can rewrite your entire analysis script in seconds. That speed is useful — until you want to go back. Without version control, “going back” means trying to remember what your code looked like before Claude changed it. Git solves this: it tracks every change to every file, lets you undo anything, and backs up your work to GitHub.

“The #1 mistake from experienced Claude Code users: not using git.”

— Aniket Panjwani, “AI Agents for Economics Research”

This tutorial covers only what you need for a research workflow with Claude Code. It’s not a comprehensive git course. For deeper coverage, the Pro Git book is free and good.

5 Installing Git

Check if git is already installed:

git --version

If you see a version number, skip to the next section. If not, ask Claude to handle it:

Tip

Prompt:

“Check if git is installed on my machine. If not, install it. Then configure it with my name [Your Name] and email [your.email@example.com]. Verify everything works by running git –version.”

Replace the bracketed placeholders with your own name and email. These are metadata attached to your commits — they don’t create an account anywhere.

Claude will detect your operating system, run the right install command (apt, brew, or the Windows installer), and set your identity with git config --global. Claude Code may ask for your approval before modifying global git config — just approve it when prompted.

6 Your First Project and First Commit

Let’s get your hands on git before explaining how it works. Navigate to a project directory and ask Claude to set everything up:

Tip

Prompt:

“Initialize a git repository in this directory. Create a .gitignore appropriate for a social science research project using R and Python. Set up the project folder structure from my CLAUDE.md. Make a first commit with a descriptive message.”

That’s it. Claude runs git init (creates the repository), writes a .gitignore file (tells git which files to ignore — more on this later), creates the folder structure, and takes your first snapshot with git add and git commit.

You now have a project with one commit. Try asking:

What does my git history look like?

Claude will run git log and show you one commit with the message it wrote. That commit is a snapshot — a permanent record of every file in your project at this moment. You can always come back to it.

7 What Just Happened: The Mental Model

Now that you’ve made a commit, here’s what’s going on under the hood.

Git tracks your project as a series of snapshots. Every time you commit, git takes a picture of every file at that moment. You can go back to any snapshot at any time.

Your files move through three areas:

Git’s three areas: working directory, staging area, and repository

Source: Pro Git, Ch. 1.3 (CC BY-NC-SA 3.0)

  • Working directory — your normal project folder. This is where you edit files.
  • Staging area — a holding pen. You choose which changes to include in the next snapshot. (This is what git add does — it moves changes here.)
  • Repository — the permanent history. Every commit is a snapshot stored here. (This is what git commit does — it saves the staging area as a snapshot.)

Five commands cover 90% of what you need:

Command What It Does
git add Moves changes from working directory → staging area
git commit Saves staging area → repository (permanent snapshot)
git push Copies repository → GitHub (remote backup)
git switch Switches between branches (covered later)
git restore Restores files from repository → working directory

When you ran Claude’s setup prompt a moment ago, Claude executed git add . (stage everything) and then git commit -m "..." (save the snapshot). That’s the core cycle.

Note

A note on git checkout: Older tutorials and Stack Overflow answers use git checkout for both switching branches and restoring files. Modern git split this into two clearer commands: git switch (branches) and git restore (files). Both checkout and the newer commands work — Claude understands either — but switch/restore are less confusing when you’re learning.

8 Good Commit Messages

Every commit has a message describing what changed and why. Over time, these messages become a log of how your project developed.

Why this matters for Claude Code: Claude reads your commit history. Descriptive messages give it better context about what you’ve done, on top of what it picks up from CLAUDE.md and README.md.

Bad Message Good Message
update script Add county-level fixed effects to main specification
fixed stuff Fix merge: drop duplicate observations from DHS wave 3
changes Restrict sample to Sub-Saharan Africa (N=34 countries)

“Add county-level fixed effects” tells Claude what your current specification looks like. “fixed stuff” tells it nothing.

The goal is that each commit represents one logical change. “Add main regression” is one commit. “Also fix a typo in the README” is a separate commit.

9 The Daily Cycle: Add, Commit, Push

9.1 Committing your work

When you want to save your progress:

Tip

Prompt:

“Commit the changes I just made with a descriptive message explaining what changed and why.”

Claude runs git add to stage your changed files, then git commit -m "..." with a message it writes based on the diff.

If you want to be specific about what to commit:

Tip

Prompt:

“Stage only the R analysis script and the output table, and commit with a message about adding the baseline DiD specification. Don’t include changes to the README.”

9.2 Auto-commits via CLAUDE.md

You can also have Claude commit automatically as it works. Add this rule to your CLAUDE.md:

## Core Principles
- Commit frequently with meaningful messages

Claude reads this and will commit after each logical change, so you build up a detailed commit log without typing git commit yourself.

Note

Reviewing auto-commits. Claude’s auto-generated commit messages are usually good but not always perfect. Periodically run git log --oneline (or ask Claude to show your recent history) to spot any vague or misleading messages. If you catch a bad one immediately, git commit --amend -m "Better message" rewrites the most recent commit message. For older commits, it’s usually not worth rewriting history — just write a clearer message on the next commit.

10 Pushing to GitHub

So far, everything lives on your machine. GitHub gives you:

  • A remote backup that survives hardware failure
  • A URL you can share with coauthors
  • A web interface to browse your history
  • Integration with Claude Code (it can read GitHub repos)

10.1 Setting up GitHub

You need the GitHub CLI (gh) installed and authenticated. Ask Claude:

Tip

Prompt:

“Check if the GitHub CLI (gh) is installed. If not, install it. Then authenticate me with GitHub through the browser. After that, create a private repository on GitHub called ‘my-research-project’ and push all my commits to it.”

Claude will install gh if needed, run gh auth login for browser authentication, gh repo create to make the remote repo, and git push -u origin main to upload your commits. Use private repos for research projects — you can make them public later when you’re ready.

10.2 Subsequent pushes

After any commit:

Tip

Prompt:

“Push all my commits to GitHub.”

Or include pushing in your session ending routine — Claude will do this automatically as part of the closing workflow described in Tutorial 2.

11 Branching: Safe Experiments

Now for the concept that makes git indispensable for AI-assisted research.

11.1 What are branches?

When you create a git repository, you start with one branch called main. Think of it as your clean, working version — the code that produces your current results correctly.

A branch is a parallel copy where you can try things without touching main. Make changes, break things, experiment. If it works, you fold it back into main. If it fails, you delete the branch and main is untouched.

  • main (dark dots): your working version. Always safe.
  • experiment (red dots): a parallel copy where you try things.
  • If the experiment works → merge back into main.
  • If it fails → delete the branch. Main is untouched.

11.2 Why this matters for research

Suppose you have a working regression specification. You ask Claude to try an alternative — different fixed effects, different controls, different sample restrictions. Claude rewrites your analysis script. Twenty minutes later, you realize the new specification is wrong.

Without git: you try to remember what the original script looked like. You undo changes manually, hoping you don’t miss anything. You might have a backup copy somewhere, but it’s from last week.

With git: you created a branch before the experiment. You switch back to main and delete the branch. Three seconds. Nothing lost.

Rule of thumb: If there’s a chance you’ll want to undo it, branch first.

Branch Don’t Branch
Alternative specifications Fixing a typo
Different sample restrictions Adding a comment
New identification strategy Minor formatting
Restructuring project directories
Trying a new package/method

11.3 Creating and using branches

Tip

Prompt:

“Create a new git branch called ‘experiment-alt-controls’ and switch to it. I want to try an alternative control set without affecting the main branch.”

Claude runs git switch -c experiment-alt-controls, which creates the branch and switches to it in one step. Everything you do from here happens on the experiment branch — main stays untouched.

Warning

Always commit or stash before switching branches. If you have uncommitted changes and try to switch branches, git will either refuse (if the changes conflict) or carry the uncommitted changes to the new branch (which is usually not what you want). Make it a habit to commit your work before switching.

If the experiment works — merge:

Tip

Prompt:

“The experiment worked. Switch back to the main branch and merge in the experiment-alt-controls branch. Then delete the experiment branch.”

Claude runs git switch main, then git merge experiment-alt-controls, then git branch -d experiment-alt-controls to clean up.

If the experiment fails — discard:

Tip

Prompt:

“The experiment didn’t work. Switch back to main and delete the experiment-alt-controls branch. I want to discard all the changes on that branch.”

Claude runs git switch main and then git branch -D experiment-alt-controls (the capital -D forces deletion of an unmerged branch).

12 The Daily Routine

This section adds git-specific steps to the session routines from Tutorial 2.

Start of session:

cd ~/path/to/your/project
claude

Start with the session startup prompt from Tutorial 2 (“Read all the markdowns…”). Claude reads your CLAUDE.md, README.md, MEMORY.md, and git history — it picks up where you left off.

Note

If you work from multiple machines (e.g., office desktop and home laptop), start each session with git pull to sync any commits you pushed from elsewhere. Without this, your local repo and GitHub can diverge, creating unnecessary merge conflicts. You can add “always pull before starting work” to your CLAUDE.md so Claude handles this automatically.

Before risky experiments — branch:

Tip

Prompt:

“Before we try this alternative specification, create a branch called ‘experiment-bandwidth-50km’ so we can go back if it doesn’t work.”

End of session — commit, document, push:

Tip

Prompt:

“We’re done for today. Update README.md with the decisions we made. Commit everything with a summary of what we accomplished. Push to GitHub.”

Tomorrow you start fresh with a full history of what you did today.

13 When Things Go Wrong

You can handle all of these by asking Claude in plain English. The explanations are here so you understand what’s happening.

13.1 “I committed something I shouldn’t have”

If you accidentally committed a file with sensitive data, you need to remove it from git tracking (while keeping the local file) and add it to .gitignore so it doesn’t happen again.

Tip

Prompt:

“I accidentally committed data/raw/sensitive_file.csv. Remove it from git tracking but keep the local file. Add it to .gitignore so it doesn’t get tracked again.”

If you already pushed to GitHub, the file is in the remote history even after you remove it locally. Prevention (a good .gitignore) is much easier than cleanup. Ask Claude to help you scrub the file from history using git filter-repo, or — if the repo is small and you don’t mind the nuclear option — delete and recreate the repository.

13.2 “I want to undo my last commit”

Under the hood: git reset --soft HEAD~1 removes the last commit but keeps your files unchanged. The snapshot goes away; the edits stay.

Tip

Prompt:

“Undo my last commit but keep all the file changes in my working directory. I want to re-do it differently.”

13.3 “I want to see what changed”

Under the hood: git status shows uncommitted changes; git log --oneline shows the commit history.

Tip

Prompt:

“Show me what files have changed since my last commit. Then show me the last 10 commits as a summary.”

13.4 “I tried to merge and got a conflict”

Most merges just work — git combines the changes automatically. A merge conflict happens when both branches edited the same lines in the same file, and git can’t decide which version to keep. This is normal, not an emergency.

When it happens, git marks the conflicting section inside the file with special markers:

<<<<<<< main
results <- feols(y ~ treatment + controls1, data = df)
=======
results <- feols(y ~ treatment + controls2, data = df)
>>>>>>> experiment-alt-controls

The section between <<<<<<< main and ======= is the version on your current branch. The section between ======= and >>>>>>> experiment-alt-controls is the incoming version. You (or Claude) need to decide which to keep, or combine them, and then remove the markers.

Tip

Prompt:

“I tried to merge but got a conflict. Show me the conflicting files, explain what each version does, and resolve the conflict by keeping the version from the experiment branch. Then commit the merge.”

Note

Preventing conflicts. Conflicts are most common when you work on the same file across multiple branches, or when collaborators edit the same section. Three habits that reduce them: commit frequently (smaller changes are easier to merge), keep branches short-lived (merge or discard within a session or two), and pull from GitHub before starting work if you share the repo.

14 What .gitignore Does

When Claude set up your project earlier, it created a .gitignore file. This tells git which files to never track — they become invisible to git and won’t be staged, committed, or pushed.

What to always ignore:

  • Secrets and credentials (.env, .Renviron, API keys)
  • OS files (.DS_Store, Thumbs.db)
  • R and Python artifacts (.Rhistory, __pycache__/, .Rproj.user/)
  • LaTeX build files (.aux, .log, .bbl, .synctex.gz)

What to ignore selectively — only when files are large or sensitive:

  • Raw microdata in data/raw/ that is too large for git (>50 MB) or contains personally identifiable information
  • Large binary outputs (output/figures/*.png) that can be regenerated from code
  • Shapefiles (*.shp, *.shx, *.dbf, *.prj) if they’re large

What to track:

  • Small data files needed for replication (crosswalks, cleaned samples, codebooks)
  • Code (always)
  • Documentation (CLAUDE.md, README.md, MEMORY.md)
  • Output tables (usually small text files)

The rule of thumb: if a file is under a few hundred KB and someone would need it to reproduce your results, track it. If it’s large, sensitive, or regenerable from code, ignore it. You can always add more patterns to .gitignore later. GitHub maintains language-specific .gitignore templates for R, Python, and other languages — ask Claude to pull the relevant ones.

Warning

Dropbox and git don’t mix well. Git stores its internal state in a hidden .git folder. Dropbox tries to sync that folder in the background. If Dropbox syncs while git is mid-write, it can corrupt your repository.

What to do:

  • Best option: Keep git projects outside Dropbox. Use GitHub (git push / git pull) to sync between machines.
  • If you must use Dropbox: Treat GitHub as your real backup, not Dropbox. Never open the same repo from two machines at the same time.
  • Note on Tutorial 2’s CLAUDE.md template: The template places projects in ~/Library/CloudStorage/Dropbox/. If you follow that convention and use git, treat GitHub as the authoritative copy and Dropbox as a working location — not a backup mechanism. Run git push at the end of every session.

15 Quick Reference

All of these can be done by asking Claude in natural language. The commands are here for reference.

Task Command Claude Prompt
Initialize a repo git init “Initialize a git repo here”
Stage changes git add . “Stage all my changes”
Commit git commit -m "msg" “Commit with a descriptive message”
Push to GitHub git push “Push to GitHub”
Create a branch git switch -c name “Create a branch called X”
Switch to main git switch main “Switch to main”
Merge a branch git merge name “Merge branch X into main”
Delete a merged branch git branch -d name “Delete branch X”
Delete an unmerged branch git branch -D name “Delete branch X and discard its changes”
See what changed git status “What files have changed?”
See history git log --oneline “Show me the commit history”
Undo last commit git reset --soft HEAD~1 “Undo my last commit”
Stash changes git stash “Stash my changes”
Pull from GitHub git pull “Pull the latest changes from GitHub”
Discard uncommitted changes git restore . “Discard all my uncommitted changes”

16 Further Reading

Cunningham’s MixtapeTools is a CLAUDE.md template and project setup for causal inference research, with opinionated workflow defaults that pair well with this tutorial. Panjwani’s AI MBA covers git and Claude Code workflows at a more advanced level — worth looking at once you have the basics down.

Resource Link
Pro Git book (free, comprehensive reference) git-scm.com/book
Panjwani AI MBA (advanced Claude Code + git) ai-mba.io
Cunningham MixtapeTools (CLAUDE.md templates) GitHub
GitHub CLI docs (gh command reference) cli.github.com