3 Git for Research
Version control for AI-assisted academic workflows
4 Why Git?
Claude can rewrite your entire analysis script in seconds. That speed is useful — until you want to go back. Without version control, “going back” means trying to remember what your code looked like before Claude changed it. Git solves this: it tracks every change to every file, lets you undo anything, and backs up your work to GitHub.
“The #1 mistake from experienced Claude Code users: not using git.”
— Aniket Panjwani, “AI Agents for Economics Research”
This tutorial covers only what you need for a research workflow with Claude Code. It’s not a comprehensive git course. For deeper coverage, the Pro Git book is free and good.
5 Installing Git
Check if git is already installed:
git --versionIf you see a version number, skip to the next section. If not, ask Claude to handle it:
Prompt:
“Check if git is installed on my machine. If not, install it. Then configure it with my name [Your Name] and email [your.email@example.com]. Verify everything works by running git –version.”
Replace the bracketed placeholders with your own name and email. These are metadata attached to your commits — they don’t create an account anywhere.
Claude will detect your operating system, run the right install command (apt, brew, or the Windows installer), and set your identity with git config --global. Claude Code may ask for your approval before modifying global git config — just approve it when prompted.
6 Your First Project and First Commit
Let’s get your hands on git before explaining how it works. Navigate to a project directory and ask Claude to set everything up:
Prompt:
“Initialize a git repository in this directory. Create a .gitignore appropriate for a social science research project using R and Python. Set up the project folder structure from my CLAUDE.md. Make a first commit with a descriptive message.”
That’s it. Claude runs git init (creates the repository), writes a .gitignore file (tells git which files to ignore — more on this later), creates the folder structure, and takes your first snapshot with git add and git commit.
You now have a project with one commit. Try asking:
What does my git history look like?
Claude will run git log and show you one commit with the message it wrote. That commit is a snapshot — a permanent record of every file in your project at this moment. You can always come back to it.
7 What Just Happened: The Mental Model
Now that you’ve made a commit, here’s what’s going on under the hood.
Git tracks your project as a series of snapshots. Every time you commit, git takes a picture of every file at that moment. You can go back to any snapshot at any time.
Your files move through three areas:
Source: Pro Git, Ch. 1.3 (CC BY-NC-SA 3.0)
- Working directory — your normal project folder. This is where you edit files.
- Staging area — a holding pen. You choose which changes to include in the next snapshot. (This is what
git adddoes — it moves changes here.) - Repository — the permanent history. Every commit is a snapshot stored here. (This is what
git commitdoes — it saves the staging area as a snapshot.)
Five commands cover 90% of what you need:
| Command | What It Does |
|---|---|
git add |
Moves changes from working directory → staging area |
git commit |
Saves staging area → repository (permanent snapshot) |
git push |
Copies repository → GitHub (remote backup) |
git switch |
Switches between branches (covered later) |
git restore |
Restores files from repository → working directory |
When you ran Claude’s setup prompt a moment ago, Claude executed git add . (stage everything) and then git commit -m "..." (save the snapshot). That’s the core cycle.
A note on git checkout: Older tutorials and Stack Overflow answers use git checkout for both switching branches and restoring files. Modern git split this into two clearer commands: git switch (branches) and git restore (files). Both checkout and the newer commands work — Claude understands either — but switch/restore are less confusing when you’re learning.
8 Good Commit Messages
Every commit has a message describing what changed and why. Over time, these messages become a log of how your project developed.
Why this matters for Claude Code: Claude reads your commit history. Descriptive messages give it better context about what you’ve done, on top of what it picks up from CLAUDE.md and README.md.
| Bad Message | Good Message |
|---|---|
update script |
Add county-level fixed effects to main specification |
fixed stuff |
Fix merge: drop duplicate observations from DHS wave 3 |
changes |
Restrict sample to Sub-Saharan Africa (N=34 countries) |
“Add county-level fixed effects” tells Claude what your current specification looks like. “fixed stuff” tells it nothing.
The goal is that each commit represents one logical change. “Add main regression” is one commit. “Also fix a typo in the README” is a separate commit.
9 The Daily Cycle: Add, Commit, Push
9.1 Committing your work
When you want to save your progress:
Prompt:
“Commit the changes I just made with a descriptive message explaining what changed and why.”
Claude runs git add to stage your changed files, then git commit -m "..." with a message it writes based on the diff.
If you want to be specific about what to commit:
Prompt:
“Stage only the R analysis script and the output table, and commit with a message about adding the baseline DiD specification. Don’t include changes to the README.”
9.2 Auto-commits via CLAUDE.md
You can also have Claude commit automatically as it works. Add this rule to your CLAUDE.md:
## Core Principles
- Commit frequently with meaningful messagesClaude reads this and will commit after each logical change, so you build up a detailed commit log without typing git commit yourself.
Reviewing auto-commits. Claude’s auto-generated commit messages are usually good but not always perfect. Periodically run git log --oneline (or ask Claude to show your recent history) to spot any vague or misleading messages. If you catch a bad one immediately, git commit --amend -m "Better message" rewrites the most recent commit message. For older commits, it’s usually not worth rewriting history — just write a clearer message on the next commit.
10 Pushing to GitHub
So far, everything lives on your machine. GitHub gives you:
- A remote backup that survives hardware failure
- A URL you can share with coauthors
- A web interface to browse your history
- Integration with Claude Code (it can read GitHub repos)
10.1 Setting up GitHub
You need the GitHub CLI (gh) installed and authenticated. Ask Claude:
Prompt:
“Check if the GitHub CLI (gh) is installed. If not, install it. Then authenticate me with GitHub through the browser. After that, create a private repository on GitHub called ‘my-research-project’ and push all my commits to it.”
Claude will install gh if needed, run gh auth login for browser authentication, gh repo create to make the remote repo, and git push -u origin main to upload your commits. Use private repos for research projects — you can make them public later when you’re ready.
10.2 Subsequent pushes
After any commit:
Prompt:
“Push all my commits to GitHub.”
Or include pushing in your session ending routine — Claude will do this automatically as part of the closing workflow described in Tutorial 2.
11 Branching: Safe Experiments
Now for the concept that makes git indispensable for AI-assisted research.
11.1 What are branches?
When you create a git repository, you start with one branch called main. Think of it as your clean, working version — the code that produces your current results correctly.
A branch is a parallel copy where you can try things without touching main. Make changes, break things, experiment. If it works, you fold it back into main. If it fails, you delete the branch and main is untouched.
- main (dark dots): your working version. Always safe.
- experiment (red dots): a parallel copy where you try things.
- If the experiment works → merge back into main.
- If it fails → delete the branch. Main is untouched.
11.2 Why this matters for research
Suppose you have a working regression specification. You ask Claude to try an alternative — different fixed effects, different controls, different sample restrictions. Claude rewrites your analysis script. Twenty minutes later, you realize the new specification is wrong.
Without git: you try to remember what the original script looked like. You undo changes manually, hoping you don’t miss anything. You might have a backup copy somewhere, but it’s from last week.
With git: you created a branch before the experiment. You switch back to main and delete the branch. Three seconds. Nothing lost.
Rule of thumb: If there’s a chance you’ll want to undo it, branch first.
| Branch | Don’t Branch |
|---|---|
| Alternative specifications | Fixing a typo |
| Different sample restrictions | Adding a comment |
| New identification strategy | Minor formatting |
| Restructuring project directories | |
| Trying a new package/method |
11.3 Creating and using branches
Prompt:
“Create a new git branch called ‘experiment-alt-controls’ and switch to it. I want to try an alternative control set without affecting the main branch.”
Claude runs git switch -c experiment-alt-controls, which creates the branch and switches to it in one step. Everything you do from here happens on the experiment branch — main stays untouched.
Always commit or stash before switching branches. If you have uncommitted changes and try to switch branches, git will either refuse (if the changes conflict) or carry the uncommitted changes to the new branch (which is usually not what you want). Make it a habit to commit your work before switching.
If the experiment works — merge:
Prompt:
“The experiment worked. Switch back to the main branch and merge in the experiment-alt-controls branch. Then delete the experiment branch.”
Claude runs git switch main, then git merge experiment-alt-controls, then git branch -d experiment-alt-controls to clean up.
If the experiment fails — discard:
Prompt:
“The experiment didn’t work. Switch back to main and delete the experiment-alt-controls branch. I want to discard all the changes on that branch.”
Claude runs git switch main and then git branch -D experiment-alt-controls (the capital -D forces deletion of an unmerged branch).
12 The Daily Routine
This section adds git-specific steps to the session routines from Tutorial 2.
Start of session:
cd ~/path/to/your/project
claudeStart with the session startup prompt from Tutorial 2 (“Read all the markdowns…”). Claude reads your CLAUDE.md, README.md, MEMORY.md, and git history — it picks up where you left off.
If you work from multiple machines (e.g., office desktop and home laptop), start each session with git pull to sync any commits you pushed from elsewhere. Without this, your local repo and GitHub can diverge, creating unnecessary merge conflicts. You can add “always pull before starting work” to your CLAUDE.md so Claude handles this automatically.
Before risky experiments — branch:
Prompt:
“Before we try this alternative specification, create a branch called ‘experiment-bandwidth-50km’ so we can go back if it doesn’t work.”
End of session — commit, document, push:
Prompt:
“We’re done for today. Update README.md with the decisions we made. Commit everything with a summary of what we accomplished. Push to GitHub.”
Tomorrow you start fresh with a full history of what you did today.
13 When Things Go Wrong
You can handle all of these by asking Claude in plain English. The explanations are here so you understand what’s happening.
13.1 “I committed something I shouldn’t have”
If you accidentally committed a file with sensitive data, you need to remove it from git tracking (while keeping the local file) and add it to .gitignore so it doesn’t happen again.
Prompt:
“I accidentally committed data/raw/sensitive_file.csv. Remove it from git tracking but keep the local file. Add it to .gitignore so it doesn’t get tracked again.”
If you already pushed to GitHub, the file is in the remote history even after you remove it locally. Prevention (a good .gitignore) is much easier than cleanup. Ask Claude to help you scrub the file from history using git filter-repo, or — if the repo is small and you don’t mind the nuclear option — delete and recreate the repository.
13.2 “I want to undo my last commit”
Under the hood: git reset --soft HEAD~1 removes the last commit but keeps your files unchanged. The snapshot goes away; the edits stay.
Prompt:
“Undo my last commit but keep all the file changes in my working directory. I want to re-do it differently.”
13.3 “I want to see what changed”
Under the hood: git status shows uncommitted changes; git log --oneline shows the commit history.
Prompt:
“Show me what files have changed since my last commit. Then show me the last 10 commits as a summary.”
13.4 “I tried to merge and got a conflict”
Most merges just work — git combines the changes automatically. A merge conflict happens when both branches edited the same lines in the same file, and git can’t decide which version to keep. This is normal, not an emergency.
When it happens, git marks the conflicting section inside the file with special markers:
<<<<<<< main
results <- feols(y ~ treatment + controls1, data = df)
=======
results <- feols(y ~ treatment + controls2, data = df)
>>>>>>> experiment-alt-controls
The section between <<<<<<< main and ======= is the version on your current branch. The section between ======= and >>>>>>> experiment-alt-controls is the incoming version. You (or Claude) need to decide which to keep, or combine them, and then remove the markers.
Prompt:
“I tried to merge but got a conflict. Show me the conflicting files, explain what each version does, and resolve the conflict by keeping the version from the experiment branch. Then commit the merge.”
Preventing conflicts. Conflicts are most common when you work on the same file across multiple branches, or when collaborators edit the same section. Three habits that reduce them: commit frequently (smaller changes are easier to merge), keep branches short-lived (merge or discard within a session or two), and pull from GitHub before starting work if you share the repo.
14 What .gitignore Does
When Claude set up your project earlier, it created a .gitignore file. This tells git which files to never track — they become invisible to git and won’t be staged, committed, or pushed.
What to always ignore:
- Secrets and credentials (
.env,.Renviron, API keys) - OS files (
.DS_Store,Thumbs.db) - R and Python artifacts (
.Rhistory,__pycache__/,.Rproj.user/) - LaTeX build files (
.aux,.log,.bbl,.synctex.gz)
What to ignore selectively — only when files are large or sensitive:
- Raw microdata in
data/raw/that is too large for git (>50 MB) or contains personally identifiable information - Large binary outputs (
output/figures/*.png) that can be regenerated from code - Shapefiles (
*.shp,*.shx,*.dbf,*.prj) if they’re large
What to track:
- Small data files needed for replication (crosswalks, cleaned samples, codebooks)
- Code (always)
- Documentation (CLAUDE.md, README.md, MEMORY.md)
- Output tables (usually small text files)
The rule of thumb: if a file is under a few hundred KB and someone would need it to reproduce your results, track it. If it’s large, sensitive, or regenerable from code, ignore it. You can always add more patterns to .gitignore later. GitHub maintains language-specific .gitignore templates for R, Python, and other languages — ask Claude to pull the relevant ones.
Dropbox and git don’t mix well. Git stores its internal state in a hidden .git folder. Dropbox tries to sync that folder in the background. If Dropbox syncs while git is mid-write, it can corrupt your repository.
What to do:
- Best option: Keep git projects outside Dropbox. Use GitHub (
git push/git pull) to sync between machines. - If you must use Dropbox: Treat GitHub as your real backup, not Dropbox. Never open the same repo from two machines at the same time.
- Note on Tutorial 2’s CLAUDE.md template: The template places projects in
~/Library/CloudStorage/Dropbox/. If you follow that convention and use git, treat GitHub as the authoritative copy and Dropbox as a working location — not a backup mechanism. Rungit pushat the end of every session.
15 Quick Reference
All of these can be done by asking Claude in natural language. The commands are here for reference.
| Task | Command | Claude Prompt |
|---|---|---|
| Initialize a repo | git init |
“Initialize a git repo here” |
| Stage changes | git add . |
“Stage all my changes” |
| Commit | git commit -m "msg" |
“Commit with a descriptive message” |
| Push to GitHub | git push |
“Push to GitHub” |
| Create a branch | git switch -c name |
“Create a branch called X” |
| Switch to main | git switch main |
“Switch to main” |
| Merge a branch | git merge name |
“Merge branch X into main” |
| Delete a merged branch | git branch -d name |
“Delete branch X” |
| Delete an unmerged branch | git branch -D name |
“Delete branch X and discard its changes” |
| See what changed | git status |
“What files have changed?” |
| See history | git log --oneline |
“Show me the commit history” |
| Undo last commit | git reset --soft HEAD~1 |
“Undo my last commit” |
| Stash changes | git stash |
“Stash my changes” |
| Pull from GitHub | git pull |
“Pull the latest changes from GitHub” |
| Discard uncommitted changes | git restore . |
“Discard all my uncommitted changes” |
16 Further Reading
Cunningham’s MixtapeTools is a CLAUDE.md template and project setup for causal inference research, with opinionated workflow defaults that pair well with this tutorial. Panjwani’s AI MBA covers git and Claude Code workflows at a more advanced level — worth looking at once you have the basics down.
| Resource | Link |
|---|---|
| Pro Git book (free, comprehensive reference) | git-scm.com/book |
| Panjwani AI MBA (advanced Claude Code + git) | ai-mba.io |
| Cunningham MixtapeTools (CLAUDE.md templates) | GitHub |
GitHub CLI docs (gh command reference) |
cli.github.com |