Skip to content
AstroPaper
Go back

git-filter-repo — History Rewriting Done Right

Edit page

The Problem It Solves

Git has a built-in tool for rewriting history called git filter-branch. It works, but it has a bad reputation:

git filter-repo was written as a drop-in replacement that is faster, safer, and far easier to reason about. The git project now officially recommends it over git filter-branch — the git filter-branch man page was updated to warn users away from it and point to git filter-repo instead.


What git-filter-repo Does

git filter-repo rewrites git history by walking every commit and applying a transformation. It does three things:

  1. Rewrites file paths — renames or moves files across all commits in history
  2. Rewrites or drops commits — keeps only commits relevant to the filter you applied, drops the rest entirely
  3. Removes all remotes — intentionally, as a safety measure to prevent you from accidentally pushing rewritten history back to the original remote and corrupting it

The remote removal comes with a notice:

NOTICE: Removing 'origin' remote; see 'Why is my origin removed?'
        in the manual if you want to push back there.
        (was https://github.com/original/repo.git)

After running it, you add a new remote manually and push to the intended destination.


The --subdirectory-filter Flag

The most common use case is extracting a subdirectory into its own standalone repo.

git filter-repo --subdirectory-filter src/sqlite

This command:

Before                              After
──────────────────────────          ──────────────────
src/sqlite/                    →    (root)
  pyproject.toml               →    pyproject.toml
  src/mcp_server_sqlite/       →    src/mcp_server_sqlite/
  README.md                    →    README.md
src/github/                    ←    gone
src/postgres/                  ←    gone
... (other directories)        ←    gone

What happens to mixed commits?

If a commit modified files both inside and outside the target subdirectory, the commit is kept but trimmed — only the changes inside src/sqlite/ survive, the rest are silently dropped from that commit. The commit message stays unchanged, so it may reference files that no longer appear in the diff. This is a minor cosmetic oddity and does not affect functionality.


Running It Without Installing

git filter-repo is not a built-in git command — it must be installed separately. If you have uv available, you can run it without touching your OS:

uvx git-filter-repo --subdirectory-filter src/sqlite

uvx runs the tool in a temporary isolated virtual environment. It does not install anything into your system Python and leaves no packages behind after the command finishes. The only thing it modifies is the git repo directory you run it in.

To install it permanently:

# Ubuntu/Debian
sudo apt install git-filter-repo

# or via pip
pip install git-filter-repo

A Complete Subdirectory Extraction Workflow

Here is the full workflow for extracting a subdirectory into its own repo with preserved history:

# 1. Copy the original repo (never modify the original)
cp -r /path/to/original-repo /path/to/new-repo-name

# 2. Run the filter
cd /path/to/new-repo-name
uvx git-filter-repo --subdirectory-filter src/target-dir

# 3. Add the new remote (filter-repo removed the old one)
git remote add origin git@github.com:username/new-repo-name.git

# 4. Push
git push -u origin main

⚙️ Always work on a copy of the original repo. git filter-repo modifies history in place.


Why git Doesn’t Include It

git filter-repo is widely recommended but ships separately. A few reasons:


git filter-repo is not alone. Several widely-used git tools live outside the core for the same reasons:

ToolWhat it does
git-lfsStores large binary files outside the repo — common in game dev and data science
git-absorbAutomatically figures out which commit a staged change belongs to and amends it
git-deltaA better diff viewer with syntax highlighting
git-branchlessSuite of tools for stacked commits / stacked PRs workflow
ghGitHub CLI — opening PRs, reviewing issues, managing releases
lazygitFull-featured terminal UI for git
tigLightweight terminal UI for browsing history

The pattern is consistent: useful, widely adopted, but external because they are third-party, have extra runtime dependencies, or fall outside git’s deliberately narrow core scope.


References


Edit page
Share this post on:

Next Post
Git Worktrees and Claude Code — Parallel Branches, Parallel Agents