Programming

Splitting branches using diff files

Besides using diff files to troubleshoot Git branches, I've also used them to split a branch into multiple branches. Often when starting some exploratory work, it's hard to know the full breadth of the changes that are needed, so foreseeing tactical cleanups and refactors can be difficult. I'll often make changes as I explore the code, whether they're relevant or not. But as a reviewer, I appreciate when a pull request is small, focused, and contains only non-functional changes or changes that are visible to the end user, not both, so when I've written a branch that's a big bundle of mixed concerns, I try to take the time to split it up. It can feel like starting over to break working changes into smaller sets, so here's how I use diff files to jumpstart new branches and avoid some rework.

In my oversized branch, I write the full diff to a file with something like git diff main... > changes.diff. Then I review the changes, looking for what parts are trivial cleanup that could go in a separate branch and be merged without any of the other work I've done. I copy those files out into a new file, say, cleanup.diff, remove irrelevant chunks, and delete from changes.diff the chunks that got copied. Sometimes chunks themselves need to be edited, which can be tricky, since Git's diffs include information about what line changes start at and how many lines were changed, information that has to be updated if I add or remove lines from a chunk.

Once I've pulled out all the simple cleanup changes, I look for refactors. Those chunks go in another file, e.g. refactor.diff. Sometimes an oversized branch has multiple refactors, and multiple diff files are needed to untangle them into separate threads. When I'm done, the chunks remaining in changes.diff should reflect functional changes.

Once I have all the diff files, I figure out how to structure my pull requests. I create a new branch for the first set of changes, run cat something.diff | patch -p1, test, and if all is well, open a PR. After that, I create a new branch off the first branch and apply the next diff. It may or may not work. Sometimes the subsequent diff files don't align perfectly with a subset of the original changes, but even if the diff applies cleanly, the code might not function correctly, because often changes are tough to eyeball from a diff. Tweaking is frequently necessary. Once I've tested and gotten the second branch working, I continue on through any other diffs I have, branching from the appropriate base branches. Usually only two branches are necessary; in extreme cases, three. Any more than that and I tend to break out cleanups and refactors before finishing exploratory work, so that exploration is clearer and without as many distractions.

Whether I create PRs for each branch immediately depends on team culture. For some teams, I've only created branches one at a time, keeping others in reserve until the PRs they depend on get merged. That simplifies the mental landscape for reviewers, since PRs targeting other PRs can be hard to reason about, but it can also be simpler for me as the code author. When reviewers have feedback, I can make changes to the branch and rebase downstream branches without worrying that others have already pulled those branches. If I do open subsequent PRs before the first one is merged, I target the open branch and note in the PR description that it depends on another PR, then I rely on GitHub to automatically retarget a PR at the main development branch when its base branch is merged.

April 2026