We’re discussing a common VC question – to fork or not to fork a Git repository

We’ve have been discussing the use of fork on GitHub, to get an understanding of good practices. 

We’ve come to realise that fork is not a Git thing. It’s a service built by “Git hosters”, to create a copy of a Git repository. You fork (copy) a repository to be able make changes in isolation, without affecting the original project.

At some point, you’re probably wanting to propose changes to the original project, which is where our discussion “to fork or not to fork” a repository began.

It’s a trust boundary

We often call forks a “trust boundary” when discussing them internally. As shown below, teammates are people we trust, so they get to add branches directly to our repo. Outside contributors are people we don’t know, and the only thing we have to go on is their code. We make them isolate their code in their own repo, so that we can review code changes before committing them in our repo.

image

Not to fork

If you’re a member of the original project, a trusted teammate, you should create a topic branch and submit a pull request in the original repository. There’s no need or value to fork.

To fork

If you’re not a member of the original project, an outside contributor, you should fork the repository, create a topic branch in your copied repo, and submit a pull request with changes back to the original repository.

Here’s a sticky note from our discussions:

sticky

… which evolved into this loose visual of the associated workflow, based on our understanding to date:

 

image

 

  1. User forks the original repo into a copied repo
  2. User creates a topic branch from master within the copied repo
  3. User issues a pull request (PR) from topic branch in the copied copy of the repo, to the master branch of the original repo
    • Rebase multiple commits (D+E ) before issuing the PR and/or squash (recommended) them as part of PR
      • Always keep the master neat and clean – branches are where the mess is tolerated
      • Rebase your topic branch(es) to provide one proposed change, expressed as one commit
      • NEVER rebase a branch that is shared with other users, it creates PAIN and is a NO-NO 
    • Optionally (recommended) delete the topic branch
    • PRs to the original repo shouldn't result in merge conflicts right away if they're targeting master and based on the latest master when starting work. If merge conflicts do arise, it’s a sign that PRs are too big.
  4. User pulls latest changes from master branch in original repo to the master branch in the copied repo to synchronise repos
    • This action literally brings commit (G) over into the fork’s master
    • Had we worked directly in master in step 2, we would bring commits that don’t fast-forward onto our master. Merging them puts us ahead of upstream’s master, which requires another PR back to upstream master, which … it’s a messy situation we have probably all struggled with Sad smile
  5. User either creates a new topic branch (recommended), or pulls changes from master of the copied repo to topic branches 

 

What are your thoughts on this?

 


Special thanks to: Derek Keeler, Robert MacLean, Giulio Vian, Chris Boretos, Matthew Mitrik, Matt Cooper and Sara Ford for sharing their experience and thoughts to shape this blog post.