Git Rebase for the Terrified

(brethorsting.com)

48 points | by aaronbrethorst 5 days ago

13 comments

  • xlii 1 hour ago
    Allow me (today) to be that person to propose checking out Jujutsu instead [0]. Not only it has a superpower of atomic commits (reviewers will love you, peers will hate 8 small PRs that are chained together ;-)) but it's also more consistent than git and works perfectly well as a drop-in replacement.

    In fact, I've been using Jujutsu for ~2 years as a drop-in and nobody complained (outside of the 8 small PRs chained together). Git is great as a backend, but Jujutsu shines as a frontend.

    [0]: https://www.jj-vcs.dev/latest/

    • Zambyte 1 hour ago
      Also been using Jujutsu for about 2 years. I feel like I have learned so much about how git actually works by simply not using git.
    • _flux 1 hour ago
      I think I'd love to use Jujutsu, but I enjoy Magit (for Emacs) too much to entertain the thought of switching :/.

      Besides, Magit rebasing is also pretty sweet.

    • vlovich123 1 hour ago
      How do you handle publishing the stack?
  • thibaut_barrere 2 hours ago
    PSA: I’m not terrified of rebase, yet it’s good to know this:

    https://docs.github.com/en/get-started/using-git/about-git-r...

    > Warning - Because changing your commit history can make things difficult for everyone else using the repository, it's considered bad practice to rebase commits when you've already pushed to a repository.

    A similar warning is in Atlassian docs.

    • ongy 2 hours ago
      I think a large part of this is about how a branch is expected to be used.

      Branches that people are expected to track (i.e. pull from or merge into their regularly) should never rebase/force-push.

      Branches that are short-lived or only exist to represent some state can do so quite often.

      • xlii 1 hour ago
        Also branches that are write-only by a single person by consensus. E.g. "personal" PR branches that are not supposed to be modified by anyone but owner.
      • thibaut_barrere 2 hours ago
        It is this, plus more:

        - the web tooling must react properly to this (as GH does mostly)

        - comments done at the commit level are complicated to track

        - and given the reach of tools like GH, people shooting their own foot with this is (even experienced ones) most likely generate a decent level of support for these tools teams

    • mr_mitm 48 minutes ago
      Is there a reason why that recommendation cannot be changed to "don't ever force push unless you are certain no one else has fetched this branch"?
  • flux3125 53 minutes ago
    >the worst case scenario for a rebase gone wrong is that you delete your local clone and start over.

    Wouldn't it be enough to simply back up the branch (eg, git checkout -b current-branch-backup)? Or is there still a way to mess up the backup as well?

    • aurecchia 43 minutes ago
      Yeah, deleting your local clone and starting over should normally not be necessary, unless you really mess things up badly.

      The "local backup branch" is not really needed either because you can still reference `origin/your-branch` even after you messed up a rebase of `your-branch` locally.

      Even if you force-pushed and overwrote `origin/your-branch` it's most likely still possible to get back to the original state of things using `git reflog`.

  • coffeebeqn 2 hours ago
    I wish rebase was taught as the default - I blame the older inferior version control software. It’s honestly easier to reason about a rebase than a merge since it’s so linear.

    Understanding of local versus origin branch is also missing or mystical to a lot of people and it’s what gives you confidence to mess around and find things out

    • Akranazon 2 hours ago
      The end result of a git rebase is arguably superior. However, I don't do it, because the process of running git rebase is a complete hassle. git merge is one-shot, whereas git rebase replays commits one-by-one.

      Replaying commits one-by-one is like a history quiz. It forces me to remember what was going on a week ago when I did commit #23 out of 45. I'm grateful that git stores that history for me when I need it, but I don't want it to force me to interact with the history. I've long since expelled it from my brain, so that I can focus on the current state of the codebase. "5 commits ago, did you mean to do that, or can we take this other change?" I don't care, I don't want to think about it.

      Of course, this issue can be reduced by the "squash first, then rebase" approach. Or judicious use of "git commit --amend --no-edit" to reduce the number of commits in my branch, therefore making the rebase less of a hassle. That's fine. But what if I didn't do that? I don't want my tools to judge me for my workflow. A user-friendly tool should non-judgmentally accommodate whatever convenient workflow I adopted in the past.

      Git says, "oops, you screwed up by creating 50 lazy commits, now you need to put in 20 minutes figuring out how to cleverly combine them into 3 commits, before you can pull from main!" then I'm going to respond, "screw you, I will do the next-best easier alternative". I don't have time for the judgement.

      • teaearlgraycold 1 hour ago
        This seems crazy to me as a self-admitted addict of “git commit --amend --no-edit && git push --force-with-lease”.

        I don’t think the tool is judgmental. It’s finicky. It requires more from its user than most tools do. Including bending over to make your workflow compliant with its needs.

    • CJefferson 31 minutes ago
      I don't mind rebasing a single commit, but I hate it when people rebase a list of commits, because that makes commits which never existed before, have probably never been tested, and generally never will be.

      I've had failures while git bisecting, hitting commits that clearly never compiled, because I'm probably the first person to ever check them out.

    • jillesvangurp 1 hour ago
      Rebase your local history, merge collaborative work. It helps to just relabel rebase as "rewrite history". That makes it more clear that it's generally not acceptable to force push your rewritten history upstream. I've seen people trying to force push their changes and overwrite the remote history. If you need to force push, you probably messed up. Maybe OK on your own pull request branches assuming nobody else is working on them. But otherwise a bad idea.

      I tend to rebase my unpushed local changes on top of upstream changes. That's why rebase exists. So you can rewrite your changes on top of upstream changes and keep life simple for consumers of your changes when they get merged. It's a courtesy to them. When merging upstream changes gets complicated (lots of conflicts), falling back to merging gives you more flexibility to fix things.

      The resulting pull requests might get a bit ugly if you merge a lot. One solution is squash merging when you finally merge your pull request. This has as the downside that you lose a lot of history and context. The other solution is to just accept that not all change is linear and that there's nothing wrong with merging. I tend to bias to that.

      If your changes are substantial, conflict resolution caused by your changes tends to be a lot easier for others if they get lots of small commits, a few of which may conflict, rather than one enormous one that has lots of conflicts. That's a good reason to avoid squash merges. Interactive rebasing is something I find too tedious to bother with usually. But some people really like those. But that can be a good middle ground.

      It's not that one is better than the other. It's really about how you collaborate with others. These tools exist because in large OSS projects, like Linux, where they have to deal with a lot of contributions, they want to give contributors the tools they need to provide very clean, easy to merge contributions. That includes things like rewriting history for clarity and ensuring the history is nice and linear.

      • cousin_it 1 hour ago
        Maybe I'm old, but I still think a repository should be a repository: sitting on a server somewhere, receiving clean commits with well written messages, running CI. And a local copy should be a local copy: sitting on my machine, allowing me to make changes willy-nilly, and then clean them up for review and commit. That's just a different set of operations. There's no reason a local copy should have the exact same implementation as a repository, git made a wrong turn in this, let's just admit it.
        • fc417fc802 47 minutes ago
          I agree but I think git got the distributed (ie all nodes the same) part right. I also think what you say doesn't take it far enough.

          I think it should be possible to assign different instances of the repository different "roles" and have the tooling assist with that. For example. A "clean" instance that will only ever contain fully working commits and can be used in conjunction with production and debugging. And various "local" instances - per feature, per developer, or per something else - that might be duplicated across any number of devices.

          You can DIY this using raw git with tags, a bit of overhead, and discipline. Or the github "pull" model facilitates it well. But either you're doing extra work or you're using an external service. It would be nice if instead it was natively supported.

          This might seem silly and unnecessary but consider how you handle security sensitive branches or company internal (proprietary) versus FOSS releases. In the latter case consider the difficulty of collaborating with the community across the divide.

        • pamcake 43 minutes ago
          > I still think a repository should be a repository: sitting on a server somewhere, receiving clean commits with well written messages, running CI. And a local copy should be a local copy: sitting on my machine, allowing me to make changes willy-nilly, and then clean them up for review and commit

          This is one way to see things and work and git supports that workflow. Higher-level tooling tailored for this view (like GitHub) is plentiful.

          > There's no reason a local copy should have the exact same implementation as a repository

          ...Except to also support the many git users who are different from you and in different context. Bending gits API to your preferences would make it less useful, harder to use, or not even suitable at all for many others.

          > git made a wrong turn in this, let's just admit it.

          Nope. I prefer my VCS decentralized and flexible, thank you very much. SVN and Perforce are still there for you.

          Besides, it's objectively wrong calling it "a wrong turn" if you consider the context in which git was born and got early traction: Sharing patches over e-mail. That is what git was built for. Had it been built your way (first-class concepts coupled to p2p email), your workflow would most likely not be supported and GitHub would not exist.

          If you are really as old as you imply, you are showing your lack of history more than your age.

    • tjpnz 1 hour ago
      I've had recent interns who've struggled with rebase and they've never known anything but Git. Never understood why that was given they seem ok with basic commits and branching. I would agree that rebase is easier to reason about than merging yet I'm still needing to give what feels like a class on it.
    • echelon 2 hours ago
      git rebase squash as a single commit on a single main branch is the one true way.

      I know a lot of people want to maintain the history of each PR, but you won't need it in your VCS.

      You should always be able to roll back main to a real state. Having incremental commits between two working stages creates more confusion during incidents.

      If you need to consult the work history of transient commits, that can live in your code review software with all the other metadata (such as review comments and diagrams/figures) that never make it into source control.

      • jameshush 2 hours ago
        This is one of the few hills I will die on. After working on a team that used Phabricator for a few years and going back to GitHub when I joined a new company, it really does make life so much nicer to just rebase -> squash -> commit a single PR to `main`
        • fc417fc802 40 minutes ago
          What was stopping you from squash -> merge -> push two new changesets to `main`? Isn't your objection actually to the specifics of the workflow that was mandated by your employer as opposed to anything inherent to merge itself?
      • _flux 1 hour ago
        Merging merge requests as merge commits (rather than fast-forwarding them) gives the same granularity in the main branch, while preserving the option to have bisect dive inside the original MR to actually find the change that made the interesting change in behavior.
      • fc417fc802 1 hour ago
        > You should always be able to roll back main to a real state.

        Well there's your problem. Why are you assuming there are non-working commits in the history with a merge based workflow? If you really need to make an incremental commit at a point where the build is broken you can always squash prior to merge. There's no reason to conflate "non-working commits" and "merge based workflow".

        Why go out of the way to obfuscate the pathway the development process took? Depending on the complexity of the task the merge operation itself can introduce its own bugs as incompatible changes to the source get reconciled. It's useful to be able to examine each finished feature in isolation and then again after the merge.

        > with all the other metadata (such as review comments and diagrams/figures) that never make it into source control.

        I hate that all of that is omitted. It can be invaluable when debugging. More generally I personally think the tools we have are still extremely subpar compared to what they could be.

      • hnarn 1 hour ago
        I completely agree. It also forces better commit messages, because "maintaining the history of each PR" is forced into prose written by the person responsible for the code instead of hand-waving it away into "just check the commits" -- no thanks.
  • aeinbu 30 minutes ago
    > I always use VS Code for this step. Its merge conflict UI is the clearest I’ve found: it shows “Accept Current Change,” “Accept Incoming Change,” “Accept Both Changes,” and “Compare Changes” buttons right above each conflict.

    I still get confused by vscode’s changing the terms used by Git. «Current» vs «incoming» are not clear, and can be understood to mean two different things.

    - Is “current” what is on the branch I am rebasing on? Or is it my code? (It’s my code)

    - Is “incoming” the code I’m adding to the repo? Or is it what i am rebasing on to? (Again, the latter is correct)

    I find that many tools are trying to make Git easier to understand, but changing the terms is not so helpful. Since different tools seldom change to the same words, it just clutters any attempts to search for coherent information.

    • spiffyk 4 minutes ago
      Git's "ours"/"theirs" terminology is often confusing to newcomers, especially when from a certain (incorrect, but fairly common) point of view their meaning may appear to be swapped between merge and rebase. I think in an attempt to make the terminology less confusing UIs tend to reinvent it, but they always fail miserably, ending up with the same problem, just with slightly different words.

      This constant reinvention makes the situation even worse, because now the terminology is not only confusing, but also inconsistent across different tools.

    • afiori 25 minutes ago
      For merges current is the branch you are on, for rebases it helps to see them as a serie of cherry picks, so current would be the branch you would be on while doing the cherry pick equivalent to this step of the rebase.
  • embedding-shape 1 hour ago
    > The response is often hesitation or outright fear. I get it. Rebase has a reputation for destroying work, and the warnings you see online don’t help.

    The best method for stop being terrified of destructive operations in git when I first learned it, was literally "cp -r $original-repo $new-test-repo && go-to-town". Don't know what will happen when you run `git checkout -- $file` or whatever? Copy the entire directory, run the command, look at what happens, then decide if you want to run that in your "real" repository.

    Sound stupid maybe, but if it works, it works. Been using git for something like a decade now, and I'm no longer afraid of destructive git operations :)

    • psychoslave 1 hour ago
      One step further which is in-scope-of-the-tool spirit will be git clone locally your repository.

      And still one step further, just create a new branch to deal with the rebase/merge.

      Yes there are may UX pain points in using git, but it also has the great benefits of extremely cheap and fast branching to experiment.

      • afiori 22 minutes ago
        in my experience some of the trickiest situations are around gitignore file updates, crlf conversion, case [in]sentivity, etc. where clones and branches are less useful as a testing ground.
    • bob1029 56 minutes ago
      The fastest way to eliminate fear is to practice. I had the team go through it one day. They didn't get a choice. I locked us on a screen share until everyone was comfortable with how rebasing works. The call lasted maybe 90 minutes. You just have to decide one day that you (or the team) will master this shit, spend a few hours doing it, and move on.

      Rebase is a super power but there are a few ground rules to follow that can make it go a lot better. Doing things across many smaller commits can make rebase less painful downstream. One of the most important things is to learn that sometimes a rebase is not really feasible. This isn't a sign that your tools are lacking. This is a sign that you've perhaps deviated so far that you need to reevaluate your organization of labor.

    • codesnik 1 hour ago
      whoa. well, if it really works for you. The thing is, git has practically zero "destructive" commands, you almost always (unless you called garbage collector aggressively) return to the previous state of anything committed to it. `git reflog` is a good starting point.

      I think i've seen someone coded user-friendlier `git undo` front for it.

    • thunderbong 42 minutes ago
      One of the many things I like about fossil is the 'undo' command [0].

      Also, since you can choose to keep the fossil repo in a separate directory, that's an additional space saver.

      [0] https://www3.fossil-scm.org/home/help/undo

  • ngruhn 1 hour ago
    Maintaining linear history is arguably more work. But excessively non-linear history can be so confusing to reason over.

    Linear history is like reality: One past and many potential futures. With non-linear history, your past depends on "where you are".

        ----- M -----+--- P
                    /
        ----- D ---+
    
    Say I'm at commit P (for present). I got married at commit M and got a dog at commit D. So I got married first and got a Dog later, right? But if I go back in time to commit D where I got the dog, our marriage is not in my past anymore?! Now my wife is sneezing all the time. Maybe she has a dog allergy. I go back in time to commit D but can't reproduce the issue. Guess the dog can't be the problem.
    • fc417fc802 17 minutes ago
      You omitted the merge commit. M is taken so let's go with R. You jump back to M to confirm that the symptoms really don't predate the marriage. Then you jump to R to reproduce and track down the underlying cause of the bad interaction.

      Had you simply rebased you would have lost the ability to separate the initial working implementation of D from the modifications required to reconcile it with M (and possibly others that predate it). At least, unless you still happen to have a copy of your pre-rebase history lying around but I prefer not to depend on happenstance.

    • psychoslave 47 minutes ago
      That’s also because there are multiple concerned that are tried to be presented as the same exposed output through a common feature. Having one branch that provides a linear logical overview of the project feature progression is not incompatible with having many other branches with all kind of messes going back and forth, merging and forking each other and so on.

      In my experience, when there is a bug, it’s often quicker to fix it without having a look at the past commits, even when a regression occurs. If it’s not obvious just looking at the current state of the code, asking whoever touch that part last will generally give a better shortcut because there is so much more in the person mind than the whole git history.

      Yes logs and commit history can brings the "haha" insight, and in some rare occasion it’s nice to have git bisect at hand.

      Maybe that’s just me, and the pinnacle of best engineers will always trust the source tree as most important source of information and starting point to move forward. :)

    • hnarn 1 hour ago
      > So I got married first and got a Dog later, right?

      No. In one reality, you got married with no dog, and in another reality you got a dog and didn't marry. Then you merged those two realities into P.

      Going "back in time to commit D" is already incorrect phrasing, because you're implying linear history where one does not exist. It's more like you're switching to an alternate past.

      • ngruhn 1 hour ago
        The point is that it's harder to reason over.
        • hnarn 1 hour ago
          I don't really agree that it's harder to reason over in the sense that it's hard to understand the consequences, but I also agree that a linear history is superior for troubleshooting, just like another comment pointed out that single squashed commits onto a main branch makes it easier to troubleshoot because you go from a working state to a non-working state between two commits.
        • agumonkey 1 hour ago
          there are others tricky time issues with staging/prod parallel branching models too, the most recent merge (to prod) contains older content, so time slips .. maybe for most people it's obvious but it caused me confusion a few times to compare various docker images
          • fc417fc802 43 minutes ago
            > the most recent merge (to prod) contains older content

            Can't that also happen with a rebase? Isn't it an (all too easy to make) error any time you have conflicting changes from two different branches that you have to resolve? Or have I misunderstood your scenario?

  • ratchetclank 1 hour ago
    I never understood why rebase is such a staple in the git world. For me "loosing" historical data, like on which branch my work was done is a real issue.

    In the same class, for commit to not have on which branch they were created as a metadata is a rel painpoint. It always a mess to find what commit were done for what global feature/bugfix in a global gitflow process...

    I'll probably be looking into adding an commit auto suffix message with the current branch in the text, but it will only work for me, not any contributors...

    • orwin 49 minutes ago
      Ideally you only rebase your own commit on your own feature branch, just before merging. Having a clean commit history before merging make the main branch/trunk more readable.

      Also (and especially) it make it way easier to revert a single feature if all the relevant commits to that feature are already grouped.

      For your issue about not knowing which branch the commits are from: that why I love merge commits and tree representation (I personally use 'tig', but git log also have a tree representation and GUI tools always have it too).

    • hhjinks 37 minutes ago
      Which branch your work was done on is noise, not signal. There is absolutely zero signal lost by rebasing, and it prunes a lot of noise. If your branch somehow carries information, that information should be in your commit message.
    • chungy 1 hour ago
      Sounds like you'd be a fan of Fossil (https://fossil-scm.org). See for instance: https://fossil-scm.org/home/doc/trunk/www/fossil-v-git.wiki#...
      • smartmic 55 minutes ago
        Let me expand on this with a link to the article "Rebase Considered Harmful" [0].

        I also prefer Fossil to Git whenever possible, especially for small or personal projects.

        [0] https://fossil-scm.org/home/doc/trunk/www/rebaseharm.md

        • fc417fc802 32 minutes ago
          > Surely a better approach is to record the complete ancestry of every check-in but then fix the tool to show a "clean" history in those instances where a simplified display is desirable and edifying

          From your link. The actual issue that people ought to be discussing in this comment section imo.

    • codesnik 58 minutes ago
      it's just "gitflow" is unnecessary complex (for most applications). with rebase you can work more or less as with "patches" and a single master, like many projects did in 90x, just much more comfortably and securely.
  • pabs3 58 minutes ago
    git rebase conflict resolution is a lot less scary with the zdiff3 merge.conflictStyle option.

    Also incremental rebasing with mergify/git-imerge/git-mergify-rebase/etc is really helpful for long-lived branches that aren't merged upstream.

    https://github.com/brooksdavis/mergify https://github.com/mhagger/git-imerge https://github.com/CTSRD-CHERI/git-mergify-rebase https://gist.github.com/nicowilliams/ea2fa2b445c2db50d2ee650...

    I also love git-absorb for automatic fixups of a commit stack.

    https://github.com/tummychow/git-absorb

  • tomaytotomato 55 minutes ago
    Github is not Git but I find the Squash and Merge functionality on Github's Pull Request system means I no longer need to worry about rebasing or squashing my commits locally before rebasing.

    At work though it is still encouraged to rebase, and I have sometimes forgotten to squash and then had to abort, or just suck it up and resolve conflicts from my many local commits.

    • cdmckay 42 minutes ago
      This

      Rebase only makes sense if you making huge PRs where you need to break it down into smaller commits to have them make sense.

      If you keep your PRs small, squashing it works well enough, and is far less work and more consistent in teams.

      Expecting your team to carefully group their commits and have good commit messages for each is a lot of unnecessary extra work.

  • rich_sasha 1 hour ago
    I have lost ~irretrievably work via rebase.

    I was working on a local branch, periodically rebasing it to master. All was well, my git history was beautiful etc.

    Then down the line I realised something was off. Code that should have been there wasn't. In the end I concluded some automatic commit application while rebasing gobbled up my branch changes. Or frankly, I don't even entirely know what happened (this is my best guess), all I know is, suddenly it wasn't there.

    No big deal, right? It's VCS. Just go back in time and get a snapshot of what the repo looked like 2 weeks ago. Ah. Except rebase.

    I like a clean linear history as much as the next guy, but in the end I concluded that the only real value of a git repo is telling the truth and keeping the full history of WTF really happened.

    You could say I was holding it wrong, that if you just follow this one weird old trick doctor hate, rebase is fine. Maybe. But not rebasing and having a few more squiggles in my git history is a small price to pay for the peace of mind that my code change history is really, really all there.

    Nowadays, if something leaves me with a chance that I cannot recreate the repo history at any point in time, I don't bother. Squash commits and keeping the branch around forever are OK in my book, for example. And I always commit with --no-ff. If a commit was never on master, it shouldn't show up in it.

    • nh2 1 hour ago
      > Just go back in time and get a snapshot of what the repo looked like 2 weeks ago. Ah. Except rebase.

      This is false.

      Any googling of "git undo rebase" will immediately point out that the git reflog stores all rebase history for convenient undoing.

      Shockingly, got being a VCS has version control for the... versions of things you create in it, not matter if via merge or rebase or cherry-pick or whatever. You can of course undo all of that.

      • rich_sasha 1 hour ago
        Up to a point - they are garbage collected, right?

        And anyway, I don't want to dig this deep in git internals. I just want my true history.

        Another way of looking at it is that given real history, you can always represent it more cleanly. But without it you can never really piece together what happened.

  • dominicrose 55 minutes ago
    When things get messy I use Sublime Merge with two tabs, one with the code that's open in VS Code and one with the same project but different branch/commit. It works well on Linux. I've managed to make it work with Windows + WSL but I don't recommend it.
  • 708145_ 56 minutes ago
    I see no need to ever rebase manually, just merge on your branch and always fast-forward squash merge (only sane default) with GitHub/GitLab/whatever.