Roy Tang

Programmer, engineer, scientist, critic, gamer, dreamer, and kid-at-heart.

Blog Notes Photos Links Archives About

Git vs CVS

A while back we were tasked with helping a client’s internal dev team to migrate their repositories from Subversion to Git. The distributed VCS seemed ideal for their situation - they had a very small in-house dev team managing contributions from external subcontractors. The main rationale was that their process of merging contributions from the external developers was extremely complicated and often resulted in conflicts that were challenging to merge. Before this, I hadn’t actually used Git too deeply myself (aside from cloning stuff from Github), and especially not in a team setting, so the training one of our other engineers gave them was a good opportunity for me to become familiar with Git as well. During the succeeding months we guided them through the first few pull requests and merges; their repository structure and build process was quite complicated so it took us a while to iron out the kinks. There was often some difficulty when merging PRs from old forks and so on. I mostly let our other engineer who had more Git experience handle any merging. Sometimes I wondered about whether the transition was really worth it for the client, but it all seemed to end up okay.

I was reminded of this because I recently tried the whole workflow of fork and change code and pull request and merge with upstream to resolve conflicts in a larger, non-trivial project. (I don’t count the few PRs I did for Hacktoberfest last year, as those were relatively simple). This one was a whole new submodule that I had to merge into the main repository, so there were a lot of changes on my end. I wasn’t expecting any conflicts though, because I was only adding new stuff and not modifying existing code. So I tried to make my PR without merging from upstream, not expecting any issues. Unfortunately, it turns out some CR/LF shenanigans between my Windows machine and the repo meant one of my commits had more “lines changed” than I intended, resulting in a merge conflict. Fortunately, it was easy to simply merge from upstream and resolve the offending files manually. Didn’t take me more than 5 minutes to fix the issue.

My experience with this was a bit amazing in retrospect. Mostly because for a large part of my career to as recently as four years ago, I worked with CVS as a version control system, which is ancient by modern standards. I remember during my first project, we had three branches: DEV, UAT and PROD. And moving changes from any given branch to another was a pain, each file had to be manually re-applied to the other branch. So my instinct has always been to dread any large-scale merge (something I have to unlearn apparently).

Git has a few other advantages of course. One of my favorites is being able to address specific commits individually by hash, and see the file changes and so on. When we were using CVS if you wanted to rollback to a specific change, you had to have the foresight to have tagged that change so that you can check it out again. And CVS for us was always very slow for larger projects with thousands of files, leading to annoyingly long build cycles and wait times. Git seems to be a lot more performant.

I hold no ill will towards my old company for making me use CVS all those years. When I started out it was a reasonable choice, but as time went by better options simply became available, and like many companies it was difficult to leave entrenched systems. I did propose the use of distributed VCS over the years; at first Mercurial and later Git as well, but there was never a good time to migrate without being overly disruptive (this happens when Everything Is Urgent). Probably if I had gotten in on a larger project on the ground floor, but the last time I had a chance for that was around 2007 I think, when Git was still in infancy. Still, I heard from people still there that they’re already in migrating to Git, so good for them (even though they waited til I was gone!).

Posted by under post at #Software Development
Also on: twitter / 0 / 702 words

See Also