Source Control and Build Systems

18 07 2007

I’ve been thinking lately about the shortcomings with SCMs (Source Control Management systems) and build systems. Mainly, that by integrating a certain kind of SCM with a build system, builds of large projects could be made to go a bit faster, as well as gaining some semantic information.

SCMs I’ve used:

  • CVS – About as basic as you can get while still being useful. Basic checkout-modify-checkin model which forms the basis of most SCMs.
  • Subversion – A slightly nicer version of CVS for most intents and purposes. This is what I consider baseline for source control. The only difference from CVS that I’ve ever cared about is that it screws up less frequently with binary files than CVS.
  • Mercurial – A nice distributed SCM.
  • SCCS / Teamware – Sun’s internal SCM. It’s a distributed SCM with very explicit parent-child repository relationships (which are of course flexible, more later)
  • Perforce – Pretty obnoxious in my experience, and not singularly any particular SCM paradigm.

I like distributed SCMs, largely because it also provides a way for a developer to have their own version control without dealing with the main repository. There’s no need to make the SCM work with another used locally, or with another copy of the same SCM – checkins can be done on the local repository without issue. This is important for developers who may for example just be starting on an existing project and not have commit access yet, or for a developer who wants to work on a laptop or other local machine under circumstances which lack access to the main repository. I also like a noticable parent-child relationship between repositories. SCCS has this. Mercurial has a notion of a parent repository (by default the one you cloned from), but it isn’t emphasized in use of the tool, so it isn’t as semantically meaningful – this is a personal nitpick, and I don’t really have a technical reason for backing up this preference. Perforce is mostly just obnoxious in my opinion*. It is similar to a DSCM, in that each developer has their own copy of the respository which they do direct file checkouts from. It keeps checkout state on a central server, however. And to save space, it enforces a requirement that all copies of the repository be on the same filesystem, because it uses hardlinks to do an initial clone of the repository. This is novel, and certainly saves space, but in addition to making remote work impossible also creates an emormous problem. Developers do occasionally screw up their umask on UNIX systems, so they do occasionally need to chmod files. Or more commonly, an editor will allow users to easily (sometimes accidentally) override read-only protection on files by chmod’ing them independently. Then the editing begins. It’s bad enough when someone commits bad code to a main repository, and those who sync up before the changes are backed out have their workspaces broken. But when someone accidentally chmod’s a file in the Perforce situation above, it can simultaneously break building in every checked out workspace that hasn’t done a local checkout of that file, which would remove the hardlink and replace it with an actual copy. For this reason, I dislike SCMs which make hardlinks for checkouts. One thing I like about both Perforce and SCCS/Teamware is that each has an explicit list of files being edited, which is the inspiration for the main point of this post.

Build systems I’ve used:

  • make (and parallel versions of it) – The baseline build system from a UNIX/*nix perspective.
  • ant – Used for Java.

The main problem with make and ant is that they are stateless – so for partial builds, they still need to do a lot of initial work to determine what needs to be compiled, and for languages like C, linked. Both traverse the filesystem, stat’ing every file listed such that the build target is dependent on it. For small projects, this isn’t significant. For larger projects (such as OpenSolaris, it becomes noticable, and increases the build time – particularly on systems with other slowdowns for file operations, such as older systems with slow disks, or systems using network mounts for files. An incremental x86 build of OpenSolaris, on the build machines at work, takes between 10 and 40 minutes depending on the number of users on the build servers, after changing one file. Recompiling any one of the main files in the genunix module takes less than 30 seconds. Re-linking all of the object files which compose genunix doesn’t take more than a minute or so. And rebuilding cpio archives for the bfu script only takes about 2 to 3 minutes. Which means that each OS build I do takes between 2 and 8 times as long as it should. This is partly a result of the OS makefile structure – there has been some discussion recently on some OpenSolaris lists about the construction of the makefiles. But the main issue is that either way, hundreds of files are stat’ed which are known not to have changed – the build system simply lacks this information.

One possible solution to this (which I might try to code up this weekend) is to combine a distributed SCM with explicit local checkout (so it has a full list of all potentially modified files), and a build system. This would allow the tools, given a dependency graph, to not even look at source files which don’t need to be rebuilt. Only modified files and those which depend explicitly on them would need to be checked. This could shave some noticable time off the OS build system. Of course, this doesn’t address things like header-dependency unless those are added explicitly to the dependency tree requirements. But for object files, this could be a win.


* It is probably worth noting that this is from my experience at one company where Perforce is in use. I haven’t looked into it too much, so these problems could just be a particular configuration of Perforce.


Actions

Information

4 responses

2 08 2007
aleks

a. i have also used perforce, and i also think it sucks.

b. SCons uses md5sums instead of access times, which seems to solve the problem you’re talking about without requiring that you integrate a build system and an SCM. or am i missing the point of your idea?

3 08 2007
csgordon

Well, you’re sort of missing the point about the access times. If you’re doing an md5 sum, that’s nice – you don’t have to recompile files that you just, for example, checked into an SCM which marks such files read-only, even though the mtime on the file has changed. But there are two problems:
1) Doing an md5 sum on a file takes longer than doing a stat(2). This is nice if it’s a huge source file, but there are all sorts of reasons to avoid too much code in one file.
2) You’re still doing an operation on every file in the tree. The reason integrating SCM and build systems could be nice is because you won’t waste time walking a huge directory tree and doing some operation on every file; you can just go directly to the files of interest.

3 08 2007
aleks

Okay. If I’m understanding correctly, the only information you’re passing from the SCM to the build system is a list of which files changed. So what if you just picked your favorite SCM, added an option so that it can print out the list of which files changed in an easy-to-parse format, and then have the build system use that? (The benefit to doing that is that your build tool is mostly SCM-agnostic.)

3 08 2007
csgordon

Yes, that’s all. So perhaps portraying it as ‘combined’ was a bit overeager, but certainly a tighter integration like what you suggest.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

%d bloggers like this: