this looks like a job for... tomorrow!

The Power of Moore's Law

One of my personal projects has been writing a buildbot that builds Mono on Windows. Although we have an official one on monobuild, it takes about 4.5 hours per build, so it can't build many revisions. Mine doesn't do everything that the official one does yet, like running tests, but it looks promising as far as cutting down on the build time (primarily by not using cygwin).

I run it on an old desktop that was lying around that I converted to a dedicated builder server. Its stats are roughly:
  • Windows Server 2008 (32bit)
  • Pentium IV 2.4ghz
  • 768 MB ram
  • 5400 rpm IDE HD
It builds Mono on Windows with the following times:

Svn update mono, mcs2:25 min
Copy mono and mcs to a fresh directory6:38 min
Build Mono runtime4:38 min
Build 2.0, 3.5 assemblies (101 assemblies)8:01 min
Remove mono/mcs directory1:00 min
Total22:42 min

Not horrible, but it's hard to be satisfied with that knowing I have a spiffy quad core machine and could build most of the managed assemblies in parallel. I had to try it out on my main desktop machine, which is:
  • Windows 7 RC (64bit)
  • Intel Core2 Quad Q6600 (2.4ghz)
  • 4 GB ram
  • 7200 rpm SATA HD
Unfortunately, my build system isn't set up for parallel builds, so I had to write that first. My first step was to simply get it running serially on my desktop. This yielded:

Svn update mono, mcs1:03 min
Copy mono and mcs to a fresh directory4:29 min
Build Mono runtime1:57 min
Build 2.0, 3.5 assemblies (101 assemblies)3:23 min
Remove mono/mcs directory0:25 min
Total11:17 min

I of course expected my desktop to be faster, but it was already completing the build in less than half the time of my server. On top of that, it builds the managed assemblies in 3:23 minutes.

Even with excellent parallelization, there isn't a lot of time to chop off of three and a half minutes. And we know we won't get perfect parallelization due to the bootstrap and the common assemblies that have to be built first that everything depends on (corlib, System, System.Xml, etc.). I think I could maybe cut two minutes off of that with a parallel build.

For now, I have aborted my plan to parallelize my build system. It turns out that Joel is right that sometimes its better to throw money at a problem instead of spending time to optimize it.

Of course, that's also a dangerous idea, because now I look at my copying numbers and want an SSD disk drive. :)

4 comments:

Anonymous said...

Looks like the 'copy to a fresh directory' has plenty of low-hanging fruit though - do you need to copy it (or can you just go through the svn tree after removing unknown files?), and/or is there a windows equivalent of cp -al ?

Anonymous said...

Just as an Idea:
Why not use a RAMdrive in the dedicated build server. It seems AT least half of your time is spent on disk IO in one form or another. With a RAMdrive you could effectively reduce that to zero.
When looking at your figures I'd assume that you could easily go down to 5 minutes just by doing that.
8 GB of RAM cost less than 100€.
And a 7GB RAMdrive should be big enough to hold both, the working and the compiling copy.

jpobst said...

As you may recall, Pentium IV's originally only accepted RDRAM, which was supposed to be the savior of mankind. In reality, it was just freakishly expensive. I don't know if you can even buy it anymore, but the last time I found some, it was still hundreds of dollars for a 256mb stick.

In fact, when I started to outgrow this computer, my decision came down to "spend $600 for more ram for this computer" or "spend $600 for a better, brand new computer with more ram".

Trying a ramdisk on my desktop however if definitely something I plan on trying out.

Amber said...

Nice post. Came across it after finding out to my delight that a git clone + ./autogen.sh + make came in at 10m52s on my Q9550 with SSD machine.

That surprised me, so I went looking for other people's build times. Note, that I'm using ccontrol and that make is effectively aliased to 'colormake -j5'.

I'm not using ccache and distcc although it would be easy to change that, even on the fly using ccontrol.

Highly recommended, and thanks for giving me a reference 'benchmark' figure there