Thursday, October 15, 2009

Time for a Real World Linux/Apache - IIS Comparison

All my real world experience is with MS infrastructure. This trying article is an attempt to take short cuts to broaden my perspective. All of the articles comparing the two leading web environments are outdated, obviously biased, or do comparisons that aren’t what I’m interested in.

My true apples to apples comparison compares strength to strength for the same non-trivial web application requirement. There are so many different (and important) scenarios that I have to make compromises. Here goes…

1) For each run of the experiment use the same 2 64-bit servers, one configured as a file server, one a web server, both on the same subnet. (Expanding the experiment to more and more cores on the web servers might also have interesting results.) I am deliberately not including a database server component.

2) Use Windows 2008 / IIS 7.0 for Windows and the latest and greatest Linux / Apache, each configured as the respective experts deem best.

3) Put a million relatively large (1M) XML files on the file server in a folder structure about four levels deep.

4) Build the Windows app in C#, latest release of .NET (3.5). Try two different configurations for Linux / Apache, one written in Java, the other PHP.

5) Here’s the spec for the app:

a) The request causes the app to read a random XML file from the file server. (Each configuration’s knowledge of the available files must be comparable. Either it is cached at app startup or they have the same means of discovering a readable file path.)

b) Iterate through the entire XML structure assigning the node value to a string variable (for no good reason other than burn CPU cycles).

c) Perform a uniformly defined non-trivial transform on the entire 1M XML file.

d) Respond with a smaller fragment of the transformed XML (I don’t want to test throughput of the response, but the rule for determining the fragment must be uniform).

Measure the throughput, as for any good benchmark, and scale up the request frequency until each configuration “breaks”.

I don’t expect anyone to rush out and perform this experiment. I certainly am not going to, but perhaps something close to this has been done, or someone with real experience may care to speculate on likely outcomes.

Wednesday, October 14, 2009

Some Observations on Real World Software Development Life Cycle, pt. 1

While TFS is a pricey solution to source control it worked well for our agile development team of coders and QA in the shop I manage. We have a big mature code base and our typical routine to move fast is for coders to check in and out source files several times a day, doing new builds for QA.


So we decide to upgrade to VS 2008 for the next round of development projects and since Corporate has mandated that every shop in its portfolio shall use Subversion for source control, we migrate over source control.


We already knew TFS’s quirks, and knew it would take time to get acquainted with Subversion’s. Now, I can’t speak to the 2008 version of TFS, but we sorely miss good old TFS. Both are rock solid for reliability, but TFS is database based and subversion is file system based. Doesn’t make much difference it your repository numbers its files in the hundreds, but our repository numbers its folders in the hundreds.


Now doing commits and updates, the equivalent of which in TFS would take seconds, burns minutes…each time. (A small commit will move faster, but updates are hopeless.)


So I talk to the guy who runs a sister shop which has been using Subversion all along. (They don’t do agile development and have more developers spread-out geographically.) Come to find out they don’t even try to merge staging branches back into the development trunk. That means double updating of ensuing staging code changes. And of course they have a full time guy who manages subversion.


Subversion has been worth every penny we paid for it.