Speed: Gentoox vs Xebian
An unbiased investigation into compiler flags on the Xbox.

by Thomas Pedley (c) 2004

 

 

Contents

Disclaimer
Introduction
About the Test Setup
About the Test Software
Pre-test Analysis
The Test Script (bash)
The Results
Conclusion
Bibliography

 

 

Disclaimer

All information contained within this document is for educational purposes only.  Any trademarks used are property of their respective owners.  While I made my best attempt to make sure all information contained within is valid, I make no guarantees to it's level of accuracy.  Use of any information, code or files contained within is done at your own risk, I cannot be held responsible for any harm that may come to you or your system from using it.  If you do not agree to this, you are not permitted to read any further in this document.

 

Introduction

For the past 2 years (near enough) I've been maintaining Gentoox people have been coming to me saying "Compiling stuff yourself makes no difference, Xebian is just as fast as Gentoox!".  Well, I _finally_ made some time to test this theory out.  This is going to be as unbiased as possible - I am NOT doing this to try and promote or badmouth either distribution, I am doing it purely out of personal (morbid? ;)) curiosity.  This isn't a definitive answer to the infamous "Which is better...?" question, the answer to that question has so many other factors which are not going to be touched on in this report.  If you are one of the people that has asked (or are thinking of asking) this question... I have a simple question of my own for you: "Which is better: a thermometer or 42 stacked chairs?"... think that's stupid?  Well guess what we think of your question!  The point I'm trying to bring across is "it depends".  If you are playing musical chairs, the 42 chairs might come in handy... but a thermometer wouldn't.  With Linux distributions its the same.  You have to know what you want Linux for and you have to try out the different distributions for yourself and see which suits your needs best.  No one can (or at least should) decide how you use your computer - its all down to your experience and personality... so stop asking this question!  Anyway... back to the subject... this report is about speed - nothing else!

This originally started with me wanting to find the "holy grail", and by that I mean the perfect CFLAGs to use for the Gentoox central server.  I read dozens of articles about compiler optimisations but they don't give any real conclusive results other than to say that "it depends", which I agree - it largely does.  The only way to reduce the factors that can affect the result was to perform a test myself.  All testing would have to be fair, so the same Xbox would have to be used (in case of any subtle hardware differences - RAM speeds etc...).

 

About the Test Setup

Xbox v1.1,
Samsung DVD-ROM,
64MB of RAM,
Xecuter Pro,
Cromwell 2.40,
160GB Samsung Spinpoint HDD

The distributions I used were Gentoox Pro v1.4 and Xebian 1.1.0.  Both were installed to the E: partition inside of a loopback "rootfs" of 4000MB in size.  This is the typical environment in which most users prefer to function (rather than fully native).  To make the tests fair with respect to memory usage, Xfree was closed before execution of the tests on Xebian since Gentoox Pro does not have X by default.

By default, Gentoox uses the ReiserFS file system, while Xebian uses the Ext3 file system.  Many people will know that Ext3 is slower than ReiserFS, this immediately put Xebian at a disadvantage.

Needless to say, all tests were done without any compiler optimisers (by which I mean distcc, ccache and the like).

 

About the Test Software

Whilst surfing around for clues on the "best" CFLAGs, I came across an excellent website run by the department of software engineering at the University of Szeged.  They have dedicated a project to researching compiler optimisations and have actually released the tools which they use to perform their research.  The project, named GCC Code-Size Benchmark Environment (here after CSiBE), runs a string of compilations and measures the time taken for each step to be completed.  It also records the size of the outputted machine code from the compilation.  The results are saved into a Comma Separated Values file (.csv) which can be analysed in the likes of Excel or Openoffice.org.

Tests include compilation and execution of the following UNIX programmes and libraries: zlib, gzip, libjpeg, kernel-2.4.23-pre3 (partial compilation only), flex, bzip2, libpng, lwip, mpeg2dec, OpenTCP and more...

As you can judge, the tests are quite varied and include many day-to-day tasks that you are likely to perform on an average system.

There are 3 portions to this test.  The first is compilation time, that is to say "How long does X piece of code take to compile with a specific CFLAG?".  There is then the file size test which gives an indication of the size of the output from compilation (which is an important factor for more reasons than just hard drive space usage - see "Pre-test Analysis" later on).  The final test checks the runtime execution time.  For each of the tests, the lower the result the better. 

The timed portions of the test output the time in the following format:
     0.01user 0.00system 0:00.16elapsed
To make things simpler, this investigation will only deal with the third "elapsed" field, as this is the total time taken by the process to finish up and die.

The tests will be compiled with the most basic of compilation flags, -Os, -O1, -O2 and finally -O3.  -Os is used to produce the smallest code possible, so it is presumed that this will result in the smallest outputted binaries.  -O1 to -O3 enable various optimisations for speed (-O3 being the most aggressive, but having the highest outputted file size cost).  For more details on GCC optimisations, see the GCC Manual.  GCC v3.3.4 and CSiBE v2.0.1 were used for all tests.

 

Pre-test Analysis

For a long time, Gentoo Technologies Inc. believed that -O3 was the fastest compilation flag due to it producing the most "optimised code".  This was, apparently, proved to be wrong since -O3 produces significantly larger code.  Processors only have so much cache in which chunks of programmes are loaded before execution (specifically, the Xbox has 128KB of level 2 processor cache - which is absolutely pathetic by the standards when it was released, let alone today's standards) meaning that large programmes could take longer to execute as more data swapping may need to take place.  Gentoo then switched to -O2, which it claims is the fastest for x86 systems.

Personally, I would expect the execution time to favour Gentoox, but not by great deals.  I would also expect the compilation times to favour Gentoox due to the fact that it is running on a faster file system than Xebian.  File sizes are anyone's guess, it all depends on the libc being used, how the compiler was optimised etc...

 

The Test Script (bash)

If you want to execute this yourself, unpack CSiBE, cd ./CSiBE/bin, then place this script inside there and execute it.

###
# CSiBE execution script (c) Thomas "ShALLaX" Pedley 2004. 
# Distribute freely - public domain.
###

./create-config --flags "-Os" os
./create-config --flags "-O1" o1
./create-config --flags "-O2" o2
./create-config --flags "-O3" o3

cd ./os
make result-runtime.csv
cp ./result-runtime.csv ../result-os-runtime.csv
cd ../

cd ./o1
make result-runtime.csv
cp ./result-runtime.csv ../result-o1-runtime.csv
cd ../

cd ./o2
make result-runtime.csv
cp ./result-runtime.csv ../result-o2-runtime.csv
cd ../

cd ./o3
make result-runtime.csv
cp ./result-runtime.csv ../result-o3-runtime.csv
cd ../

rm -rf ./os
rm -rf ./o1
rm -rf ./o2
rm -rf ./o3

./create-config --flags "-Os" os
./create-config --flags "-O1" o1
./create-config --flags "-O2" o2
./create-config --flags "-O3" o3

cd ./os
make result
cp ./result-time.csv ../result-os-ctime.csv
cp ./result-size.csv ../result-os-size.csv
cd ../

cd ./o1
make result
cp ./result-time.csv ../result-o1-ctime.csv
cp ./result-size.csv ../result-o1-size.csv
cd ../

cd ./o2
make result
cp ./result-time.csv ../result-o2-ctime.csv
cp ./result-size.csv ../result-o2-size.csv
cd ../

cd ./o3
make result
cp ./result-time.csv ../result-o3-ctime.csv
cp ./result-size.csv ../result-o3-size.csv
cd ../

rm -rf ./os
rm -rf ./o1
rm -rf ./o2
rm -rf ./o3


###
# EOF
###

 

The Results

 

The following are tables of the results.  Each one has a brief commentary.  For the "Diff" columns, negative results favour Gentoox, positive favour Xebian and 0 indicates no difference between the test results.

Test 1: Compilation Time (in seconds)

Xebian definitely has Gentoox beat here.  This is surprising considering that Gentoox is using a notably faster file system.  However, it's my opinion that compilation times are irrelevant - what the user should be looking for is the end result.  Good things come to those who wait!

 

Test 2: File size (in bytes)

As stated earlier, the results of this test couldn't really be estimated.  There are many factors that could have affected this.  It turns out that Gentoox produces the smallest binaries which helps on the Xbox considering the pathetic excuse for CPU cache.

 

Test 3: Runtime (in seconds)

As is evident from this set of results, Gentoox does have a lead on Xebian when it comes to execution time of produced code despite the code being produced by the same compiler.  If analysed closely, Gentoox does have a significant lead in nearly all tasks, however there are just a few which Xebian excels in, which heavily weight the TOTALS field for the Diff columns.  For example: Xebian coped with minigzip0 much better than Gentoox for -O3, but did very poorly with pnm2png1, so it balances out the overall result.  What is quite amazing is how poorly Xebian did with -O2 losing a massive 4.14 seconds of execution time to Gentoox. 

Overall, it is clear that (on the Xbox, at least) -O3 is the fastest for execution times with -O2 following close second for Gentoox, but -O1 being the next closest for Xebian.  What is also conclusive here is that you cannot just say that a certain CFLAG is better for everything.  The problem is dynamic, each programme has its own best CFLAG to use, enhex1 did best with -Os on Gentoox, but dehex1 did best with -O3.  This goes back to the initial question of "Which is better?", you can see how "it depends" applies here.  What are you going to be using your system for?  Decide on that first, then pick your CFLAGs.

In my opinion, this is the test that really counts.  The compilation time is a one off factor and the file size isn't really an issue unless your hard drive is extremely small.  When it comes to it, you want the final output to run as fast as possible as you will be running it many times.

 

Conclusion

Some unexpected results to say the least.  Xebian compiles far faster than Gentoox despite having a slower file system!  I suspect this may be down to the way the compiler was compiled itself as the Gentoox compiler seems to optimise code further.  Is compilation time really what matters?  In Xebian, would you be compiling much anyway?  Xebian seems to produce bigger binaries, which may have an effect on execution time.

Ironically, the whole test is flawed.  What I should actually be comparing is the time taken to execute programmes compiled on Gentoox vs the time taken to execute programmes installed via apt-get on Xebian.  This test is really just comparing Glibc and GCC versions against each other but inside of different distribution environments.  This test, I believe would increase the gap between performances.

It is safe to say that, yes, Gentoox DOES have faster and smaller binaries than Xebian - but so very marginally in most cases that its not worth even thinking about.  You would probably save about 50MB (and that's being generous) on identical, fully setup systems.

What you have to weigh up is whether you believe -O3 is worthwhile using in Gentoox over -O2.  The code size produced is significantly bigger, and compilations times are significantly longer just for an absolutely (under 1 second) increase in execution performance.  It certainly seems that it is worth using in Xebian over -O2.

If you are an absolute ricer, sure... go around boasting that Gentoox is faster.  If you're like the rest of us, you'll realise that for the most part the test results are close enough to be inconclusive.  Just carry on using whichever distribution suits you best.

The results I obtained took the best part of a day to get.  For people interested in compiler optimisations, it may be worth trying this out for yourselves with more aggressive flags than just the -O's.

 

Bibliography

GNU - Producers of GCC.
Gentoo Technologies Inc. - Creators of Gentoo Linux.
Xbox-Linux - Creators of Xebian and all Xbox Linux related software.
CSiBE - The benchmark software.