Compiler flags (+ LLVM GCC)

Posted on December 19, 2010

We’re in crazy pre-Holiday mode at Four Door Lemon right now, we’ve got a couple of exciting things to mention before getting onto the post!

  • QuizQuizQuiz our popular trivia app is currently FREE for the rest of the weekend – please download it now and tell your friends :)
  • Also Cricket Captain which is the source of the research for this article was approved tonight and is now available – if you like Cricket this is the management game to have – including 3d highlights and a great simulation engine.

Compiler flags

I’ve worked on pretty much every game platform compiler setup in my time in the industry, for some platforms that actually covers working with 3-4 different variations while the teams found the best code generation for their device.

There are always discussions between developers on what *the* best set of compiler flags are to use for optimum speed of your game (which other than executable size for the more RAM limited devices) is the main goal of development work. Normally the gains from tweaking with these can be very small but I remember features like LTCG on Xbox 1 being incredibly big in terms of the speed up.

We recently submitted Cricket Captain to the App Store and other than -O3 and -Os I didn’t actually do any testing on the various flags for the compilation, the game isn’t quite as optimised as I’d like it to be yet although it plays fine and the 3d highlights run at around 30fps – we will be improving this through actual coding work (and talking about the techniques used for this).

It seemed ideal to combine my investigation into the XCode GCC compilers optimisation flags with writing a blog post so I used Cricket Captain as the test sample.

So a couple of notes on the tests

  • The random number generator has been forced to generate the same logic each run
  • The game mode / teams are identical so the same logic will occur (I also verified the results came out the same)
  • The phone was in the same state for all tests
  • There will be some variance due to background processes most likely
  • This isn’t intended to be super accurate but to see if we’d get a noticeable difference from changing these flags
  • Our render and matrix library are compiled using -O3 into static libs and are not part of this test, I wanted to test it on game code on the basis that people would possibly not have access to middleware sourcecode.

Let’s have a look at the results

Compiler / Flags 5x Bowls time (ms) 3x 3d highlights FPS average
GCC -Os 1704 24.68666667
GCC -O1 1750 24.52
GCC -O1 Auto Vectorisation enabled 1677 24.96
GCC -O2 1692 24.81
GCC -O3 1718 24.85
GCC -O3 Auto Vectorisation enabled 1710 24.67
GCC -O3 FastFP 1713 24.74
GCC -O3 Compile for Thumb 1706 24.80
GCC (Max) -O3, Unroll Loops, Auto Vectorisation enabled, FastFP 1727 24.78

Above we’re showing the flags that we tried, the total in milliseconds of the first 5 bowl calculations (this system actually runs through the movement and animation of all the fielders / batsman to calculate the actual output – note the first bowl is pretty slow in a game so is a fair part of this time possibly skewing results a little). The final column shows the average FPS across 3 3d highlights that last 5.5 – 8 seconds each.

For both tests we see that -O1 with auto vectorisation comes out on top, the values however are very close and the -O3 we shipped with was fairly good. The biggest surprise for me was that compile for thumb didn’t produce shocking results, I’m not really sure if that can be correct? As mentioned above our middleware was still in ARM so we’re not paying a penalty for it on some of the heavy lifting code.

In summary though there isn’t a huge difference between the flags and as I originally thought it’s probably not worth a huge amount of optimisation.

LLVM GCC

I have however been playing with the LLVM – GCC compiler in XCode 4 and despite a few issues getting 4.2.1 debugging to work correctly with it got the same tests done (running on the same device). LLVM – GCC uses the GCC frontend to parse source and the LLVM backend to optimise and generate the actual executable code. From what I read 33% performance increases were expected at runtime.

I did a couple of tests with various flags

Compiler / Flags 5x Bowls time (ms) 3x 3d highlights FPS average
LLVM -O3 AutoVec, Unroll Loops, LinkTimeOpt 874 27.43
LLVM -Os AutoVec, Unroll Loops, LinkTimeOpt 906 27.43667
LLVM -Os AutoVec, Unroll Loops 907 26.66667

Wow.. So that’s roughly twice as fast on the processing and a nice FPS / frametime boost. I imagine compiling our middleware with LLVM would result in even better speeds as that’s where the highlight time will be mainly going (which is partially hinted at by the fact that disabling link-time optimisation only changed the highlight speed).

After the GCC flag results I was worried about how interesting this post would be, the LLVM-GCC results are really exciting though and I think XCode 4 is something everyone will be looking forward to!

Things we’ve been enjoying this week

Be Sociable, Share!

2 Responses

  1. […] This post was mentioned on Twitter by Simon Barratt. Simon Barratt said: #iDevBlogADay post – http://bit.ly/gE1ncj – Compiler flags (+ LLVM-GCC) – Thanks! […]


  2. RP
    December 23, 2010

    That’s cool data on LLVM. I’m sort of a newb at compiler flags and compilers in general. You should consider doing an intro post to the different compiler flags and what they mean and how they work. For example, you mention the -Os compiler flag, but I have no idea really what that means. And I think I know where to type it in in XCode, but I’m not confident enough to ship with messing around with compiler flags.


Leave a Reply