Arduino IDE 1.6.x Compiler Optimisations = Faster Code

37,136

59

26

Posted

Introduction: Arduino IDE 1.6.x Compiler Optimisations = Faster Code

After downloading the latest Arduino IDE (1.6.1) I was rather disappointed that some of my sketches ran significantly slower than the same sketch compiled under IDE 1.0.6. This was particularly noticeable on one of my sketches that drove a TFT display.

The good news however was that the 1.6.1 IDE produced a sketch that was 20% smaller, this was great as I was beginning to run out of FLASH space on my UNO for different fonts.

To solve the mysteries of the compiled code sizes and speed differences I decided to investigate further. Ideally I wanted the speed back that the older IDE gave me, whilst still being able to save program FLASH space!

Step 1: Compilers

A search on the internet will tell you much more than I can about compilers, here is just a brief summary.

The Arduino sketches are written in a high level language, namely C++ or C but the micro-controller executes machine code instructions, thus the job of the compiler is to convert (viz translate) the human readable code into a sequence of instructions that the micro-controller can execute. Essentially the compiler converts one language to another.

When the compiler is called upon to create the executable code there are a number of options that can be invoked, in the case of the GCC compiler used by the Arduino IDE there are 5 speed/size optimisation options as detailed here. The different options cause the compiler to make more effort to optimise the executable code for size or speed.

The default optimisation used in the Arduino IDE is for size, this is option "-Os" in the command line. The reason the code sizes and speeds generated by the two Arduino IDE's is so different is because a much newer version of the GCC compiler is used in the latest IDE. Clearly this new version creates a significantly smaller executable but the penalty it appears is a significantly slower execution speed.

A few mild words of caution

It is worth noting that, in rare cases, changing the optimisation level can affect the way a program behaves when running. This is because optimisation tries to"rewrite" the software to make a "time" and "size" efficient executable. The probability of the software function being affected is dependent on how "aggressively" the compiler modifies the code.

The default optimisation level for the Arduino IDE (-Os) is already pretty aggressive so it is very unlikely that you will see new behaviour problems introduced if the optimisation level is changed.

Because the compiler follows a set of rules we sometimes need to include "compiler directives" within the software itself to avoid problems during the optimisation process, a classic problem is the failure to declare variables as "volatile" when they are used by a main program and an Interrupt routine... if you don't know what "volatile" means then Google "why is volatile needed".

Step 2: Size and Speed Differences

The sketch I used for testing was a graphics speed test.

Here are the results of tests for a sketch compiled for an Arduino Mega, similar results would be expected for an UNO:

IDE 1.0.6 :

  • Compiled size: 26,620 bytes
  • Execution time: 13.3 seconds

IDE 1.6.1:

  • Compiled size: 19,558 bytes
  • Execution time: 17.8 seconds

These results showed why I noticed such a dramatic speed difference...

We can also see the 1.6.1 IDE produces a FLASH image 7062 bytes smaller, that is significant when you consider it could make the difference between getting it running on an UNO or needing an upgrade to a Mega.

Unfortunately the execution speed has dropped 34% which is not helpful. The question I wanted to answer was:

Can we have the best of both worlds, a fast execution time and a smaller sketch?

Step 3: Results of Changing the Compiler Optimisation

Bear in mind that I have just tested one sketch and different options may be better in some circumstances.

These are the results I obtained when using the IDE 1.6.1 and changing the compiler optimisation directive:

-Os (Arduino IDE default)

  • Compiled size: 19,558 bytes
  • Execution time: 17.8 seconds

-O0 (no optimisation at all!)

  • Compiled size: 31,382 bytes
  • Execution time: 44.7 seconds

-O1

  • Compiled size: 20,428 bytes
  • Execution time: 17.0 seconds

-O2

  • Compiled size: 20,500 bytes
  • Execution time: 12.7 seconds

-O3

  • Compiled size: 25,550 bytes
  • Execution time: 12.2 seconds

As I am using an Arduino Mega I am not particularly concerned about the FLASH size, so option -O3 gives a better speed (shorter run time) and the sketch is smaller than the IDE 1.0.6 gave me. However I have decided to set the 1.6.1 IDE to optimisation -O2 as that looks like a good compromise between better speed and smaller FLASH code.

The size and speed improvements obtained for your own sketches may well give better or worse results and a different compiler option may give better results.

Step 4: How to Change the Optimisation Level...

The compiler command lines are contained within a text file buried within the Arduino application image, it is necessary to burrow down through a few directory levels to find a text file called "platform.txt".

In a Windows environment you need to open the folder where the arduino.exe is and find the file in the folder path.

arduino-1.6.1\hardware\arduino\avr\platform.txt

See step 6 of this Instructable if you are using the latest 1.6.x IDE, as the file path to platform.txt has changed!

If you are nervous about messing something up then make a copy of the file somewhere!

Open the platform.txt file in WordPad (Notepad will not work due to the way the file is structured). Turn off "Word wrap" so lines can be counted more easily.

Find this line, about 16 lines down from the top:

compiler.c.flags=-c -g -Os -w -ffunction-sections -fdata-sections -MMD

Change the -Os to -O2 as below:

compiler.c.flags=-c -g -O2 -w -ffunction-sections -fdata-sections -MMD

Next find a second line a little further down the file, about 23 lines from the top:

compiler.cpp.flags=-c -g -Os -w -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -MMD

Again, change the -Os to -O2 as below:

compiler.cpp.flags=-c -g -O2 -w -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -MMD

In practice it is just a case of changing the "s" to a "2". Note that is a letter "O" not a zero in the command line.

Now save the file, don't worry about any format warning. Next time it will open in Notepad OK!

Changing the compiler options will have no effect if you do it while you have the Arduino IDE open, you must close all the Arduino windows and open up the IDE again to get the change to be recognised.

I found my sketch ran a tiny weeny bit (a few microseconds!) faster with the first line changed to -O1 but the difference was far too small to notice when the sketch is running.

Step 5: Result!

Mission accomplished! Smaller code TICK, faster speed TICK, so it was a win-win for me!

Have fun!

Step 6: Arduino IDE 1.6.2 to 1.6.3 - Platform.txt Location

The latest 1.6.x IDE is now available on the Arduino website. The same methods can be used to change the compiler optimisation but the "platform.txt" file is in a different location.

If you open the Arduino IDE "File" menu, select "Preferences" then in the bottom on the window you will see the file path to help find it. On my Windows setup this is:

C:\Users\XXXX\AppData\Roaming\Arduino15\

where XXXX is your user name.

There is already a platform.txt at that directory level but I see no way to change the compiler options in that one. You need to burrow down through a few more directory levels to find this platform.txt file:

C:\Users\XXXX\AppData\Roaming\Arduino15\packages\arduino\hardware\avr\1.6.x\platform.txt

Now open the file, edit the compiler option, and save as described in Step 4. On my copy of the txt file the lines to modify are 17 and 24.

Hopefully in future IDE versions the file will be in a similar location.

Annoyingly the new 1.6.x IDE version over-writes some files used by other older IDE versions that may be resident! So if you load a new IDE you will need to change the compiler options yet again.

Step 7: IDE 1.6.6 and 1.6.7

To quote McCoy from Star Trek "I know engineers, they love to change things". So the platform.txt file is back in the IDE folder for the latest versions (a good move)! For example:

C:\Users\xxx\xxxx\Arduino\arduino-1.6.6\hardware\arduino\avr

I am not sure if it is because my software programming skills have improved or for some other unfathomable reason, but the speed gains from using the -O2 optimisation level instead of -Os seem somewhat lower these days at a few percent. So might be that the gains are very dependent on what the software is doing and may not be worth the effort. I suspect that I got lucky on some of my sketches and some tight loops were made significantly faster by -O2.

Share

Recommendations

  • Epilog Challenge 9

    Epilog Challenge 9
  • First Time Author Contest 2018

    First Time Author Contest 2018
  • Sew Warm Contest 2018

    Sew Warm Contest 2018
user

We have a be nice policy.
Please be positive and constructive.

Tips

Questions

26 Comments

Very impressive. That works finally i found it out :-) Thank you!!!

Great, thanks for your feedback. I hope your project is a success.

Instead of having to modify the platform.txt file, an alternate way to get *almost* the same effect is to simply place this at the top of your code: #pragma GCC optimize ("-O2"). Replace -O2 with the optimization level you want. If you are using libraries, place this at the top of the .h file for the library, or else it won't apply this optimization level to the library, even if you have it at the top of your main Arduino sketch.

Sources:

https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gcc/Function-Specific-Option-Pragmas.html

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

http://stackoverflow.com/questions/30038172/adding-unused-elements-to-c-c-structure-speeds-up-and-slows-down-code-executio

Thank you! I did not know that was possible. I will update the Instructable with a new step to include this useful information. This is going to be a very convenient method of test different optimisation levels on some sketches and avoids the tedious editing of the platform.txt file. Cheers.

Bodmer, until 12 hrs of research over the last few days trying to solve my own problem, on the stackoverflow link I posted, I didn't know either! Thank you for this instructable or else I never would have figured it out. Until 3 days ago I didn't even know what a pragma was, nor what compiler optimization levels were, and now I'm already sharing that info. I've learned lots this week. This instructable was key, so thanks again. In case you wanted to cite me for how you figured out your additional info for your instructable, I always appreciate link-backs using my name (Gabriel Staples) and website (http://www.electricrcaircraftguy.com/). Also, this #pragma directive is not quite the same thing as editing the platform.txt file, but that's why I searched so hard for it. I really wanted something faster & more convenient, and that could be easily changed (and kept on a fixed setting), for individual codes. Do some comparisons of #pragma vs changing the platform.txt file and you'll see the size results (and prob. speed too) are not necessarily identical. This #pragma only modifies functions, so perhaps the "command-line" modification in the txt file also modifies global variables and the #pragma does not....I"m not 100% sure. I plan on using the #pragma option simply for convenience though, but need to do some more experimenting myself.

Thanks for the update, I have had a play and the code sizes did change indicating the optimisation was different, I am a bit busy with work at the moment so have not had much time to look at this further. Thanks again.

user

Really impressive. Thanks!

Out of interest I took the simplest example sketch "Blink", took out the delays and looked at the LED logic line with an oscilloscope. The toggle rate was 15% faster with option O2 compared to Os!

It would be interesting to see if there are any significant speed differences for computationally intense sketches such as Sine/Cosine, and floating point maths...

The graphics test sketch has a lot of short loops that are run at a high frequency so even optimising out a few machine code instructions will show singificant speed improvements.

Thanks! This is great! Good work!