loading
Arduino is slow? What? This instructable will show just how slow a part of Arduino is, and how to fix it.
 
It’s true – more specifically, Arduino’s digitalWrite command takes a considerable amount of time. If you are just switching on a LED once or something, you won’t be able to notice it. However, I realized how slow it was while I was trying to use a TLC5947 PWM driver. That requires the microcontroller to shift in 288 bytes each time! Each byte required about 12 digitalWrites, for a total of 3456 digitalWrites each time I wanted to shift in new data to the TLC5947.
 
How long did that take? 30 seconds of just digitalWrite!
 
But there is a solution – using “true c” style commands, or what the AVR GCC (GNU C Compiler) uses. The brains behind Arduinos are ATMega168s or ATMega328s. The AVR community typically uses “true c” commands to program these chips, using AVR Studio 4. The advantage of using these “true c” commands is that it does exactly what you tell it to do.
 
But before we get in to these commands, we must get familiar with port and pin definitions in the next step!

(If you predict you will like this instructable, feel free to vote for the Arduino contest!)

Step 1: The Truth about Pins

Arduino users know that the pins are labeled as digital 0-13 and analog 0-5. The makers behind Arduino used this for simplicity. The actual ATMega chip’s pins are labeled differently, however. Each pin has an assigned letter and number. The numbers range from 0-7. For example, pins can be A0-A8, B0-B8, and so on. All of the AVR 8-bit pins are labeled this way.
 
To help you clarify which digital/analog pin corresponds to which AVR pin, see the chart below.
 
My Seeeduino has a LED built in on digial pin 13. So, looking at the chart, its real pin would be B5.
 
Next I will show you what the actual C command is.
<p>When I'm talking about PIND = _BV(PD7) code, the PIND is usually used to read the port, but if the pin is an OUTPUT pin it will flip it on a write. Typo... thanks</p>
<p>Nice way to stimulate people... However I think the comparisons are apples and oranges. Doing a direct write to memory then comparing it to a polished function call, is apples and oranges. Just the call itself generates more code than a write to memory. For example, I don't have a boot-loader, I write directly to the 328p, which makes looking at the assembly created a little easier.<br><br>If you code<br><br> sei(); // enable interrupts<br> PORTD = 0;<br> PIND = _BV(PD7); // more on this code later<br> digitalWrite(1, HIGH);<br><br>This is the generated code (from the .lss listing file). The hexadecimal numbers on the left are where it lives in memory when loaded.<br><br><br> PORTD = 0;<br> 178: 1b b8 out 0x0b, r1 ; 11<br> PIND = _BV(PD7);<br> 17a: 80 e8 ldi r24, 0x80 ; 128<br> 17c: 89 b9 out 0x09, r24 ; 9<br> digitalWrite(13, HIGH);<br> 17e: 61 e0 ldi r22, 0x01 ; 1<br> 180: 8d e0 ldi r24, 0x0D ; 13<br> 182: 32 d0 rcall .+100 ; 0x1e8 &lt;digitalWrite&gt;<br> 184: 80 e8 ldi r24, 0x80 ; 128<br><br><br>This isn't tough to follow<br><br>PORTD = 0<br>causes machine code of '1b b8' (hexadecimal), the pneumonic is out 0x0b, r1, which is 'out' to (port) 0x0b (PORTD). We have to load the value of _BV(PD7) -&gt; (1 &lt;&lt; PD7) -&gt; (1 &lt;&lt; 7) -&gt; 0x80, into r24, which the compiler (and preprocessor) handle creating.<br><br>I'm going to skip PIND = 1, as it pertains other parts of this, but suffice to say that the normal generated code of these writes are like that generated by the PIND = 1, code segment. In other words it loads a value into a register, then writes it out to the proper port.<br><br>Here is the call to digitalWrite<br><br>It loads the two values to be passed into r22 and r24, then calls the digitalWrite function. Note we already increased the code size by 1/3 from 4 bytes to 6 bytes, a 33% increase in the amount of code you are going to execute, not even counting what happens when it gets to 'function' itself. By the way, my code size was about 334 bytes, adding the one call to digitalWrite increased code size by almost 60% to 607 bytes. Why is this? I'm glad you asked...<br><br>Here is the code that gets executed when you invoke digitalWrite(....)<br><br><br>000001e8 &lt;digitalWrite&gt;:<br> 1e8: 0f 93 push r16<br> 1ea: 1f 93 push r17<br> 1ec: cf 93 push r28<br> 1ee: df 93 push r29<br> 1f0: 1f 92 push r1<br> 1f2: cd b7 in r28, 0x3d ; 61<br> 1f4: de b7 in r29, 0x3e ; 62<br> 1f6: 28 2f mov r18, r24<br> 1f8: 30 e0 ldi r19, 0x00 ; 0<br> 1fa: f9 01 movw r30, r18<br> 1fc: e8 59 subi r30, 0x98 ; 152<br> 1fe: ff 4f sbci r31, 0xFF ; 255<br> 200: 84 91 lpm r24, Z<br> 202: f9 01 movw r30, r18<br> 204: e4 58 subi r30, 0x84 ; 132<br> 206: ff 4f sbci r31, 0xFF ; 255<br> 208: 14 91 lpm r17, Z<br> 20a: f9 01 movw r30, r18<br> 20c: e0 57 subi r30, 0x70 ; 112<br> 20e: ff 4f sbci r31, 0xFF ; 255<br> 210: 04 91 lpm r16, Z<br> 212: 00 23 and r16, r16<br> 214: c1 f0 breq .+48 ; 0x246 &lt;digitalWrite+0x5e&gt;<br> 216: 88 23 and r24, r24<br> 218: 19 f0 breq .+6 ; 0x220 &lt;digitalWrite+0x38&gt;<br> 21a: 69 83 std Y+1, r22 ; 0x01<br> 21c: bc df rcall .-136 ; 0x196 &lt;turnOffPWM&gt;<br> 21e: 69 81 ldd r22, Y+1 ; 0x01<br> 220: e0 2f mov r30, r16<br> 222: f0 e0 ldi r31, 0x00 ; 0<br> 224: ee 0f add r30, r30<br> 226: ff 1f adc r31, r31<br> 228: ec 55 subi r30, 0x5C ; 92<br> 22a: ff 4f sbci r31, 0xFF ; 255<br> 22c: a5 91 lpm r26, Z+<br> 22e: b4 91 lpm r27, Z<br> 230: 9f b7 in r25, 0x3f ; 63<br> 232: f8 94 cli<br> 234: 8c 91 ld r24, X<br> 236: 61 11 cpse r22, r1<br> 238: 03 c0 rjmp .+6 ; 0x240 &lt;digitalWrite+0x58&gt;<br> 23a: 10 95 com r17<br> 23c: 81 23 and r24, r17<br> 23e: 01 c0 rjmp .+2 ; 0x242 &lt;digitalWrite+0x5a&gt;<br> 240: 81 2b or r24, r17<br> 242: 8c 93 st X, r24<br> 244: 9f bf out 0x3f, r25 ; 63<br> 246: 0f 90 pop r0<br> 248: df 91 pop r29<br> 24a: cf 91 pop r28<br> 24c: 1f 91 pop r17<br> 24e: 0f 91 pop r16<br> 250: 08 95 ret<br><br>Notice a call in there to turnOffPWM, this handles pulse width modulation, in case you use incompatible pins. turnOffPWM function is much longer than the digitalWrite function. Since you have the potential to go through all of the code, this types of comparisons are pretty useless. Also notice the overhead of the function itself where it has to push (store on stack) 5 register values and restore them before it returns. This is common as these 5 registers can now be used by the function and are restored to their previous value (pop off stack) before returning.<br><br>I find the Arduino and primarily all the Atmel chips 'fun' chip to play with. It's neat to find new things such as the code that shows you how to flip a bit. Atmel has the hardware designed in a way that support bit manipulation, especially in ports.<br><br>I see many code examples that show reading ports like PORTD, where the actual read should be from PIND. Since this 'reads' the port, you don't write to it. However if the port is an input, you can write to PIND and flip any bits that you wish..<br><br>PIND = _BV(PD7)<br><br> 17a: 80 e8 ldi r24, 0x80 ; 128<br> 17c: 89 b9 out 0x09, r24 ; 9<br><br>PORTD &amp;= ~_BV(PD7) generates a 'cbi' or clear bit on port but is complex looking to people<br><br>Lots of times you just want to flip it:)<br><br>Bottom line, this comparison is pretty useless unless you want to show the overhead. Don't think digitalWrite, just writes to memory. Functions alleviate you of dealing with the very detailed parts of programming. If you want speed and small code size you pay for it by needing to know more about the item you are working on and generally coding it by hand (i.e. assembly). For example, I see many demo code sequences that use a pull up or pull down resistor to make a switch or something work properly. The 328p has internal pull-ups, reducing the component count, just have to enable them. I encourage people to read the data sheets. They may seem intimidating but you get used to them and get quicker at figuring out what you can do with the device.<br>The level of obfuscation by functions are usually by the choice of 'what's easier' to do and how detailed to you want. The time spent doing assembly is generally not worth it (unless you like doing it:) But it is worth knowing an looking at when you come to some problems. Just be aware.</p><p><br>I use eclipse and have the listings turned on. If you have the Arduino IDE I think you have to enable the listing file in the preferences.txt file in the default directory for the IDE. I 'think' the line you need to add is &quot;build.verbose=true&quot;.<br><br>Hope this helps some anyway....</p>
<p>Writing directly into the hardware registers, you loose in readability and portability.</p><p>I've published on Github a tool I called HWA that lets you use an object-oriented interface to the hardware that does not require a C++ compiler and produces high efficiency binary code.</p><p>It is there: https://github.com/duparq/hwa</p>
<p>using HWA , can I run Arduino at speeds similar to that of low level programmed Arduino?</p>
Absolutely.<br><br>Being not a library, HWA helps producing highly optimized binary code, most often the same as a clever low-level programmer would have obtained.<br>
<p>using HWA , can I run Arduino at speeds similar to that of low level programmed Arduino?</p>
<p>Tambien cuenta mucho que tipo de variable utilizamos para el manejo de nuestras variables ovbiamente le tomara mas tiempo manejar un long que un short int, o que un unsigned char. asi que tambien hay que tomarlo muy en cuenta a la hora de realizar nuestros sketch para que funciones mas rapido y sin forzar nuestro arduino. </p>
<p>Aca dejo un ejemplo de cuanto toma el arduino en ejecutar un sketch lo encontre en un libro. y bueno creo que es bueno que sea del conocimiento de muchos y lo que dice <a href="http://www.instructables.com/member/RazorConcepts/" rel="nofollow">RazorConcepts</a> es cierto al arduino le toma menos tiempo el ejecutar acciones cuando se le programa en modo Real C programming. que es como son programados los AVR pero como una parte de que sean para usos de personas aveces inexpertas es la una de las funciones principales aveces se les resta velocidad para darles un entorno de programacion mas amigable con las personas que no saben mucho (como yo) <br>aca les comparto el link para que lo prueben si decean esta en codebender:<br>https://codebender.cc/sketch:339633</p>
<p>Here you may see how fast are Arduino boards (turn on english subtitles):<br>https://www.youtube.com/watch?v=1PxyQubkZzM</p>
<p>Thanks and very useful to know.</p><p>In my case, it is 20 times faster, below are the result when I ran the same code:-</p><p>Time for digitalWrite(): 5220</p><p>time for true c command: 252</p>
I used a ATMEGA328 and I got: <p><em>digitalWrite(): 5224<br>true c : 256</em></p>
<p>how to write digitalRead function in c?</p>
<p>digitalRead() already exists, in wiring_digital.c, its burried within the arduino hardware folders.</p>
<p>For more details on arduino registers about port manipulation and what PORTB really is, refer to:</p><p>https://www.arduino.cc/en/Reference/PortManipulation</p>
<p>another way si use single instruction macro with CBI and SBI...</p><p>on bottom of post...</p><p>http://playground.arduino.cc/Main/AVR</p>
<p>An example (assuming an inductor of proper value is connected to the CLKIN pin and corresponding IO pin) ...</p><p>PORTB |= 0b00100000; // if needed repeat this operation for total of 32 cycles<br>PORTB &amp;= ~0b00000000; // if needed repeat for 32 cycles and so on ...</p><p>Then (in theory) after flashing this simple code the uC should be clocked to the RFID's carrier and also powered by it as well :D</p>
<p>Does anyone know how many actual clock cycles it takes to set pin states using this 'pure c' method? I found a very interesting post on emulating RFID tags using nothing but a PIC and a single radial coil. The same could be done with an ATTiny. It exploits the uC's internal capacitance and clamping diodes on the IO pins. Essentially, the RF modulation of the RFID reader supplies the oscillator frequency and just enough RF/induced current to even parasitically power the chip! One end of the coil is connected to the GP/CLKIN and the other end to a GP/IO pin. If the right amount of clock cycles are used in between switching pin states, it emulates a RFID card (simply switching low/high/low). There must be 32 clock cycles between each state per Manchester encoding. So, if you know how many cycles a pin state operation takes then you can achieve this kind of extremely-simple emulation. Here's the article - http://www.t4f.org/projects/open-rfid-tag/the-simplest-possible-rfid-emulator/</p>
If you count down rather than up in the loop it will be even faster! On my Arduino Mega changing the loop to count down for the True C commands reduced the time taken from 288microseconds to 192 microseconds, big difference if you need it to be as fast as possible!
<p>Counting down is more efficient, but only if you are comparing to zero. It is always more efficient to compare a value to zero than to another value! Similarly, do{}while(); loops are more efficient than for() loops. There are lots of pre-optimization tricks that you can do to squeeze your code into the smallest of chip spaces.</p>
<p>I am glad someone else finally realized the dirty secret behind Arduino... all of those libraries you love you use are horribly inefficient because they have to remap function calls to different pins on different devices.<br><br>You also have very little control over how the internal hardware is used (which timer, at what speed, etc). I regularly squeeze my optimized code into 99% of the space available on AVR chips. I want to get everything out of it that I can - no room for fluff.</p>
<p>If you know how to program C on an AVR, then it's likely Arduino's are not for you to begin with. However, I regularly program AVRs, but still find Arduino's useful, inefficient or not. The sheer number of libraries for almost every peripheral device imaginable makes it appealing for getting projects quickly off the ground and/or prototyping. If the project provides fruitful, I'll then port it to an AVR or write my own C library for the Arduino.</p>
<p>There is nothing wrong with that approach... in fact, I'd say that is probably the most intelligent use of Arudino possible - rapid proof of concept with the intent to optimize the code later.<br><br>However, that still doesn't solve the issues many people come across with multiple libraries try take control of the same hardware peripherals or IO Pins, and the user has no idea how any of it works, so they just give up.</p>
<p>&quot;Note that | can be found to the left of the backspace key, on the same key of the backslash.&quot;</p><p>On a Windows PC, with a U.S.-style keyboard.</p><p>Different machines will vary.</p>
<p>&quot;0b10000000 is an 8-bit binary number, you can convert it to hex for a cleaner look. Doing it manually is a pain&quot;</p><p>That's not true, every four bits can be directly translated to hex: 1000 is binary for 8 and 0000 is a 0, so the output will be 0x80. 0b11110101: 1111 is decimal 15 and hexadecimal F, 0101 is binary for 5 (dec and hex). Putting these numbers together gives 0xF5. If you get the hang of it, it is ways faster than looking up every single number.</p>
Thanks for sharing that !
It's better change just the part after the logic operator:<br><br>PORTB |= 0b00100000;<br>PORTB &amp;= ~0b00000000;<br><br>If you don't use the logic operators, but just the equal &quot;=&quot; sign, all the pins are going to be set like the byte you sent, not just pin B5. In the case where you set B5 high, you would set all ohter B pins to low.
Part of the reason this works is that the DigitalWrite() and DigitalRead() do some error checking before it sets the register bit--in particular, it checks and turns off PWM if it's enabled for that pin.&nbsp; Directly accessing the bit yourself is fine, but you're also assuming responsibility for skipping the check.<br /> <br /> This is pretty safe if your sketch is simple and doesn't do much multitasking or multiplexing of pins, but can get perilous if you've got a complex one.<br />
&nbsp;I wonder why I get this error:<br /> <br /> In function void loop():<br /> Error: 'PB5' was not declared in this scope.<br /> <br /> I use 0018 software.<br /> <br /> <br /> I used form PORTB = B.....;<br /> <br /> and it took 3808 vs 284 microseconds.<br /> <br /> I'm trying to get TLC5945 communicating with arduino, no luck with digitalWrites, hopefully this version works.<br />
&nbsp;I have the same problem. The example code is obviously missing something. Where and how should PB5 be declared?&nbsp;<br /> <br /> Me, I'm trying to PWM on all the pins of an Arduino Duemilanove to dim some LEDs.&nbsp;
&nbsp;Hmm, compiles fine for me on 0015, it may be something with the new software.<br /> <br /> If anyone has problems, replace<br /> &quot; PORTB |= _BV(PB5);<br /> &nbsp; PORTB &amp;= ~_BV(PB5);&quot;<br /> <br /> with:<br /> &quot;PORTB = 0b00100000;<br /> PORTB = 0b00000000;&quot;<br /> <br /> Which should toggle PB5 like the original code. PB5 shouldnt have to be declared, as winAVR should recognize it as a pin, but arduino 0018 may be doing something funky with that.<br />
Thanks this helped a lot becaus its the only way to programm the 16 extra pins on the Seeduino Mega<br />
Thanks for this instructable. Not only does this save a lot of time, it also knocks the program size down quite a bit.&nbsp; You can also you the registers DDRx an PINx (where x is the letter of the port) to change a port's input/output status, and to check what the input value is.<br />
&nbsp;So do you then use the AVR Studio 4?&nbsp;<br /> Or how do you program an Arduino in this way?<br /> Can you use the Arudino's usb ftdi connection?<br /> <br /> Please elaborate. Thanks!<br />
I use Arduino with the &quot;true c&quot; style commands.<br /> <br /> You can use AVR Studio 4, however it is quite complicated and you cannot use the usb connection, you will need a ISP programmer to burn the programs.<br />
Maybe using <a href="http://sites.google.com/site/funlw65/electronics/jaluino-pinguino-28-pins-starting-bo/freejalduino">this</a> &quot;Arduino&quot; ?<br /> <br /> <br />
Actually, there is a very nice writeup on &quot;Direct Port Manipulation&quot; (which is the Arduino folk's name for what you are doing) on the Arduino site.&nbsp; Here's the link:<br /> <br /> &nbsp; <a href="http://www.arduino.cc/en/Reference/PortManipulation">http://www.arduino.cc/en/Reference/PortManipulation</a><br /> <br /> The description at the top of the page says &quot;Port registers allow for lower-level and faster manipulation of the i/o pins of the microcontroller on an Arduino board.&quot;<br /> <br /> Wayne<br />
Nice tutorial.&nbsp; You should enter it into the Arduino contest.<br />
Some explanation of C &quot;bitwise&quot; operators might be helpful, too... And maybe break it down without using compound assignment ops first.<br /> <br /> Here's my fav bitwise link:<a href="http://www.avrfreaks.net/index.php?name=PNphpBB2&amp;file=viewtopic&amp;t=37871"> AVR bit tutorial</a><br /> <br /> Other than that, it's a good 'ible...<br />
Thanks! That may be a bit complicated, but I&nbsp;added a final step with that link.&nbsp;
*facepalm*<br /> horrible pun was unintended
How come you didnt you Hardware SPI instead of Bit Banging the commands to the PWM chip? I checked briefly and it seems it use SPI&nbsp;commands.&nbsp;Looking at the Atmega168 Data sheet, It Does have support for hardware SPI, and if you used hardware, it would much Much faster than bit Banging SPI Commands since you can usually get up to 20Mhz Data Transfer depending on your internal clock (or external, but usually it requires you to go with the max clock rate for the chip) since it would Free up the CPU to do other things.<br />
SPI uses the USART&nbsp;doesnt it? And thats pre-wired on the arduino to support the bootloader<br />
True, actually there is a TLC5940 library for Arduino that uses SPI, however I wanted to see what was actually going on so I just did it the &quot;manual&quot; way by bit banging. <br /> <br /> SPI doesn't use the USART, it uses SS/MOSI/MISO/SCK, the latter 3 are used for ISP programming also.&nbsp;
does this work for arduino mega? (chip is atmega1280<br />
&nbsp;Yes it does, however I&nbsp;cannot find a chart that shows which arduino pin goes to which pin... you may have to take a peek in to the schematic of the Arduino Mega.
reading and writing pwm pins will also be somewhat slower than writing non-pwm pins.<br /> <br /> hmmm, i notice that there's no longer any range checking on pin numbers.&nbsp; although this speeds things up i wonder what havoc a pinMode or digitalWrite to some out of range value might wreak.<br />
Good to know that there is such an overhead!<br /> How does the ShiftOut() command compare to this? Looking at the arduino reference it seems to be dedicated for this task (if the SPI pins are available..)<br /> Would be nice if someone posts some numbers, since i do not own an arduino to test with..<br /> <br />
There's a pretty extensive discussion on pin-toggling on the arduino forums here: http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1230286016<br /> <br />

About This Instructable

213,767views

191favorites

Bio: Check out razorconcepts.net for some projects.
More by RazorConcepts:Start an Online Microbusiness Joule Thief - use LEDs with only one AA battery! Arduino is Slow - and how to fix it! 
Add instructable to: