tayaoutdoor.blogg.se

Division speed ns
Division speed ns










division speed ns

division of any data type is quite time consuming, int32 being especially bad.

division speed ns

** results varied a bit, highest value reported % was also tested but is the same as /, as expected. The extra time required for looping was estimated at 250ns and subtracted out. Measured using double nested loops (usually 255*255). Operation uint8 int16 int32 int64 float** Nanosecond execution times on Arduino Mega2560 (16 MHz) Since you can't view pics as a guest, I thought it would be a good idea to put the results as text instead:

division speed ns

Oh, by the way, these times are in ns, not us (typo on top of image file). And I'm not sure if the differences in sqrt() are from differences in operand values or from converting to float (because I suspect sqrt() is only actually implemented for float). The surprising result here is that int8 division is almost as slow as int16! I wonder if int8 division is not really implemented, and it just executes as int16? Float division is definitely quite a bit faster than int32. And I did subtract out the time that I estimated the looping to take (I guessed 250 ns for each iter). Anyway, check out the attached image for the new results. Although + and * seemed quite independent of operand values. I tried to have the operand change for each calculation as well as I could without adding time, so perhaps it gives a good "average" result. I went ahead and got some more accurate results (by looping 65535 times in most cases). but not worried enough to spend a lot of time on it. Yeah, I was worried about the value of the operands making a difference. So subtract 3.5 us from all my measurements! I think I might go ahead and re-measure all the math with the looping method just to compare. However, I just measured the execution time of micros() (with the looping method), and it takes 3774ns minus the roughly 3-4 clocks for the loops (say 250ns). The reason I didn't loop is because I wanted to be certain the compiler couldn't optimize anything. In addition, like micros(), my measurements only have 4 us resolution (well, it's better than that because I did it multiple times and took an average). I did these measurements only once between calls to micros(), which means the time for micros() to execute is included in the measurement. Regardless, I think your 5us measurement for 32-bit multiply is more correct than my measurement. An 8-bit x 8-bit multiply does produce a 16-bit result, but if the operands are both 16-bit, then there will be multiple instructions to complete. The only multiply available on the Uno/Mega takes 8-bit operands (it's an 8-bit microcontroller, remember). Actually, a (int16_t * int16_t) multiply is definitely more than 2 clock cycles.












Division speed ns