INFO LOWONGAN KERJA TERBARU KLIK DISINI

Software Engineer Things You Only Encounter In Embedded Programming - Part I

I mentioned in the blog's introduction that I'm working on a closed-source DSP (Digital Signal Processing) application. I have an embedded DSP development board from Texas Instruments with a low-power fixed point CPU, connected via SPI to an FPGA board from Xilinx, connected via USB cable to my PC. I can't say yet what I'm developing with it, mostly because I haven't fully figured it out yet, but also because I might try to sell it one day. But recently I encountered a bug while programming that I have never seen before and I thought people might find it interesting.

When (0 > i) is not the same as (i < 0)

Most developers, myself included, take it for granted that you can flip the operands of a comparison operator and as long as you change the greater-than to a less-than the behavior is identical either way. I tend to like to write my "if" statements like this:
if(0 > i) ...
Rather than this:
if(i < 0) ... 
Because I like to see the zero come first in the expression. Then I know immediately the code is comparing something with 0, rather than having to scan the whole line with my eyes to the end to see that it's comparing with 0. In all my years of software development experience this has never made a difference in any context. Whether I'm programming SQL, VHDL, C/C++, Python, .NET, you name it, this has never ever made a difference.

...until my embedded DSP application...

It was a normal weekend of hobby coding when I decided to add an FFT (Fast Fourier Transform) calculation to my project. I programmed the DSP to sample a real-time audio signal, compute the FFT every so often and send the data through the FPGA to my PC where I plot the spectrum on a barchart. I got everything working perfectly (after at least a whole day's toil) but at the end of the day I noticed that after a certain non-deterministic amount of runtime the code on my DSP suddenly broke in a very specific way.

What I saw was that my rate-limiting logic, which made sure that I only compute the FFT "every so often" to avoid buffers overflowing, suddenly and for no good reason broke, causing the FFT to compute on every task cycle and overflowing all my buffers. The code looked something like this:
divider_counter = divider_counter - 1;
if(0 > divider_counter){
    compute_fft(...);
    divider_counter = DIVIDER_MAX;
}
So whenever the divider counter falls below 0 it computes the FFT and resets the divider counter back to the max. Super simple, right? What I found was that after the code started failing, if I set a debugger breakpoint on the line "if(0 > divider_counter)" I found that the divider counter was actually greater than 0 but it still entered the 'if' body. This absolutely dumbfounded me. I have never in my life seen an 'if' statement execute unless the condition was actually true. So what was happening? Why did my greater-than sign suddenly and randomly switch to a less-than sign after some unknown amount of runtime?

This was a puzzle that stumped me for at least a week or two. I kept developing other areas of the system, but in the back of my mind I was furiously trying to understand why that one 'if' statement was working for a while and then suddenely did the opposite. Finally after trying every possible thing to fix it, I got desperate and entered the rescrog phase of troubleshooting. If you aren't familiar with rescrogging, it's basically where you just start changing your code without really knowing what's wrong with it or how to fix it. One of the things I tried, on a whim, was to switch the operands of the comparison, so now my code looked like this:
...
if(divider_counter < 0){ ... }
And lo and behold that's what fixed it! After beating my head on this for weeks trying all the most intricate ways of fixing it, the thing that finally did it was moving the zero to the end. I let the whole system run for 3 days straight to convince myself that it really was fixed, and indeed the duduk perkara never happened again.

But that's not the end of my story, because even though my code works now, I had no idea of any mechanism that explains why that fixed my bug. This was really troubling to a professional software engineer who prides himself on knowing practically everything about software. If someone were to tell me that they fixed a bug by changing (0 > i) to (i < 0) I would tell them to look at it again because according to the C standard that can't be a valid bugfix. So I looked at it again and again, and I was about to go on the TI forums and bounce it off their engineers to see if they know anything about this (assuming they even believed my story), when I discovered what my bug was on my own.

It turns out that the FFT function I'm calling is implemented in a static library provided by TI, whose source code is entirely in assembly. I was perusing the documentation for the FFT function of this library when I found a list of known bugs in the version I was using. It turns out that there is a bug in their assembly code that, due to alignment issues, causes a part of the stack to be clobbered if the stack pointer happens to point to an odd address (They claim it happens 50% of the time, but I think that would depend on a lot of things). After I read that, I grabbed the latest version of the library and rebuilt my application with it, and sure enough the duduk perkara went away, even after I changed the 'if' condition back to the original way.

So that was a very important exercise for me in my journey to become a software expert. This is something you really only encounter in embedded programming, because generally you don't ever see assembly code otherwise, and the C/C++ compiler takes care of saving and restoring contexts during function calls for you. But you should not take for granted that the assembly code is doing everything right. Sometimes the bug really isn't in your code, but in a library you're using. That's why you should always read the documentation, and always use the latest versions as they're available!

INFO LOWONGAN KERJA TERBARU KLIK DISINI

Iklan Atas Artikel


Iklan Tengah Artikel 1

Iklan Tengah Artikel 2


Iklan Bawah Artikel