Embedded Software Colin Walls
Colin Walls has over thirty years experience in the electronics industry, largely dedicated to embedded software. A frequent presenter at conferences and seminars and author of numerous technical articles and two books on embedded software, Colin is an embedded software technologist with Mentor … More » When compilers do magicAugust 17th, 2015 by Colin Walls
What is a compiler? Ask an average engineer and you will get an answer something like: “A software tool that translates high level language code into assembly language or machine code.” Although this definition is not incorrect, it is rather incomplete and out of date – so 1970s. A better way to think of a compiler is: “A software tool that translates an algorithm described in a high level language code into a functionally identical algorithm expressed in assembly language or machine code.” More words, yes, but a more precise definition. The implications of this definition go beyond placating a pedant like me. It leads to a greater understanding of code generation – and just how good a job a modern compiler can do – and the effect upon debugging the compiled code …
It may be argued that a modern compiler could [under specific circumstances] produce better code than a skilled human assembly language programmer. To illustrate my point, here is an example of that phenomenon. Consider this code: This is very straightforward. One would expect a simple loop that counts around four times using the counter variable i. I tried this, generating code for a 32-bit device and stepped through the code using a debugger. To my surprise, the code only seemed to execute the assignment once; not four times. Yet the array was cleared correctly. So, what was going on? A quick look at the underlying assembly language clarified matters. The compiler had generated a single, 32-bit clear instruction, which was considerably more efficient than a loop. The loop variable did not exist at all. I experimented and found that, for different values for SIZE, various combinations of 8-, 16- and 32-bit clear instructions were generated. Only when the array size exceeded something like 12 did the compiler start generating a recognizable loop, but even that was not a byte-by-byte clear. The operation was performed 32 bits at a time. Of course, such optimized code is tricky to debug. Indeed, even today, some debuggers just do not allow debugging of fully optimized code. They give you an interesting choice: ship optimal code or debugged code. Realistically, I would recommend that initial debugging is performed with optimization wound down to avoid such confusion and enable this kind of logic to be carefully verified. Later, verify the overall functionality of the code with aggressive optimization activated. I was very impressed by how smart the compiler was. I know that it is not magic, but it sure looks like it. 4 Responses to “When compilers do magic” |
I do not have any appreciation about code from such compiler activities. This example shows that a result might be misleading or just not functional or not doing what the person who wrote the code expected.
No surprise that there is no certified C compiler around that can produce certified code.
This is like going to a restaurant WITH A RECIPE. The CHEF has a better idea and modifies the recipe according to what he/she likes. The result can be ok, better or just for the bin. Definitely unpredictable.
No wonder there are a few companies around that sell software to verify that code that has been generated.
Not much different to the one who had to taste the food if it was poisoned…
Well it seems things have not changed much since the middle ages …
In this case [at least] the result is indistinguishable from the result of going around a loop. Unless you regard the fact that the code was both smaller and faster could be a problem.
Hi Colin,
Recently there was a decent discussion on LinkedIn group “Plain Old C Programming” very much related to this topic. The discussion devolves into good code vs bad code and what different compilers do with that code. The discussion title was “Why do we get a segmentation fault in the following code?” At issue was the different behaviour of different compilers.
The code in the OP posting was “bad code” so I guess that the discussion converged around the topic of a compilers “good magic” versus another compilers “bad magic.”
https://www.linkedin.com/grp/post/1627067-6046617327905026050
I hope that helps,
Good Luck,
Cheers, Richardv.
A way to verify code that is hard to test exhaustively is symbolic simulation, here’s a company working on it –
https://galois.com/blog/2013/09/high-assurance-base64/
You can use the unoptimized code as a reference for the optimized, Since symbolic simulation works quite fast you can build it into the compilers as a check on aggressive/speculative optimization.