By using this site, you agree to our Privacy Policy and our Terms of Use. Close
MisterBlonde said:
@alephnull

This is the PC discussion forum so I was thinking more like x86. Perhaps this is why we are disagreeing?

From your link:
"Due to the small number of architectural registers, the x86 calling conventions mostly pass arguments on the stack"

As far as inline functions, I always thought those were compiler hints so there is no guarantee that the compiler will actually inline them. I also think of inline functions as now actually being a part of the function they were inlined within. No different than other lines of code within the calling function. Since no call instructions are being executed there wouldn't be a purpose for pushing params on to the stack.

Let me know if any of my assumptions are wrong for X86.

Whether or not a function is inlined it still doesn't change that the called function (inlined or not) code has to execute before the calling function continues execution. Assuming this is a synchronous function call.


 

The inline keyword is indeed a compiler hint which the compiler is free to ignore. When a function is "inlined" it is equivalent to just putting everything inside the callee function in {} braces in the at the point where that function was called. If you do that, it is no longer just a "hint".

Even if you don't put in an inline keyword you will find that gcc on an x86 will "inline" functions you didn't ask it to if you compile your program with -O3. Furthermore I suspect you are assuming that the generated x86 ISA is the final word on what the processor actually does. The problem with this is that the x86 instructions you are looking at actually get translated into microarchitecture instructions. For example the pentium 4 actually has more registers than you can explicitly reference through 32-bit instructions and uses rotating register windows to compensate for the limitations of the x86 ISA. Just because you see push and pop instructions in your out.s doesn't mean it actually happens like that. It's complicated and irritating, this is why many realtime people stick with 386s because there is all this stuff that can happen.

Note that gcc doesn't use nearly as many fancy optimizations as icc or suncc (which amusingly enough seems to beat icc with matrix multiplies for me).

Also note that I am not one of those people who believe the compiler is better at optimizing than a human who knows what they are doing is. Only compiler researchers and Java/.Net programmers believe this :P

I guess it depends on what you define "PC" as. I guess I consider my dual processor POWER machine here to be a PC, but you don't need that. Is an Itanium workstation a PC? What about my old transmeta laptop? Or amd64 running in 64-bit windows or linux? All of these have large numbers of registers.