Board index » cppbuilder » Re: inlining functions

Re: inlining functions


2008-05-20 05:39:27 AM
cppbuilder58
"Mike King" < XXXX@XXXXX.COM >writes:
Quote
class Test {
unsigned int buf[4];
public:
// I know if sz>4 then a memory access exception will occurr (this isn't
production code)
inline unsigned int Comp ( unsigned int sz ) {
unsigned int ret=0;
switch (sz) {
default: do { ret |= buf[--sz]; } while(sz);
case 4: ret |= buf[3];
case 3: ret |= buf[2];
case 2: ret |= buf[1];
ret |= buf[0]; // sz will never be <=1
}
Still, why have the default case if it's certain to be an error? When
it finishes, then cases 4,3,2, and the invisible case 1 will all be
re-applied anyway.
Quote
>Also, in the past some code I'm aware of was found to be more efficiently
>generated if it was coded as *(buf+n) rather than using buf[n]...

How did you find out that it was more efficient?
I remember hearing long ago that incrementing a pointer was faster
than indexing into an array, but optimizers should have improved
beyond that stage years ago. In fact, it was probably like 10 years
ago when I heard from a friend who looked into generated code that the
pointer/indexing code was 100% identical. (But he was using MS's
compiler, and I've never been motivated to see how other compilers do
it.)
But as optimizers go, I'm pretty impressed with the one in g++ 4.x. I
discovered that code like this:
std::string s;
//...
s += "abcdefg";
has a pretty awesome optimization. I expected the operator += to do a
strlen on the string literal, then append n bytes. I was astounded to
see that the generated code was equivalent to this:
s.append("abcdefg", 7);
Apparently,
1) operator+=(char const * str) calls append(str)
2) append(char const * str) figures the length, then forwards it on:
size_t len = strlen(str);
append(str, len);
3) append(str,len) implements the real work.
The inliner was smart enough to trace through the code and know that
operator+= called append() with a string literal, and then the call to
strlen on the string literal was recognized as something that could be
done at compile time, and that value was inlined, and then the call to
append(str, 7) was inlined as well.
I had no idea that compilers would trace string literals through char*
parameter arguments. (In g++ 3.x series, it didn't.)
Anyway, that was my most recent moment of awe. :)
--
Chris (TeamB);
 
 

Re:Re: inlining functions

dhoke wrote:
Quote
>>template <unsigned int T>
>>struct Test {
>>unsigned int buf[T];
>>public:
>>inline unsigned int Comp ( unsigned int sz = T ) {
>>unsigned int ret=0;
>>switch (sz) {
>>case 4: ret |= buf[3];
>>case 3: ret |= buf[2];
>>case 2: ret |= buf[1];
>>case 1: ret |= buf[0]; break;
>>default: return buf[sz-1] | Comp(sz-1);
>
>This is incorrect for Test<4>.
>default: is an error condition.

How is it incorrect for Test<4>??? I can see it being incorrect for a call
of Comp(0), and I can see that Test< (with n>4)>probably would keep it
from inlining most anywhere... but incorrect for Test<4>how, in what sense?
Incorrect in that you should handle errors instead of throwing
(or worse, not throwing) memory access errors.
For Test<4>, buf is buf[4], so when you get to the default:
case ( <=0 or>=5 ), you are accessing beyond the bounds of buf.
 

Re:Re: inlining functions

"Mike King" < XXXX@XXXXX.COM >wrote in message
Quote
>This does not seem to me like it would generate very efficient code
>unless Borland is unrolling loops... But give the possible variance of
>"sz", I think you'd have a lot more jumps than Mike's original switch()
>falling thru the various cases... Although in his switch, I would've
>just returned from the case 0, rather than break'ing and then
>return'ing...

how about this?

class Test {
unsigned int buf[4];
public:
// I know if sz>4 then a memory access exception will occurr (this isn't
production code)
inline unsigned int Comp ( unsigned int sz ) {
unsigned int ret=0;
You hope the compiler does it, but doesn't hurt to tell it just in case...
register unsigned int ret=0 ;
Quote
switch (sz) {
default: do { ret |= buf[--sz]; } while(sz);
That still has the loop (and thus jump, unless unrolled), and doesn't matter
if you're only going to have 2-4 anyway...
But you would need a break on this (default) case unless you otherwise
adjust size and still fall thru the other four cases...
Quote
case 4: ret |= buf[3];
case 3: ret |= buf[2];
case 2: ret |= buf[1];
ret |= buf[0]; // sz will never be <=1
and might change this to simply:
case 2: return ret | buf[1] | buf[2] ;
Quote
}
return ret;
}
};

>Also, in the past some code I'm aware of was found to be more efficiently
>generated if it was coded as *(buf+n) rather than using buf[n]...

How did you find out that it was more efficient?
Well I'm taking someone's word for it, but I believe they actually tried it
both ways in a particular situation, looking at the generated code, and
timing it. That would have been done with BCB6 or earlier, but when I
mentioned this more recently in one of these groups, someone else seemed to
concur.
Quote

 

{smallsort}

Re:Re: inlining functions

<Bob Gonder>wrote in message
Quote
dhoke wrote:

>>>template <unsigned int T>
>>>struct Test {
>>>unsigned int buf[T];
>>>public:
>>>inline unsigned int Comp ( unsigned int sz = T ) {
>>>unsigned int ret=0;
>>>switch (sz) {
>>>case 4: ret |= buf[3];
>>>case 3: ret |= buf[2];
>>>case 2: ret |= buf[1];
>>>case 1: ret |= buf[0]; break;
>>>default: return buf[sz-1] | Comp(sz-1);
>>
>>This is incorrect for Test<4>.
>>default: is an error condition.
>
>How is it incorrect for Test<4>??? I can see it being incorrect for a
>call
>of Comp(0), and I can see that Test< (with n>4)>probably would keep it
>from inlining most anywhere... but incorrect for Test<4>how, in what
>sense?

Incorrect in that you should handle errors instead of throwing
(or worse, not throwing) memory access errors.
For Test<4>, buf is buf[4], so when you get to the default:
case ( <=0 or>=5 ), you are accessing beyond the bounds of buf.
But you're only going to get to default for Comp(sz) when sz<=0, which I did
and do acknowledge would be an error...
But I don't see it for Test<4>, Comp(4)...
With sz == 4, 3, 2, or 1, you'll take one of those branches, break and
return...
With initial sz>4, you will hit default until sz == 4, on which recursion
will then hit the other cases, which will not ever reach the default as they
will take case 4: and fall thru, breaking... ??? I think...
Quote


 

Re:Re: inlining functions

"dhoke" < XXXX@XXXXX.COM >wrote in message
Quote


and might change this to simply:
case 2: return ret | buf[1] | buf[2] ;
duh - edit as required:
case 2: return ret | buf[1] | buf[0] ;
 

Re:Re: inlining functions

dhoke wrote:
Quote
>>>>template <unsigned int T>
>>>>struct Test {
>>>>unsigned int buf[T];
>>>>public:
>>>>inline unsigned int Comp ( unsigned int sz = T ) {
>>>>unsigned int ret=0;
>>>>switch (sz) {
>>>>case 4: ret |= buf[3];
>>>>case 3: ret |= buf[2];
>>>>case 2: ret |= buf[1];
>>>>case 1: ret |= buf[0]; break;
>>>>default: return buf[sz-1] | Comp(sz-1);
>>>
>>>This is incorrect for Test<4>.
>>>default: is an error condition.
>>
With initial sz>4, you will hit default until sz == 4, on which recursion
will then hit the other cases, which will not ever reach the default as they
will take case 4: and fall thru, breaking... ??? I think...
Take another look at sz=5:
default: return buf[4] | Comp(4);
buf[4] does not exist for a Test<4>.
default: should either
1) return 0; //ignore error return minimum
2) return Comp(T); // ignore error, return max
3) return -1; // error code -1 not likely result
4) throw. // error state
 

Re:Re: inlining functions

<Bob Gonder>wrote in message
Quote

Take another look at sz=5:
default: return buf[4] | Comp(4);
buf[4] does not exist for a Test<4>.
Won't argue with that, I just wasn't thinking of calling Comp(5) for an item
defined as Test<4>