Board index » cppbuilder » Re: [BUG] bcc32 - optimizing error with inline function

Re: [BUG] bcc32 - optimizing error with inline function


2005-06-13 06:09:06 PM
cppbuilder17
Bob Gonder < XXXX@XXXXX.COM >wrote:
Quote
I always assumed int would go to 64 bit, as it was promoted from 16 to
32, and long would remain 32 where it has been.
If so, it wouldn't be a C or C++ compiler.
(1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long))
Ideally, int would go to 64 if that was the natural word size on the
system (i.e. the fastest integral type). On the other hand, it would
finally open up an inability to name certain sized integers.
Alan Bellingham
--
ACCU Conference 2006 - 19-22 April, Randolph Hotel, Oxford, UK
 
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

Alan Bellingham wrote:
Quote
Bob Gonder wrote:

If so, it wouldn't be a C or C++ compiler.

(1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long))
Did not realize there was a guarantee int <= long
Some bureaucratic committee must have dreamed that up.
Quote
Ideally, int would go to 64 if that was the natural word size on the
system (i.e. the fastest integral type).
Agreed.
Quote
On the other hand, it would
finally open up an inability to name certain sized integers.
Which is why long should remain 32 as it has (forever)?
I seem to remember this topic was discussed before.
But I can't find(google) it.
There has to be 64bit compilers out by now (Gnu?)...what do they use?
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

Bob Gonder wrote:
Quote
dazbee wrote:

>(my example code is stripped for reporting purposes only
>- it bears little resemblance to the code in use)

That makes it a bit difficult to help, then.

?
It encapsulates the problem. If the compiler was being
misdirected by any of the statements/declarations, I hope
I could then weave the correction into the actual code.
Quote
>[...] but the actual code is
>a nest of macros which don't reduce so well.
>
>if (FIXNUM_P(a) && FIXNUM_P(b)) ...
>
>#define FIXNUM_P(f) (((long)(f))&FIXNUM_FLAG)

Reduces quite easily...
(cast) depends on size of FIXNUM_FLAG

if( (BYTE)a & (BYTE)b & FIXNUM_FLAG )
Pretty fast, too.

That bypasses the FIXNUM_P macro.
If we're testing for a FIXNUM using a macro throughout,
we can't suddenly not use it here and there without
losing integrity.
But I tried it ...
if ((unsigned char)a & (unsigned char)b & 1) {
... and it didn't have any influence on the broken pointer.
Quote
>Is there something wrong with the code apart from style ?
>If not, it's a bug, right ?
>Then we need to go with something like:
>
>#if __BORLANDC__ == 0x551
>if (TYPE(a) == T_STRING) if (TYPE(b) == T_STRING) {
>#else
>if (TYPE(a) == T_STRING && TYPE(b) == T_STRING) {
>#endif

Why not just use the first as they are the same?
<private grin>
I can prove that I thought of that (post from last week):
blade.nagaokaut.ac.jp/cgi-bin/vframe.rb/ruby/ruby-core/5152?4925-5195+split-mode-vertical
The bccwin build team are not seeing the problem because
bcc5.5 has a default switch of /v (degugging) which
causes out-of-line expansions (inlines are not inlined -
they're call'd). Seems like an odd default, IMHO.
I'm not sure they're aware of that, yet.
I can understand their reluctance to replace a valid
line with a bogus one just to keep one user happy.
You know, someone might look at it and say:
"Whoa - doesn't this guy know what && is for?"
In the absence of an elegant solution, an #if with
a comment may be more acceptable but I expect I'll
just get ignored. I think I'd do the same.
Quote

Yes, it's a bug. The optimizer does some odd things.
At times it's a bit too agressive. (Like this one)
Thanks; a very fine compiler it is, though.
Thanks also to those who looked and would have posted
if they could have offered a way round this.
It's great that you're all here in times of need :-)
Cheers,
daz
 

{smallsort}

Re:Re: [BUG] bcc32 - optimizing error with inline function

Bob Gonder < XXXX@XXXXX.COM >wrote:
Quote
Did not realize there was a guarantee int <= long
Some bureaucratic committee must have dreamed that up.
It's a guarantee as to what things you may rely on.
Quote
>Ideally, int would go to 64 if that was the natural word size on the
>system (i.e. the fastest integral type).

Agreed.

>On the other hand, it would
>finally open up an inability to name certain sized integers.

Which is why long should remain 32 as it has (forever)?
Err, no. Why do you have this strange assumption that a long is 32 bits?
No, if you want something /whose size is guaranteed/, then you want to
state what the size is. e.g, intt_8, uintt_32.
The guarantee in the standard is effectively that long is the longest
type available, and that int is the fastest type available. Those are
valuable guarantees, and I for one would not want to see them being
broken.
Alan Bellingham
--
Me <url:mailto: XXXX@XXXXX.COM ><url:www.doughnut.demon.co.uk/>
ACCU - C, C++ and Java programming <url:accu.org/>
The 2004 Discworld Convention <url:dwcon.org/>
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

Bob Gonder wrote:
Quote
Alan Bellingham wrote:
>
>If so, it wouldn't be a C or C++ compiler.
>
>(1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long))

Did not realize there was a guarantee int <= long
Some bureaucratic committee must have dreamed that up.

>Ideally, int would go to 64 if that was the natural word size on the
>system (i.e. the fastest integral type).

Agreed.

>On the other hand, it would
>finally open up an inability to name certain sized integers.

Which is why long should remain 32 as it has (forever)?

I seem to remember this topic was discussed before.
But I can't find(google) it.

There has to be 64bit compilers out by now (Gnu?)...what do they use?

They'll probably try to follow a standard.
There's a useful table ( ILP32 Versus LP64 ) here:
www.devx.com/cplus/Article/27510/1954
which shows promotions for pointer, 'long' and 'long double'
'int' remains 32-bit
daz
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

"dazbee" < XXXX@XXXXX.COM >wrote:
Quote
They'll probably try to follow a standard.

There's a useful table ( ILP32 Versus LP64 ) here:
www.devx.com/cplus/Article/27510/1954

which shows promotions for pointer, 'long' and 'long double'

'int' remains 32-bit
Ah, thanks.
(I wonder whether there /is/ any speed difference between the 32 and 64
bit types.)
And it was int8_t, uint32_t, etc. that I was thinking of.
Alan Bellingham
--
Me <url:mailto: XXXX@XXXXX.COM ><url:www.doughnut.demon.co.uk/>
ACCU - C, C++ and Java programming <url:accu.org/>
The 2004 Discworld Convention <url:dwcon.org/>
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

daz,
On Sat, 11 Jun 2005 22:17:43 +0100, "dazbee" < XXXX@XXXXX.COM >
wrote:
Quote

Bob Gonder wrote:
>Thomas Maeder [TeamB] wrote:
>
>>obj &= 0xff;
>>if (obj==5) return 100;
>>return ((struct st_type*)(obj))->flags & 0x3f;
>>
>>and not
>>
>>>>if ((obj & 0xff)==5) return 100;
>>>>return ((struct st_type*)(obj))->flags & 0x3f;
[snip]
It's a speed issue.
[snip]
You said you've come up with some workarounds. Here are four
suggestions, at the risk of duplicating what you know already:
1. Turn the optimization off just for the in_type function:
#pragma option -Od
__inline int in_type(VALUE obj)
{
if ((obj & 0xff)==5) return 100;
return ((struct st_type*)(obj))->flags & 0x3f;
}
#pragma option -O2
2. Cast the constant 0xff:
__inline int in_type(VALUE obj)
{
if ((obj & (VALUE)0xff)==5) return 100;
return ((struct st_type*)(obj))->flags & 0x3f;
}
3. Use a pointer to obj instead of a copy of obj (or C's version of
using an argument by reference instead of by value) with the calls
adjusted accordingly (in_type(&a), in_type(&b)):
__inline int in_type(VALUE *obj)
{
if ((*obj & 0xff)==5) return 100;
return ((struct st_type*)(*obj))->flags & 0x3f;
}
4. Advance the cast to st_type*:
__inline int in_type(VALUE obj)
{
struct st_type *stobj = (struct st_type*)(obj) ;
if ((obj & 0xff)==5) return 100;
return stobj->flags & 0x3f;
}
This last one just grabs the value of obj before it may still get
mangled by and-ing with 0xFF.
By the way, the successful output on my machine is
Success ! (version 551)
All of this is done with BCB5 Patch 1, operating from the command
line.
It may be worthwhile to remember that traditionally optimization in/by
compilers is always accompanied by compromises, sacrificing a subtle
thing here or there that is legal within the ANSI (or other relvant)
standard. In this case, the failure may be unintentional rather than
deliberate.
Cheers, Jochen
hjtrost at microfab dot com
Nil nimium studeo, Caesar, tibi velle placere,
nec scire ut an sis albus an ater homo.
Catullus
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

Alan Bellingham wrote:
Quote
>There's a useful table ( ILP32 Versus LP64 ) here:
>www.devx.com/cplus/Article/27510/1954
Nice.
Quote
(I wonder whether there /is/ any speed difference between the 32 and 64
bit types.)
If there were, I'd expect it to show up in address[long][int]
calculations where the promotion might take an extra clock.
(And, here I've been using for(int x;;) when I didn't really care
about x, but used it for indexing. Shoulda been using long, but long
indexes are wrong for 16bit code. That may be why I thought int should
move up: It's been the index of choice, now not so much.)
Quote
And it was int8_t, uint32_t, etc. that I was thinking of.
That explains why I'd never seen , intt_8, uintt_32. before....
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

Hans-Jochen Trost wrote:
Quote

daz,
Jochen !, ... I was wondering where you'd been :-)
(Sorry - I was close to thinking that the thread had dried up)
Quote

You said you've come up with some workarounds. Here are four
suggestions, at the risk of duplicating what you know already:

1. Turn the optimization off just for the in_type function:
Nice idea ! I never thought to wrap the inline.
I wanted to wrap the problem statement in sort2 with a #pragma
but read that /O could be changed only between functions -
which makes sense now I know that a function is the fundamental
unit of optimisation.
Some points arising: -
a) This solution generates equivalent code to that
generated by wrapping sort2 with the same #pragmas. Hmm.
b) Using this pair instead:
#pragma option push /Od
#pragma option pop
around sort2 has the same action as a), but
around the __inline has no effect at all.
I sense you may have tried push/pop and discovered the same.
Quote

2. Cast the constant 0xff:

__inline int in_type(VALUE obj)
{
if ((obj & (VALUE)0xff)==5) return 100;
return ((struct st_type*)(obj))->flags & 0x3f;
}

I see no effect from this. Outputs are identical.
Did you leave your #pragmas in - or did I{*word*222}up ?
Quote
3. Use a pointer to obj instead of a copy of obj (or C's version of
using an argument by reference instead of by value) with the calls
adjusted accordingly (in_type(&a), in_type(&b)):

__inline int in_type(VALUE *obj)
{
if ((*obj & 0xff)==5) return 100;
return ((struct st_type*)(*obj))->flags & 0x3f;
}

Interesting. Good stuff. Hope someone is looking at the
.asm from these alternatives. Enormous variation.
This one *totally* wouldn't be accepted, though :-))
(several hundred macro arguments involved)
Quote
4. Advance the cast to st_type*:

__inline int in_type(VALUE obj)
{
struct st_type *stobj = (struct st_type*)(obj) ;
if ((obj & 0xff)==5) return 100;
return stobj->flags & 0x3f;
}

This last one just grabs the value of obj before it may still get
mangled by and-ing with 0xFF.

Ooh, you were doing so well and now you resort to cheating :-))
Actually there's not much overhead here when the intermediate
variable is assigned a register.
Here's a clip of the .asm (before ->after) files -
www.d10.karoo.net/misc/bcc_asm.gif
(nothing else on that site, btw)
Probably the best of your submissions - and no duplication of my
experimenting -- I don't recall trying any of those.
-------------
Another "fix" I forgot to mention is:
volatile VALUE a = *ap, b = *bp; // at top of sort2
('volatile' prevents use of register variables -
- designed for another purpose, but usable here -
- folks, you should've told me that, even though I knew ;-)
-------------
Quote
It may be worthwhile to remember that traditionally optimization in/by
compilers is always accompanied by compromises, sacrificing a subtle
thing here or there that is legal within the ANSI (or other relvant)
standard. In this case, the failure may be unintentional rather than
deliberate.

I'll remember, and I'm sure it's unintentional and it isn't one of
those annoying bugs because there are a good few workarounds ...
... even more now :-).
Quote
Cheers, Jochen
Thank you for replying - with crystal clarity.
Kept me so engaged, I forgot to go to bed !
daz
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

daz
On Tue, 14 Jun 2005 08:44:33 +0100, "dazbee" < XXXX@XXXXX.COM >
wrote:
[snip]
Quote
Jochen !, ... I was wondering where you'd been :-)

(Sorry - I was close to thinking that the thread had dried up)
... just{*word*154} out here waiting for a sensible answer in the
graphics forum (hint, hint) ...
[snip]
Quote
>1. Turn the optimization off just for the in_type function:

Nice idea ! I never thought to wrap the inline.
I wanted to wrap the problem statement in sort2 with a #pragma
but read that /O could be changed only between functions -
which makes sense now I know that a function is the fundamental
unit of optimisation.

Some points arising: -
a) This solution generates equivalent code to that
generated by wrapping sort2 with the same #pragmas. Hmm.

b) Using this pair instead:
#pragma option push /Od
#pragma option pop
around sort2 has the same action as a), but
around the __inline has no effect at all.
I sense you may have tried push/pop and discovered the same.
No, I did not try push/pop. I was brash enough to simply assume that
you would never dare to compile without /O2, so I took the straight
route of disabling optimization and forcing /O2 on you at the end.
Also, as the (obj & 0xff) piece in in_type seems to be at the core of
all your trouble, I feel that messing with sort2 is not exactly
desirable. The best solutions for you will act strictly on or in
in_type with no apparent side effects anywhere else, outside of the
bug being fixed, of course.
Quote
>2. Cast the constant 0xff:
>
>__inline int in_type(VALUE obj)
>{
>if ((obj & (VALUE)0xff)==5) return 100;
>return ((struct st_type*)(obj))->flags & 0x3f;
>}
>

I see no effect from this. Outputs are identical.
Did you leave your #pragmas in - or did I{*word*222}up ?
I screwed up. I just tried it again and it failed, so I must indeed
have left those pragmas in.
Quote
>3. Use a pointer to obj instead of a copy of obj (or C's version of
>using an argument by reference instead of by value) with the calls
>adjusted accordingly (in_type(&a), in_type(&b)):
>
>__inline int in_type(VALUE *obj)
>{
>if ((*obj & 0xff)==5) return 100;
>return ((struct st_type*)(*obj))->flags & 0x3f;
>}
>

Interesting. Good stuff. Hope someone is looking at the
.asm from these alternatives. Enormous variation.
I haven't looked at the assembler as I am not really converant in it.
My serious close encounters with assembler occurred way back in times
of yore, on a Telefunken TR440 in the early 1970s and IBM mainframes
in the late 70s and first half of the 80s.
Quote
This one *totally* wouldn't be accepted, though :-))
(several hundred macro arguments involved)
Can't tell that from your reduced demo example, of course ;-) But
this is exactly the reason to try to find different workarounds. The
changes outside of in_type are conceptually an undesirable idea, just
like wrapping sort2 above instead of in_type where the bug really is.
Quote
>4. Advance the cast to st_type*:
>
>__inline int in_type(VALUE obj)
>{
>struct st_type *stobj = (struct st_type*)(obj) ;
>if ((obj & 0xff)==5) return 100;
>return stobj->flags & 0x3f;
>}
>
>This last one just grabs the value of obj before it may still get
>mangled by and-ing with 0xFF.
>

Ooh, you were doing so well and now you resort to cheating :-))
Hey, anything that gets the job done will do, won't it? Also, if the
function is extended with a "real" use of obj as an integer after this
detrimental and-ing, the optimizer might be coaxed into not
misbehaving - remember the senseless add-in lines that you found to do
the trick, too. By the way, I have seen source code (Fortran, not C)
way back when where this was done also - and there was a comment
nearby explaining that a silly useles instruction was there only to
coax the optimizer or other compiler section into getting things
right.
Quote
Actually there's not much overhead here when the intermediate
variable is assigned a register.
Here's a clip of the .asm (before ->after) files -
www.d10.karoo.net/misc/bcc_asm.gif
(nothing else on that site, btw)

Probably the best of your submissions - and no duplication of my
experimenting -- I don't recall trying any of those.

-------------

Another "fix" I forgot to mention is:
volatile VALUE a = *ap, b = *bp; // at top of sort2

('volatile' prevents use of register variables -
- designed for another purpose, but usable here -
- folks, you should've told me that, even though I knew ;-)
Got you moving again, did I? Another fix away from the bug, though.
Quote
>It may be worthwhile to remember that traditionally optimization in/by
>compilers is always accompanied by compromises, sacrificing a subtle
>thing here or there that is legal within the ANSI (or other relvant)
>standard. In this case, the failure may be unintentional rather than
>deliberate.
>

I'll remember, and I'm sure it's unintentional and it isn't one of
those annoying bugs because there are a good few workarounds ...
... even more now :-).


>Cheers, Jochen

Thank you for replying - with crystal clarity.
Kept me so engaged, I forgot to go to bed !
This looked like a good instructive exercise, to I considered it
advanced training on the job, and did it right there =8^O
Cheers, Jochen
hjtrost at microfab dot com
Nil nimium studeo, Caesar, tibi velle placere,
nec scire ut an sis albus an ater homo.
Catullus
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

Hans-Jochen Trost wrote:
Quote
On Tue, 14 Jun 2005 08:44:33 +0100, "dazbee" wrote:
>[...]
>Interesting. Good stuff. Hope someone is looking at the
>.asm from these alternatives. Enormous variation.

I haven't looked at the assembler as I am not really converant in it.

Ketman - TUTOR86 - 80K (Console app)
www.btinternet.com/~btketman/tutpage.html
(Assembler interpreter !)
Quote
My serious close encounters with assembler occurred way back in times
of yore, on a Telefunken TR440 in the early 1970s and IBM mainframes
in the late 70s and first half of the 80s.
Sys Prog ... IBM mainframes ... mid 70s, early 80s
(Shall resist talking about the days of yore ;)
Quote
Got you moving again, did I?
Vroooosh -------->>>>>>>>>>:-)))
Hope you get your graphics answer.
Quote
Cheers, Jochen
Cheers to you,
daz
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

dazbee < XXXX@XXXXX.COM >wrote:
Quote
There are several ways to work round this, so I'm
not desperate for a solution, but I wonder if the
team here can pin-point whether there's something
in the C that's really misleading the optimiser
or if it's a plain old bug.
It's a bug --- in your source. You're relying on undefined behaviour,
because you're take a value that was never (as far as the posted
source shows) the address of an actual object. You cast this to a
pointer, and expect the resulting pointer to be valid.
The compiler cannot possibly have a bug here, because it's undefined
behaviour we're talking about --- and that means whatever it may do,
including mailing a complaint about your code style to your boss, or
exploding your machine, is correct.
The C and C++ languages both offer the concept of a "union". And
guess what---there's a reason for its existence. It's code trying to
do what your code does, but *without* causing undefined behaviour.
--
Hans-Bernhard Broeker ( XXXX@XXXXX.COM )
Even if all the snow were burnt, ashes would remain.
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

Hans-Bernhard Broeker < XXXX@XXXXX.COM >wrote:
Quote
The C and C++ languages both offer the concept of a "union". And
guess what---there's a reason for its existence. It's code trying to
do what your code does, but *without* causing undefined behaviour.
I defy you to actually use a union in C++ to do this without invoking
undefined behaviour.
Alan Bellingham
--
ACCU Conference 2006 - 19-22 April, Randolph Hotel, Oxford, UK
 

Re:Re: [BUG] bcc32 - optimizing error with inline function

Hans-Bernhard,
On 15 Jun 2005 03:49:44 -0700, Hans-Bernhard Broeker
< XXXX@XXXXX.COM >wrote:
[snip]
Quote
It's a bug --- in your source. You're relying on undefined behaviour,
because you're take a value that was never (as far as the posted
source shows) the address of an actual object. You cast this to a
pointer, and expect the resulting pointer to be valid.
I disagree. In his main program, daz loads a pointer of type struct
st_type into his VALUE type variable (which is typedef'ed to be
unsigned long):
psa = (VALUE)&sta; psb = (VALUE)&stb ;
and passes these by reference into his sort2 function. There he
prepares explicit copies of the VALUE type data:
VALUE a = *ap, b = *bp ;
and passes these to his in_type function. There he ends up converting
the VALUE data back to the same type of pointer he started off with:
(struct st_type)(obj)->...
There is nothing undefined in this chain of assignments, and the
strategy is also legal, see Kernighan and Ritchie, "The C Programming
Language," 2nd edition, section A6.6, pp.199:
"A pointer may be converted to an integral type large enough to hold
it; the required size is implementation-dependent. The mapping
function is also implementation-dependent.
"An object of integral type may be explicitly converted to a pointer.
The mapping always carries a sufficiently wide integer converted from
a pointer back to the same pointer, but is otherwise
implementation-dependent."
daz's code adheres to these rules strictly. The Borland compiler
thinks the whole source code does not violate any ANSI standard that
it is able to check.
As the analyses of the assembler code produced by the compiler has
shown, the "obj" argument value gets loaded into a register, and-ed
with 0xFF and due to optimization continues to be used from that
register thereafter. That is the bug, and the compiler's optimization
strategy deserves the blame. Telling it to not optimize just the
in_type function provides a successful work-around.
Cheers, Jochen
hjtrost at microfab dot com
Nil nimium studeo, Caesar, tibi velle placere,
nec scire ut an sis albus an ater homo.
Catullus