Board index » delphi » StrToInt - Embedded Zeroes Issue

StrToInt - Embedded Zeroes Issue


2006-10-26 11:33:56 PM
delphi268
Hi
I made a new poll because the other one was closed to early. I had by
mistake set a very early closing date.
Is it a bug that StrToInt does not handle embedded zeroes in the input
string?
tech.groups.yahoo.com/group/fastcodeproject/surveys
Please help me get many votes in.
Best regards
Dennis Kjaer Christensen
----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at f?3478 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
 
 

Re:StrToInt - Embedded Zeroes Issue

"Dennis" <XXXX@XXXXX.COM>wrote
Quote
Is it a bug that StrToInt does not handle embedded zeroes
in the input string?
By "embedded zeroes", I assume that you are referring to
characters with an ord value of zero and sometimes
referred to as null characters.
"Does not handle" is rather vague.
Does StrToInt report an exception?
Does StrToInt assume the end of the string's data?
Does StrToInt ignore the null characters?
Does StrToInt give erroneous output when encountering
the null characters?
HTH to clarify things, JohnH
 

Re:StrToInt - Embedded Zeroes Issue

The RTL StrToInt function is coded as:-
function StrToInt(const S: string): Integer;
var
E: Integer;
begin
Val(S, Result, E);
if E <>0 then ConvertErrorFmt(@SInvalidInteger, [S]);
end;
The Val function called (_ValLong in system.pas) is designed to work with
both AnsiStrings and PChars, and assumes the first #0 character found is the
end of the string.
Given and input string of '0'#0, The call to Val from StrToInt returns a
Result of 0 and an error code (E) of 0, and no exception is raised within
IntToStr.
--
regards,
John
The Fastcode Project:
www.fastcodeproject.org/
"John Herbster" <herb-sci1_at_sbcglobal.net>writes
Quote

"Dennis" <XXXX@XXXXX.COM>wrote
>Is it a bug that StrToInt does not handle embedded zeroes
>in the input string?

By "embedded zeroes", I assume that you are referring to
characters with an ord value of zero and sometimes
referred to as null characters.
"Does not handle" is rather vague.
Does StrToInt report an exception?
Does StrToInt assume the end of the string's data?
Does StrToInt ignore the null characters?
Does StrToInt give erroneous output when encountering
the null characters?
HTH to clarify things, JohnH


 

Re:StrToInt - Embedded Zeroes Issue

"John O'Harrow" <XXXX@XXXXX.COM>wrote
Quote
Given and input string of '0'#0, The call to Val from StrToInt returns a
Result of 0 and an error code (E) of 0, and no exception is raised within
IntToStr.
John O, Thanks. Now I know how to vote. Rgds, JohnH
 

Re:StrToInt - Embedded Zeroes Issue

John Herbster writes:
Quote
"Dennis" <XXXX@XXXXX.COM>wrote
>Is it a bug that StrToInt does not handle embedded zeroes
>in the input string?

By "embedded zeroes", I assume that you are referring to
characters with an ord value of zero and sometimes
referred to as null characters.
I took the wording to mean that the function didn't handle leading zeros:
123 - OK
00123 - not OK.
If this is wrong, please clarify the question and start the poll again.
David
 

Re:StrToInt - Embedded Zeroes Issue

Hi David
My wording in this thread was bad. I assumed that you all had read the
previous thread.
Look at the post from John in this thread and also read this thread"
Fastcode StrToInt32 B&V 0.8.0"
Leading zeroes are handled just fine by the RTL function.
Embedded Zeroes = '1'+#0+'2'
This string will be converted into integer=1 by the RTL function, because it
terminates on the #0= zero terminator in PChars.
I think that '1'+#0+'2' is a valid AnsiString that cannot be converted into
an integer and StrToInt must raise a convert error exception.
There are two more QC reports like issues
AnsiPos does not handle embedded nulls (like 'Pos')
qc.borland.com/wc/qcmain.aspx
SysUtils.StringReplace - strings containing #0 character handled incorrectly
qc.borland.com/wc/qcmain.aspx
They are both open and the conclusion is that #0 is a valid char in an
AnsiString and treating it as a terminator is a bug.
Best regards
Dennis Kjaer Christensen
Best regards
Dennis Kjaer Christensen
----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at f?3478 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
 

Re:StrToInt - Embedded Zeroes Issue

Hi
The poll result is that it is not an error that StrToInt treats a AnsiString
as a nul terminated PChar.
Then we cannot use Length(S) to get the length of the input string.
Should we make it a general rule that the length field of an AnsiString does
not represent the string lenght?
And make a QC report that Length is buggy - it has to call StrLen to search
for the zero terminator?
Best regards
Dennis Kjaer Christensen
----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at f?3619 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
 

Re:StrToInt - Embedded Zeroes Issue

"Dennis" <XXXX@XXXXX.COM>writes
Quote
Hi

The poll result is that it is not an error that StrToInt treats a
AnsiString
as a nul terminated PChar.

Then we cannot use Length(S) to get the length of the input string.
StrToInt (or rather Val on which it is based) is unusual in that the Val
function is designed to work with PChars. AFAIK no other string functions
are coded like this.
Quote
Should we make it a general rule that the length field of an AnsiString
does
not represent the string lenght?
I would say, NO
Quote
And make a QC report that Length is buggy - it has to call StrLen to
search
for the zero terminator?
Again, NO. This would have a major impact on performance.
--
regards,
John
The Fastcode Project:
www.fastcodeproject.org/
 

Re:StrToInt - Embedded Zeroes Issue

Hi
Quote
StrToInt (or rather Val on which it is based) is unusual in that the Val
function is designed to work with PChars. AFAIK no other string functions
are coded like this.
And this is the reason why it is buggy - not an explanation for it not being
buggy.
Quote
>Should we make it a general rule that the length field of an AnsiString
>does
>not represent the string lenght?

I would say, NO

>And make a QC report that Length is buggy - it has to call StrLen to
>search
>for the zero terminator?

Again, NO. This would have a major impact on performance.
But for StrToInt it is ok? I do not see the logic in this. Defining two
different behaviours for AnsiString functions. Sometimes the length is
defined by the length field and sometimes by the zero terminator. Sometimes
#0 are valid chars for an AnsiString and sometimes they have a special
meaning. Very weird if you ask me !
Best regards
Dennis Kjaer Christensen
----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at f?3619 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
 

Re:StrToInt - Embedded Zeroes Issue

Hi
The Pascal version of does _ValLong not treat #0 as a terminator as the BASM
version does.
function _ValLong(const s: String; var code: Integer): Longint;
{$IFDEF PUREPASCAL}
var
I: Integer;
Negative, Hex: Boolean;
begin
I := 1;
code := -1;
Result := 0;
Negative := False;
Hex := False;
while (I <= Length(s)) and (s[I] = ' ') do Inc(I);
if I>Length(s) then Exit;
case s[I] of
'$',
'x',
'X': begin
Hex := True;
Inc(I);
end;
'0': begin
Hex := (Length(s)>I) and (UpCase(s[I+1]) = 'X');
if Hex then Inc(I,2);
end;
'-': begin
Negative := True;
Inc(I);
end;
'+': Inc(I);
end;
if Hex then
while I <= Length(s) do
begin
if Result>(High(Result) div 16) then
begin
code := I;
Exit;
end;
case s[I] of
'0'..'9': Result := Result * 16 + Ord(s[I]) - Ord('0');
'a'..'f': Result := Result * 16 + Ord(s[I]) - Ord('a') + 10;
'A'..'F': Result := Result * 16 + Ord(s[I]) - Ord('A') + 10;
else
code := I;
Exit;
end;
end
else
while I <= Length(s) do
begin
if Result>(High(Result) div 10) then
begin
code := I;
Exit;
end;
Result := Result * 10 + Ord(s[I]) - Ord('0');
Inc(I);
end;
if Negative then
Result := -Result;
code := 0;
end;
Best regards
Dennis Kjaer Christensen
----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at f?3690 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
 

Re:StrToInt - Embedded Zeroes Issue

"Dennis" <XXXX@XXXXX.COM>writes
Quote
Hi

>StrToInt (or rather Val on which it is based) is unusual in that the Val
>function is designed to work with PChars. AFAIK no other string
>functions
>are coded like this.

And this is the reason why it is buggy - not an explanation for it not
being
buggy.

>>Should we make it a general rule that the length field of an AnsiString
>>does
>>not represent the string lenght?
>
>I would say, NO
>
>>And make a QC report that Length is buggy - it has to call StrLen to
>>search
>>for the zero terminator?
>
>Again, NO. This would have a major impact on performance.

But for StrToInt it is ok? I do not see the logic in this. Defining two
different behaviours for AnsiString functions. Sometimes the length is
defined by the length field and sometimes by the zero terminator.
Sometimes
#0 are valid chars for an AnsiString and sometimes they have a special
meaning. Very weird if you ask me !
I think the general consensus is that StrToInt is indeed buggy, but it will
not get fixed, because it could break backwards compatibility.
--
regards,
John
The Fastcode Project:
www.fastcodeproject.org/
 

Re:StrToInt - Embedded Zeroes Issue

"Dennis" <XXXX@XXXXX.COM>writes
Quote
Hi

The Pascal version of does _ValLong not treat #0 as a terminator as the
BASM
version does.

function _ValLong(const s: String; var code: Integer): Longint;
{$IFDEF PUREPASCAL}
var
I: Integer;
Negative, Hex: Boolean;
begin
...
end;
The Pascal version has numerous other bugs and does not work anyway.
--
regards,
John
The Fastcode Project:
www.fastcodeproject.org/
 

Re:StrToInt - Embedded Zeroes Issue

Hi John
Quote
I think the general consensus is that StrToInt is indeed buggy, but it
will
not get fixed, because it could break backwards compatibility.
I wish it was so well. I agree with that viewpoint and understand why it can
be a bad idea to fix the bug.
But the poll ended with 8-3 in favor of StrToInt not being buggy.
tech.groups.yahoo.com/group/fastcodeproject/surveys
Best regards
Dennis Kjaer Christensen
----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at f?3696 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
 

Re:StrToInt - Embedded Zeroes Issue

Hi John
Quote
The Pascal version has numerous other bugs and does not work anyway.
Yes. Lars also pointed it out earlier. Is it QC'ed?
Best regards
Dennis Kjaer Christensen
----------------------------------------------------------------------------
----
Jeg beskyttes af den gratis SPAMfighter til privatbrugere.
Den har indtil videre sparet mig for at f?3696 spam-mails
Betalende brugere får ikke denne besked i deres e-mails.
Hent en gratis SPAMfighter her.
 

Re:StrToInt - Embedded Zeroes Issue

"Dennis" <XXXX@XXXXX.COM>writes
Quote
Hi John

>I think the general consensus is that StrToInt is indeed buggy, but it
will
>not get fixed, because it could break backwards compatibility.

I wish it was so well. I agree with that viewpoint and understand why it
can
be a bad idea to fix the bug.

But the poll ended with 8-3 in favor of StrToInt not being buggy.
tech.groups.yahoo.com/group/fastcodeproject/surveys
Even if the poll concluded that it was buggy, I do not believe DevCo would
have changed functionality which has existed for 10+ years.
Lets just accept the poll results, treat StrToInt as a special case, and
move on.
--
regards,
John
The Fastcode Project:
www.fastcodeproject.org/