Board index » delphi » Why oh Why is TFileStream so slow

Why oh Why is TFileStream so slow

Consider the code at the end of this post. all it does is read in a large
254,263 byte text file one character at a time by using TextFile and
TFileStream.

On my Dell Dimension XPS 450, the time to read the file using Read and
a text file is
               20.28 Milliseconds if file is cached  +- 0.5 millisecond
               41.10 Millisecond if file is not cached.

For the TFileStream it is
             4,356   Millisecond +- 100 Milliseconds

That is about 20 times slower!!  

What is really interesting is that if you first read the data into
a TMemoryStream and then read that memory stream you get
               82 Milliseconds to read the memory stream.

So for what it is worth, TFileStream is very slow.  I cannot understand
what it is doing.  Any Ideas?

-John_Mer...@Brown.EDU

Const
  FName : String = 'c:\winnt\system32\ras\modem.inf';
procedure TForm1.Button1Click(Sender: TObject);
Var
Hrt : THRTimer;
x   : Real;
fp  : TextFile;
Stream : TFileStream;
c : Char;
i : Integer;
BEGIN
 HRT := THRTimer.Create;

 AssignFile(fp, FName);
 Reset(fp);
 HRT.StartTimer;
{$I-}
 i := 0;
 While Not EOF(FP) Do
   Begin
    Read(fp, c);
    i := i + 1;
   End;
 x := HRT.ReadTimer;
 ShowMessage(Format('%n %d',[x,i]));
 CloseFile(fp);

 Stream := TFileStream.Create(Fname, fmOpenRead or fmShareDenyNone);
 i := 0;
 HRT.StartTimer;
 While Stream.Position < Stream.Size Do
   Begin
    Stream.Read(c, 1);
    i := i + 1;
   End;

 x := HRT.ReadTimer;
 ShowMessage(Format('%n %d',[x,i]));
 Stream.Free;

 HRT.Free;
END;

(HRT is a high resolution timer I wrote}

 

Re:Why oh Why is TFileStream so slow


TFileStream does unbuffered I/O:  your code asks windows to read one
character, then asks it to read another, etc.  Text files contain a
128 byte buffer.  Your text file code does 128 times fewer API calls.
That's why it's faster.

Reading first into a TMemoryStream and then one character at a time
does even better buffering than a text file, but the reads are more
complicated, because the buffer is out on the heap somewhere instead
of being part of the text variable, and the size isn't known in
advance.

Duncan Murdoch

Re:Why oh Why is TFileStream so slow


In article <865ofh$...@cocoa.brown.edu>, John_Mer...@brown.edu (John_Mertus)
writes:

Quote
>On my Dell Dimension XPS 450, the time to read the file using Read and
>a text file is
>               20.28 Milliseconds if file is cached  +- 0.5 millisecond
>               41.10 Millisecond if file is not cached.

>For the TFileStream it is
>             4,356   Millisecond +- 100 Milliseconds

>That is about 20 times slower!!  

It's those damned dots again <gg>

In article <38866a50.1311...@newshost.uwo.ca>, dmurd...@pair.com (Duncan

Quote
Murdoch) writes:
>Reading first into a TMemoryStream and then one character at a time
>does even better buffering than a text file, but the reads are more
>complicated, because the buffer is out on the heap somewhere instead
>of being part of the text variable, and the size isn't known in
>advance.

You can find the file size and allocate that value to TMemoryStream.Capacity
before loading the file. But it didn't seem to make much difference for a 250kb
file.

As you found, the ratio between TFileStream and TMemoryStream is about 50
times.

Ray Lischner in his "Secrets of Delphi 2" book, gives some code for a
TBufferedStream which can be applied to any stream.

Alan Lloyd
alangll...@aol.com

Re:Why oh Why is TFileStream so slow


Thanks for the comment.  I discovered that is not so much the
buffered/unbuffered issue but the
    stream.position, stream.size calls

Apparently these are recomputed from the file EACH time requested.  In
particular, the stream.size is really slow. If I replace the loop

  if (stream.position < stream.size) Then

with
  size := stream.size;
  i := 0;
  if (i < size) then
   begin
     i := i + 1;

The code runs almost 10 times faster!  Wow. Tfilestream.size should come
with a warning label.  "Slow code, should be use only by Microsoft Personal"

-John_Mertus

In article <38866a50.1311...@newshost.uwo.ca>, dmurd...@pair.com says...

Quote

>TFileStream does unbuffered I/O:  your code asks windows to read one
>character, then asks it to read another, etc.  Text files contain a
>128 byte buffer.  Your text file code does 128 times fewer API calls.
>That's why it's faster.

>Reading first into a TMemoryStream and then one character at a time
>does even better buffering than a text file, but the reads are more
>complicated, because the buffer is out on the heap somewhere instead
>of being part of the text variable, and the size isn't known in
>advance.

>Duncan Murdoch

Re:Why oh Why is TFileStream so slow


In article <867out$...@cocoa.brown.edu>, John_Mer...@brown.edu (John_Mertus)
writes:

Quote
>Thanks for the comment.  I discovered that is not so much the
>buffered/unbuffered issue but the
>    stream.position, stream.size calls

>Apparently these are recomputed from the file EACH time requested.  In
>particular, the stream.size is really slow. If I replace the loop

I don't know whether it has any effect on speed, but one usually uses :-

  CurrentPos := Seek(0, soFromBeginning) to get the present position.

... and when reading until the end to do :-

  ReadBytes := Read(Buff, BytesToRead);

  If ReadBytes <> BytesToRead then (at the end of stream)

... instead of checking for Position and Size equality.

Alan Lloyd
alangll...@aol.com

Re:Why oh Why is TFileStream so slow


Quote
John_Mertus wrote:
> Thanks for the comment.  I discovered that is not so much the
> buffered/unbuffered issue but the
>     stream.position, stream.size calls

> Apparently these are recomputed from the file EACH time requested.  In
> particular, the stream.size is really slow. If I replace the loop

>   if (stream.position < stream.size) Then

> with
>   size := stream.size;
>   i := 0;
>   if (i < size) then
>    begin
>      i := i + 1;

> The code runs almost 10 times faster!  Wow. Tfilestream.size should come
> with a warning label.  "Slow code, should be use only by Microsoft Personal"

    Function calls are usually slower than accessing variables - probably they
could rewrite the docs to say "often slower than accessing a variable" in
the description of _every_ function in Delphi, eh?

    Have you got a faster implementation for TFilestream.Size than the
one Borland gave? If so you should post it.

    Or is your complaint that the compiler did not do the same as what
your second version did? It _can't_ do that, because it _cannot_ know
that the file size has not changed. If the two versions were equivalent
then the question would be whether the optimizer should have been
able to figure out they were equivalent. But they don't look equivalent
to me.

    Or if all you want is faster code you could use TMemoryStream
instead, as suggested.

- Show quoted text -

Quote
> -John_Mertus

> In article <38866a50.1311...@newshost.uwo.ca>, dmurd...@pair.com says...

> >TFileStream does unbuffered I/O:  your code asks windows to read one
> >character, then asks it to read another, etc.  Text files contain a
> >128 byte buffer.  Your text file code does 128 times fewer API calls.
> >That's why it's faster.

> >Reading first into a TMemoryStream and then one character at a time
> >does even better buffering than a text file, but the reads are more
> >complicated, because the buffer is out on the heap somewhere instead
> >of being part of the text variable, and the size isn't known in
> >advance.

> >Duncan Murdoch

Re:Why oh Why is TFileStream so slow


In article <388F4F21.E21DB...@math.okstate.edu>, ullr...@math.okstate.edu
says...
Sorry, I don't understand any of your points.  I was commenting Stream.Size,
Stream.Positions are much slower then expected (most likely because they are
safe calls). It is a very unexpected performance hit.  In most of my code,
accessing properties do not slow down the system 40 times. Thats all.  So its a
word of warning to programers who would think this.

-John_Mer...@Brown.EDU

Quote

>    Function calls are usually slower than accessing variables - probably they
>could rewrite the docs to say "often slower than accessing a variable" in
>the description of _every_ function in Delphi, eh?

>    Have you got a faster implementation for TFilestream.Size than the
>one Borland gave? If so you should post it.

>    Or is your complaint that the compiler did not do the same as what
>your second version did? It _can't_ do that, because it _cannot_ know
>that the file size has not changed. If the two versions were equivalent
>then the question would be whether the optimizer should have been
>able to figure out they were equivalent. But they don't look equivalent
>to me.

>    Or if all you want is faster code you could use TMemoryStream
>instead, as suggested.

>> -John_Mertus

>> In article <38866a50.1311...@newshost.uwo.ca>, dmurd...@pair.com says...

>> >TFileStream does unbuffered I/O:  your code asks windows to read one
>> >character, then asks it to read another, etc.  Text files contain a
>> >128 byte buffer.  Your text file code does 128 times fewer API calls.
>> >That's why it's faster.

>> >Reading first into a TMemoryStream and then one character at a time
>> >does even better buffering than a text file, but the reads are more
>> >complicated, because the buffer is out on the heap somewhere instead
>> >of being part of the text variable, and the size isn't known in
>> >advance.

>> >Duncan Murdoch

Re:Why oh Why is TFileStream so slow


Quote
John_Mertus wrote:
> In article <388F4F21.E21DB...@math.okstate.edu>, ullr...@math.okstate.edu
> says...
> Sorry, I don't understand any of your points.

    I may have been a little sarcastic, which may make it tougher, sorry.
Realized later that the hit with TFileStream.Size was much larger than
just "function call", so some of my points may have been invalid - that
would also make them harder to understand. But I believe a valid point
remains:

Quote
> I was commenting Stream.Size,
> Stream.Positions are much slower then expected (most likely because they are
> safe calls). It is a very unexpected performance hit.  In most of my code,
> accessing properties do not slow down the system 40 times.

    Well, although I haven't checked carefully, I believe that it's
not Stream.Size that's so slow, it's TFileStream.Size specifically
(or rather THandleStream.Size). What seems to me like a
still-valid-until-shown-otherwise point is that TFileStream.Size
has to ask the file system for the file's size each time - it
can't know that the file size has not changed, for example.
    Presumably a person would expect asking the file system
for a file's size to be very slow? I _bet_ that if you used a
TMemoryStream as suggested you'd find that Size was
acceptably fast.

Quote
> Thats all.  So its a
> word of warning to programers who would think this.

    Sounded like a little more than just a warning... never mind.
Quote

> -John_Mer...@Brown.EDU

> >    Function calls are usually slower than accessing variables - probably they
> >could rewrite the docs to say "often slower than accessing a variable" in
> >the description of _every_ function in Delphi, eh?

> >    Have you got a faster implementation for TFilestream.Size than the
> >one Borland gave? If so you should post it.

> >    Or is your complaint that the compiler did not do the same as what
> >your second version did? It _can't_ do that, because it _cannot_ know
> >that the file size has not changed. If the two versions were equivalent
> >then the question would be whether the optimizer should have been
> >able to figure out they were equivalent. But they don't look equivalent
> >to me.

> >    Or if all you want is faster code you could use TMemoryStream
> >instead, as suggested.

> >> -John_Mertus

> >> In article <38866a50.1311...@newshost.uwo.ca>, dmurd...@pair.com says...

> >> >TFileStream does unbuffered I/O:  your code asks windows to read one
> >> >character, then asks it to read another, etc.  Text files contain a
> >> >128 byte buffer.  Your text file code does 128 times fewer API calls.
> >> >That's why it's faster.

> >> >Reading first into a TMemoryStream and then one character at a time
> >> >does even better buffering than a text file, but the reads are more
> >> >complicated, because the buffer is out on the heap somewhere instead
> >> >of being part of the text variable, and the size isn't known in
> >> >advance.

> >> >Duncan Murdoch

Other Threads