Board index » cppbuilder » Re: fgets() vs std::getline() performance

Re: fgets() vs std::getline() performance


2006-09-20 12:06:33 PM
cppbuilder27
"Chris Uzdavinis (TeamB)" < XXXX@XXXXX.COM >wrote in message
Quote
"crhras" < XXXX@XXXXX.COM >writes:

>I would have no idea what the cause is but a 4000 percent difference
>in performance seems out of line to me and points to a flaw in the
>implementation of getline. Also, when fgets is compared to getline
>using gc++ on Linux there is some performance difference but nothing
>to write home about.

Without profiling, it's hard to say where the time is spent. But it's
not a fair comparison between linux/g++ and windows/BCB. Different
os, different compiler (with different levels of optimizations... g++
probably much more aggressive), and last but not least, different
standard library implementations.

Dinkumware (BCB) has a great implementation, but I have heard it
expects the compiler to optimize away and inline things that Borland
might not be doing.

Also, I'm not making excuses for Borland, but based on what you have
said I do disagree with your conclusions that it is a flaw in
getline. It may later turn out you are right, but the supporting
arguments you are currently using are inadequate to draw any
conclusions about anything except that your program takes a lot
longer using getline.

--
Chris (TeamB);
Thanks Chris - good stuff.
Curt
 
 

Re:Re: fgets() vs std::getline() performance

crhras < XXXX@XXXXX.COM >wrote:
Quote
>Exactly this had been explained to crhas over in
>c.l.c++.m already.
>
>Schobi


Schobi,

I don't know who you are but it seems that you are taking it
personally when I say that getline( )'s implementation is flawed.
I'm not. I know the guys who implemented it only from
their newsgroups postings and have absolutely no
personal relationships with them. (One of them once
publicly plonked me, if that's any indication.)
I just find it annoying that you keep asking the same
question over and over despite the fact that you
already got it answered several times.
Quote
A lot
has been explained to me in this thread and I am grateful for
everyone's responses. It doesn't seem to me however, that the
explanation that "strings do more" than char buffers or getline( ) writes
into a string rather than a fixed size array should account for a
performance difference of 4000 percent.
(If you don't understand this, then why don't you ask
this question, but instead keep repeating the already
answered one?)
Quote
Do you ?
In 'std::getline()', 'std::string' hast to allocate
memory as needed in order to grow. When a string grows
beyond its capacity, it allocates a new (bigger) chunk
of memory, copies its contents into this, and frees
the old memory. Allocating and freeing dynamic memory
is very expensive (and AFAIK, regarding performance,
Borland's memory manager isn't exactly the leader of
the pack). Copying characters isn't for free either.
'fgets()', OTOH, has only to copy bytes from the file
into a user provided buffer, stopping either on newline
or on end-of-buffer, leaving you to deal with the
consequences of the latter.
This /could/ explain the difference you see. But then
it could be many other things. Without profiling the
app, you can't know.
Nevertheless, I went and did some tests using VC8,
which is what I have available here. I started with
this program:
#include <iostream>
#include <fstream>
#include <string>
#include <cstdio>
#include <cassert>
#include <windows.h>
const unsigned int num_tests = 10;
void test_getline(const std::string& filename)
{
std::ifstream ifs(filename.c_str());
assert(ifs.good());
std::string line;
while( std::getline(ifs, line,'\n') ) {
}
}
void test_fgets(const std::string& filename)
{
std::FILE* fp = std::fopen(filename.c_str(), "r");
assert(fp);
char buffer[512];
while( std::fgets(buffer, sizeof(buffer), fp) != NULL ) {
}
}
inline unsigned int test( const std::string& filename
, void (*func)(const std::string& filename) )
{
const DWORD dwStart = ::GetTickCount();
for( unsigned int u=0; u<num_tests; ++u ) {
func(filename);
}
return ::GetTickCount()-dwStart;
}
int main(int argc, char* argv[])
{
assert(argc==2);
const unsigned int u_getline = test( argv[1], test_getline );
const unsigned int u_fgets = test( argv[1], test_fgets );
std::cout << "reading \"" << argv[1] << "\" took ~"
<< u_getline/num_tests << " using 'std::getline()'\n";
std::cout << "reading \"" << argv[1] << "\" took ~"
<< u_fgets /num_tests << " using 'std::fgets()'\n";
return 0;
}
I fed it with a 193k log file that happened to be in my
temp folder and got
reading "<file>" took ~139 using 'std::getline()'
reading "<file>" took ~4 using 'std::fgets()'
for the debug version and
reading "<file>" took ~6 using 'std::getline()'
reading "<file>" took ~1 using 'std::fgets()'
for the release version.
(I checked and found that reversing the order of the
tests doesn't have any notable influence on the result.)
So VC was able to speed up 'fgets()' by a factor of 4,
'std::getline()' by a factor of 20. To me this seems
to indicate that a good optimizer is crucial for the
C++ version to perform reasonable.
(Good optimization is another thing BCC has not been
very famous for in the last couple of years.)
Of course, these don't have to be realistic figures. VC
might, after all, find out that the results of reading
the files aren't needed and just skip a lot of code.
So I changed my code to output the results:
void test_getline(const std::string& filename)
{
std::ifstream ifs(filename.c_str());
assert(ifs.good());
std::ofstream ofs( (filename+".out").c_str());
assert(ofs.good());
std::string line;
while( std::getline(ifs, line,'\n') ) {
ofs << line << '\n';
}
}
void test_fgets(const std::string& filename)
{
std::FILE* fp = std::fopen(filename.c_str(), "r");
assert(fp);
std::ofstream ofs( (filename+".out").c_str());
assert(ofs.good());
char buffer[512];
while( std::fgets(buffer, sizeof(buffer), fp) != NULL ) {
ofs << buffer << '\n';
}
}
I now get
reading "<file>" took ~203 using 'std::getline()'
reading "<file>" took ~11 using 'std::fgets()'
for a debug build and
reading "<file>" took ~13 using 'std::getline()'
reading "<file>" took ~7 using 'std::fgets()'
for a release build.
VC was able to speed up the C version by a factor
of 1.5, the C++ version by a factor of ~15.5 --
obviously again indicating that optimization is
a very important factor in this.
If we take the the optimized builds of the first
version, we see a factor of 7 between the C and C++
versions. That seems a lot -- but then you have to
consider that you didn't compare equal algorithms.
The C version will cut lines that are longer than
511 characters. (Indeed it fails on the log file
I fed it with.)
If you change the code to handle arbitrary long
lines, you will have to use dynamic memory. I very
much doubt that you would find an algorithm that
is significantly faster than the (considerably
easier to write, to get right, and to read) C++
version.
Does this make it any clearer?
Schobi
--
XXXX@XXXXX.COM is never read
I'm Schobi at suespammers dot org
"The sarcasm is mightier than the sword."
Eric Jarvis
 

Re:Re: fgets() vs std::getline() performance

Quote
Does this make it any clearer?
Two points that may help the OP.
If he's profiling in a debug build I would
expect getline()/std::string to take longer.
He needs to profile in a release build with
optimizations to really compare.
Second thing is, regarding your point
of reallocation, he can always use reserve().
I've found that if I have some idea of the size
of the lines, this helps a lot.
To the OP, how do your results look in
a release build if you do:
std::string line;
line.reserve(512);
before the getline() call?
 

{smallsort}

Re:Re: fgets() vs std::getline() performance

Quote
To the OP, how do your results look in
a release build if you do:

std::string line;
line.reserve(512);

before the getline() call?

Thanks for your input. I can't believe that I didn't think to test the
release build. I come from an AIX C background and used to use
the de{*word*81} maybe twice a year.
I ran the tests using line.reserve( ) and got the following results:
NOTE:
Test 1 is getline( ) with line.reserve( )
Test 2 is fgets( )
Release Version
-----------------------
Test 1 has taken 124 seconds.
Test 2 has taken 4 seconds.
Debug Version
------------------------
Test 1 has taken 159 seconds.
Test 2 has taken 4 seconds.
 

Re:Re: fgets() vs std::getline() performance

crhras < XXXX@XXXXX.COM >wrote:
Quote
>To the OP, how do your results look in
>a release build if you do:
>
>std::string line;
>line.reserve(512);
>
>before the getline() call?
>
Thanks for your input. I can't believe that I didn't think to test the
release build. I come from an AIX C background and used to use
the de{*word*81} maybe twice a year.

I ran the tests using line.reserve( ) and got the following results:

NOTE:
Test 1 is getline( ) with line.reserve( )
Test 2 is fgets( )

Release Version
-----------------------
Test 1 has taken 124 seconds.
Test 2 has taken 4 seconds.

Debug Version
------------------------
Test 1 has taken 159 seconds.
Test 2 has taken 4 seconds.
If I compare these number to mine, the optimization
questions comes up. The debug version numbers are
pretty close to mine. But my release version numbers
are much better for 'std::getline()', while yours
aren't that much different.
Schobi
--
XXXX@XXXXX.COM is never read
I'm Schobi at suespammers dot org
"The sarcasm is mightier than the sword."
Eric Jarvis
 

Re:Re: fgets() vs std::getline() performance

"Hendrik Schober" < XXXX@XXXXX.COM >wrote in message
Quote
crhras < XXXX@XXXXX.COM >wrote:
<snip>
If I compare these number to mine, the optimization
questions comes up. The debug version numbers are
pretty close to mine. But my release version numbers
are much better for 'std::getline()', while yours
aren't that much different.
With MSVC7.1 my results were similar to yours.
Using reserve() seems to reduce the time by
around 40%.
Although I have BCB6 installed,
I don't have time to test with that.
While fgets() is clearly faster, it requires allocating
a fixed sized buffer or reallocating it. This isn't
needed with std::string/getline() so I find the
tradeoff acceptable. I'm also optimizing
for size and speed. This may be a factor.
I don't remember how to setup optimization with
BCB. But it sounds like its "out of the box"
settings aren't very good.
 

Re:Re: fgets() vs std::getline() performance

"crhras" < XXXX@XXXXX.COM >wrote in message
Quote
>To the OP, how do your results look in
>a release build if you do:
>
>std::string line;
>line.reserve(512);
>
>before the getline() call?
>
Thanks for your input. I can't believe that I didn't think to test the
release build. I come from an AIX C background and used to use
the de{*word*81} maybe twice a year.
You typically see a big difference between debug/release
when using templates - especially boost shared ptr
stuff.
Quote
I ran the tests using line.reserve( ) and got the following results:

NOTE:
Test 1 is getline( ) with line.reserve( )
Test 2 is fgets( )

Release Version
-----------------------
Test 1 has taken 124 seconds.
Test 2 has taken 4 seconds.

Debug Version
------------------------
Test 1 has taken 159 seconds.
Test 2 has taken 4 seconds.
Running your test in MSVC7.1,
in release with optimization I get
around 1 second for fgets() and around
4 seconds for getline() without reserve(),
around 2-3 seconds with reserve().
 

Re:Re: fgets() vs std::getline() performance

Duane Hebert < XXXX@XXXXX.COM >wrote:
Quote
"Hendrik Schober" < XXXX@XXXXX.COM >wrote in message
news:451134ce$ XXXX@XXXXX.COM ...
>crhras < XXXX@XXXXX.COM >wrote:

<snip>
>If I compare these number to mine, the optimization
>questions comes up. The debug version numbers are
>pretty close to mine. But my release version numbers
>are much better for 'std::getline()', while yours
>aren't that much different.

With MSVC7.1 my results were similar to yours.
Using reserve() seems to reduce the time by
around 40%.
I just put a
line.reserve(1024)
into the 'test_getline()' function (first version,
without the output) and it didn't change all that
much.
Quote
[...]
Schobi
--
XXXX@XXXXX.COM is never read
I'm Schobi at suespammers dot org
"The sarcasm is mightier than the sword."
Eric Jarvis
 

Re:Re: fgets() vs std::getline() performance

The machine I'm on right now only goes up to BCB6 and doesn't have BDS2006
on it.
I used the code as shown and, as BCB6 refuses the 'inline' of the function
containing the 'for' loop, commented out the word 'inline' so the warning
doesn't clutter up the screen capture.
Note that 'reserve' was not used.
The results were:
The 1.2M, 25,000 line test file took these times for getline/fgets:
122/20 dynamic linked debug on
36/20 dynamic linked debug off
129/21 static linked debug on
36/20 static linked debug off
------------------------------
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
dir test
Volume in drive C has no label.
Volume Serial Number is FC8D-A209
Directory of C:\Documents and Settings\Edward\My Documents\lookat\q186
09/20/2006 09:26 AM 1,239,311 test
1 File(s) 1,239,311 bytes
0 Dir(s) 30,942,109,696 bytes free
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
s8 lc test
S Version 8.0 Copyright 1986-2003 Emdata Co
C:\Documents and Settings\Edward\My Documents\lookat\q186\
25450 lines test
25450 lines 1 files
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
bcc32 -WCR -v ques186
Borland C++ 5.6.4 for Win32 Copyright (c) 1993, 2002 Borland
ques186.cpp:
Turbo Incremental Link 5.66 Copyright (c) 1997-2002 Borland
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
ques186 test
reading "test" took ~122 using 'std::getline()'
reading "test" took ~20 using 'std::fgets()'
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
bcc32 -WCR -v- ques186
Borland C++ 5.6.4 for Win32 Copyright (c) 1993, 2002 Borland
ques186.cpp:
Turbo Incremental Link 5.66 Copyright (c) 1997-2002 Borland
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
ques186 test
reading "test" took ~36 using 'std::getline()'
reading "test" took ~20 using 'std::fgets()'
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
bcc32 -WC -v ques186
Borland C++ 5.6.4 for Win32 Copyright (c) 1993, 2002 Borland
ques186.cpp:
Turbo Incremental Link 5.66 Copyright (c) 1997-2002 Borland
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
ques186 test
reading "test" took ~129 using 'std::getline()'
reading "test" took ~21 using 'std::fgets()'
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
bcc32 -WC -v- ques186
Borland C++ 5.6.4 for Win32 Copyright (c) 1993, 2002 Borland
ques186.cpp:
Turbo Incremental Link 5.66 Copyright (c) 1997-2002 Borland
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
ques186 test
reading "test" took ~36 using 'std::getline()'
reading "test" took ~20 using 'std::fgets()'
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote

----------------------------------------------------------
The code used was
----------------------------------------------------------
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote
type ques186.cpp
#include <iostream>
#include <fstream>
#include <string>
#include <cstdio>
#include <cassert>
#include <windows.h>
const unsigned int num_tests = 10;
void test_getline(const std::string& filename)
{
std::ifstream ifs(filename.c_str());
assert(ifs.good());
std::string line;
while( std::getline(ifs, line,'\n') ) {
}
}
void test_fgets(const std::string& filename)
{
std::FILE* fp = std::fopen(filename.c_str(), "r");
assert(fp);
char buffer[512];
while( std::fgets(buffer, sizeof(buffer), fp) != NULL ) {
}
}
/* inline */
unsigned int test( const std::string& filename
, void (*func)(const std::string& filename) )
{
const DWORD dwStart = ::GetTickCount();
for( unsigned int u=0; u<num_tests; ++u ) {
func(filename);
}
return ::GetTickCount()-dwStart;
}
int main(int argc, char* argv[])
{
assert(argc==2);
const unsigned int u_getline = test( argv[1], test_getline );
const unsigned int u_fgets = test( argv[1], test_fgets );
std::cout << "reading \"" << argv[1] << "\" took ~"
<< u_getline/num_tests << " using 'std::getline()'\n";
std::cout << "reading \"" << argv[1] << "\" took ~"
<< u_fgets /num_tests << " using 'std::fgets()'\n";
return 0;
}
C:\Documents and Settings\Edward\My Documents\lookat\q186
Quote

----------------------------------------------------------
. Ed
Quote
Duane Hebert wrote in message
news: XXXX@XXXXX.COM ...

<snip>
>If I compare these number to mine, the optimization
>questions comes up. The debug version numbers are
>pretty close to mine. But my release version numbers
>are much better for 'std::getline()', while yours
>aren't that much different.

With MSVC7.1 my results were similar to yours.
Using reserve() seems to reduce the time by
around 40%.

Although I have BCB6 installed,
I don't have time to test with that.

While fgets() is clearly faster, it requires allocating
a fixed sized buffer or reallocating it. This isn't
needed with std::string/getline() so I find the
tradeoff acceptable. I'm also optimizing
for size and speed. This may be a factor.

I don't remember how to setup optimization with
BCB. But it sounds like its "out of the box"
settings aren't very good.
 

Re:Re: fgets() vs std::getline() performance

"Ed Mulroy" < XXXX@XXXXX.COM >wrote in message
Quote
The machine I'm on right now only goes up to BCB6 and doesn't have BDS2006
on it.

I used the code as shown and, as BCB6 refuses the 'inline' of the function
containing the 'for' loop, commented out the word 'inline' so the warning
doesn't clutter up the screen capture.

Note that 'reserve' was not used.

The results were:

The 1.2M, 25,000 line test file took these times for getline/fgets:

122/20 dynamic linked debug on
36/20 dynamic linked debug off
129/21 static linked debug on
36/20 static linked debug off
So yours is more in line with expectations.
I imagine since you have BCB6 it's dinkumware.
Which BCB and which stl is the op using?
 

Re:Re: fgets() vs std::getline() performance

"Hendrik Schober" < XXXX@XXXXX.COM >wrote in message
Quote
I just put a
line.reserve(1024)
into the 'test_getline()' function (first version,
without the output) and it didn't change all that
much.
I changed my optimization to be for speed
and the time was a bit less and reserve didn't
help as much.
FWIW, we do a lot of parsing of flat files
and I've never seen this as a bottleneck -
at least not more than you would expect
for file i/o.
 

Re:Re: fgets() vs std::getline() performance

Quote
Which BCB and which stl is the op using?
As you can see from the screen capture it is BCB 6, bcc32.exe is 5.6.4, the
one which uses StlPort.
. Ed
Quote
Duane Hebert wrote in message
news:4511b263$ XXXX@XXXXX.COM ...

So yours is more in line with expectations.
I imagine since you have BCB6 it's dinkumware.

Which BCB and which stl is the op using?
 

Re:Re: fgets() vs std::getline() performance

"Ed Mulroy" < XXXX@XXXXX.COM >wrote in message
Quote
The machine I'm on right now only goes up to BCB6 and doesn't have BDS2006
on it.

I used the code as shown and, as BCB6 refuses the 'inline' of the function
containing the 'for' loop, commented out the word 'inline' so the warning
doesn't clutter up the screen capture.

Note that 'reserve' was not used.

The results were:

The 1.2M, 25,000 line test file took these times for getline/fgets:

122/20 dynamic linked debug on
36/20 dynamic linked debug off
129/21 static linked debug on
36/20 static linked debug off
I ran your code at home on Turbo C++ and the free
MSVC8.
MSVC: 14/4
Turbo:
Console project, no vcl.
Debug: 93/14
Release dynamic 62/4
Release static 65/4
I was using a 314Kb text file.
I had both set to optimize for speed in the release build.
We can ignore the debug build. I get the same
small boost with the dynamic run in Borland.
But I still see that Borland is 4 times slower using
std::getline.
Admittedly, I've just installed Turbo C++ and
am just using the default release/debug settings,
but I'm also using my default settings in msvc.
To Hendrik, with msvc8 and this code,
reserving the string made nearly no difference.
With BCB reserving 1024 for the string increased
the release build to 68.
 

Re:Re: fgets() vs std::getline() performance

"Ed Mulroy" < XXXX@XXXXX.COM >wrote in message
Quote
>Which BCB and which stl is the op using?

As you can see from the screen capture it is BCB 6, bcc32.exe is 5.6.4,
the one which uses StlPort.
Right. I just posted a test from home using turbo c++ from
Borland and msvc8. I believe Borland uses dinkumware
in this release so both are with the "same" std library.
My results were a bit different than yours but I had a much smaller
file. Ignoring debug runs you seem to get around
2:1 where I get more like 15:1 with Borland.
Maybe stlport is faster. I wonder how BDS fares? Probably
similar to the Turbo C++ that I'm running.
 

Re:Re: fgets() vs std::getline() performance

"Ed Mulroy" < XXXX@XXXXX.COM >wrote in message
Quote
>Which BCB and which stl is the op using?

As you can see from the screen capture it is BCB 6, bcc32.exe is 5.6.4,
the one which uses StlPort.
I was talking about the OP and it looks like he's running
Borland Studio (I imagine that's BDS).
At any rate, I tried your code again with a file size
of 1.256Kb.
Results were:
MSVC: 51/12 Both from the ide with no debugging
and from the exe (still around 4:1)
Borland
From the ide, release /run with no debugging:
368/17
From the exe:
259/16
Sounds like the (run with no debugging
from the ide has some problems)
I don't know what all this proves though.
This code doesn't really do anything but
load the files. Maybe things would change
profiling string parsing or something.