Board index » cppbuilder » Reading and parsing a text file.

Reading and parsing a text file.


2006-07-25 08:45:23 AM
cppbuilder12
We do it every day. Read a text file and then parse the lines; usually
the line has some delimiter character.
So in a BCB app how would you do it? To add a bit more spice; How would
you do it for MBCS?
John.
 
 

Re:Reading and parsing a text file.

John Grabner < XXXX@XXXXX.COM >writes:
Quote
We do it every day. Read a text file and then parse the lines;
usually the line has some delimiter character.

So in a BCB app how would you do it? To add a bit more spice; How
would you do it for MBCS?
[assuming MBCS means string of multi-byte characters]
You get the simplest programs when the multi-byte encoding is
converted into some fixed-number-of-byte encoding as early as possible
and converted back (if necessary) as late as possible. All the code in
between can then operate on characters of equal size, and hopefully
use std::wstring and its operations.
Leaving the characters in their multi-byte encoding may result in a
more efficient program, both wrt to space and time. I doubt that this
is the case in a line by line parser, though.
 

Re:Reading and parsing a text file.

Thomas Maeder [TeamB] wrote:
Quote
John Grabner < XXXX@XXXXX.COM >writes:


>So in a BCB app how would you do it? To add a bit more spice; How
>would you do it for MBCS?


[assuming MBCS means string of multi-byte characters]
I am talking about multi-byte characters.
So how would you load the text file?
Use TStringList & load from file?
Use a vector<string>and getline?
Others?
John.
 

{smallsort}

Re:Reading and parsing a text file.

John Grabner < XXXX@XXXXX.COM >writes:
Quote
>>So in a BCB app how would you do it? To add a bit more spice; How
>>would you do it for MBCS?
>[assuming MBCS means string of multi-byte characters]

I am talking about multi-byte characters.

So how would you load the text file?
Use TStringList & load from file?
This may work - I let the VCL specialists answer this part.
Quote
Use a vector<string>and getline?
Why vector?
The streams part of the Standard C++ Library has built-in support for
converting between encodings on input and (reverse conversion) on
output. The common providing that support is the codecvt facet of the
localization machinery.
The required steps are:
1. Create a locale object whose codecvt facet does the appropriate
conversions.
2. Open a file using a std::[io]fstream object
3. "Imbue" the locale object to the fstream object
4. Use the fstream object for input and/or output
E.g.:
int main()
{
std::locale loc(std::locale(), new std::codecvt_byname(SOME_CODECVT_NAME));
std::ifstream if("myinput");
if.imbue(loc);
std::string line;
while (getline(if,line))
; // process line
}
It is implementation-defined what value of SOME_CODECVT_NAME you have
to use. See the documentation of the library you are using.
And there are of course encodings that your library implementation
will not support; for such encodings, you'd have to derive a
user-defined class from std::codecvt and the implement the conversion
yourself.