Board index » delphi » Enhanced text file accessing in TP

Enhanced text file accessing in TP

Hello!

I'm working with a LARGE text file in one of my programs.  The file is
probably about 300 lines long with each line between 40 and 250 chars long...
So rather than trying to read it all in at once (because of the 64kb limit),
i've been reading it in page at a time, as needed.

The problem is, I need to WRITE each page back to the text file before loading
the next one, and using TP 7.0's internal text file handling, I cannot do
that since Append will open the file as read only and Reset will open the file
as write only.

Could someone please supply some code for opening text files in RANDOM access
mode?  Or could someone supply the interrupt information to do this?

Also, if anyone has a ReadLn/WriteLn text file replacement that is faster than
TPs, i'd appreciate that too!

Thanks for any help
James

 

Re:Enhanced text file accessing in TP


I think you can't

You could open it and get acces to read and write to the
file using
Interupt $21 function $3D

But when writing back to this kind of file you will damage
the data in it.  You probably would overwrite some of it.
I think it's a better idea to use a temporary file which you
can write to, and later on copy it's contents back,
or rename the file.
300 lines of 250 chars is not that much so time can't be the
problem...

Me heeft geschreven in bericht ...

Quote
>Hello!

>I'm working with a LARGE text file in one of my programs.
The file is
>probably about 300 lines long with each line between 40 and
250 chars long...
>So rather than trying to read it all in at once (because of
the 64kb limit),
>i've been reading it in page at a time, as needed.

>The problem is, I need to WRITE each page back to the text
file before loading
>the next one, and using TP 7.0's internal text file

handling, I cannot do
Quote
>that since Append will open the file as read only and Reset
will open the file
>as write only.

>Could someone please supply some code for opening text

files in RANDOM access
Quote
>mode?  Or could someone supply the interrupt information to
do this?

>Also, if anyone has a ReadLn/WriteLn text file replacement
that is faster than
>TPs, i'd appreciate that too!

>Thanks for any help
>James

Re:Enhanced text file accessing in TP


Quote
> I'm working with a LARGE text file in one of my programs.  The file is
> probably about 300 lines long with each line between 40 and 250 chars long...
> So rather than trying to read it all in at once (because of the 64kb limit),
> i've been reading it in page at a time, as needed.

   That's not large at all.  Granted, you can't store that much data in
normal data variables, but you could certainly store that much data on
the Heap and use pointers to access it.

Quote
> The problem is, I need to WRITE each page back to the text file before loading
> the next one, and using TP 7.0's internal text file handling, I cannot do
> that since Append will open the file as read only and Reset will open the file
> as write only.

   You're confused here: Append opens a text file for _adding_ new data
(at the end of existing data).  I don't know that it affects your
problem, but you should get the usage of Append (and Reset) clear before
you start implementing something...

Quote
> Could someone please supply some code for opening text files in RANDOM access
> mode?  Or could someone supply the interrupt information to do this?

   That's not an easy thing to do, and it almost certainly isn't what you
need (based on your problem description).  You can process "chunks" of
your file (reading in some, changing it, and writing it out to a text
file you Append to), in a manner similar to your statement above...OR you
could read the whole file into the Heap (using a pointer array) and do
everything at once, prior to writing the entire changed file data to a
new file.  The latter is what seems best, and I'm attaching code which
will explain and help.

Quote
> Also, if anyone has a ReadLn/WriteLn text file replacement that is faster than
> TPs, i'd appreciate that too!

   Not necessary, more than likely - using large Text file buffers (with
SetTextBuf) will greatly speed up TP's file processing.  Use a 8-16K
buffer for best results.
   A Text file/pointer array program follows (note that it's cobbled from
a program which processes TagLines, and I've deleted a lot of code you
don't need.  However, there are some references which will cause it not
to compile asis, and it's intended only for you to see how the key
processing I mention above is done...):

program Text_File_Viewer;                     { MRCopeland 941231 }
{$M 8192,0,655000}
Uses CRT,DOS;

const TLIM    = 10000;                               { Records Limit }
type  S80     = string[80];  
      LLPTR   = ^S80;
      BigBuf  = array[0..16383] of char;
var   I,J,K   : integer;
      PAX,B,E : integer;                       { Pointer Array indeX }
      DONE    : boolean;
      T,F3    : S80;
      PA      : array[1..TLIM] of LLPTR;             { Pointer Array }
      FV3     : Text;
      Buff    : ^BigBuf;                          { large i/o buffer }

procedure INITIALIZE;                { initialize system & variables }
begin
  if ParamCount > 0 then F3 := ParamStr(1)
  else
    begin
      WPROM (LONORM,'Enter filename: '); readln (F3)
    end;
  if not EXISTS (F3) then FATAL ('Cannot Open '+F3+' as input file');
  for I := 1 to TLIM do PA[I] := Nil;
  Assign (FV3,F3); New (Buff); SetTextBuf (FV3,Buff); Reset (FV3)
end;  { INITIALIZE }

procedure SORT_RECS (LEFT,RIGHT : word);       { Lo-Hi QuickSort }
var LOWER,UPPER,MIDDLE : Word;
    PIVOT              : S80;
begin
  LOWER := LEFT; UPPER := RIGHT; MIDDLE := (LEFT+RIGHT) Shr 1;
  PIVOT := PA[MIDDLE]^;
  repeat
    while PA[LOWER]^ < PIVOT do Inc(LOWER);
    while PIVOT < PA[UPPER]^ do Dec(UPPER);
    if LOWER <= UPPER then
      begin
        T := PA[LOWER]^; PA[LOWER]^ := PA[UPPER]^;
        PA[UPPER]^ := T; Inc (LOWER); Dec (UPPER)
      end;
  until LOWER > UPPER;
  if LEFT < UPPER then SORT_RECS (LEFT, UPPER);
  if LOWER < RIGHT then SORT_RECS (LOWER, RIGHT)
end;                                                { SORT_RECS }

procedure READ_RECS;
begin
  PAX := 0;
  while not EOF (FV3) do
    begin
      readln (FV3,S1); Inc (CT);
      CH := S1[1]; S2 := TTB(S1); Inc (PAX);
      if PAX <= TLIM then
        begin
          New (PA[PAX]); PA[PAX]^ := S2
        end
    end;
  Close (FV3); Dispose (Buff); SORT_RECS (1,PAX)
end;  { READ_RECS }

begin  { MAIN LINE }
  INITIALIZE;                     { initialize system & variables }
  READ_RECS                                { read & store records }
end.

Re:Enhanced text file accessing in TP


Quote
In article <tUnq.24$N21.91...@nnrp2.ptd.net>, Me <cco...@ptd.net> wrote:
>Hello!

>I'm working with a LARGE text file in one of my programs.  The file is
>probably about 300 lines long with each line between 40 and 250 chars long...
>So rather than trying to read it all in at once (because of the 64kb limit),
>i've been reading it in page at a time, as needed.

That is not large. If you were talking about 30000 lines then it would
be somewhat large. You can read the whole line if you define an array:

type Pstring=^string;

var data:array[1..500] of Pstring { for example }

then write routines to allocate and dispose strings:

function Newstr(const st:string):Pstring;
var p:Pstring;
begin
  getmem(p,length(st)+1);
  p^:=st;
  newstr:=p;
End;

Procedure disposestr(const p:string);
begin
  freemem(p,length(p^)+1);
End;

Do not even attempt to modify those strings in place but just dispose
old and allocate new. I.e.

not

var p:pstring;
...
delete(p^,1,1)

but:

var p:pstring;
    st:string;
...
st:=p^;
delete(st,1,1);
Disposestr(p);
p:=newstr(st);

If you need to handle lines longer than 255 then there is some problem. I
personally have chosen this by cutting the lines during reading and
marking them for combination during the writing to the file. (#26 as
first character is a good marker as it cannot be in a text file).

Quote
>The problem is, I need to WRITE each page back to the text file before loading
>the next one, and using TP 7.0's internal text file handling, I cannot do
>that since Append will open the file as read only and Reset will open the file
>as write only.

If you choose some kind of file approach, use an intermediate file,
completely different from your source file, or then use just a different
destination file. You do not tell us enough about the problem. Is your
handling sequential (like a filter that, say, capitalizes the file) or
random access, like in a text editor?

In general the first thing you should tell is what practical problem,
you are trying to solve with the program.

Quote

>Could someone please supply some code for opening text files in RANDOM access
>mode?  Or could someone supply the interrupt information to do this?

Maybe there is a reason why one cannot access text files randomly. maybe
that has something to do with the records (lines) being of different
size.

Quote

>Also, if anyone has a ReadLn/WriteLn text file replacement that is faster than
>TPs, i'd appreciate that too!

Use SetTextbuf() with, say, 10K buffer.

Osmo

Re:Enhanced text file accessing in TP


Quote
>> So rather than trying to read it all in at once (because of the 64kb limit),
>> i've been reading it in page at a time, as needed.

>   That's not large at all.  Granted, you can't store that much data in
>normal data variables, but you could certainly store that much data on
>the Heap and use pointers to access it.

Yes, I guess the only thing for me to do would be to read it all in and write
it all back when finished with it.  The only problem is the file is 300 lines
x 255 chars NOW, but its going to get much larger in the future - probably
double in size at least.

Quote
>> The problem is, I need to WRITE each page back to the text file before
> loading
>> the next one, and using TP 7.0's internal text file handling, I cannot do
>> that since Append will open the file as read only and Reset will open the
> file
>> as write only.

>   You're confused here: Append opens a text file for _adding_ new data
>(at the end of existing data).  I don't know that it affects your
>problem, but you should get the usage of Append (and Reset) clear before
>you start implementing something...

Nah, i'm not confused, I just typed wrong. :P  Notice I resevered the two
(Append = write only, Reset = read only).

Quote
>> mode?  Or could someone supply the interrupt information to do this?

>   That's not an easy thing to do, and it almost certainly isn't what you
>need (based on your problem description).  You can process "chunks" of
>your file (reading in some, changing it, and writing it out to a text
>file you Append to), in a manner similar to your statement above...OR you
>could read the whole file into the Heap (using a pointer array) and do
>everything at once, prior to writing the entire changed file data to a
>new file.  The latter is what seems best, and I'm attaching code which
>will explain and help.

I guess it might be possible for me to access the file as a file of char and
use BlockRead / BlockWrite.  See, the problem is that I have to read in a
chunk of the file, modify, and write it BACK to the SAME file in the SAME
position.  I just wrote a Text_Seek function but I haven't tried it on the
program.  I could always save the POS of the text position, close the file,
reopen as append and try to seek back to the original POS and write the data.
Then close it and reopen it with reset and seek back to the original POS
again...  But i'm not too informed on how Pascal handles text files as most of
my programming is done with typed records.

Even if that DOES work, it would probably make the program SLOOOOOOOOW...

Re:Enhanced text file accessing in TP


Quote
In article <guAq.106$N21.423...@nnrp2.ptd.net>, Me <cco...@ptd.net> wrote:
>>> So rather than trying to read it all in at once (because of the 64kb limit),
>>> i've been reading it in page at a time, as needed.

>>   That's not large at all.  Granted, you can't store that much data in
>>normal data variables, but you could certainly store that much data on
>>the Heap and use pointers to access it.

>Yes, I guess the only thing for me to do would be to read it all in and write
>it all back when finished with it.  The only problem is the file is 300 lines
>x 255 chars NOW, but its going to get much larger in the future - probably
>double in size at least.

maybe you could tell what on earth are you doing to the file. That is
the most relevant thing. It is hard to help when you hide the problem.

Are all lines 255 chars or just some?

Quote

>I guess it might be possible for me to access the file as a file of char and
>use BlockRead / BlockWrite.  See, the problem is that I have to read in a
>chunk of the file, modify, and write it BACK to the SAME file in the SAME
>position.

That is not the way to do things, One should not mess with the
original file until one is ready to make a save operation. Keep the
issue of using disk to extend memory separate from the file being
processed.

Quote
>I just wrote a Text_Seek function but I haven't tried it on the
>program.  I could always save the POS of the text position, close the file,
>reopen as append and try to seek back to the original POS and write the data.
>Then close it and reopen it with reset and seek back to the original POS
>again...  But i'm not too informed on how Pascal handles text files as most of
>my programming is done with typed records.

If you need that kind of stuff then th best thing is to discard text type
altogether and just use untyped filed.

Osmo

Re:Enhanced text file accessing in TP


Quote
> >   You're confused here: Append opens a text file for _adding_ new data
> >(at the end of existing data).  I don't know that it affects your
> >problem, but you should get the usage of Append (and Reset) clear before
> >you start implementing something...

> Nah, i'm not confused, I just typed wrong. :P  Notice I resevered the two
> (Append = write only, Reset = read only).

In most file types Reset is for read/write access by default. Resever - to
cut off again? If you always reserve Reset only for reading you may be
cutting yourself off from something useful.

Quote
> I guess it might be possible for me to access the file as a file of char
and
> use BlockRead / BlockWrite.

"File of char" is _extremely_ slow. Also it is not used with
BlockRead/BlockWrite, the untyped "File" is. You also want to use a second
parameter in Reset (e.g. Reset(f,1)) and blockread/blockwrite as many bytes
in one go as possible, i.e. use a large buffer.

But it's a bad idea. The most significant feature of text files is that the
record length is undefined. So you may end up writing back a longer string
than you read - which would overwrite at least some of the following
record.

You haven't said what you are writing. "Text" may not be the most
appropriate file type for your application. If your file is only a sequence
of strings, whose lengths may vary, and which is not read by any other
program, would File of String be better? 600 strings is still only 150K
which is nothing on a modern hard disc. You would then be back to normal
"file of record" code, and random access becomes trivial.

Quote
> I could always save the POS of the text position, close the file,
> reopen as append and try to seek back to the original POS and write the
data.
> Then close it and reopen it with reset and seek back to the original POS
> again...  But i'm not too informed on how Pascal handles text files as
most of
> my programming is done with typed records.

FilePos is not for text files. Since the record size is undefined (and
variable) and FilePos gives the position in terms of records, it could not
give a meaningful result.

Quote
> Even if that DOES work, it would probably make the program SLOOOOOOOOW...

Especially if you use Append to open for writing - it has to find the end
of the file and then look back for a char(26) which some text files use as
an end-of-file marker.

FP

Other Threads