Board index » delphi » Sorting Record Files

Sorting Record Files

In article <370FB129.263F1...@eunet.at>,
  "Ing. Franz Glaser" <meg-gla...@eunet.at> wrote:

Quote
> David wrote:

> > Is there a procedure for the sorting of a file of records ? Any help
> > would be appreciated.

> It is VERY unusual to sort the records on file. Instead you sort
> the index, either in RAM or on a special index file.

Are you serious?

Quote
> Explanation: Records contain several fields, and usually at least
> one field is a key field. You create (in the most simple implementation)
> an array of strings in RAM holding the keys together with the record
> number.

You're not reading the question, the guy wants to sort a file of records.

David,

type the following in into your browser, without the < & >, and take your pick
for the nearest site:

<http://ftpsearch.lycos.com/cgi-bin/search?form=medium&query=rpsrt102&...
insensitive+substring+match&otype=Exact+search&oquery=rpsrt102.zip>

RPSORT by the late Robert Pirko will do what you want to do, and nothing, as
in _ABSOLUTELY!_ nothing, comes even close to its speed. It can handle both
variable (text) and fixed (record) length files, can sort on multiple keys,
keys can be C-strings, Pascal strings, ASCII, FPU datatypes, TP reals,
(un)singned integer, BASICA/GWBASIC reals. It can optionally delete
duplicates, remove (for text files) blank lines, it can merge multiple input
files into one output file, and best of all it comes with the complete
assembler source and 5 builtin help screens... (and still is only 18597
bytes) It's a brilliant program!

Robert
--
Robert AH Prins
prin...@williscorroon.com

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

 

Re:Sorting Record Files


Quoting a message by Robert AH Prins <prin...@williscorroon.com> in
comp.lang.pascal.borland:

Quote
>> It is VERY unusual to sort the records on file. Instead you sort
>> the index, either in RAM or on a special index file.

>Are you serious?

Very.

In a network situation, sorting the file directly can cost you your job,
as it could cost the company hundreds of thousands of dollars or
hundreds of man hours.

Imagine, if you will, that you're working for a bank or any other large
firm. Your program is editing the employee information file. As you're
in the middle of sorting the file, someone at another workstation tries
to write to the file. File gets chewed, the genious who was sorting it
directly gets fired, the company has to re-input the information for all
25,000 employees from scratch. That's a lot of overtime :>

Problems like this could also occur in a multi-tasking or other
multi-user situation (Linux).

The proper way to sort a file involves three steps;

Old master
Temp master
New master

Copy Old master to Temp master, sort Temp master, replace Old master
with New master.

--

= Stewart Honsberger (AKA Blackdeath)
= Web: http://sprk.com/blackdeath ICQ UIN: 3484915
= Remove 'thir{*word*249}' to reply privately

... Age is only important if you're a cheese
-!- GOPGP/2 v1.23

Re:Sorting Record Files


In article <IBLE3QRK14wM09...@13softhome.net>,

Quote
  blackde...@13softhome.net wrote:
> Quoting a message by Robert AH Prins <prin...@williscorroon.com> in
> comp.lang.pascal.borland:

> >> It is VERY unusual to sort the records on file. Instead you sort
> >> the index, either in RAM or on a special index file.

> >Are you serious?

> Very.

> In a network situation, sorting the file directly can cost you your job,
> as it could cost the company hundreds of thousands of dollars or
> hundreds of man hours.

In a network situation, your access to the file would lock it!

Quote
> Imagine, if you will, that you're working for a bank or any other large
> firm. Your program is editing the employee information file. As you're
> in the middle of sorting the file, someone at another workstation tries
> to write to the file.  File gets chewed, the genious who was sorting it
> directly gets fired, the company has to re-input the information for all
> 25,000 employees from scratch. That's a lot of overtime :>

Banks and other large firms do have decent backup procedures, and if the file
was this important, you would probably not have wrtite permission on it
anyway.

Quote
> Problems like this could also occur in a multi-tasking or other
> multi-user situation (Linux).

> The proper way to sort a file involves three steps;

> Old master
> Temp master
> New master

> Copy Old master to Temp master, sort Temp master, replace Old master
> with New master.

Doesn't work, because while you are sorting temp, updates to old will be lost
when you copy temp to new.

Robert
--
Robert AH Prins
prin...@williscorroon.com

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

Re:Sorting Record Files


In article <7es7h1$48...@nnrp1.dejanews.com>,
  Robert AH Prins <prin...@williscorroon.com> wrote:

Quote

> > Copy Old master to Temp master, sort Temp master, replace Old master
> > with New master.

> Doesn't work, because while you are sorting temp, updates to old will be lost
> when you copy temp to new.

This and the record lock feature occurred to me when I read his message.  One
other thing:  Sorting important files would probably occur during the hours
when most other activity is shut down anyway.  Otherwise, other users on the
network would be excluded from writing to the file (and possibly reading as
well) during the hours they need to do just that.

----------
WARNING: Anti-spam fanatic.  Keep your ads to yourself!!!

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

Re:Sorting Record Files


Quote
walt...@my-dejanews.com wrote:

...
Quote
> One
> other thing:  Sorting important files would probably occur during the hours
> when most other activity is shut down anyway.  Otherwise, other users on the
> network would be excluded from writing to the file (and possibly reading as
> well) during the hours they need to do just that.

Hmm... what to do in a 24x7 - company then? Like an online-store?

regards
--
Arno Fehm (af...@bigfoot.de)

------------------------------------------------------------------------
Member of Grey Dreams Design: visit http://GreyDreams.home.pages.de !!!!
He who can destroy a thing has the real control over it. (Frank Herbert)
------------------------------------------------------------------------

Re:Sorting Record Files


Quote
In article <7es7h1$48...@nnrp1.dejanews.com> prin...@williscorroon.com wrote...
> In a network situation, your access to the file would lock it!

Now there's a sweeping statement! File locking depends on (1) the
network software and (2) the way in which you have accsessed the file.

Quote
> Banks and other large firms do have decent backup procedures, and if
> the file was this important, you would probably not have wrtite
> permission on it anyway.

These are 'probably' and 'maybe' statements, neither of which I would
place any faith in.

Mike{*word*106}son, Black Cat Software Factory, Edinburgh, Scotland
fax 0131-271-1551 - Columnated Ruins Domino - Mellotron M400 #996

Re:Sorting Record Files


In article <37123B84.81CA6...@bigfoot.de>,
  Arno Fehm <af...@bigfoot.de> wrote:

Quote
> walt...@my-dejanews.com wrote:
> ...
> > One
> > other thing:  Sorting important files would probably occur during the hours
> > when most other activity is shut down anyway.  Otherwise, other users on the
> > network would be excluded from writing to the file (and possibly reading as
> > well) during the hours they need to do just that.
> Hmm... what to do in a 24x7 - company then? Like an online-store?

Well, where I work (24 hrs but not online store), we still have a time when
transactions slow down.  That's when we do things like sorts and major
reports.

----------
WARNING: Anti-spam fanatic.  Keep your ads to yourself!!!

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

Re:Sorting Record Files


In article <CheetahPRO_v1.121n8_86...@blackcat.demon.co.uk>,

Quote
  mike@blackcat..demon..co..uk wrote:
> In article <7es7h1$48...@nnrp1.dejanews.com> prin...@williscorroon.com
wrote...

> > In a network situation, your access to the file would lock it!

> Now there's a sweeping statement! File locking depends on (1) the
> network software and (2) the way in which you have accsessed the file.

OK, .... to the file would probably lock it!

Quote
> > Banks and other large firms do have decent backup procedures, and if
> > the file was this important, you would probably not have wrtite
> > permission on it anyway.

> These are 'probably' and 'maybe' statements, neither of which I would
> place any faith in.

In that case I would suggest that you keep your money under your mattress,
because that's the way it happens at banks and insurance companies. (Note the
omission of ALL. I have worked only at a number of them, but they all had such
procedures!)

But let's return to the original question:

Quote
>Is there a procedure for the sorting of a file of records ? Any help would
>be appreciated.

>Thank you all,

>David

We seem to have digressed considerably. As I wrote in my reply to Franz
Glazer's first reply, the guy wants to sort a file. We can start discussions
about the rights or wrongs of sorting a file, file locking, backup
procedures, 24x7 operations, but as we do not know the context in which this
sort operation is placed, we are taking this thread, which, don't get me
wrong, is quite interesting, completely beyond the original purpose.

Maybe you can dip in again David, and tell us in which context you want your
data/file to be sorted?

Robert
--
Robert AH Prins
prin...@williscorroon.com

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

Re:Sorting Record Files


Quote
In article <7eurho$ce...@nnrp1.dejanews.com> prin...@williscorroon.com wrote...
> > Now there's a sweeping statement! File locking depends on (1) the
> > network software and (2) the way in which you have accsessed the file.

> OK, .... to the file would probably lock it!

'Probably' isn't good enough,is it?!

Quote
> In that case I would suggest that you keep your money under your mattress,
> because that's the way it happens at banks and insurance companies.

By omitting the word 'all' you obviate any use that your statement has.
If it doesn't happen at one such place then you cannot count on it
happening at all. (Yes, I've had first-hand experience of this sort of
thing)

Mike{*word*106}son, Black Cat Software Factory, Edinburgh, Scotland
fax 0131-271-1551 - Columnated Ruins Domino - Mellotron M400 #996

Re:Sorting Record Files


In article <CheetahPRO_v1.121n8_86...@blackcat.demon.co.uk>,

Quote
  mike@blackcat..demon..co..uk wrote:
> In article <7eurho$ce...@nnrp1.dejanews.com> prin...@williscorroon.com
wrote...

> > > Now there's a sweeping statement! File locking depends on (1) the
> > > network software and (2) the way in which you have accsessed the file.

> > OK, .... to the file would probably lock it!

> 'Probably' isn't good enough,is it?!

> > In that case I would suggest that you keep your money under your mattress,
> > because that's the way it happens at banks and insurance companies.

> By omitting the word 'all' you obviate any use that your statement has.
> If it doesn't happen at one such place then you cannot count on it
> happening at all. (Yes, I've had first-hand experience of this sort of
> thing)

<Sigh>

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

Re:Sorting Record Files


David's origianl problem as sent to me by email:

Quote
> Hello All,

> It is the me, David who asked the original question and caused all
> this problems.

Oh no, you're question has started a nice thread, going slightly off-topic,
but who cares.

Quote
> My problem was to sort a file of records as it was
> asked in an assignment at University. (I mean I wanted a few
> pointers to a procedure .. I don't mean I expected someone to do
> it for me ofcourse) Anyway .. the file to sort was the following:

Guest_Recs = Record
                Name       : String[10];
                Surname    : String[20];
                IDorPass   : string[7];
                Room       : Integer;
                Rate       : String[2];
             end;

Room_Recs  = Record
                No_of_Beds : Byte;
                Date_In    : Str_20;
                Date_out   : Str_20;
                Extra      : Set of Extras;
                RecNo      : Integer;
             end;
Index_Rec  = Record
                Surname    : String[20];
                RecNo      : Integer;
             End;

Quote
> Index rec was the file I wanted sorted to have the surnames listed
> in alphabetical order.
> The procedure I used is as follows (It seems to work but any
> comments would be appreciated)

{****************************************************************}
Function UpCased(s:string):string;

var
  i : Integer;
begin
  for i := 1 to Length(s) do
    s[i] := UpCase(s[i]);
    upcased:=s;
end;

Procedure Sort_Index_File;

Var
  i,x,a,upperb,lastswap:integer;
  Rec1,Rec2,temp:index_rec;

Begin
  clrscr;
  upperb := filesize(indexfile) -1;
  while upperb <>0 do
   begin
      lastswap :=0;
       for i := 0 to upperb do
         begin
          seek(Indexfile,i);
          read(Indexfile,rec1);
          seek(Indexfile,(i+1));
          read(Indexfile,rec2);
           if Upcased(rec1.surname) > Upcased(rec2.surname) then
             begin
               temp:=rec1;
               rec1:=rec2;
               rec2:=temp;
               seek(Indexfile,i);
               write(indexfile,rec1);
               seek(Indexfile,(i+1));
               write(indexfile,rec2);
               lastswap :=i ;
             end;

         end;
        upperb:= lastswap;

   end;
end;

{****************************************************************}

Quote
> For anyone who is interested, please find a copy of my program up to now

<attached binaries snipped for the obvious reasons>

Quote
> (Now remember, I'm still a beginner) and once more, any comments would be
> greatly appreciated.

1) You're using a bubble sort. It works, but it's about the slowest sort
known, however as this is a university assignment, any sort will do.

2) You are doing a bubble sort writing and reading records to and from disk
for every swap and upcasing everything over and over again. Sure, it will
work, but depending on the size of the file and the speed of the disk, this
is a very slow way of doing things. How could you speed it up?

Depending on concurrency requirements, and to keep things simple, let's
assume there are none you could as a first improvement read in every record,
and write out a second file with records containing not only the original
name, but also an upcased version {type Index_rec2}. You could then sort this
file and after the sort copy it back to the original file, minus the extra
field.

Still, despite eliminating the constant calls to Upcased, it will still be
slow.

So let's look at a second improvement, you declare an array[0..N] of
Index_Rec2, where N is a big number. You read in the first N records of your
file, or all of them (=A] if it contains less than N (in which case you will
have to remember that you should sort only the first A elements of the array]
adding the Upcased name to each. You can now sort the array in RAM with
whatever sort you like (have a look at Shell sort, it's only slightly more
complicated than bubble sort, but a whole lot faster), using array[0] as your
temp. If A<=N you finish by writing out the elements minus your upcased
field.

Of course you've got a problem if your file contains more than N records or if
you've got limited memory. (With both the original and the upcased name the
record will have a size of 44 bytes, limiting N to, depending on other global
variables, somewhere arond the 1450)

Any clues? Give it a try and come back if you're really stuck. (And if you do
so in a new thread, cross-post to both borland.public.turbopascal &
comp.lang.pascal.borland - the latter has a rather larger audience)

Quote
> Thank you all very much for the interest shown,

Given thay you are actually doing your own homework you deserve it!

Robert
--
Robert AH Prins
prin...@williscorroon.com

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

Other Threads