Board index » delphi » how do i sort records in files

how do i sort records in files

just the idea i'll learn myself the technical side of it

--
Diabolic_Preacher
The Evil shall rule
B'Coz The Preacher said so....
to reply thru email convert the 0 to o
and yeehaw to yahoo

---
open with care
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.314 / Virus Database: 175 - Release Date: 1/11/2002

 

Re:how do i sort records in files


Quote
Diabolic_Preacher wrote:
> just the idea i'll learn myself the technical side of it

> --
> Diabolic_Preacher
> The Evil shall rule
> B'Coz The Preacher said so....
> to reply thru email convert the 0 to o
> and yeehaw to yahoo

> ---
> open with care
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.314 / Virus Database: 175 - Release Date: 1/11/2002

Several ways: if enough memory available read the file to memory (array
or linked list) and sort this. Rewrite the whole file afterwards with
the sorted entries. Other thing: use a temporary file to build the
sorted one. Read through the data file and get the next needed value.
Write it to the temp file. Afterwards the temp file will be sorted and
the data file may be erased and the tempfile may be renamed to the name
of the data file.

Btw: there might be other/better solutions!

Greetings

Markus

Re:how do i sort records in files


For really large files where it does NOT all fit into memory at once,
there's something called Polyphase Sort, originally developed back when
"mass storage" meant a tape drive, memory size was measured in K (not
M), and Machines were really Machines (meaning you couldn't pick one
up).

The basic idea is as follows:  Assume you have N+1 "tapes", with the
unsorted data on one of them.  Read as much data as you can into memory,
sort it, then write the sorted "chunk" onto tape 1.  Continue reading,
sorting, and distribute "chunks" to tapes 2 .. N.  If you run out of
tapes (and you should), repeat from tape 1.

That's the "initialization step".  Now proceed to merge the N tapes back
onto a single tape.  By reading the first entry of the N tapes, you can
pick them off in order and write them to the output tape.  In the
beginning, all of the tapes have sorted "chunks", so you'll be merging N
"chunks" into one bigger one.  At some point, one of the input tapes
will "roll over" into its next chunk and stop contributing to output
chunk 1.  When you can't go any more, start merging output chunk 2.
Eventually you will have merged the N tapes into a single tape, making
much larger chunks (by an average factor of N, with the number of chunks
going down by a factor of N).

Now you can start the final repetetive phase.  Distribute the existing
chunks again to the N tapes, then merge them back again to the single
tape.  You again decrease the number of chunks by a factor of N
(approximately).  Repeat until the number of chunks is 1, when you will
then have the entire file sorted.

I think if you find a description of Polyphase Sort, it might differ
slightly from this.  I seem to recall there are some tricks to play
where you can use the tapes even more efficiently and gain improvement
factors bigger than N, but this makes the algorithm much more complex,
and harder to explain.

Bob Schor
Pascal Enthusiast
P.S. -- back when I was working with a 16-bit machine having a 16-bit
directly-addressable memory, I coded Polyphase Sort in Pascal to make
for myself a simple spell checker that looked up words in a dictionary.
Since I was spell-checking, all I wanted was a list of misspelled
(meaning "not in the dictionary") words.  I split my input file into
words, sorted the words, then did a side-by-side comparison with my
(sorted) dictionary.  All were contained in files, much bigger than
memory.

Quote
Diabolic_Preacher wrote:
> just the idea i'll learn myself the technical side of it

> --
> Diabolic_Preacher
> The Evil shall rule
> B'Coz The Preacher said so....
> to reply thru email convert the 0 to o
> and yeehaw to yahoo

> ---
> open with care
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.314 / Virus Database: 175 - Release Date: 1/11/2002

Re:how do i sort records in files


If the data is sufficiently unbiased, it is also possible to make one pass
through the file to read all the A's. Sort these and write them to output.
Then make another pass for the B's etc.

Of course, if you have the memory you can read all the A's, B's and C's, and
you can use the filesize to estimate how big a chunk to read.

As I mentioned this only really works for unbiased data - that is, data with
about the same number of entries in each chuck.
Chris

Quote
"Diabolic_Preacher" <pintoo...@remyahoo.com> wrote in message

news:a2aamn$lt5$1@news.vsnl.net.in...
Quote
> just the idea i'll learn myself the technical side of it

> --
> Diabolic_Preacher
> The Evil shall rule
> B'Coz The Preacher said so....
> to reply thru email convert the 0 to o
> and yeehaw to yahoo

> ---
> open with care
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.314 / Virus Database: 175 - Release Date: 1/11/2002

Other Threads