Board index » delphi » Need some suggestions from the experts here, for a fast algorithem :)

Need some suggestions from the experts here, for a fast algorithem :)


2007-04-30 04:31:00 AM
delphi202
Hello, was wondering if some of you could share your knowledge here,
im trying to write an algorithem to (very) quickly do this:
i have a file (around 10gb in size) with letters A->Z randomy (like ASDFSDA)
if the input is "ABD" i find the first "A" then find the distance between A
to the first "B" (say 5chars), i then need to check if "D" is the same
distance from B as A to B. (like B+5chars).
what is the best way to go along with this, the file is STATIC and constant,
and will not change.
any ideas/theories about going along doing this would be greatly
appericated!
 
 

Re:Need some suggestions from the experts here, for a fast algorithem :)

desp writes:
Quote
i have a file (around 10gb in size) with letters A->Z randomy (like ASDFSDA)
if the input is "ABD" i find the first "A" then find the distance between A
to the first "B" (say 5chars), i then need to check if "D" is the same
distance from B as A to B. (like B+5chars).
So,
AOffset = scan( array, 0, AVal )
BOffset = scan( array, AOffset+1, BVal )
return CVal == array[ BOffset * 2 - AOffset ]
Tricky parts being array[] is a file, and offsets may be more than the
4GB limit.
I would be tempted to have xOffset be a 2 part value, Sector and
Offset. Maybe decide that Sector would be 1MB in size, so disk reads
would be 1MB each. Offset would of course be the offset within the 1MB
Sector. (I seems to remember someone mentioning that 32KB is optimal
read size, rather than 1MB, but then you'd be limiting yourself to
128TB. )
 

Re:Need some suggestions from the experts here, for a fast algorithem :)

desp writes:
Quote
i have a file (around 10gb in size)
This is the main problem. File IO will dominate so make yourself
familiar with CreateFileMapping and MapViewOfFile.
 

Re:Need some suggestions from the experts here, for a fast algorithem :)

Bob Gonder writes:
Quote
(I seems to remember someone mentioning that 32KB is optimal
read size, rather than 1MB, but then you'd be limiting yourself to
128TB. )
I believe that for NTFS 256KB is the optimal read-size, with 64KB coming in
second.
Jon
 

Re:Need some suggestions from the experts here, for a fast algorithem :)

desp a couch?sur son écran :
Quote
Hello, was wondering if some of you could share your knowledge here,
im trying to write an algorithem to (very) quickly do this:

i have a file (around 10gb in size) with letters A->Z randomy (like ASDFSDA)
if the input is "ABD" i find the first "A" then find the distance between A
to the first "B" (say 5chars), i then need to check if "D" is the same
distance from B as A to B. (like B+5chars).

what is the best way to go along with this, the file is STATIC and constant,
and will not change.

any ideas/theories about going along doing this would be greatly
appericated!
Is it anyhow related to DNA analysis ? :-?
 

Re:Need some suggestions from the experts here, for a fast algorithem :)

----- Original Message -----
From: "John Herbster" <herb-sci1_AT_sbcglobal.net>
Newsgroups: borland.public.delphi.language.basm
Sent: Monday, April 30, 2007 12:10 AM
Subject: Re: Need some suggestions from the experts here, for a fast
algorithem :)
Quote

"desp" <XXXX@XXXXX.COM>wrote
>im trying to write an algorithem to (very) quickly
>do this: i have a file (around 10 GB in size) with
>letters A->Z randomly (like ASDFSDA) if the input is
>"ABD" i find the first "A" then find the distance
>between A to the first "B" (say 5chars), i then need
>to check if "D" is the same distance from B as A to B.
>(like B+5chars). ...

Is the result to be just true or false, or if not, then
what results are needed?
Is the file to be treated as circular?
--JohnH
the result should be true/false for each query
the file will not be treated as cricular no.
"Is it anyhow related to DNA analysis ? :-?"
nope.