Board index » cppbuilder » ProcessMessages() in a loop

ProcessMessages() in a loop


2005-03-04 01:37:06 AM
cppbuilder19
Hi,
I have a database (with data, of course) which structure
was not adequate to me. So I decided to transform it's
data in the way I want. The database itself is pretty big
(for MS Access ;) - a couple of tables every with about
400,000 records.
I haven't used the standard DB mechanisms for seraching
for matches (Lookup, Locate), because I found working with
TStringList-s signifficantly faster.
Then, I made a utility which:
1) transfers the data to TStringList-s
2) processes that TStringList-s
3) transfers data from TStringList-s back to DB.
The problem is that utility worked 98% correct which
is rediculous. The 2% of data was processed wrong.
I'm wondering if such code skeleton can in some
circumstances become bad, because of ProcessMessages(),
or something else:
for ( int i=0; i<400000; i++ )
{
Application->ProcessMessages();
// Here, I had few TStringList objects' manipulations
// on very big (400,000 items) objects of TStringList
// like SomeIndex = SL1->IndexOf( ... ),
// SL2->Strings[ SomeIndex ] = ...,
// etc, ...
}
At the beginning, loop processed about 2000 recors / second
but as the destination TStringList become bigger, it processed
only a few records / second (because IndexOf() now had more
job to do)
Once, the destination TStringList was full of duplicated entries,
like:'sometextsometext' instead of 'sometext'. Sometimes, the
destination TStringList was short for maybe of 1000 consecutive
records. There were also some other misterious results.
Can ProcessMessages() be dangerous in this case? Should I use
CriticalSections in this transformations, like:
for ( int i=0; i<400000; i++ )
{
Application->ProcessMessages();
EnterCS
Processing ...
LeaveCS
}
--
Best regards,
Vladimir Stefanovic
 
 

Re:ProcessMessages() in a loop

A few questions:
1. Is your loop below being performed in a message handler on the main
thread, or on another thread?
2. Using TStringList for this is a bad idea; its results will gradually
increase as the size of the list grows because it's doing a sequential
search through the data. I would consider using the STL map class, with
the first part of the template being an AnsiString and the second part
being a pointer to the record.
3. I must be missing something, what is TStringList-s as opposed to
TStringList?
4. What else is happening in that loop?
Vladimir Stefanovic wrote:
Quote
Hi,

I have a database (with data, of course) which structure
was not adequate to me. So I decided to transform it's
data in the way I want. The database itself is pretty big
(for MS Access ;) - a couple of tables every with about
400,000 records.

I haven't used the standard DB mechanisms for seraching
for matches (Lookup, Locate), because I found working with
TStringList-s signifficantly faster.

Then, I made a utility which:
1) transfers the data to TStringList-s
2) processes that TStringList-s
3) transfers data from TStringList-s back to DB.

The problem is that utility worked 98% correct which
is rediculous. The 2% of data was processed wrong.

I'm wondering if such code skeleton can in some
circumstances become bad, because of ProcessMessages(),
or something else:

for ( int i=0; i<400000; i++ )
{
Application->ProcessMessages();

// Here, I had few TStringList objects' manipulations
// on very big (400,000 items) objects of TStringList
// like SomeIndex = SL1->IndexOf( ... ),
// SL2->Strings[ SomeIndex ] = ...,
// etc, ...
}

At the beginning, loop processed about 2000 recors / second
but as the destination TStringList become bigger, it processed
only a few records / second (because IndexOf() now had more
job to do)

Once, the destination TStringList was full of duplicated entries,
like:'sometextsometext' instead of 'sometext'. Sometimes, the
destination TStringList was short for maybe of 1000 consecutive
records. There were also some other misterious results.

Can ProcessMessages() be dangerous in this case? Should I use
CriticalSections in this transformations, like:

for ( int i=0; i<400000; i++ )
{
Application->ProcessMessages();

EnterCS

Processing ...

LeaveCS

}



 

Re:ProcessMessages() in a loop

Quote
1. Is your loop below being performed in a message
handler on the main thread, or on another thread?
No event, no thread - just Button1Click()
Quote
2. Using TStringList for this is a bad idea; its results will
gradually increase as the size of the list grows because it's
doing a sequential search through the data.
You are partialy right. The destination TStringList object
can be changed in two ways:
1) dest_string_list->Add( Something )
2) dest_string_list->Strings[ SomePosition ] =
destination_string_list->Strings[ SomePosition ] + ";" + Addition;
... so it has to search the whole string list object sequential.
Quote
I would consider using the STL map class, with the first part
of the template being an AnsiString and the second part being a pointer to
the record.
I will look at it deeper. The utility I made is an instant app,
not of any future importance, so I wanted something simple.
Quote
3. I must be missing something, what is TStringList-s as
opposed to TStringList?
No it's the same. I just wanted to say that more than one
TStringList is involved in processing.
Quote
4. What else is happening in that loop?
Sometimes, when monitor or some HDs goes to stand by
and I reactrivate it, the loop stopped working.
--
Best regards,
Vladimir Stefanovic
 

{smallsort}

Re:ProcessMessages() in a loop

Vladimir Stefanovic wrote:
Quote
for ( int i=0; i<400000; i++ )
{
Application->ProcessMessages();
Would not call that for every iteration. Why not every 100 ?
Quote
// like SomeIndex = SL1->IndexOf( ... ),
// SL2->Strings[ SomeIndex ] = ...,
Is SL2 the destination ? And SL1 the source ?
Quote
At the beginning, loop processed about 2000 recors / second
but as the destination TStringList become bigger, it processed
only a few records / second (because IndexOf() now had more
job to do)
If you want us to understand what happens you have to tell more
of your algorithm.
Quote
Once, the destination TStringList was full of duplicated entries,
like:'sometextsometext' instead of 'sometext'. Sometimes, the
destination TStringList was short for maybe of 1000 consecutive
records.
We cannot comment on that as we do not know what should happen.
Quote
Can ProcessMessages() be dangerous in this case? Should I use
CriticalSections in this transformations,
No you do not need critical sections. And for the relevance of your
question: remove the Processmessages statement and see if
the result is 100%.
Working with TStringList with a large amount of data is quite
possible. But yes: avoid IndexOf() as it might need more time
to find the strings at the end.
By the way. Is this the algoritm we discused lately for your
dictionary ?
Hans.
 

Re:ProcessMessages() in a loop

Hi Hans!
Quote
>for ( int i=0; i<400000; i++ )
>{
>Application->ProcessMessages();

Would not call that for every iteration. Why not every 100 ?
Yes, I tried even that, but *i think* that the problems were
still there.
Quote
>// like SomeIndex = SL1->IndexOf( ... ),
>// SL2->Strings[ SomeIndex ] = ...,

Is SL2 the destination ? And SL1 the source ?
There are more than just these two StringLists. I just shown
the techniques (IndexOf, Strings[]) I used.
Quote
If you want us to understand what happens you have to tell more
of your algorithm.
Yes, I will show you the complete code tomorrow. It's not here
right now. The utilitiy is just now working at my work place, and
I'll check what weird has happend tomorrow morning.
Quote
>Once, the destination TStringList was full of duplicated entries,
>like:'sometextsometext' instead of 'sometext'. Sometimes, the
>destination TStringList was short for maybe of 1000 consecutive
>records.
We cannot comment on that as we do not know what should happen.
I agree.
Quote
>Can ProcessMessages() be dangerous in this case? Should I use
>CriticalSections in this transformations,

No you do not need critical sections. And for the relevance of your
question: remove the Processmessages statement and see if
the result is 100%.
That's something I definitely should try.
Quote
Working with TStringList with a large amount of data is quite
possible. But yes: avoid IndexOf() as it might need more time
to find the strings at the end.

By the way. Is this the algoritm we discused lately for your
dictionary ?
Yes, that's the case.
BTW, The procedure which I mentioned then had not string lists
involved, and even then there was something weird:
The complete wordlist beginning with letter 'W' was somehow
missing ?! And I had *NO CODE* which checked anything
similar. The algorythm was general for all words.
--
Best regards,
Vladimir Stefanovic
 

Re:ProcessMessages() in a loop

Vladimir,
I would look for a pointer problem. Can you explain how the transfer is
going wrong? Can you enable code guard? Please show the code.
John
"Vladimir Stefanovic" < XXXX@XXXXX.COM >wrote in message
Quote
Hi Hans!

>>for ( int i=0; i<400000; i++ )
>>{
>>Application->ProcessMessages();
>
>Would not call that for every iteration. Why not every 100 ?

Yes, I tried even that, but *i think* that the problems were
still there.

>>// like SomeIndex = SL1->IndexOf( ... ),
>>// SL2->Strings[ SomeIndex ] = ...,
>
>Is SL2 the destination ? And SL1 the source ?

There are more than just these two StringLists. I just shown
the techniques (IndexOf, Strings[]) I used.

>If you want us to understand what happens you have to tell more
>of your algorithm.

Yes, I will show you the complete code tomorrow. It's not here
right now. The utilitiy is just now working at my work place, and
I'll check what weird has happend tomorrow morning.

>>Once, the destination TStringList was full of duplicated entries,
>>like:'sometextsometext' instead of 'sometext'. Sometimes, the
>>destination TStringList was short for maybe of 1000 consecutive
>>records.
>We cannot comment on that as we do not know what should happen.

I agree.

>>Can ProcessMessages() be dangerous in this case? Should I use
>>CriticalSections in this transformations,
>
>No you do not need critical sections. And for the relevance of your
>question: remove the Processmessages statement and see if
>the result is 100%.

That's something I definitely should try.

>Working with TStringList with a large amount of data is quite
>possible. But yes: avoid IndexOf() as it might need more time
>to find the strings at the end.
>
>By the way. Is this the algoritm we discused lately for your
>dictionary ?

Yes, that's the case.

BTW, The procedure which I mentioned then had not string lists
involved, and even then there was something weird:

The complete wordlist beginning with letter 'W' was somehow
missing ?! And I had *NO CODE* which checked anything
similar. The algorythm was general for all words.




--
Best regards,
Vladimir Stefanovic