Board index » cppbuilder » Re: WSAGetLastError return always 6

Re: WSAGetLastError return always 6


2007-02-27 09:59:01 AM
cppbuilder88
"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
Quote
Because I know the pages I'm accessing, I don't need to look to
these
parameters.
Oh yes, you most certainly do!!!! That is the ONLY way to use HTTP
properly. If you do not follow the rules, even the slightlest changes
to the HTML could cause your code to fail very very badly. You MUST
use the sizes and types that the server actually specifies. This is
EXTREMELY important if you ever want to use keep-alives properly as
well, as you don't want to leave unfinished data in the socket buffers
between requests.
Seriously, the more you push forward on this topic, the more I see how
wrongly you are implementing everything. You really should consider
using pre-made libraries for everything. If you want portability, you
should look at libcurl (curl.haxx.se) and other cross-platform
libraries.
Quote
The page that agendar_recv() access (look to my other post),
always return a chunked page.
You are most certainly NOT handling HTTP chunked data at all !! You
are not even looking at the "Transfer-Encoding" header, which tells
you whether the data is chunked or not in the first place. If it is
NOT chunked, then you must use the "Content-Length" header to know how
many bytes to read.
Quote
So, I read the buffer until I get the final like this: "0\r\n\r\n"
tnat
indicates the end of the response (according with RFC2616)
You are not even close to implementing what the RFC says to do in
regards to handling chunked data.
Quote
And it works.
Only because you are completely ignoring everything.
Quote
If I try to read anything after this, the recv stay many seconds
waiting fod data and finaly return 0.
As it should be, because there is no data sent after the final chunk
block. You need to stop reading after you receive that block. But
you are not stopping, because you are not handling chunked reading
correctly to begin with.
Quote
Then, I understand correct the bcb 6 help :)
No, you don't understand, not even close. The code you have shown so
far demonstrates that. You are FAR away from having a working HTTP
implementation.
Quote
At each 91 requests, the server close the connection... :)
<snip>
Say that to the server. Look's like he don't know that ;)
More like you don't know how to use sockets correctly in the first
place. Your code is very very wrong for what you are attempting to
do. You seriously need to clean it up ALOT just to get the basic TCP
functionality working correctly, let alone the HTTP functionality on
top of it.
Gambit
 
 

Re:Re: WSAGetLastError return always 6

"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
Quote
I modify the code to show the LAST response before recv
returns zero. Is a response ok (HTTP 200), with a Connection: close
Then the server is intentionally closing the connection after sending
the reply. That has nothing to do with the number of requests you
send. Any reply can close the connection for any reason. That is why
keep-alives have to be negotiated in order to work.
Quote
About the cookies, I was wrong. The session cookie change every
time I start a connection
I would expect that. Then you need to completely re-write your cookie
management code to actually handle cookies properly. That is a whole
other set of RFCs (2109, 2965).
Quote
Seens that the server was really programed to close the connection
at predefined number of requests of the same connection.
It has nothing to do with the number of requests.
Gambit
 

Re:Re: WSAGetLastError return always 6

Remy Lebeau (TeamB) escreveu:
Quote
Oh yes, you most certainly do!!!! That is the ONLY way to use HTTP
properly. If you do not follow the rules, even the slightlest changes
to the HTML could cause your code to fail very very badly. You MUST
use the sizes and types that the server actually specifies. This is
EXTREMELY important if you ever want to use keep-alives properly as
well, as you don't want to leave unfinished data in the socket buffers
between requests.

Seriously, the more you push forward on this topic, the more I see how
wrongly you are implementing everything. You really should consider
using pre-made libraries for everything. If you want portability, you
should look at libcurl (curl.haxx.se) and other cross-platform
libraries.

You not understand Remy. But this is my fault.
This app that I'm doing, is for my personal use only.
I don't mind if a change on the page that I access will make me re-write
the code. I'll I mind is to put a code to verify if any changes that
affect the behaviour of the program have been done. So, when I start the
app, it will check if the pages are ok. If not, I'll use some other app
(like Ethereal you mentioned) to find the changes that have been made.
Of course, I'll use many of your knowledge to better my program.
Quote
>The page that agendar_recv() access (look to my other post),
>always return a chunked page.

You are most certainly NOT handling HTTP chunked data at all !! You
are not even looking at the "Transfer-Encoding" header, which tells
you whether the data is chunked or not in the first place. If it is
NOT chunked, then you must use the "Content-Length" header to know how
many bytes to read.

I sow every header of the pages that I access to see if it was chunked
or not, to make the code without the need to verify this every time I
access the page. Is bad programing, but will work until they change the
page, and no ptoblem with that :)
Quote
>So, I read the buffer until I get the final like this: "0\r\n\r\n"
tnat
>indicates the end of the response (according with RFC2616)

You are not even close to implementing what the RFC says to do in
regards to handling chunked data.
Of course not!
To do this, I will have to write praticaly a new "component" to handle
http. This isn't my goal.
Quote

>And it works.

Only because you are completely ignoring everything.
Yes... :)
But works :)
Quote

>If I try to read anything after this, the recv stay many seconds
>waiting fod data and finaly return 0.

As it should be, because there is no data sent after the final chunk
block. You need to stop reading after you receive that block. But
you are not stopping, because you are not handling chunked reading
correctly to begin with.

Oh...
But I'm stopping :)
If you look at the code, you'll see this:
recv_buffer[bytes] = '\0';
ptr_recv = &recv_buffer[bytes - 5];
strcat(html, recv_buffer);
if(strcmp(ptr_recv, "0\r\n\r\n") == 0)
break;
}
This is at the end of the loop of the recv.
The last line "if(strcmp" is exactly verifying if the chunked block is
the last (because this page is chunked)
Quote
>Then, I understand correct the bcb 6 help :)

No, you don't understand, not even close. The code you have shown so
far demonstrates that. You are FAR away from having a working HTTP
implementation.

HUm... Really I need learning much more :)
But anyway, besides the fact that EXACTLY every 91 requests the server
close the connection, fact that I will pursue to know if is an error of
my code or is the behaviour of the server, besides this, my app is
working perfectly, returning the results I need :)
Quote
>At each 91 requests, the server close the connection... :)
<snip>
>Say that to the server. Look's like he don't know that ;)

More like you don't know how to use sockets correctly in the first
place. Your code is very very wrong for what you are attempting to
do. You seriously need to clean it up ALOT just to get the basic TCP
functionality working correctly, let alone the HTTP functionality on
top of it.

Please, don't say that :)
I'm not really trying to do a BIG app. Much less an HTTP app. Only use
sockets to do some few tasks.
I have done some changes in code, particularly with the cookies. My code
works very well :) But was misplaced. Now is out of the recv loop, so,
when the recv is successful, I get new cookies.
Of corse I'll change the code a lot. I know I must learn much. :)
But you help me much too.
For exemple, my code now take in mean 0.55 seconds to read, interpret,
find, and store the information I seek on the page :)
The last version (that uses WinHTTP) take 0.65 seconds.
On the end of day, My new code (that I'm certain have much to improve)
access thousands more times the page that the predecessor :)
Thanks :)
 

{smallsort}

Re:Re: WSAGetLastError return always 6

Remy Lebeau (TeamB) escreveu:
Quote
"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
news: XXXX@XXXXX.COM ...

>I modify the code to show the LAST response before recv
>returns zero. Is a response ok (HTTP 200), with a Connection: close

Then the server is intentionally closing the connection after sending
the reply. That has nothing to do with the number of requests you
send. Any reply can close the connection for any reason. That is why
keep-alives have to be negotiated in order to work.

I'll continue reading the documents (RCF) and help to find if I'm doing
something wrong :) Speccially the parts on keeping alive (to see if I'm
forgetting something, some parameter, etc)
Quote
>About the cookies, I was wrong. The session cookie change every
>time I start a connection

I would expect that. Then you need to completely re-write your cookie
management code to actually handle cookies properly. That is a whole
other set of RFCs (2109, 2965).
One of them I alread read. :) Because the RFC for HTTP don't mention
cookies :) I'll look for the other too :)
Quote

>Seens that the server was really programed to close the connection
>at predefined number of requests of the same connection.

It has nothing to do with the number of requests.


Maybe not, but I'm not certain of this. But I'll keep tracking bugs on
my app to find if is an error of mine :)
Thanks :)
 

Re:Re: WSAGetLastError return always 6

Remy Lebeau (TeamB) escreveu:
Quote
"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
news:45e37c5a$ XXXX@XXXXX.COM ...

>Maybe the cookies are set differently by server. This is the only
thing
>that could be wrong in my request.

That is entirely dependant on what the server is actualy expecting you
to send.

Well, the cookies aren't the problem anyway :) But I'll try the app you
mentioned on the end to see if I'm correct.
Quote
>Sorry, but I don't understand... How was this? What you mean with
>"write a line, write a list of lines, read a line"?

Write functions that wrap the necessary functionality, and then call
your functions instead of accessing the sockets directly. This is
particularly important with line-based protocols such as HTTP, so you
want to make sure that your core socket-level operations are as solid
and safe as possible. You can then built your protocol-level logic on
top of them. For example:

bool WriteToSocket(SOCKET sock, void *ptr, int size)
{
BYTE *pb = (BYTE*) ptr;
int iNumSent;

while( size>0 )
{
iNumSent = send(sock, (char*) pb, size, 0);
if( iNumSent>0 )
{
size -= iNumSent;
pb += iNumSent;
}
else
{
if( (iNumSent == 0) || (WSAGetLastError() !=
WSAEWOULDBLOCK) )
return false;
}
}

return true;
}

bool WriteStrToSocket(SOCKET sock, const char* str)
{
return WriteToSocket(sock, str, strlen(str));
}

bool WriteLineToSocket(SOCKET sock, const char *str)
{
if( WriteStrToSocket(sock, str) )
return WriteToSocket(sock, "\r\n", 2);
return false;
}

bool __fastcall TPesquisa::agendar_recv(void)
{
WriteLineToSocket(sockfd, "GET
/agendamento-web/agendamento.do?task=forwardCalendario&forward=confirm
arInformacoesDeVisto&requerenteAgendado[0].informacoesChecadas=true&ag
endamentoVO.agenciadorGrupoVO.seqAgenciadorGrupo=&locale=pt_BR&com.tel
ecom.tipo_usuario=AGENCIA+DE+VIAGEM HTTP/1.1");
WriteLineToSocket(sockfd, "Host: www.visto-eua.com.br");
WriteLineToSocket(sockfd, "User-Agent: Mozilla/5.001 (windows;
U; NT4.0; en-us) Gecko/25250101");
WriteStrToSocket(sockfd, "Cookie: ");
WriteLineToSocket(sockfd, cookies);
WriteLineToSocket(sockfd, "");
//...
}


I understand now :)
This is really good advice :)
But I'll try to see if this affect the performance...
If affect, I'll do it, but after shure that the TCP layer is ok, I'll
put the code "inline". Few calls, more fast... I think :)
Quote
>I have an application to do what I'm trying to acomplish using
sockets,
>that is working, using WinHTTP. I do the same application using
>TCppWebBrowser. :)
>
>With TCppWebBrowser I have a problem with memory, because IE don't
free
>them.
>
>The other, using WinHTTP work perfectly, yet from time to time stop
for
>10-20 seconds, waiting for the server replay, and turn to work
again...

WinHTTP and TCppWebBrowser are not the only HTTP implementations
around. There are plenty other vendor implementations widely
available that you could try using.
For now, I'll try other 3. Indy and other two that I don't remember. But
don't get to install them. All of them presented some type of problem
during installation...
Soon I'll try again Indy. Has a lot of functionality. But I'll continue
to trying to use sockets directly anyway :)
Quote

>html[0] = '\0';
>bytes = send(sockfd, info_checked, strlen(info_checked), 0);
>acessos++;
>if(bytes == SOCKET_ERROR)
>{
>error_code = WSAGetLastError();
>closesocket(sockfd);
>return false;
>}

send() has the same rules as recv(), in that WSAGetLastError() could
return WSAEWOULDBLOCK which is not a failure. As shown further above,
writing should be looped just as reading is.

Hum... I forget about this... :) I have read on other forum, but at that
time, I'll barely got to connect...
And this can cause a grate trouble... I'll change the code to take it in
account.
Quote
>strcpy(info_checked, first);
>strcat(info_checked, cookies); //<<== New cookies
>strcat(info_checked, "\r\n\r\n");

At no point in the code you showed are you parsing the server's reply
to grab the new cookies, if any. Besides that, you are not even close
to handling cookies properly anyway. You must parse the "name=value"
pairs and result their scope. You can't just send back everything the
server gives you as-is. That is not how cookies work.
You're right, like always ;)
I will use Ethereal to see if some of the cookies are not been send back.
If not, I'll implement a code to send back only what is needed :)
Quote

>for(ptr_recv = &recv_buffer[0]; *ptr_recv != '\0';)
>{
>if(*ptr_recv == 'S')
>{
>ptr_recv++;
>if(*ptr_recv == 'e')
>{
>ptr_recv++;
>if(*ptr_recv == 't')
>{
>ptr_recv++;
>if(*ptr_recv == '-')
>{
>ptr_recv++;
>if(*ptr_recv == 'C')
>{
>ptr_recv++;
>if(*ptr_recv == 'o')
>{
>ptr_recv++;
>if(*ptr_recv == 'o')
>{
>ptr_recv += 6;

That is very sloppy and inefficient. You could just use strstr()
instead, ie:

ptr_recv = strstr(recv_buffer, "Set-Coo");
if( ptr_recv )
//...

I tried this. :) But believe or not, my code works and is much faster
than using strstr. :)
Is not beauty or elegant, and I'm seeking for a more fast to do this,
but for now it works :)
I use the same "schema" at many points on the code :)
Give me some trouble on the begining, but after some tests I put to work :)
Quote
>The function agendar_rec is inside a looping.

But it is not reading/writing correctly to begin with. You have a lot
to learn about how to use sockets properly.

I'm shure of that :)
Quote
>I used "zero_on_send" and "zero_on_recv" together with "acessos",
and
>find that at each 91 requests the connection is closed...

That is not a guarantee. You can't count on the number of 0 returns
you get. That is not how sockets work.

I'm not using it. I only do a test, to see if my suspicious wasn't
wrong. Really, The close always happen on the same interval...
Can be something wrong with cookis, or with the buffer overflow that you
mentioned. I'll find if is this or the server really do this because he
want...
Quote
>Now I save the cookies every time recv is performed...

Not correctly, though.

Endeed, but it's working, for now at least :)
Quote
>The truth is that I find that the cookies are ALWAYS the same.

If I were you, I would use an external packer sniffer, such as
Ethereal (www.ethereal.com), to look at what real web browsers
are actually sending/receiving, and then you can mimic that in your
code as needed.

In the beginning I use the OnBeforeNavigate event of the TCppWebBrowser
to see that... :)
But I will try this app. Looks pretty gook :)
Quote

Gambit
Thanks Gambit :)


 

Re:Re: WSAGetLastError return always 6

"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
Quote
This app that I'm doing, is for my personal use only. I don't mind
if a change on the page that I access will make me re-write the
code.
You don't have to do that, though. If you implement it properly in
the first place, then the code will automatically adapt to whatever
content it receives, so you won't have to re-write the code each time
a change occurs.
Quote
>You are most certainly NOT handling HTTP chunked data at all !!
You
>are not even looking at the "Transfer-Encoding" header, which
tells
>you whether the data is chunked or not in the first place. If it
is
>NOT chunked, then you must use the "Content-Length" header to know
how
>many bytes to read.
>

I sow every header of the pages that I access to see if it was
chunked
or not, to make the code without the need to verify this every time
I
access the page. Is bad programing, but will work until they change
the
page, and no ptoblem with that :)
Chunking is a server-side configuration. Many systems don't use
chunked transfers. It actually produces more overhead than sending
the raw data as-is instead. Chunking tends to be used only on systems
that support partial downloads, such as for resuming broken transfers.
Not all systems support that, though.
Quote
Oh...
But I'm stopping :)
But not as quickly as you could be. You seem to be very fond of
writing fast running code, but you are not handling the HTTP responses
in a way that allows you to exit as soon as the response is finished.
You are sitting there waiting for a timeout that you do not need to
wait for if you take the true data length into account properly like
the RFCs say to do.
Quote
If you look at the code, you'll see this:
Which is still the completely wrong thing to do.
Quote
This is at the end of the loop of the recv.
The last line "if(strcmp" is exactly verifying if the chunked block
is
the last (because this page is chunked)
But you are not reading the individual chunks properly at all. Which
is why you are getting into a scenerio that relies on a waiting loop
to time out.
Gambit
 

Re:Re: WSAGetLastError return always 6

Remy Lebeau (TeamB) escreveu:
Quote
You don't have to do that, though. If you implement it properly in
the first place, then the code will automatically adapt to whatever
content it receives, so you won't have to re-write the code each time
a change occurs.

I know this. I know that I must be this ia very different aproach to
make my life more easy. But for now, the time difference that I get if I
verify every token on the header is not possible to take...
If I find a way to do the code even more fast, I'll do it, even if this
will turn the code more problematic. In this unic and exclusive app, I
need this.
Quote
Chunking is a server-side configuration. Many systems don't use
chunked transfers. It actually produces more overhead than sending
the raw data as-is instead. Chunking tends to be used only on systems
that support partial downloads, such as for resuming broken transfers.
Not all systems support that, though.
Yes... I see. But the page is chunked. I verify the header. And I
executed the code. The page never comes entire. It comes a first part,
that contains the header and an hexa that say the number of bytes to
read. Then on the next loop, I receive a new chunk with a new hexa that
says the length of the new chunk, a third time this occour and on the
fourth, the signal that the send of chunked data ends is present
"0\r\n\r\n". So I quit the loop because there not more data to read...
Is this wrong? The last bytes of the last chunk are not "0\r\n\r\n"?
Quote

>Oh...
>But I'm stopping :)

But not as quickly as you could be. You seem to be very fond of
writing fast running code, but you are not handling the HTTP responses
in a way that allows you to exit as soon as the response is finished.
You are sitting there waiting for a timeout that you do not need to
wait for if you take the true data length into account properly like
the RFCs say to do.

And how to do that?
Quote
>If you look at the code, you'll see this:

Which is still the completely wrong thing to do.

And what I must to do?
Quote
But you are not reading the individual chunks properly at all. Which
is why you are getting into a scenerio that relies on a waiting loop
to time out.

I don't getting on a wauting loop... I don't understand... What you mean?
Eduardo.
 

Re:Re: WSAGetLastError return always 6

"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
Quote
It comes a first part, that contains the header and an hexa that say
the number of bytes to read. Then on the next loop, I receive a new
chunk with a new hexa that says the length of the new chunk, a
third time this occour and on the fourth, the signal that the send
of
chunked data ends is present "0\r\n\r\n". So I quit the loop because
there not more data to read...
That is how chunked transfers work, but that is not how your earlier
code was actually reading chunked data. You were not reading the
chunked sizes at all. You were reading the incoming data in arbitrary
byte sizes into a buffer without any care whatsoever to the actual
content of the transfer, and then trying to verify the last chunk
only. A proper transfer should be performing the following sequence
instead:
send the request
read a single line
verify the response code in the line
loop
read a line
if line is empty then break
store the line into a header list
end
if list contains 'Transfer-Encoding: chunked' header then
begin
loop
read a line
extract chunk size from the line
if chunk size is 0 then break
read number of bytes specified
store bytes into buffer
read an empty line
end
read an empty line
end
else if list contains 'Content-Length' header then
begin
read number of bytes specified
store bytes into buffer
end
else
begin
loop
read as many bytes as are currently available
if disconnected then break
store bytes into buffer
end
end
Quote
And what I must to do?
See above.
Gambit
 

Re:Re: WSAGetLastError return always 6

Remy Lebeau (TeamB) escreveu:
Quote
"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
news:45e4284e$ XXXX@XXXXX.COM ...

>It comes a first part, that contains the header and an hexa that say
>the number of bytes to read. Then on the next loop, I receive a new
>chunk with a new hexa that says the length of the new chunk, a
>third time this occour and on the fourth, the signal that the send
of
>chunked data ends is present "0\r\n\r\n". So I quit the loop because
>there not more data to read...

That is how chunked transfers work, but that is not how your earlier
code was actually reading chunked data. You were not reading the
chunked sizes at all. You were reading the incoming data in arbitrary
byte sizes into a buffer without any care whatsoever to the actual
content of the transfer, and then trying to verify the last chunk
only. A proper transfer should be performing the following sequence
instead:

send the request
read a single line
verify the response code in the line
loop
read a line
if line is empty then break
store the line into a header list
end
if list contains 'Transfer-Encoding: chunked' header then
begin
loop
read a line
extract chunk size from the line
if chunk size is 0 then break
read number of bytes specified
store bytes into buffer
read an empty line
end
read an empty line
end
else if list contains 'Content-Length' header then
begin
read number of bytes specified
store bytes into buffer
end
else
begin
loop
read as many bytes as are currently available
if disconnected then break
store bytes into buffer
end
end

>And what I must to do?

See above.


Gambit


I understand.
I'm not readind the chunk length. But this is because the char[] where I
store the page is far bigger than the page I receive...
I know... Bad pratice.
but to do this is time consuming. I first do a code that does something
similarly... But the "speed" was slower that if I "force" to do like I'm
doing now :)
But I'll implement the code with the knowledge and tips that you show
me, at last to verify if anything changed...
For now at last.
But all you said was good advise of course, and I thank you :)
I'll do tests with the code for long time anyway, and problably will
implement most of what was discussed here :)
Thanks Gambit.
 

Re:Re: WSAGetLastError return always 6

"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
Quote
I'm not readind the chunk length. But this is because the char[]
where I store the page is far bigger than the page I receive...
Doesn't matter. You still have to process the chunks properly. The
way you are reading the data, you are leaving all of that chunk
information inside your buffer. You will have to strip off all of
that info anyway when you go to process the contents of the downloaded
data later on. If you process the chunks while they are downloading,
as described earlier, then they won't be in your buffer at all when it
comes time to process it.
Quote
but to do this is time consuming.
Not as much as you think. Besides, you are going to have to take the
time to process the chunk info anyway, so you may as well do it as
soon as they arrive.
Quote
I first do a code that does something similarly... But the "speed"
was
slower that if I "force" to do like I'm doing now :)
You probably did not optimize the code once it was running.
Gambit
 

Re:Re: WSAGetLastError return always 6

Remy Lebeau (TeamB) escreveu:
Quote
Doesn't matter. You still have to process the chunks properly. The
way you are reading the data, you are leaving all of that chunk
information inside your buffer. You will have to strip off all of
that info anyway when you go to process the contents of the downloaded
data later on. If you process the chunks while they are downloading,
as described earlier, then they won't be in your buffer at all when it
comes time to process it.

What I do is:
Receive the whole data first. All of them is loaded into "html", that is
a char [].
What I really do is search the first "<select " on the page. When I find
it, I look for the second "<option " (if it exists) and so on until I
find the first "</select>"
The only things I "use" is the "value=xxxxx" on the options...
If I find a value interesting, I execute another GET (not show in my
previous posts). Now, I do the same, but with the second "<select " on
the page.
These select's stay on the end of the html.
When I first enter the page (via GET), the server load all the
information for the 3 selects. But show only the information for the
first. When I execute the second GET, what I'm doing is asking to server
to fill the second select based on my selection (of the first).
All the information don't change, even if one of the options is
discarded or used by another user... So, I can iterate on the values
until I find something useful and send a GET saying to the server that I
want that information.If no one ask for it before me, I get, if not, the
server say that the information dos not exist anymore.
The "speed" to a desired info "vanish" is fewer than 1 second
sometimes... BEcause all data is loaded by the server for me on the
first GET, Only this is timing consuming, the other three GETS to ask
the desired information togheder occour in fewer time than the first.
So, If I can do a code that do this fewer than 1 sec I got more chances
to achive the info.
This is why I do such horrible code... And why I don't mind to read the
chunked data etc...
And because the information that I want always come only with the last
chunk, I don't bother with the rest. In fact, I only keep the rest for
security reasons. Because sometimes the data that came is bigger than
normal, and some of that come in the chunk right before the last.
It's a page generated automatically. Always the same, except for the
options on the three selects.
Quote
>but to do this is time consuming.

Not as much as you think. Besides, you are going to have to take the
time to process the chunk info anyway, so you may as well do it as
soon as they arrive.

Because I look for data that will come only 7000 bytes offset the begin
of the body, specific data, and I don't bother if the page changes and
my app will stop working for a few days until I rewrite the code, this
is not so important exactly now.
Quote
>I first do a code that does something similarly... But the "speed"
was
>slower that if I "force" to do like I'm doing now :)

You probably did not optimize the code once it was running.

Problably :)
 

Re:Re: WSAGetLastError return always 6

"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
Quote
Receive the whole data first. All of them is loaded into "html",
that is a char [].
Which is STILL the wrong thing to do!!! You are not even ensuring
that your char[] really is large enough to hold everything the server
may send to you, and at worse you are overallocating more memory than
you really need right now.
Quote
What I really do is search the first "<select " on the page. When I
find it, I look for the second "<option " (if it exists) and so on
until
I find the first "</select>"
That is very error prone. Because of the nature of chunked transfers,
you are not even guaranteed to have such complete strings in your
buffer because you are not stripping out the chunked information
first. You could have "<sel"+chunk+"ect" and the like, which would
cause your searching algorithm to fail. You MUST handle chunked
transfers correctly in order to ensure the integrity of the data you
want to parse. I don't know how to make this any clearer to you.
Quote
This is why I do such horrible code...
Your code is horrible not because of HOW UGLY it is to look at, but
because how INCOMPLETE it is to begin with. There are so many things
you are doing wrong with it that you are not guaranteed to end up with
the data that you are expecting.
Quote
And why I don't mind to read the chunked data etc...
The presence of chunking is not an issue. Your handling of chunking,
on the other hand, is very much an issue.
Quote
And because the information that I want always come only
with the last chunk, I don't bother with the rest.
That is also not a guarantee, either. You really don't seem to be
grasping what HTTP is actually sending you. So you are going to keep
handling it all wrong.
Quote
sometimes the data that came is bigger than normal, and some of that
come in the chunk right before the last.
All the more reason why you NEED to handle chunking, and in fact any
HTTP transfer in general, properly from the very beginning. But you
are refusing to do that much. Obviously, you want a broken,
error-prone, dangerous implementation, so that is your choice. I'm
done debating this discussion any further.
Gambit
 

Re:Re: WSAGetLastError return always 6

Remy Lebeau (TeamB) escreveu:
Quote
Which is STILL the wrong thing to do!!! You are not even ensuring
that your char[] really is large enough to hold everything the server
may send to you, and at worse you are overallocating more memory than
you really need right now.
I know that.
Quote

>What I really do is search the first "<select " on the page. When I
>find it, I look for the second "<option " (if it exists) and so on
until
>I find the first "</select>"

That is very error prone. Because of the nature of chunked transfers,
you are not even guaranteed to have such complete strings in your
buffer because you are not stripping out the chunked information
first. You could have "<sel"+chunk+"ect" and the like, which would
cause your searching algorithm to fail. You MUST handle chunked
transfers correctly in order to ensure the integrity of the data you
want to parse. I don't know how to make this any clearer to you.
You're right. I forget the problem of the hex code of the chunk be
inside the data.
Quote

>This is why I do such horrible code...

Your code is horrible not because of HOW UGLY it is to look at, but
because how INCOMPLETE it is to begin with. There are so many things
you are doing wrong with it that you are not guaranteed to end up with
the data that you are expecting.

I know that.
Quote
>And why I don't mind to read the chunked data etc...

The presence of chunking is not an issue. Your handling of chunking,
on the other hand, is very much an issue.

I'm rewritting the code since yesterday to take care of many of the
issues that you target. Including strip the chunk information of the
data you pointed now. The code I show you on the beggining of the
discussion was rewrite many times since them.
Quote
>And because the information that I want always come only
>with the last chunk, I don't bother with the rest.

That is also not a guarantee, either. You really don't seem to be
grasping what HTTP is actually sending you. So you are going to keep
handling it all wrong.

>sometimes the data that came is bigger than normal, and some of that
>come in the chunk right before the last.

All the more reason why you NEED to handle chunking, and in fact any
HTTP transfer in general, properly from the very beginning. But you
are refusing to do that much. Obviously, you want a broken,
error-prone, dangerous implementation, so that is your choice.
Sometimes, the broken, error-prone, dangerous implementation, even if it
really fails from time to time, is a better choice than the opposite...
Quote
I'm done debating this discussion any further.

ok. Was a very instructive (yet may you don't believe) discussion to me. :)
Thanks. :)
 

Re:Re: WSAGetLastError return always 6

"Eduardo Jauch" < XXXX@XXXXX.COM >wrote in message
Quote
I'm rewritting the code since yesterday to take care of many of the
issues that you target. Including strip the chunk information of the
data you pointed now. The code I show you on the beggining of the
discussion was rewrite many times since them.
You should have said so earlier and saved us alot of aggravation.
Quote
Sometimes, the broken, error-prone, dangerous implementation, even
if it really fails from time to time, is a better choice than the
opposite...
Not when the opposite is total failure and crashing.
Quote
ok. Was a very instructive (yet may you don't believe) discussion to
me. :)
Glad you got something out of it.
Gambit