Unusual HTTP Response in Basic C++ Socket Programming - c++

I've got a basic HTTP client set up in C++, which works ok so far. It's for a school assignment, so there's lots more to do, but I'm having a problem.
I use the recv() function in a while loop, to repeatedly add pieces of the response to my response buffer, and then output that buffer each time. The problem is, at the end of each piece of the response, the HTTP Request is getting tacked on as well.
For example, the response will be a chunk of the page's source code, followed by "GET / HTTP/1.1...", followed by the next chunk, and then the "GET..." again, and so on.
Here's my relevant code:
// Prepare request
char request[] = "HEAD /index.html HTTP/1.1\r\nHOST: www.google.com\r\nCONNECTION: close\r\n\r\n";
// Send request
len = send(sockfd, request, sizeof(request), 0);
// Write/output response
while (recv(sockfd, buf, sizeof(buf), 0) != 0)
{
// Read & output response
printf("%s", buf);
}

The buffer isn't null terminated, which is required for strings in C++. When you see the "extra GET", you are seeing memory that you shouldn't be because the stdlib tried to print your buffer, but never found a '\0' character.
A quick fix is to force the buffer to be terminated:
int n = 1;
while (n > 0) {
n = recv(sockfd, buf, sizeof(buf), 0);
if (n > 0) {
// null terminate the buffer so that we can print it
buf[n] = '\0';
// output response
printf("%s", buf);
}
}

I suspect it's because your buf is allocated in memory just below your request. When you call printf on the buffer, printf will print as much as it can before finding a NUL character (which marks the end of the string). If there isn't one, it'll go right on through into request. And generally, there won't be one, because recv is for receiving binary data and doesn't know that you want to treat its output a string.
One quick fix would be to limit the receive operation to sizeof(buf)-1, and to explicitly add the NUL terminator yourself, using the size of the returned data:
while ((nr = recv(sockfd, buf, sizeof(buf), 0)) > 0)
{
buf[nr] = 0;
...
}
Of course, for this to (marginally) safe you need to be sure that you'll always receive printable data.

recv does not add a \0 string terminator to the buffer recieved - it just works in raw binary. So your printf is running off the send of your buf buffer (and apparently ending up looking at your request buffer).
Either add a nul-terminator to the end of buf, or print the buffer one character at a time using putchar() (both of these approaches will make it necessary to store the value returned by recv()).

The recv call will not null-terminate buf; instead, it will just provide you with the raw data received from the wire. You need to save the return value of recv, and then add a null-terminating byte yourself into buf before printing it. Consequentially, you can only ask for sizeof(buf)-1 bytes.

Related

Sending Large Base64 String over TCP Socket

I am trying to send a Base64 encoded image from TCP client using GO and TCP server in C++.
Here is the code snippet for C++ Receiver
std::string recieve(int bufferSize=1024,const char *eom_flag = "<EOF>"){
char buffer[bufferSize];
std::string output;
int iResult;
char *eom;
do{
iResult = recv(client, buffer, sizeof(buffer), 0);
//If End OF MESSAGE flag is found.
eom = strstr(buffer,eom_flag);
//If socket is waiting , do dot append the json, keep on waiting.
if(iResult == 0){
continue;
}
output+=buffer;
//Erase null character, if exist.
output.erase(std::find(output.begin(), output.end(), '\0'), output.end());
//is socket connection is broken or end of message is reached.
}while(iResult > -1 and eom == NULL);
//Trim <EOF>
std::size_t eom_pos = output.rfind(eom_flag);
return output.substr(0,eom_pos);}
Idea is to receive the message until End of Message is found, thereafter continue to listen for another message on the same TCP connection.
Golang TCP client code snippet.
//Making connection
connection, _ := net.Dial("tcp", "localhost"+":"+PortNumber)
if _, err := fmt.Fprintf(connection, B64img+"<EOF>"); err != nil {
log.Println(err)
panic(err)
}
Tried approaches:
Increasing the buffer size in the C++ receiver.
Removing the null character from the end of the string in the C++ receiver.
Observations:
Length of string sent by the client is fixed, while the length of the string after receive function is larger and
random. Example: Go client string length is 25243. For the same string, length after receive when i
run send and receive in the loop is 25243, 26743, 53092, 41389, 42849.
On Saving the received string in a file, I see <0x7f> <0x02> character in the string.
I am using winsock2.h for c++ socket.
You are treating the received data as a C string - a sequence of bytes ending with a 0 byte - which is not correct.
recv receives some bytes and puts them in buffer. Let's say it received 200 bytes.
You then do strstr(buffer,eom_flag);. strstr doesn't know that 200 bytes were received. strstr starts from the beginning of the buffer, and keeps looking until it finds either , or a 0 byte. There is a chance that it might find a in the other 824 bytes of the buffer, even though you didn't receive one.
Then you do output += buffer;. This also treats the buffer as if it ends with a 0 byte. This will look through the whole buffer (not just the first 200 bytes) to find a 0 byte. It will then add everything up to that point into output. Again, it might find a 0 byte in the last 824 bytes of the buffer, and add too much data. It might not find a 0 byte in the buffer at all, and then it will keep on adding extra data from other variables that are stored next to buffer in memory. Or it might find a 0 byte in the first 200 bytes, and stop there (but only if you sent a 0 byte).
What you should do is pay attention to the number of bytes received (which is iResult) and add that many bytes to the output. You could use:
output.insert(output.end(), buffer, buffer+iResult);
Also (as Phillipe Thomassigny has pointed out in a comment), the "" might not be received all at once. You might receive "" separately. You should check whether output has an "" instead of checking whether buffer has an "". (The performance implications of this are left as an exercise to the reader)
By the way, this line doesn't do anything at the moment:
output.erase(std::find(output.begin(), output.end(), '\0'), output.end());
because '\0' never gets added to output, because with output += buffer;, a '\0' tells it where to stop adding.

Garbage values and Buffers differences in TCP

First question: I am confused between Buffers in TCP. I am trying to explain my proble, i read this documentation TCP Buffer, author said a lot about TCP Buffer, thats fine and a really good explanation for a beginner. What i need to know is this TCP Buffer is same buffer with the one we use in our basic client server program (Char *buffer[Some_Size]) or its some different buffer hold by TCP internally ?
My second question is that i am sending a string data with prefix length (This is data From me) from client over socket to server, when i print my data at console along with my string it prints some garbage value also like this "This is data From me zzzzzz 1/2 1/2....." ?. However i fixed it by right shifting char *recvbuf = new char[nlength>>3]; nlength to 3 bits but why i need to do it in this way ?
My third question is in relevance with first question if there is nothing like TCP Buffer and its only about the Char *buffer[some_size] then whats the difference my program will notice using such static memory allocation buffer and by using dynamic memory allocation buffer using char *recvbuf = new char[nlength];. In short which is best and why ?
Client Code
int bytesSent;
int bytesRecv = SOCKET_ERROR;
char sendbuf[200] = "This is data From me";
int nBytes = 200, nLeft, idx;
nLeft = nBytes;
idx = 0;
uint32_t varSize = strlen (sendbuf);
bytesSent = send(ConnectSocket,(char*)&varSize, 4, 0);
assert (bytesSent == sizeof (uint32_t));
std::cout<<"length information is in:"<<bytesSent<<"bytes"<<std::endl;
// code to make sure all data has been sent
while (nLeft > 0)
{
bytesSent = send(ConnectSocket, &sendbuf[idx], nLeft, 0);
if (bytesSent == SOCKET_ERROR)
{
std::cerr<<"send() error: " << WSAGetLastError() <<std::endl;
break;
}
nLeft -= bytesSent;
idx += bytesSent;
}
std::cout<<"Client: Bytes sent:"<< bytesSent;
Server code:
int bytesSent;
char sendbuf[200] = "This string is a test data from server";
int bytesRecv;
int idx = 0;
uint32_t nlength;
int length_received = recv(m_socket,(char*)&nlength, 4, 0);//Data length info
char *recvbuf = new char[nlength];//dynamic memory allocation based on data length info
//code to make sure all data has been received
while (nlength > 0)
{
bytesRecv = recv(m_socket, &recvbuf[idx], nlength, 0);
if (bytesRecv == SOCKET_ERROR)
{
std::cerr<<"recv() error: " << WSAGetLastError() <<std::endl;
break;
}
idx += bytesRecv;
nlength -= bytesRecv;
}
cout<<"Server: Received complete data is:"<< recvbuf<<std::endl;
cout<<"Server: Received bytes are"<<bytesRecv<<std::endl;
WSACleanup();
system("pause");
delete[] recvbuf;
return 0;
}
You send 200 bytes from the client, unconditionally, but in the server you only receive the actual length of the string, and that length does not include the string terminator.
So first of all you don't receive all data that was sent (which means you will fill up the system buffers), and then you don't terminate the string properly (which leads to "garbage" output when trying to print the string).
To fix this, in the client only send the actual length of the string (the value of varSize), and in the receiving server allocate one more character for the terminator, which you of course needs to add.
First question: I am confused between Buffers in TCP. I am trying to
explain my proble, i read this documentation TCP Buffer, author said a
lot about TCP Buffer, thats fine and a really good explanation for a
beginner. What i need to know is this TCP Buffer is same buffer with
the one we use in our basic client server program (Char
*buffer[Some_Size]) or its some different buffer hold by TCP internally ?
When you call send(), the TCP stack will copy some of the bytes out of your char array into an in-kernel buffer, and send() will return the number of bytes that it copied. The TCP stack will then handle the transmission of those in-kernel bytes to its destination across the network as quickly as it can. It's important to note that send()'s return value is not guaranteed to be the same as the number of bytes you specified in the length argument you passed to it; it could be less. It's also important to note that sends()'s return value does not imply that that many bytes have arrived at the receiving program; rather it only indicates the number of bytes that the kernel has accepted from you and will try to deliver.
Likewise, recv() merely copies some bytes from an in-kernel buffer to the array you specify, and then drops them from the in-kernel buffer. Again, the number of bytes copied may be less than the number you asked for, and generally will be different from the number of bytes passed by the sender on any particular call of send(). (E.g if the sender called send() and his send() returned 1000, that might result in you calling recv() twice and having recv() return 500 each time, or recv() might return 250 four times, or (1, 990, 9), or any other combination you can think of that eventually adds up to 1000)
My second question is that i am sending a string data with prefix
length (This is data From me) from client over socket to server, when
i print my data at console along with my string it prints some garbage
value also like this "This is data From me zzzzzz 1/2 1/2....." ?.
However i fixed it by right shifting char *recvbuf = new
char[nlength>>3]; nlength to 3 bits but why i need to it in this way ?
Like Joachim said, this happens because C strings depend on the presence of a NUL-terminator byte (i.e. a zero byte) to indicate their end. You are receiving strlen(sendbuf) bytes, and the value returned by strlen() does not include the NUL byte. When the receiver's string-printing routine tries to print the string, it keeps printing until if finds a NUL byte (by chance) somewhere later on in memory; in the meantime, you get to see all the random bytes that are in memory before that point. To fix the problem, either increase your sent-bytes counter to (strlen(sendbuf)+1), so that the NUL terminator byte gets received as well, or alternatively have your receiver manually place the NUL byte at the end of the string after it has received all of the bytes of the string. Either way is acceptable (the latter way might be slightly preferable as that way the receiver isn't depending on the sender to do the right thing).
Note that if your sender is going to always send 200 bytes rather than just the number of bytes in the string, then your receiver will need to always receive 200 bytes if it wants to receive more than one block; otherwise when it tries to receive the next block it will first get all the extra bytes (after the string) before it gets the next block's send-length field.
My third question is in relevance with first question if there is
nothing like TCP Buffer and its only about the Char *buffer[some_size]
then whats the difference my program will notice using such static
memory allocation buffer and by using dynamic memory allocation buffer
using char *recvbuf = new char[nlength];. In short which is best and
why ?
In terms of performance, it makes no difference at all. send() and receive() don't care a bit whether the pointers you pass to them point at the heap or the stack.
In terms of design, there are some tradeoffs: if you use new, there is a chance that you can leak memory if you don't always call delete[] when you're done with the buffer. (This can particularly happen when exceptions are thrown, or when error paths are taken). Placing the buffer on the stack, on the other hand, is guaranteed not to leak memory, but the amount of space available on the stack is finite so a really huge array could cause your program to run out of stack space and crash. In this case, a single 200-byte array on the stack is no problem, so that's what I would use.

How to read an input from a client via socket in Linux in C++?

My goal is create an app client server, written in C++.
When the server read an input from the client, should process the string and give an output.
Basically, I have a simply echo server that send the same message.
But if the user types a special string (like "quit"), the program have to do something else.
My problem is that this one dont happend, because the comparison between strings is not working... I dunno why!
Here a simple code:
while(1) {
int num = recv(client,buffer,BUFSIZE,0);
if (num < 1) break;
send(client, ">> ", 3, 0);
send(client, buffer, num, 0);
char hello[6] ="hello";
if(strcmp(hello,buffer)==0) {
send(client, "hello dude! ", 12, 0);
}
buffer[num] = '\0';
if (buffer[num-1] == '\n')
buffer[num-1] = '\0';
std::cout << buffer;
strcpy(buffer, "");
}
Why the comparison is not working?
I have tried many solutions...but all failed :(
Your data in buf may not be NULL-terminated, because buf contains random data if not initialized. You only know the content of the first num bytes. Therefore you also have to check how much data you've received before comparing the strings:
const char hello[6] ="hello";
size_t hello_sz = sizeof hello - 1;
if(num == hello_sz && memcmp(hello, buffer, hello_sz) == 0) { ...
As a side note, this protocol will be fragile unless you delimit your messages, so in the event of fragmented reads (receive "hel" on first read, "lo" on the second) you can tell where one message starts and another one ends.
strcmp requires null terminated strings. The buffer you read to might have non-null characters after the received message.
Either right before the read do:
ZeroMemory(buffer, BUFSIZE); //or your compiler defined equivalent
Or right after the read
buffer[num] = '\0';
This will ensure that there is a terminating null at the end of the received message and the comparison should work.
A string is defined to be an array of chars upto and including the terminating \0 byte. Initially your buffer contains arbitrary bytes, and is not even guaranteed to contain a string. You have to set buffer[num] = '\0' to make it a string.
That of course means that recv should not read sizeof buffer bytes but one byte less.

C++ , Send() function sends extra bytes

I am having trouble with a Winsock2 wrapper classes (client-server) and after countless hours of scratching-my-head-in-confusion, I decided it would be better if I asked your opinion.
To be more specific, the problem is that every time I use my Send() function, both the client and the server (not always!) send one or two extra bytes!
For example I use SendBytes("Hello") and the Recv function returns "Hello•" with a '•' or other random characters at the end of the character array.
//main.cpp (Client)
#include "Socket.h"
int main()
{
NetworkService::Client cService = NetworkService::Client();
int res = cService.Initialize("127.0.0.1","20248");
if(res == 0){
int local = cService.SendBytes("Hello!");
printf("Bytes Sent: %ld\n", local);
cService.Shutdown();
char* temp = cService.Recv();
printf("String Recieved: %s - Size: %d",temp,strlen(temp));
printf("\nSTRLEN: %d",strlen("X5"));
}
else{
cService.Clean();
}
cService.Close();
while(!kbhit());
return 0;
}
Of course, the server sends the string "X5" and the client prints the strlens ...
//The result with "X5" as the dummy text:
String Recieved: X5? - Size: 3 //Notice the extra '?' character
STRLEN: 2
Send // Recieve Functions
int NetworkService::Client::SendBytes(char* lData){
int local = send( ConnectSocket, lData, (int)strlen(lData), 0 );
if (local == SOCKET_ERROR) {
Close();
return WSAGetLastError();
}
return local;
}
char* NetworkService::Client::Recv(){
recv(ConnectSocket, recvbuf , recvbuflen, 0);
return recvbuf;
}
Help would be appreciated ^_^.
Excuse me, but
int local;
(...)
return (int*)local;
What you were trying to achieve? There are many serious problems in your code.
This is not the way you send data over the network. There are too many errors.
IF you want to send null-terminated strings over the network:
int local = send( ConnectSocket, lData, (int)strlen(lData), 0 );
as everyone said, you don't actually send the null terminator. You would have sent it if you added 1 to the length. Moreover, with long strings, the send() function doesn't guarantee you to send the whole string at once. You have to check for that and resend the missing part.
recv(ConnectSocket, recvbuf , recvbuflen, 0);
You don't check the return value, so you can't know the length of the received string. As you don't send the null byte, the received data is not null-terminated. Also, if null terminator is the only delimiter of more data you send, you'll have to read byte-by-byte (not efficient) not to miss the null terminator to know when to finish. An alternative would be to make your own buffering scheme (so the next read would partially return the result of the previous), or change the protocol to make length of the transported data known beforehand. Also, the same remark about partial reads as with the send function applies here.
BTW returning a static/global buffer is not a sign of good code.
You don't really check the return value of recv.
There is a do-while but it doesn't do anything. You return from the function without proper error handling even when recv fails, but you will never know it.
Also you don't send the terminating \0 which isn't necessary bad, depends on what you're trying to do, for example you can add that after receiving.

Reading socket reply in loop

I have:
char buf[320];
read(soc, buf, sizeof(buf));
//print buf;
However, sometimes the reply is much bigger then 320 characters, so I'm trying to run the read in a loop to avoid taking up too much memory space. I tried read(soc, buf, sizeof(buf)) but that only prints the same first x characters over again. How would I print the leftover characters that did not fit into the first 320 characters in a loop?
Thanks
Change your loop to something like:
int numread;
while(1) {
if ((numread = read(soc, buf, sizeof(buf) - 1)) == -1) {
perror("read");
exit(1);
}
if (numread == 0)
break;
buf[numread] = '\0';
printf("Reply: %s\n", buf);
}
for the reasons Nikola states.
Every time you call read( s, buf, buf_size ) the kernel copies min( buf_size, bytes_available ) into the buf, where bytes_available is the number of bytes already received and waiting in socket receive buffer. The read(2) system call returns the number of bytes placed into application buffer, or -1 on error, or 0 to signal EOF, i.e. a close(2) of the socket on the sending end. Thus when you reuse the buffer, only part of it might be overwritten with new data. Also note that -1 evaluates to true in C and C++. This is probably the case you are hitting.
printf(3) expects zero-terminated string for the %s format specifier. The bytes read from the socket might not contain the '\0' byte, thus letting printf(3) print till it finds zero further down somewhere. This might lead to buffer overrun.
The points here are:
Always check the value returned from read(2)
If you print strings read from a socket - always zero-terminate them manually.
Hope this helps.