I am using read function to read data from a socket, but when the data is more than 4k, read function just read part of the data, for example, less than 4k. Here is the key code:
mSockFD = socket(AF_INET, SOCK_STREAM, 0);
if (connect(mSockFD, (const sockaddr*)(&mSockAdd), sizeof(mSockAdd)) < 0)
{
cerr << "Error connecting in Crawl" << endl;
perror("");
return false;
}
n = write(mSockFD, httpReq.c_str(), httpReq.length());
bzero(mBuffer, BUFSIZE);
n = read(mSockFD, mBuffer, BUFSIZE);
Note than BUFSIZE is much larger than 4k.
When data is just a few hundred bytes, read function works as expected.
This is by design and to be expected.
The short answer to your question is you should continue calling "read" until you get all the data you expect. That is:
int total_bytes = 0;
int expected = BUFSIZE;
int bytes_read;
char *buffer = malloc(BUFSIZE+1); // +1 for null at the end
while (total_bytes < expected)
{
int bytes_read = read(mSockFD, buffer+total_bytes, BUFSIZE-total_bytes);
if (bytes_read <= 0)
break;
total_bytes += bytes_read;
}
buffer[total_bytes] = 0; // null terminate - good for debugging as a string
From my experience, one of the biggest misconceptions (resulting in bugs) that you'll receive as much data as you ask for. I've seen shipping code in real products written with the expectation that sockets work this way (and no one certain as to why it doesn't work reliably).
When the other side sends N bytes, you might get lucky and receive it all at once. But you should plan for receiving N bytes spread out across multiple recv calls. With the exception of a real network error, you'll eventually get all N bytes. Segmentation, fragmentation, TCP window size, MTU, and the socket layer's data chunking scheme are the reasons for all of this. When partial data is received, the TCP layer doesn't know about how much more is yet to come. It just passes what it has up to the app. It's up to the app to decide if it got enough.
Likewise, "send" calls can get conglomerated into the same packet together.
There may be ioctls and such that will make a socket block until all the expected data is received. But I don't know of any off hand.
Also, don't use read and write for sockets. Use recv and send.
Read this book. It will change your life with regards to sockets and TCP:
Related
I'm writing a C++ program. I need to receive a file and I'm using recv() function over a TCP socket to do that.
download_file() {
while (left_bytes != 0 && !connection_closed) {
if (left_bytes >= buffer_max_size)
bytes_to_download = buffer_max_size;
else
bytes_to_download = left_bytes;
if (request.conn->read_data(buffer, bytes_to_download))
{
left_bytes -= buffer->get_size();
temporary_file.write_data(buffer);
} else connection_closed = true;
}
}
read_data() {
while (bytes_received < size && alive_) {
bytes_read = recv(sock_, read_buffer, size, 0);
if (bytes_read == SOCKET_ERROR) {
delete[] local_buffer;
throw SocketException(WSAGetLastError());
}
// the connection is closed
if (bytes_read == 0) alive_ = false;
else {
bytes_received += bytes_read;
buffer->add(local_buffer, bytes_read);
}
}
}
The problem is that the recv never returns. It receives the whole file except for few KB and it freeze on the recv(). The buffer size is 1460.
I receive the file only if I print something to the console with cout every time the recv is called. Only in this case I receive the whole file.
Otherwise if I set as socket option the WAITALL and the client closes the connection after the file is sent, I receive the whole file.
Here's the code for the Client side that sends the file:
TransmitFile(file_request->connection_->get_handle_socket(), file_handler.get_file_handle(), file_request->file_size_, 65535, nullptr, nullptr, TF_USE_SYSTEM_THREAD)
EDIT
Here's how I send and read the file size between the Client and Server.
std::stringstream stream_;
stream_.str(std::string());
// append the file size
const __int64 file_size = htonll(GetFileSize(file_handle_, nullptr););
stream_ << ' ' << file_size << ' ';
Then I use the send to send this string
Here's how I read the file size
// Within stream_ there is all the content of the received packet
std::string message;
std::getline(stream_, message, ' ');
this->request_body_.file_size_ = ntohll(strtoll(message.c_str(), nullptr, 0));
EDIT
I cleaned up the code and I found out that read_data() is obviously called once and I was updating the buffer variable wrongly. Hence I was tracking the size of the content within the buffer in a wrong way which make me call the recv() once more.
First thing: recv() will block if there are no bytes left to read but the connection is still open. So whatever you might say about what your code is doing, that must be what is happening here.
That could be for any of the following reasons:
the sender lied about the size of the file, or did not send the promised number of bytes
the file size was not interpreted correctly at the receiving end for whatever reason
the logic that 'counts down' the number of bytes left in the receiver is somehow flawed
Trouble is, looking at the code samples you have posted, it's hard to say which because the code is a bit muddled and, in my eyes, more complicated than it needs to be. I'm going to recommend you sort that out.
Sending the size of the file.
Don't mess about sending this as a string. Send it instead in binary, using (say) htonll() at the sending end and ntohll() at the receiving end. Then, the receiver knows to read exactly 8 bytes to figure out what's coming next. It's hard to get that wrong.
Sending the file itself.
TransmitFile() looks to be a good choice here. Stick with it.
Receiving the file and counting down how many bytes are left.
Take a closer look at that code and consider rewriting it. It's a bit of a mess.
What to do if it still doesn't work.
Check with WireShark that the expected data is being sent and then walk through the code in the receiver in the debugger. There is absolutely no excuse for not doing this unless you don't have a debugger for some reason, in which case please say so and somebody will try to help you. The fact that logging to cout fixes your problems is a red-herring. That just changes the timing and then it just happens to work right.
That's all. Best of luck.
This is more of a request for confirmation than a question, so I'll keep it brief. (I am away from my PC and so can't simply implement this solution to test).
I'm writing a program to send an image file taken via webcam (along with meta data) from a raspberryPi to my PC.
I've worked out that the image is roughly around 130kb, the packet header is 12b and the associated meta data another 24b. Though I may increase the image size in future, once I have a working prototype.
At the moment I am not able to retrieve this whole packet successfully as, after sending it to the PC I only ever get approx 64kb recv'd in the buffer.
I have assumed that this is because for whatever reason the default buffer size for a socket declared like:
SOCKET sock = socket(PF_INET, SOCK_STREAM, 0);
is 64kb (please could someone clarify this if you're 'in the know')
So - to fix this problem I intend to increase the socket size to 1024kb via the setsockopt(x..) command.
Please could someone confirm that my diagnosis of the problem, and proposed solution are correct?
I ask this question as I am away form my PC right now and am unable to try it until I get back home.
This most likely has nothing to do with the socket buffers, but with the fact that recv() and send() do not have to receive and send all the data you want. Check the return value of those function calls, it indicates how many bytes have actually been sent and received.
The best way to deal with "short" reads/writes is to put them in a loop, like so:
char *buf; // pointer to your data
size_t len; // length of your data
int fd; // the socket filedescriptor
size_t offset = 0;
ssize_t result;
while (offset < len) {
result = send(fd, buf + offset, len - offset, 0);
if (result < 0) {
// Deal with errors here
}
offset += result;
}
Use a similar construction for receiving data. Note that one possible error condition is that the function call was interrupted (errno = EAGAIN or EWOULDBLOCK), in that case you should retry the send command, in all other cases you should exit the loop.
I want to use the function recv(socket, buf, len, flags) to receive an incoming packet. However I do not know the length of this packet prior to runtime so the first 8 bytes are supposed to tell me the length of this packet. I don't want to just allocate an arbitrarily large len to accomplish this so is it possible to set len = 8 have buf be a type of uint64_t. Then afterwards
memcpy(dest, &buf, buf)?
Since TCP is stream-based, I'm not sure what type of packages you mean. I will assume that you are referring to application level packages. I mean packages which are defined by your application and not by underlying protocols like TCP. I will call them messages instead to avoid confusion.
I will show two possibilities. First I will show, how you could read a message without knowing the length before you have finished reading. The second example will do two calls. First it reads the size of the message. Then it read the whole message at once.
Read data until the message is complete
Since TCP is stream-based, you will not loss any data when your buffer is not big enough. So you can read a fixed amount of bytes. If something is missing, you can call recv again. Here is a extensive example. I just wrote it without testing. I hope everything would work.
std::size_t offset = 0;
std::vector<char> buf(512);
std::vector<char> readMessage() {
while (true) {
ssize_t ret = recv(fd, buf.data() + offset, buf.size() - offset, 0);
if (ret < 0) {
if (errno == EINTR) {
// Interrupted, just try again ...
continue;
} else {
// Error occured. Throw exception.
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore.
if (offset == 0) {
// Client did just close the connection
return std::vector<char>(); // return empty vector
} else {
// Client did close connection while sending package?
// It is not a clean shutdown. Throw exception.
throw ProtocolException("Unexpected end of stream");
}
} else if (isMessageComplete(buf)) {
// Message is complete.
buf.resize(offset + ret); // Truncate buffer
std::vector<char> msg = std::move(buf);
std::size_t msgLen = getSizeOfMessage(msg);
if (msg.size() > msgLen) {
// msg already contains the beginning of the next message.
// write it back to buf
buf.resize(msg.size() - msgLen)
std::memcpy(buf.data(), msg.data() + msgLen, msg.size() - msgLen);
msg.resize(msgLen);
}
buf.resize(std::max(2*buf.size(), 512)) // prepare buffer for next message
return msg;
} else {
// Message is not complete right now. Read more...
offset += ret;
buf.resize(std::max(buf.size(), 2 * offset)); // double available memory
}
}
}
You have to define bool isMessageComplete(std::vector<char>) and std::size_t getSizeOfMessage(std::vector<char>) by yourself.
Read the header and check the length of the package
The second possibility is to read the header first. Just the 8 bytes which contains the size of the package in your case. After that, you know the size of the package. This mean you can allocate enough storage and read the whole message at once:
/// Reads n bytes from fd.
bool readNBytes(int fd, void *buf, std::size_t n) {
std::size_t offset = 0;
char *cbuf = reinterpret_cast<char*>(buf);
while (true) {
ssize_t ret = recv(fd, cbuf + offset, n - offset, MSG_WAITALL);
if (ret < 0) {
if (errno != EINTR) {
// Error occurred
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore
if (offset == 0) return false;
else throw ProtocolException("Unexpected end of stream");
} else if (offset + ret == n) {
// All n bytes read
return true;
} else {
offset += ret;
}
}
}
/// Reads message from fd
std::vector<char> readMessage(int fd) {
std::uint64_t size;
if (readNBytes(fd, &size, sizeof(size))) {
std::vector buf(size);
if (readNBytes(fd, buf.data(), size)) {
return buf;
} else {
throw ProtocolException("Unexpected end of stream");
}
} else {
// connection was closed
return std::vector<char>();
}
}
The flag MSG_WAITALL requests that the function blocks until the full amount of data is available. However, you cannot rely on that. You have to check it and read again if something is missing. Just like I did it above.
readNBytes(fd, buf, n) reads n bytes. As far as the connection was not closed from the other side, the function will not return without reading n bytes. If the connection was closed by the other side, the function returns false. If the connection was closed in the middle of a message, an exception is thrown. If an i/o-error occurred, another exception is thrown.
readMessage reads 8 bytes [sizeof(std::unit64_t)] und use them as size for the next message. Then it reads the message.
If you want to have platform independency, you should convert size to a defined byte order. Computers (with x86 architecture) are using little endian. It is common to use big endian in network traffic.
Note: With MSG_PEEK it is possible to implement this functionality for UDP. You can request the header while using this flag. Then you can allocate enough space for the whole package.
A fairly common technique is to read leading message length field, then issue a read for the exact size of the expected message.
HOWEVER! Do not assume that the first read will give you all eight bytes(see Note), or that the second read will give you the entire message/packet.
You must always check the number of bytes read and issue another read (or two (or three, or...)) to get all the data you want.
Note: Because TCP is a streaming protocol and because the packet size "on the wire" varies in accordance with a very arcane algorithm designed to maximize network performance, you could easily issue a read for eight bytes and the read could return having only read three (or seven or ...) bytes. The guarantee is that unless there is an unrecoverable error you will receive at least one byte and at most the number of bytes you requested. Because of this you must be prepared to do byte address arithmetic and issue all reads in a loop that repeats until the desired number of bytes is returned.
Since TCP is streaming there isn't really any end to the data you receive, not until the connection is closed or there is an error.
Instead you need to implement your own protocol on top of TCP, one that either contains a specific end-of-message marker, a length-of-data header field, or possibly a command-based protocol where the data of each command is of a well-known size.
That way you can read into a small fixed-sized buffer and append to a larger (possibly expanding) buffer as needed. The "possibly expanding" part is ridiculously easy in C++, what with std::vector and std::string (depending on the data you have)
There is another important thing to remember, that since TCP is stream-based, a single read or recv call may not actually fetch all the data you request. You need to receive the data in a loop until you have received everything.
In my Personal opinion.
I suggest receive "size of message"(integer 4 byte fixed) first.
recv(socket, "size of message written in integer" , "size of integer")
then
receive real message after.
recv(socket, " real message" ,"size of message written in integer")
This techinique also can be used on "sending files, images ,long messages"
Currently, I'm learning how to build a transparent HTTP proxy in C++. There had two issues on the proxy client side I couldn't resolve for long time. Hope someone can point out the root causes based on following scenarios. Thanks a lot. :D
The HTTP proxy I built right now is somehow work partially only. For example, I could access google's main page through proxy while I couldn't get any search result after I typed keyword(the google instant is also not working at all). On the other hand, youtube is working perfectly includes searching, loading video and commenting. What's more, there also got some websites like yahoo even couldn't display main page after I keyed in its URL.
The reason why I said the issues are on the proxy client side at the begining is because I traced the data flow of my program. I found out the written size returned by socket programming function write() was smaller than the data size I passed to my write back function. The most weird observation for me was the data losing issue is independent from the size of data. The socket write() function could work properly for youtube video data which is nearly 2MB while it would loss data for google search request which is just 20KB.
Furthermore, there also had another situation that browser displayed blank when the data size I passed to my write back function and the written size returned by socket write function() are the same. I used wireshark to trace the flow of communication and compared mine with pure IP communication without proxy involved. I found out that browser didn't continuously send out HTTP requests after it received certain HTTP responses comparing with pure IP communication flow. I couldn't find out why the browser didn't send out rest of HTTP requests.
Following is my code for write back function:
void Proxy::get_data(char* buffer, size_t length)
{
cout<<"Length:"<<length<<endl;
int connfd;
size_t ret;
// get connfd from buffer
memset(&connfd, 0, sizeof(int));
memcpy(&connfd, buffer, sizeof(int));
cout<<"Get Connection FD:"<<connfd<<endl;
// get receive data size
size_t rData_length = length-sizeof(int);
cout<<"Data Size:"<<rData_length<<endl;
// create receive buffer
char* rBuf = new char[rData_length];
// allocate memory to receive buffer
memset(rBuf, 0, rData_length);
// copy data to buffer
memcpy(rBuf, buffer+sizeof(int), rData_length);
ret = write(connfd, rBuf, rData_length);
if(ret < 0)
{
cout<< "received data failed"<< endl;
close(connfd);
delete[] rBuf;
exit(1);
}
else
{
printf("Write Data[%d] to Socket\n", ret);
}
close(connfd);
delete[] rBuf;
}
May be you could try this
int curr = 0;
while( curr < rData_length ) {
ret = write( connfd, rBuf + curr, rData_length - curr );
if( ret == -1 ) { /* ERROR */ }
else
curr += ret;
}
instead of
ret = write(connfd, rBuf, rData_length);
In general, the number of bytes written by write() could differ from what you ask to write. You should better read some manual. Say, http://linux.die.net/man/2/write
Copying bytes between an input socket and an output socket is much simpler than this. You don't need to dynamically allocate buffers according to how much data was read by the last read. You just need to read into a char[] array and write from that array to the target, taking due account of the length value returned by the read.
I am working on network programming using epoll and I have this code...
int read = read(socket, buf, bufsize);
I have a huge buffer size and I assumed it will receive everything clients sent.
However, I started facing problems like packet segmentation.
One example is that if a client sent 500 bytes but it somehow got into two 250 bytes packets then there is no way to handle this situation.
I looked up online and found this code
int handle_read(client *cli, struct epoll_event *ev) {
size_t len = 4096;
char *p;
ssize_t received;
cli->state = 1;
if (cli->buffer != NULL) {
//free(cli->buffer);
//printf("Buff not null %s\n", cli->buffer);
}
//allocate space for data
cli->buffer = (char*)malloc( (size_t)(sizeof(char) * 4096) );
p = cli->buffer;
do { //read until loop conditions
received = recv(ev->data.fd, p, len, 0);
if (received < 0 && errno != EAGAIN && errno != EWOULDBLOCK) {
//if error, remove from epoll and close socket
printf("Handle error!!!\nClient disconnected!\n");
epoll_ctl(epollfd, EPOLL_CTL_DEL, ev->data.fd, ev);
close(ev->data.fd);
}
p = &cli->buffer[received];
} while (received >= len && errno != EAGAIN && errno != EWOULDBLOCK);
return received;
}
Do you guys think it handles all the exceptions might happen while receiving? Also could you please provide me tutorials or examples that handles socket exceptions? Sample codes online don't cover details.. Thanks in advance
recv can return any of three things, and your code needs to handle each one correctly:
1) Positive number. This means it read some bytes.
2) Negative number. This means an "error" occurred.
3) Zero. This means the other end of the connection performed a successful shutdown() (or close()) on the socket. (In general, a return of 0 from read() or recv() means EOF.)
The "error" case further breaks down into "EAGAIN or EWOULDBLOCK" and "everything else". The first two just means it is a non-blocking socket and there was no data to give you at this time. You probably want to go back and call poll() (or select() or epoll()) again to avoid busy waiting...
"Everything else" means a real error. You need to handle those too; see the POSIX spec for recv() for a complete list.
Given all this, I would say your sample code is bad for several reasons. It does not handle 0 (closed connection) properly. It does not handle any errors. It does a busy-loop when the recv() returns EAGAIN/EWOULDBLOCK.
Oh, and it uses sizeof(char), which is a sure sign it was written by somebody who is not familiar with the C or C++ programming languages.
You can't know "How many datas client sent" in normaly. you should use scalable data format(that have data length in the header) or separator for data tokens. For example, you may add \xff between data and next data. Or, you should use fixed data format.