Infinite read from socket - c++

what is the right way to read chunked data (from http request) from socket?
sf::TcpSocket socket;
socket.connect("0.0.0.0", 80);
std::string message = "GET /address HTTP/1.1\r\n";
socket.send(message.c_str(), message.size() + 1);
// Receive an answer from the server
char buffer[128];
std::size_t received = 0;
socket.receive(buffer, sizeof(buffer), received);
std::cout << "The server said: " << buffer << std::endl;
But server sends infinite data and socket.receive doesn't return management. Any right ways to read chunked data part by part? (The answer is chunked data).

The right way to process HTTP requests is to use a higher-level library that manages the socket connections for you. In C++ one example would be pion-net; there are others too like Mongoose (which is C, but fine to use in C++).

Well infinite data is theoretically possible while the practical implementation differ from process to process.
Approach 1 - Generally many protocol do send size in the first few bytes ( 4 bytes ) and you can have a while loop
{
int i = 0, ret = 1;
unsigned char buffer[4];
while ( i<4 && ret == 0)
socket.receive(buffer + i, 1 , ret);
// have a while loop to read the amount of data you need. Malloc the buffer accordingly
}
Approach 2 - Or in your case where you don't know the lenght ( infinite )
{
char *buffer = (char *)malloc(TCP_MAX_BUF_SIZE);
std::size_t total = 0, received = 0;
while ( total < TCP_MAX_BUF_SIZE && return >= 0) {
socket.receive(buffer, sizeof(buffer), received);
total += received;
}
//do something with your data
}
You will have to break at somepoint and process your data Dispatch it to another thread of release the memory.

If by "chunked data" you are referring to the Transfer-Encoding: chunked HTTP header, then you need to read each chunk and parse the chunk headers to know how much data to read in each chunk and to know when the last chunk has been received. You cannot just blindly call socket.receive(), as chunked data has a defined structure to it. Read RFC 2616 Section 3.6.1 for more details.
You need to do something more like the following (error handling omitted for brevity - DON'T omit it in your real code):
std::string ReadALine(sf::TcpSocket &socket)
{
std::string result;
// read from socket until a LF is encountered, then
// return everything up to, but not including, the
// LF, stripping off CR if one is also present...
return result;
}
void ReadHeaders(sf::TcpSocket &socket, std::vector<std::string> &headers)
{
std::string line;
do
{
line = ReadALine(socket);
if (line.empty()) return;
headers.push_back(line);
}
while (true);
}
std::string UpperCase(const std::string &s)
{
std::string result = s;
std::for_each(result.begin(), result.end(), toupper);
return result;
}
std::string GetHeader(const std::vector<std::string> &headers, const std::string &s)
{
std::string prefix = UpperCase(s) + ":";
for (std::vector<std::string>::iterator iter = headers.begin(), end = headers.end(); iter != end; ++iter)
{
if (UpperCase(i)->compare(0, prefix.length(), prefix) == 0)
return i->substr(prefix.length());
}
return std::string();
}
sf::TcpSocket socket;
socket.connect("0.0.0.0", 80);
std::string message = "GET /address HTTP/1.1\r\nHost: localhost\r\n\r\n";
socket.send(message.c_str(), message.length());
std:vector<std::string> headers;
std::string statusLine = ReadALine(sockeet);
ReadHeaders(socket, headers);
// Refer to RFC 2616 Section 4.4 for details about how to properly
// read a response body in different situations...
int statusCode;
sscanf(statusLine.c_str(), "HTTP/%*d.%*d %d %*s", &statusCode);
if (
((statusCode / 100) != 1) &&
(statusCode != 204) &&
(statusCode != 304)
)
{
std::string header = GetHeader(headers, "Transfer-Encoding");
if (UpperCase(header).find("CHUNKED") != std::string::npos)
{
std::string extensions;
std::string_size_type pos;
std::size_t chunkSize;
do
{
line = ReadALine(socket);
pos = line.find(";");
if (pos != std::string::npos)
{
extensions = line.substr(pos+1);
line.resize(pos);
}
else
extensions.clear();
chunkSize = 0;
sscanf(UpperCase(line).c_str(), "%X", &chunkSize);
if (chunkSize == 0)
break;
socket.receive(someBuffer, chunkSize);
ReadALine(socket);
// process extensions as needed...
// copy someBuffer into your real buffer...
}
while (true);
std::vector<std::string> trailer;
ReadHeaders(socket, trailer);
// merge trailer into main header...
}
else
{
header = GetHeader(headers, "Content-Length");
if (!header.empty())
{
uint64_t contentLength = 0;
sscanf(header.c_str(), "%Lu", &contentLength);
// read from socket until contentLength number of bytes have been read...
}
else
{
// read from socket until disconnected...
}
}
}

Related

Why recv blocks before receiving all Content-Length?

I'm trying to build an http server using c++. and so among the conditions based in which i decide how to extract the body entity, is if there's a content length present? , here's a minimal code on how i extract body using Content-Length :
req_t *Webserver::_recv(int client_fd, bool *closed)
{
string req;
static string rest;
// string extracted_req;
char buff[1024];
// while (true) {
// std::cout << "client_fd: " << client_fd << std::endl;
int n = recv(client_fd, buff, 1024, 0);
// std::cout << "n: " << n << std::endl;
if (n == -1)
{
_set_error_code("500", "Internal Server Error");
return NULL;
}
if (n == 0)
{
*closed = true;
return NULL;
}
buff[n] = '\0';
req += buff;
req_t *extracted_req = _extract_req(client_fd, req, rest, closed);
return extracted_req;
}
...
else if (headers.find("Content-Length") != string::npos) {
string body = extract_body_len(client_fd, rest_of_req, content_length);
}
req_t is a simple struct that contains three strings status_line, headers, body.
req_t *Webserver::_extract_req(int client_fd, const string &req, string &rest, bool *closed)
{
req_t *ret;
try
{
ret = new req_t;
}
catch (std::bad_alloc &e)
{
std::cerr << "\033[1;31mError:\033[0m " << e.what() << std::endl;
exit(1);
}
string status_line = req.substr(0, req.find("\r\n"));
string headers = req.substr(req.find("\r\n") + 2, req.find("\r\n\r\n") - req.find("\r\n") - 2);
rest = req.substr(req.find("\r\n\r\n") + 4, req.size() - req.find("\r\n\r\n") - 4);
ret->status_line = status_line;
ret->headers = headers;
// if method is get request body is empty
// if the header contains a content-length, extract number of buytes for body;
if (headers.find("Content-Length") != string::npos)
{
long long content_length = _get_content_len(headers);
if (content_length == -1)
{
_set_error_code("400", "Bad Request");
return NULL;
}
// substracting the length of the body from the length of the request
ret->body = _extract_body_len(client_fd, rest, content_length, closed);
// if body is not complete, return an error
...
string extract_body_len(int client_fd, string& rest, unsigned long long len) {
string body;
unsigned long long total = 0;
body = rest;
// starting total with first bytes of body
total += rest.size();
// if we have it all that's it
if (total >= len) {
body = rest.substr(0, len);
rest = rest.substr(len);
return body;
}
else
{
while (total < len)
{
char buf[1024];
int ret = recv(client_fd, buf, 1024, 0);
// after a lot of debugging , i've noticed that recv starts to read less than 1024 only when total is closer to len, so i added this condition naively.
if (ret != 1024)
{
if ((total + ret) >= len)
{
body += string(buf).substr(0, len - total);
rest = string(buf).substr(len - total);
break;
}
}
if (ret == 0)
{
if (total == len)
{
rest = "";
break;
}
// client closed connection and it's still incomplete: 400
else
{
res->status_code = "400";
res->status_message = "Bad Request";
return NULL;
}
}
else if (ret == -1)
{
res->status_code = "500";
res->status_message = "Internal Server Error";
return body;
}
total += ret;
body += string(buf, ret);
}
}
return body;
}
Now, The problem is i've tested requests with varying sized body entities(8MB, 1.9MB, 31 MB) and all the time i never receive the whole body (as per content-length), the pattern is like the following:
recv keeps reading all 1024 bytes until total gets closer to len then it starts reading smaller numbers. until the difference between total and len is around 400...600 bytes then recv blocks at some point (there's nothing more to read) before total == len.
That really confused me, i tried with different api clients (postman, insonomia) but the same results, i doubted maybe Content-Length isn't that accurate but it obviously should be, what do you think is the problem , why am i receiving or reading less than Content-Length ?
int n = recv(client_fd, buff, 1024, 0);
The above code appears to assume that this recv call returns only the header portion of the HTTP request. Not one byte more, not one byte less.
Unfortunately, you will not find anything in your textbook on network programming that gives you any such guarantee, like that, whatsoever.
Your only guarantee (presuming that there is no socket-level error), is that recv() will return a value between 1 and 1024, representing however many bytes were already received on the socket, or arrived in the first packet that it blocked and waited for.
Using an example of a completely made up HTTP request that looks something like this:
POST /cgi-bin/upload.cgi HTTP/1.0<CR><LF>
Host: www.example.com<CR><LF>
Content-Type: application/octet-stream<CR><LF>
Content-Length: 4000<CR><LF>
<CR><LF>
[4000 octets follow]
When your web browser, or a simulated browser, sends this request this recv call can return any value between 1 and 1024 (excluding the case of network errors).
This means that this recv call can cough up anything between:
a return value of 1, and placing just the letter "P" into buff.
a return value of 1024, and placing the entire HTTP header, plus as much of the initial part of the HTTP content portion of the request into the buffer that's needed to produce 1024 bytes total.
The shown logic is completely incapable of correctly handling all of these possibilities, and that's why it fails. It will need to be reimplemented, pretty much from scratch, using the correct logic.

Socket Communication Data Corruption on Write/Read

I've got a C++ server that communicates with multiple clients. It uses a vector to store the handles to the sockets for those clients (playerSockets in the code below). At the end of the "game" I want the server to loop through that vector and write the same string to each client. However, sometimes the data that the client reads (and then displays) is "corrupted" as you can see in the screenshot, but this doesn't happen for the first client, only the second. I can't figure out why this is happening! I use this same technique (looping and writing) earlier in the program and it always works fine in that instance.
Here is what it is supposed to be shown:
Here and here's what I get:
Here is the server code that writes:
std::string announcement = "";
if (playerWon) {
...
}
} else {
announcement = "?No one won the game!\nGAME BOARD: " + cn.getGameBoard();
for (int player : gameData->playerSockets) {
write(player, announcement.c_str(), announcement.size() + 1);
}
}
And here's the client code that reads. Keep in mind that more than one client is running and connected to the server, and this issue only happens with a client OTHER THAN the first client in the server's loop:
static bool readMyTurn(int clientSd) {
...
char buf[BUFSIZE];
read(clientSd, buf, BUFSIZE);
string myTurn(buf);
cout << "MYMYMYMY: " << myTurn << endl;
myTurn.erase(0, 1);
cout << myTurn << endl;
...
}
UPDATE
Here is my current code to read until encountering the null-terminator character.
string readOneStringFromServer(int clientSd, string &leftovers) {
ssize_t nullTerminatorPosition = 0;
std::string stringToReturn = "";
do {
char buf[BUFSIZE];
ssize_t bytesRead = read(clientSd, buf, BUFSIZE);
nullTerminatorPosition = findPositionOfNullTerminator(buf, bytesRead);
// found a null terminator
if (nullTerminatorPosition != -1) {
// create a buffer to hold all of the chars from buf1 up to and including the null terminator
char upToNullTerminator[nullTerminatorPosition + 1];
// get those chars from buf1 and put them into buf2 (including the null terminator)
for (int i = 0; i < nullTerminatorPosition + 1; ++i) {
upToNullTerminator[i] = buf[i];
}
// use buf2 to create a string
stringToReturn += upToNullTerminator;
// check if there are leftover bytes after the null terminator
int leftoverBytes = bytesRead - nullTerminatorPosition - 1;
if (leftoverBytes != 0) {
// if there are, create a char array of that size
char leftoverChars[leftoverBytes];
// loop through buf1 and add the leftover chars to buf3
for (int i = nullTerminatorPosition + 1; i < bytesRead; ++i) {
leftoverChars[i - (nullTerminatorPosition + 1)] = buf[i];
}
// make a string out of those leftover chars
leftovers = leftoverChars;
} else {
// if there are no leftover bytes, then we want to "erase" what is currently held in leftovers so that
// it doesn't get passed to the next function call
leftovers = "";
}
// didn't find one
} else {
stringToReturn += buf;
}
} while (nullTerminatorPosition == -1);
return stringToReturn;
}

C++ Sending PNG File via HTTP

I'm trying to create a webserver to learn how HTTP functions. I am trying to send a png file to the browser, however the image successfully makes it.
Here is my png sending code:
std::ifstream in("P:/server"+location, std::ios::binary);
if(!in.is_open()) {
std::cout << "failed to open file" << std::endl;
in.close();
}
in.seekg(0, std::ios::end);
int length = in.tellg();
in.seekg(0, std::ios::beg);
char *data = new char[length];
in.read(data, length);
in.close();
std::string headers = "HTTP/1.1 200 OK\n\rContent-Length: " + std::to_string(length) + "\n\rConnection: keep-alive\n\rContent-Type: image/png\n\r\n\r";
int totalLength = headers.length() + length;
char *allData = new char[totalLength];
std::strcpy(allData, headers.data());
std::strcat(allData, data);
int bytes = send(socket, data, totalLength, NULL);
Once the server should have sent the image, it shows up as the missing image icon.
I have checked to make sure that all the bytes are being sent, and that the image is being loaded.
Any help would be very much appreciated!
There are quite a few mistakes in your code.
You are still trying to process the file if is_open() returns false.
You are using \n\r when you need to use \r\n instead.
You are using strcat() to append data to allData. When appending binary data, like a PNG, strcat() will truncate on the first 0x00 byte encountered. You need to use memcpy() (or equivalent) instead.
Your call to send() is sending data when it should be sending allData instead. So you are you not sending the HTTP headers at all, and you are sending data using the wrong length.
You are assuming send() will send all of the data you give it in a single operation. That is almost never the case, especially fo large amounts of data. send() returns the number of bytes it actually accepted for sending, so you need to call it in a loop until all of the data has been accepted.
You are not sending an error message to the client if something goes wrong while preparing the file.
That being said, allData is actually unnecessary, and is a waste of memory. TCP is a byte stream, so you can send headers and data individually one after the other. How many times you call send() doesn't affect how the other party receives the data.
You might also consider minimizing memory usage further by changing data to be a fixed-sized buffer so you can send() the content of in while reading from it in smaller chunks.
Try something more like this instead:
int sendData(int sckt, const void *data, int datalen)
{
const char *ptr = static_cast<const char*>(data);
while (datalen > 0) {
int bytes = send(sckt, ptr, datalen, 0);
if (bytes <=0) return -1;
ptr += bytes;
datalen -= bytes;
}
return 0;
}
int sendStr(int sckt, const std::string &s)
{
return sendData(sckt, s.c_str(), s.size());
}
...
std::string filename = "P:/server"+location;
if (!fileExists(filename)) // <- you need to implement this
{
if (sendStr(socket, "HTTP/1.1 404 Not Found\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n") == -1) {
close(socket);
}
}
else
{
std::ifstream in(filename, std::ios::binary);
if (!in.is_open())
{
std::cout << "failed to open file" << std::endl;
if (sendStr(socket, "HTTP/1.1 500 Error\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n") == -1) {
close(socket);
}
}
else
{
in.seekg(0, std::ios::end);
std::size_t length = in.tellg();
in.seekg(0, std::ios::beg);
if (in.fail())
{
std::cout << "failed to get size of file" << std::endl;
if (sendStr(socket, "HTTP/1.1 500 Error\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n") == -1) {
close(socket);
}
}
else if (sendStr(socket, "HTTP/1.1 200 OK\r\nContent-Length: " + std::to_string(length) + "\r\nConnection: keep-alive\r\nContent-Type: image/png\r\n\r\n") == -1)
{
close(socket);
}
else if (length > 0)
{
char data[1024];
do
{
if (!in.read(data, std::min(length, sizeof(data))))
{
close (socket);
break;
}
int bytes = in.gcount();
if (sendData(socket, data, bytes) == -1)
{
close(socket);
break;
}
length -= bytes;
}
while (length > 0);
}
}
}

How can I send all data over a socket?

I am trying to send large amounts of data over a socket, sometimes when I call send (on Windows) it won't send all the data I requested, as expected. So, I wrote a little function that should have solved my problems- but it's causing problems where the data isn't being sent correctly and causing the images to be corrupted. I'm making a simple chat room where you can send images (screenshots) to each other.
Why is my function not working?
How can I make it work?
void _internal_SendFile_alignment_512(SOCKET sock, BYTE *data, DWORD datasize)
{
Sock::Packet packet;
packet.DataSize = datasize;
packet.PacketType = PACKET_FILETRANSFER_INITIATE;
DWORD until = datasize / 512;
send(sock, (const char*)&packet, sizeof(packet), 0);
unsigned int pos = 0;
while( pos != datasize )
{
pos += send(sock, (char *)(data + pos), datasize - pos, 0);
}
}
My receive side is:
public override void OnReceiveData(TcpLib.ConnectionState state)
{
if (state.fileTransfer == true && state.waitingFor > 0)
{
byte[] buffer = new byte[state.AvailableData];
int readBytes = state.Read(buffer, 0, state.AvailableData);
state.waitingFor -= readBytes;
state.bw.Write(buffer);
state.bw.Flush();
if (state.waitingFor == 0)
{
state.bw.Close();
state.hFile.Close();
state.fileTransfer = false;
IPEndPoint ip = state.RemoteEndPoint as IPEndPoint;
Program.MainForm.log("Ended file transfer with " + ip);
}
}
else if( state.AvailableData > 7)
{
byte[] buffer = new byte[8];
int readBytes = state.Read(buffer, 0, 8);
if (readBytes == 8)
{
Packet packet = ByteArrayToStructure<Packet>(buffer);
if (packet.PacketType == PACKET_FILETRANSFER_INITIATE)
{
IPEndPoint ip = state.RemoteEndPoint as IPEndPoint;
String filename = getUniqueFileName("" + ip.Address);
if (filename == null)
{
Program.MainForm.log("Error getting filename for " + ip);
state.EndConnection();
return;
}
byte[] data = new byte[state.AvailableData];
readBytes = state.Read(data, 0, state.AvailableData);
state.waitingFor = packet.DataSize - readBytes;
state.hFile = new FileStream(filename, FileMode.Append);
state.bw = new BinaryWriter(state.hFile);
state.bw.Write(data);
state.bw.Flush();
state.fileTransfer = true;
Program.MainForm.log("Initiated file transfer with " + ip);
}
}
}
}
It receives all the data, when I debug my code and see that send() does not return the total data size (i.e. it has to be called more than once) and the image gets yellow lines or purple lines in it — I suspect there's something wrong with sending the data.
I mis-understood the question and solution intent. Thanks #Remy Lebeau for the comment to clarify that. Based on that, you can write a sendall() function as given in section 7.3 of http://beej.us/guide/bgnet/output/print/bgnet_USLetter.pdf
int sendall(int s, char *buf, int *len)
{
int total = 0; // how many bytes we've sent
int bytesleft = *len; // how many we have left to send
int n = 0;
while(total < *len) {
n = send(s, buf+total, bytesleft, 0);
if (n == -1) {
/* print/log error details */
break;
}
total += n;
bytesleft -= n;
}
*len = total; // return number actually sent here
return n==-1?-1:0; // return -1 on failure, 0 on success
}
You need to check the returnvalue of send(). In particular, you can't simply assume that it is the number of bytes sent, there is also the case that there was an error. Try this instead:
while(datasize != 0)
{
n = send(...);
if(n == SOCKET_ERROR)
throw exception("send() failed with errorcode #" + to_string(WSAGetLastEror()));
// adjust pointer and remaining number of bytes
datasize -= n;
data += n;
}
BTW:
Make that BYTE const* data, you're not going to modify what it points to.
The rest of your code seems too complicated, in particular you don't solve things by aligning to magic numbers like 512.

How to read exactly one line?

I have a Linux file descriptor (from socket), and I want to read one line.
How to do it in C++?
I you are reading from a TCP socket you can't assume when the end of line will be reached.
Therfore you'll need something like that:
std::string line;
char buf[1024];
int n = 0;
while(n = read(fd, buf, 1024))
{
const int pos = std::find(buf, buf + n, '\n')
if(pos != std::string::npos)
{
if (pos < 1024-1 && buf[pos + 1] == '\n')
break;
}
line += buf;
}
line += buf;
Assuming you are using "\n\n" as a delimiter. (I didn't test that code snippet ;-) )
On a UDP socket, that is another story. The emiter may send a paquet containing a whole line. The receiver is garanted to receive the paquet as a single unit .. If it receives it , as UDP is not as reliable as TCP of course.
Pseudocode:
char newline = '\n';
file fd;
initialize(fd);
string line;
char c;
while( newline != (c = readchar(fd)) ) {
line.append(c);
}
Something like that.
Here is a tested, quite efficient code:
bool ReadLine (int fd, string* line) {
// We read-ahead, so we store in static buffer
// what we already read, but not yet returned by ReadLine.
static string buffer;
// Do the real reading from fd until buffer has '\n'.
string::iterator pos;
while ((pos = find (buffer.begin(), buffer.end(), '\n')) == buffer.end ()) {
char buf [1025];
int n = read (fd, buf, 1024);
if (n == -1) { // handle errors
*line = buffer;
buffer = "";
return false;
}
buf [n] = 0;
buffer += buf;
}
// Split the buffer around '\n' found and return first part.
*line = string (buffer.begin(), pos);
buffer = string (pos + 1, buffer.end());
return true;
}
It's also useful to setup signal SIGPIPE ignoring in reading and writing (and handle errors as shown above):
signal (SIGPIPE, SIG_IGN);
Using C++ sockets library:
class LineSocket : public TcpSocket
{
public:
LineSocket(ISocketHandler& h) : TcpSocket(h) {
SetLineProtocol(); // enable OnLine callback
}
void OnLine(const std::string& line) {
std::cout << "Received line: " << line << std::endl;
// send reply here
{
Send( "Reply\n" );
}
}
};
And using the above class:
int main()
{
try
{
SocketHandler h;
LineSocket sock(h);
sock.Open( "remote.host.com", port );
h.Add(&sock);
while (h.GetCount())
{
h.Select();
}
}
catch (const Exception& e)
{
std::cerr << e.ToString() << std::endl;
}
}
The library takes care of all error handling.
Find the library using google or use this direct link: http://www.alhem.net/Sockets/