Remove string from read() function - socket programming [duplicate] - c++

i want to know, is there a possibility to find out where in the response Stream the header ends?
The background of the question is as following, i am using sockets in c to get content from a website, the content is encoded in gzip. I would like to read the content directly from stream and encode the gzip content with zlib. But how do i know the gzip content started and the http header is finished.
I roughly tried two ways which are giving me some, in my opinion, strange results. First, i read in the whole stream, and print it out in terminal, my http header ends with "\r\n\r\n" like i expected, but the secound time, i just retrieve the response once to get the header and then read the content with while loop, here the header ends without "\r\n\r\n".
Why? And which way is the right way to read in the content?
I'll just give you the code so you could see how i'm getting the response from server.
//first way (gives rnrn)
char *output, *output_header, *output_content, **output_result;
size_t size;
FILE *stream;
stream = open_memstream (&output, &size);
char BUF[BUFSIZ];
while(recv(socket_desc, BUF, (BUFSIZ - 1), 0) > 0)
{
fprintf (stream, "%s", BUF);
}
fflush(stream);
fclose(stream);
output_result = str_split(output, "\r\n\r\n");
output_header = output_result[0];
output_content = output_result[1];
printf("Header:\n%s\n", output_header);
printf("Content:\n%s\n", output_content);
.
//second way (doesnt give rnrn)
char *content, *output_header;
size_t size;
FILE *stream;
stream = open_memstream (&content, &size);
char BUF[BUFSIZ];
if((recv(socket_desc, BUF, (BUFSIZ - 1), 0) > 0)
{
output_header = BUF;
}
while(recv(socket_desc, BUF, (BUFSIZ - 1), 0) > 0)
{
fprintf (stream, "%s", BUF); //i would just use this as input stream to zlib
}
fflush(stream);
fclose(stream);
printf("Header:\n%s\n", output_header);
printf("Content:\n%s\n", content);
Both give the same result printing them to terminal, but the secound one should print out some more breaks, at least i expect, because they get lost splitting the string.
I am new to c, so i might just oversee some easy stuff.

You are calling recv() in a loop until the socket disconnects or fails (and writing the received data to your stream the wrong way), storing all of the raw data into your char* buffer. That is not the correct way to read an HTTP response, especially if HTTP keep-alives are used (in which case no disconnect will occur at the end of the response). You must follow the rules outlined in RFC 2616. Namely:
Read until the "\r\n\r\n" sequence is encountered. This terminates the response headers. Do not read any more bytes past that yet.
Analyze the received headers, per the rules in RFC 2616 Section 4.4. They tell you the actual format of the remaining response data.
Read the remaining data, if any, per the format discovered in #2.
Check the received headers for the presence of a Connection: close header if the response is using HTTP 1.1, or the lack of a Connection: keep-alive header if the response is using HTTP 0.9 or 1.0. If detected, close your end of the socket connection because the server is closing its end. Otherwise, keep the connection open and re-use it for subsequent requests (unless you are done using the connection, in which case do close it).
Process the received data as needed.
In short, you need to do something more like this instead (pseudo code):
string headers[];
byte data[];
string statusLine = read a CRLF-delimited line;
int statusCode = extract from status line;
string responseVersion = extract from status line;
do
{
string header = read a CRLF-delimited line;
if (header == "") break;
add header to headers list;
}
while (true);
if ( !((statusCode in [1xx, 204, 304]) || (request was "HEAD")) )
{
if (headers["Transfer-Encoding"] ends with "chunked")
{
do
{
string chunk = read a CRLF delimited line;
int chunkSize = extract from chunk line;
if (chunkSize == 0) break;
read exactly chunkSize number of bytes into data storage;
read and discard until a CRLF has been read;
}
while (true);
do
{
string header = read a CRLF-delimited line;
if (header == "") break;
add header to headers list;
}
while (true);
}
else if (headers["Content-Length"] is present)
{
read exactly Content-Length number of bytes into data storage;
}
else if (headers["Content-Type"] begins with "multipart/")
{
string boundary = extract from Content-Type header;
read into data storage until terminating boundary has been read;
}
else
{
read bytes into data storage until disconnected;
}
}
if (!disconnected)
{
if (responseVersion == "HTTP/1.1")
{
if (headers["Connection"] == "close")
close connection;
}
else
{
if (headers["Connection"] != "keep-alive")
close connection;
}
}
check statusCode for errors;
process data contents, per info in headers list;

Related

Cannot serve png files and other binary files in hobby HTTP server

I am writing a HTTP server in C++, and serving static files is mostly OK, however when reading .PNG files or other binary's, every method I have tried fails. My main problem is when I open up Dev tools, reading a example image would give a transferred size of 29.56kb, and a size of 29.50 kb for my current method. The sizes given also do not match up with what du-sh give, which is 32kb.
My first method was to push the contents of a file onto a string, and call a function to serve that. However, this would also server ~6kb if memory serves correctly.
My current method is to read the file using std::ifstream in binary mode. I am getting the size of the file using C++17's filesystem header and using std::filesystem::file_size. I read the contents into a buffer and then call a function to send the buffer contents 1 byte at a time
void WebServer::sendContents(std::string contents) {
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
throw std::runtime_error("Server accept: " + std::string(strerror(errno)));
}
}
void WebServer::sendFile(std::string path) {
path = "./" + path;
std::string fileCont; //File contents
std::string mimeType; //Mime type of the file
std::string contLength;
std::string::size_type idx = path.rfind('.');
if (idx != std::string::npos) mimeType = this->getMimeType(path.substr(idx + 1));
else mimeType = "text/html";
std::filesystem::path reqPath = std::filesystem::path("./" + path).make_preferred();
std::filesystem::path parentPath = std::filesystem::path("./");
std::filesystem::path actualPath = std::filesystem::canonical(parentPath / reqPath);
if (!this->isSubDir(actualPath, parentPath)) { this->sendRoute("404"); return; }
std::ifstream ifs;
ifs.open(actualPath, std::ios::binary);
if (ifs.is_open()) {
//Get the size of the static file being server
std::filesystem::path staticPath{path};
std::size_t length = std::filesystem::file_size(staticPath);
char* buffer = new char[length];
*buffer = { 0 }; //Initalize the buffer that will send the static file
ifs.read(buffer, sizeof(char) * length); //Read the buffer
std::string resp = "HTTP/1.0 200 OK\r\n"
"Server: webserver-c\r\n"
"Content-Length" + std::to_string(length) + "\r\n"
"Content-type: " + mimeType + "\r\n\r\n";
if (!ifs) std::cout << "Error! Only " << std::string(ifs.gcount()) << " could be read!" << std::endl;
this->sendContents(resp); //Send the headers
for (size_t i=0; i < length; i++) {
std::string byte = std::string(1, buffer[i]);
this->sendContents(byte);
}
delete buffer; //We do not need megs of memory stack up, that shit will grow quick
buffer = nullptr;
} else {
this->sendContents("HTTP/1.1 500 Error\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n"); return;
}
ifs.close();
}
It should be noted that this->newFd is a socket descriptor
It should also be noted that I have tried to take a look at this question here, however the same problem still occurs for me
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
There are two bugs for the price of one, here.
This is used to send the contents of the binary file. One byte at a time. sendContents gets used, apparently, to send one byte at a time, here. This is horribly inefficient, but it's not the bug. The first bug is as follows.
Your binary file has plenty of bytes that are 00.
In that case, contents will proudly contain this 00 byte, here. c_str() returns a pointer to it. strlen() then reaches the conclusion that it is receiving an empty string, for input, and make a grandiose announcement that the string contains 0 characters.
In the end, send's third parameter will be 0.
No bytes will get sent, at all, instead of the famous 00 byte.
The second bug will come into play once the inefficient algorithm gets fixed, and sendContents gets used to send more than one byte at a time.
send() holds a secret: this system call may return other values, other than -1 to indicate the failure. Such as the actual number of bytes that were sent. So, if send() was called to send, say, 100 bytes, it may decide so send only 30 bytes, return 30, and leaving you holding the bag with the remaining 70 unsent bytes.
This is actually, already, an existing bug in the shown code. sendContents() also gets used to send the entire resp string. Which is, eh, in the neighborhood of a 100 bytes. Give or take a dozen.
You are relying on this house of cards: of send() always doing its job complete job, in this particular case, not slacking off, and actually sending the entire HTTP/1.0 response string.
But, send() is a famous slacker, and you have no guarantees, whatsoever, that this will actually happen. And I have it on good authority that an upcoming Friday the 13th your send() will decide to slack off, all of a sudden.
So, to fix the shown code:
Implement the appropriate logic to handle the return value from send().
Do not use c_str(), followed by strlen(), because: A) it's broken, for strings containing binary data, B) this elaborate routine simply reinvents a wheel called size(). You will be happy to know that size() does exactly what it's name claims to be.
One other bug:
char* buffer = new char[length];
It is possible for an exception to get thrown from the subsequent code. This memory get leaked, because delete does not get called.
C++ gurus know a weird trick: they rarely use new, but instead use containers, like std::vector, and they don't have to worry about leaking memory, because of that.

How to publish JSON to a web server?

I am playing around with freeboard.io and trying to make a widget that pulls JSON data from a URL [TBD]. My original data source is from an iMX6-based Wandboard running Linux that is connected to the internet. I want to write a C++ program on the Wandboard that opens a socket to [TBD] and sends UDP packets, for example, containing my sensor data. My JSON data structure is like this:
{
"sensor_a": 1100,
"sensor_b": 247,
"sensor_c": 0
}
Can you help me put my JSON data structure into an IP packet using C++ on Ubuntu Linux? I know how to just serialize the data structure in ascii for example and build a buffer to stuff an IP packet but I'm wondering if there is a standard way to do this for cloud services, or will it be different for Azure vs AWS? Is some type of header info needed to "put" the data?
This is a very simple problem, like all simple problems no need for external libraries for serializing etc. Like #Galik said above your problem is how to send a string from client to server. Additionally for your case you need a JSON parser on the server (any C or C++ parser from the JSON page will do, I use gason because it's fast and simple).
In TCP/IP socket programming you have to make the other part know how many bytes (characters in your case) to read.
I faced a similar case: send JSON over the web.
here's the example, a JSON "message"
https://github.com/pedro-vicente/lib_netsockets/blob/master/examples/json_message.cc
in this case, the size of the message has this header format
nbr_bytes#json_string
where "json_string" is the JSON text, "nbr_bytes" is the number of characters "json_string" has and "#" is a separator character.
how does the server parse this?
By reading 1 character at a time until the "#" separator is found, then converting that string into a number;
then make the socket API read "nbr_bytes" characters and exit
example
100#{json_txt....}
in this case "json_txt" has 100 characters
here's the code for the parser
std::string read_response(socket_t &socket)
{
int recv_size; // size in bytes received or -1 on error
size_t size_json = 0; //in bytes
std::string str_header;
std::string str;
//parse header, one character at a time and look for for separator #
//assume size header lenght less than 20 digits
for (size_t idx = 0; idx < 20; idx++)
{
char c;
if ((recv_size = recv(socket.m_socket_fd, &c, 1, 0)) == -1)
{
std::cout << "recv error: " << strerror(errno) << std::endl;
return str;
}
if (c == '#')
{
break;
}
else
{
str_header += c;
}
}
//get size
size_json = static_cast<size_t>(atoi(str_header.c_str()));
//read from socket with known size
char *buf = new char[size_json];
if (socket.read_all(buf, size_json) < 0)
{
std::cout << "recv error: " << strerror(errno) << std::endl;
return str;
}
std::string str_json(buf, size_json);
delete[] buf;
return str_json;
}

How to send image data over linux socket

I have a relatively simple web server I have written in C++. It works fine for serving text/html pages, but the way it is written it seems unable to send binary data and I really need to be able to send images.
I have been searching and searching but can't find an answer specific to this question which is written in real C++ (fstream as opposed to using file pointers etc.) and whilst this kind of thing is necessarily low level and may well require handling bytes in a C style array I would like the the code to be as C++ as possible.
I have tried a few methods, this is what I currently have:
int sendFile(const Server* serv, const ssocks::Response& response, int fd)
{
// some other stuff to do with headers etc. ........ then:
// open file
std::ifstream fileHandle;
fileHandle.open(serv->mBase + WWW_D + resource.c_str(), std::ios::binary);
if(!fileHandle.is_open())
{
// error handling code
return -1;
}
// send file
ssize_t buffer_size = 2048;
char buffer[buffer_size];
while(!fileHandle.eof())
{
fileHandle.read(buffer, buffer_size);
status = serv->mSock.doSend(buffer, fd);
if (status == -1)
{
std::cerr << "Error: socket error, sending file\n";
return -1;
}
}
return 0
}
And then elsewhere:
int TcpSocket::doSend(const char* message, int fd) const
{
if (fd == 0)
{
fd = mFiledes;
}
ssize_t bytesSent = send(fd, message, strlen(message), 0);
if (bytesSent < 1)
{
return -1;
}
return 0;
}
As I say, the problem is that when the client requests an image it won't work. I get in std::cerr "Error: socket error sending file"
EDIT : I got it working using the advice in the answer I accepted. For completeness and to help those finding this post I am also posting the final working code.
For sending I decided to use a std::vector rather than a char array. Primarily because I feel it is a more C++ approach and it makes it clear that the data is not a string. This is probably not necessary but a matter of taste. I then counted the bytes read for the stream and passed that over to the send function like this:
// send file
std::vector<char> buffer(SEND_BUFFER);
while(!fileHandle.eof())
{
fileHandle.read(&buffer[0], SEND_BUFFER);
status = serv->mSock.doSend(&buffer[0], fd, fileHandle.gcount());
if (status == -1)
{
std::cerr << "Error: socket error, sending file\n";
return -1;
}
}
Then the actual send function was adapted like this:
int TcpSocket::doSend(const char* message, int fd, size_t size) const
{
if (fd == 0)
{
fd = mFiledes;
}
ssize_t bytesSent = send(fd, message, size, 0);
if (bytesSent < 1)
{
return -1;
}
return 0;
}
The first thing you should change is the while (!fileHandle.eof()) loop, because that will not work as you expect it to, in fact it will iterate once too many because the eof flag isn't set until after you try to read from beyond the end of the file. Instead do e.g. while (fileHandle.read(...)).
The second thing you should do is to check how many bytes was actually read from the file, and only send that amount of bytes.
Lastly, you read binary data, not text, so you can't use strlen on the data you read from the file.
A little explanations of the binary file problem: As you should hopefully know, C-style strings (the ones you use strlen to get the length of) are terminated by a zero character '\0' (in short, a zero byte). Random binary data can contain lots of zero bytes anywhere inside it, and it's a valid byte and doesn't have any special meaning.
When you use strlen to get the length of binary data there are two possible problems:
There's a zero byte in the middle of the data. This will cause strlen to terminate early and return the wrong length.
There's no zero byte in the data. That will cause strlen to go beyond the end of the buffer to look for the zero byte, leading to undefined behavior.

Concatenating strings into own protocol

I'm writing networking programming using socket.h to my studies. I have written server and client simple programs that can transfer files between them using buffer size given by user.
Server
void transfer(string name)
{
char *data_to_send;
ifstream myFile;
myFile.open(name.c_str(),ios::binary);
if(myFile.is_open))
{
while(myFile.eof))
{
data_to_send = new char [buffer_size];
myFile.read(data_to_send, buffer_size);
send(data_to_send,buffer_size);
delete [] data_to_send;
}
myFile.close();
send("03endtransmission",buffer_size);
}
else
{
send("03error",buffer_size);
}
}
Client
void download(string name)
{
char *received_data;
fstream myFile;
myFile.open(name.c_str(),ios::out|ios::binary);
if(myFile.is_open())
{
while(1)
{
received_data = new char[rozmiar_bufora];
if((receivedB = recv(sockfd, received_data, buffer_size,0)) == -1) {
perror("recv");
close(sockfd);
exit(1);
}
if(strcmp(received_data,"03endoftransmission") == 0)
{
cout<<"End of transmission"<<endl;
break;
}
else if (strcmp(received_data,"03error") == 0)
{
cout<<"Error"<<endl;
break;
}
myFile.write(received_data,buffer_size);
}
myFile.close();
}
The problem occurs, when I want to implement my own protocol- two chars (control), 32 chars hash, and the rest of package is data. I tried few times to split it and I end up with this code:
Server
#define PAYLOAD 34
void transfer(string name)
{
char hash[] = "12345678901234567890123456789012"; //32 chars
char *data_to_send;
ifstream myFile;
myFile.open(name.c_str(),ios::binary);
if(myFile.is_open))
{
while(myFile.eof))
{
data_to_send = new char [buffer_size-PAYLOAD];
myFile.read(data_to_send, buffer_size-PAYLOAD);
concatenation = new char[buffer_size];
strcpy(concatenation,"02");
strcat(concatenation,hash);
strcat(concatenation,data_to_send);
send(concatenation,buffer_size);
delete [] data_to_send;
delete [] concatenation;
}
myFile.close();
send("03endtransmission",buffer_size);
}
else
{
send("03error",buffer_size);
}
}
Client
void download(string name)
{
char *received_data;
fstream myFile;
myFile.open(name.c_str(),ios::out|ios::binary);
if(myFile.is_open())
{
while(1)
{
received_data = new char[buffer_size];
if((receivedB = recv(sockfd, received_data, buffer_size,0)) == -1) {
perror("recv");
close(sockfd);
exit(1);
}
if(strcmp(received_data,"03endoftransmission") == 0)
{
cout<<"End of transmission"<<endl;
break;
}
else if (strcmp(received_data,"03error") == 0)
{
cout<<"Error"<<endl;
break;
}
control = new char[3];
strcpy(control,"");
strncpy(control, received_data,2);
control[2]='\0';
hash = new char[33];
strcpy(hash,"");
strncpy(hash,received_data+2,32);
hash[32]='\0';
data = new char[buffer_size-PAYLOAD+1];
strcpy(data,"");
strncpy(data,received_data+34,buffer_size-PAYLOAD);
myFile.write(data,buffer_size-PAYLOAD);
}
myFile.close();
}
But this one inputs to file some ^# instead of real data. Displaying "data" to console looks the same on server and client. If you know how I can split it up, I would be very grateful.
You have some issues which may or may not be your problem.
(1) send/recv can return less than you requested. You may ask to receive 30 bytes but only get 10 on the recv call so all of these have to be coded in loops and buffered somewhere until you actually get the number you wanted. Your first set of programs was lucky to work in this regard and probably only because you tested on a limited amount of data. Once you start to push through more data your assumptions on what you are reading (and comparing) will fail.
(2) There is no need to keep allocating char buffers in the loops; allocate them before the loop or just use a local buffer rather than the heap. What you are doing is inefficient and in the second program you have memory leaks because you don't delete them.
(3) You can get rid of the strcpy/strncpy statements and just use memmove()
Your specific problem is not jumping out at me but maybe this will push in the right direction. More information what is being transmitted properly and exactly where in the data you are seeing problems would be helpful.
But this one inputs to file some ^# instead of real data. Displaying
"data" to console looks the same on server and client. If you know how
I can split it up, I would be very grateful.
You say that the data (I presume the complete file rather than the '^#') is the same on both client and server? If this is the case, then your issue is likely writing the data to file, rather than the actual transmission of the data itself.
If this is the case, you'll probably want to check assumptions about how the program writes to file - for example, are you passing in text data to be written to file, or binary data? If you're writing binary data, but it uses the NULL-terminated string, chances are it will quit early treating valid binary information as a NULL.
If it's text mode, you might want to consider initialising all strings with memset to a default character (other than NULL) to see if it's garbage data being out put.
If both server and client display the '^#' (or whatever data), binary based char data would be incompatible with the strcpy/strcat functions as this rely on NULL termination (where-as binary uses size termination instead).
I can't track down the specific problem, but maybe this might offer an insight or two that helps.

How to send a file with http with c++

I want to write a server side code. It should work with popular browsers and wget. My server check that file exists or not, if exists then browser can download it. But I have some problems.
Honestly, I read lots of question-answer (for example: Send binary file in HTTP response using C sockets) but I didn't find out. My browser (Chrome) can get text. But I cannot send any binary data or images etc. I am changing header according to downloading files. But I cannot send a downloadable files yet.
I have some questions.
void *clientWorker(void * acceptSocket) {
int newSocket = (int) acceptSocket;
char okStatus[] = "HTTP/1.1 200 OK\r\n"
"Content-Type: text/html\r\n"
"Connection: close\r\n"
"Content-Length: 20\r\n"
"\r\n"
"s";
writeLn(newSocket, okStatus);
const char * fileName = "/home/tyra/Desktop/example.txt";
sendF(newSocket, fileName);
}
1- If I wouldn't write "s" or something else inokStatus, my message cannot send. I don't understand anything of this.
This is writeLn function :
void writeLn(int acceptSocket, const char * buffer) {
int n = write(acceptSocket, buffer, strlen(buffer) - 1);
if (n < 0) {
error("Error while writing");
}
}
This is sendF function :
string buffer;
string line;
ifstream myfile(fileName);
struct stat filestatus;
stat(fileName, &filestatus);
int fsize = filestatus.st_size;
if (myfile.is_open()) {
while (myfile.good()) {
getline(myfile, line);
buffer.append(line);
}
cout << buffer << endl;
}
writeLn(acceptSocket, buffer.c_str());
cout << fsize << " bytes\n";
A little messy. I haven't used file size yet. If I send a file, then I rearrange these things.
2- I can send text and browser demonstrates it but browser didn't understand new lines.
If text file contains (123\n456\n789), browser demonstrates (123456789). I think I should change Content-Type header, but I didn't find out.
I don't want that browser demonstrates text files. Browser should download it. How can I send downloadable files?
Sorry, I explain everything pretty complicated.
As to your first question, you should find out the exact size of the file and specify it in your "Content-Length: xxxx\r\n" header. And, of course, you should ensure that the data is sent completely out.
Indeed, in your writeF function you use a std::string as a buffer:
string buffer;
this is not appropriate for binary data. You should allocate a raw char array of the right size:
int fsize = file.tellg();
char* buffer = new char[fsize];
file.seekg (0, ios::beg);
file.read (buffer, size);
file.close();
As to the second point, when your data is not HTML, specify as Content-Type: text/plain;
otherwise, your carriage return should be represented by <br> instead of "\r\n".
In case of binary downloads, to have the data download as a file (and not shown in the browser), you should specify
Content-Type: application/octet-stream
The issue is strlen here. strlen terminates when it gets a '\0' character. In binary file you will have a number of '\0' characters.
While reading the file you should find out the file size. This size should be used in int n = write(acceptSocket, buffer, strlen(buffer) - 1); in place of strlen
Change the writeLn(acceptSocket, buffer.c_str()); to writeLn(acceptSocket, buffer.c_str(), buffer.size()); and try...
For the case of 123\n456\n789 you need to send <PRE>123\n456\n789</PRE> as browser will parse this text as html and not like the OS parses and shows the output. The other way you can do is replace all \n with <BR> ...
Regarding question 1 - if you don't want to send any content back then remove the s from the end of okStatus and specify Content-Length: 0\r\n in the header