I am trying to build a web server for videos and my client is a web-browser. I am using HTTP with header structure with content-Disposition: attachment
std::string make_header(std::string filename, int file_size)
{
std::ostringstream header;
header << "HTTP/1.1 200 OK\r\n";
header << "Content-Type: " << get_contenttype(filename) << "\r\n";
header << "Content-Disposition: attachment; filename =\"" << filename << "\"\r\n";
header << "Connection: close\r\n";
header << "Content-Length: " << file_size << "\r\n\r\n";
return header.str();
}
I am able to send small files but as soon as large files are given the server takes lot of time. I am using the following method to copy the video file to a local variable.
std::string content((std::istreambuf_iterator<char>(file)),
std::istreambuf_iterator<char>());
Is there any way to send file using HTTP without actually loading the entire file into memory.
The short answer is: Yes, of course. In the end, HTTP goes over TCP, which goes over IPv4/v6. IP is a sequence of small packets, and TCP is even a bytestream protocol. The other side won't notice that you are still reading bytes from disk when you're sending the first bytes.
In practice, that means the client can't see or won't care how many calls to send() you made. One bug call or a thousand small calls are equivalent.
I wouldn't bother with istreambuf_iterator. I'd just use fread, but I'd still use a std::vector<char> for the buffer. Just read 1 MB chunks and send those. Your OS isn't going to choke on either an 1MB disk read or a 1MB send call.
For a higher-end solution I'd use Boost::asio on Windows, or sendfile on Linux.
Related
I am writing a server-client application using Winsock in c++ for sending a file line by line and I have a problem in sending huge string. The line size is very huge.
For getting the message from the client by the server I use the code below.
int result;
char message[200];
while (true)
{
recv(newSd, (char*)&message, sizeof(message), 0);
cout << "The Message from client: " << message << ";";
}
The above code working fine if I send small length of the message. But, what I wanted is to send an unknown size of lines in a file.
How to send a big unknown string instead of char message[200];
TCP is a byte stream, it knows nothing about messages or lines or anything like that. When you send data over TCP, all it knows about is raw bytes, not what the bytes represent. It is your responsibility to implement a messaging protocol on top of TCP to delimit the data in some meaningful way so the receiver can know when the data is complete. There are two ways to do that:
send the data length before sending the actual data. The receiver reads the length first, then reads however many bytes the length says.
send a unique terminator after sending the data. Make sure the terminator never appears in the data. The receiver can then read until the terminator is received.
You are not handling either of those in your recv() code, so I suspect you are not handling either of them in your send() code, too (which you did not show).
Since you are sending a text file, you can either:
send the file size, such as in a uint32_t or uint64_t (depending on how large the file is), then send the raw file bytes.
send each text line individually as-is, terminated by a CRLF or bare-LF line break after each line, and then send a final terminator after the last line.
You are also ignoring the return value of recv(), which tells you how many bytes were actually received. It can, and usually does, return fewer bytes than requested, so you must be prepared to call recv() multiple times, usually in a loop, to receive data completely. Same with send().
I am trying to use boost::asio to make a synchronous http POST request to localhost:55001/predict with a large body (around 8784000 characters). I am able to do this just fine in python using the requests package. My server is able to handle the large body just fine so I know the issue is probably not on the server side.
The Problem:
I set up my request based on other stackoverflow posts for boost::asio POST requests. Here is the main chunk of code for how I set up my POST request. EDIT: I need to use Boost because my client cannot use C++11.
The features variable contains a very long string (360 JSONs represented as string delimited by &).
boost::asio::streambuf request;
std::ostream request_stream(&request);
request_stream << "POST " << /predict << " HTTP/1.1\r\n";
request_stream << "Host: " << localhost:55001 << "\r\n";
request_stream << "Content-Type: text/plain; charset=utf-8 \r\n";
request_stream << "Content-Length: " << features.length() << "\r\n";
request_stream << "Connection: close\r\n\r\n";
request_stream << features;
// Send the request
boost::asio::write(socket, request);
// Get the response
boost::asio::streambuf response;
boost::asio::read_until(socket, response, "\r\n");
I get the following error/response from the server:
Response returned with status code 413
libc++abi.dylib: terminating with uncaught exception of type boost::exception_detail::clone_impl >: (1): expected value
The Jetty server I'm sending the request to complains of a HttpParser:HttpParser Full which implies that the request is to large. Considering I was able to send the full request using Python's request package I know the the Jetty server is definitely able to handle requests of this size. However, this means I am packaging my request incorrectly with boost::asio.
When features just contained 2 JSONs (still delimited by &) everything works fine and the response contains the results I expect.
I suspect that this issue is because I am writing too much data to the buffer and that I need to send several buffers in a single request. Furthermore, I imagine that Python's request package is handling these issues internally which is why the python code works just fine. Here is the line I use to send the request in python.
response = requests.post('http://localhost:55001/predict', data=features.encode('utf-8'))
1) Can someone explain how to send a lot of data in the request body of a synchronous POST request using boost::asio? I am unfamiliar with C++ so an example would be helpful.
2) Is Python's Request package abstracting these issues away from me?
Let me know if there is additional information I can provide to help you answer this. Thanks in advance!
Boost is great but if all you're trying to do is HTTP GET and POST methods (and you are using C++11) then consider something like cpr C++ Requests. It's a lightweight library that wraps libcurl and is modeled after the Python Requests project.
From the documentation, here is a quick example of making a POST request
#include <cpr/cpr.h>
auto r = cpr::Post(cpr::Url{"http://www.httpbin.org/post"},
cpr::Body{"This is raw POST data"},
cpr::Header{{"Content-Type", "text/plain"}});
std::cout << r.text << std::endl;
/*
* {
* "args": {},
* "data": "This is raw POST data",
* "files": {},
* "form": {},
* "headers": {
* ..
* "Content-Type": "text/plain",
* ..
* },
* "json": null,
* "url": "http://www.httpbin.org/post"
* }
*/
Currently I am receiving data synchronously in the following manner
boost::array<char, 2000> buf;
while(true)
{
std::string dt;
size_t len = connect_sock->receive(boost::asio::buffer(buf, 3000));
std::copy(buf.begin(), buf.begin()+len, std::back_inserter(dt));
std::cout << dt;
}
My question is whether this method is efficient enough to receive data that exceed the buffer size . Is there any way that I could know exactly how much data is available so that I could adjust the buffer size accordingly ? (The reason for this is that my server sends a particular response to a request that needs to be processed only when an entire response has been stored in a string variable.
If you are sending data using TCP, you have to take care of this at the application protocol level.
For example, you could prefix each request with a header that would include the number of bytes that make up the request. The receiver, having read and parsed the header, would know how many more bytes it would need to read to get the rest of the request. Then it could repeatedly call receive() until it gets the correct amount of data.
Symptom
I think, I messed up something, because both Mozilla Firefox and Google Chrome produce the same error: they don't receive the whole response the webserver sends them. CURL never misses, the last line of the quick-scrolling response is always "</html>".
Reason
The reason is, that I send response in more part:
sendHeaders(); // is calls sendResponse with a fix header
sendResponse(html_opening_part);
for ( ...scan some data... ) {
sendResponse(the_data);
} // for
sendResponse(html_closing_part)
The browsers stop receiving data between sendResponse() calls. Also, the webserver does not close() the socket, just at the end.
(Why I'm doing this way: the program I write is designed for non-linux system, it will run on an embedded computer. It has not too much memory, which is mostly occupied by lwIP stack. So, avoid collecting the - relativelly - huge webpage, I send it in parts. Browsers like it, no broken HTML occurred as under Linux.)
Environment
The platform is GNU/Linux (Ubuntu 32-bit with 3.0 kernel). My small webserver sends the stuff back to the client standard way:
int sendResponse(char* data,int length) {
int x = send(fd,data,length,MSG_NOSIGNAL);
if (x == -1) {
perror("this message never printed, so there's no error \n");
if (errno == EPIPE) return 0;
if (errno == ECONNRESET) return 0;
... panic() ... (never happened) ...
} // if send()
} // sendResponse()
And here's the fixed header I am using:
sendResponse(
"HTTP/1.0 200 OK\n"
"Server: MyTinyWebServer\n"
"Content-Type: text/html; charset=UTF-8\n"
"Cache-Control: no-store, no-cache\n"
"Pragma: no-cache\n"
"Connection: close\n"
"\n"
);
Question
Is this normal? Do I have to send the whole response with a single send()? (Which I'm working on now, until a quick solution arrives.)
If you read RFC 2616, you'll see that you should be using CR+LF for the ends of lines.
Aside from that, open the browser developer tools to see the exact requests they are making. Use a tool like Netcat to duplicate the requests, then eliminate each header in turn until it starts working.
Gotcha!
As #Jim adviced, I've tried sending same headers with CURL, as Mozilla does: fail, broken pipe, etc. I've deleted half of headers: okay. I've added back one by one: fail. Deleted another half of headers: okay... So, there is error, only if header is too long. Bingo.
As I've said, there're very small amount of memory in the embedded device. So, I don't read the whole request header, only 256 bytes of them. I need only the GET params and "Host" header (even I don't need it really, just to perform redirects with the same "Host" instead of IP address).
So, if I don't recv() the whole request header, I can not send() back the whole response.
Thanks for your advices, dudes!
I'm serving some files locally via HTTP using QTcpSocket. My problem is that only wget downloads the file properly, firefox adds four extra bytes to the end. This is the header I send:
HTTP/1.0 200 Ok
Content-Length: 382917;
Content-Type: application/x-shockwave-flash;
Content-Disposition: attachment; filename=file.swf;
This is the code used to send the response:
QTextStream os(socket);
os.setAutoDetectUnicode(true);
QString name = tokens[1].right(tokens[1].length() - 1);
QString resname = ":/" + name; // the served file is a Qt resource
QFile f(resname); f.open(QIODevice::ReadOnly);
os << "HTTP/1.0 200 Ok\r\n" <<
"Content-Length: " << f.size() << ";\r\n" <<
"Content-Type: application/x-shockwave-flash;\r\n" <<
"Content-Disposition: attachment; filename=" << name <<
";\r\n\r\n";
os.flush();
QDataStream ds(socket);
ds << f.readAll();
socket->close();
if (socket->state() == QTcpSocket::UnconnectedState)
{
delete socket;
}
As I stated above, wget gets it right and downloads the file properly. The problem is that Firefox (and my target application, a Flash ActiveX instance) don't.
The four extra bytes are always the same: 4E E9 A5 F4
Hex dump http://www.freeimagehosting.net/uploads/a5711fd7af.gif
My question is what am I doing wrong, and what should I change to get it right? Thanks in advance.
You should not be terminating the lines with semicolons. At first glance this seems like the most likely problem.
I don't know much about QDataStream (or QT in general), however a quick look at the QDataStream documentation mentions operator<<(char const*). If you are passing a null terminated string to QDataStream, you are almost certainly going over the end of the final buffer.
Try using QDataStream::writeRawBytes().
If you remove the semicolons, then the clients should at least read the correct number of bytes for the response and ignore the last four bytes.
I'd leave out "Content-Disposition" too. That's a MIME thing, not an HTTP thing.
So I found the whole solution to the question, and I think someone might need it, so here it is:
The first problem were the four extra bytes. The reason for this is that according to the QDataStream documentation, "each item written to the stream is written in a predefined binary format that varies depending on the item's type". And as QFile.readAll() returned a QByteArray, QDataStream.operator<< wrote that object in the following format:
If the byte array is null: 0xFFFFFFFF (quint32)
Otherwise: the array size (quint32) followed by the array bytes, i.e. size bytes
(link)
So, the four extra bytes were the four bytes of quint32 that denoted the array size.
The solution, according to janm's answer was to use the writeRawBytes() function.
QDataStream ds(socket);
ds.writeRawData(f.readAll().data(), f.size());
Wget probably got it right the first time because it strictly enforces the Content-Length field of the HTTP header, while apparently firefox does not.
The second problem was that despite the right header and working sockets, the flashplayer did not display the desired content at all. I experimented with various fields to make it work, and noticed that by uploading to a real server, it works all right. I copied the header from server, and tadaa! it works. This is the header:
HTTP/1.1 200 OK
Server: Apache/2.2.15 (Fedora)
Accept-Ranges: bytes
Content-Length: 382917
Content-Type: application/x-shockwave-flash
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
At first I only tried setting the version to 1.1, but that didn't help. Probably it's the keepalive thing, but honestly, I don't care at all as long as it works :).
There shouldn't be any semicolons at the end of the line.