Apache module - get request body - c++

I am creating simple apache module to capture all HTTP traffic for real time processing by security software. My goal is to get headers and body from both request and response. So far I managed to get all i need except request body. What's the best way to get request body in output filter, or in any other hook/handler to get request-response "tuple" with all releated information ?
static apr_status_t ef_output_filter(ap_filter_t *f, apr_bucket_brigade *bb)
{
apr_status_t rv;
request_rec *r = f->r;
apr_bucket *e = APR_BRIGADE_FIRST(bb);
const char *data;
apr_size_t length;
std::ofstream outfile;
outfile.open("/var/log/apache2/test.txt", std::ios_base::app);
outfile << r->method << r->unparsed_uri << std::endl;
apr_table_do(loop_table, &outfile, r->headers_in, NULL);
//!!! READ REQUST BODY HERE !!!!
outfile << r->status << std::endl;
apr_table_do(loop_table, &outfile, r->headers_out, NULL);
outfile << std::endl;
while (e != APR_BRIGADE_SENTINEL(bb)) {
apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
e = APR_BUCKET_NEXT(e);
outfile << data;
}
outfile.flush();
outfile.close();
return ap_pass_brigade(f->next, bb);
}
Any help appriciated

You can read the body from the request_rec pointer you're deriving from the ap_filter_t pointer variable.
As a first step, you should tell apache you want to read data from the client, by calling ap_setup_client_block, passing the request_rec pointer and a "read policy" as argument.
Second, you call ap_should_client_block (passing the request_rec pointer as argument) to check everything is OK, especially on the client side (expecting true as result).
Then you call (as many times as needed) ap_get_client_block, with the request_rec as argument, a buffer where the data will go, and the size of your buffer. You should get as a response the number of bytes read, and the data should be in your buffer. If you tried to read X bytes maximum and got X bytes returned, you should call again to get the remaining bytes. Note that the header "Content-length" should be use to avoid trying to read too many data, which might cause crashes...
So you'd go for something along the lines of:
char buffer[SOME_BUFER_SIZE];
int ret_code = ap_setup_client_block(r, REQUEST_CHUNKED_ERROR);
if (ret_code == OK) {
if (ap_should_client_block(r)) {
int dataBytesRead = ap_get_client_block(r, buffer, SOME_BUFFER_SIZE);
...
}
}
As of writing, you can find more info here: https://docstore.mik.ua/orelly/apache_mod/139.htm or here: http://byteandbits.blogspot.com/2013/09/example-apache-module-for-reading.html
Hope it helps...

Related

Cannot serve png files and other binary files in hobby HTTP server

I am writing a HTTP server in C++, and serving static files is mostly OK, however when reading .PNG files or other binary's, every method I have tried fails. My main problem is when I open up Dev tools, reading a example image would give a transferred size of 29.56kb, and a size of 29.50 kb for my current method. The sizes given also do not match up with what du-sh give, which is 32kb.
My first method was to push the contents of a file onto a string, and call a function to serve that. However, this would also server ~6kb if memory serves correctly.
My current method is to read the file using std::ifstream in binary mode. I am getting the size of the file using C++17's filesystem header and using std::filesystem::file_size. I read the contents into a buffer and then call a function to send the buffer contents 1 byte at a time
void WebServer::sendContents(std::string contents) {
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
throw std::runtime_error("Server accept: " + std::string(strerror(errno)));
}
}
void WebServer::sendFile(std::string path) {
path = "./" + path;
std::string fileCont; //File contents
std::string mimeType; //Mime type of the file
std::string contLength;
std::string::size_type idx = path.rfind('.');
if (idx != std::string::npos) mimeType = this->getMimeType(path.substr(idx + 1));
else mimeType = "text/html";
std::filesystem::path reqPath = std::filesystem::path("./" + path).make_preferred();
std::filesystem::path parentPath = std::filesystem::path("./");
std::filesystem::path actualPath = std::filesystem::canonical(parentPath / reqPath);
if (!this->isSubDir(actualPath, parentPath)) { this->sendRoute("404"); return; }
std::ifstream ifs;
ifs.open(actualPath, std::ios::binary);
if (ifs.is_open()) {
//Get the size of the static file being server
std::filesystem::path staticPath{path};
std::size_t length = std::filesystem::file_size(staticPath);
char* buffer = new char[length];
*buffer = { 0 }; //Initalize the buffer that will send the static file
ifs.read(buffer, sizeof(char) * length); //Read the buffer
std::string resp = "HTTP/1.0 200 OK\r\n"
"Server: webserver-c\r\n"
"Content-Length" + std::to_string(length) + "\r\n"
"Content-type: " + mimeType + "\r\n\r\n";
if (!ifs) std::cout << "Error! Only " << std::string(ifs.gcount()) << " could be read!" << std::endl;
this->sendContents(resp); //Send the headers
for (size_t i=0; i < length; i++) {
std::string byte = std::string(1, buffer[i]);
this->sendContents(byte);
}
delete buffer; //We do not need megs of memory stack up, that shit will grow quick
buffer = nullptr;
} else {
this->sendContents("HTTP/1.1 500 Error\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n"); return;
}
ifs.close();
}
It should be noted that this->newFd is a socket descriptor
It should also be noted that I have tried to take a look at this question here, however the same problem still occurs for me
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
There are two bugs for the price of one, here.
This is used to send the contents of the binary file. One byte at a time. sendContents gets used, apparently, to send one byte at a time, here. This is horribly inefficient, but it's not the bug. The first bug is as follows.
Your binary file has plenty of bytes that are 00.
In that case, contents will proudly contain this 00 byte, here. c_str() returns a pointer to it. strlen() then reaches the conclusion that it is receiving an empty string, for input, and make a grandiose announcement that the string contains 0 characters.
In the end, send's third parameter will be 0.
No bytes will get sent, at all, instead of the famous 00 byte.
The second bug will come into play once the inefficient algorithm gets fixed, and sendContents gets used to send more than one byte at a time.
send() holds a secret: this system call may return other values, other than -1 to indicate the failure. Such as the actual number of bytes that were sent. So, if send() was called to send, say, 100 bytes, it may decide so send only 30 bytes, return 30, and leaving you holding the bag with the remaining 70 unsent bytes.
This is actually, already, an existing bug in the shown code. sendContents() also gets used to send the entire resp string. Which is, eh, in the neighborhood of a 100 bytes. Give or take a dozen.
You are relying on this house of cards: of send() always doing its job complete job, in this particular case, not slacking off, and actually sending the entire HTTP/1.0 response string.
But, send() is a famous slacker, and you have no guarantees, whatsoever, that this will actually happen. And I have it on good authority that an upcoming Friday the 13th your send() will decide to slack off, all of a sudden.
So, to fix the shown code:
Implement the appropriate logic to handle the return value from send().
Do not use c_str(), followed by strlen(), because: A) it's broken, for strings containing binary data, B) this elaborate routine simply reinvents a wheel called size(). You will be happy to know that size() does exactly what it's name claims to be.
One other bug:
char* buffer = new char[length];
It is possible for an exception to get thrown from the subsequent code. This memory get leaked, because delete does not get called.
C++ gurus know a weird trick: they rarely use new, but instead use containers, like std::vector, and they don't have to worry about leaking memory, because of that.

Segmentation fault on the second call of the function?

Edit (solution)
I've followed the advice of debugging with -fsanitize=address & valgrind. I only used -fsanitize (which I never heard of before) and found out what was the problem, there was a left over call for a destructor in another function and the object was being destroyed twice. The memory was completely jeopardised at this point.
Thanks a lot for the help and the other recommendations too.
I'm writing a code in C++ to talk with CouchDB using sockets (CouchDB is a Database by Apache that has an HTTP API). I've created a whole class to deal with it and it's basically a socket client that connects and closes.
One of my functions is to send an HTTP request and then read the response and work with it, it works fine on the first call, but fails when I call it a second time.
But it's inconsistent where it fails, sometimes it's a SEGFAULT inside of it in one of the string functions, other times it's a SIGABORT in the return. I've signalled the lines where it crashed with ->
And the worst part is that it only fails when it runs for the "second" time, which is actually the 10th time. Explanation: When the class is instantiated a socket is created, sendRequest is called 8 times (all work, always), I close the socket. Then I have another class that controls a socket server, which receives commands and creates a remote user object that executes the command, the remote user command then calls the CouchDB class to manipulate the DB. The first time a command is requested works, but the second fails and crashes the program.
Extra info: In the short int httpcode line, gdb trace shows it's a crash on substr, on the SIGABORT crash trace shows a problem on free().
I've already debugged many times, made some changes as to where and how to instantiate the string and the buffer and I'm lost. Anyone knows why it would work fine many times but crash on a subsequent call?
CouchDB::response CouchDB::sendRequest(std::string req_method, std::string req_doc, std::string msg)
{
std::string responseBody;
char buffer[1024];
// zero message buffer
memset(buffer, 0, sizeof(buffer));
std::ostringstream smsg;
smsg << req_method << " /" << req_doc << " HTTP/1.1\r\n"
<< "Host: " << user_agent << "\r\n"
<< "Accept: application/json\r\n"
<< "Content-Length: " << msg.size() << "\r\n"
<< (msg.size() > 0 ? "Content-Type: application/json\r\n" : "")
<< "\r\n"
<< msg;
/*std::cout << "========== Request ==========\n"
<< smsg.str() << std::endl;*/
if (sendData((void*)smsg.str().c_str(), smsg.str().size())) {
perror("#CouchDB::sendRequest, Error writing to socket");
std::cerr << "#CouchDB::sendRequest, Make sure CouchDB is running in " << user_agent << std::endl;
return {-1, "ERROR"};
}
// response
int len = recv(socketfd, buffer, sizeof(buffer), 0);
if (len < 0) {
perror("#CouchDB::sendRequest, Error reading socket");
return {-1, "ERROR"};
}
else if (len == 0) {
std::cerr << "#CouchDB::sendRequest, Connection closed by server\n";
return {-1, "ERROR"};
}
responseBody.assign(buffer);
// HTTP code is the second thing after the protocol name and version
-> short int httpcode = std::stoi(responseBody.substr(responseBody.find(" ") + 1));
bool chunked = responseBody.find("Transfer-Encoding: chunked") != std::string::npos;
/*std::cout << "========= Response =========\n"
<< responseBody << std::endl;*/
// body starts after two CRLF
responseBody = responseBody.substr(responseBody.find("\r\n\r\n") + 4);
// chunked means that the response comes in multiple packets
// we must keep reading the socket until the server tells us it's over, or an error happen
if (chunked) {
std::string chunkBody;
unsigned long size = 1;
while (size > 0) {
while (responseBody.length() > 0) {
// chunked requests start with the size of the chunk in HEX
size = std::stoi(responseBody, 0, 16);
// the chunk is on the next line
size_t chunkStart = responseBody.find("\r\n") + 2;
chunkBody += responseBody.substr(chunkStart, size);
// next chunk might be in this same request, if so, there must have something after the next CRLF
responseBody = responseBody.substr(chunkStart + size + 2);
}
if (size > 0) {
len = recv(socketfd, buffer, sizeof(buffer), 0);
if (len < 0) {
perror("#CouchDB::sendRequest:chunked, Error reading socket");
return {-1, "ERROR"};
}
else if (len == 0) {
std::cerr << "#CouchDB::sendRequest:chunked, Connection closed by server\n";
return {-1, "ERROR"};
}
responseBody.assign(buffer);
}
}
// move created body from chunks to responseBody
-> responseBody = chunkBody;
}
return {httpcode, responseBody};
}
The function that calls the above and that sometimes SIGABORT
bool CouchDB::find(Database::db db_type, std::string keyValue, std::string &value)
{
if (!createSocket()) {
return false;
}
std::ostringstream doc;
std::ostringstream json;
doc << db_name << db_names[db_type] << "/_find";
json << "{\"selector\":{" << keyValue << "},\"limit\":1,\"use_index\":\"index\"}";
-> CouchDB::response status = sendRequest("POST", doc.str(), json.str());
close(socketfd);
if (status.httpcode == 200) {
value = status.body;
return true;
}
return false;
}
Some bits that you might have questions about:
CouchDB::response is a struct {httpcode: int, body: std::string}
CouchDB::db is an enum to choose different databases
sendData only sends anything as bytes until all bytes are sent
Make it int len = recv(socketfd, buffer, sizeof(buffer), 0); might be overwriting the last '\0' in your buffer. One might be tempted to use sizeof(buffer) - 1 but this would be wrong as you might be getting null bytes in your stream. So, do this instead: responseBody.assign(buffer, len);. Only do this of course after you've made sure len >= 0, which you do in your error checks.
You have to do that every place where you call recv. Though, why you're using recv instead of read is beyond me, since you aren't using any of the flags.
Also, your buffer memset is pointless if you do it my way. You should also declare your buffer right before you use it. I had to read through half your function to figure out if you did anything with it. Though, of course, you do end up using it a second time.
Heck, since your error handling is basically identical in both cases, I would just make a function that did it. Don't repeat yourself.
Lastly, you play fast and loose with the result of find. You might not actually find what you're looking for and might get string::npos back instead, and that'd also cause you interesting problems.
Another thing, try -fsanitize=address (or some of the other sanitize options documented there) if you're using gcc or clang. And/or run it under valgrind. Your memory error may be far from the code that's crashing. Those might help you get close to it.
And, a very last note. Your logic is totally messed up. You have to separate out your reading data and your parsing and keep a different state machine for each. There is no guarantee that your first read gets the entire HTTP header, no matter how big that read is. And there is no guarantee that your header is less than a certain size either.
You have to keep reading until you've either read more than you're willing to for the header and consider it an error, or until you get the CR LN CR LN at the end of the header.
Those last bits won't cause your code to crash, but will cause you to get spurious errors, especially in certain traffic scenarios, which means that they will likely not show up in testing.

How to publish JSON to a web server?

I am playing around with freeboard.io and trying to make a widget that pulls JSON data from a URL [TBD]. My original data source is from an iMX6-based Wandboard running Linux that is connected to the internet. I want to write a C++ program on the Wandboard that opens a socket to [TBD] and sends UDP packets, for example, containing my sensor data. My JSON data structure is like this:
{
"sensor_a": 1100,
"sensor_b": 247,
"sensor_c": 0
}
Can you help me put my JSON data structure into an IP packet using C++ on Ubuntu Linux? I know how to just serialize the data structure in ascii for example and build a buffer to stuff an IP packet but I'm wondering if there is a standard way to do this for cloud services, or will it be different for Azure vs AWS? Is some type of header info needed to "put" the data?
This is a very simple problem, like all simple problems no need for external libraries for serializing etc. Like #Galik said above your problem is how to send a string from client to server. Additionally for your case you need a JSON parser on the server (any C or C++ parser from the JSON page will do, I use gason because it's fast and simple).
In TCP/IP socket programming you have to make the other part know how many bytes (characters in your case) to read.
I faced a similar case: send JSON over the web.
here's the example, a JSON "message"
https://github.com/pedro-vicente/lib_netsockets/blob/master/examples/json_message.cc
in this case, the size of the message has this header format
nbr_bytes#json_string
where "json_string" is the JSON text, "nbr_bytes" is the number of characters "json_string" has and "#" is a separator character.
how does the server parse this?
By reading 1 character at a time until the "#" separator is found, then converting that string into a number;
then make the socket API read "nbr_bytes" characters and exit
example
100#{json_txt....}
in this case "json_txt" has 100 characters
here's the code for the parser
std::string read_response(socket_t &socket)
{
int recv_size; // size in bytes received or -1 on error
size_t size_json = 0; //in bytes
std::string str_header;
std::string str;
//parse header, one character at a time and look for for separator #
//assume size header lenght less than 20 digits
for (size_t idx = 0; idx < 20; idx++)
{
char c;
if ((recv_size = recv(socket.m_socket_fd, &c, 1, 0)) == -1)
{
std::cout << "recv error: " << strerror(errno) << std::endl;
return str;
}
if (c == '#')
{
break;
}
else
{
str_header += c;
}
}
//get size
size_json = static_cast<size_t>(atoi(str_header.c_str()));
//read from socket with known size
char *buf = new char[size_json];
if (socket.read_all(buf, size_json) < 0)
{
std::cout << "recv error: " << strerror(errno) << std::endl;
return str;
}
std::string str_json(buf, size_json);
delete[] buf;
return str_json;
}

How to send image data over linux socket

I have a relatively simple web server I have written in C++. It works fine for serving text/html pages, but the way it is written it seems unable to send binary data and I really need to be able to send images.
I have been searching and searching but can't find an answer specific to this question which is written in real C++ (fstream as opposed to using file pointers etc.) and whilst this kind of thing is necessarily low level and may well require handling bytes in a C style array I would like the the code to be as C++ as possible.
I have tried a few methods, this is what I currently have:
int sendFile(const Server* serv, const ssocks::Response& response, int fd)
{
// some other stuff to do with headers etc. ........ then:
// open file
std::ifstream fileHandle;
fileHandle.open(serv->mBase + WWW_D + resource.c_str(), std::ios::binary);
if(!fileHandle.is_open())
{
// error handling code
return -1;
}
// send file
ssize_t buffer_size = 2048;
char buffer[buffer_size];
while(!fileHandle.eof())
{
fileHandle.read(buffer, buffer_size);
status = serv->mSock.doSend(buffer, fd);
if (status == -1)
{
std::cerr << "Error: socket error, sending file\n";
return -1;
}
}
return 0
}
And then elsewhere:
int TcpSocket::doSend(const char* message, int fd) const
{
if (fd == 0)
{
fd = mFiledes;
}
ssize_t bytesSent = send(fd, message, strlen(message), 0);
if (bytesSent < 1)
{
return -1;
}
return 0;
}
As I say, the problem is that when the client requests an image it won't work. I get in std::cerr "Error: socket error sending file"
EDIT : I got it working using the advice in the answer I accepted. For completeness and to help those finding this post I am also posting the final working code.
For sending I decided to use a std::vector rather than a char array. Primarily because I feel it is a more C++ approach and it makes it clear that the data is not a string. This is probably not necessary but a matter of taste. I then counted the bytes read for the stream and passed that over to the send function like this:
// send file
std::vector<char> buffer(SEND_BUFFER);
while(!fileHandle.eof())
{
fileHandle.read(&buffer[0], SEND_BUFFER);
status = serv->mSock.doSend(&buffer[0], fd, fileHandle.gcount());
if (status == -1)
{
std::cerr << "Error: socket error, sending file\n";
return -1;
}
}
Then the actual send function was adapted like this:
int TcpSocket::doSend(const char* message, int fd, size_t size) const
{
if (fd == 0)
{
fd = mFiledes;
}
ssize_t bytesSent = send(fd, message, size, 0);
if (bytesSent < 1)
{
return -1;
}
return 0;
}
The first thing you should change is the while (!fileHandle.eof()) loop, because that will not work as you expect it to, in fact it will iterate once too many because the eof flag isn't set until after you try to read from beyond the end of the file. Instead do e.g. while (fileHandle.read(...)).
The second thing you should do is to check how many bytes was actually read from the file, and only send that amount of bytes.
Lastly, you read binary data, not text, so you can't use strlen on the data you read from the file.
A little explanations of the binary file problem: As you should hopefully know, C-style strings (the ones you use strlen to get the length of) are terminated by a zero character '\0' (in short, a zero byte). Random binary data can contain lots of zero bytes anywhere inside it, and it's a valid byte and doesn't have any special meaning.
When you use strlen to get the length of binary data there are two possible problems:
There's a zero byte in the middle of the data. This will cause strlen to terminate early and return the wrong length.
There's no zero byte in the data. That will cause strlen to go beyond the end of the buffer to look for the zero byte, leading to undefined behavior.

C++ Read From Socket into std::string

I am writing a program in c++ that uses c sockets. I need a function to receive data that I would like to return a string. I know this will not work:
std::string Communication::recv(int bytes) {
std::string output;
if (read(this->sock, output, bytes)<0) {
std::cerr << "Failed to read data from socket.\n";
}
return output;
}
Because the read()* function takes a char array pointer for an argument. What is the best way to return a string here? I know I could theoretically read the data into a char array then convert that to a string but that seems wasteful to me. Is there a better way?
*I don't actually mind using something other that read() if there is a more fitting alternative
Here is all of the code on pastebin which should expire in a week. If I don't have an answer by then I will re-post it: http://pastebin.com/HkTDzmSt
[UPDATE]
I also tried using &output[0] but got the output contained the following:
jello!
[insert a billion bell characters here]
"jello!" was the data sent back to the socket.
Here are some functions that should help you accomplish what you want. It assumes you'll only receive ascii character from the other end of the socket.
std::string Communication::recv(int bytes) {
std::string output(bytes, 0);
if (read(this->sock, &output[0], bytes-1)<0) {
std::cerr << "Failed to read data from socket.\n";
}
return output;
}
or
std::string Communication::recv(int bytes) {
std::string output;
output.resize(bytes);
int bytes_received = read(this->sock, &output[0], bytes-1);
if (bytes_received<0) {
std::cerr << "Failed to read data from socket.\n";
return "";
}
output[bytes_received] = 0;
return output;
}
When printing the string, be sure to use cout << output.c_str() since string overwrite operator<< and skip unprintable character until it reaches size. Ultimately, you could also resize at the end of the function to the size received and be able to use normal cout.
As pointed out in comments, sending the size first would also be a great idea to avoid possible unnecessary memory allocation by the string class.