Curlpp, incomplete data from request - c++

I am using Curlpp to send requests to various webservices to send and receive data.
So far this has worked fine since i have only used it for sending/receiving JSON data.
Now i have a situation where a webservice returns a zip file in binary form. This is where i encountered a problem where the data received is not complete.
I first had Curl set to write any data to a ostringstream by using the option WriteStream, but this proved not to be the correct approach since the data contained null characters, and thus the data stopped at the first null char.
After that, instead of using WriteStream i used WriteFunction with a callback function.
The problem in this case is that this function is always called 2 or 3 times, regardless of the amount of data.
This results in always having a few chunks of data that don't seem to be the first part of the file, although the data always contains PK as the first 2 characters, indicating a zip file.
I used several tools to verify that the data is entirely being sent to my application so this is not a problem of the webservice.
Here the code. Do note that the options like hostname, port, headers and postfields are set elsewhere.
string requestData;
size_t WriteStringCallback(char* ptr, size_t size, size_t nmemb)
{
requestData += ptr;
int totalSize= size*nmemb;
return totalSize;
}
const string CurlRequest::Perform()
{
curlpp::options::WriteFunction wf(WriteStringCallback);
this->request.setOpt( wf );
this->request.perform();
return requestData;
}
I hope anyone can help me out with this issue because i've run dry of any leads on how to fix this, also because curlpp is poorly documented(and even worse since the curlpp website disappeared).

The problem with the code is that the data is put into a std::string, despite having the data in binary (ZIP) format. I'd recommend to put the data into a stream (or a binary array).
You can also register a callback to retrieve the response headers and act in the WriteCallback according to the "Content-type".
curlpp::options::HeaderFunction to register a callback to retrieve response-headers.

std::string is not a problem, but the concatenation is:
requestData += ptr;
C string (ptr) is terminated with zero, if the input contains any zero bytes, the input will be truncated. You should wrap it into a string which knows the length of its data:
requestData += std::string(ptr, size*nmemb);

Related

Issue with boost serialization not dearchiving when data length is over a specific value

Okay, so I am trying to send a struct with boost asio. The send on the client-side works fine and the read_until also seems fine. However, when it tries to deserialize the data back to the struct it won't work when the size of the archive is greater than about 475 in length. The rest of the struct gets ignored for some reason and only the data field gets printed. I also added screenshots of the output. Basically, when the whole struct is not received there is an input stream error on the line ba >> frame. I also tested both with a larger file and get the same error. I even tried serializing a vector as well so not sure where my error is.
EDIT:
I figured out the issue. When I was reading from the socket I had something like this...
boost::asio::read_until(socket, buf, "\0");
This was causing weird issues reading in all the data from the boost binary archive. To fix this issue I made a custom delimiter that I appended to the archive I was sending over the socket like...
boost::asio::read_until(socket, buf, "StopReadingHere");
This fixed the weird issue of the entire boost archive string not being read into the streambuf.
First Issue
ostringstream oss;
boost::archive::text_oarchive ba(oss);
ba << frame;
string archived_data = oss.str();
Here you take the string without ensuring that the archive is complete. Fix:
ostringstream oss;
{
boost::archive::text_oarchive ba(oss);
ba << frame;
}
string archived_data = oss.str();
Second issue:
boost::asio::read_until(socket, buf, "\0");
string s((istreambuf_iterator<char>(&buf)), istreambuf_iterator<char>());
Here you potentially read too much into s - buf may contain additional data after the '\0'. Use the return value from read_until and e.g. std::copy_n, following buf.consume(n).
If you then keep the buf instance for subsequent reads you will still have the previously read remaining data in the the buffer. If you discard it, that will lead to problems deserializing the next message.
Risky Code?
void write(tcp::socket& socket, string data, int timeout) {
auto time = std::chrono::seconds(timeout);
async_write(socket, boost::asio::buffer(data), transfer_all(), [&] (error_code error, size_t bytes_transferred) {
});
service.await_operation(time, socket);
}
You're using async operation, but passing local variables (data) as buffer.The risk is that data becomes invalid as soon as write returns.
Are you making sure that async_write is always completed before exiting from write? (It is possible that await_operation achieves this for you.
Perhaps you are even using await_operation from my own old answer here How to simulate boost::asio::write with a timeout . It's possible since things were added that some assumptions no longer hold. I can always review some larger piece of code to check.

Cannot read .jpg binary data, buffer only has 4 bytes of data

My question almost exactly the same as this one which is unanswered. I am trying to read the binary data of a .jpg to send as an HTTP response on a simple web server using C++. The code for reading the data is below.
FILE *f = fopen(file.c_str(),"rb");
if(f){
fseek(f,0,SEEK_END);
int length = ftell(f);
fseek(f,0,SEEK_SET);
char* buffer = (char*)malloc(length+1);
if(buffer){
int b = fread(buffer,1,length,f);
std::cout << "bytes read: " << b << std::endl;
}
fclose(f);
buffer[length] = '\0';
return buffer;
}
return NULL;
When the request for the image is made and this code runs, fread() returns 25253 bytes being read, which seems correct. However, when I perform strlen(buffer) I get only 4. Of course, this gives an error on a browser when the image tries to display. I have also tried manually setting the HTTP content length to 25253 but I then a receive a curl error 18, indicating the transfer ended early (as only 4 bytes exist).
As the other poster mentioned in their question, the 5th byte of the image (and I assume most .jpg images) is 0x00, but I am unsure if this has an effect on saving to the buffer.
I have verified the .jpg images I am loading are in the directory, valid, and display properly when opened normally. I have also tried 2 different methods of loading the binary data, and both also give only 4 bytes, so I am really at a loss. Any help is much appreciated.
When the request for the image is made and this code runs, fread()
returns 25253 bytes being read, which seems correct. However, when I
perform strlen(buffer) I get only 4.
Well there is your problem: You read binary data, not text, meaning that special characters like newline or the null character is not a something that indicates the structure of a text, its simple numbers.
strlen is a function to give you the count of characters other than '\0' or simply 0. However in a binary file like jpeg there a dozen of zeros usually in there, and because of a binary header structure, there seems to be always a zero at position 5 so, so strlen will stop at the first it found and return 4.
Also you seem confused by the fact that you try to send this "text interpreted" jpeg to a HTTP server. Of course it will complain, because you can not simply send binary data as text in HTTP, you either have to encode it, base64 is very popular, or set the content length header. Of course you also have to tell the HTTP client/server the type by setting the proper MIME header.

Filter data on pr_write after hooking

I am having troubles filtering data that is passed to PR_Write. It is the Mozilla function that is used in passing all sorts of data sent to the server. I managed to hook it using a DLL( extremely basic ) using the code from Wikipedia on hooking.
The following is the declaration of the PR_Write function referenced from the Mozilla website.
PRInt32 PR_Write(PRFileDesc *fd, const void *buf, PRInt32 amount);
The second parameter buf is what I am logging by casting it to a const char*, it works fine but I don't know how I can filter the data since it logs everything from start to end.
The code below is what I tried but it is too heavy a loop and crashes Mozilla.
char *p=(char*)buf; // get pointer to beginning of the buffer
while (*p!='\00')
{
// do some data filtering
*p++;
}
The idea was from Grayhat Python book to iterate through the buffer and filter data as needed but the loop is too much since the buffer is always extremely large.
Overall, I need a way to filter the data that is passed to the second parameter.
Thanks for any suggestions in advance :)

Cannot Send Image File (image/jpg) Using Winsock WSABUF

I'm stuck and I need help.
I'm trying to write the correct code for sending back an image file so the web browser can render it. It can send back text/html just fine, but image/* is not working.
You can see the code and the URL is shown below.
https://github.com/MagnusTiberius/iocphttpd/blob/master/iocphttpl/SocketCompletionPortServer.cpp
What the browser is receiving is just a few bytes of image data.
I tried vector, std::string and const char* to set the values of WSABUF, but still the same few bytes are sent over.
Please let know what is the missing piece to make this one work.
Thanks in advance.
Here's your problem:
PerIoData->LPBuffer = _strdup(str.c_str());
The _strdup function only copies up until the first null, so it cannot be used to copy binary data. Consider using malloc and memcpy if you don't want to use the C++ library.
The alternate implementation (in the false branch) is also incorrect, because it saves the data in an object (vc) that goes out of scope before the I/O is completed. You could instead do something like
vector<char> * vc = new vector<char>;

How to send large, frequent xml data from javascript to a c++ http server

In my project I want to send possibly large and frequent XML data to a custom server written in c++. I don't want to use Apache and CGI because the data is too frequent to be starting a CGI process for every request. I would prefer if the data was recieved directly in the c++ code that will process the data and send a reply.
I started out by using libmicrohttpd for the c++ server but now I believe it won't be possible because it doesn't give access to the raw POST data. I started looking for another library but I can't seem to find a c++ library that does this. Can anyone suggest a c++ http server library that has access to the raw post data?
Here is the code I intended to start with. It is one of the example files provided in the source code of libmicrohttpd. Post Example from libmicrohttpd library
Edit:
A little more context.
From what I understand to access the post data in libmicrohttpd you create MHD_PostProcessor function that gets called incrementally as the post data is received in chunks. But in the example below it only shows how to get post data in the form of key value pairs. But I can't see how to get the data from a post.
The example implements the MHD_PostProcessor as post_iterator. See the definition of
static int post_iterator(void *cls,
enum MHD_ValueKind kind,
const char *key,
const char *filename,
const char *content_type,
const char *transfer_encoding,
const char *data, uint64_t off, size_t size) {
...
in the example posted above. You will see it only shows how to iterate the key value pairs.
MHD does give you access to the raw POST data, just grab it from "upload_data" directly instead of passing it to the MHD_PostProcessor. MHD will give you the uploaded POST stream incrementally by calling your main request processing callback repeatedly with more and more POST data being given to you raw, unprocessed in "upload_data".