URL encoding in C++ and decoding in nodejs - c++

I am transferring data from a C++ client to a nodejs server.
I compress the string using zlib deflate first, then I use curl_easy_escape to url encode the compressed string.
std::string s = zlib_compress(temp.str());
std::cout << s <<"\n";
CURL *handle = curl_easy_init();
char* o = curl_easy_escape(handle, s.data(), s.size());
std::cout << o <<"\n";
Then I send it using:
std::string bin(o);
curl_easy_setopt(handle, CURLOPT_POSTFIELDSIZE, bin.size());
curl_easy_setopt(handle, CURLOPT_POSTFIELDS, bin.data());
curl_easy_perform(handle);
When I run this, I get the output:
x??с??Ҵ4?
x%DA%D3%D1%81%80%E2%92%D2%B44%1D%03%00%1BW%03%E5
Now, I receive the second encoded string on my nodejs server as it is.
I now try to decode it.
var x = req.params;
for (var key in req.body)
{
console.log(key);
var x = unescape(key);
var buffer = new Buffer(x);
console.log(x);
zlib.inflate(buffer, function(err, buffer) {
console.log(err+" here");
});
}
Which outputs:
x%DA%D3%D1%81%80%E2%92%D2%B44%1D%03%00%1BW%03%E5
xÚÓÑâÒ´4å
Error: incorrect header check here
What is the problem here? How do I debug it?

You can debug it by printing the decimal value for each byte in the compressed string in C++ and node.js code. For C++ that code would be:
for(int i=0; i<s.size(); i++) {
std::cout << static_cast<int>(s[i]);
}
In node.js code you would need to print the decimal value for each byte contained in variable buffer.
If the decimal values for each byte are identical in both C++ and node.js parts, then zlib libraries are incompatible or functions do not match: e.g. zlib_compress in C++ may correspond to something else than zlib.inflate in node.js: maybe there is a function like zlib.decompress() .
The root cause can be in that characters are 1-byte in C++ std::string and 2-byte in node.js . Specifying the encoding when constructing Buffer in node.js may solve the problem if that is it:
var buffer = new Buffer(x, 'binary');
See https://nodejs.org/api/buffer.html#buffer_new_buffer_str_encoding
As the data is zlib compressed here, or in a general compressed case, the encoding should be binary.

Related

Cannot serve png files and other binary files in hobby HTTP server

I am writing a HTTP server in C++, and serving static files is mostly OK, however when reading .PNG files or other binary's, every method I have tried fails. My main problem is when I open up Dev tools, reading a example image would give a transferred size of 29.56kb, and a size of 29.50 kb for my current method. The sizes given also do not match up with what du-sh give, which is 32kb.
My first method was to push the contents of a file onto a string, and call a function to serve that. However, this would also server ~6kb if memory serves correctly.
My current method is to read the file using std::ifstream in binary mode. I am getting the size of the file using C++17's filesystem header and using std::filesystem::file_size. I read the contents into a buffer and then call a function to send the buffer contents 1 byte at a time
void WebServer::sendContents(std::string contents) {
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
throw std::runtime_error("Server accept: " + std::string(strerror(errno)));
}
}
void WebServer::sendFile(std::string path) {
path = "./" + path;
std::string fileCont; //File contents
std::string mimeType; //Mime type of the file
std::string contLength;
std::string::size_type idx = path.rfind('.');
if (idx != std::string::npos) mimeType = this->getMimeType(path.substr(idx + 1));
else mimeType = "text/html";
std::filesystem::path reqPath = std::filesystem::path("./" + path).make_preferred();
std::filesystem::path parentPath = std::filesystem::path("./");
std::filesystem::path actualPath = std::filesystem::canonical(parentPath / reqPath);
if (!this->isSubDir(actualPath, parentPath)) { this->sendRoute("404"); return; }
std::ifstream ifs;
ifs.open(actualPath, std::ios::binary);
if (ifs.is_open()) {
//Get the size of the static file being server
std::filesystem::path staticPath{path};
std::size_t length = std::filesystem::file_size(staticPath);
char* buffer = new char[length];
*buffer = { 0 }; //Initalize the buffer that will send the static file
ifs.read(buffer, sizeof(char) * length); //Read the buffer
std::string resp = "HTTP/1.0 200 OK\r\n"
"Server: webserver-c\r\n"
"Content-Length" + std::to_string(length) + "\r\n"
"Content-type: " + mimeType + "\r\n\r\n";
if (!ifs) std::cout << "Error! Only " << std::string(ifs.gcount()) << " could be read!" << std::endl;
this->sendContents(resp); //Send the headers
for (size_t i=0; i < length; i++) {
std::string byte = std::string(1, buffer[i]);
this->sendContents(byte);
}
delete buffer; //We do not need megs of memory stack up, that shit will grow quick
buffer = nullptr;
} else {
this->sendContents("HTTP/1.1 500 Error\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n"); return;
}
ifs.close();
}
It should be noted that this->newFd is a socket descriptor
It should also be noted that I have tried to take a look at this question here, however the same problem still occurs for me
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
There are two bugs for the price of one, here.
This is used to send the contents of the binary file. One byte at a time. sendContents gets used, apparently, to send one byte at a time, here. This is horribly inefficient, but it's not the bug. The first bug is as follows.
Your binary file has plenty of bytes that are 00.
In that case, contents will proudly contain this 00 byte, here. c_str() returns a pointer to it. strlen() then reaches the conclusion that it is receiving an empty string, for input, and make a grandiose announcement that the string contains 0 characters.
In the end, send's third parameter will be 0.
No bytes will get sent, at all, instead of the famous 00 byte.
The second bug will come into play once the inefficient algorithm gets fixed, and sendContents gets used to send more than one byte at a time.
send() holds a secret: this system call may return other values, other than -1 to indicate the failure. Such as the actual number of bytes that were sent. So, if send() was called to send, say, 100 bytes, it may decide so send only 30 bytes, return 30, and leaving you holding the bag with the remaining 70 unsent bytes.
This is actually, already, an existing bug in the shown code. sendContents() also gets used to send the entire resp string. Which is, eh, in the neighborhood of a 100 bytes. Give or take a dozen.
You are relying on this house of cards: of send() always doing its job complete job, in this particular case, not slacking off, and actually sending the entire HTTP/1.0 response string.
But, send() is a famous slacker, and you have no guarantees, whatsoever, that this will actually happen. And I have it on good authority that an upcoming Friday the 13th your send() will decide to slack off, all of a sudden.
So, to fix the shown code:
Implement the appropriate logic to handle the return value from send().
Do not use c_str(), followed by strlen(), because: A) it's broken, for strings containing binary data, B) this elaborate routine simply reinvents a wheel called size(). You will be happy to know that size() does exactly what it's name claims to be.
One other bug:
char* buffer = new char[length];
It is possible for an exception to get thrown from the subsequent code. This memory get leaked, because delete does not get called.
C++ gurus know a weird trick: they rarely use new, but instead use containers, like std::vector, and they don't have to worry about leaking memory, because of that.

How to publish JSON to a web server?

I am playing around with freeboard.io and trying to make a widget that pulls JSON data from a URL [TBD]. My original data source is from an iMX6-based Wandboard running Linux that is connected to the internet. I want to write a C++ program on the Wandboard that opens a socket to [TBD] and sends UDP packets, for example, containing my sensor data. My JSON data structure is like this:
{
"sensor_a": 1100,
"sensor_b": 247,
"sensor_c": 0
}
Can you help me put my JSON data structure into an IP packet using C++ on Ubuntu Linux? I know how to just serialize the data structure in ascii for example and build a buffer to stuff an IP packet but I'm wondering if there is a standard way to do this for cloud services, or will it be different for Azure vs AWS? Is some type of header info needed to "put" the data?
This is a very simple problem, like all simple problems no need for external libraries for serializing etc. Like #Galik said above your problem is how to send a string from client to server. Additionally for your case you need a JSON parser on the server (any C or C++ parser from the JSON page will do, I use gason because it's fast and simple).
In TCP/IP socket programming you have to make the other part know how many bytes (characters in your case) to read.
I faced a similar case: send JSON over the web.
here's the example, a JSON "message"
https://github.com/pedro-vicente/lib_netsockets/blob/master/examples/json_message.cc
in this case, the size of the message has this header format
nbr_bytes#json_string
where "json_string" is the JSON text, "nbr_bytes" is the number of characters "json_string" has and "#" is a separator character.
how does the server parse this?
By reading 1 character at a time until the "#" separator is found, then converting that string into a number;
then make the socket API read "nbr_bytes" characters and exit
example
100#{json_txt....}
in this case "json_txt" has 100 characters
here's the code for the parser
std::string read_response(socket_t &socket)
{
int recv_size; // size in bytes received or -1 on error
size_t size_json = 0; //in bytes
std::string str_header;
std::string str;
//parse header, one character at a time and look for for separator #
//assume size header lenght less than 20 digits
for (size_t idx = 0; idx < 20; idx++)
{
char c;
if ((recv_size = recv(socket.m_socket_fd, &c, 1, 0)) == -1)
{
std::cout << "recv error: " << strerror(errno) << std::endl;
return str;
}
if (c == '#')
{
break;
}
else
{
str_header += c;
}
}
//get size
size_json = static_cast<size_t>(atoi(str_header.c_str()));
//read from socket with known size
char *buf = new char[size_json];
if (socket.read_all(buf, size_json) < 0)
{
std::cout << "recv error: " << strerror(errno) << std::endl;
return str;
}
std::string str_json(buf, size_json);
delete[] buf;
return str_json;
}

Winsock - read integer from Java client in C++

I have a client-server application, with the server part written in C++ (Winsock) and the client part in Java.
When sending data from the client, I first send its length followed by the actual data. For sending the length, this is the code:
clientSender.print(text.length());
where clientSender is of type PrintWriter.
On the server side, the code that reads this is
int iDataLength;
if(recv(client, (char *)&iDataLength, sizeof(iDataLength), 0) != SOCKET_ERROR)
//do something
I tried printing the value of iDataLength within the if and it always turns out to be some random large integer. If I change iDataLength's type to char, I get the correct value. However, the actual value could well exceed a char's capacity.
What is the correct way to read an integer passed over a socket in C++ ?
I think the problem is that PrintWriter is writing text and you are trying to read a binary number.
Here is what PrintWriter does with the integer it sends:
http://docs.oracle.com/javase/7/docs/api/java/io/PrintWriter.html#print%28int%29
Prints an integer. The string produced by String.valueOf(int) is
translated into bytes according to the platform's default character
encoding, and these bytes are written in exactly the manner of the
write(int) method.
Try something like this:
#include <sys/socket.h>
#include <cstring> // for std::strerror()
// ... stuff
char buf[1024]; // buffer to receive text
int len;
if((len = recv(client, buf, sizeof(buf), 0)) == -1)
{
std::cerr << "ERROR: " << std::strerror(errno) << std::endl;
return 1;
}
std::string s(buf, len);
int iDataLength = std::stoi(s); // convert text back to integer
// use iDataLength here (after sanity checks)
Are you sure the endianness is not the issue? (Maybe Java encodes it as big endian and you read it as little endian).
Besides, you might need to implement receivall function (similar to sendall - as here). To make sure you receive exact number of bytes specified - because recv may receive fewer bytes than it was told to.
You have a confusion between numeric values and their ASCII representation.
When in Java you write clientSender.print(text.length()); you are actually writing an ascii string - if length is 15, you will send characters 1 (code ASCII 0x31) and 5 (code ASCII 0x35)
So you must either :
send a binary length in a portable way (in C or C++ you have hton and ntoh, but unsure in Java)
add a separator (newline) after the textual length from Java side and decode that in C++ :
char buffer[1024]; // a size big enough to read the packet
int iDataLength, l;
l = recv(client, (char *)&iDataLength, sizeof(iDataLength), 0);
if (l != SOCKET_ERROR) {
buffer[l] = 0;
iDataLength = sscanf(buffer, "%d", &iDataLength);
char *ptr = strchr(buffer, '\n');
if (ptr == NULL) {
// should never happen : peer does not respect protocol
...
}
ptr += 1; // ptr now points after the length
//do something
}
Java part should be : clientSender.println(text.length());
EDIT :
From Remy Lebeau's comment, There is no 1-to-1 relationship between sends and reads in TCP. recv() can and does return arbitrary amounts of data, so you cannot assume that a single recv() will read the entire line of text.
Above code should not do a simple recv but be ready to concatenate multiple reads to find the separator (left as exercise for the reader :-) )

cURL for UTF-8 Requests. it always goes like �abe when it's supposed to be çabe

Well I'm using libcurl in C++ on Visual Studio 2008 and Windows 7 Professional 32-bit to send a request with UTF-8 characters but the problem is that I get that encoding error.
�abe instead of çabe.
and one thing I notice while I was testing in my localhost server, it's first I receive in a PHP server like this echo $_POST['post']; and it comes �abe but if I encode it, it comes correct: echo utf8_encode($_POST['post']); it comes çabe which is I want.
but the thing is that I don't have control over the server that receives the data, I would like to send it already encoded with UTF-8.
how could I do that?
here is my post part
struct curl_httppost *formpost=NULL;
struct curl_httppost *lastptr=NULL;
std::string post = "çabe";
url_formadd(&formpost,
&lastptr,
CURLFORM_COPYNAME, "post",
CURLFORM_COPYCONTENTS, post.c_str(),
CURLFORM_END);
#edit
According to DietrichEpp, to see if I have UTF-8 on, I can simply test the length of a different character in example "ç"
so I tried printf("%d\n", (int) strlen("ç")); --
and it should print out 2 or 3 for UTF-8, or 1 for something else.
and it did printed out 1, so that may be the reason, how can I fix this ?
I want to set it to be UTF-8 or at least be able to put a function to set it to be utf8 when using in the post field of the cURL.
If you want to send UTF-8 encoded data you have to encode it in UTF-8. For example “ç” encoded in UTF-8 can portabily be spelled \xC3\xA7 as in
std::string post = "\xC3\xA7abe";
Visual Studio 2008 should be able to automatically save file in UTF-8, saving you the trouble of doing this encoding yourself. If you are stuck working in ISO 8859-1, this specific transcoding to UTF-8 can easily be achieved with (optimization left as an exercice):
std::string utf8_from_iso8859_1(std::string str)
{
std::string res;
for (std::string::iterator i = str.begin(); i < str.end(); i++) {
if (0 <= *i && *i < 0x80)
res += *i;
else {
res += 0xC0 | ((*i >> 6) & 0x03);
res += 0x80 | (*i & 0x3F);
}
}
return res;
}
and then use
std::string post = "çabe";
std::string encoded = utf8_from_iso8859_1(post);
url_formadd(&formpost,
&lastptr,
CURLFORM_COPYNAME, "post",
CURLFORM_COPYCONTENTS, encoded.c_str(),
CURLFORM_END);
Transcoding from other encodings will require a specific mapping and your best bet will be to use a specialized library like libiconv

Do I need to encode using Base 64 in my web service?

I am transferring messages to mobile devices via a web service. The data is an xml string, which I compress using GZipStream, then encode using Base64.
I am getting out-of memory exceptions in the emulator and looking to optimise the process so I have stopped passing the string around by value and removed unecessary copies of byte arrays.
Now I'm wondering about the Base64 encoding. It increases the size of the message, the processing and the memory requirements. Is it strictly necessary?
Edit: Here is how I decompress:
public static byte[] ConvertMessageStringToByteArray(ref string isXml)
{
return fDecompress(Convert.FromBase64String(isXml));
}
public static byte[] fDecompress(byte[] ivBytes)
{
const int INT_BufferSize = 2048;
using (MemoryStream lvMSIn = new MemoryStream(ivBytes))
using (GZipInputStream lvZipStream = new GZipInputStream(lvMSIn, ivBytes.Length))
using (MemoryStream lvMSOut = new MemoryStream())
{
byte[] lvBuffer = new byte[INT_BufferSize];
int liSize;
while (true)
{
liSize = lvZipStream.Read(lvBuffer, 0, INT_BufferSize);
if (liSize <= 0)
break;
lvMSOut.Write(lvBuffer, 0, liSize);
}
return lvMSOut.ToArray();
}
}
gzip (which is inside the GZipStream) produces binary data - they won't fit into a 7-bit text message (SOAP is a text message) unless you do somethning like base64 encoding on them.
Perhaps the solution is to not gzip/encode (decode/ungzip) a whole buffer, but use streams for that - connect a gzipping stream to an encoding stream and read the result from the output of the latter (or connect the decoding stream to the ungzipping stream). This way you have a chance to consume less memory.