How to set maximum buffer size in Node.js? - c++

I was creating a server application, which uses socket.write() to send data back to the clients.
Here, I got the problem of managing buffer size. Suppose the connection between server and client broke down, and server was not aware of the problem. So, it keeps writing to the buffer. In such situation, I want to limit the maximum buffer size, so that it can throw errors before it consumes lots of resources.
I know kMaxLength in node_buffer.h controls the size of buffer, but I do not think it's a good idea to change its value via some self-defined methods. Is there any other method to manage the size?

kMaxLength is the limit value for memory allocation, it is much bigger then socket.buffSize
http://nodejs.org/docs/v0.11.5/api/smalloc.html
You can read about socket.buffSize here:
http://nodejs.org/api/net.html#net_socket_buffersize
You can also see socket.bytesRead, socket.bytesWritten to control your data transmission.

Related

Libevent bufferevent's evbuffer_add

I am using Libevent library 2.0 for socket communication.
In order to add data to evbuffer, I am using evbuffer_add. The bufferevent stores the data in its internal buffer and transfers the data via socket using some predefined timeout and watermark settings.
My question is, is there any way to control the data transfer? Can we transfer the data explicitly any time and after any random number of bytes being written?
The idea behind this function is fire-and-forget. However, you can add a callback so that when the send finally happens, you can do some things:
evbuffer_add_cb
This doesn't allow you much control, but you can use it for some behaviors like appending the buffer.

Boost Asio sample HTTP Server -- taking this example and making it "production ready"

In my search for a clean, simple, modern, and cross-platform HTTP server, I settled on the Boost.Asio C++11 example HTTP Server. You may find it here, and in the boost source directory boost_1_55_0/doc/html/boost_asio/example/cpp11/http/server.
I have reviewed the code a little, and it looks to be quite clean and very very well documented, so it's nearly ideal. I just have a few small questions which probably only have an impact on performance, which for now is a secondary priority (the primary being stability), as I do intend to use the same portable code on mobile and embedded platforms.
This magic number 512 appears in request_handler::handle_request() in request_handler.cpp:
// Fill out the reply to be sent to the client.
rep.status = reply::ok;
char buf[512];
while (is.read(buf, sizeof(buf)).gcount() > 0)
rep.content.append(buf, is.gcount());
rep.headers.resize(2);
rep.headers[0].name = "Content-Length";
rep.headers[0].value = std::to_string(rep.content.size());
rep.headers[1].name = "Content-Type";
rep.headers[1].value = mime_types::extension_to_type(extension);
And also in connection.hpp the connection class has this member:
/// Buffer for incoming data.
std::array<char, 8192> buffer_;
I'm not sure why these sizes, 512 bytes and 8K bytes are used. It seems pretty clear to me that the local file to be served is being read and dumped into the response's std::string 512 bytes at a time. I do wonder if 4K or 8K would be a more appropriate chunking size.
As for the 8K buffer_, it seems to be used for holding the data arriving over the network. This is a little harder for me to figure out since I find myself inside of asio's guts. I'm primarily worried about 8K not being enough. While a single packet will never exceed this length (I think... there is a theoretical max packet size of 64K, though.), I just don't know why this has to be 8K.
The 512 byte buffer is used to read data from file and then to add it to the body that is constructed. This is for sure an unnecessary copy operation, but this is just an example. And copying a file into process local memory and to send it as one message is for sure not optimal.
This code is just an example and before you allow others access to your filesystem (even if it's just reading access), you should really be sure that no confidential informations could be read this way.
To handle really large files, you would probably want to use chunk encoding and read and send files chunk by chunk having an overlap of the waitings for the disk and network.
For a usual http request, 8k seems enough to me, if request bodies are handled otherwise.
But keep in mind, that this is only an example. If you want to have a http server, that aims to serve nearly all possible purposes, you should look somewhere else. But don't expect that an implementation of this would be as trivial, as this boost example.
I agree with #Torsten Robitzki, it's just the size of the receive buffer.
Like you I was looking for clean, simple, cross-platform HTTP server for an embedded project over a year ago. I tried a few of the asio based HTTP libraries around at the time, but got frustrated with them and ended up writing my own. You can find it here: via-httplib
I'd be glad of any feedback, especially negative, although positive is always welcome. ;)
Currently Apache and nginx also use an 8KB limit (nginx used a 4KB limit until recently). However they do make it configurable:
http://wiki.nginx.org/HttpCoreModule#large_client_header_buffers
http://httpd.apache.org/docs/2.2/mod/core.html#limitrequestfieldsize
The increase from 4KB to 8KB would have been because cookies are getting larger and larger. But some limit is a good idea, to prevent connections from evil or broken clients. Make sure you send back a 4xx error if a client does send too big a request.

Add delay to raw data before send it?

I want to know how can I add a delay (ex:200ms) to some received raw data before send it again through the network.
Is it possible to use memory to store bits(8000) in memory before send it.
Yes, but it is really beyond the scope of this site to give you a full implementation. However here are some tips
Storing memory is basic enough. To store 8000 bits you could use std::bitset or you could manually implement it, no doubt in 1000 bytes on a regular 8 bits-per-byte system. If you need to send it across a network as 8000 bits then the latter form is what you would use but you can get the raw data out of std::bitset so you could still use that class internally.
The delaying is simply a matter of writing a scheduler and std::priority_queue could be used potentially to implement that.
You do not store or send 8000bits to cause a delay. Either use the usleep()/nanosleep() functions to pause the program for 200ms before sending the data.
Or use the Win32 Timer API SetTimer/KillTimer. Add the data you want to delay to a queue and then start a timer for the number of milliseconds you want to delay the data. When the timer goes off, remove the data from the queue and send it.

Sending large chunks of data over Boost TCP?

I have to send mesh data via TCP from one computer to another... These meshes can be rather large. I'm having a tough time thinking about what the best way to send them over TCP will be as I don't know much about network programming.
Here is my basic class structure that I need to fit into buffers to be sent via TCP:
class PrimitiveCollection
{
std::vector<Primitive*> primitives;
};
class Primitive
{
PRIMTYPES primType; // PRIMTYPES is just an enum with values for fan, strip, etc...
unsigned int numVertices;
std::vector<Vertex*> vertices;
};
class Vertex
{
float X;
float Y;
float Z;
float XNormal;
float ZNormal;
};
I'm using the Boost library and their TCP stuff... it is fairly easy to use. You can just fill a buffer and send it off via TCP.
However, of course this buffer can only be so big and I could have up to 2 megabytes of data to send.
So what would be the best way to get the above class structure into the buffers needed and sent over the network? I would need to deserialize on the recieving end also.
Any guidance in this would be much appreciated.
EDIT: I realize after reading this again that this really is a more general problem that is not specific to Boost... Its more of a problem of chunking the data and sending it. However I'm still interested to see if Boost has anything that can abstract this away somewhat.
Have you tried it with Boost's TCP? I don't see why 2MB would be an issue to transfer. I'm assuming we're talking about a LAN running at 100mbps or 1gbps, a computer with plenty of RAM, and don't have to have > 20ms response times? If your goal is to just get all 2MB from one computer to another, just send it, TCP will handle chunking it up for you.
I have a TCP latency checking tool that I wrote with Boost, that tries to send buffers of various sizes, I routinely check up to 20MB and those seem to get through without problems.
I guess what I'm trying to say is don't spend your time developing a solution unless you know you have a problem :-)
--------- Solution Implementation --------
Now that I've had a few minutes on my hands, I went through and made a quick implementation of what you were talking about: https://github.com/teeks99/data-chunker There are three big parts:
The serializer/deserializer, boost has its own, but its not much better than rolling your own, so I did.
Sender - Connects to the receiver over TCP and sends the data
Receiver - Waits for connections from the sender and unpacks the data it receives.
I've included the .exe(s) in the zip, run Sender.exe/Receiver.exe --help to see the options, or just look at main.
More detailed explanation:
Open two command prompts, and go to DataChunker\Debug in both of them.
Run Receiver.exe in one of the
Run Sender.exe in the other one (possible on a different computer, in which case add --remote-host=IP.ADD.RE.SS after the executable name, if you want to try sending more than once and --num-sends=10 to send ten times).
Looking at the code, you can see what's going on, creating the receiver and sender ends of the TCP socket in the respecitve main() functions. The sender creates a new PrimitiveCollection and fills it in with some example data, then serializes and sends it...the receiver deserializes the data into a new PrimitiveCollection, at which point the primitive collection could be used by someone else, but I just wrote to the console that it was done.
Edit: Moved the example to github.
Without anything fancy, from what I remember in my network class:
Send a message to the receiver asking what size data chunks it can handle
Take a minimum of that and your own sending capabilities, then reply saying:
What size you'll be sending, how many you'll be sending
After you get that, just send each chunk. You'll want to wait for an "Ok" reply, so you know you're not wasting time sending to a client that's not there. This is also a good time for the client to send a "I'm canceling" message instead of "Ok".
Send until all packets have been replied with an "Ok"
The data is transfered.
This works because TCP guarantees in-order delivery. UDP would require packet numbers (for ordering).
Compression is the same, except you're sending compressed data. (Data is data, it all depends on how you interpret it). Just make sure you communicate how the data is compressed :)
As for examples, all I could dig up was this page and this old question. I think what you're doing would work well in tandem with Boost.Serialization.
I would like to add one more point to consider - setting TCP socket buffer size in order to increase socket performance to some extent.
There is an utility Iperf that let test speed of exchange over the TCP socket. I ran on Windows a few tests in a 100 Mbs LAN. With the 8Kb default TCP window size the speed is 89 Mbits/sec and with 64Kb TCP window size the speed is 94 Mbits/sec.
In addition to how to chunk and deliver the data, another issue you should consider is platform differences. If the two computers are the same architecture, and the code running on both sides is the same version of the same compiler, then you should, probably, be able to just dump the raw memory structure across the network and have it work on the other side. If everything isn't the same, though, you can run into problems with endianness, structure padding, field alignment, etc.
In general, it's good to define a network format for the data separately from your in-memory representation. That format can be binary, in which case numeric values should be converted to standard forms (mainly, changing endianness to "network order", which is big-endian), or it can be textual. Many network protocols opt for text because it eliminates a lot of formatting issues and because it makes debugging easier. Personally, I really like JSON. It's not too verbose, there are good libraries available for every programming language, and it's really easy for humans to read and understand.
One of the key issues to consider when defining your network protocol is how the receiver knows when it has received all of the data. There are two basic approaches. First, you can send an explicit size at the beginning of the message, then the receiver knows to keep reading until it's gotten that many bytes. The other is to use some sort of an end-of-message delimiter. The latter has the advantage that you don't have to know in advance how many bytes you're sending, but the disadvantage that you have to figure out how to make sure the the end-of-message delimiter can't appear in the message.
Once you decide how the data should be structured as it's flowing across the network, then you should figure out a way to convert the internal representation to that format, ideally in a "streaming" way, so you can loop through your data structure, converting each piece of it to network format and writing it to the network socket.
On the receiving side, you just reverse the process, decoding the network format to the appropriate in-memory format.
My recommendation for your case is to use JSON. 2 MB is not a lot of data, so the overhead of generating and parsing won't be large, and you can easily represent your data structure directly in JSON. The resulting text will be self-delimiting, human-readable, easy to stream, and easy to parse back into memory on the destination side.

Webservice protection against big messages

I am developing a WebService in Java upon the jax-ws stack and glassfish.
Now I am a bit concerned about a couple of things.
I need to pass in a unknown amount of binary data that will be processed with a MDB, it is written this way to be asynchronous (so the user does not have to wait for the calculation to take place, kind of fault tolerant aswell as being very scalable.
The input message can however be split into chunks and sent to the MDB or split in the client and sent to the WS itself in chunks.
What I am looking for is a way to be able to specify the max size of the input so I wont blow the heap even if someone delibratly tries to send a to big message. I have noticed that things tend to be a bit unstable once you hit the ceiling and I must be able to keep running.
Is it possible to be safe against big messages or should I try to use another method instead of WS. Which options do I have?
Well I am rather new to Java EE..
If you're passing binary data take a look at enabling MTOM for endpoint. It utilizes streaming and has 'threshold' parameter.