ZeroMq: Too many open files.. Number of fd usage growing continuosly on the same object

ZeroMq: Too many open files.. Number of fd usage growing continuosly on the same object - c++

Through the same class object which includes 2 zeromq subscriber and 1 zeromq request socket, I create objects in different threads. I use inproc zeromq sockets and that belong to same ZContext.
Each time I create the object the number of open files (lsof | wc -l) in the server (operating Centos 7) system increases incrementally. After creating the first object the open file # increases by amount of 300 and the second one increases the open file number by 304 and continuously growing.
As my programme can use many of these objects during runtime this can result in too many open files error for zeromq even though I set the limit to 524288 (ulimit -n). As the # of objects getting higher each object consumes the open file limit much more as some of them around 1500.
During runtime my programme crashes with the too many open files error at the times of many objects created and threads doing their work (sending messages to another server or clients) on the objects.
How can I overcome this through?
example code:
void Agent::run(void *ctx) {
zmq::context_t *_context = (zmq::context_t *) ctx;
zmq::socket_t dataSocket(*(_context),ZMQ_SUB);
zmq::socket_t orderRequestSocket(*(_context),ZMQ_REQ);//REQ
std::string bbpFilter = "obprice.1;
std::string bapFilter = "obprice.2"
std::string orderFilter = "order";
dataSocket.connect("inproc://ordertrade_publisher");
dataSocket.connect("inproc://orderbook_prices_pub");
orderRequestSocket.connect("inproc://frontend_oman_agent");
int rc;
try {
zmq::message_t filterMessage;
zmq::message_t orderMessage;
rc = dataSocket.recv(&filterMessage);
dataSocket.recv(&orderMessage);
//CALCULATION AND SEND ORDER
// end:
return;
}
catch(std::exception& e) {
std::cerr<< "Exception:" << e.what() << std::endl;
Order.cancel_order(orderRequestSocket);
return;
}
}

I'm running into this as well. I'm not sure I have a solution, but I see that a context (zmq::context_t) has a maximum number of sockets. See zmq_ctx_set for more detail. This limit defaults to ZMQ_MAX_SOCKETS_DFLT which appears to be 1024.
You might just need to increase the number of sockets your context can have, although I suspect there might be some leaking going on (at least in my case).
UPDATE:
I was able to fix my leak through a combination of socket options:
ZMQ_RCVTIMEO - I was already using this to avoid waiting forever if the other end wasn't there. My system handles this by only making one request on a socket, then closing it.
ZMQ_LINGER - set to 0 so the socket doesn't wait around trying to send the failed message. The default behavior is infinite linger. This is probably the key to your problem
ZMQ_IMMEDIATE - this option restricts the queueing of messages to only completed connections. Without a queue, there's no need for the socket to linger.
I can't say for sure if I need both linger and immediate, but they both seemed appropriate to my use case; they might help yours. With these options set, my number of open files does not grow infinitely.

Related

Qt QTcpSocket Reading Data Overlap Causes Invalid TCP Behavior During High Bandwidth Reading and Writing

Summary: Some of the memory within the TCP socket to be overwritten by other incoming data.
Application:
A client/server system that utilizes TCP within Qt (QTcpSocket and QTcpServer). The client request a frame from the server(just a simple string message), and the response (Server -> Client) which consists of that frame (614400 bytes for testing purposes). Frame sizes are established in advance and are fixed.
Implementation Details:
From the guarantees of the TCP protocol (Server -> Client), I know that I should be able to read the 614400 bytes from the socket and that they are in order. If any either of these two things fails, the connection must have failed.
Important Code:
Assuming the socket is connected.
This code requests a frame from the server. Known as the GetFrame() function.
// Prompt the server to send a frame over
if(socket->isWritable() && !is_receiving) { // Validate that socket is ready
is_receiving = true; // Forces only one request to go out at a time
qDebug() << "Getting frame from socket..." << image_no;
int written = SafeWrite((char*)"ReadyFrame"); // Writes then flushes the write buffer
if (written == -1) {
qDebug() << "Failed to write...";
return temp_frame.data();
}
this->SocketRead();
is_receiving = false;
}
qDebug() << image_no << "- Image Received";
image_no ++;
return temp_frame.data();
This code waits for the frame just requested to be read. This is the SocketRead() function
size_t byte_pos = 0;
qint64 bytes_read = 0;
do {
if (!socket->waitForReadyRead(500)) { // If it timed out return existing frame
if (!(socket->bytesAvailable() > 0)) {
qDebug() << "Timed Out" << byte_pos;
break;
}
}
bytes_read = socket->read((char*)temp_frame.data() + byte_pos, frame_byte_size - byte_pos);
if (bytes_read < 0) {
qDebug() << "Reading Failed" << bytes_read << errno;
break;
}
byte_pos += bytes_read;
} while (byte_pos < frame_byte_size && is_connected); // While we still have more pixels
qDebug() << "Finished Receiving Frame: " << byte_pos;
As shown in the code above, I read until the frame is fully received (where the number of bytes read is equal to the number of bytes in the frame).
The issue that I'm having is that the QTcpSocket read operation is skipping bytes in ways that are not in line with the guarantees of the TCP protocol. Since I skip bytes I end up not reaching the end of the while loop and just "Time Out". Why is this happening?
What I have done so far:
The data that the server sends is directly converted into uint16_t (short) integers which are used in other parts of the client. I have changed the server to simply output data that just counts up adding one for each number sent. Since the data type is uint16_t and the number of bytes exceeds that maximum number for that integer type, the int-16's will loop every 65535.
This is a data visualization software so this debugging configuration (on the client side) leads to something like this:
I have determined (and as you can see a little at the bottom of the graphic) that some bytes are being skipped. In the memory of temp_frame it is possible to see the exact point at which the memory skipped:
Under correct circumstances, this should count up sequentially.
From Wireshark and following this specific TCP connection I have determined that all of the bytes are in fact arriving (all 6114400), and that all the numbers are in order (I used a python script to ensure counting was sequential).
This is work on an open source project so this is the whole code base for the client.
Overall, I don't see how I could be doing something wrong in this solution, all I am doing is reading from the socket in the standard way.

Caveat: This isn't a definitive answer to your problem, but some things to try (it's too large for a comment).
With (e.g.) GigE, your data rate is ~100MB/s. With a [total] amount of kernel buffer space of 614400, this will be refilled ~175 times per second. IMO, this is still too small. When I've used SO_RCVBUF [for a commercial product], I've used a minimum of 8MB. This allows a wide(er) margin for task switch delays.
Try setting something huge like 100MB to eliminate this as a factor [during testing/bringup].
First, it's important to verify that the kernel and NIC driver can handle the throughput/latency.
You may be getting too many interrupts/second and the ISR prolog/epilog overhead may be too high. The NIC card driver can implement polled vs interrupt driver with NAPI for ethernet cards.
See: https://serverfault.com/questions/241421/napi-vs-adaptive-interrupts
See: https://01.org/linux-interrupt-moderation
You process/thread may not have high enough priority to be scheduled quickly.
You can use the R/T scheduler with sched_setscheduler, SCHED_RR, and a priority of (e.g.) 8. Note: going higher than 11 kills the system because at 12 and above you're at a higher priority than most internal kernel threads--not a good thing.
You may need to disable IRQ balancing and set the IRQ affinity to a single CPU core.
You can then set your input process/thread locked to that core [with sched_setaffinity and/or pthread_setaffinity].
You might need some sort of "zero copy" to bypass the kernel copying from its buffers into your userspace buffers.
You can mmap the kernel socket buffers with PACKET_MMAP. See: https://sites.google.com/site/packetmmap/
I'd be careful about the overhead of your qDebug output. It looks like an iostream type implementation. The overhead may be significant. It could be slowing things down significantly.
That is, you're not measuring the performance of your system. You're measuring the performance of your system plus the debugging code.
When I've had to debug/trace such things, I've used a [custom] "event" log implemented with an in-memory ring queue with a fixed number of elements.
Debug calls such as:
eventadd(EVENT_TYPE_RECEIVE_START,some_event_specific_data);
Here eventadd populates a fixed size "event" struct with the event type, event data, and a hires timestamp (e.g. struct timespec from clock_gettime(CLOCK_MONOTONIC,...).
The overhead of each such call is quite low. The events are just stored in the event ring. Only the last N are remembered.
At some point, your program triggers a dump of this queue to a file and terminates.
This mechanism is similar to [and modeled on] a H/W logic analyzer. It is also similar to dtrace
Here's a sample event element:
struct event {
long long evt_tstamp; // timestamp
int evt_type; // event type
int evt_data; // type specific data
};

how to handle half open socket in middle of transfer?

/*Basic Things has been done.*/
/*Like setting connection and receiving */
namespace bar = boost::asio::error;
void doWrite(char* buffer, size_t size_) {
boost::asio::async_write_some(socket, boost::asio::buffer(buffer ,size), boost::bind(&Handler, this, bar::error, bar::bytes_transferred));
}
/*handler*/
void handler(/*parameters*/)
{
}
while my server is continuously transferring the data. sometimes client gets crash /*purposely */.
errorCode.message() gives error of boost::asio::error::bad_descriptor and whole program crashes.
i have copied the program from boost chat server example.
if server is transmitting let say 1024 bytes and while writing client close in middle of writing of 1024 bytes. whole program crashes.
More Technical Wording:
how to handle half open socket in middle of transfer?

Be sure to do all operations either catching system_error or receiving (and checking) the error_code.
As you say, you already receive the boost::asio::error::bad_descriptor code, so I expect the program termination results because of a subsequent action on the same socket that throws.
Look for the overloads that take a boost::system::error_code& parameter (like e.g. http://www.boost.org/doc/libs/1_66_0/doc/html/boost_asio/reference/basic_stream_socket/close.html).

How would one avoid race conditions from multiple threads of a server sending data to a client? C++

I was following a tutorial on youtube on building a chat program using winsock and c++. Unfortunately the tutorial never bothered to consider race conditions, and this causes many problems.
The tutorial had us open a new thread every time a new client connected to the chat server, which would handle receiving and processing data from that individual client.
void Server::ClientHandlerThread(int ID) //ID = the index in the SOCKET Connections array
{
Packet PacketType;
while (true)
{
if (!serverptr->GetPacketType(ID, PacketType)) //Get packet type
break; //If there is an issue getting the packet type, exit this loop
if (!serverptr->ProcessPacket(ID, PacketType)) //Process packet (packet type)
break; //If there is an issue processing the packet, exit this loop
}
std::cout << "Lost connection to client ID: " << ID << std::endl;
}
When the client sends a message, the thread will process it and send it by first sending packet type, then sending the size of the message/packet, and finally sending the message.
bool Server::SendString(int ID, std::string & _string)
{
if (!SendPacketType(ID, P_ChatMessage))
return false;
int bufferlength = _string.size();
if (!SendInt(ID, bufferlength))
return false;
int RetnCheck = send(Connections[ID], _string.c_str(), bufferlength, NULL); //Send string buffer
if (RetnCheck == SOCKET_ERROR)
return false;
return true;
}
The issue arises when two threads (Two separate clients) are synchronously trying to send a message at the same time to the same ID. (The same third client). One thread may send to the client the int packet type, so the client is now prepared to receive an int, but then the second thread sends a string. (Because the thread assumes the client is waiting for that). The client is unable to process correctly and results in the program being unusable.
How would I solve this issue?
One solution I had:
Rather than allow each thread to execute server commands on their own, they would set an input value. The main server thread would loop through all the input values from each thread and then execute the commands one by one.
However I am unsure this won't have problems of its own... If a client sends multiple messages in the time frame of a single server loop, only one of the messages will send (since the new message would over-write the previous message). Of course there are ways around this, such as arrays of input or faster loops, but it still poses a problem.
Another issue that I thought of was that a client with a lower ID would always end up having their message sent first each loop. This isn't that big of a deal but if there was a situation, say, a trivia game, where two clients entered the correct answer in the same loop then the client with the lower ID would end up saying the answer "first" every time.
Thanks in advance.

If all I/O is being handled through a central server, a simple (but certainly not elegant) solution is to create a barrier around the I/O mechanisms to each client. In the simplest case this can just be a mutex. Associate that barrier with each client and anytime someone wants to send that client something (a complete message), lock the barrier. Unlock it when the complete message is handled. That way only one client can actually send something to another client at a time. In C++11, see std::mutex.

multiple boost::asio ssl clients running on same system

I have a simple Boost ASIO SSL Client which calls a web api. The client is slight modification of the Boost SSL documentation example.
//http.h
class Http {
public:
static void WebApiCall(...);
}
//http.cpp
void Http::WebApiCall(...) {
try {
// .......
boost::asio::io_service io_service;
tcp::resolver resolver(io_service);
tcp::resolver::query query(serverip, serverport);
tcp::resolver::iterator endpoint_iterator = resolver.resolve(query);
boost::asio::ssl::context ctx(io_service, boost::asio::ssl::context::tlsv1); // ERROR # 1
// ....
// Setting SSL Context Properties Here
// ....
boost::shared_ptr<boost::asio::ssl::stream<tcp::socket> > ssocket(new boost::asio::ssl::stream<tcp::socket>(io_service, ctx));
boost::asio::ip::tcp::endpoint endpoint = *endpoint_iterator;
ssocket->lowest_layer().connect(endpoint);
boost::system::error_code er;
ssocket->handshake(boost::asio::ssl::stream_base::client,er);
boost::asio::streambuf request;
std::ostream request_stream(&request);
// ....
// Set Headers & Body of HTTP Request here
// ....
size_t written = 0;
written = boost::asio::write(*ssocket, request); // ERROR # 2
// .....
// Read server response
boost::asio::streambuf response;
boost::system::error_code error;
int read_bytes = 0;
std::string TempBuf = "";
std::ostringstream responseStringstream;
std::stringstream response_stream;
while ( boost::asio::read(*ssocket,response,boost::asio::transfer_at_least(1), error)) {
read_bytes = read_bytes + response.size();
responseStringstream << &response;
}
}
// Do some stuff with server response....
// ....
} catch ( const boost::system::system_error &error ) {
// Print the exception ..
}
}
// client.cpp
Http::WebApiCall(<api_to_call>)
You can see its a simple HTTP client with one static function which implements the actual SSL enabled HTTP Client using ASIO.
Use Case:
1000 processes are running of this client on one machine. All processes are making a POST request periodically (e.g after every one minute) to one resource in approximately the same time. Machine is Ubuntu and I do not seems to be out of memory (I have around 6 GB free)
This client works perfect but in one case where I have to simulate some load on my server I have launched 1000 processes of this client, all on one machine, all calling same API to same server using same public certificates, except that every client has its own OAuth token¹. In this situation I am getting two types of exceptions:
Errors:
ERROR # 2: Some clients (NOT ALL) while writing get error (write: short read). From different forums and Boost sources it seems the server is sending SSL_Shutdown causing ASIO to throw this error, which, as per my finding is normal behavior. My question is, why server is sending SSL_Shutdown at this point? Does this have to do anything with multiple processes calling the same resource from same machine? From ASIO docs ASIO SSL is not thread safe, but in this case I am running only one thread but different processes (which I believe is perfectly safe), besides above code is itself thread safe. Is underlying openssl behaving erratically?
ERROR # 1: Sometimes get an exception while creating Boost ASIO SSL Context, simply saying "context: ssl error". Again same thoughts, why it behaves like this? Does this has something to do with multiple processes, is openssl mixing things up in this scenario?
My client is running perfectly for last one year as one process per machine and I have never seen these errors before. Any thoughts are appreciated.
¹ (Just mentioning about OAuth but I don't think this has anything to do with it)

Q: Some clients (NOT ALL) while writing get error (write: short read). From different forums and Boost sources it seems the server is sending SSL_Shutdown causing ASIO to throw this error, which, as per my finding is normal behavior.
Most likely cause is you're pushing the system beyond a resource limit. E.g. the client or the server may run out of file handles. E.g. on my Linux box the number of open files is limited to 1024 by default: ulimit -a outputs:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256878
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 95
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 256878
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Q: My question is, why server is sending SSL_Shutdown at this point?
Most likely because of the above.
Q: Does this have to do anything with multiple processes calling the same resource from same machine?
No.
Q: From ASIO docs ASIO SSL is not thread safe, but in this case I am running only one thread but different processes (which I believe is perfectly safe), besides above code is itself thread safe. Is underlying openssl behaving erratically?
Thread safety or the underlying SSL library is not the issue here.
Q: Sometimes get an exception while creating Boost ASIO SSL Context, simply saying "context: ssl error". Again same thoughts, why it behaves like this? Does this has something to do with multiple processes, is openssl mixing things up in this scenario?
It's unlikely but it's possible that each instance of ssl::context incurs overhead. You might try allocating it statically/out of the loop.
That said, it's more likely that the initialization if the SSL context simply runs into (the same) resource limit, as it will likely open some system-configuration files and/or check for existence of well known paths (e.g. the CApath etc.)

C++ non blocking socket select send too slow?

I have a program that maintains a list of "streaming" sockets. These sockets are configured to be non-blocking sockets.
Currently, I have used a list to store these streaming sockets. I have some data that I need to send to all these streaming sockets hence I used the iterator to loop through this list of streaming sockets and calling the send_TCP_NB function below:
The issue is that my own program buffer that stores the data before sending to this send_TCP_NB function slowly decreases in free size indicating that the send is slower than the rate at which data is put into the program buffer. The rate at which the program buffer is about 1000 data per second. Each data is quite small, about 100 bytes.
Hence, i am not sure if my send_TCP_NB function is working efficiently or correct?
int send_TCP_NB(int cs, char data[], int data_length) {
bool sent = false;
FD_ZERO(&write_flags); // initialize the writer socket set
FD_SET(cs, &write_flags); // set the write notification for the socket based on the current state of the buffer
int status;
int err;
struct timeval waitd; // set the time limit for waiting
waitd.tv_sec = 0;
waitd.tv_usec = 1000;
err = select(cs+1, NULL, &write_flags, NULL, &waitd);
if(err==0)
{
// time limit expired
printf("Time limit expired!\n");
return 0; // send failed
}
else
{
while(!sent)
{
if(FD_ISSET(cs, &write_flags))
{
FD_CLR(cs, &write_flags);
status = send(cs, data, data_length, 0);
sent = true;
}
}
int nError = WSAGetLastError();
if(nError != WSAEWOULDBLOCK && nError != 0)
{
printf("Error sending non blocking data\n");
return 0;
}
else
{
if(nError == WSAEWOULDBLOCK)
{
printf("%d\n", nError);
}
return 1;
}
}
}

One thing that would help is if you thought out exactly what this function is supposed to do. What it actually does is probably not what you wanted, and has some bad features.
The major features of what it does that I've noticed are:
Modify some global state
Wait (up to 1 millisecond) for the write buffer to have some empty space
Abort if the buffer is still full
Send 1 or more bytes on the socket (ignoring how much was sent)
If there was an error (including the send decided it would have blocked despite the earlier check), obtain its value. Otherwise, obtain a random error value
Possibly print something to screen, depending on the value obtained
Return 0 or 1, depending on the error value.
Comments on these points:
Why is write_flags global?
Did you really intend to block in this function?
This is probably fine
Surely you care how much of the data was sent?
I do not see anything in the documentation that suggests that this will be zero if send succeeds
If you cleared up what the actual intent of this function was, it would probably be much easier to ensure that this function actually fulfills that intent.
That said
I have some data that I need to send to all these streaming sockets
What precisely is your need?
If your need is that the data must be sent before proceeding, then using a non-blocking write is inappropriate*, since you're going to have to wait until you can write the data anyways.
If your need is that the data must be sent sometime in the future, then your solution is missing a very critical piece: you need to create a buffer for each socket which holds the data that needs to be sent, and then you periodically need to invoke a function that checks the sockets to try writing whatever it can. If you spawn a new thread for this latter purpose, this is the sort of thing select is very useful for, since you can make that new thread block until it is able to write something. However, if you don't spawn a new thread and just periodically invoke a function from the main thread to check, then you don't need to bother. (just write what you can to everything, even if it's zero bytes)
*: At least, it is a very premature optimization. There are some edge cases where you could get slightly more performance by using the non-blocking writes intelligently, but if you don't understand what those edge cases are and how the non-blocking writes would help, then guessing at it is unlikely to get good results.
EDIT: as another answer implied, this is something the operating system is good at anyways. Rather than try to write your own code to manage this, if you find your socket buffers filling up, then make the system buffers larger. And if they're still filling up, you should really give serious thought to the idea that your program needs to block anyways, so that it stops sending data faster than the other end can handle it. i.e. just use ordinary blocking sends for all of your data.

Some general advice:
Keep in mind you are multiplying data. So if you get 1 MB/s in, you output N MB/s with N clients. Are you sure your network card can take it ? It gets worse with smaller packets, you get more general overhead. You may want to consider broadcasting.
You are using non blocking sockets, but you block while they are not free. If you want to be non blocking, better discard the packet immediately if the socket is not ready.
What would be better is to "select" more than one socket at once. Do everything that you are doing but for all the sockets that are available. You'll write to each "ready" socket, then repeat again while there are sockets that are not ready. This way, you'll proceed with the sockets that are available first, and then with some chance, the busy sockets will become themselves available.
the while (!sent) loop is useless and probably buggy. Since you are checking only one socket FD_ISSET will always be true. It is wrong to check again FD_ISSET after a FD_CLR
Keep in mind that your OS has some internal buffers for the sockets and that there are way to extend them (not easy on Linux, though, to get large values you need to do some config as root).
There are some socket libraries that will probably work better than what you can implement in a reasonable time (boost::asio and zmq for the ones I know).
If you need to implement it yourself, (i.e. because for instance zmq has its own packet format), consider using a threadpool library.
EDIT:
Sleeping 1 millisecond is probably a bad idea. Your thread will probably get descheduled and it will take much more than that before you get some CPU time again.

This is just a horrible way to do things. The select serves no purpose but to waste time. If the send is non-blocking, it can mangle data on a partial send. If it's blocking, you still waste arbitrarily much time waiting for one receiver.
You need to pick a sensible I/O strategy. Here is one: Set all sockets non-blocking. When you need to send data to a socket, just call write. If all the data writes, lovely. If not, save the portion of data that wasn't sent for later and add the socket to your write set. When you have nothing else to do, call select. If you get a hit on any socket in your write set, write as many bytes as you can from what you saved. If you write all of them, remove that socket from the write set.
(If you need to write to a data that's already in your write set, just add the data to the saved data to be sent. You may need to close the connection if too much data gets buffered.)
A better idea might be to use a library that already does all these things. Boost::asio is a good one.

You are calling select() before calling send(). Do it the other way around. Call select() only if send() reports WSAEWOULDBLOCK, eg:
int send_TCP_NB(int cs, char data[], int data_length)
{
int status;
int err;
struct timeval waitd;
char *data_ptr = data;
while (data_length > 0)
{
status = send(cs, data_ptr, data_length, 0);
if (status > 0)
{
data_ptr += status;
data_length -= status;
continue;
}
err = WSAGetLastError();
if (err != WSAEWOULDBLOCK)
{
printf("Error sending non blocking data\n");
return 0; // send failed
}
FD_ZERO(&write_flags);
FD_SET(cs, &write_flags); // set the write notification for the socket based on the current state of the buffer
waitd.tv_sec = 0;
waitd.tv_usec = 1000;
status = select(cs+1, NULL, &write_flags, NULL, &waitd);
if (status > 0)
continue;
if (status == 0)
printf("Time limit expired!\n");
else
printf("Error waiting for time limit!\n");
return 0; // send failed
}
return 1;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js