The scenario is the next one:
I have a XMLRPC-C++ applcation, listening for connections on PORT=8081. It implements an Abyss Server, using the xmlrpc-c library as next:
xmlrpc_c::serverAbyss myAbyssServer(
myRegistry, //handler of methods
port, //8081
"xmlrpc_log"
);
when I create multiple connections from a script calling many XMLRPC methods, it works fine.
the script is something like this:
Script1:
rpc.method1(parameters);
rpc.method2(parameters);
rpc.methodN(parameters);
If I check connections in the server with netstat and the xmlrpc_log when this script is executing, the output is something like XMLRPC-SERVER:8081 XMLRPC-CLIENT:SOME TIME_WAIT. Though the XMLRPC_CLIENT IP is the same, fore very rpc.method call it creates a new connections.
The problem appears when I execute two of this scripts in the same client. It means, the call rpc.methodM(parameters) in one script, is executed simultaneously with the rpc.methodN(parameters) in the other script, in the same client.
This produces a crash in the server, and XMLRPC-SERVER stay down till a restart the process.
I read the Abyss help, and runOnce() method will not help. By default, calling the constructor as above, the MaxConnections by default is 30, and timeout 15 segs, for the Abyss server.
Is there some configuration to avoid this crash? I will need to support more than one client at the same time and many connections simultaneously.
Thanks for any help related to this,
Sincerely,
Luchux.
Well. apparently the server is handling the multiple connections and supporting multithreading with pthreads. The problem should be in my code executed by the RPC calls, I guess because a reentrant/thread safety problem.
After some break working with other project, I came back to this code and the problem was in a Natural Language library, with some not re-entrant methods. They solved it, I solved it :) -
Related
I am using boost::asio to implement network programming and running into timing issues. The issue is currently most with the client.
The protocol initially begins by the server returning a date time string to the user, and the client reads it. Up to that part it works fine. But What I also want is to be able to write commands to the server which then processes them. To accomplish this I use the io_service.post() function as shown below.
io_service.post(boost::bind()); // bounded function calls async_write() method.
For some reason the write tries happens before the initial client/server communication, when the socket has not been created yet. And I get bad socket descriptor error.
Now the io_service's run method is indeed called in another thread.
When I place a sleep(2) command before post method, it work fine.
Is there way to synchronize this, so that the socket is created before any posted calls are executed.
When creating the socket and establishing the connection using boost::asio, you can define a method to be called when these operations have either completed or failed. So, you should trigger your "posted call" in the success callback.
Relevant methods and classes are :
boost::asio::ip::tcp::resolver::async_resolve(...)
boost::asio::ip::tcp::socket::async_connect(...)
I think the links below
will give u some help
http://www.boost.org/doc/libs/1_42_0/doc/html/boost_asio/reference/io_service.html
For every single tutorials and examples I have seen on the internet for Linux/Unix socket tutorials, the server side code always involves an infinite loop that checks for client connection every single time.
Example:
http://www.thegeekstuff.com/2011/12/c-socket-programming/
http://tldp.org/LDP/LG/issue74/tougher.html#3.2
Is there a more efficient way to structure the server side code so that it does not involve an infinite loop, or code the infinite loop in a way that it will take up less system resource?
the infinite loop in those examples is already efficient. the call to accept() is a blocking call: the function does not return until there is a client connecting to the server. code execution for the thread which called the accept() function is halted, and does not take any processing power.
think of accept() as a call to join() or like a wait on a mutex/lock/semaphore.
of course, there are many other ways to handle incoming connection, but those other ways deal with the blocking nature of accept(). this function is difficult to cancel, so there exists non-blocking alternatives which will allow the server to perform other actions while waiting for an incoming connection. one such alternative is using select(). other alternatives are less portable as they involve low-level operating system calls to signal the connection through a callback function, an event or any other asynchronous mechanism handled by the operating system...
For C++ you could look into boost.asio. You could also look into e.g. asynchronous I/O functions. There is also SIGIO.
Of course, even when using these asynchronous methods, your main program still needs to sit in a loop, or the program will exit.
The infinite loop is there to maintain the server's running state, so when a client connection is accepted, the server won't quit immediately afterwards, instead it'll go back to listening for another client connection.
The listen() call is a blocking one - that is to say, it waits until it receives data. It does this is an extremely efficient way, using zero system resources (until a connection is made, of course) by making use of the operating systems network drivers that trigger an event (or hardware interrupt) that wakes the listening thread up.
Here's a good overview of what techniques are available - The C10K problem.
When you are implementing a server that listens for possibly infinite connections, there is imo no way around some sort of infinite loops. Usually this is not a problem at all, because when your socket is not marked as non-blocking, the call to accept() will block until a new connection arrives. Due to this blocking, no system resources are wasted.
Other libraries that provide like an event-based system are ultimately implemented in the way described above.
In addition to what has already been posted, it's fairly easy to see what is going on with a debugger. You will be able to single-step through until you execute the accept() line, upon which the 'sigle-step' highlight will disappear and the app will run on - the next line is not reached. If you put a breadkpoint on the next line, it will not fire until a client connects.
We need to follow the best practice on writing client -server programing. The best guide I can recommend you at this time is The C10K Problem . There are specific stuff we need to follow in this case. We can go for using select or poll or epoll. Each have there own advantages and disadvantages.
If you are running you code using latest kernel version, then I would recommend to go for epoll. Click to see sample program to understand epoll.
If you are using select, poll, epoll then you will be blocked until you get an event / trigger so that your server will not run in to infinite loop by consuming your system time.
On my personal experience, I feel epoll is the best way to go further as I observed the threshold of my server machine on having 80k ACTIVE connection was very less on comparing it will select and poll. The load average of my server machine was just 3.2 on having 80k active connection :)
On testing with poll, I find my server load average went up to 7.8 on reaching 30k active client connection :(.
I have a server program which should run full time a day. If I want to change some parameters of it, Is there any way rather than shut down then restart way?
There are quite a few ways of doing this, including, but almost certainly not limited to:
You can maintain the parameters in a separate file so that the program will periodically check that file and update its internal information.
Similar to (1) but you can send some sort of signal to the application to get it to immediately re-read the file.
You can do either (1) or (2) but using shared memory rather than a configuration file.
You can have your program sit at the server end of an IPC conversation, so that a client can open up a connection to it to provide new parameters. Anything from a simple message queue to a full-blown HTTP server and associated pages.
Of course, all of these tend to need a fair amount of work in your program to get it to look for the new information.
You should take that into account when making your decision. By far the quickest solution to implement is to just (cleanly) kill off the process at something like 11:55pm then immediately restart it. It's simpler because your code probably already has the ability to load the information on startup, so this could be a simple cron one-liner.
Some people speak of laziness as a bad thing but that's not always the case :-)
If the Server maintains many alive connections from clients, restarting the server process is the last way you should consider. Except reloading configuration files, inserting a proxy process between clients and server can be another way.
The proxy process is Responsible for 2 things.
a. Maintaining the connection from clients and forwarding packets to Server for handling.
b. Judging weather the current server process(Server A) is alive and if it not, switching to another server(Server B) automatically.
Then you can change parameters by restart server without worrying about interrupting clients since there is always two(or more) servers running.
i got a very specific question about server programming in UNIX (Debian, kernel 2.6.32). My goal is to learn how to write a server which can handle a huge amount of clients. My target is more than 30 000 concurrent clients (even when my college mentions that 500 000 are possible, which seems QUIIITEEE a huge amount :-)), but i really don't know (even whats possible) and that is why I ask here. So my first question. How many simultaneous clients are possible? Clients can connect whenever they want and get in contact with other clients and form a group (1 group contains a maximum of 12 clients). They can chat with each other, so the TCP/IP package size varies depending on the message sent.
Clients can also send mathematical formulas to the server. The server will solve them and broadcast the answer back to the group. This is a quite heavy operation.
My current approach is to start up the server. Than using fork to create a daemon process. The daemon process binds the socket fd_listen and starts listening. It is a while (1) loop. I use accept() to get incoming calls.
Once a client connects I create a pthread for that client which will run the communication. Clients get added to a group and share some memory together (needed to keep the group running) but still every client is running on a different thread. Getting the access to the memory right was quite a hazzle but works fine now.
In the beginning of the programm i read out the /proc/sys/kernel/threads-max file and according to that i create my threads. The amount of possible threads according to that file is around 5000. Far away from the amount of clients i want to be able to serve.
Another approach i consider is to use select () and create sets. But the access time to find a socket within a set is O(N). This can be quite long if i have more than a couple of thousands clients connected. Please correct me if i am wrong.
Well, i guess i need some ideas :-)
Groetjes
Markus
P.S. i tag it for C++ and C because it applies to both languages.
The best approach as of today is an event loop like libev or libevent.
In most cases you will find that one thread is more than enough, but even if it isn't, you can always have multiple threads with separate loops (at least with libev).
Libev[ent] uses the most efficient polling solution for each OS (and anything is more efficient than select or a thread per socket).
You'll run into a couple of limits:
fd_set size: This is changable at compile time, but has quite a low limit by default, this affects select solutions.
Thread-per-socket will run out of steam far earlier - I suggest putting the longs calculations in separate threads (with pooling if required), but otherwise a single thread approach will probably scale.
To reach 500,000 you'll need a set of machines, and round-robin DNS I suspect.
TCP ports shouldn't be a problem, as long as the server doesn't connection back to the clients. I always seem to forget this, and have to be reminded.
File descriptors themselves shouldn't be too much of a problem, I think, but getting them into your polling solution may be more difficult - certainly you don't want to be passing them in each time.
I think you can use the event model(epoll + worker threads pool) to solve this problem.
first listen and accept in main thread, if the client connects to the server, the main thread distribute the client_fd to one worker thread, and add epoll list, then this worker thread will handle the reqeust from the client.
the number of worker thread can be configured by the problem, and it must be no more the the 5000.
I have a simple c++ application that generates reports on the back end of my web app (simple LAMP setup). The problem is the back end loads a data file that takes about 1.5GB in memory. This won't scale very well if multiple users are running it simultaneously, so my thought is to split into several programs :
Program A is the main executable that is always running on the server, and always has the data loaded, and can actually run reports.
Program B is spawned from php, and makes a simple request to program A to get the info it needs, and returns the data.
So my questions are these:
What is a good mechanism for B to ask A to do something?
How should it work when A has nothing to do? I don't really want to be polling for tasks or otherwise spinning my tires.
Use a named mutex/event, basically what this does is allows one thread (process A in your case) to sit there hanging out waiting. Then process B comes along, needing something done, and signals the mutex/event this wakes up process A, and you proceed.
If you are on Microsoft :
Mutex, Event
Ipc on linux works differently, but has the same capability:
Linux Stuff
Or alternatively, for the c++ portion you can use one of the boost IPC libraries, which are multi-platform. I'm not sure what PHP has available, but it will no doubt have something equivalent.
Use TCP sockets running on localhost.
Make the C++ application a daemon.
The PHP front-end creates a persistent connection to the daemon. pfsockopen
When a request is made, the PHP sends a request to the daemon which then processes and sends it all back. PHP Sockets C++ Sockets
EDIT
Added some links for reference. I might have some really bad C code that uses sockets of interprocess communication somewhere, but nothing handy.
IPC is easy on C++, just call the POSIX C API.
But what you're asking would be much better served by a queue manager. Make the background daemon wait for a message on the queue, and the frontend PHP just add there the specifications of the task it wants processed. Some queue managers allow the result of the task to be added to the same object, or you can define a new queue for the finish messages.
One of the best known high-performance queue manager is RabbitMQ. Another one very easy to use is MemcacheQ.
Or, you could just add a table to MySQL for tasks, the background process just queries periodically for unfinished ones. This works and can be very reliable (sometimes called Ghetto queues), but break down at high tasks/second.