I am currently writing a server backend for my iOS game. The server is written in C++ and compiled & run on a remove Ubuntu Server. I start the server through SSH using
sudo nohup ./mygameserver &
The server communications is written in TCP. The main function with the read-loop is written using standard C socket.h/netdb.h/in.h, and uses select() to accept many users on a nonblocking listening socket.
When I run the server in the foreground through SSH, everything seems to be working fine. It receives all packets I send in the right order, and with correct header info. When I use nohup and disconnect from SSH however... Everything seems to crash. A typical log from a user connect when the server software runs without SSH/in nohup mode looks like this:
CONNECTED TO SQL
Server starting...
Second thread initialized
Entering order loop
Got a new message of type: 0
Got a new message of type: 0
Got a new message of type: 0
Got a new message of type: 0
<this line continues ad infinitum>
I really have no idea why. I've made sure every print goes to nohup.out instead of std::cout, and the server sends an update to MySQL every 10 seconds to avoid timeouts.
Any ideas or input on what's wrong here? Let me know if you want some code samples, I just don't know which ones are interesting to this problem in particular. Thanks in advance.
I found out what was wrong.
In my server program I have a readSockets() function which is called after a call to select() in the main server loop. The readSockets() function responded to newline character pushed to stdin (for reasons I don't know) by nohup on startup, and as stdin is, in fact, also a FILE* connected to a file descriptor, my readSockets() function responded to stdin as a connecting client.
This obviously made the server crash, as stdin was never flushed (and was therefore read every time select() had returned). This again blocked the thread for other users.
Related
First off, sorry that I was not able to provide a reduced example. At the moment, it was beyond my ability. Especially my codes that pass the file descriptors around wasn't cleanly working. I think I have only a fair understanding on how the code works at a high level.
The question is essentially if, in the following complicated example, the end-user enters Ctrl + C, which process takes the SIGINT and how things happen that way.
The application works on the Command Line Interface (CLI, going forward). The user starts a client, which effectively sends a command to the server, prints some responses out, and terminates. The server upon request finds a proper worker executable, and fork-and-exec the executable, and waits for it. Then, the server constructs the response and sends it back to the client.
There are, however, some complications. The client starts the server if the server process is not already running -- there's one server process per user. When the server is fork-and-exec'ed, the line just after fork() has this:
if (pid == 0) {
daemon(0, 0);
// do more set up and exec
}
Another complication, which might be more important,is that when the client sends a request over a unix socket (which looks like #server_name), the client appears to send the three file descriptors for standard I/O, using techniques like this.
When the server fork-and-execs the worker executable, the server redirects the worker's standard I/O to the the three file descriptors received from the client:
// just after fork(), in the child process' code
auto new_fd = fcntl(received_fd, F_DUPFD_CLOEXEC, 3);
dup2(new_fd, channel); // channel seems to be 0, 1, or 2
That piece of codes run for all the three file descriptors, respectively. (The worker executable yet again creates a bunch of processes but it does not pass the STDIN to its children.)
The question is what happens if the end user inputs Ctrl + C in the terminal. I thought, the Bash shell takes it, and generates & sends SIGINT to the processes that has a particular session ID perhaps same as the bash shell's direct child process or itself: the client, in the example.
However, it looks like the worker executable receives the signal, and I cannot confirm if the client receives the signal. I do not think the server process receives the signal but cannot confirm that. How could this happen?
If the Bash takes the Ctrl+C first, and delivers it to whatever processes, I thought the server has been detached from the Bash (i.e. daemon(0, 0)), and has nothing to do with the bash process. I thought the server and thus the worker processes have different session IDs, and which looked so when I ran the ps -o command.
It's understandable that the user keyboard inputs (yes or no, etc) could be delivered to the worker process. I am not sure how Ctrl + C could be delivered to the worker process by just effectively sharing the standard input. I would like to understand how it works.
%P.S.Thank you for the answers and comments! The answer was really helpful. It sounded like the client must get the signal, and the worker process must be stopped by other mechanism. Based on that, I could look into the code deeper. It turned out that the client indeed catches the signal and dies. It breaks the socket connection. The server detects when the fd is broken, and signals the corresponding worker process. That was why the worker process looked like getting the signal from the terminal.
It's not Bash that sends the signal, but the tty driver. It sends it to the foreground process group, meaning all processes in the foreground group receive it.
I've written a C application that grabs some sensor data and puts it into a string. This string gets passed to gammu-smsd-inject for transmission by SMSD. For reference, my application launches gammu-smsd-inject using fork() & wait(). The program waits for gammu-smsd-inject to terminate and then exits itself.
My program works just fine: if I run it manually from a bash prompt it grabs the sensor data, calls gammu-smsd-inject and quits. The sms appears in the database outbox and shortly after I receive an sms on my phone.
I've added the absolute path to my program into the runonreceive directive of SMSD. When I send a text to SMSD, it is received in the inbox and from the log file I can see the daemon running my program. The logfile then states that the process (my program) successfully exited (0), but I never receive any sms and nothing is added to the database's outbox or sentitems tables.
Any idea what might be going on? I haven't posted a code listing as it's quite long,but it is available.
The only think I could think of that might be happening is that gammu-smsd-inject is perhaps being terminated (by a parent process somewhere up the tree) BEFORE it gets a chance to do any SQL stuff. Wouldn't this create a non zero exit code though?
So the problem was which user was running the program. When I ran my application manually from bash, it launch it with my user ID, but when the SMSD daemon ran it, it was launched with a different ID which was causing issues for some reason. I thought it was a problem with the user ID being used to access the mysql database, but apparently not. In short, I don't actually know what the problem was, but by assigning my login's UID to the child process, everything suddenly worked.
The scenario is the next one:
I have a XMLRPC-C++ applcation, listening for connections on PORT=8081. It implements an Abyss Server, using the xmlrpc-c library as next:
xmlrpc_c::serverAbyss myAbyssServer(
myRegistry, //handler of methods
port, //8081
"xmlrpc_log"
);
when I create multiple connections from a script calling many XMLRPC methods, it works fine.
the script is something like this:
Script1:
rpc.method1(parameters);
rpc.method2(parameters);
rpc.methodN(parameters);
If I check connections in the server with netstat and the xmlrpc_log when this script is executing, the output is something like XMLRPC-SERVER:8081 XMLRPC-CLIENT:SOME TIME_WAIT. Though the XMLRPC_CLIENT IP is the same, fore very rpc.method call it creates a new connections.
The problem appears when I execute two of this scripts in the same client. It means, the call rpc.methodM(parameters) in one script, is executed simultaneously with the rpc.methodN(parameters) in the other script, in the same client.
This produces a crash in the server, and XMLRPC-SERVER stay down till a restart the process.
I read the Abyss help, and runOnce() method will not help. By default, calling the constructor as above, the MaxConnections by default is 30, and timeout 15 segs, for the Abyss server.
Is there some configuration to avoid this crash? I will need to support more than one client at the same time and many connections simultaneously.
Thanks for any help related to this,
Sincerely,
Luchux.
Well. apparently the server is handling the multiple connections and supporting multithreading with pthreads. The problem should be in my code executed by the RPC calls, I guess because a reentrant/thread safety problem.
After some break working with other project, I came back to this code and the problem was in a Natural Language library, with some not re-entrant methods. They solved it, I solved it :) -
I'm trying to write an experimental server program that accepts a connection and sends a message to the client. I got the client to connect, but I can't seem to send the message without doing really odd things.
For example, in this snip, conn is a connected socket:
int sendRes;
char buf[1024];
strcpy_s(buf,"Testing!");
sendRes = send(conn,buf,strlen(buf),0);Well, when I connect to it via Telnet, it displays nothing and just quits. However, when I add the line cout << sendRes to the end of this snip, it suddenly works and displays Testing! on Telnet, just like it should.
And so, I would like to ask anyone who knows, why is it acting like so?
Could it be that the telnet client itself is waiting for an end of line marker to display the incoming buffer?
Try writing your own client and using recv to see if anything is incoming.
Then again, new line might not have anything to do with it since the cout is on the local side.
Try checking RFC854 for the full telnet specification (or, again, simply write your own client).
I'm writing an application that is split into two parts for Mac OS X - a daemon and an agent. I'm using a standard unix socket to communicate between the daemon and the agents. That is, the socket is created with PF_UNIX and SOCK_STREAM.
When agents are created (whenever a user logs in), one of the first things it does is to connect to the socket. This seems to work perfectly for the first agent. However, when the second agent connects, the daemon experiences the following issue:
I'm using select() to check for data that can be read. The select() call succeeds, and indicates that there is data to be read. However, when i call recv() it returns with -1, and errno is set to 35, or "Resource temporarily unavailable".
Now, I would expect this for a non-blocking socket, but I have triple-checked - I never set the socket to be non-blocking.
As far as I can tell, this only happens when a second agent connects to the same unix socket. If I limit myself to one daemon and one agent then everything seems to work perfectly. What could be causing this odd behaviour?
It sounds a bit like you're trying to read from the wrong client fd. It's hard to tell without seeing your code, but it also sounds a bit that way from your description.
So just in case, here's how it works. Your server is ending up with three file descriptors, the socket it first starts listening on, and then one file descriptor for each connected client. When there's something to read on the original socket, that means there's a new client; it sounds like you have this part right. Each connected client then gives you its own independent fd to read/write from. Calling select() will return if any of these is ready to read; you then have to check each fd in the readfds variable from select with FD_ISSET() to see if it actually has data to read.
You can see a basic example of this type of code here.