Winsock uncontrollably spawns several, persistent threads - c++

I'm designing a networking framework which uses WSAEventSelect for asynchronous operations. I spawn one thread for every 64th socket due to the max 64 events per thread limitation, and everything works as expected except for one thing:
Threads keep getting spawned uncontrollably by Winsock during connect and disconnect, threads that won't go away.
With the current design of the framework, two threads should be running when only a few sockets are active. And as expected, two threads are running in total. However, when I connect with a few sockets (1-5 sockets), an additional 3 threads are spawn which persist until I close the application. Also, when I lose connection on any of the sockets, 2 more threads are spawned (also persisting until closure). That's 7 threads in total, 5 of which I have no idea what they are there for.
If they are required by Winsock for connecting or whatever and then disappeared, that would be fine. But it bothers me that they persist until I close my application.
Is there anyone who could shed some light on this? Possibly a solution to avoid these threads or force them to close when no connections are active?
(Application is written in C++ with Win32 and Winsock 2.2)
Information from Process Explorer:
Expected threads:
MyApp.exe!WinMainCRTStartup
MyApp.exe!Netfw::NetworkThread::ThreadProc
Unexpected threads:
ntdll.dll!RtlpUnWaitCriticalSection+0x2dc
mswsock.dll+0x7426
ntdll.dll!RtlGetCurrentPeb+0x155
ntdll.dll!RtlGetCurrentPeb+0x155
ntdll.dll!RtlGetCurrentPeb+0x155
All of the unexpected threads have call stacks with calls to functions such as ntkrnlpa.exe!IoSetCompletionRoutineEx+0x46e which probably means it is a part of the notification mechanism.

Download the sysinternals tool process explorer. Install the appropriate debugging tools for windows. In process explorer, set Options -> Symbols path to:
SRV*C:\Websymbols*http://msdl.microsoft.com/download/symbols
Where C:\Websymbols is just a place to store the symbol cache (I'd create a new empty directory for it.)
Now, you can inspect your program with process explorer. Double click the process, go to the threads tab, and it will show you where the threads started, how busy they are, and what their current callstack is.
That usually gives you a very good idea of what the threads are. If they're Winsock internal threads, I wouldn't worry about them, even if there are hundreds.

One direction to look in (just a guess): If these are TCP connections, these may be background threads to handle internal TCP-related timers. I don't know why they would use one thread per connection, but something has to do the background work there.

Related

Is there a cross-platform way to make a process based socket server in C++?

It's something that seems deceptively simple, but comes with a lot of nasty details and compatibility problems. I have some code that kinda works on Linux and... sorta works on Windows but it's having various problems, for what seems like a common and simple problem. I know async is all the rage these days, but I have good reasons to want a process per connection.
I'm writing a server that hosts simulation processes. So each connection is long-running and CPU intensive. But more importantly, these simulators (Ngspice, Xyce) have global state and sometimes segfault or reach unrecoverable errors. So it is essential that each connection has its own process so they can run/crash in parallel and not mess with each other's state.
Another semi-important detail is that the protocol is based on Capnp RPC, which has a nice cross-platform async API, but not a blocking one. So what I do is have my own blocking accept loop that forks a new process and then starts the Capnp event loop in the new process.
So I started with a simple accept loop, added a ton of ifdefs to support windows, and then added fork to make it multiprocess and then added a SIGCHLD handler to try to avoid zombie processes. But Windows doesn't have fork, and if many clients disconnect simultaneously I still get zombies.
My current code lives here: https://github.com/NyanCAD/SimServer/blob/1ba47205904fe57196498653ece828c572579717/main.cpp
I'm fine with either some more ifdefs and hacks to make Windows work and avoid zombies, or some sort of library that either offers a ready made multiprocess socket server or functionality for writing such a thing. The important part is that it can accept a socket in a new process and pass the raw FD to the Capnp event loop.

Multithreaded application often hangs with signal 1

I have a application using pthreads and prior to C++11 is in use. We have several worker threads assigned for several purposes and tasks get distributed in producer-consumer way through shared circular pool of task data. Posix semaphores have been used to do inter-thread synchronizations in both wait/notify mode as well as mutex locks for shared data to ensure mutual exclusions.
Recently, noticing a strange problem with large volume of data that program seems to hang with signal 1 received. Signal 1 is basically a SIGHUP, that means hang-up, this signal is usually used to report that the user's terminal is disconnected, perhaps because a network or telephone connection was broken.
Can this be caused because the parent terminal time-outing? If so, can nohup help?
This occurs only for large volume of data (didn't notice with smaller volume) and the application is being run from command line from a solaris terminal (telnet session).
Thoughts, welcome.

QProcess ProcessState sufficient for Blocked Processes?

I want to know if a process (started with a QProcess class) doesn't respond anymore. For instance, my process is an application that only prints 1 every seconds.
My problem is that I want to know if (for some mystical reason), that process is blocked for a short period of time (more than 1 second, something noticeable by a human).
However, the different states of a QProcess (Not Running, Starting, Running) don't include a "Blocked" state.
I mean blocked as "Don't Answer to the OS" when we got the "Non Responding" message in the Task Manager. Such as when a Windows MMI (like explorer.exe) is blocked and becomes white.
But : I want to detect that "Not Responding" state for ANY processes. Not just MMI.
Is there a way to detect such a state ?
Qt doesn't provide any api for that. You'd need to use platform-specific mechanisms. On some platforms (Windows!), there is no notion of a hung application, merely that of a hung window. You can have one application that has both responsive and unresponsive windows :)
On Windows, you'd enumerate all windows using EnumWindows, check if they belong to your process by comparing the pid from GetWindowThreadProcessId to process->pid(), and finally checking if the window is hung through IsHungAppWindow.
Caveats
Generally, there's is no such thing as an all-encompassing notion of a "non responding" process.
Suppose you have a web server. What does it mean that it's not responding? It's under heavy load, so it may deny some incoming connections. Is that "non responding" from your perspective? It may be, but there's nothing you can do about it - killing and restarting the process won't fix it. If anything, it will make things worse for the already connected clients.
Suppose you have a process that is blocking on a filesystem read because the particular drive it tries to access is slow, or under heavy load. Does it mean that it's not responding? Will killing and restarting it always fix this? If the process then retries the read from the beginning of the file, it may well make things worse.
Suppose you have a poorly designed process with a GUI. It's doing blocking serial port reads in the GUI thread. The read it's doing takes long time, and the GUI is nonresponsive for several seconds. You kill the process, it restarts and tries that long read again - you've only made things worse.
You have to tread very carefully here.
Solution Ideas
There are multiple approaches to determining what is a "responsive" process. It was already mentioned that processes with a GUI are monitored by the operating system on both Windows and OS X. Thus one can use native APIs that can query whether a window or a process is hung or not. This makes sense for applications that offer a UI, and subject to caveats above.
If the process is providing a service, you may periodically use the service to determine if it's still available, subject to some deadlines. Any elections as to what to do with a "hung" process should take into account CPU and I/O load of the system.
It may be worthwhile to keep a history of the latency of the service's response to the service request. Only "large" changes to the latency should be taken to be an indication of a problem. Suppose you're keeping track of the average latency. One could have set an ultimate deadline to 50x the previous average latency. Missing this deadline, the service is presumed dead and up for forced recycling. An "action flag" deadline may be set to 5-10x the average latency. A human would then be given an option to orderly restart the service. The flag would be automatically removed when latency backs down to, say, 30% below the deadline that triggered the flag.
If you are the developer of the monitored process, then you can invert the monitoring aspect and become a passive watchdog of the monitored process. The monitored process must then periodically, actively "wake" the watchdog to indicate that it's alive. The emission of the wake signal (in generic terms) should be performed in strategic location(s) in the code. Periodic reception of wake "signals" should allow you to reason that the process is still alive. You may have multiple wake signals, tagged with the location in the watched process. Everything depends on how many threads the process has, what is it doing, etc.

Best approach for writing a Linux Server in C (phtreads, select or fork ? )

i got a very specific question about server programming in UNIX (Debian, kernel 2.6.32). My goal is to learn how to write a server which can handle a huge amount of clients. My target is more than 30 000 concurrent clients (even when my college mentions that 500 000 are possible, which seems QUIIITEEE a huge amount :-)), but i really don't know (even whats possible) and that is why I ask here. So my first question. How many simultaneous clients are possible? Clients can connect whenever they want and get in contact with other clients and form a group (1 group contains a maximum of 12 clients). They can chat with each other, so the TCP/IP package size varies depending on the message sent.
Clients can also send mathematical formulas to the server. The server will solve them and broadcast the answer back to the group. This is a quite heavy operation.
My current approach is to start up the server. Than using fork to create a daemon process. The daemon process binds the socket fd_listen and starts listening. It is a while (1) loop. I use accept() to get incoming calls.
Once a client connects I create a pthread for that client which will run the communication. Clients get added to a group and share some memory together (needed to keep the group running) but still every client is running on a different thread. Getting the access to the memory right was quite a hazzle but works fine now.
In the beginning of the programm i read out the /proc/sys/kernel/threads-max file and according to that i create my threads. The amount of possible threads according to that file is around 5000. Far away from the amount of clients i want to be able to serve.
Another approach i consider is to use select () and create sets. But the access time to find a socket within a set is O(N). This can be quite long if i have more than a couple of thousands clients connected. Please correct me if i am wrong.
Well, i guess i need some ideas :-)
Groetjes
Markus
P.S. i tag it for C++ and C because it applies to both languages.
The best approach as of today is an event loop like libev or libevent.
In most cases you will find that one thread is more than enough, but even if it isn't, you can always have multiple threads with separate loops (at least with libev).
Libev[ent] uses the most efficient polling solution for each OS (and anything is more efficient than select or a thread per socket).
You'll run into a couple of limits:
fd_set size: This is changable at compile time, but has quite a low limit by default, this affects select solutions.
Thread-per-socket will run out of steam far earlier - I suggest putting the longs calculations in separate threads (with pooling if required), but otherwise a single thread approach will probably scale.
To reach 500,000 you'll need a set of machines, and round-robin DNS I suspect.
TCP ports shouldn't be a problem, as long as the server doesn't connection back to the clients. I always seem to forget this, and have to be reminded.
File descriptors themselves shouldn't be too much of a problem, I think, but getting them into your polling solution may be more difficult - certainly you don't want to be passing them in each time.
I think you can use the event model(epoll + worker threads pool) to solve this problem.
first listen and accept in main thread, if the client connects to the server, the main thread distribute the client_fd to one worker thread, and add epoll list, then this worker thread will handle the reqeust from the client.
the number of worker thread can be configured by the problem, and it must be no more the the 5000.

Why can't my MFC app exit completely?

I made a MFC application which probably has two threads, one for receiving data from a socket using UDP protocol and one is the main thread of MFC app. While any data is received some objects, created in the main thread by new operator, would be notified to fetch the data through apply the observer design pattern. The problem is that sometimes after I clicked the close system button, the GUI of the app disappeared, but its process can still be found in the Task Manager. If I stop the data source (UDP client) this problem would never happen. Other important and maybe helpful information is listed below:
The Observer design pattern was implemented with STL container list. I have used the critical section protection in the Attach, Detach and Notify functions.
I deleted the observer objects before closing the UDP socket.
The data transfer rate may be a little faster than process data, because after closing the data source the data process is still working.
I can't figure out what lead my app can not exit completely. Please give me some clues.
This is usually caused by a thread you created and not exit it programmatically when you exit the appliation. There must be a while clause in your thread. The way to find where it is still running is:
use debug mode to start you application and click the exit button the top right corner to exit it.
Check from task manager and see if it is still running
if it is, excute Debug->Break All,
Open threads windows, double click each thread, you will find where your code is still looping.
Typically a process won't terminate because there's still a foreground thread running somewhere. You must ensure that your socket library isn't running any thread when you want to close your application.
First thing, with MFC, please use the notification based methods to get notifications on message arrivals, connections etc. So you can get rid of threads if you have.
It's quite easy to attache to a debugger and Break see which threads are existing and waiting for what.
Alternatively you can use ProcessExplorer with proper symbol configuration to see the call stacks of the threads available for the particular process.
The application can two kind of issues to exit, one could be infinite loop and other might be waiting/deadlock (e.g. socket read command is a blocking call). You can easily deduce the problem by attaching to debugger.
Otherwise please provide further information about the threads, code snippet possible.