I'm having this obscure problem since 2 days : I created a launch-at-boot application in C++ on a debian system, which worked flawlessly until I integrated some multithreading elements.
There are only 2 threads (1 main and 1 child)
I included -lpthread and -pthread in the makefile
I tried both /.config/autostart and the .desktop file methods (same
result)
The program is lanched with sudo
There is no error/crash anywhere, the main thread works OK, but the
child thread runs 1 iteration only then stops for some reason
even tried to add some sleep in the lxsession boot sequence
If I launch the same command line than in the autostart file in a terminal (sudo or not), it works perfectly.
Its been 2 days and I just have NO CLUE !
If someone experienced this before or can find some logic in it, i'll be ever grateful.
It appears to me that you simply have ... a bug in your new logic. You have made an error in the design of your multi-threading logic, such that the child thread only runs one iteration. (Or, much more likely, stalls in an infinite-wait. Waits for a event that is never signaled, a semaphore that is never raised, a queue that runs dry and is never filled, and so on.)
We can help you further if you post excerpts of the code in question ... only illustrating how the child thread is launched and how it interacts with the parent. (Condition-variables, semaphores, and so-forth, which is probably where the crux of your error lies.)
I would suggest that "all the other stuff is irrelevant." You don't need "a sleep in the boot-sequence" (if the sequence waits for your program to complete, and if it needs to). I suggest that it seems to me that you simply have ... a bug in your new code which introduces multi-threading.
And you might wish to contemplate whether multi-threading is advantageous, given that you had a non-threaded version of the same thing that worked properly. If the processing that is to be done used to be done (successfully) by a single thread, such processing might or might not be more-advantageously processed by "n threads." Should you find-and-fix this bug, or is it just as well to abandon the change and revert back to what worked? Only you can decide that ...
Thank you all for your suggestions.
I found a "fix" : running the startup program in a terminal ('#lxterminal -e url/to/program &' in autostart of lxsession) instead of background seems to fix it SOMEHOW. There is no GUI though ... it is a service.
The multithreaded logic isnt at fault here, not my first shot, and I really want to keep this feature (#Mike Robinson).
I will reconsider the use of sudo as suggested as well, which seems sketchy all things considered. It might get it running in background. thanks # datenwolf.
Related
Using tf::taskflow, I have some tasks (Opengl) that needs to execute in main thread.
How to make the library supports it? Or workaround?
My research
The issues were mentioned in https://github.com/taskflow/taskflow/issues/303
According to its developers (wiki link of the library) , my issue can be solved by setting property of worker, but I don't understand how to apply it.
Another comment in Reddit states a workaround, but I don't think it is applicable to my case :-
Unfortunately, this is not possible but you can always have an
executor with one worker thread and run your task graphs synchronously
with the master thread.
My poor workaround
First, I will sort the tasks that needs to run in main thread by "precede".
I will get an array {Task1,Task5}. Let the main thread run it.
Then, std::condition_variables will be used as gate keepers to block/trigger execution.
In the example, I will use 3 condition_variables (red, orange and brown).
In my imagination, it looks nice.
But can it really work? I want to avoid significant performance loss.
In my speculation and fear, the condition_variables might be conflict/redundant with what tf::taskflow already managed ?
I have a performance-sensitive program that I would like to run as stably as possible, thus I'm wanting to disable/suspend MsMpEng.exe, among a few others, to hopefully attain that on Windows 10 when my program starts. When the program finishes, I'd like to restore normal previous function.
I have tried directly suspending the process using resmon.exe (Resource Monitor), and it suspends... but 20-30 seconds later, the entire system just stops. I assume this is some form of self-protect... so at the very least, I'd have to suspend and resume in a timed loop.
Thoughts? Is it even worth the trouble?
EDIT: Gave it some thought and some test cases, and just adjusting process priority isn't quite enough, but it's better than nothing. I'll just recommend people disable their virus protection if they encounter slow downs unless anyone else has any suggestions.
I have built my first application using glibmm. I'm using a lot of threads as it does heavy processing. I have tried to follow the guidelines concerning multithreading, i.e. not doing any GUI updates from other threads than the one where g_main_loop is running.
I do a lot of graphics rendering in worker threads but I always only update a PixBuf which is later drawn by the widgets on_draw() from the main loop.
All was fine as long as the data I render was read from files. When I started streaming data from a server which I render at regular intervals then the problems started.
Every now and then, especially when executing multiple instances of my application simultaneously, I see that the main threads takes 100% CPU time. Running strace on the process shows that g_main_loop has ended up in an eternal loop calling poll:
poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=10, events=POLLIN}, {fd=8, events=POLLIN}], 4, 100) = 1 ([{fd=10, revents=POLLIN}])
In proc I get this for file-descriptor 10: 10 -> socket:[1132750]
The poll always returns immediately as file-descriptor 10 has something to offer. This goes on forever so I assume that the file-descriptor is never read. The odd thing is that running 5 applications will almost always lead to all 5 ending up in the infinite poll loop after just a couple of minutes while running only instance one seems to work more than 30 minutes most of the times I try.
Why is this happening and is there any way to debug this?
My mistake was that I called queue_draw() from one of my worker threads. Given that the function is called "queue", I assumed it would queue a redraw which would later be executed by the g_main_loop. As it turned out, this was what broke the g_main_loop. I wish libgtkmm would have a little more detail about these multithreading restrictions in its reference manual.
My solution, to the problem was adding Glib::Dispatcher queueRedraw to my Widget and connecting it to the queue_draw() function:
queueRedraw.connect(sigc::mem_fun(*this, &MyWidgetClass::queue_draw))
Calling queueRedraw() signals the main thread to call the queue_draw() function.
I don't know if this is the best approach, but it solves the problem.
I'm trying to debug a custom thread pool implementation that has rarely deadlocks. So I cannot use a debugger like gdb because I have click like 100 times "launch" debugger before having a deadlock.
Currently, I'm running the threadpool test in an infinite loop in a shell script, but that means I cannot see variables and so on. I'm trying to std::cout data, but that slow down the thread and reduce the risk of deadlocks meaning that I can wait like 1hour with my infinite before getting messages. Then I don't get the error, and I need more messages, which means waiting one more hour...
How to efficiently debug the program so that its restart over and over until it deadlocks ? (Or maybe should I open another question with all the code for some help ?)
Thank you in advance !
Bonus question : how to check everything goes fine with a std::condition_variable ? You cannot really tell which thread are asleep or if a race condition occurs on the wait condition.
There are 2 basic ways:
Automate the running of program under debugger. Using gdb program -ex 'run <args>' -ex 'quit' should run the program under debugger and then quit. If the program is still alive in one form or another (segfault, or you broke it manually) you will be asked for confirmation.
Attach the debugger after reproducing the deadlock. For example gdb can be run as gdb <program> <pid> to attach to running program - just wait for deadlock and attach then. This is especially useful when attached debugger causes timing to be changed and you can no longer repro the bug.
In this way you can just run it in loop and wait for result while you drink coffee. BTW - I find the second option easier.
If this is some kind of homework - restarting again and again with more debug will be a reasonable approach.
If somebody pays money for every hour you wait, they might prefer to invest in a software that supports replay-based debugging, that is, a software that records everything a program does, every instruction, and allows you to replay it again and again, debugging back and forth. Thus instead of adding more debug, you record a session during which a deadlock happens, and then start debugging just before the deadlock happened. You can step back and forth as often as you want, until you finally found the culprit.
The software mentioned in the link actually supports Linux and multithreading.
Mozilla rr open source replay based debugging
https://github.com/mozilla/rr
Hans mentioned replay based debugging, but there is a specific open source implementation that is worth mentioning: Mozilla rr.
First you do a record run, and then you can replay the exact same run as many times as you want, and observe it in GDB, and it preserves everything, including input / output and thread ordering.
The official website mentions:
rr's original motivation was to make debugging of intermittent failures easie
Furthermore, rr enables GDB reverse debugging commands such as reverse-next to go to the previous line, which makes it much easier to find the root cause of the problem.
Here is a minimal example of rr in action: How to go to the previous line in GDB?
You can run your test case under GDB in a loop using the command shown in https://stackoverflow.com/a/8657833/341065: gdb --eval-command=run --eval-command=quit --args ./a.out.
I have used this myself: (while gdb --eval-command=run --eval-command=quit --args ./thread_testU ; do echo . ; done).
Once it deadlocks and does not exit, you can just interrupt it by CTRL+C to enter into the debugger.
An easy quick debug to find deadlocks is to have some global variables that you modify where you want to debug, and then print it in a signal handler. You can use SIGINT (sent when you interrupt with ctrl+c) or SIGTERM (sent when you kill the program):
int dbg;
int multithreaded_function()
{
signal(SIGINT, dbg_sighandler);
...
dbg = someVar;
...
}
void dbg_sighandler(int)
{
std::cout << dbg1 << std::endl;
std::exit(EXIT_FAILURE);
}
Like that you just see the state of all your debug variables when you interrupt the program with ctrl+c.
In addition you can run it in a shell while loop:
$> while [ $? -eq 0 ]
do
./my_program
done
which will run your program forever until it fails ($? is the exit status of your program and you exit with EXIT_FAILURE in your signal handler).
It worked well for me, especially for finding out how many thread passed before and after what locks.
It is quite rustic, but you do not need any extra tool and it is fast to implement.
I want to run a process that checks my key press state, parallel to my existing infinite loop (from pcap header). I was looking something very similar to GetAsyncKeyState that of Windows.
I tried for a whole week and found its hard to program something similar to GetAsyncKeyState. So, I was using Termination Signal like ctrl+c to perform certain operation.
I wanted to know, if there are some other similar Termination signals that I can catch using program to perform operation of my own?
P.S. I'm a beginner for Linux and C++. Sorry, if my question is stupid.
POSIX makes SIGUSR1 and SIGUSR2 available for application use. Additionally there are the set of realtime signals. A close reading of man (7) signal should provide the basics and ample reference material is available on the web.
That said, it sounds like you are headed toward expanding what is already an awkward hack. Perhaps you should ask a separate question detailing exactly what you are doing and someone can help you with a more appropriate path toward solving your primary problem rather than improvements on a work-around.
you can catch the pid (Process identifier of your program) and with another terminal put
kill -9 {pid}
to get the pid just type in terminal ps -u {username}
or you can open application monitor (it's like the task admin of windows)