How to handle infinite loop in threads and avoid memory leak - c++

I have a project which run several infinite loops in threads, I simplify it to the following code:
#include <iostream>
#include <vector>
#include <thread>
#include <boost/fiber/algo/round_robin.hpp>
#include <boost/thread.hpp>
#include <chrono>
#include <boost/thread.hpp>
#include <string>
void foo(){
std::cout<<"thread a"<<std::endl;
while(true){
std::this_thread::sleep_for(std::chrono::seconds{5});
}
return;
}
void foo2(){
std::cout<<"thread b"<<std::endl;
while(true){
std::this_thread::sleep_for(std::chrono::seconds{5});
}
return;
}
int main(){
std::thread a(foo);
std::thread b(foo2);
while(true){
std::this_thread::sleep_for(std::chrono::seconds{5});
}
return 0;
}
It works as expected.
I use valgrind to detect memory leak and it shows it has memory leak(I guess infinite loop never release memory because it never stops). I considered to use join(), but it doesn't make sense here. I tried to add
a.detach();
b.detach();
before the while loop in main function, but it doesn't solve memory leak issue.
Would somebody please give me some advice how to avoid memory leak here?

Its a long answer, so I'll start with a summary: The leak in your example code is not an issue. Nevertheless you should fix it. And the way to fix it is to turn the infinite loops into non-infinite loops and to join the threads.
A memory leak is for example this:
void bar() {
int * x = new int;
}
An object is dynamically allocated and when the function returns all pointers to the object are lost. The memory is still allocated to the process but you cannot free it. Calling bar many times will pile up memory until the process runs out of memory and gets killed. This is to be avoided.
Then there is a less severe type of memory leaks:
int main() {
bar();
}
Here some memory is allocated, but next the process terminates. When the process terminates all memory is reclaimed by the OS. The missing delete is not such a big issue here.
There are other ways of leaking memory and I am not trying to enumerate them all, but rather use the examples to get a point across.
Then there are good reasons to worry also about this second type of leaks, that I called "less severe". And that is because it is typically not just memory that is leaked. Consider (dont write code like this! it is only for illustrating a point):
int main() {
A* = new A();
}
A is some class. In main some memory is allocated and an A is constructed. The memory is the lesser problem here. The real problem is any other resource that A claimed in its constructor. It might have opened a file. It might have opened a connection to a data base. Such resources must be cleaned up in a destructor. If the A object is not properly destroyed critical data might get lost.
Conclusion: Leaking memory when returning from main isn't a big issue. Leaking other resource is a big issue. And the memory leak is good indication that also other resources are not cleaned up properly.
In your toy example there is no problem but only a small change makes your approach problematic:
void foo(){
A a;
while(true){
std::this_thread::sleep_for(std::chrono::seconds{5});
}
}
A is again the class that acquires some resource in its constructor and that resouce must be properly release in the destructor. Also when the program terminates you want to have the data in the database, the last log message in the log file, etc.
Rather than while(true) and detach you should use some atomic or condition variable to signal the threads that they should stop. Something along the line of
std::atomic<bool> foo_runs;
void foo(){
A a;
while(foo_runs.load()){
std::this_thread::sleep_for(std::chrono::seconds{5});
}
}
int main() {
foo_runs.store(true);
std::thread a(foo);
// do something else
foo_runs.store(false);
a.join();
}

Whatever you do, you have to join()/detach() on a and b. If you call join() before the main loop, you'll never get to the main loop. If you get to the end of main() without join()/detach(), std::abort() will be called.
I don't see a leak, but there is a race on the cout stream. Maybe potential leak can happen if detached thread a or b escapes main() and continues running a never-ending function. In such case, the thread itself is leaked since it is detached from *this (main), and there is no owner to destroy it. If that's the story, try to call join() on both a and b after the main loop.

Related

What will happens to a local pointer if thread is terminated?

what happens to data created in local scope of thread if thread is terminated, memory leak?
void MyThread()
{
auto* ptr = new int[10];
while (true)
{
// stuff
}
// thread is interrupted before this delete
delete[] ptr;
}
Okay, my perspective.
If the program exits, the threads exit wherever they are. They don't clean up. But in this case you don't care. You might care if it's an open file and you want it flushed.
However, I prefer a way to tell my threads to exit cleanly. This isn't perfect, but instead of while (true) you can do while (iSHouldRun) and set the field to false when it's time for the thread to exit.
You can also set a flag that says, iAmExiting at the end, then myThread.join() once the flag is set. That gives your exit code a chance to clean up nicely.
Coding this from the beginning helps when you write your unit tests.
The other thing -- as someone mentioned in comments -- use RAII. Pretty much if you're using raw pointers, you're doing something you shouldn't do in modern C++.
That's not an absolute. You can write your own RAII classes. For instance:
class MyIntArray {
MyArray(int sizeIn) { ... }
~MyArray() { delete array; }
private:
int * array = nullptr;
int size = 0;
};
You'll need a few more methods to actually get to the data, like an operator[]. Now, this isn't any different than using std::vector, so it's only an example of how to implement RAII for your custom data, for instance.
But your functions should NEVER call new like this. It's old-school. If your method pukes somehow, you have a memory leak. If it pukes on exit(), no one cares. But if it pukes for another reason, it's a problem. RAII is a much, much better solution than the other patterns.

Can the thread object be deleted after std::thread::detach?

I have a question about std::thread::detach(). In cplusplus.com it says 'After a call to this function, the thread object becomes non-joinable and can be destroyed safely', by which it seems to mean that the destructor ~thread() may be called safely.
My question is, does this mean that it is ok & safe to delete a thread object immediately after calling detach(), as in the following sample code? Will the function my_function continue safely, and safely use its thread_local variables and variables that are global to the program?
#include <thread>
#include <unistd.h>
void my_function(int t)
{
sleep(t);
}
int main()
{
std::thread *X = new std::thread(my_function, 10);
X->detach();
delete X;
sleep(30);
return 0;
}
The code 'runs' ok, I just want to know if this is safe from the point of view of memory ownership. My motivation here is to have a program that runs 'forever', and spawns a few child threads from time to time (e.g. every 30 seconds.) Each child thread then does something, and dies: I do not want to have to somehow keep track of the children in the parent thread, call join() and then delete.

Vector of thread pointers, push_back causes crash

I am new to C++, and I was lately looking into concurrency.
I tried to run this simple program demonstrating threads:
#include <vector>
#include <thread>
#include <stdio.h>
void threadWork()
{
printf("WORKER THREAD EXECUTING");
}
int main()
{
printf("INSTANTIATE THREAD\n");
std::vector<std::thread *> threadList;
threadList.push_back(&std::thread(&threadWork));
printf("THREAD INSTANTIATED JOINING...\n");
for (std::thread* t : threadList)
{
t->join();
}
printf("THREADS JOINED\n");
return 0;
}
After the call to push_back() the program crashes with a call to abort() (I am running this on Windows)
Why is this code crashing at runtime?
std::vector<std::thread *> threadList;
threadList.push_back(&std::thread(&threadWork));
Here you take the address of a temporary.
This is not permitted in C++, but Visual Studio lets you do it anyway.
That's a shame because the temporary is destroyed at the end of the call, so your vector contains nothing but a dangling pointer.
Furthermore, since you are not join()ing or detach()ing the thread, destroying its std::thread (which happens when it goes out of scope) causes your program to std::terminate() (commonly considered a crash).
I recommend re-reading the chapter in your C++ book about thread management, so that you can be assured of effecting it in a safe manner.
Forget about pointers here; you don't need them. Instead, just have a nice simple vector of threads, and directly construct your elements using emplace_back, like this:
std::vector<std::thread> threadList;
threadList.emplace_back(&threadWork);
Then ensure you join() your threads before program completion.
The program aborts because std thread aborts if you destroy it when it is running or finished.
This happens because you are storing a vector of pointers and taking the address of temporary objects. The first is a bad idea, the second is illegal in C++ but MSVC permits it by default.
In C++ you should default to using values, especially with std library types. And when you don't use values, use smart pointers. Using raw pointer should usually be either only for some function args, sometimes in a struct/class with reference semantics, or when doing C interop.
So here,
std::vector<std::thread> threadList;
is a vector of threads. No pointless pointers.
threadList.emplace_back(&threadWork);
that creates a new thread running threadwork/
for(auto& thread:threadList){
thread.join();
}
that joins the threads; this will wait until the thread is done. Failure to do something like this will make your program abort.
All together:
printf("INSTANTIATE THREAD\n");
std::vector<std::thread> threadList;
threadList.emplace_back(&threadWork);
printf("THREAD INSTANTIATED\n");
for(auto& thread:threadList){
thread.join();
}

Why does my code that interrupts a thread leak?

This is a very simplified example, so please bear with me for a moment....
#include <boost/thread/thread.hpp>
struct foo {
boost::thread t;
void do_something() { std::cout << "foo\n"; }
void thread_fun(){
try {
boost::this_thread::sleep(boost::posix_time::seconds(2));
{
boost::this_thread::disable_interruption di;
do_something();
}
} catch (boost::thread_interrupted& e) {}
}
void interrupt_and_restart() {
t.interrupt();
//if (t.joinable()) t.join(); // X
t = boost::thread(&foo::thread_fun,this);
}
};
int main(){
foo f;
for (int i=0;i<1000;i++){
f.interrupt_and_restart();
boost::this_thread::sleep(boost::posix_time::seconds(3));
}
}
When I run this code on linux and look at the memory consumption with top I see a constant increase in virtual memory used (and my actual code crashes at some point). Only if I join the thread after interrupting it, the memory usage stays constant. Why is that?
You are not joining the thread: because of this, some resources needed to keep track of the thread stay allocated.
A non joined thread still uses some system resources even if it has been terminated (e.g. its thread id is still valid).
Also, the system may impose a limit on the number of threads simultaneously allocated, and non joined threads count toward that limit.
Using cat /proc/sys/kernel/threads-max on my Linux VM gives me 23207 threads.
The latest versions of boost should actually crash when you destroy a joinable thread object, while older versions are happy to comply with the destruction request.

Prevent destruction of self after main?

I'm writing some asynchronous I/O stuff in C++, and I need to prevent an object from being destructed until its handler for the asynchronous I/O is called. I'm trying to use shared_ptr and create my object with a static constructor so I can be sure that it is using reference counting. Then I save that in a weak_ptr until I start the asynchronous I/O, when I store it into another shared_ptr to be sure it doesn't become invalid during that time. Finally, I reset it when the callback completes. Here's an example:
#pragma once
#include <memory>
#include <functional>
using namespace std;
class SomeIO {
std::weak_ptr<SomeIO> self;
std::shared_ptr<SomeIO> savingSelf;
void myCallback() {
// do my callback stuff here
savingSelf.reset();
}
public:
SomeIO() = delete;
~SomeIO() {}
static shared_ptr<SomeIO> create() {
auto self = make_shared<SomeIO>();
self->self = self;
return self;
}
void start() {
savingSelf = self.lock();
//startSomeAsyncIO(bind(self, SomeIO::myCallback));
}
};
int main() {
auto myIO = SomeIO::create();
myIO->start();
return 0;
}
My question is, what is going to happen after main returns? Will it stay alive until the final reference is released, or is this going to cause a memory leak? If this does cause a memory leak, how do I handle this situation so the asynchronous I/O can be canceled and the program can end without a memory leak? I would think that shared_ptr protects me from memory leaks, but I'm not so sure about this situation.
Thanks!
In C++ (as opposed to Java) , the program ends whenever the main ends. all other threads are terminated. memory leaks are not your problem since the program ends anyway and all the memory is deallocated.
you can use std::thread with std::thread::join to prevent you program from exiting too early :
int main (void){
std::thread myAsyncIOThread ([]{
auto myIO = SomeIO::create();
myIO->start();
});
//other things you program needs to do
myAsyncIOThread.join();
return 0;
}
you might want to be interested having a Thread-Pool in your program.