Multithreading using pthread in C++ with shared variables - c++

I'm new to threading (and C/C++ for that matter), and I'm attempting to use multiple threads to access shared variables.
In the main, I've created a variable char inputarray[100];
Thread 1: This thread will be reading data from stdin in 2 byte bursts, and appending them to the inputarray. (input by feeding a file in)
Thread 2: This thread will be reading data 1 byte at a time, performing a calculation, and putting its data into an output array.
Thread 3: This thread will be outputting data from the output array in 2 byte bursts. (stdout)
I've attempted the input part and got it working by passing a struct, but would like to do it without using a struct, but it has been giving me problems.
If I can get input down, I'm sure I'll be able to use a similar strategy to complete output. Any help would be greatly appreciated.
Below is a rough template for the input thread.
#include <stdio.h>
#include <pthread.h>
using namespace std;
void* input(void* arg) {
char reading[3];
fread(reading,1,2,stdin);
//append to char inputarray[]..???
}
int main() {
char inputarray[100];
pthread_t t1;
pthread_create(&t1, NULL, &input, &inputarray);
void *result;
pthread_join(t1,&result);
return 0;
}

Several issues:
I think array on stack is very bad choice for shared variable, because it has a fixed size and it's not clear from Thread 2 and 3 where to put new elements or where to read elements from. I would propose to use std::vector or std::deque instead.
Initially your container is empty. Then Thread 2 pushes some elements to it.
Thread 3 is polling (or waiting on condition variable) container, and once it found new elements - print them
You have to synchronize access to shared variable with mutex (consider pthread mutex, std::mutex or boost::mutex). You might also want to use condition variable to notify Thread 3 about new elements in the queue. But for initial implementation it's not needed.
Do you really have to use pthread primitives? Normally it's much easier and safer (i.e. exception safety) to use std::thread, std::mutex (if you have modern compiler), or boost::thread, boost::mutex otherwise.

You are on the correct track:
As a note the pthreads libraries are C libs so you need to declare the callbacks as C functions:
extern "C" void* input(void* arg);
Personally I would pass the address of the first element:
pthread_create(&t1, NULL, &input, &inputarray[0]);
This then makes your code look like this:
void* input(void* arg) {
try
{
char* inputarray = (char*)arg;
size_t inputLocation = 0;
// Need to make sure you don't over run the buffer etc...
while(!finished())
{
fread(&inputarray[inputLocation],1,2,stdin);
inputLocation += 2;
}
}
catch(...){} // Must not let exceptions escape a thread.
return NULL;
}
The trouble with this style is that you are putting the responsibility for coordination into each individual thread. The writer thread has to check for end the reader thread has to check there is data available etc. All this needs coordination so now you need some shared mutexes and condition variables.
A better choice is to move that responsibility into the object the does the communication. So I would create a class that has the basic operations needed for communication then make its methods do the appropriate checks.
class Buffer
{
public:
void write(......); //
void read(.....); //
private:
// All the synchronization and make sure the two threads
// behave nicely inside the object.
};
int main()
{
pthread_t threads[3];
std::pair<Buffer, Buffer> comms;
// comms.first inputToRead
// comms.second processesToOutput
pthread_create(&threads[0], NULL, &readInput, &comms.first); // Input
pthread_create(&threads[1], NULL, &procInput, &comms); // Processing
pthread_create(&threads[2], NULL, &genOutput, &comms.second); // Output
void *result;
pthread_join(threads[0],&result);
pthread_join(threads[1],&result);
pthread_join(threads[2],&result);
}
As a side note:
Unless there is something very strange about your processing of data. This would probably be faster written as a single threaded application.

Related

Using Global array in multhreaded application

I am using Toradex Colibri iMX7 for running our embedded software(C,C++). Our application is to acquire data from two sensors and to plot it in real time.We are having two threads, one for data acquisition(append data in a global array) and other thread for plotting the array of values(same global array) in a interval of time(100ms). While trying this our application gets crashed after some time. I know some kind of thread synchronization is necessary but don't know exactly how to handle this. Any suggestions or examples would be helpful.
Here is a dummy example how to use mutex for thread synchronization with pthread library.
#include <pthread.h>
pthread_mutex_t _mutex;
int globalArray[5];
void Write()
{
pthread_mutex_lock (&_mutex);
// Write to global array
globalArray[0] = 0;
pthread_mutex_unlock (&_mutex);
}
int Read( )
{
int i;
pthread_mutex_lock (&_mutex);
// read from global array
i = globalArra[0];
pthread_mutex_unlock (&_mutex);
return i;
}
Before you start using mutex object one time initialization needed. eg. begining of your program.
pthread_mutex_init(&_mutex, NULL);
and when no longer need it you need to destroy it. eg. before program ends.
pthread_mutex_destroy(&_mutex);

Multithreading & shared resource: Periodically copy data from buffer (data structure) to a file using C++

My code has a data structure say for example "vector of vecor of Strings. I've 2 threads:
THread 1 is writing data to this data structure (buffer in RAM).
Thread 2 running in parallel which should copy data FROM the above buffer i.e data structure TO a file for every "x" miliseconds.
I'm wondering how would I achieve this in C++ ? It should consider key
points in problem statement like:
a) The copy from buffer to file should happen only once in "X"
miliseconds.
b) Synchronization between both threads.
EDITING THE QUERY WITH MORE DETAILS PER THE ASK
I want to build a library (*.lib) & this library exposes some APIs hence it gets input data from EXE or any entity which uses my library through these APIs.
Say the data received by my library is in the form of a vector of strings.
FillLibraryDataStructure(std::vector<std::string>); // is the API of this library. Any app can call this API & pass a vector of string to this library.
Example app code:
for(int i=100; i<100;i))
{
std::vector<std::string> vec = GetVectorOfString(); // GetVectorOfString from business logic
FillLibraryDataStructure(vec);
}
Library code havin a shared resource:
// Within library I've a 2D vector i.e. vector of vector of
strings where all the vector of strings passed by application to this librray are added as a new row in vecofvecofstr.
SHARED RESOURCE:
std::vector<std::vector<string>> vecofvecofstr;
THREAD 1: is copying the data it receives from API to the data structure i.e. vector of vector of strings.
vecofvecofstr.push_back(vec);
THREAD 2: is copying the contents of this vector of vector of string (which was written to in 1st thread ) to files (XML, HTML etc..)
for every "X" miiliseconds.
Few more points about thread1: 1) Thread 1 should be running
continuously i.e. as and when application calls the API the data
received should be put to the data structure vecofvecofstr. 2) After
"X" miliseonds of copying the data to the buffer, 2nd thread should
get started & it should copy all the stuff that was dumped to buffer
till date. Again after "X" milisonds the thread 2 should pause & wait
for "X" ms.
How do I achieve this. Here the 1st thread is the default one in which my library code would be running.
How do i achieve this using C++?
You could use std::mutex and std::condition_variable> to your advantage. And a double buffer would keep locking to a minimum.
std::condition_variable> is the nearest thing to an event the std has to offer, its use is a bit contrived, but it works.
The example below uses a double buffer so you can keep on buffering data while thread 2 is saving to file, without locking.
The std:condition_variable is used so your application can exit without waiting. This is only needed if you want your application to exit promptly, else you can use a timer. The call to notify_all() will prevent the wait_for() from timing out, and wake the writing thread immediately, so it can exit without waiting for the timeout to occur. See ref at :http://en.cppreference.com/w/cpp/thread/condition_variable
Header:
#include <mutex>
// a generic double buffer
template<typename _Data>
class DoubleBuffer
{
private:
std::mutex mutex_;
std::vector<std::vector<_Data>> storeBuffer_;
std::vector<std::vector<_Data>> saveBuffer_;
public:
void lock() { mutex_.lock(); }
void unlock() { mutex_.unlock();}
auto& GetStoreBuffer() { return storeBuffer_; }
auto& GetSaveBuffer() { return saveBuffer_; }
auto& Swap()
{
std::lock_guard<std::mutex> lock(mutex_);
std::swap(storeBuffer_, saveBuffer_);
}
};
In your library:
#include <condition_variable>
#include <thread>
#include <chrono>
#include <mutex>
#include <vector>
#include <string>
// As an example, could be inside a class, or struct
static std::condition_variable exiting;
static std::mutex lk_exiting;
static DoubleBuffer<std::string> yourBuffer;
void FillLibraryDataStructure(std::vector<std::string> strings)
{
// the lock is only for the duration of a swap - very short at worst.
std::lock_guard<DoubleBuffer<std::string>> lock(yourBuffer);
yourBuffer.GetStoreBuffer().emplace_back(strings);
}
void StoreLoop()
{
for(;;)
{
{ // wait_for() unlocks lk_exiting, doc says lock should
// be set before cv is triggered.
std::unique_lock<std::mutex> lk_exiting;
if (std::cv_status::no_timeout == exiting.wait_for(lk_exiting, 60s))
break; // app is exiting
}
yourBuffer.Swap();
auto& stringsToSave = GetSaveBuffer();
// save... You do have plenty of time.
}
}
// as an example. A destructor would be a good place for this
void Exit_Application()
{
// stops the wait_for operation in StoreLoop()
exiting.notify_all();
}
As a general answer regard to your general question, there is two possible options:
1- use some wait, signal commands to sleep and wake threads in parallel
2- use some sleep to provide X milliseconds in reading thread
If you need better answer, give more details

pthread - accessing multiple objects with a thread

I'm trying to get my hands on multi threading and it's not working so far. I'm creating a program which allows serial communication with a device and it's working quite well without multi threading. Now I want to introduce threads, one thread to continuously send packets, one thread to receive and process packets and another thread for a GUI.
The first two threads need access to four classes in total, but using pthread_create() I can only pass one argument. I then stumled upon a post here on stack overflow (pthread function from a class) where Jeremy Friesner presents a very elegant way. I then figured that it's easiest to create a Core class which contains all the objects my threads need access to as well as all functions for the threads.So here's a sample from my class Core:
/** CORE.CPP **/
#include "SerialConnection.h" // Clas for creating a serial connection using termios
#include "PacketGenerator.h" // Allows to create packets to be transfered
#include <pthread.h>
#define NUM_THREADS 4
class Core{
private:
SerialConnection serial; // One of the objects my threads need access to
pthread_t threads[NUM_THREADS];
pthread_t = _thread;
public:
Core();
~Core();
void launch_threads(); // Supposed to launch all threads
static void *thread_send(void *arg); // See the linked post above
void thread_send_function(); // See the linked post above
};
Core::Core(){
// Open serial connection
serial.open_connection();
}
Core::~Core(){
// Close serial connection
serial.close_connection();
}
void Core::launch_threads(){
pthread_create(&threads[0], NULL, thread_send, this);
cout << "CORE: Killing threads" << endl;
pthread_exit(NULL);
}
void *Core::thread_send(void *arg){
cout << "THREAD_SEND launched" << endl;
((Core *)arg)->thread_send_function();
return NULL;
}
void Core::thread_send_function(){
generator.create_hello_packet();
generator.send_packet(serial);
pthread_exit(NULL);
}
Problem is now that my serial object crashes with segmentation fault (that pointer stuff going on in Core::thread_send(void *arg) makes me suspicious. Even when it does not crash, no data is transmitted over the serial connection even though the program executed without any errors. Execution form main:
/** MAIN.CPP (extract) VARIANT 1 **/
int main(){
Core core;
core.launch_threads(); // No data is transferred
}
However, if I call the thread_send_function directly (the one the thread is supposed to execute), the data is transmitted over the serial connection flawlessly:
/** MAIN.CPP (extract) VARIANT 1 **/
int main(){
Core core;
core.thread_send_function(); // Data transfer works
}
Now I'm wondering what the proper way of dealing with this situation is. Instead of that trickery in Core.cpp, should I just create a struct holding pointers to the different classes I need and then pass that struct to the pthread_create() function? What is the best solution for this problem in general?
The problem you have is that your main thread exits the moment it created the other thread, at which point the Core object is destroyed and the program then exits completely. This happens while your newly created thread tries to use the Core object and send data; you either see absolutely nothing happening (if the program exits before the thread ever gets to do anything) or a crash (if Core is destroyed while the thread tries to use it). In theory you could also see it working correctly, but because the thread probably takes a bit to create the packet and send it, that's unlikely.
You need to use pthread_join to block the main thread just before quitting, until the thread is done and has exited.
And anyway, you should be using C++11's thread support or at least Boost's. That would let you get rid of the low-level mess you have with the pointers.

Threading in C++ to keep two functions running parallely

I have a code congaing two functions func1 and func2. Role of both the function is same. Keep reading a directory continuously and write the names of file present in their respective log files. Both functions are referring a common log function to write the logs. I want to use introduce threading in my code such that both of them keep on running parallely but both should not access the log function at same time. How to achieve that?
This is a classic case of needing a mutex.
void WriteToLog(const char *msg)
{
acquire(mutex);
logfile << msg << endl;
release(mutex);
}
The above code won't "copy and paste" into your system, since mutexes are system specific - pthread_mutex would be the choice if you are using pthreads. C++11 has it's own mutex and thread functionality, and Windows has another variant.
From Sajal's comments:
tried pthread_create(&thread1, NULL, start_opca, &opca); pthread_join( thread1, NULL); pthread_create(&thread2, NULL, start_ggca, &ggca); pthread_join( thread2, NULL);
But the problem with this is that it will wait for one thread to finish before starting next. I don't want that.
the join function blocks the calling thread, until the thread you call join for, finishes. In your case, calling join on the first thread before creating the second, guarantees that the first thread will end before the second one begins.
You should create the two threads first, then join them both (instead of interspersing the creations and join of both).
Additionally, the access to the log should be extracted into common code for both (a logging function, a logging class etc. Within the extracted code, the log access should be guarded using a mutex.
If you have an implementation (partially) supporting c++11, you should use std::thread and std::mutex for this. Otherwise, you should use boost::thread. If you have access to neither, use pthreads under linux.
On linux, you will need to use pthreads
Since both threads are reading/writing from/to I/O (reading dirs and writing log files) there's no need for multi-threading: you gain no speed improvement parallelizing the task since every I/O access is enqueued at lower levels.
This C language Code may give you some hint. To answer your question:
You should use mutex in pthread to make sure that the log file could only be access by one thread at the same time.
#include <pthread.h>
#include <stdio.h>
pthread_mutex_t LogLock = PTHREAD_MUTEX_INITIALIZER;
char* LogFileName= "test.log";
void* func_tid0( void* a) {
int i;
for(i=0; i < 50; i++ ) {
pthread_mutex_lock(&LogLock);
fprintf((FILE*)a, "write to log by thread0:%d\n", i);
pthread_mutex_unlock(&LogLock);
}
}
void* func_tid1(void* a) {
int i;
for(i=0; i < 50; i++ ) {
pthread_mutex_lock(&LogLock);
fprintf((FILE*)a, "write to log by thread1:%d\n", i);
pthread_mutex_unlock(&LogLock);
}
}
int main() {
pthread_t tid0, tid1;
FILE* fp=fopen(LogFileName, "wb+");
pthread_create(&tid0, NULL, func_tid0, (void*) fp );
pthread_create(&tid1, NULL, func_tid1, (void*) fp );
void* ret;
pthread_join(tid0, &ret);
pthread_join(tid1, &ret);
}
Your another question isn't exist.
Because the main thread is suspend at your first pthread_join, but it's not mean the second thread doesn't run. Actually the second thread is beginning at pthread_create(thread1).
And actually pthread_mutex casuses your program serial.

Accessing and modifying automatic variables on another thread's stack

I want to pass some data around threads but want to refrain from using global variables if I can manage it. The way I wrote my thread routine has the user passing in a separate function for each "phase" of a thread's life cycle: For instance this would be a typical usage of spawning a thread:
void init_thread(void *arg) {
graphics_init();
}
void process_msg_thread(message *msg, void *arg) {
if (msg->ID == MESSAGE_DRAW) {
graphics_draw();
}
}
void cleanup_thread(void *arg) {
graphics_cleanup();
}
int main () {
threadCreator factory;
factory.createThread(init_thread, 0, process_msg_thread, 0, cleanup_thread, 0);
// even indexed arguments are the args to be passed into their respective functions
// this is why each of those functions must have a fixed function signature is so they can be passed in this way to the factory
}
// Behind the scenes: in the newly spawned thread, the first argument given to
// createThread() is called, then a message pumping loop which will call the third
// argument is entered. Upon receiving a special exit message via another function
// of threadCreator, the fifth argument is called.
The most straightforward way to do it is using globals. I'd like to avoid doing that though because it is bad programming practice because it generates clutter.
A certain problem arises when I try to refine my example slightly:
void init_thread(void *arg) {
GLuint tex_handle[50]; // suppose I've got 50 textures to deal with.
graphics_init(&tex_handle); // fill up the array with them during graphics init which loads my textures
}
void process_msg_thread(message *msg, void *arg) {
if (msg->ID == MESSAGE_DRAW) { // this message indicates which texture my thread was told to draw
graphics_draw_this_texture(tex_handle[msg->texturehandleindex]); // send back the handle so it knows what to draw
}
}
void cleanup_thread(void *arg) {
graphics_cleanup();
}
I am greatly simplifying the interaction with the graphics system here but you get the point. In this example code tex_handle is an automatic variable, and all its values are lost when init_thread completes, so will not be available when process_msg_thread needs to reference it.
I can fix this by using globals but that means I can't have (for instance) two of these threads simultaneously since they would trample on each other's texture handle list since they use the same one.
I can use thread-local globals but is that a good idea?
I came up with one last idea. I can allocate storage on the heap in my parent thread, and send a pointer to in to the children to mess with. So I can just free it when parent thread leaves away since I intend for it to clean up its children threads before it exits anyway. So, something like this:
void init_thread(void *arg) {
GLuint *tex_handle = (GLuint*)arg; // my storage space passed as arg
graphics_init(tex_handle);
}
void process_msg_thread(message *msg, void *arg) {
GLuint *tex_handle = (GLuint*)arg; // same thing here
if (msg->ID == MESSAGE_DRAW) {
graphics_draw_this_texture(tex_handle[msg->texturehandleindex]);
}
}
int main () {
threadCreator factory;
GLuint *tex_handle = new GLuint[50];
factory.createThread(init_thread, tex_handle, process_msg_thread, tex_handle, cleanup_thread, 0);
// do stuff, wait etc
...
delete[] tex_handle;
}
This looks more or less safe because my values go on the heap, my main thread allocates it then lets children mess with it as they wish. The children can use the storage freely since the pointer was given to all the functions that need access.
So this got me thinking why not just have it be an automatic variable:
int main () {
threadCreator factory;
GLuint tex_handle[50];
factory.createThread(init_thread, &tex_handle, process_msg_thread, &tex_handle, cleanup_thread, 0);
// do stuff, wait etc
...
} // tex_handle automatically cleaned up at this point
This means children thread directly access parent's stack. I wonder if this is kosher.
I found this on the internets: http://software.intel.com/sites/products/documentation/hpc/inspectorxe/en-us/win/ug_docs/olh/common/Problem_Type__Potential_Privacy_Infringement.htm
it seems Intel Inspector XE detects this behavior. So maybe I shouldn't do it? Is it just simply a warning of potential privacy infringement as suggested by the the URL or are there other potential issues that may arise that I am not aware of?
P.S. After thinking through all this I realize that maybe this architecture of splitting a thread into a bunch of functions that get called independently wasn't such a great idea. My intention was to remove the complexity of requiring coding up a message handling loop for each thread that gets spawned. I had anticipated possible problems, and if I had a generalized thread implementation that always checked for messages (like my custom one that specifies the thread is to be terminated) then I could guarantee that some future user could not accidentally forget to check for that condition in each and every message loop of theirs.
The problem with my solution to that is that those individual functions are now separate and cannot communicate with each other. They may do so only via globals and thread local globals. I guess thread local globals may be my best option.
P.P.S. This got me thinking about RAII and how the concept of the thread at least as I have ended up representing it has a certain similarity with that of a resource. Maybe I could build an object that represents a thread more naturally than traditional ways... somehow. I think I will go sleep on it.
Put your thread functions into a class. Then they can communicate using instance variables. This requires your thread factory to be changed, but is the cleanest way to solve your problem.
Your idea of using automatic variables will work too as long as you can guarantee that the function whose stack frame contains the data will never return before your child threads exit. This is not really easy to achieve, even after main() returns child threads can still run.