Inconsistent output from c++ multithreaded program - c++

I have the following program in C++:
// multithreading01.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <string>
#include <iostream>
#include <process.h>
using namespace std;
bool threadFinished = false;
struct params {
string aFile;
bool tf;
};
void WriteToFile(void *p)
{
params* a = (params*)p;
cout<<a->aFile<<endl;
a->tf = true;
_endthread();
}
int main(int argc, char* argv[])
{
params *param01 = new params;
params *param02 = new params;
param01->aFile = "hello from p1";
param01->tf = false;
param02->aFile = "hello from p2";
param02->tf = false;
_beginthread(WriteToFile,0,(void *) param01);
_beginthread(WriteToFile,0,(void *) param02);
while(!param01->tf || !param02->tf)
{
}
cout << "Main ends" << endl;
system("pause");
return 0;
}
However, I am getting inconsistent outputs such as
output 1:
hello from p1
hello from p2
output 2:
hello from p1hello from p2
output 3:
hhello from p2ello from p1
How can I get a consistent output from this code? I am using Visual C++ 6.0 Standard Edition.

Read this small writeup
Like everyone mentioned in the comment, when you create threads, generally speaking, idea is to separate tasks and thusly increasing performance on modern multicore architecture CPUs which could one thread per core.
If you want to access same resource (same file in your case) from two different threads then you need to make sure that simultaneous access from two threads doesnt happen otherwise you would see the problem that you are seeing.
Your provide safe simultaneous access by protecting shared resource using some locks (e.g POSIX locks or you could chose your platform specific lock implementation).
Common mistake beginners do is that they lock the "code" not "resource".
Dont do this:
void WriteToFile(void *p)
{
pthread_mutex_lock(var); //for example only
params* a = (params*)p;
cout<<a->aFile<<endl;
a->tf = true;
_endthread();
pthread_mutex_unlock(var); //for example only
}
You should instead put a lock in your resource
struct params {
lock_t lock; //for example only not actual code
string aFile;
bool tf;
};
void WriteToFile(void *p)
{
params* a = (params*)p;
pthread_mutex_lock(a->lock); //Locking params here not the whole code.
cout<<a->aFile<<endl;
a->tf = true;
pthread_mutex_unlock(a->lock); //Unlocking params
_endthread();
}

Related

Thread with expensive operations slows down UI thread - Windows 10, C++

The Problem: I have two threads in a Windows 10 application I'm working on, a UI thread (called the render thread in the code) and a worker thread in the background (called the simulate thread in the code). Ever couple of seconds or so, the background thread has to perform a very expensive operation that involves allocating a large amount of memory. For some reason, when this operation happens, the UI thread lags for a split second and becomes unresponsive (this is seen in the application as a camera not moving for a second while the camera movement input is being given).
Maybe I'm misunderstanding something about how threads work on Windows, but I wasn't aware that this was something that should happen. I was under the impression that you use a separate UI thread for this very reason: to keep it responsive while other threads do more time intensive operations.
Things I've tried: I've removed all communication between the two threads, so there are no mutexes or anything of that sort (unless there's something implicit that Windows does that I'm not aware of). I have also tried setting the UI thread to be a higher priority than the background thread. Neither of these helped.
Some things I've noted: While the UI thread lags for a moment, other applications running on my machine are just as responsive as ever. The heavy operation seems to only affect this one process. Also, if I decrease the amount of memory being allocated, it alleviates the issue (however, for the application to work as I want it to, it needs to be able to do this allocation).
The question: My question is two-fold. First, I'd like to understand why this is happening, as it seems to go against my understanding of how multi-threading should work. Second, do you have any recommendations or ideas on how to fix this and get it so the UI doesn't lag.
Abbreviated code: Note the comment about epochs in timeline.h
main.cpp
#include "Renderer/Headers/Renderer.h"
#include "Shared/Headers/Timeline.h"
#include "Simulator/Simulator.h"
#include <iostream>
#include <Windows.h>
unsigned int __stdcall renderThread(void* timelinePtr);
unsigned int __stdcall simulateThread(void* timelinePtr);
int main() {
Timeline timeline;
HANDLE renderHandle = (HANDLE)_beginthreadex(0, 0, &renderThread, &timeline, 0, 0);
if (renderHandle == 0) {
std::cerr << "There was an error creating the render thread" << std::endl;
return -1;
}
SetThreadPriority(renderHandle, THREAD_PRIORITY_HIGHEST);
HANDLE simulateHandle = (HANDLE)_beginthreadex(0, 0, &simulateThread, &timeline, 0, 0);
if (simulateHandle == 0) {
std::cerr << "There was an error creating the simulate thread" << std::endl;
return -1;
}
SetThreadPriority(simulateHandle, THREAD_PRIORITY_IDLE);
WaitForSingleObject(renderHandle, INFINITE);
WaitForSingleObject(simulateHandle, INFINITE);
return 0;
}
unsigned int __stdcall renderThread(void* timelinePtr) {
Timeline& timeline = *((Timeline*)timelinePtr);
Renderer renderer = Renderer(timeline);
renderer.run();
return 0;
}
unsigned int __stdcall simulateThread(void* timelinePtr) {
Timeline& timeline = *((Timeline*)timelinePtr);
Simulator simulator(timeline);
simulator.run();
return 0;
}
simulator.cpp
// abbreviated
void Simulator::run() {
while (true) {
// abbreviated
timeline->push(latestState);
}
}
// abbreviated
timeline.h
#ifndef TIMELINE_H
#define TIMELINE_H
#include "WorldState.h"
#include <mutex>
#include <vector>
class Timeline {
public:
Timeline();
bool tryGetStateAtFrame(int frame, WorldState*& worldState);
void push(WorldState* worldState);
private:
// The concept of an Epoch was introduced to help reduce mutex conflicts, but right now since the threads are disconnected, there should be no mutex locks at all on the UI thread. However, every 1024 pushes onto the timeline, a new Epoch must be created. The amount of slowdown largely depends on how much memory the WorldState class takes. If I make WorldState small, there isn't a noticable hiccup, but when it is large, it becomes noticeable.
class Epoch {
public:
static const int MAX_SIZE = 1024;
void push(WorldState* worldstate);
int getSize();
WorldState* getAt(int index);
private:
int size = 0;
WorldState states[MAX_SIZE];
};
Epoch* pushEpoch;
std::mutex lock;
std::vector<Epoch*> epochs;
};
#endif // !TIMELINE_H
timeline.cpp
#include "../Headers/Timeline.h"
#include <iostream>
Timeline::Timeline() {
pushEpoch = new Epoch();
}
bool Timeline::tryGetStateAtFrame(int frame, WorldState*& worldState) {
if (!lock.try_lock()) {
return false;
}
if (frame >= epochs.size() * Epoch::MAX_SIZE) {
lock.unlock();
return false;
}
worldState = epochs.at(frame / Epoch::MAX_SIZE)->getAt(frame % Epoch::MAX_SIZE);
lock.unlock();
return true;
}
void Timeline::push(WorldState* worldState) {
pushEpoch->push(worldState);
if (pushEpoch->getSize() == Epoch::MAX_SIZE) {
lock.lock();
epochs.push_back(pushEpoch);
lock.unlock();
pushEpoch = new Epoch();
}
}
void Timeline::Epoch::push(WorldState* worldState) {
if (this->size == this->MAX_SIZE) {
throw std::out_of_range("Pushed too many items to Epoch without clearing");
}
this->states[this->size] = *worldState;
this->size++;
}
int Timeline::Epoch::getSize() {
return this->size;
}
WorldState* Timeline::Epoch::getAt(int index) {
if (index >= this->size) {
throw std::out_of_range("Tried accessing nonexistent element of epoch");
}
return &(this->states[index]);
}
Renderer.cpp: loops to call Presenter::update() and some OpenGL rendering tasks.
Presenter.cpp
// abbreviated
void Presenter::update() {
camera->update();
// timeline->tryGetStateAtFrame(Time::getFrames(), worldState); // Normally this would cause a potential mutex conflict, but for now I have it commented out. This is the only place that anything on the UI thread accesses timeline.
}
// abbreviated
Any help/suggestions?
I ended up figuring this out!
So as it turns out, the new operator in C++ is threadsafe, which means that once it starts, it has to finish before any other threads can do anything. Why was that a problem in my case? Well, when an Epoch was being initialized, it had to initialize an array of 1024 WorldStates, each of which has 10,000 CellStates that need to be initialized, and each of those had an array of 16 items that needed to be initalized, so we ended up with over 100,000,000 objects needing to be initialized before the new operator could return. That was taking long enough that it caused the UI to hiccup while it was waiting.
The solution was to create a factory function that would build the pieces of the Epoch piecemeal, one constructor at a time and then combine them together and return a pointer to the new epoch.
timeline.h
#ifndef TIMELINE_H
#define TIMELINE_H
#include "WorldState.h"
#include <mutex>
#include <vector>
class Timeline {
public:
Timeline();
bool tryGetStateAtFrame(int frame, WorldState*& worldState);
void push(WorldState* worldState);
private:
class Epoch {
public:
static const int MAX_SIZE = 1024;
static Epoch* createNew();
void push(WorldState* worldstate);
int getSize();
WorldState* getAt(int index);
private:
Epoch();
int size = 0;
WorldState* states[MAX_SIZE];
};
Epoch* pushEpoch;
std::mutex lock;
std::vector<Epoch*> epochs;
};
#endif // !TIMELINE_H
timeline.cpp
Timeline::Epoch* Timeline::Epoch::createNew() {
Epoch* epoch = new Epoch();
for (unsigned int i = 0; i < MAX_SIZE; i++) {
epoch->states[i] = new WorldState();
}
return epoch;
}

No need for mutex, race conditions not always bad, do they?

I'm getting this crazy idea that mutex synchronization can be omitted in some cases when most of us would typically want and would use mutex synchronization.
Ok suppose you have this case:
Buffer *buffer = new Buffer(); // Initialized by main thread;
...
// The call to buffer's `accumulateSomeData` method is thread-safe
// and is heavily executed by many workers from different threads simultaneously.
buffer->accumulateSomeData(data); // While the code inside is equivalent to vector->push_back()
...
// All lines of code below are executed by a totally separate timer
// thread that executes once per second until the program is finished.
auto bufferPrev = buffer; // A temporary pointer to previous instance
// Switch buffers, put old one offline
buffer = new Buffer();
// As of this line of code all the threads will switch to new instance
// of buffer. Which yields that calls to `accumulateSomeData`
// are executed over new buffer instance. Which also means that old
// instance is kinda taken offline and can be safely operated from a
// timer thread.
bufferPrev->flushToDisk(); // Ok, so we can safely flush
delete bufferPrev;
While it's obvious that during buffer = new Buffer(); there can still be uncompleted operations that add data on previous instance. But since disk operations are slow we get natural kind of barrier.
So how do you estimate the risk of running such code without mutex synchronisation?
Edit
It's so hard these days to ask a question in SO without getting mugged by couple of angry guys for no reason.
Here is my correct in all terms code:
#include <cassert>
#include "leveldb/db.h"
#include "leveldb/filter_policy.h"
#include <iostream>
#include <boost/asio.hpp>
#include <boost/chrono.hpp>
#include <boost/thread.hpp>
#include <boost/filesystem.hpp>
#include <boost/lockfree/stack.hpp>
#include <boost/lockfree/queue.hpp>
#include <boost/uuid/uuid.hpp> // uuid class
#include <boost/uuid/uuid_io.hpp> // streaming operators etc.
#include <boost/uuid/uuid_generators.hpp> // generators
#include <CommonCrypto/CommonDigest.h>
using namespace std;
using namespace boost::filesystem;
using boost::mutex;
using boost::thread;
enum FileSystemItemType : char {
Unknown = 1,
File = 0,
Directory = 4,
FileLink = 2,
DirectoryLink = 6
};
// Structure packing optimizations are used in the code below
// http://www.catb.org/esr/structure-packing/
class FileSystemScanner {
private:
leveldb::DB *database;
boost::asio::thread_pool pool;
leveldb::WriteBatch *batch;
std::atomic<int> queue_size;
std::atomic<int> workers_online;
std::atomic<int> entries_processed;
std::atomic<int> directories_processed;
std::atomic<uintmax_t> filesystem_usage;
boost::lockfree::stack<boost::filesystem::path*, boost::lockfree::fixed_sized<false>> directories_pending;
void work() {
workers_online++;
boost::filesystem::path *item;
if (directories_pending.pop(item) && item != NULL)
{
queue_size--;
try {
boost::filesystem::directory_iterator completed;
boost::filesystem::directory_iterator iterator(*item);
while (iterator != completed)
{
bool isFailed = false, isSymLink, isDirectory;
boost::filesystem::path path = iterator->path();
try {
isSymLink = boost::filesystem::is_symlink(path);
isDirectory = boost::filesystem::is_directory(path);
} catch (const boost::filesystem::filesystem_error& e) {
isFailed = true;
isSymLink = false;
isDirectory = false;
}
if (!isFailed)
{
if (!isSymLink) {
if (isDirectory) {
directories_pending.push(new boost::filesystem::path(path));
directories_processed++;
boost::asio::post(this->pool, [this]() { this->work(); });
queue_size++;
} else {
filesystem_usage += boost::filesystem::file_size(iterator->path());
}
}
}
int result = ++entries_processed;
if (result % 10000 == 0) {
cout << entries_processed.load() << ", " << directories_processed.load() << ", " << queue_size.load() << ", " << workers_online.load() << endl;
}
++iterator;
}
delete item;
} catch (boost::filesystem::filesystem_error &e) {
}
}
workers_online--;
}
public:
FileSystemScanner(int threads, leveldb::DB* database):
pool(threads), queue_size(), workers_online(), entries_processed(), directories_processed(), directories_pending(0), database(database)
{
}
void scan(string path) {
queue_size++;
directories_pending.push(new boost::filesystem::path(path));
boost::asio::post(this->pool, [this]() { this->work(); });
}
void join() {
pool.join();
}
};
int main(int argc, char* argv[])
{
leveldb::Options opts;
opts.create_if_missing = true;
opts.compression = leveldb::CompressionType::kSnappyCompression;
opts.filter_policy = leveldb::NewBloomFilterPolicy(10);
leveldb::DB* db;
leveldb::DB::Open(opts, "/temporary/projx", &db);
FileSystemScanner scanner(std::thread::hardware_concurrency(), db);
scanner.scan("/");
scanner.join();
return 0;
}
My question is: Can I omit synchronization for batch which I'm not using yet? Since it's thread-safe and it should be enough to just switch buffers before actually committing any results to disk?
You have a serious misunderstanding. You think that when you have a race condition, there are some specific list of things that can happen. This is not true. A race condition can cause any kind of failure, including crashes. So absolutely, definitely not. You absolutely cannot do this.
That said, even with this misunderstanding, this is still a disaster.
Consider:
buffer = new Buffer();
Suppose this is implemented by first allocating memory, then setting buffer to point to that memory, and then calling the constructor. Other threads may operate on the unconstructed buffer. boom.
Now, you can fix this. But it's just one the many ways I can imagine this screwing up. And it can screw up in ways that we're not clever enough to imagine. So, for all that is holy, do not even think of doing this ever again.

NET-SNMP and multithreading

I am writing a C++ SNMP server using a NET-SNMP library. I read the documentation and still got one question. Can multiple threads sharing single snmp session and using it in procedures like snmp_sess_synch_response() simultaneously, or I must init and open new session in each thread?
Well, when I am trying to snmp_sess_synch_response() from two different threads using the same opaque session pointer simultaneously, one of three errors always occures. The first is memory access violation, the second is endless WaitForSingleObject() in both threads and the third is heap allocation error.
I suppose I can treat this as an answer, thus sharing single session between multiple threads is unsafe, because using it in procedures like snmp_sess_synch_response() simultaneously will cause an errors.
P.S. Here is the piece of code of described before:
void* _opaqueSession;
boost::mutex _sessionMtx;
std::shared_ptr<netsnmp_pdu> ReadObjectValue(Oid& objectID)
{
netsnmp_pdu* requestPdu = snmp_pdu_create(SNMP_MSG_GET);
netsnmp_pdu* response = 0;
snmp_add_null_var(requestPdu, objectID.GetObjId(), objectID.GetLen());
void* opaqueSessionCopy;
{
//Locks the _opaqueSession, wherever it appears
boost::mutex::scoped_lock lock(_sessionMtx);
opaqueSessionCopy = _opaqueSession;
}
//Errors here!
snmp_sess_synch_response(opaqueSessionCopy, requestPdu, &response);
std::shared_ptr<netsnmp_pdu> result(response);
return result;
}
void ExecuteThread1()
{
Oid sysName(".1.3.6.1.2.1.1.5.0");
try
{
while(true)
{
boost::thread::interruption_pont();
ReadObjectValue(sysName);
}
}
catch(...)
{}
}
void ExecuteThread2()
{
Oid sysServices(".1.3.6.1.2.1.1.7.0");
try
{
while(true)
{
boost::thread::interruption_pont();
ReadObjectValue(sysServices);
}
}
catch(...)
{}
}
int main()
{
std::string community = "public";
std::string ipAddress = "127.0.0.1";
snmp_session session;
{
SNMP::snmp_sess_init(&session);
session.timeout = 500000;
session.retries = 0;
session.version = SNMP_VERSION_2c;
session.remote_port = 161;
session.peername = (char*)ipAddress.c_str();
session.community = (u_char*)community.c_str();
session.community_len = community.size();
}
_opaqueSession = snmp_sess_open(&session);
boost::thread thread1 = boost::thread(&ExecuteThread1);
boost::thread thread2 = boost::thread(&ExecuteThread2);
boost::this_thread::sleep(boost::posix_time::seconds::seconds(30));
thread1.interrupt();
thread1.join();
thread2.interrupt();
thread2.join();
return 0;
}

Writing (logging) into same file from different threads , different functions?

In C++ is there any way to make the writing into file thread safe in the following scenario ?
void foo_one(){
lock(mutex1);
//open file abc.txt
//write into file
//close file
unlock(mutex1);
}
void foo_two(){
lock(mutex2);
//open file abc.txt
//write into file
//close file
unlock(mutex2);
}
In my application (multi-threaded) , it is likely that foo_one() and foo_two() are executed by two different threads at the same time .
Is there any way to make the above thread safe ?
I have considered using the file-lock ( fcntl and/or lockf ) but not sure how to use them because fopen() has been used in the application ( performance reasons ) , and it was stated somewhere that those file locks should not be used with fopen ( because it is buffered )
PS : The functions foo_one() and foo_two() are in two different classes , and there is no way to have a shared data between them :( , and sadly the design is such that one function cannot call other function .
Add a function for logging.
Both functions call the logging function (which does the appropriate locking).
mutex logMutex;
void log(std::string const& msg)
{
RAIILock lock(logMutex);
// open("abc.txt");
// write msg
// close
}
If you really need a logger, do not try doing it simply by writing into files and perhaps use a dedicated logger, thus separating the concerns away from the code you're writing. There's a number of thread-safe loggers: the first one that comes to mind: g2log. Googling further you'll find log4cplus, a discussion here, even a minimalist one, +1
If the essence of functions foo_one() and foo_two() are only to open the file, write something to it, and close it, then use the same mutex to keep them from messing each other up:
void foo_one(){
lock(foo_mutex);
//open file abc.txt
//write into file
//close file
unlock(foo_mutex);
}
void foo_two(){
lock(foo_mutex);
//open file abc.txt
//write into file
//close file
unlock(foo_mutex);
}
Of course, this assumes these are the only writers. If other threads or processes write to the file, a lock file might be a good idea.
You should do this, have a struct with a mutex and a ofstream:
struct parser {
ofstream myfile
mutex lock
};
Then you can pass this struct (a) to foo1 and foo2 as a void*
parser * a = new parser();
initialise the mutex lock, then you can pass the struct to both the functions.
void foo_one(void * a){
parser * b = reinterperet_cast<parser *>(a);
lock(b->lock);
b->myfile.open("abc.txt");
//write into file
b->myfile.close();
unlock(b->mutex);
}
You can do the same for the foo_two function. This will provide a thread safe means to write to the same file.
Try this code. I've done this with MFC Console Application
#include "stdafx.h"
#include <mutex>
CWinApp theApp;
using namespace std;
const int size_ = 100; //thread array size
std::mutex mymutex;
void printRailLock(int id) {
printf("#ID :%", id);
lock_guard<std::mutex> lk(mymutex); // <- this is the lock
CStdioFile lastLog;
CString logfiledb{ "_FILE_2.txt" };
CString str;
str.Format(L"%d\n", id);
bool opend = lastLog.Open(logfiledb, CFile::modeCreate | CFile::modeReadWrite | CFile::modeNoTruncate);
if (opend) {
lastLog.SeekToEnd();
lastLog.WriteString(str);
lastLog.Flush();
lastLog.Close();
}
}
int main()
{
int nRetCode = 0;
HMODULE hModule = ::GetModuleHandle(nullptr);
if (hModule != nullptr)
{
if (!AfxWinInit(hModule, nullptr, ::GetCommandLine(), 0))
{
wprintf(L"Fatal Error: MFC initialization failed\n");
nRetCode = 1;
}
else
{
std::thread threads[size_];
for (int i = 0; i < size_; ++i) {
threads[i] = std::thread(printRailLock, i + 1);
Sleep(1000);
}
for (auto& th : threads) { th.hardware_concurrency(); th.join(); }
}
}
else
{
wprintf(L"Fatal Error: GetModuleHandle failed\n");
nRetCode = 1;
}
return nRetCode;
}
Referance:
http://www.cplusplus.com/reference/mutex/lock_guard/
http://www.cplusplus.com/reference/mutex/mutex/
http://devoptions.blogspot.com/2016/07/multi-threaded-file-writer-in-c_14.html

Increases physical memory continuously Visual C++ CryptMsgClose and CryptReleaseContext

I am using C++ and Visual Studio 2005.
I have a project that memory Increases in a very abnormal. When debug the code I Realized That there are Several parts That Contribute to it. Like this for example:
// has to add crypt32.lib to link
#include <windows.h>
#define MY_ENCODING_TYPE (PKCS_7_ASN_ENCODING | X509_ASN_ENCODING)
void memoryUP( const unsigned char *pData, int cData )
{
HCRYPTMSG msg = NULL;
HCRYPTPROV hProv = NULL;
CryptAcquireContext(&hProv,NULL,NULL,PROV_RSA_FULL,0);
msg = CryptMsgOpenToDecode(MY_ENCODING_TYPE,0,0,hProv,NULL,NULL);
if(!(CryptMsgUpdate( msg, pData, cData, TRUE)))
{
if(msg != NULL)
{
CryptMsgClose(msg);
msg = NULL;
}
}
if (hProv != NULL)
CryptReleaseContext(hProv,0);
if (msg != NULL)
{
CryptMsgClose(msg);
msg = NULL;
}
}
int main(int argc, char** argv)
{
MyFile myfile = myReadFile("c:\\file.p7s");
{
for(int i=0; i<100000; ++i)
{
memoryUP( myfile._data, myfile._length );
}
}
delete myfile;
return 0;
}
When I run this code, the memory goes up continuously "when call CryptMsgUpdate". Am I deallocating wrong?
I tried to use Memory Leak Detection Enabling method to detect memory leak but nothing appears:
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>
and
_CrtDumpMemoryLeaks();
Thanks in Advance
You have to release resources in reverse order of their acquisition:
CryptAcquireContext();
if (success)
{
CryptMsgOpenToDecode();
if (success)
{
CryptMsgClose();
}
// else: nothing to close, opening failed
CryptReleaseContext();
}
// else: nothing to release, acquisition failed
The deeper nested constructions depend on the outer ones and may lock up resources, so you can only release the prerequisite resources after you've released the dependent ones.
Since you tagged this C++, I would be remiss to mention that those sort of things should be handled with RIAA, and you should make a class that takes responsibility for the resource. As you can see even in this simple example, writing the correct error checking paths very quickly become onerous, so it'd be much better and more modular to have a class that cleans up after itself, which automatically happens in the correct order.
I think you should call CryptMsgClose before CryptReleaseContext.