How to control reading from file using performance counters?

How to control reading from file using performance counters? - c++

There are several operations being done on drive G. My program should read data from file. When the disk usage is very high(>90%) the program should slow down the reading so it won't interfere with other processes that uses the disk. Obviously, I guess, that checking the Disk Time after calling get_data_from_file() will cause the counter to return very high percentage because the disk was just used. You can see that on the image.
Any suggestions on how I can check correctly the Disk Time?
PDH_HQUERY query;
PDH_HCOUNTER counter;
PdhOpenQuery(NULL, 0, &query);
PdhAddCounterA(query, "\\LogicalDisk(G:)\\% Disk Time", 0, &counter);
PdhCollectQueryData(query);
auto getDiskTime = [&]()->double
{
PDH_FMT_COUNTERVALUE fmtCounter;
PdhCollectQueryData(query);
PdhGetFormattedCounterValue(counter, PDH_FMT_DOUBLE, 0, &fmtCounter);
return fmtCounter.doubleValue;
};
for(...)
{
get_data_from_file();
print_done_percentage();
double diskUsage = getDiskTime();
if(diskUsage >= 90)
{
std::cout << "The disk usage is over << diskUsage << "%. I'll wait...
while(diskUsage >= 90)
{
diskUsage = getDiskTime();
Sleep(500);
}
}
}

A distinct monitoring thread could help you measure disk usage with more independence from the writing.
The function executed by the thread would look like this:
void diskmonitor(atomic<double>& du, const atomic<bool>& finished) {
while (!finished) { // stop looping as soon as main process has finished job
du = getDiskTime(); // measure disk
this_thread::sleep_for(chrono::milliseconds(500)); //wait
}
}
It communicates with the main thread through atomic (i.e. to avoid data races) variables passed by reference.
Your processing loop would look as follows:
atomic<bool> finished=false; // tell diskmonitor that the processing is ongoing
atomic<double> diskusage=0; // last disk usage read by diskmonitor
thread t(diskmonitor, ref(diskusage), ref(finished)); // launch monitor
for (int i = 0; i < 1000; i++)
{
...
print_done_percentage();
while (diskusage >= 90) { // disk usage is filled in background
std::cout << "The disk usage is over " << diskusage << ".I'll wait...\n";
this_thread::sleep_for(chrono::milliseconds(500));
}
...
}
finished = false; // tell diskmonitor that i't's finished, so that it ends the loop
t.join(); // wait until diskmonitor is finished.
This example is with standard C++ threads. Of course you could code something similar with OS specific threads.

Related

How one thread make something instead of waiting on condition variable

I was searching long before ask this question, and I can't find how to solve my problem.
I have five threads(Workers), this workers are mining gold,transport gold to avant poste and unload it there.
And my problem is there that when the worker is mining gold, user can input b to check is there enough gold, and if this is true to build barrack.
When worker is mining gold there is 2 sec sleep that is why I use pthread_cond_timedwait().
I have global variables which are storing barracks number, gold on map and gold in avant poste
Here is the pseudo code.
void makeBarrack(size_t data) {
timespec waitTime = { 2, 0 };
pthread_mutex_lock(&check_mutex);
while (wantBarrack) {
pthread_cond_timedwait(&condp, &gold_mutex, &waitTime);
}
std::cout << "Worker" << data << "is making barrack" << std::endl;
wantBarrack = false;
pthread_mutex_lock(&unload_mutex);
avantPost -= 100;
pthread_mutex_unlock(&unload_mutex);
barracks++;
pthread_mutex_unlock(&check_mutex);
}
void *work(void *data, char input) {
size_t thread_num = (size_t) data;
pthread_mutex_lock(&gold_mutex);
timespec waitTime = { 2, 0 };
if ((input == 'B' || input == 'b') && avantPost >= 100) {
wantBarrack = true;
input = 0;
} else if ((input == 'B' || input == 'b') && avantPoste < 100) {
std::cout << "There is " << avantPoste << " gold" << std::endl;
}
while (wantBarrack) {
pthread_cond_timedwait(&condp, &gold_mutex, &waitTime);
}
makeBarrack(data);
}
I an trying to make something like consumer producer but in my task I need to do something(mine gold) instead of waiting other threads to mine.
Other question is do I need to use same mutex in this two functions?
P.S.
I am novice in multithreading and it will be good someone to edit my question if there is something wrong.

The problem was threre that I've learnt that I can use cv in simple if.The main reason to use cv is thath we can block our thread without blocking other threads (It's unlocking the mutex while waiting on cv).And we just need to signal thath the conditition is done and we are ready to unblock(release) the thread and make the function we want. I am using pthread_cond_timedwait()
because I can block my thread for time I want.

Experiencing audio dropouts with OS X core audio playback/output

I'm doing playback using core audio (OS X, 10.11.4 Beta, older mac mini) using a simple output audio unit configured for input and output (though all of my problems seem to be with output). This is a streaming audio source from socket/internet feeding into a boost lockless queue, which then feeds into the output AU. I'm getting audio dropouts that appear to be a result of the AU render callback not being called by core audio intermittently.
Here is a graph. There were ~10 seconds of flawless audio before this section.
black: sample audio, simple sine wave
blue: wall clock duration of render callback (OutputProc) in ms, point off the chart above is ~120ms
orange: size of lockless queue (playback_buf) in samples/1000 to fit it in graph nicely
x-axis: time in ms
Everything is logged in OutputProc, so if that isn't called, then nothing gets logged, but the graphing tool will connect the dots across those periods. There is always enough samples in the buffer. It seems that from ~22475ms to ~22780ms, OutputProc is only called once at 22640. It does have a long wall clock time on that particular instance, but seems to be due to pre-emption. Later in the 22800 to 23000 range there are still dropouts but the OutputProc doesn't last any longer than normal and certainly doesn't overrun the real time window (~6ms here...HW sample rate is 96kHz). So, I'm thinking this is some other thread that is pre-empting somehow. I would expect core audio thread to have very high prio though. I do have some boost asio socket input/output going on in parallel (e.g. boost::asio::io_service io_service) but I would expect that to always lose priority to core audio. If you have any pointers to the actual problem...that is always welcome...but, I can make progress if I can just find out what thread(s) are executing during those times of interest? Is there something in Xcode that tells me a scheduler history or thread history, possibly per CPU core?
The OutputProc if it helps:
OSStatus AudioStream::OutputProc(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *TimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
AudioStream *This = (AudioStream *) inRefCon;
playback_cb_dur_log.StartTime();
static bool first_call = true;
if (first_call)
{
std::cout << TIME(timer) << " playback starting\n";
This->playback_state = PLAYBACK_ACTIVE;
first_call = false;
}
int playback_buf_avail = (int) This->playback_buf.read_available();
playback_buf_size_log.AddPoint(playback_buf_avail/1000.);
if (playback_buf_avail >= This->playback_buf_thresh)
{
std::cout << TIME() << " audio, thresh: " << This->playback_buf_thresh << ", buf_size: " << playback_buf_avail << std::endl;
// new threshold just one frame of data
This->playback_buf_thresh = This->frames_total;
for(int i = 0; i < This->num_channels; i++)
{
float *temp = (float *) ioData->mBuffers[i].mData;
This->playback_buf.pop(temp, inNumberFrames);
playback_sample_log.AddData(ioData->mBuffers[i].mData, inNumberFrames, This->chan_params.sample_rate);
}
}
else
{
std::cout << TIME() << " silence, thresh: " << This->playback_buf_thresh << ", buf_size: " << This->playback_buf.read_available() << std::endl;
for(int i = 0; i < This->num_channels; i++)
{
memset(ioData->mBuffers[i].mData, 0, inNumberFrames * sizeof(Float32));
playback_sample_log.AddData(ioData->mBuffers[i].mData, inNumberFrames, This->chan_params.sample_rate);
}
}
playback_cb_dur_log.StopAndCaptureTime();
return noErr;
}

Your logging mechanism might be interfering with the real-time thread. Anything, any call, which can take a lock, or manage memory (such a string creation or stdout file IO) can cause dropouts and other failures in Audio Unit callbacks.
If that's the case, you might try stuffing your time stamps in a lock-free circular logging FIFO, and doing any file IO in another thread.

The conditional variable is not working but after adding std::cout, it is working

My project is consists of two threads: one main thread and the other thread which handles another window content. So, the when the main thread wants to ask the another windows to update itself it calls the draw function which is as follows:
void SubApplicationManager::draw() {
// Zero number of applications which has finished the draw counter
{
boost::lock_guard<boost::mutex> lock(SubApplication::draw_mutex);
SubApplication::num_draws = 0;
}
// Draw the sub applications.
for (size_t i = 0; i < m_subApplications.size(); i++)
m_subApplications[i].signal_draw();
// Wait until all the sub applications finish drawing.
while (true){
boost::lock_guard<boost::mutex> lock(SubApplication::draw_mutex);
std::cout << SubApplication::num_draws << std::endl;
if (SubApplication::num_draws >= m_subApplications.size()) break;
}
}
The draw function just signals the other thread that a new task is received.
void SubApplication::signal_draw() {
task = TASK::TASK_DRAW;
{
boost::lock_guard<boost::mutex> lock(task_received_mutex);
task_received = true;
}
task_start_condition.notify_all();
}
The body of other thread is as follows. It waits for the task to arrive and then start to process:
void SubApplication::thread() {
clock_t start_time, last_update;
start_time = last_update = clock();
//! Creates the Sub Application
init();
while (!done) // Loop That Runs While done=FALSE
{
// Draw The Scene. Watch For ESC Key And Quit Messages From DrawGLScene()
if (active) // Program Active?
{
// Wait here, until a update/draw command is received.
boost::unique_lock<boost::mutex> start_lock(task_start_mutex);
while (!task_received){
task_start_condition.wait(start_lock);
}
// Task received is set to false, for next loop.
{
boost::lock_guard<boost::mutex> lock(task_received_mutex);
task_received = false;
}
clock_t frame_start_time = clock();
switch (task){
case TASK_UPDATE:
update();
break;
case TASK_DRAW:
draw();
swapBuffers();
break;
case TASK_CREATE:
create();
break;
default:
break;
}
clock_t frame_end_time = clock();
double task_time = static_cast<float>(frame_end_time - frame_start_time) / CLOCKS_PER_SEC;
}
}
}
The problem is that if I run the code as it is, it never runs the other thread with task = TASK::TASK_DRAW; but if I add a std::cout << "Draw\n"; to the beginning of SubApplication::draw(), it will work as it should. I am looking for the reason which it is happening and what is the usual way to fix it?

boost::lock_guard<boost::mutex> lock(task_received_mutex);
task_received = true;
Okay, the task_received_mutex protects task_received.
boost::unique_lock<boost::mutex> start_lock(task_start_mutex);
while (!task_received){
task_start_condition.wait(start_lock);
}
Oops, we're reading task_received without holding the mutex that protects it. What prevents a race where one thread reads task_received while another thread is modifying it? This could immediately lead to deadlock.
Also, you have code that claims to "Wait until all the sub applications finish drawing" but there's no call to any wait function. So it actually spins rather than waiting, which is awful.

As a starter, signal the task_start_condition under the task_start_mutex lock.
Consider locking that mutex during thread creation to avoid obvious races.
Third: it seems you have several mutexes named for "logical tasks" (draw, start). In reality, however, mutexes guard resources, not "logical tasks". So it's good practice to name them after the shared resource they should guard. _(In this case I get the impression that a single mutex could be enough/better. But we can't tell for sure from the code shown)).

Multithreaded not efficient: Debugging False Sharing?

I have the following code, that starts multiple Threads (a threadpool) at the very beginning (startWorkers()). Subsequently, at some point i have a container full of myWorkObject instances, which I want to process using multiple worker threads simulatenously. The myWorkObject are completely isolated from another in terms of memory usage. For now lets assume myWorkObject has a method doWorkIntenseStuffHere() which takes some cpu time to calculate.
When benchmarking the following code, i have noticed that this code does not scale well with the number of threads, and the overhead for initializing/synchronizing the worker threads exceeds the benefit of multithreading unless there are 3-4 threads active. I've looked into this issue and read about the false-sharing problem and i assume my code suffers from this problem. However, I'd like to debug/profile my code to see whether there is some kind of starvation/false sharing going on. How can I do this? Please feel free to critize anything about my code as I'm still learning a lot about memory/cpu and multithreading in particular.
#include <boost/thread.hpp>
class MultiThreadedFitnessProcessingStrategy
{
public:
MultiThreadedFitnessProcessingStrategy(unsigned int numWorkerThreads):
_startBarrier(numWorkerThreads + 1),
_endBarrier(numWorkerThreads + 1),
_started(false),
_shutdown(false),
_numWorkerThreads(numWorkerThreads)
{
assert(_numWorkerThreads > 0);
}
virtual ~MultiThreadedFitnessProcessingStrategy()
{
stopWorkers();
}
void startWorkers()
{
_shutdown = false;
_started = true;
for(unsigned int i = 0; i < _numWorkerThreads;i++)
{
boost::thread* workerThread = new boost::thread(
boost::bind(&MultiThreadedFitnessProcessingStrategy::workerTask, this,i)
);
_threadQueue.push_back(new std::queue<myWorkObject::ptr>());
_workerThreads.push_back(workerThread);
}
}
void stopWorkers()
{
_startBarrier.wait();
_shutdown = true;
_endBarrier.wait();
for(unsigned int i = 0; i < _numWorkerThreads;i++)
{
_workerThreads[i]->join();
}
}
void workerTask(unsigned int id)
{
//Wait until all worker threads have started.
while(true)
{
//Wait for any input to become available.
_startBarrier.wait();
bool queueEmpty = false;
std::queue<SomeClass::ptr >* myThreadq(_threadQueue[id]);
while(!queueEmpty)
{
SomeClass::ptr myWorkObject;
//Make sure queue is not empty,
//Caution: this is necessary if start barrier was triggered without queue input (e.g., shutdown) , which can happen.
//Do not try to be smart and refactor this without knowing what you are doing!
queueEmpty = myThreadq->empty();
if(!queueEmpty)
{
chromosome = myThreadq->front();
assert(myWorkObject);
myThreadq->pop();
}
if(myWorkObject)
{
myWorkObject->doWorkIntenseStuffHere();
}
}
//Wait until all worker threads have synchronized.
_endBarrier.wait();
if(_shutdown)
{
return;
}
}
}
void doWork(const myWorkObject::chromosome_container &refcontainer)
{
if(!_started)
{
startWorkers();
}
unsigned int j = 0;
for(myWorkObject::chromosome_container::const_iterator it = refcontainer.begin();
it != refcontainer.end();++it)
{
if(!(*it)->hasFitness())
{
assert(*it);
_threadQueue[j%_numWorkerThreads]->push(*it);
j++;
}
}
//Start Signal!
_startBarrier.wait();
//Wait for workers to be complete
_endBarrier.wait();
}
unsigned int getNumWorkerThreads() const
{
return _numWorkerThreads;
}
bool isStarted() const
{
return _started;
}
private:
boost::barrier _startBarrier;
boost::barrier _endBarrier;
bool _started;
bool _shutdown;
unsigned int _numWorkerThreads;
std::vector<boost::thread*> _workerThreads;
std::vector< std::queue<myWorkObject::ptr >* > _threadQueue;
};

Sampling-based profiling can give you a pretty good idea whether you're experiencing false sharing. Here's a previous thread that describes a few ways to approach the issue. I don't think that thread mentioned Linux's perf utility. It's a quick, easy and free way to count cache misses that might tell you what you need to know (am I experiencing a significant number of cache misses that correlates with how many times I'm accessing a particular variable?).
If you do find that your threading scheme might be causing a lot of conflict misses, you could try declaring your myWorkObject instances or the data contained within them that you're actually concerned about with __attribute__((aligned(64))) (alignment to 64 byte cache lines).

If you're on Linux, there is a tool called valgrind, with one of the modules doing cache effects simulation (cachegrind). Please take a look at
http://valgrind.org/docs/manual/cg-manual.html

Can Detaching A Thread Lead To A Torn Write

Good day,
I am new to threading, and I am wondering if I have something like (context of C++, and X threading library):
//Pseudo code...//
void OnThread() {
someGlobalVar = 2;
someGlobalVar += 4;
}
void main()
{
ThreadHandle someThreadHandle = MakeThread( &OnThread );
//Can a torn write occur?//
someThreadHandle.Detach();
//Can "someGlobalVar" be trusted?//
std::cerr << someGlobalVar << "\n";
return ( 0 );
}
Could someGlobalVar have a torn write applied to it, can it be considered "safe" after the detach?

It is safe to detach a thread as long as your program is still running, The thread will keep running after you detach it. But it would be safer to use a join which will block until the thread is done executing.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to control reading from file using performance counters? - c++

Related

How one thread make something instead of waiting on condition variable

Experiencing audio dropouts with OS X core audio playback/output

The conditional variable is not working but after adding std::cout, it is working

Multithreaded not efficient: Debugging False Sharing?

Can Detaching A Thread Lead To A Torn Write

Categories

Resources