CPU comsuption with Serial Port Thread - c++

I write my professional application and I have one problem with the serial port thread.
I have cpu consuption. When I add SerialCtrl.h (from project SerialCtrl http://www.codeproject.com/Articles/99375/CSerialIO-A-Useful-and-Simple-Serial-Communication ) in my project my CPU % is become more 100% so without is near 40%.
I use VS C++ 2012 Professional in ANSI 32 bits MFC MT
const unsigned short MAX_MESSAGE = 300;
SerialThread::SerialThread() :m_serialIO(NULL)
m_serialIO = NULL;
BOOL SerialThread::InitInstance()
return TRUE;
int SerialThread::Run()
// Check signal controlling and status to open serial communication.
if ((serialCtrl().GetPortStatus()==FALSE)&&m_serialIO->GetPortActivateValue()==TRUE)
else if (m_serialIO->GetPortActivateValue()==TRUE)
char message[MAX_MESSAGE]={0};
unsigned int lenBuff = MAX_MESSAGE;
unsigned long lenMessage;
if (m_serialIO->GetSendActivateValue()==TRUE)
unsigned long nWritten;
if (m_serialIO->m_bClosePort==TRUE)
if (serialCtrl().ClosePort()==TRUE)
return 0;
void SerialThread::ClosePort()
I guess that it is SerialThread run which an issues but I didn't find how solve it.
(After performance and others tools)
Are you some idea?
Thank you

I took a look at your code, and unfortunately the problem comes from the library/project you are using. Basically the all-in-one thread is just looping and never waiting anywhere, and this leads to 100% CPU consumption.
What you can do :
Add a Sleep(1-10) at the end of the inner while loop in the run() method. This method is the worst, it just patch the underlying problem.
Use another, better designed library.
Make your own library suited to your use.
Some advises to make your own serial com wrapper :
Everything you need to know about serial ports on Windows is here : Serial Communications.
An IO thread should always wait somewhere. It can be on a blocking IO call like ReadFile(), or on a Windows waitable object.
If you can, use overlapped IO, even if you don't use asynchronous calls. It will enable simultaneous read and write, and make the reads and writes cancellable (cleanly).
You only need a separate thread to read. And optionally another one to write via a message queue, if you want a completely asynchronous library.


Runtime issues on ARM but runs fine on x86

I recently created an application that I developed on my x86 Linux machine. Basically it's just two threads that communicate over a pipe() with each other. Thread 0 listens on the read end and Thread 1 writes into that pipe. That program worked perfectly fine.
But when I copied the sources over to a RaspberryPi and built it, there were some runtime issues (but compiled with no errors). It seems that thread0 never gets something out of the pipe, it just blocks.
Since pipes are made for Interprocess communication, i thought it would also be thread safe (since there also are two different file descriptors for read and write end).
BUT: Stepping through the program in the Qt Creator debugger on the RPi, everything seemed to work fine! I know the debugger initializing certain variables different can lead to such conditions, but I couldn't find any usages of uninitialized variables etc. in my Code.
thread 1:
void *midiThread(void *fds)
midiDevice = ((int*)fds)[0]; // device file for midi input
midiBuffer = ((int*)fds)[1]; // write end of the pipe
unsigned char rawBuffer[MIDI_MSG_LENGTH];
while (read(midiDevice, rawBuffer, MIDI_MSG_LENGTH)
struct midievent_t currentEvent;
unsigned char *rawBuffer = (unsigned char *)buffer;
currentEvent.channel = rawBuffer[0] & 0x0f;
// ....
write(midiBuffer, &currentEvent, sizeof(struct midievent_t));
return NULL;
main thread:
void MidiInput::createMidiThread()
if (pipe(_midiBufferPipe) < 0)
// error
int fds[2];
fds[0] = _midiFileDescriptor;
fds[1] = _midiBufferPipe[1];
pthread_create(&_midiThreadId, NULL,
midiThread, fds);
bool MidiInput::read(midievent_t *event)
if (!_initialized)
return false;
if (read(_midiBufferPipe[0], event, sizeof(struct midievent_t))
< sizeof(struct midievent_t))
// some error
return _initialized = false;
return true;

libuv - Limiting callback rate of idle event without blocking thread without multithreading

I'm using libsourcey which uses libuv as its underlying I/O networking layer.
Everything is setup and seems to run (haven't testen anything yet at all since I'm only prototyping and experimenting). However, I require that next to the application loop (the one that comes with libsourcey which relies on libuv's loop), also calls an "Idle function". As it is now, it calls the Idle CB on every cycle which is very CPU consuming. I'd need a way to limit the call-rate of the uv_idle_cb without blocking the calling thread which is the same the application uses to process I/O data (not sure about this last statement, correct me if i'm mistaken).
The idle function will be managing several different aspects of the application and it needs to run only x times within 1 second. Also, everything needs to run one the same thread (planning to upgrade an older application's network infrastructure which runs entirely single-threaded).
This is the code I have so far which also includes the test I did with sleeping the thread within the callback but it blocks everything so even the 2nd idle cb I set up has the same call-rate as the 1st one.
struct TCPServers
CTCPManager<scy::net::SSLSocket> ssl;
int counter = 0;
void idle_cb(uv_idle_t *handle)
printf("Idle callback %d TID %d\n", counter, std::this_thread::get_id());
std::this_thread::sleep_for(std::chrono::milliseconds(1000 / 25));
int counter2 = 0;
void idle_cb2(uv_idle_t *handle)
printf("Idle callback2 %d TID %d\n", counter2, std::this_thread::get_id());
std::this_thread::sleep_for(std::chrono::milliseconds(1000 / 50));
class CApplication : public scy::Application
CApplication() : scy::Application(), m_uvIdleCallback(nullptr), m_bUseSSL(false)
void start()
if (m_uvIdleCallback)
uv_idle_start(&m_uvIdle, m_uvIdleCallback);
if (m_uvIdleCallback2)
uv_idle_start(&m_uvIdle2, m_uvIdleCallback2);
void stop()
if (m_bUseSSL)
void bindIdleEvent(uv_idle_cb cb)
m_uvIdleCallback = cb;
uv_idle_init(loop, &m_uvIdle);
void bindIdleEvent2(uv_idle_cb cb)
m_uvIdleCallback2 = cb;
uv_idle_init(loop, &m_uvIdle2);
void initSSL(const std::string& privateKeyFile = "", const std::string& certificateFile = "")
scy::net::SSLManager::instance().initNoVerifyServer(privateKeyFile, certificateFile);
m_bUseSSL = true;
uv_idle_t m_uvIdle;
uv_idle_t m_uvIdle2;
uv_idle_cb m_uvIdleCallback;
uv_idle_cb m_uvIdleCallback2;
bool m_bUseSSL;
int main()
CApplication app;
TCPServers srvs;
srvs.ssl.start("", 9000);
app.waitForShutdown([&](void*) {
return 0;
Thanks in advance if anyone can help out.
Solved the problem by using uv_timer_t and uv_timer_cb (Hadn't digged into libuv's doc yet). CPU usage went down drastically and nothing gets blocked.

Multithreaded not efficient: Debugging False Sharing?

I have the following code, that starts multiple Threads (a threadpool) at the very beginning (startWorkers()). Subsequently, at some point i have a container full of myWorkObject instances, which I want to process using multiple worker threads simulatenously. The myWorkObject are completely isolated from another in terms of memory usage. For now lets assume myWorkObject has a method doWorkIntenseStuffHere() which takes some cpu time to calculate.
When benchmarking the following code, i have noticed that this code does not scale well with the number of threads, and the overhead for initializing/synchronizing the worker threads exceeds the benefit of multithreading unless there are 3-4 threads active. I've looked into this issue and read about the false-sharing problem and i assume my code suffers from this problem. However, I'd like to debug/profile my code to see whether there is some kind of starvation/false sharing going on. How can I do this? Please feel free to critize anything about my code as I'm still learning a lot about memory/cpu and multithreading in particular.
#include <boost/thread.hpp>
class MultiThreadedFitnessProcessingStrategy
MultiThreadedFitnessProcessingStrategy(unsigned int numWorkerThreads):
_startBarrier(numWorkerThreads + 1),
_endBarrier(numWorkerThreads + 1),
assert(_numWorkerThreads > 0);
virtual ~MultiThreadedFitnessProcessingStrategy()
void startWorkers()
_shutdown = false;
_started = true;
for(unsigned int i = 0; i < _numWorkerThreads;i++)
boost::thread* workerThread = new boost::thread(
boost::bind(&MultiThreadedFitnessProcessingStrategy::workerTask, this,i)
_threadQueue.push_back(new std::queue<myWorkObject::ptr>());
void stopWorkers()
_shutdown = true;
for(unsigned int i = 0; i < _numWorkerThreads;i++)
void workerTask(unsigned int id)
//Wait until all worker threads have started.
//Wait for any input to become available.
bool queueEmpty = false;
std::queue<SomeClass::ptr >* myThreadq(_threadQueue[id]);
SomeClass::ptr myWorkObject;
//Make sure queue is not empty,
//Caution: this is necessary if start barrier was triggered without queue input (e.g., shutdown) , which can happen.
//Do not try to be smart and refactor this without knowing what you are doing!
queueEmpty = myThreadq->empty();
chromosome = myThreadq->front();
//Wait until all worker threads have synchronized.
void doWork(const myWorkObject::chromosome_container &refcontainer)
unsigned int j = 0;
for(myWorkObject::chromosome_container::const_iterator it = refcontainer.begin();
it != refcontainer.end();++it)
//Start Signal!
//Wait for workers to be complete
unsigned int getNumWorkerThreads() const
return _numWorkerThreads;
bool isStarted() const
return _started;
boost::barrier _startBarrier;
boost::barrier _endBarrier;
bool _started;
bool _shutdown;
unsigned int _numWorkerThreads;
std::vector<boost::thread*> _workerThreads;
std::vector< std::queue<myWorkObject::ptr >* > _threadQueue;
Sampling-based profiling can give you a pretty good idea whether you're experiencing false sharing. Here's a previous thread that describes a few ways to approach the issue. I don't think that thread mentioned Linux's perf utility. It's a quick, easy and free way to count cache misses that might tell you what you need to know (am I experiencing a significant number of cache misses that correlates with how many times I'm accessing a particular variable?).
If you do find that your threading scheme might be causing a lot of conflict misses, you could try declaring your myWorkObject instances or the data contained within them that you're actually concerned about with __attribute__((aligned(64))) (alignment to 64 byte cache lines).
If you're on Linux, there is a tool called valgrind, with one of the modules doing cache effects simulation (cachegrind). Please take a look at

Multiple threads writing to same socket causing issues

I have written a client/server application where the server spawns multiple threads depending upon the request from client.
These threads are expected to send some data to the client(string).
The problem is, data gets overwritten on the client side. How do I tackle this issue ?
I have already read some other threads on similar issue but unable to find the exact solution.
Here is my client code to receive data.
char buff[MAX_BUFF];
int bytes_read = read(sd,buff,MAX_BUFF);
if(bytes_read == 0)
else if(bytes_read > 0)
Server Thread code :
void send_data(int sd,char *data)
void *calcWordCount(void *arg)
tdata *tmp = (tdata *)arg;
string line = tmp->line;
string s = tmp->arg;
int sd = tmp->sd_c;
int line_no = tmp->line_no;
int startpos = 0;
int finds = 0;
while ((startpos = line.find(s, startpos)) != std::string::npos)
int t=wcount[s];
char buff[MAX_BUFF];
sprintf(buff+strlen(buff),"%s"," occured ");
sprintf(buff+strlen(buff),"%s"," times on line ");
delete (tdata*)arg;
On the server side make sure the shared resource (the socket, along with its associated internal buffer) is protected against the concurrent access.
Define and implement an application level protocol used by the server to make it possible for the client to distinguish what the different threads sent.
As an additional note: One cannot rely on read()/write() reading/writing as much bytes as those two functions were told to read/write. It is an essential necessity to check their return value to learn how much bytes those functions actually read/wrote and loop around them until all data that was intended to be read/written had been read/written.
You should put some mutex to your socket.
When a thread use the socket it should block the socket.
Some mutex example.
I can't help you more without the server code. Because the problem is probably in the server.

Boost Thread Hanging on _endthreadex

I think I am making a simple mistake, but since I noticed there are many boost experts here, I thought I would ask for help.
I am trying to use boost threads(1_40) on windows xp. The main program loads a dll, starts the thread like so (note this is not in a class, the static does not mean static to a class but private to the file).
static boost::thread network_thread;
static bool quit = false;
HANDLE quitEvent;
//some code omitted for clarity, ask if you think it would help
void network_start()
HANDLE *waitHandles = (HANDLE*)malloc(3 * sizeof(HANDLE));
waitHandles[0] = quitEvent;
waitHandles[1] = recvEvent;
waitHandles[2] = pendingEvent;
do {
//read network stuff, or quit event
dwEvents =WaitForMultipleObjects(3, waitHandles, FALSE, timeout);
} while (!quit)
network_thread = boost::thread(boost::bind<void>(network_start));
//signal quit (which works)
quit = true;
//the following code is slightly verbose because I'm trying to figure out what's wrong
try {
if (network_thread.joinable() ) {
} else {
TRACE("Too late!");
} catch (boost::thread_interrupted&) {
The problem is that the main thread is hanging on the join, and the network thread is hanging at the end of _endthreadex. What am I misunderstanding?
You are not supposed to create/end threads in InitInstance/ExitInstance,
see http://support.microsoft.com/default.aspx?scid=kb;EN-US;142243 for more info. Also, see http://msdn.microsoft.com/en-us/library/ms682583%28VS.85%29.aspx about DllMain in general.