CSocket::OnReceive called concurrently - c++

I have a strange problem.
void MySocket::OnReceive( int nErrorCode )
{
static CMutex mutex;
static int depth=0;
static int counter=0;
CSingleLock lock(&mutex, true);
Receive(pBuff, iBuffSize-1);
counter++;
depth++; //<-- Breakpoint
log("Onreceive: enter %d %d %d", GetCurrentThreadId(), counter, depth);
.....
Code handling data
depth--;
log("Onreceive: exit %d %d %d", GetCurrentThreadId(), counter, depth);
}
Results in this log statement:
02/19/2014 08:33:14:982 [DEBUG] Onreceive Enter: 3200 1 2
02/19/2014 08:34:13:726 [DEBUG] Onreceive Exit : 3200 2 1
02/19/2014 08:32:34:193 [DEBUG] Onreceive Enter: 3200 0 1 <- Log statement was created but interrupted before it was written to disk
02/19/2014 08:34:13:736 [DEBUG] Onreceive Exit : 3200 2 0
Now what happens:
I start the program and the debugger stops at the breakpoint
Step through into the log
Some where in the log the debugger jumps back to the break point
This is the second entry into the OnReceive
Second call completes
First call continues
My questions:
How is it possible to get two concurrent calls to OnReceive
Why does the Mutex not work (Due to the same threadid?)
And how can I have two executing paths with the same ThreadID???
And ofcourse, how can I fix this??
Note that this only happens if I send a lot small messages (<50Bytes) until the Send blocks. In total it's around 500KB/s. If I put a Sleep(1) after each send it doesn't happen.. but that ofcourse kills my transfer speed.

Ok, I found the root cause. In the Log statement a Win32 Mutex is used and the following Wait:
DWORD dwResult = MsgWaitForMultipleObjects(nNoOfHandle, handle, FALSE, dwTimeout, QS_POSTMESSAGE|QS_ALLPOSTMESSAGE|QS_SENDMESSAGE|QS_TIMER);
if (dwResult == WAIT_OBJECT_0 + nNoOfHandle) // new message is in the queue, let's clear
{
MSG Msg;
while (PeekMessage(&Msg, NULL, 0, 0, PM_REMOVE))
{
::TranslateMessage(&Msg);
::DispatchMessage(&Msg);
}
}
This waits for the Mutex to be cleared OR for a message to be posted. CSocket posts messages to the thread when it receives data and that will call the OnReceive. So, this code produced the problem that while waiting for the mutex it would handle incoming messages and effectively call the OnReceive again.
One way of solving this was to prevent the CSocket from posting more notifications like this:
void MySocket::OnReceive(int nErrorCode)
{
/* Remove FD_READ notifications */
VERIFY(AsyncSelect(FD_WRITE | FD_OOB | FD_ACCEPT | FD_CONNECT | FD_CLOSE));
OldOnReceive(nErrorCode);
/* Restore default notifications */
VERIFY(AsyncSelect());
}

Related

Windows Named Semaphore Not Getting Locked

I am developing C++ class with calls to Windows API C libraries.
I am using the Semaphores for a task, let's say I have two processes:
ProcessA has two semaphores:
Global\processA_receiving_semaphore
Global\processA_waiting_semaphore
ProcessB has two semaphores:
Global\processB_receiving_semaphore
Global\processB_waiting_semaphore
I have two threads in each process:
Sending thread in processA:
Wait on "Global\processB_waiting_semaphore"
// do something
Signal "Global\processB_receiving_semaphore"
Receiving thread on processB:
Wait on "Global\processB_receiving_semaphore"
// do something
Signal "Global\processB_waiting_semaphore
I removed ALL code that Releases "Global\processB_waiting_semaphore" but it can still be acquired. Calling WaitForSingleObject on that semaphore always returns successful wait and immediately. I tried setting the timeout period to 0 and it still acquires the semaphore while NOTHING is releasing it.
The receiving semaphore has initial count = 0 and max count = 1 while the waiting semaphore has initial count = 1 and max count = 1.
Calling WaitForSingleObject on the receiving semaphore works great and blocks until it is released by the other process. The problem is with the waiting semaphore and I cannot figure out why. The code is very big and I have made sure the names of the semaphores are set correctly.
Is this a common issue? If you need more explanation please comment and I will modify the post.
EDIT: CODE ADDED:
Receiver semaphores:
bool intr_process_comm::create_rcvr_semaphores()
{
std::cout << "\n Creating semaphore: " << "Global\\" << this_name << "_rcvr_sem";
rcvr_sem = CreateSemaphore(NULL, 0, 1, ("Global\\" + this_name + "_rcvr_sem").c_str());
std::cout << "\n Creating semaphore: " << "Global\\" << this_name << "_wait_sem";
wait_sem = CreateSemaphore(NULL, 1, 1, ("Global\\" + this_name + "_wait_sem").c_str());
return (rcvr_sem && wait_sem);
}
Sender semaphores:
// this sender connects to the wait semaphore in the target process
sndr_sem = OpenSemaphore(SEMAPHORE_MODIFY_STATE, FALSE, ("Global\\" + target_name + "_wait_sem").c_str());
// this target connects to the receiver semaphore in the target process
trgt_sem = OpenSemaphore(SEMAPHORE_MODIFY_STATE, FALSE, ("Global\\" + target_name + "_rcvr_sem").c_str());
DWORD intr_process_locking::wait(unsigned long period)
{
return WaitForSingleObject(sndr_sem, period);
}
void intr_process_locking::signal()
{
ReleaseSemaphore(trgt_sem, 1, 0);
}
Receiving thread function:
void intr_process_comm::rcvr_thread_proc()
{
while (conn_state == intr_process_comm::opened) {
try {
// wait on rcvr_semaphore for an infinite time
WaitForSingleObject(rcvr_sem, INFINITE);
if (inner_release) // if the semaphore was released within this process
return;
// once signaled by another process, get the message
std::string msg_str((LPCSTR)hmf_mapview);
// signal one of the waiters that want to put messages
// in this process's memory area
//
// this doesn't change ANYTHING in execution, commented or not..
//ReleaseSemaphore(wait_sem, 1, 0);
// put this message in this process's queue
Msg msg = Msg::from_xml(msg_str);
if (msg.command == "connection")
process_connection_message(msg);
in_messages.enQ(msg);
//std::cout << "\n Message: \n"<< msg << "\n";
}
catch (std::exception e) {
std::cout << "\n Ran into trouble getting the message. Details: " << e.what();
}
}
}
Sending thread function:
void intr_process_comm::sndr_thread_proc()
{
while (conn_state == intr_process_comm::opened ||
(conn_state == intr_process_comm::closing && out_messages.size() > 0)
) {
// pull a message out of the queue
Msg msg = out_messages.deQ();
if (connections.find(msg.destination) == connections.end())
connections[msg.destination].connect(msg.destination);
if (connections[msg.destination].connect(msg.destination)
!= intr_process_locking::state::opened) {
blocked_messages[msg.destination].push_back(msg);
continue;
}
// THIS ALWAYS GETS GETS WAIT_OBJECT_0 RESULT
DWORD wait_result = connections[msg.destination].wait(wait_timeout);
if (wait_result == WAIT_TIMEOUT) { // <---- THIS IS NEVER TRUE
out_messages.enQ(msg);
continue;
}
// do things here
// release the receiver semaphore in the other process
connections[msg.destination].signal();
}
}
To clarify some things:
trgt_sem in a sender is the rcvr_sem in the receiver.
`sndr_sem' in the sender is the 'wait_sem" in the receiver.
for call WaitForSingleObject with some handle:
The handle must have the SYNCHRONIZE access right.
but you open semaphore with SEMAPHORE_MODIFY_STATE access only. with this access possible call ReleaseSemaphore (This handle must have the SEMAPHORE_MODIFY_STATE access right) but call to WaitForSingleObject fail with result WAIT_FAILED. call to GetLastError() after this must return ERROR_ACCESS_DENIED.
so if we want call both ReleaseSemaphore and any wait function - we need have SEMAPHORE_MODIFY_STATE | SYNCHRONIZE access on handle. so need open with code
OpenSemaphore(SEMAPHORE_MODIFY_STATE|SYNCHRONIZE, )
and of course always checking api return values and error codes can save a lot of time
If you set the timeout to 0 WaitForSingleObject will always return immediately, a successful WaitForSingleObject will return WAIT_OBJECT_0 (which happens to have the value 0), WFSO is not like most APIs where success is indicated by a non-zero return.

How to exit a program at particuliar time while main loop is listening for message?

In the main program main loop, I'm listening on a EMS topic by calling tibemsMsgConsumer_Receive. Meanwhile, I want to exit the program at specific time, say 5PM. How can I implement this?
I tried to use the following code but it doesn't work properly in the case there is no message received.
Is there a way I can exit the program when 'while' loop is stuck there?
while (1)
{
status = tibemsMsgConsumer_Receive(m_CmbsSpreadMatrixSubscriber, &msg);
if (status == TIBEMS_OK)
{
DoSomething();
}
if (getRunTime("hour").c_str()) >= 18)
{
exit(0);
}
}
Use tibemsMsgConsumer_ReceiveTimeout() and set an appropriate timeout to check your exit condition repeatedly.
From the description on that page:
This function consumes the next message from the consumer’s destination. When the destination does not have any messages ready, this function blocks:
If a message arrives at the destination, this call immediately consumes that message and returns.
If the (non-zero) timeout elapses before a message arrives, this call returns TIBEMS_TIMEOUT.
If another thread closes the consumer, this call returns TIBEMS_INTR.
before starting the main loop listening on message, I start a thread.
boost::thread aThread(&threadFunc);
and in the thread function I simply count time and exit the program. Not sure if
it's safe and right or not...
void threadFunc()
{
while (true)
{
wait(60);
if (atoi(getRunTime("hour").c_str()) >= 18)
{
Log("Now it's 6PM, let's stop and get back tomorrow.");
exit(0);
}
}
}

How can I clean up properly when recv is blocking?

Consider the example code below (I typed it up quickly as an example, if there are errors it doesn't matter - I'm interested in the theory).
bool shutDown = false; //global
int main()
{
CreateThread(NULL, 0, &MessengerLoop, NULL, 0, NULL);
//do other programmy stuff...
}
DWORD WINAPI MessengerLoop( LPVOID lpParam )
{
zmq::context_t context(1);
zmq::socket_t socket (context, ZMQ_SUB);
socket.connect("tcp://localhost:5556");
socket.setsockopt(ZMQ_SUBSCRIBE, "10001 ", 6);
while(!shutDown)
{
zmq_msg_t getMessage;
zmq_msg_init(&getMessage);
zmq_msg_recv (&getMessage, socket, 0); //This line will wait forever for a message
processMessage(getMessage);
}
}
A thread is created to wait for incoming messages and to handle them appropriately. The thread is looping until shutDown is set to true.
In ZeroMQ the Guide specifically states what must be cleaned up, namely the messages, socket and context.
My issue is: Since recv will wait forever for a message, blocking the thread, how can I shut down this thread safely if a message is never received?
The blocking call will exit in a few ways. First, and this depends on your language and binding, an interrupt (Ctrl-C, SIGINT, SIGTERM) will exit the call. You'll get back (again, depending on your binding) an error or a null message (libzmq returns an EINTR error).
Second, if you terminate the context in another thread, the blocking call will also exit (libzmq returns an ETERM error).
Thirdly, you can set timeouts on the socket so it will return in any case after some timeout, if there's no data. We don't often do this but it can be useful in some cases.
Finally, what we do in practice is never do blocking receives but use zmq_poll to find out when sockets have messages waiting, then receive from those sockets. This is how you scale out to handling more sockets.
You can use non-blocking call flag ZMQ_DONTWAIT
while(!shutDown)
{
zmq_msg_t getMessage;
zmq_msg_init(&getMessage);
while(-1 == zmq_msg_recv(&getMessage, socket, ZMQ_DONTWAIT))
{
if (EAGAIN != errno || shutDown)
{
break;
}
Sleep(100);
}
processMessage(getMessage);
}
Whenever zmq context is destroyed, zmq_msg_recv will receive a -1. I use this as the terminating condition in all of my code.
while (!shutdown)
{
..
..
int rc = zmq_msg_recv (&getMessage, socket, 0);
if (rc != -1)
{
processMessage;
}
else
break;
}
Remember to destroy the zmq context at the end of your main() for a proper clean-up.
zmq_ctx_destroy(zctx);
Lets say you have a class say SUB (subscriber) that manages the receive of your ZMQ messages. In the destructor or exit function of your main function/class, call the following:
pub->close();
///
/// Close the publish context
///
void PUB::close()
{
zmq_close (socket);
zmq_ctx_destroy (context);
}
This will enable that 'recv' blocking terminates with error message that you can ignore. The application will exit comfortably in the right way. This is the right method. Good luck!

COM port read - Thread remains alive after timeout occurs

I have a dll which includes a function called ReadPort that reads data from serial COM port, written in c/c++. This function is called within an extra thread from another WINAPI function using the _beginthreadex. When COM port has data to be read, the worker thread returns the data, ends normaly, the calling thread closes the worker's thread handle and the dll works fine.
However, if ReadPort is called without data pending on the COM port, when timeout occurs then WaitForSingleObject returns WAIT_TIMEOUT but the worker thread never ends. As a result, virtual memory grows at about 1 MB every time, physical memory grows some KBs and the application that calls the dll becomes unstable. I also tryied to use TerminateThread() but i got the same results.
I have to admit that although i have enough developing experience, i am not familiar with c/c++. I did a lot of research before posting but unfortunately i didn't manage to solve my problem.
Does anyone have a clue on how could i solve this problem? However, I really want to stick to this kind of solution. Also, i want to mention that i think i can't use any global variables to use some kind of extra events, because each dll's functions may be called many times for every COM port.
I post some parts of my code below:
The Worker Thread:
unsigned int __stdcall ReadPort(void* readstr){
DWORD dwError; int rres;DWORD dwCommModemStatus, dwBytesTransferred;
int ret;
char szBuff[64] = "";
ReadParams* params = (ReadParams*)readstr;
ret = SetCommMask(params->param2, EV_RXCHAR | EV_CTS | EV_DSR | EV_RLSD | EV_RING);
if (ret == 0)
{
_endthreadex(0);
return -1;
}
ret = WaitCommEvent(params->param2, &dwCommModemStatus, 0);
if (ret == 0)
{
_endthreadex(0);
return -2;
}
ret = SetCommMask(params->param2, EV_RXCHAR | EV_CTS | EV_DSR | EV_RLSD| EV_RING);
if (ret == 0)
{
_endthreadex(0);
return -3;
}
if (dwCommModemStatus & EV_RXCHAR||dwCommModemStatus & EV_RLSD)
{
rres = ReadFile(params->param2, szBuff, 64, &dwBytesTransferred,NULL);
if (rres == 0)
{
switch (dwError = GetLastError())
{
case ERROR_HANDLE_EOF:
_endthreadex(0);
return -4;
}
_endthreadex(0);
return -5;
}
else
{
strcpy(params->param1,szBuff);
_endthreadex(0);
return 0;
}
}
else
{
_endthreadex(0);
return 0;
}
_endthreadex(0);
return 0;}
The Calling Thread:
int WINAPI StartReadThread(HANDLE porthandle, HWND windowhandle){
HANDLE hThread;
unsigned threadID;
ReadParams readstr;
DWORD ret, ret2;
readstr.param2 = porthandle;
hThread = (HANDLE)_beginthreadex( NULL, 0, ReadPort, &readstr, 0, &threadID );
ret = WaitForSingleObject(hThread, 500);
if (ret == WAIT_OBJECT_0)
{
CloseHandle(hThread);
if (readstr.param1 != NULL)
// Send message to GUI
return 0;
}
else if (ret == WAIT_TIMEOUT)
{
ret2 = CloseHandle(hThread);
return -1;
}
else
{
ret2 = CloseHandle(hThread);
if (ret2 == 0)
return -2;
}}
Thank you in advance,
Sna.
Don't use WaitCommEvent. You can call ReadFile even when there is no data waiting.
Use SetCommTimeouts to make ReadFile itself timeout, instead of building a timeout on the inter-thread communications.
Change the delay in the WaitForSingleObject call to 5000 or 10000 and I bet your problem frequency goes way down.
Edwin's answer is also valid. The spawned thread does not die because you closed the thread handle.
There is no guarantee that the ReadPort thread has even started by the time you are timing out. Windows takes a LONG time to start a thread.
Here are some suggestions:
You never check the return value of beginthreadex. How do you know the thread started?
Use whatever synchronization method with which you are comfortable to sync the ReadPort thread startup with StartReadThread. It could be as simple as an integer flag that ReadPort sets to 1 when its ready to work. Then the main thread can start its true waiting at that point. Otherwise you'll never know short of using a debugger what's happening between the 2 threads. Do not time out from the call to WaitForSingleObject in StartReadThread until your sync method indicates that ReadPort is working.
You should not use strcpy to copy the bytes received from the serial port with ReadFile. ReadFile tells you how many bytes it read. Use that value and memcpy to fill the buffer.
Look here and here for info on how to have ReadFile time out so your reads are not indefinite. Blocking forever on Windows is a recipe for disaster as it can cause zombie processes you cannot kill, among other problems.
You communicate no status to StartReadThread about what happened in the ReadPort thread. How do you know how many bytes ReadPort placed into szBuff? To get the theads exit code, use GetExitCodeThread. Documented here. Note that you cannot use GetExitCodeThread if you've closed the thread handle.
In your calling thread after a timeout you close the threadhandle. This will only stop you from using the handle. The worker thread however is still running. You should use a loop which waits again.

Overlapped IO and ERROR_IO_INCOMPLETE

I have had overlapped IO working for 2 years now but ive used it with a new application and its chucking this error at me (when i hide the main form).
I have googled but i fail to understand what the error means and how i should handle it?
Any ideas?
Im using this over NamedPipes and the error happens after calling GetOverlappedResult
DWORD dwWait = WaitForMultipleObjects(getNumEvents(), m_hEventsArr, FALSE, 500);
//check result. Get correct data
BOOL fSuccess = GetOverlappedResult(data->hPipe, &data->oOverlap, &cbRet, FALSE);
// error happens here
ERROR_IO_INCOMPLETE is an error code that means that the Overlapped operation is still in progress; GetOverlappedResult returns false as the operation hasn't succeeded yet.
You have two options - blocking and non-blocking:
Block until the operation completes: change your GetOverlappedResult call to:
BOOL fSuccess = GetOverlappedResult(data->hPipe, &data->oOverlap, &cbRet, TRUE);
This ensures that the Overlapped operation has completed (i.e. succeeds or fails) before returning the result.
Poll for completion: if the operation is still in progress, you can return from the function, and perform other work while waiting for the result:
BOOL fSuccess = GetOverlappedResult(data->hPipe, &data->oOverlap, &cbRet, FALSE);
if (!fSuccess) {
if (GetLastError() == ERROR_IO_INCOMPLETE) return; // operation still in progress
/* handle error */
} else {
/* handle success */
}
Generally, the second option is preferable to the first, as it does not cause your application to stop and wait for a result. (If the code is running on a separate thread, however, the first option may be preferable.)