Overlapped IO and ERROR_IO_INCOMPLETE - c++

I have had overlapped IO working for 2 years now but ive used it with a new application and its chucking this error at me (when i hide the main form).
I have googled but i fail to understand what the error means and how i should handle it?
Any ideas?
Im using this over NamedPipes and the error happens after calling GetOverlappedResult
DWORD dwWait = WaitForMultipleObjects(getNumEvents(), m_hEventsArr, FALSE, 500);
//check result. Get correct data
BOOL fSuccess = GetOverlappedResult(data->hPipe, &data->oOverlap, &cbRet, FALSE);
// error happens here

ERROR_IO_INCOMPLETE is an error code that means that the Overlapped operation is still in progress; GetOverlappedResult returns false as the operation hasn't succeeded yet.
You have two options - blocking and non-blocking:
Block until the operation completes: change your GetOverlappedResult call to:
BOOL fSuccess = GetOverlappedResult(data->hPipe, &data->oOverlap, &cbRet, TRUE);
This ensures that the Overlapped operation has completed (i.e. succeeds or fails) before returning the result.
Poll for completion: if the operation is still in progress, you can return from the function, and perform other work while waiting for the result:
BOOL fSuccess = GetOverlappedResult(data->hPipe, &data->oOverlap, &cbRet, FALSE);
if (!fSuccess) {
if (GetLastError() == ERROR_IO_INCOMPLETE) return; // operation still in progress
/* handle error */
} else {
/* handle success */
}
Generally, the second option is preferable to the first, as it does not cause your application to stop and wait for a result. (If the code is running on a separate thread, however, the first option may be preferable.)

Related

WriteFileEx completion routine succeeds, but bytes transferred is incorrect

I'm communicating between two processes on different machines via a pipe, using IO completion routines.
Occasionally, when the completion routine for WriteFileEx gets called, the completion routine parameter dwErrorCode is 0 (i.e. no error), GetOverlappedResult returns true (i.e. no error), but dwNumberOfBytesTransfered does not match nNumberOfBytesToWrite in the call to WriteFileEx. I only see this on the client end of the pipe however.
If the number of bytes transferred does not match the number of bytes that was requested to transfer, how can this be deemed a success?
This is how the client's handle to the pipe is created:
mHPipe = CreateFile(pipeName, // pipe name
GENERIC_READ | // read and write access
GENERIC_WRITE,
0, // no sharing
NULL, // default security attributes
OPEN_EXISTING, // opens existing pipe
FILE_FLAG_OVERLAPPED | // overlapped
FILE_FLAG_WRITE_THROUGH, // write through mode
NULL); // no template file
// do some checking...
// The pipe connected; change to message-read mode.
DWORD dwMode = PIPE_READMODE_MESSAGE;
BOOL fSuccess = SetNamedPipeHandleState(mHPipe, // pipe handle
&dwMode, // new pipe mode
NULL, // don't set maximum bytes
NULL); // don't set maximum time
Can anyone see why this would happen?
Thanks
EDIT:
The relevant WriteFileEx code is as follows:
void WINAPI CompletedWriteRoutine(DWORD dwErrorCode, DWORD dwNumberOfBytesTransfered, LPOVERLAPPED lpOverLap)
{
BOOL fWrite = FALSE;
LPPIPEINST lpPipeInst = (LPPIPEINST)lpOverLap;
//
// ! 99.9% of the time, dwNumberOfBytesTransfered == lpPipeInst->cbDataSize
// but 0.1% of the time, they do not match
//
// Some stuff
// Copy next message to send
memcpy_s(lpPipeInst->chData, sizeof(lpPipeInst->chData), pMsg->msg, pMsg->size);
lpPipeInst->cbDataSize = pMsg->size;
// Some other stuff
fWrite = WriteFileEx(lpPipeInst->hPipeInst,
lpPipeInst->chData,
lpPipeInst->cbDataSize,
(LPOVERLAPPED) lpPipeInst,
(LPOVERLAPPED_COMPLETION_ROUTINE)CompletedWriteRoutine);
// Some other, other stuff
}
Where LPPIPEINST is declared as:
typedef struct
{
OVERLAPPED oOverlap; // must remain first item
HANDLE hPipeInst;
TCHAR chData[BUFSIZE];
DWORD cbDataSize;
} PIPEINST, *LPPIPEINST;
And the initial call to CompletedWriteRoutine is given the lpOverlap parameter declared thusly:
PIPEINST pipeInstWrite = {0};
pipeInstWrite.hPipeInst = client.getPipeHandle();
pipeInstWrite.oOverlap.hEvent = hEvent[eventWriteComplete];
EDIT:
After trying re-initializing the overlapped structure as Harry suggested, I noticed something peculiar.
I memset the OVERLAPPED structure to zero before each WriteFileEx, and roughly 1/5000 completion routine callbacks, the cbWritten parameter and the OVERLAPPED structure's InternalHigh member was now set to the size of the previous message, instead of the most recent message. I added some logging to file on both the client and server ends of the pipe inside the completion routines, and the data sent and received at both ends was an exact match (and the correct, expected data). This then unveiled that in the time taken to write the data to a file, the InternalHigh member in the OVERLAPPED structure had changed to now reflect the size of the message I was expecting (cbWritten remains the old message size). I removed the file logging, and am now able to reproduce the issue like clockwork with this code:
void WINAPI CompletedWriteRoutine(DWORD dwErr, DWORD cbWritten, LPOVERLAPPED lpOverLap)
{
LPPIPEINST lpPipeInst = (LPPIPEINST)lpOverLap;
// Completion routine says it wrote the amount of data from the previous callback
if (cbWritten != lpPipeInst->cbDataSize)
{
// Roughly 1 in 5000 callbacks ends up in here
OVERLAPPED ovl1 = lpPipeInst->oOverlap; // Contains size of previous message, i.e. cbWritten
Sleep(100);
OVERLAPPED ovl2 = lpPipeInst->oOverlap; // Contains size of most recent message, i.e lpPipeInst->cbDataSize
}
...
}
It seems that sometimes, the completion routine is being called before the OVERLAPPED structure and the completion routine input parameter is updated. I'm using MsgWaitForMultipleObjectsEx(eventLast, hEvent, INFINITE, QS_POSTMESSAGE, MWMO_ALERTABLE); for the completion routines to be called on Windows 7 64 bit.
This MSDN page says:
"The system does not use the OVERLAPPED structure after the completion routine is called, so the completion routine can deallocate the memory used by the overlapped structure."
...so apparently, what this code can reproduce should never happen?
Is this a WINAPI bug?
Added FILE_FLAG_NO_BUFFERING to the CreateFile call - haven't seen the problem since. Thanks everyone who commented for your time.

How to check if WriteFile function is done

I want to check if the WriteFile function is done writing to UART so that i can call ReadFile on the same ComDev without causing an Exception.
It seems the WriteFile function can return before writing is done.
BOOL WriteCommBlock(HANDLE * pComDev, char *pBuffer , int BytesToWrite)
{
while(fComPortInUse){}
fComPortInUse = 1;
BOOL bWriteStat = 0;
DWORD BytesWritten = 0;
COMSTAT ComStat = {0};
OVERLAPPED osWrite = {0,0,0};
if(WriteFile(*pComDev,pBuffer,BytesToWrite,&BytesWritten,&osWrite) == FALSE)
{
short Errorcode = GetLastError();
if( Errorcode != ERROR_IO_PENDING )
short breakpoint = 5; // Error
Sleep(1000); // complete write operation TBD
fComPortInUse = 0;
return (FALSE);
}
fComPortInUse = 0;
return (TRUE);
}
I used Sleep(1000) as an workaround, but how can i wait for an appropriate time?
You can create a Event, store it in your overlapped structure and wait for it to be signalled. Like this (untested):
BOOL WriteCommBlock(HANDLE * pComDev, char *pBuffer , int BytesToWrite)
{
while(fComPortInUse){}
fComPortInUse = 1;
BOOL bWriteStat = 0;
DWORD BytesWritten = 0;
COMSTAT ComStat = {0};
OVERLAPPED osWrite = {0,0,0};
HANDLE hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
if (hEvent != NULL)
{
osWrite.hEvent = hEvent;
if(WriteFile(*pComDev,pBuffer,BytesToWrite,&BytesWritten,&osWrite) == FALSE)
{
short Errorcode = GetLastError();
if( Errorcode != ERROR_IO_PENDING )
short breakpoint = 5; // Error
WaitForSingleObject(hEvent, INFINITE);
fComPortInUse = 0;
return (FALSE);
}
CloseHandle(hEvent);
}
fComPortInUse = 0;
return (TRUE);
}
Note that depending on what else you are trying to do simply calling WaitForSingleObject() might not be the best idea. And neither might an INFINITE timeout.
Your problem is the incorrect use of the overlapped I/O, regardless to the UART or whatever underlying device.
The easiest (though not necessarily the most optimal) way to fix your code is to use an event to handle the I/O completion.
// ...
OVERLAPPED osWrite = {0,0,0};
osWrite.hEvent = CreateEvent(FALSE, NULL, NULL, FALSE);
if(WriteFile(*pComDev,pBuffer,BytesToWrite,&BytesWritten,&osWrite) == FALSE)
{
DWORD Errorcode = GetLastError();
// ensure it's ERROR_IO_PENDING
WaitForSingleObject(osWrite.hEvent, INFINITE);
}
CloseHandle(osWrite.hEvent);
Note however that the whole I/O is synchronous. It's handles by the OS in an asynchronous way, however your code doesn't go on until it's finished. If so, why do you use the overlapped I/O anyway?
One should use it to enable simultaneous processing of several I/Os (and other tasks) within the same thread. To do this correctly - you should allocate the OVERLAPPED structure on heap and use one of the available completion mechanisms: event, APC, completion port or etc. Your program flow logic should also be changed.
Since you didn't say that you need asynchronous I/O, you should try synchronous. It's easier. I think if you just pass a null pointer for the OVERLAPPED arg you get synchronous, blocking, I/O. Please see the example code I wrote in the "Windows C" section of this document:
http://www.pololu.com/docs/0J40/
Your Sleep(1000); is of no use, it will only execute after the writefile completes its operation.You have to wait till WriteFile is over.
if(WriteFile(*pComDev,pBuffer,BytesToWrite,&BytesWritten,&osWrite) == FALSE)
{}
You must be knowing that anything inside conditionals will only execute if the result is true.
And here the result is sent to the program after completion(whether complete or with error) of WriteFile routine.
OK, I missed the overlapped I/O OVL parameter in the read/write code, so It's just as well I only replied yesterday as a comment else I would be hammered with downvotes:(
The classic way of handling overlapped I/O is to have an _OVL struct as a data member of the buffer class that is issued in the overlapped read/write call. This makes it easy to have read and write calls loaded in at the same time, (or indeed, multiple read/write calls with separate buffer instances).
For COM posrts, I usually use an APC completion routine whose address is passed in the readFileEx/writeFileEx APIs. This leaves the hEvent field of the _OVL free to use to hold the instance pointer of the buffer so it's easy to cast it back inside the completion routine, (this means that each buffer class instance contains an _OVL memebr that contains an hEvent field that points to the buffer class instance - sounds a but weird, but works fine).

COM port read - Thread remains alive after timeout occurs

I have a dll which includes a function called ReadPort that reads data from serial COM port, written in c/c++. This function is called within an extra thread from another WINAPI function using the _beginthreadex. When COM port has data to be read, the worker thread returns the data, ends normaly, the calling thread closes the worker's thread handle and the dll works fine.
However, if ReadPort is called without data pending on the COM port, when timeout occurs then WaitForSingleObject returns WAIT_TIMEOUT but the worker thread never ends. As a result, virtual memory grows at about 1 MB every time, physical memory grows some KBs and the application that calls the dll becomes unstable. I also tryied to use TerminateThread() but i got the same results.
I have to admit that although i have enough developing experience, i am not familiar with c/c++. I did a lot of research before posting but unfortunately i didn't manage to solve my problem.
Does anyone have a clue on how could i solve this problem? However, I really want to stick to this kind of solution. Also, i want to mention that i think i can't use any global variables to use some kind of extra events, because each dll's functions may be called many times for every COM port.
I post some parts of my code below:
The Worker Thread:
unsigned int __stdcall ReadPort(void* readstr){
DWORD dwError; int rres;DWORD dwCommModemStatus, dwBytesTransferred;
int ret;
char szBuff[64] = "";
ReadParams* params = (ReadParams*)readstr;
ret = SetCommMask(params->param2, EV_RXCHAR | EV_CTS | EV_DSR | EV_RLSD | EV_RING);
if (ret == 0)
{
_endthreadex(0);
return -1;
}
ret = WaitCommEvent(params->param2, &dwCommModemStatus, 0);
if (ret == 0)
{
_endthreadex(0);
return -2;
}
ret = SetCommMask(params->param2, EV_RXCHAR | EV_CTS | EV_DSR | EV_RLSD| EV_RING);
if (ret == 0)
{
_endthreadex(0);
return -3;
}
if (dwCommModemStatus & EV_RXCHAR||dwCommModemStatus & EV_RLSD)
{
rres = ReadFile(params->param2, szBuff, 64, &dwBytesTransferred,NULL);
if (rres == 0)
{
switch (dwError = GetLastError())
{
case ERROR_HANDLE_EOF:
_endthreadex(0);
return -4;
}
_endthreadex(0);
return -5;
}
else
{
strcpy(params->param1,szBuff);
_endthreadex(0);
return 0;
}
}
else
{
_endthreadex(0);
return 0;
}
_endthreadex(0);
return 0;}
The Calling Thread:
int WINAPI StartReadThread(HANDLE porthandle, HWND windowhandle){
HANDLE hThread;
unsigned threadID;
ReadParams readstr;
DWORD ret, ret2;
readstr.param2 = porthandle;
hThread = (HANDLE)_beginthreadex( NULL, 0, ReadPort, &readstr, 0, &threadID );
ret = WaitForSingleObject(hThread, 500);
if (ret == WAIT_OBJECT_0)
{
CloseHandle(hThread);
if (readstr.param1 != NULL)
// Send message to GUI
return 0;
}
else if (ret == WAIT_TIMEOUT)
{
ret2 = CloseHandle(hThread);
return -1;
}
else
{
ret2 = CloseHandle(hThread);
if (ret2 == 0)
return -2;
}}
Thank you in advance,
Sna.
Don't use WaitCommEvent. You can call ReadFile even when there is no data waiting.
Use SetCommTimeouts to make ReadFile itself timeout, instead of building a timeout on the inter-thread communications.
Change the delay in the WaitForSingleObject call to 5000 or 10000 and I bet your problem frequency goes way down.
Edwin's answer is also valid. The spawned thread does not die because you closed the thread handle.
There is no guarantee that the ReadPort thread has even started by the time you are timing out. Windows takes a LONG time to start a thread.
Here are some suggestions:
You never check the return value of beginthreadex. How do you know the thread started?
Use whatever synchronization method with which you are comfortable to sync the ReadPort thread startup with StartReadThread. It could be as simple as an integer flag that ReadPort sets to 1 when its ready to work. Then the main thread can start its true waiting at that point. Otherwise you'll never know short of using a debugger what's happening between the 2 threads. Do not time out from the call to WaitForSingleObject in StartReadThread until your sync method indicates that ReadPort is working.
You should not use strcpy to copy the bytes received from the serial port with ReadFile. ReadFile tells you how many bytes it read. Use that value and memcpy to fill the buffer.
Look here and here for info on how to have ReadFile time out so your reads are not indefinite. Blocking forever on Windows is a recipe for disaster as it can cause zombie processes you cannot kill, among other problems.
You communicate no status to StartReadThread about what happened in the ReadPort thread. How do you know how many bytes ReadPort placed into szBuff? To get the theads exit code, use GetExitCodeThread. Documented here. Note that you cannot use GetExitCodeThread if you've closed the thread handle.
In your calling thread after a timeout you close the threadhandle. This will only stop you from using the handle. The worker thread however is still running. You should use a loop which waits again.

out of process rendering

Im trying to implement out of process rendering for my application (like what chrome does). I have the ipc (interprocess communication) all set up and working however it just deadlocks when trying to init a new form on the other process.
I have started the process with inherit handles as true is there any thing else i need to do?
I happy to provide sample code if needed.
Edit: it deadlocks in window api calls. Runs fine when in the same process
It is very easy to couple two threads if they own windows with any kind of relationship.
The effective result of this is, your IPC calls cannot block when waiting for a reply - your IPC reads always need to use MsgWaitForMultipleObjects so that you can process window messages from the other process/thread while waiting for the IPC message indicating completion.
What you do is replace your current call to WaitForMultipleObjects with MSGWaitForMultipleObjects. When it returns, you check the return value. If nCount is the number of IPC handles you are waiting to be signalled:
// Pump messages while waiting on 0 or more handles.
for(;;)
{
while(PeekMessage(&msg,0,0,0,PM_REMOVE))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
DWORD ret = MsgWaitForMultipleObjects(nCount,pHandles,FALSE,dwTimeout,QS_ALLEVENTS);
if(ret >= WAIT_OBJECT_0 && ret < (WAIT_OBJECT_0 + nCount))
{
// one of the handles was signalled.
return ret;
}
else if(ret == WAIT_OBJECT_0 + nCount)
{
// The wait was aborted because there is at least one message,
// go back to pumping messages
continue;
}
else
{
// test for WAIT_OBJECT_ABANDONED_0, WAIT_TIMEOUT etc. as appropriate
}
}

MsgWaitForMultipleObjects sometimes returns WAIT_FAILED with no GetLastError value

I have a thread that creates COM objects that require a single threaded apartment.
Originally, this thread's main function put it into a WaitForMultipleObjects loop. Evidently this is a problem, because it prevents the COM message pump from doing it's job.
I replaced this with MsgWaitForMultipleObjects as a solution, but now I'm running into an issue: MsgWaitForMultipleObjects sometimes (often) returns WAIT_FAILED, but doesn't set an error.
The code handles a WAIT_FAILED return value by just continuing and trying to call MsgWaitForMultipleObjects again. The call to MsgWaitForMultipleObjects may return WAIT_FAILED a few times (the most I've seen is 9), but then it suddenly works with no problem.
The code is written so that this could potentially go into an infinite loop if the function were returning WAIT_FAILED for a valid reason. I know I should fix this, but at the moment I'm considering it a 'work around' because the MsgWaitForMultipleObjects call will eventually succeed.
This code is being tested on Windows 7, Vista, and XP (all 32-bit, Windows 7 32- and 64-bit).
Does anyone have any idea why this is happening?
The relevant code:
bool run = true;
while (run)
{
DWORD ret = MsgWaitForMultipleObjects(2, events, FALSE, INFINITE,
QS_ALLINPUT);
switch (ret)
{
case WAIT_OBJECT_0:
{
ResetEvent(events[0]);
ProcessWorkQueue();
break;
}
case WAIT_OBJECT_0 + 1:
{
run = false;
break;
}
case WAIT_OBJECT_0 + 2:
{
MSG msg;
while (PeekMessage(&msg, NULL, 0, 0, PM_REMOVE))
DispatchMessage(&msg);
break;
}
case WAIT_FAILED:
{
Logger::Output(L"Wait failed in Notify::Run(), error is "
+ boost::lexical_cast<std::wstring>(GetLastError()));
}
}
}
Example output would be:
Wait failed in Notify::Run(), error is 0
Wait failed in Notify::Run(), error is 0
Wait failed in Notify::Run(), error is 0
Wait failed in Notify::Run(), error is 0
Wait failed in Notify::Run(), error is 0
Wait failed in Notify::Run(), error is 0
Wait failed in Notify::Run(), error is 0
Wait failed in Notify::Run(), error is 0
Wait failed in Notify::Run(), error is 0
// At this point, the wait succeeds
I believe the WAIT_FAILED return value is only occurring after the wait has been broken by a message.
That shouldn't be happening, and I sure can't explain exactly why it does. I do have a few pointers, however.
First, you're not calling TranslateMessage() before DispatchMessage() in your message pump. That's bad juju, and you don't want bad juju anywhere near MsgWaitForMultipleObjects().
You also might want to try explicitly calling MsgWaitForMultipleObjectsEx(), just in case it doesn't exhibit the same problem:
DWORD ret = MsgWaitForMultipleObjectsEx(2, events, INFINITE, QS_ALLINPUT, 0);
Finally, it might be far-fetched but consider what's happening after MsgWaitForMultipleObjects() returns and before GetLastError() is called. Disregarding the assignment to ret, I see an implicit call to std::wstring's constructor.
Can you guarantee std::wstring's constructor doesn't have some side-effect that clears the thread's last error code? I sure can't, so I'd move the call to GetLastError() into an good, old-fashioned, atomic assignment to a DWORD variable in the first line of the case statement.