I have a program reading in from a file, doin some work on the input, then outputing it to a socket. It's been running fine for over a month when suddenly I started to get error 11 (EAGAIN?) error that kill the program. When I start 32 instances of the program more then half die within a few minutes receiving EAGAIN messages. I never set the file as non-blocking and besides which how would an input file block, the data is always there isn't it? The only change i made to this code was to disable the sigpipe signal to avoid the program dieing when it's socket connection is lost.
Forgive me for not posting code, but I can't copy and paste and the code is sort of spread out anyways. It's really as simple as opening a file on one line and calling readline(file, inputString) later on though.
Thanks.
EAGAIN means that the read has been interrupted by a signal and that it should be restarted.
I assume it is the SIGPIPE which used to directly kill your programs and now is handled (even if the handling is to do nothing).
Related
I can receive and send data as long as I dont use fd_set(..) /select.
After that I can't send data to the client. The data is send "after" killing the process (pressing ctrl C).
For example if I run that snippet:
http://www.binarytides.com/multiple-socket-connections-fdset-select-linux/
I get the "welcome client-connected message" (line 126) but after the next loop, the new client is added via fd_set and select. Line 171 should send the received message back to the client, but I only get it back after killing the process.
Maybe it's because the "OS running the server" thinks that the connection is busy and buffers the output. And that could be the reason why killing the process causes the buffer to be send to client.
If I use write() instead of send() the behavoir doesn't change.
int count = write()
count is fine and the code doesn't throw any error.
I tried it on two ubuntu 14.04 systems (one lts and some other build from source)
If you need some more src-code I will upload it. I just think that the example in the link is well documented and shows the problem.
I already found a lot of stuff about the topic, but I can't figure out what I am doing wrong as all tutorials and docs do it that way.
Unluckily I am not that familiar with c++/linux and don't know what to investigate next. So any help is appreciated.
Thanks :)
My suspicion is that what you are seeing is not a network problem at all, but rather a buffering problem with your program's stdout stream. In particular, characters your program sends to stdout won't actually become visible in the terminal window until either (a) a newline character ('\n') is printed, or (b) you manually flush the stream (e.g. vi fflush(stdout), or cout.flush(), or (c) the program terminates (as happens when you press CTRL-C).
So most likely your client program did receive and print the message, but you aren't seeing it because the program is waiting for the newline character before printing anything to the terminal. (it makes sense to do that in cases where the program is printing out a line of text one small substring at a time; but it can be confusing)
The easy fix then (assuming this is indeed the problem), would be to call fflush(stdout) (or printf("\n"); after you call printf() to print the received text. (Or if you are using C++ streams, call cout.flush() or cout<<endl after your call to cout << theText)
Found the error, thanks Jeremy Friesner who mentioned the client. I read until "\n" occurs -> parse message. For testing my c++ server, I have sent messages without "\n". Thank you
I am building an application that intersepts a serial comunication line by recieving the transmition, modifieng the data, and echoing the changed result.
The transmitted data is made of status sentances at high baudrate with alot of data.
I have created two threads, one reads the sentaces and pushes a pointer to each new sentance into a queue, and the Other pops the pointers out of the queue, manipulates them, sends them to the serial port and deletes the pointer.
The queue operstions are in external functions with CririticalSection locks so that works fine.
To make sure the queue doesnt overflow quickly i need to send the messages quickly and not wait for the recieving to end.
To my understanding serial ports can recieve and transmit simultaniously but trying to do so gives error with access resttictions.
The other solution is to split the system into two diffrent ports but I try to avoid it because the hardware changes and the need of another USB and convertor.
I read about Overlapped structures but didnt fully understood what is their usage and, as I got it they manage asinc operation where my issue is parallel operation.
Sorry for my lame english, any help or explanation will help.
I used this class for the serial comunication, setting overlapped to enable when opening the comport to allow wait event timeouts:
http://www.codeproject.com/Articles/992/Serial-library-for-C
Thanks in advance.
Roman.
Clarification:
Im not opening the port twice, just once in the main program and pass the handler to both threads (writing it now maximizes the problem in this approach
More details:
The error comes from the Cserial library:
"Cserial::read overlapped complete without result." Commenting the send back to serial command in the sending thread will not raise an error and the queue is filled and displays correctly–
Im on a classified system without internet access so i cant upload the sample, writing from my tablet. The error accures after I get the first sentace, which triggers the first send command ss soon as queues size changes, and then the recieving thread exits because recieve failes, so the queue stops to fill and nothing sends out.
Probbly because both use same serial handler but whats the alternative to access the same port simultaniosly without locking one thread or the other
Ignoring error 996, which is the error id of the "read overlapped completed without results" and not exiting the thread when its detected makes both recieve an transmited data wrong (missing bytes)
At the buttom line, after asking alot of questions:
Why a read operation is interrupted by a write operation if these are two seperate comunication lines?can i use two handlers one for each task on the same port?
Is the D+/- in usb is transmit/recieve or both line used for transmit and recieve?
":read overlapped complete without result"
Are you preventing the read from being interrupted by the OS switching execution to the write thread? You need to protect this from happening by using a mutex or similar.
The real solution is to switch to an asynchronous library, such as bosst::asio.
Why a read operation is interrupted by a write operation if these are two seperate comunication lines?
here is a possible hand-waving visualization of what happens if you use synchronous operations in two threads without locking them against each other. ( I am guessing at the details of how you arranged your software )
Your app receives a read request from the port.
Your app requests the OS to start the read thread.
OS agrees, and your read thread completes the read.
-. Your app does its processing.
Your app asks the OS to start the write thread.
The OS agrees, and your write thread starts a write.
A second read request arrives on the port. This does not interrupt anything, it just waits.
The write is not yet finished, but the OS decides that the write thread has had enough time. It decides to switch context to the read thread which is waiting.
The read thread starts reading
Again the OS decides that the running thread ( read ) has had a fair crack at the CPU . It switches context back to the write thread. This crashes the unfinished read. Note that this happens in your software, not in the hardware, or the hardware driver.
This should give you a general insight into the sort of problems that occur, unless you keep the OS from running the reads and writes over the top of each other. It is a matter of opinion wehter it is better to use multithreading with mutexes ( or equivalent ) or asynchronous event-driven designs.
Two threads can't operate on single port / file descriptior. Depending on what library you used you should try to do this asynchronous or by checking how many bytes can be read/write without blocking thread. (if it is Linux raw filedescriptor you should look at poll / select)
When my client sends a file to the server, should I Sleep(100) or so before sending the next chunk to ensure the server has enough time to download + write the data?
Does that just seem completely unnecessary?
Also I'm getting wouldblock errors (# 10035) when sending a chunk, so im just looping send until it succeeds, if send == SOCKET_ERROR goto SendAgain; , is that ok?
If you're sending your file via TCP, then it's the protocol that is ensuring that everything has been received, I wouldn't put a sleep between each chunk.
The wouldblock error is either that you're sending too much data for your output buffer, or you try to send it too quickly, and the remote buffer gets full. That seems ok to send it again because the receiver received it but didn't have enough space to store it and have juste drop it.
Here is a small article about your error: Winsock error 10035
In my opinion using sleepfunction to wait for something to be done is in 99% of the time the wrong way.
You ll never now the time you gonna need or you ve to expect for a process to be executed (can be interrupted by e.g spikes, other problems in i/o or whatever)
If you want to make sure something important is executed completely you should read about Semaphores or something like that, where you lock/free processes on start/end.
Taken from a man-page:
When the message does not fit into the send buffer of the socket,
send() normally blocks, unless the socket has been placed in
nonblocking I/O mode. In nonblocking mode it would fail with the error
EAGAIN or EWOULDBLOCK in this case. The select(2) call may be
used to determine when it is possible to send more data.
I'm writing program that should control a piece of scientific hardware over COM-port. The program itself is written in wxWidgets and uses ctb library. To test, it before I connect it to 300k€ equipment, I use com0com (Null-modem emulator) to forward COM2 port. To emulate my hardware I use wxTerminal (COM3). Altogether it works nice. One can debug not only in VS or DB but also see the whole data transfer in wxTerminal.
Now to my problem. I use to send data to COM-port ctb::SerialPort::Write() function.
device->Write( (char*)line.c_str(), line.size() );
However, if I disconnect the connection on the side of wxTerminal (i.e. COM2->NULL) than program hangs in this function.
It's obvious that I should add some function to test if my equipment is still there, but to do it I need to send data-packet to it and expect some answer. So I'm back to the Write().
"Just in case" I've also tried ctb::IOBase::Writev (char ∗ buf, size_t len, unsigned int timeout_in_ms) with timeout set to 100ms and I've still got program hanging in the same line. It's actually expected behavior as in this case timeout means only that the connection line is blocked till whole buffer is transferred or timeout is reached.
Connecting of wxTerminal to COM3 leads to un-freezing of debugger or stand-alone program. The Sun is shining, the birds are singing.
Can somebody give me a hint how to overcome my problem? I'd appreciate if comments would be restrained to wxWidgets-world - I really do not want to re-write whole program with other toolkit.
If you COM port library does not provide effective timeouts on write block, (presumably because of hardware flow-control), you could implement your own by threading off the write. You could use a couple of events/semaphores/condvar/whatever. One to signal to the thread that there is something in a buffer to send and another that you can wait on with a timeout that is signaled by the thread after it has sent the buffer. If the 'ack' wait times out, your COM port is stuck and you can pop up some 'Check cable' messageBox. I don't know what other calls your port lib supports, so I don't know how you could implement flushes/retries.
I am trying to write 2 server/client programs under Linux, in which they communicate through named pipes. The problem is that sometimes when I try to write from the server into a pipe that doesn't exist anymore (the client has stopped), I get a "Resource temporarily unavailable" error and the server stops completely.
I understand that this is caused by using a O_NONBLOCK parameter when opening the fifo chanel, indicating the point where the program would usually wait until it could write again in the file, but is there a way to stop this behavior, and not halt the entire program if a problem occurs (shouldn't the write command return -1 ad the program continue normally)?
And another strange thing is that this error only occurs when running the programs outside the ide (eclipse). If I run both programs inside eclipse, on error the write function just returns -1 and the programs continues normally.
If you wish that write() to returns -1 on error (and set errno to EPIPE) instead of stopping your server completly when the write end of your pipe is unconnected, you must ignore the SIGPIPE signal with signal( SIGPIPE, SIG_IGN ).
The problem with this undefined behaviour is strange, you could have a memory problem somewhere or you missed a test. ( or Eclipse does something special to handle signals? )
To quote the section 2 man page for write:
"[errno=]EPIPE An attempt is made to write to a pipe or a FIFO that is not open for reading by any process, or that has only one end open (or to a file descriptor created by socket(3SOCKET), using type SOCK_STREAM that is no longer connected to a peer endpoint). A SIGPIPE signal will also be sent to the thread. The process dies unless special provisions were taken to catch or ignore the signal." [Emphasis mine].
As Platypus said you'll need to ignore the SIGPIPE signal:
signal(SIGPIPE, SIG_IGN). You could also catch the signal and handle the pipe disconnection in a different way in your server.
maybe you can just wrap it into a "try..catch" statement?