Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
The scenario is like this: one process is using epoll on several sockets, all sockets are set non-blocking and edge triggered; then EPOLLIN event occurs on one socket, then we start to read data on its fd, but the problem is that there are too many data coming in, and in the while loop reading data, the return value of recv is always larger than 0. So the application is stuck there, reading data and cannot move on.
Any idea how should I deal with this?
constexpr int max_events = 10;
constexpr int buf_len = 8192;
....
epoll_event events[max_events];
char buf[buf_len];
int n;
auto fd_num = epoll_wait(...);
for(auto i = 0; i < fd_num; i++) {
if(events[i].events & EPOLLIN) {
for(;;) {
n = ::read(events[i].data.fd, buf, sizeof(buf));
if (errno == EAGAIN)
break;
if (n <= 0)
{
on_disconnect_(events[i].data.fd);
break;
}
else
{
on_data_(events[i].data.fd, buf, n);
}
}
}
}
When using edge triggered mode the data must be read in one recv call, otherwise it risks starving other sockets. This issue has been written about in numerous blogs, e.g. Epoll is fundamentally broken.
Make sure that your user-space receive buffer is at least the same size as the kernel receive socket buffer. This way you read the entire kernel buffer in one recv call.
Also, you can process ready sockets in a round-robin fashion, so that the control flow does not get stuck in recv loop for one socket. That works best with the user-space receive buffer being of the same size as the kernel one. E.g.:
auto n = epoll_wait(...);
for(int dry = 0; dry < n;) {
for(auto i = 0; i < n; i++) {
if(events[i].events & EPOLLIN) {
// Do only one read call for each ready socket
// before moving to the next ready socket.
auto r = recv(...);
if(-1 == r) {
if(EAGAIN == errno) {
events[i].events ^= EPOLLIN;
++dry;
}
else
; // Handle error.
}
else if(!r){
// Process client disconnect.
}
else {
// Process data received so far.
}
}
}
}
This version can be further improved to avoid scanning the entire events array on each iteration.
In you original post do {} while(n > 0); is incorrect and it leads to an endless loop. I assume it is a typo.
Related
I am implementing MPI non-blocking communication inside my program. I see on MPI_Isend man_page, it says:
A nonblocking send call indicates that the system may start copying data out of the send buffer. The sender should not modify any part of the send buffer after a nonblocking send operation is called, until the send completes.
My code works like this:
// send messages
if(s > 0){
MPI_Requests s_requests[s];
MPI_Status s_status[s];
for(int i = 0; i < s; ++i){
// some code to form the message to send
std::vector<doubel> send_info;
// non-blocking send
MPI_Isend(&send_info[0], ..., s_requests[i]);
}
MPI_Waitall(s, s_requests, s_status);
}
// recv info
if(n > 0){ // s and n will match
for(int i = 0; i < n; ++i){
MPI_Status status;
// allocate the space to recv info
std::vector<double> recv_info;
MPI_Recv(&recv_info[0], ..., status)
}
}
My question is: am I modify the send buffers since they are in the inner curly brackets (the send_info vector get killed after the loop finishes)? Therefore, this is not a safe communication mode? Although my program works fine now, I still being suspected. Thank you for your reply.
There are two points I want to emphasize in this example.
The first one is the problem I questioned: send buffer gets modified before MPI_Waitall. The reason is what Gilles said. And the solution could be allocated a big buffer before the for loop, and use MPI_Waitall after the loop is finished or put MPI_Wait inside the loop. But the latter one is equivalent to use MPI_Send in the sense of performance.
However, I found if you simply transfer to blocking send and receive, a communication scheme like this could cause deadlock. It is similar to the classic deadlock:
if (rank == 0) {
MPI_Send(..., 1, tag, MPI_COMM_WORLD);
MPI_Recv(..., 1, tag, MPI_COMM_WORLD, &status);
} else if (rank == 1) {
MPI_Send(..., 0, tag, MPI_COMM_WORLD);
MPI_Recv(..., 0, tag, MPI_COMM_WORLD, &status);
}
And the explaination could be found here.
My program could cause a similar situation: all the processors called MPI_Send then it is a deadlock.
So my solution is to use a large buffer and stick to non-blocking communication scheme.
#include <vector>
#include <unordered_map>
// send messages
if(s > 0){
MPI_Requests s_requests[s];
MPI_Status s_status[s];
std::unordered_map<int, std::vector<double>> send_info;
for(int i = 0; i < s; ++i){
// some code to form the message to send
send_info[i] = std::vector<double> ();
// non-blocking send
MPI_Isend(&send_info[i][0], ..., s_requests[i]);
}
MPI_Waitall(s, s_requests, s_status);
}
// recv info
if(n > 0){ // s and n will match
for(int i = 0; i < n; ++i){
MPI_Status status;
// allocate the space to recv info
std::vector<double> recv_info;
MPI_Recv(&recv_info[0], ..., status)
}
}
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
This is a section of the server code. When the client exits, the server program dies without coring after the following line:
n = send(s, buf+total, bytesleft, 0);
Here is the function:
static ssize_t conn_Send(int s, u8* buf, ssize_t len)
{
ssize_t total = 0; // how many bytes we've sent
ssize_t bytesleft = len; // how many we have left to send
ssize_t n;
while(total < len)
{
n = send(s, buf+total, bytesleft, 0);
//abort();
if (n < 0)
{
return n;
}
total += n;
bytesleft -= n;
}
return total;
}
I have looked in the obvious places: /proc/sys/kernel/core_pattern is set correctly, and indeed the program exits with a core file if the "//abort();" is commented out above.
Any ideas? I'm at wit's end.
When the client exits, the server program dies
n = send(s, buf+total, bytesleft, 0);
This is typically the result of getting killed by SIGPIPE when the other end closes the connection. There are quite a few ways to getting around this, including:
You can use setsockopt with SO_NOSIGPIPE
You can use MSG_NOSIGNAL as a send flag
You can ignore SIGPIPE
I am writing a wrapper around generic file operations and do not know how to handle the case when write returns a smaller size written then provided.
The man page for write says:
The number of bytes written may be less than count if, for example, there is insufficient space on the underlying physical medium, or the RLIMIT_FSIZE resource limit is encountered (see setrlimit(2)), or the call was interrupted by a signal handler after having written less than count bytes. (See also pipe(7).)
From my understanding of the above, it's a mixture of errors (medium full) and incitation to come back (interrupted call). If my file descriptors are all non-blocking, I should not get the interrupt case and then the only reason would be an error. Am I right ?
Code example:
int size_written = write(fd, str, count);
if (size_written == -1) {
if (errno == EAGAIN) {
// poll on fd and come back later
} else {
// throw an error
}
} else if (size_written < count) {
// ***************
// what should I do here ?
// throw an error ?
// ***************
}
You need to use the raw I/O functions in a loop:
ssize_t todo = count;
for (ssize_t n; todo > 0; )
{
n = write(fd, str, todo);
if (n == -1 && errno != EINTR)
{
// error
break;
}
str += n;
todo -= n;
}
if (todo != 0) { /* error */ }
The special condition concerning EINTR allows the write call to be interrupted by a signal without causing the entire operation to fail. Otherwise, we expect to be able to write all data eventually.
If you can't finish writing all data because your file descriptor is non-blocking and cannot accept any data at the moment, you have to store the remaining data and try again later when the file descriptor has signalled that it's ready for writing again.
I have a simple tcp/ip server written in c++ on linux. I'm using asynchronous sockets and epoll. Is it possible to find out how many bytes are available for reading, when i get the EPOLLIN event?
From man 7 tcp:
int value;
error = ioctl(sock, FIONREAD, &value);
Or alternatively SIOCINQ, which is a synonym of FIONREAD.
Anyway, I'd recommend just to use recv in non-blocking mode in a loop until it returns EWOULDBLOCK.
UPDATE:
From your comments below I think that this is not the appropriate solution for your problem.
Imagine that your header is 8 bytes and you receive just 4; then your poll/select will return EPOLLIN, you will check the FIONREAD, see that the header is not yet complete and wayt for more bytes. But these bytes never arrive, so you keep on getting EPOLLIN on every call to poll/select and you have a no-op busy-loop. That is, poll/select are level-triggered. Not that an edge triggered function solves your problem either.
At the end you are far better doing a bit of work, adding a buffer per connection, and queuing the bytes until you have enough. It is not as difficult as it seems and it works far better. For example, something like that:
struct ConnectionData
{
int sck;
std::vector<uint8_t> buffer;
size_t offset, pending;
};
void OnPollIn(ConnectionData *d)
{
int res = recv(d->sck, d->buffer.data() + offset, d->pending);
if (res < 0)
handle_error();
d->offset += res;
d->pending -= res;
if (d->pending == 0)
DoSomethingUseful(d);
}
And whenever you want to get a number of bytes:
void PrepareToRecv(ConnectionData *d, size_t size)
{
d->buffer.resize(size);
d->offset = 0;
d->pending = size;
}
I'm working on a network project and I know select() function (with FD_XXX) returns the total number of socket handles that are ready and contained in the fd_set structures but do we know these sockets (as SOCKET or INT)? There is only way to get sockets list with a FOR LOOP-CHECK FD_ISSET, Am I right? else how?
Despite what others say about the return value of select(), I use it this way when dealing with a lot of sockets, it does not guarantee that you don't have to process all the list in case the only one socket happens to be the last one but would save some code if it's the first one.
int i;
int biggest=0;
fd_set sfds;
struct timeval timeout={0, 0};
FD_ZERO(&sfds);
for (i=0; i < NumberOfsockets; i++)
{
FD_SET(SocktList[i], &sfds);
if (SocktList[i] > biggest) biggest=SocktList[i];
}
timeout.tv_sec=30;
timeout.tv_usec=0;
// biggest is only necessary when dealing with Berkeley sockets,
// Visual Studio C++ (and others) ignore this parameter.
if ((nReady=select((biggest+1), &sfds, NULL, NULL, TimeOut)) > 0)
{
for (i=0; i < NumerbsOfSocket && nReady > 0; i++)
{
if (FD_ISSET(SocketList[i], &sfds)) {
// SocketList[i] got data to be read
... your code to process the socket when it's readable...
nReady--;
}
}
}