How can we get sockets list from select() function? - c++

I'm working on a network project and I know select() function (with FD_XXX) returns the total number of socket handles that are ready and contained in the fd_set structures but do we know these sockets (as SOCKET or INT)? There is only way to get sockets list with a FOR LOOP-CHECK FD_ISSET, Am I right? else how?

Despite what others say about the return value of select(), I use it this way when dealing with a lot of sockets, it does not guarantee that you don't have to process all the list in case the only one socket happens to be the last one but would save some code if it's the first one.
int i;
int biggest=0;
fd_set sfds;
struct timeval timeout={0, 0};
FD_ZERO(&sfds);
for (i=0; i < NumberOfsockets; i++)
{
FD_SET(SocktList[i], &sfds);
if (SocktList[i] > biggest) biggest=SocktList[i];
}
timeout.tv_sec=30;
timeout.tv_usec=0;
// biggest is only necessary when dealing with Berkeley sockets,
// Visual Studio C++ (and others) ignore this parameter.
if ((nReady=select((biggest+1), &sfds, NULL, NULL, TimeOut)) > 0)
{
for (i=0; i < NumerbsOfSocket && nReady > 0; i++)
{
if (FD_ISSET(SocketList[i], &sfds)) {
// SocketList[i] got data to be read
... your code to process the socket when it's readable...
nReady--;
}
}
}

Related

Why isn´t the for loop accessed in release mode, but in debug it works fine

my problem is that I coded server/client application on windows. Now in debug mode everything works fine and as it should , but in release mode the server doesn´t receive or send messages.
I think it is due to that, that the for loop inside the infinite for loop doesn´t get accessed.I also tried to implement this solution with a while but it didn´t work. I think it might be the problem that i call a function in the condition field, because when i compare the i to an integer it gets accessed. Also interesting is that when i std::cout something right before the inner for loop , the loop gets also accessed , despite the fact that I am calling the function in the condition field.
#include <iostream>
#include <thread>
#include "server.cpp"
//gets defined in server.cpp
void server::acceptConn() {
u_long mode =1;
for(;;){
int len = sizeof(incAddr[connectedSocks]);
if((inc_sock[connectedSocks] = accept(serv,(struct sockaddr *)&incAddr[connectedSocks],&len))!= SOCKET_ERROR){
std::cout<<"client connected : "<<inet_ntoa(incAddr[connectedSocks].sin_addr)<<std::endl;
ioctlsocket(inc_sock[connectedSocks],FIONBIO,&mode);
connectedSocks++;
}
}
}
int main() {
server ser;
ser.init_Server();
std::thread t(&server::acceptConn,&ser);
char buf[1024];
for(;;){
for(int i=0 ; ser.getCounter()>i;i++){
if (recv(ser.getInc_Sock()[i], buf, sizeof(buf), 0) == SOCKET_ERROR) {
} else{
for (int x = 0; x < ser.getCounter(); x++) {
if(x==i){//just that the message doesnt get send back to the sender}
else{
if (send(ser.getInc_Sock()[x], buf, sizeof(buf), 0) == SOCKET_ERROR) {
std::cout<<"send failed : "<<WSAGetLastError()<<std::endl;
}
}
}
}
}
}
}
int getCounter(){return connectedSocks;};//in server.h
The result should be that Server is having a List of connected socks and is distributing the messages to everyone. Now when I am running it in debug mode everything works fine. What could be the problem?
Your code lacks any form of synchronization between the two threads. The worker thread writes data, and the main thread reads data. But because there is no synchronization between them (mutex/atomics/etc), every read from the main thread is a data race with the writes from the worker thread.
And data races are undefined behavior. So the compiler is permitted to compile the main thread as if nobody ever writes data.
When you put a std::cout in your loops, this "works" because std::cout requires some form of synchronization to keep multiple threads from writing to stdout at the same time. So you're effectively piggybacking off of that internal synchronization. But you shouldn't rely on that; you should instead use proper synchronization primitives.

Too many data on socket using epoll edge trigger [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
The scenario is like this: one process is using epoll on several sockets, all sockets are set non-blocking and edge triggered; then EPOLLIN event occurs on one socket, then we start to read data on its fd, but the problem is that there are too many data coming in, and in the while loop reading data, the return value of recv is always larger than 0. So the application is stuck there, reading data and cannot move on.
Any idea how should I deal with this?
constexpr int max_events = 10;
constexpr int buf_len = 8192;
....
epoll_event events[max_events];
char buf[buf_len];
int n;
auto fd_num = epoll_wait(...);
for(auto i = 0; i < fd_num; i++) {
if(events[i].events & EPOLLIN) {
for(;;) {
n = ::read(events[i].data.fd, buf, sizeof(buf));
if (errno == EAGAIN)
break;
if (n <= 0)
{
on_disconnect_(events[i].data.fd);
break;
}
else
{
on_data_(events[i].data.fd, buf, n);
}
}
}
}
When using edge triggered mode the data must be read in one recv call, otherwise it risks starving other sockets. This issue has been written about in numerous blogs, e.g. Epoll is fundamentally broken.
Make sure that your user-space receive buffer is at least the same size as the kernel receive socket buffer. This way you read the entire kernel buffer in one recv call.
Also, you can process ready sockets in a round-robin fashion, so that the control flow does not get stuck in recv loop for one socket. That works best with the user-space receive buffer being of the same size as the kernel one. E.g.:
auto n = epoll_wait(...);
for(int dry = 0; dry < n;) {
for(auto i = 0; i < n; i++) {
if(events[i].events & EPOLLIN) {
// Do only one read call for each ready socket
// before moving to the next ready socket.
auto r = recv(...);
if(-1 == r) {
if(EAGAIN == errno) {
events[i].events ^= EPOLLIN;
++dry;
}
else
; // Handle error.
}
else if(!r){
// Process client disconnect.
}
else {
// Process data received so far.
}
}
}
}
This version can be further improved to avoid scanning the entire events array on each iteration.
In you original post do {} while(n > 0); is incorrect and it leads to an endless loop. I assume it is a typo.

How to accommodate timing variability in writing to tcp socket?

As a test, I'm writing a series of byte arrays to a tcp socket from an Android application, and reading them in a C++ application.
Java
InetAddress address = InetAddress.getByName("192.168.0.2");
Socket socket = new Socket(address, 1300);
DataOutputStream out = new DataOutputStream(socket.getOutputStream())
...
if(count == 0) {
out.write(first, 0, first.length);
} else if(count == 1) {
out.write(second, 0, second.length);
}
C++
do {
iResult = recv(ClientSocket, recvbuf, 3, 0);
for (int i = 0; i < 3; i++) {
std::cout << (int)(signed char)recvbuf[i] << std::endl;
}
} while (iResult > 0);
As it stands, on the first receipt, recv[2] = -52, which I assume to be a junk value, as the output stream has not yet written the second byte array by the time I've received the first segment.
However, when I pause after the the ListenSocket has accepted the connection:
ClientSocket = accept(ListenSocket, NULL, NULL);
std::cin.ignore();
...giving the sender time to do both writes to the stream, recv[2] = 3, which is the first value of the second written byte array.
If I ultimately want to send and receive a constant stream of discrete arrays, how can I determine after I've received the last value of one array, whether the next value in the buffer is the first value of the next array or whether it's a junk value?
I've considered that udp is more suitable for sending a series of discrete data sets, but I need the reliability of tcp. I imagine that tcp is used in this way regularly, but it's not clear to me how to mitigate this issue.
EDIT:
In the actual application for which I'm writing this test, I do implement length prefixing. I don't think that's relevant though; even if I know I'm at the end of a data set, I need to know whether the next value on the buffer is junk or the beginning of the next set.
for (int i = 0; i < 3; i++)
The problem is here. It should be:
for (int i = 0; i < iResult; i++)
You're printing out data that you may not have received. This is the explanation of the 'junk value'.
You can't assume that recv() fills the buffer.
You must also check iResult for both -1 and zero before this loop, and take the appropriate actions, which are different in each case.
As you point out, TCP is stream-based, so there's no built-in way to say "here's a specific chunk of data". What you want to do is add your own "message framing". A simple way to do that is called "length prefixing". Where you first send the size of the data packet, and then the packet itself. Then the receiver will know when they've gotten all the data.
Sending side
send length of packet (as a known size -- say a 32-bit int)
send packet data
Receiving side
read length of packet
read that many bytes of data
process fully-received packet
Check out this article for more information: http://blog.stephencleary.com/2009/04/message-framing.html

How to iterate through a fd_set

I'm wondering if there's an easy way to iterate through a fd_set? The reason I want to do this is to not having to loop through all connected sockets, since select() alters these fd_sets to only include the ones I'm interested about. I also know that using an implementation of a type that is not meant to be directly accessed is generally a bad idea since it may vary across different systems. However, I need some way to do this, and I'm running out of ideas. So, my question is:
How do I iterate through an fd_set? If this is a really bad practice, are there any other ways to solve my "problem" except from looping through all connected sockets?
Thanks
You have to fill in an fd_set struct before calling select(), you cannot pass in your original std::set of sockets directly. select() then modifies the fd_set accordingly, removing any sockets that are not "set", and returns how many sockets are remaining. You have to loop through the resulting fd_set, not your std::set. There is no need to call FD_ISSET() because the resulting fd_set only contains "set" sockets that are ready, eg:
fd_set read_fds;
FD_ZERO(&read_fds);
int max_fd = 0;
read_fds.fd_count = connected_sockets.size();
for( int i = 0; i < read_fds.fd_count; ++i )
{
read_fds.fd_array[i] = connected_sockets[i];
if (read_fds.fd_array[i] > max_fd)
max_fd = read_fds.fd_array[i];
}
if (select(max_fd+1, &read_fds, NULL, NULL, NULL) > 0)
{
for( int i = 0; i < read_fds.fd_count; ++i )
do_socket_operation( read_fds.fd_array[i] );
}
Where FD_ISSET() comes into play more often is when using error checking with select(), eg:
fd_set read_fds;
FD_ZERO(&read_fds);
fd_set error_fds;
FD_ZERO(&error_fds);
int max_fd = 0;
read_fds.fd_count = connected_sockets.size();
for( int i = 0; i < read_fds.fd_count; ++i )
{
read_fds.fd_array[i] = connected_sockets[i];
if (read_fds.fd_array[i] > max_fd)
max_fd = read_fds.fd_array[i];
}
error_fds.fd_count = read_fds.fd_count;
for( int i = 0; i < read_fds.fd_count; ++i )
{
error_fds.fd_array[i] = read_fds.fd_array[i];
}
if (select(max_fd+1, &read_fds, NULL, &error_fds, NULL) > 0)
{
for( int i = 0; i < read_fds.fd_count; ++i )
{
if( !FD_ISSET(read_fds.fd_array[i], &error_fds) )
do_socket_operation( read_fds.fd_array[i] );
}
for( int i = 0; i < error_fds.fd_count; ++i )
{
do_socket_error( error_fds.fd_array[i] );
}
}
Select sets the bit corresponding to the file descriptor in the set, so, you need-not iterate through all the fds if you are interested in only a few (and can ignore others) just test only those file-descriptors for which you are interested.
if (select(fdmax+1, &read_fds, NULL, NULL, NULL) == -1) {
perror("select");
exit(4);
}
if(FD_ISSET(fd0, &read_fds))
{
//do things
}
if(FD_ISSET(fd1, &read_fds))
{
//do more things
}
EDIT
Here is the fd_set struct:
typedef struct fd_set {
u_int fd_count; /* how many are SET? */
SOCKET fd_array[FD_SETSIZE]; /* an array of SOCKETs */
} fd_set;
Where, fd_count is the number of sockets set (so, you can add an optimization using this) and fd_array is a bit-vector (of the size FD_SETSIZE * sizeof(int) which is machine dependent). In my machine, it is 64 * 64 = 4096.
So, your question is essentially: what is the most efficient way to find the bit positions of 1s in a bit-vector (of size around 4096 bits)?
I want to clear one thing here:
"looping through all the connected sockets" doesn't mean that you are actually reading/doing stuff to a connection. FD_ISSET() only checks weather the bit in the fd_set positioned at the connection's assigned file_descriptor number is set or not. If efficiency is your aim, then isn't this the most efficient? using heuristics?
Please tell us what's wrong with this method, and what are you trying to achieve using the alternate method.
It's fairly straight-forward:
for( int fd = 0; fd < max_fd; fd++ )
if ( FD_ISSET(fd, &my_fd_set) )
do_socket_operation( fd );
This looping is a limitation of the select() interface. The underlying implementations of fd_set are usually a bit set, which obviously means that looking for a socket requires scanning over the bits.
It is for precisely this reason that several alternative interfaces have been created - unfortunately, they are all OS-specific. For example, Linux provides epoll, which returns a list of only the file descriptors that are active. FreeBSD and Mac OS X both provide kqueue, which accomplishes the same result.
See this section 7.2 of Beej's guide to networking - '7.2. select()—Synchronous I/O Multiplexing' by using FD_ISSET.
in short, you must iterate through an fd_set in order to determine whether the file descriptor is ready for reading/writing...
I don't think what you are trying to do is a good idea.
Firstly its system dependent, but I believe you already know it.
Secondly, at the internal level these sets are stored as an array of integers and fds are stored as set bits. Now according to the man pages of select the FD_SETSIZE is 1024.
Even if you wanted to iterate over and get your interested fd's you have to loop over that number along with the mess of bit manipulation.
So unless you are waiting for more than FD_SETSIZE fd's on select which I don't think so is possible, its not a good idea.
Oh wait!!. In any case its not a good idea.
I don't think you could do much using the select() call efficiently. The information at "The C10K problem" are still valid.
You will need some platform specific solutions:
Linux => epoll
FreeBSD => kqueue
Or you could use an event library to hide the platform detail for you libev
ffs() may be used on POSIX or 4.3BSD for bits iteration, though it expects int (long and long long versions are glibc extensions). Of course, you have to check, if ffs() optimized as good as e.g. strlen and strchr.

Speeding up non-blocking Unix Sockets (C++)

I've implemented a simple socket wrapper class. It includes a non-blocking function:
void Socket::set_non_blocking(const bool b) {
mNonBlocking = b; // class member for reference elsewhere
int opts = fcntl(m_sock, F_GETFL);
if(opts < 0) return;
if(b)
opts |= O_NONBLOCK;
else
opts &= ~O_NONBLOCK;
fcntl(m_sock, F_SETFL, opts);
}
The class also contains a simple receive function:
int Socket::recv(std::string& s) const {
char buffer[MAXRECV + 1];
s = "";
memset(buffer,0,MAXRECV+1);
int status = ::recv(m_sock, buffer, MAXRECV,0);
if(status == -1) {
if(!mNonBlocking)
std::cout << "Socket, error receiving data\n";
return 0;
} else if (status == 0) {
return 0;
} else {
s = buffer;
return status;
}
}
In practice, there seems to be a ~15ms delay when Socket::recv() is called. Is this delay avoidable? I've seen some non-blocking examples that use select(), but don't understand how that might help.
It depends on how you using sockets. If you have multiple sockets and you loop over all of them checking for data that may account for the delay.
With non-blocking recv you are depending on data being there. If your application need to use more than one socket you will have to constantly pool each socket in turns to find out if any of them have data available.
This is bad for system resources because it means your application is constantly running even when there is nothing to do.
You can avoid that with select. You basically set up your sockets, add them to group and select on the group. When anything happens on any of the selected sockets select returns specifying what happened and on which socket.
For some code about how to use select look at beej's guide to network programming
select will let you a specify a timeout, and can test if the socket is ready to be read from. So you can use something smaller than 15ms. Incidentally you need to be careful with that code you have, if the data on the wire can contain embedded NULs s won't contain all the read data. You should use something like s.assign(buffer, status);.
In addition to stefanB, I see that you are zeroing out your buffer every time. Why bother? recv returns how many bytes were actually read. Just zero out the one byte after ( buffer[status+1]=NULL )
How big is your MAXRECV? It might just be that you incur a page fault on the stack growth. Others already mentioned that zeroing out the receive buffer is completely unnecessary. You also take memory allocation and copy hit when you create a std::string out of received character data.