Boost UDP socket issue on unix - bind: address already in use

Boost UDP socket issue on unix - bind: address already in use - c++

First of all, I know there are several other threads on the same theme, but I was unable to find anything in those that could help me so I'll try to be very specific with my situation.
I have set up a simple UDP Client / UDP Server pair that is responsible to send data between several parallel simulations. That is, every instance of the simulator is running in a separate thread and send data on a UDP socket. In the master thread the server is running and routes the messages between the simulations.
The (for this problem) important parts of the server code looks like this:
UDPServer::UDPServer(boost::asio::io_service &m_io_service) :
m_socket(m_io_service, udp::endpoint(udp::v4(), PORT_NUMBER)),
m_endpoint(boost::asio::ip::address::from_string("127.0.0.1"), PORT_NUMBER)
{
this->start_receive();
};
void UDPServer::start_receive() {
// Set SO_REUSABLE to true
boost::asio::socket_base::reuse_address option(true);
this->m_socket.set_option(option);
// Specify what happens when a message is received (it should call the handle_receive function)
this->m_socket.async_receive_from( boost::asio::buffer(this->recv_buffer),
this->m_endpoint,
boost::bind(&UDPServer::handle_receive, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
};
This works fine on my windows workstation.
The thing is; I want to be able to run this on a linux cluster, which is why I compiled it and tried to run it on a cluster node. The code compiled without a hitch, but when I try to run it I get the error
bind: address already in use
I use a port number above 1024, and have verified that it is not in use by another program. And as is seen above, I also set the reuse_address option, so I really don't know what else could be wrong.

To portably use SO_REUSEADDR you need to set the option before binding the socket to the wildcard address:
UDPServer::UDPServer(boost::asio::io_service &m_io_service) :
m_socket(m_io_service, udp::v4()),
m_endpoint()
{
boost::asio::socket_base::reuse_address option(true);
this->m_socket.set_option(option);
this->m_socket.bind(udp::endpoint(udp::v4(), PORT_NUMBER));
this->start_receive();
}
In your original code, the constructor that takes an endpoint constructs, opens and binds the socket in a single line - it's concise but not very flexible. Here we're constructing and opening the socket in the constructor call, and then binding it later after we set the option.
As an aside, there's not much point initialising m_endpoint if you're just going to use it as the out argument of async_receive_from anyway.

Try running the following command on Linux to see if the port is already being used by another program.
netstat -antup | grep 1024
If you are getting "address already in use" then it is definitely being used by some other program. If the above command yields some result, then kill the process id that is reported in the command. If this does not work, try changing the port number to some other arbitrary port and check if the problem persists.

Related

Boost Asio SSL not able to receive data for 2nd time onwards (1st time OK)

I'm working on Boost Asio and Boost Beast for simple RESTful server. For normal HTTP and TCP socket, it works perfectly. I put it under load test with JMeter, everything works fine.
I tried to add the SSL socket. I set the 'ssl::context' and also called the 'async_handshake()' - additional steps for SSL compared to normal socket. It works for the first time only. Client can connected with me (server) and I also able to receive the data via 'boost::beast::http::async_read()'.
Because this is RESTful, so the connection will drop after the request & respond. I call 'SSL_Socket.shutdown()' and follow by 'SSL_Socket.lowest_layer().close()' to close the SSL socket.
When the next incoming request, the client able to connect with me (server). I called 'SSL_Socket.async_handshake()' and then follow by 'boost::beast::http::async_read()'. But this time I not able to receive any data. But the connection is successfully established.
Anyone has any clue what i missed?
Thank you very much!

If you want to reuse the stream instance, you need to manipulate SSL_Socket.native_handle() with openssl lib function. After ssl shutdown, use SSL_clear() before start a new ssl handshake.
please read(pay attention to warnings) link for detail
SSL_clear() resets the SSL object to allow for another connection. The reset operation however keeps several settings of the last sessions (some of these settings were made automatically during the last handshake)
.........
WARNINGS
SSL_clear() resets the SSL object to allow for another connection. The reset operation however keeps several settings of the last sessions (some of these settings were made automatically during the last handshake). It only makes sense for a new connection with the exact same peer that shares these settings, and may fail if that peer changes its settings between connections. Use the sequence SSL_get_session(3); SSL_new(3); SSL_set_session(3); SSL_free(3) instead to avoid such failures (or simply SSL_free(3); SSL_new(3) if session reuse is not desired).
In regard to ssl shutdown issue, link explain how boost asio ssl shutdown work.
In Boost.Asio, the shutdown() operation is considered complete upon error or if the party has sent and received a close_notify message.
If you look at boost.asio (1.68) source code boost\asio\ssl\detail\impl\engine.ipp, it shows how boost.asio do ssl shutdown and stream_truncated happens when there is data to be read or ssl shutdown expected from peer not received.
int engine::do_shutdown(void*, std::size_t)
{
int result = ::SSL_shutdown(ssl_);
if (result == 0)
result = ::SSL_shutdown(ssl_);
return result;
}
const boost::system::error_code& engine::map_error_code(
boost::system::error_code& ec) const
......
// If there's data yet to be read, it's an error.
if (BIO_wpending(ext_bio_))
{
ec = boost::asio::ssl::error::stream_truncated;
return ec;
}
......
// Otherwise, the peer should have negotiated a proper shutdown.
if ((::SSL_get_shutdown(ssl_) & SSL_RECEIVED_SHUTDOWN) == 0)
{
ec = boost::asio::ssl::error::stream_truncated;
}
}
Also you can see boost.asio ssl shutdown routine may call openssl SSL_shutdown() twice if first return 0, openssl document allows it but advice call SSL_read() to do a bidirectional shutdown if first SSL_shutdown() returns 0.
Read link for details.

I had a similar issue, the 2nd time onward my asynchonous accept always failed with session id uninitialized.
I solved this problem calling SSL_CTX_set_session_id_context on context or
setting context cache mode with SSL_SESS_CACHE_OFF and SSL_OP_NO_TICKET on context options.
This is my cents to someone else's problem.

I managed to resolve the problem by switching 'ssl::stream' socket to 'boost::optional' and then added 'SSL_Socket.emplace(io_context, oSSLContext)' each time the socket is shutdown and closed.
Big credit to sehe at 'Can't implement boost::asio::ssl::stream<boost::asio::ip::tcp::socket> reconnect to server'. His statement "the purest solution would be to not reuse the stream/socket objects" rocks! Save my time.
Thanks.

SSH local port forwarding using libssh

Problem
I try to do local port forwarding using libssh with the libssh-C++-wrapper. My intention is to forward port localhost:3306 on a server to localhost:3307 on my machine via SSH to connect via MySQL to localhost:3307.
void ssh_session::forward(){
ssh::Channel channel(this->session);
//remotehost, remoteport, localhost, localport
channel.openForward("localhost",3306,"localhost",3307);
std::cout<< "Channel is " << (channel.isOpen()?"open!":"closed!") << std::endl;
}
with session in the constructor of ssh::Channel being of type ssh::Session.
The code above prints Channel is open!. If I try to connect to localhost:3307 using the MySQL Connector/C++ I get
ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (61)
Observations
If I use the shell command $ ssh -L 3307:localhost:3306 me#myserver.com everything works fine and I can connect.
If I use ssh::Session session used in the constructor or ssh::Channel channel to execute remote shell commands everything works therefore the session is fine!
The documentation of libssh (which is total crap for the C++ wrapper libsshpp.hpp since a lot of public member functions are not documented and you have to look into the source code) shows that ssh::Channel::openForward() is a wrapper for the C function ssh_channel_open_forward()
The documentation of ssh_channel_open_forward() states
Warning
This function does not bind the local port and does not automatically forward the content of a socket to the channel. You still have to use channel_read and channel_write for this.
I think that could cause the problem. I have no problem by reading and writing in to the ssh:Channel but thats not how the MySQL Connector/C++ works.
Question
How can I achieve the same behaviour produced by the common shell command
$ ssh -L 3307:localhost:3306 me#myserver.com
using libssh?

Warning
This function does not bind the local port and does not automatically forward the content of a socket to the channel. You
still have to use channel_read and channel_write for this.
This is telling you that you need to write your own local socket code. Unfortunately, it doesn't do it for you.
The simplest implementation would be to bind a local socket, and use ssh_select to listen for events (e.g. new connection to accept, socket or channel events). You can keep your socket fds aand ssh_channels in a vector for easy management.
When you get any event, just loop over all the operations in a non-blocking way, i.e.
try to accept a new connection, and append the fd, and a new ssh_channel (created as in your question) to your vectors.
try to read all the socket fds, and forward anything to the corresponding ssh channel using ssh_channel_write (make sure to setsockopt SO_RCVTIMEO to 0)
try to read all the channels, using ssh_channel_read_nonblocking, and forward to the socket fd using write.
You also need to handle errors everywhere, and close the corresponding fd and ssh_channel.
Overall it's probably going to be too much code for a StackOverflow answer, but I may come back and add it in if I get time.
The tempting alternative to all that would be to just run ssh -L ... as a subprocess using fork & exec, avoiding all that boilerplate socket code, and benefitting from an efficient, bug-free implementation.

Libnodave - daveStart() Error using TCP Connection

I have established connection to a Siemens S7-300 PLC (simulated via PlcSIM) using the libnodave library. There are no issues connecting and writing data to the PLC. However, I am unable to change the status of the PLC from Start/Stop. I am attempting to use the following libnodave methods for such actions:
int daveStatus = daveStart(dc);
int daveStatus = daveStop(dc);
Both function calls return the same Error: 33794
nodave.c Cites the error as the following:
case 0x8402: return "CPU already in RUN or already in STOP ?";
The use of the daveStart() and daveStop() functions can be viewed in the example testS7online.c:
if(doStop) {
daveStop(dc);
}
if(doRun) {
daveStart(dc);
}
In the examples the start/stop functions are only called when MPI connections to the PLC are made. Does anyone know if the start/stop functions are supported for use with TCP connections? If so, any suggestions as to what may be causing my error?

I have just tried dc.start() and dc.stop() using libnodave 8.4 and NetToPlcSim tool. It worked perfectly. Possibly you don't use NetToPlcSim tool that makes connection to PLCSim via TCP/IP (that is 127.0.0.1 port 102 obviously) hence dc can't even connect. So if your lines don't work, then u must be doing something wrong.

close on socket not releasing file descriptor

When conducting a stress test on some server code I wrote, I noticed that even though I am calling close() on the descriptor handle (and verifying the result for errors) that the descriptor is not released which eventually causes accept() to return an error "Too many open files".
Now I understand that this is because of the ulimit, what I don't understand is why I am hitting it if I call close() after each synchronous accept/read/send cycle?
I am validating that the descriptors are in fact there by running a watch with lsof:
ctsvr 9733 mike 1017u sock 0,7 0t0 3323579 can't identify protocol
ctsvr 9733 mike 1018u sock 0,7 0t0 3323581 can't identify protocol
...
And sure enough there are about 1000 or so of them. Further more, checking with netstat I can see that there are no hanging TCP states (no WAIT or STOPPED or anything).
If I simply do a single connect/send/recv from the client, I do notice that the socket does stay listed in lsof; so this is not even a load issue.
The server is running on an Ubuntu Linux 64-bit machine.
Any thoughts?

So using strace (thanks Gearoid), which I have no idea how I ever lived without, I noted I was in fact closing the descriptors.
However. And for the sake of posterity I lay bare my foolish mistake:
Socket::Socket() : impl(new Impl) {
impl->fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
....
}
Socket::ptr_t Socket::accept() {
auto r = ::accept(impl->fd, NULL, NULL);
...
ptr_t s(new Socket);
s->impl->fd = r;
return s;
}
As you can see, my constructor allocated a socket immediately, and then I replaced the descriptor with the one returned by accept - creating a leak. I had refactored the accept code from a standalone Acceptor class into the Socket class without changing this.
Using strace I could easily see socket() being run each time which lead to my light bulb moment.
Thanks all for the help!

Have you ever called perror() after close()?
I think the returned string will give you some help;

You are most probably hanging on a recv() or send() command. Consider setting a timeout using setsockopt .
I noticed a similar output on lsof when the socket was closed on the other end but my thread was keeping the socket open hanging on the recv() command waiting for data.

Socket in use error when reusing sockets

I am writing an XMLRPC client in c++ that is intended to talk to a python XMLRPC server.
Unfortunately, at this time, the python XMLRPC server is only capable of fielding one request on a connection, then it shuts down, I discovered this thanks to mhawke's response to my previous query about a related subject
Because of this, I have to create a new socket connection to my python server every time I want to make an XMLRPC request. This means the creation and deletion of a lot of sockets. Everything works fine, until I approach ~4000 requests. At this point I get socket error 10048, Socket in use.
I've tried sleeping the thread to let winsock fix its file descriptors, a trick that worked when a python client of mine had an identical issue, to no avail.
I've tried the following
int err = setsockopt(s_,SOL_SOCKET,SO_REUSEADDR,(char*)TRUE,sizeof(BOOL));
with no success.
I'm using winsock 2.0, so WSADATA::iMaxSockets shouldn't come into play, and either way, I checked and its set to 0 (I assume that means infinity)
4000 requests doesn't seem like an outlandish number of requests to make during the run of an application. Is there some way to use SO_KEEPALIVE on the client side while the server continually closes and reopens?
Am I totally missing something?

The problem is being caused by sockets hanging around in the TIME_WAIT state which is entered once you close the client's socket. By default the socket will remain in this state for 4 minutes before it is available for reuse. Your client (possibly helped by other processes) is consuming them all within a 4 minute period. See this answer for a good explanation and a possible non-code solution.
Windows dynamically allocates port numbers in the range 1024-5000 (3977 ports) when you do not explicitly bind the socket address. This Python code demonstrates the problem:
import socket
sockets = []
while True:
s = socket.socket()
s.connect(('some_host', 80))
sockets.append(s.getsockname())
s.close()
print len(sockets)
sockets.sort()
print "Lowest port: ", sockets[0][1], " Highest port: ", sockets[-1][1]
# on Windows you should see something like this...
3960
Lowest port: 1025 Highest port: 5000
If you try to run this immeditaely again, it should fail very quickly since all dynamic ports are in the TIME_WAIT state.
There are a few ways around this:
Manage your own port assignments and
use bind() to explicitly bind your
client socket to a specific port
that you increment each time your
create a socket. You'll still have
to handle the case where a port is
already in use, but you will not be
limited to dynamic ports. e.g.
port = 5000
while True:
s = socket.socket()
s.bind(('your_host', port))
s.connect(('some_host', 80))
s.close()
port += 1
Fiddle with the SO_LINGER socket
option. I have found that this
sometimes works in Windows (although
not exactly sure why):
s.setsockopt(socket.SOL_SOCKET,
socket.SO_LINGER, 1)
I don't know if this will help in
your particular application,
however, it is possible to send
multiple XMLRPC requests over the
same connection using the
multicall method. Basically
this allows you to accumulate
several requests and then send them
all at once. You will not get any
responses until you actually send
the accumulated requests, so you can
essentially think of this as batch
processing - does this fit in with
your application design?

Update:
I tossed this into the code and it seems to be working now.
if(::connect(s_, (sockaddr *) &addr, sizeof(sockaddr)))
{
int err = WSAGetLastError();
if(err == 10048) //if socket in user error, force kill and reopen socket
{
closesocket(s_);
WSACleanup();
WSADATA info;
WSAStartup(MAKEWORD(2,0), &info);
s_ = socket(AF_INET,SOCK_STREAM,0);
setsockopt(s_,SOL_SOCKET,SO_REUSEADDR,(char*)&x,sizeof(BOOL));
}
}
Basically, if you encounter the 10048 error (socket in use), you can simply close the socket, call cleanup, and restart WSA, the reset the socket and its sockopt
(the last sockopt may not be necessary)
i must have been missing the WSACleanup/WSAStartup calls before, because closesocket() and socket() were definitely being called
this error only occurs once every 4000ish calls.
I am curious as to why this may be, even though this seems to fix it.
If anyone has any input on the subject i would be very curious to hear it

Do you close the sockets after using it?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Boost UDP socket issue on unix - bind: address already in use - c++

Related

Boost Asio SSL not able to receive data for 2nd time onwards (1st time OK)

SSH local port forwarding using libssh

Libnodave - daveStart() Error using TCP Connection

close on socket not releasing file descriptor

Socket in use error when reusing sockets

Categories

Resources