I want to simply connect to a server and get a response. My program is written in c++, and you can see the code here:
if((interpreterSocket = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
return SOCKETERR;
}
fcntl(interpreterSocket, F_SETFD, FD_CLOEXEC);
setsockopt(interpreterSocket, SOL_SOCKET, SO_REUSEADDR, (char *) &flag, sizeof(flag));
setsockopt(interpreterSocket, IPPROTO_TCP, TCP_NODELAY, (char *) &flag, sizeof(flag));
setsockopt(interpreterSocket, SOL_SOCKET, SO_REUSEADDR , (char *) &flag, sizeof(flag));
address.sin_family = AF_INET;
address.sin_addr.s_addr = inet_addr(INTERPRETERADDR);
address.sin_port = htons(INTERPRETERPORT);
adlen = sizeof(address);
if((rc = connect(interpreterSocket, (struct sockaddr *) &address, adlen)) < 0)
{
close(interpreterSocket);
return SOCKETERR;
}
The problem is when I run this program sometimes it has some trouble, so I have to kill the process. After that when I run the program, the connect function does not return and the program stops at the if line. I think this problem most be related to a socket that does not close in a proper manner.
I should mention that I run this program in CentOS.
Thanks in advance.
EDIT
I had one more problem, that caused the program stop at connect() function and it was some routing problems at the server.
You are using blocking socket so connect blocks until it connects to server or timeout passes. To set sockets to non-blocking, you should use ioctlsocket function, exactly with FIONBIO command as second argument.
Remark that when you set sockets to non-blocking, you should start using select function to read or write to socket.
EDIT
Sorry, i forgot that's about Linux. It is possible to do nonblocking I/O on sockets by setting the O_NONBLOCK flag on a socket file descriptor using fcntl.
Related
I am writing a C++ multicasting application on Linux Ubuntu.
In my C++ multicast sender class I do this:
uint16_t port = 5678;
const char* group = "239.128.128.128";
int fd = socket(AF_INET, SOCK_DGRAM, 0);
struct sockaddr_in addr;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = inet_addr(group);
addr.sin_port = htons(port);
const char* buf = "Hi there";
size_t bytes_to_write = 8;
size_t bytes_sent = sendto(fd, buf, bytes_to_write, 0, (struct sockaddr*) &addr, sizeof(addr));
Is there any way to configure the file descriptor so that I can call write() rather than sendto()? I would have thought there would be a setsockopt option or similar to do this?
Yes.
Per the documentation man 7 udp
When
connect(2) is called on the socket, the default destination address
is set and datagrams can now be sent using send(2) or write(2)
without specifying a destination address.
and, for generality, the POSIX spec for connect says
If the initiating socket is not connection-mode, then connect() shall set the socket's peer address, and no connection is made. For SOCK_DGRAM sockets, the peer address identifies where all datagrams are sent on subsequent send() functions, and limits the remote sender for subsequent recv() functions.
It's always worth checking the documentation for these, things, it isn't that impenetrable. FWIW I couldn't remember immediately whether you need connect() or bind() for this, and it took me a few seconds to find out.
I have a C++ program, using mpi, that follows a typical client server model. Each mpi instance of the client connects to a corresponding mpi instance of the server. This has worked relatively well until I have had to do some testing with added latency (1 second of added latency to be precise).
Problem:
Sometimes one of the server processes do not think the client has connected but the client thinks it has connected. i.e. After using gdb, the server is waiting at accept() but the client has continued on past connect(). Thus, it appears the client thinks it has connected when the server does not think it has connected.
My best guess is that I need to set an sock-option somewhere, however talking to fellow programmers and googling has not yielded any helpful results.
EDIT:
There are two sets of MPI processes (so two different calls to mpirun), the accept() and connect() calls are for the sockets, which are between the two sets of MPI processes. It is openmpi.
The code (from someone else's code, actually) [reduced]:
Client (connect code): (m_socket is the actual socket)
if (-1 == m_socket)
{
perror("cannot create socket");
exit(EXIT_FAILURE);
}
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_port = htons(port);
res = inet_pton(AF_INET, host_ip, &addr.sin_addr);
if (0 > res)
{
perror("error: first parameter is not a valid address family");
close(m_socket);
exit(EXIT_FAILURE);
}
else if (0 == res)
{
perror("error: second parameter does not contain valid IP address");
close(m_socket);
exit(EXIT_FAILURE);
}
//backoff
for (int sec = 1; sec < 20000; sec++ )
{
int ret;
if (0 == (ret = connect(m_socket, (struct sockaddr *)&addr, sizeof(addr))))
{
return;
}
sleep(1);
close(m_socket);
m_socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
}
perror("connect failed");
close(m_socket);
exit(EXIT_FAILURE);
Server: (m_socket is the actual socket)
int socket = ::accept(m_socket, NULL, NULL);
if(socket < 0)
{
fprintf(stderr, "accept() failed: %s\n", strerror(errno));
close(m_socket);
exit(EXIT_FAILURE);
}
It looks like you're trying to do your connect/accept manually rather than with MPI. You might take a look at the example on Deino (http://mpi.deino.net/mpi_functions/MPI_Comm_accept.html) if you're trying to use MPI for your connections.
Alternatively, you might just need to look at a more general tutorial (some available here: http://www.mcs.anl.gov/research/projects/mpi/tutorial/) of MPI to get a feel for how communication works. Most of the time and application doesn't use Connect/Accept to communicate, but uses MPI Communicators to set up communication mechanisms between processes. It's a different model (SPMD as opposed to MPMD).
I am developing a C++ app in openSUSE 12.3 and one of it's part is responsible to send data to a device via Socket (in LAN). I am using this code
int sockfd, portno, n;
struct sockaddr_in serv_addr;
struct hostent *printer;
portno = 9100;
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if(sockfd < 0) error("ERROR opening socket\n");
printer = gethostbyname("100.0.69.23");
if(printer == NULL) error("No such device on 100.0.69.23\n");
//set bit set to zero
bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
bcopy((char *) printer->h_addr, (char *) &serv_addr.sin_addr.s_addr, printer- >h_length);
serv_addr.sin_port = htons(portno);
if(connect(sockfd, (struct sockaddr *) & serv_addr, sizeof(serv_addr)) < 0)
{error("ERROR connecting");
return;
}
n = write(sockfd, data, datalenght);
if(n < 0) error("ERROR sending command to printer");
n = read(sockfd, buffer, 200);
I think the code is correct but the connect function returns -1 and seems that could not connect to the device (printer) . This code was written in openSUSE 11 and was working OK and I could send/receive data to device but when I copy/paste it to new system (openSUSE 12.3) it gives me failure in connecting. I ping result on the specific IP which is in use show that device is reachable via LAN
I think you should consider the possibility that hostent returned by gethostbyname function might have AF_INET6 address family (in which case it will be IPv6 instead of IPv4 address).
http://linux.die.net/man/3/gethostbyname
So you can either use GNU extension function gethostbyname2 function that will allow you to specify address family.
printer = gethostbyname2("100.0.69.23", AF_INET);
Or instead you can use getaddrinfo function, as gethostbyname function is said to be obsolete, by the documentation.
As already mentioned, you are checking for printer == NULL before initializing it. I think you meant the following instead:
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0) error("ERROR opening socket\n");
printer = gethostbyname("100.0.69.23");
...
Also the structure of the code seems to indicate that when you want to send a command to the printer you connect(), write() then read(), which is OK if you are only ever sending one command, but suboptimal if you are sending multiple commands. In the latter case you want to separate the connect() from the write() as it's fairly expensive to connect so you want to do it just once.
I have a loop which keeps writing data to a client through TCP/IP. The connection is opened as follows:
newsockfd = accept(sockfd,
(struct sockaddr *) &cli_addr,
&clilen);
The following line is executed continuously in a loop (with sleep of 0.1 sec) in order to write the data to the client:
n = write(newsockfd,data.c_str(),data.length()+1); //+1 to include NULL in null terminated string
if(n>=0)
{
cout<<"success"<<endl;
}
else
{
cout<<"Fail"<<endl;
close(newsockfd);
newsockfd = -1;
}
I want the server to become reading for receiving a new connections if the connection is broken for any reason. So if writing fails, I get ready again to accept a new connection with the first command.
My problem is the following: the method succeeds for the first time, so if the connection is broken from the client, write() returns a negative number and I know immediately that the connection has a problem, so I close it and expect a new one. The server receives the new connection, but at the next time when using write(), the program crashes immediately.
Why does this happen? Please help, I'm new in TCP/IP stuff.
Please ask for more information if you require it.
Requested from helpers:
Stack trace:
Error: signal 13:
/mnt/hgfs/Dropbox/common_src/LinuxTCP/Server/ServerLinux-build-desktop-Qt_4_8_1_in_PATH__System__Release/ServerLinux[0x402155]
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7ffc57ac04a0]
/lib/x86_64-linux-gnu/libpthread.so.0(write+0x10)[0x7ffc5836dcb0]
/mnt/hgfs/Dropbox/common_src/LinuxTCP/Server/ServerLinux-build-desktop-Qt_4_8_1_in_PATH__System__Release/ServerLinux[0x4023b6]
/mnt/hgfs/Dropbox/common_src/LinuxTCP/Server/ServerLinux-build-desktop-Qt_4_8_1_in_PATH__System__Release/ServerLinux[0x401b54]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7ffc57aab76d]
/mnt/hgfs/Dropbox/common_src/LinuxTCP/Server/ServerLinux-build-desktop-Qt_4_8_1_in_PATH__System__Release/ServerLinux[0x402081]
Variable definitions: it's a class:
Body:
int sockfd, portno, n;
struct sockaddr_in serv_addr;
struct hostent *server;
Constructor starts the stuff:
LinuxTCPServer::LinuxTCPServer(int port, bool nonblocking)
{
if(nonblocking)
sockfd = socket(AF_INET, SOCK_NONBLOCK | SOCK_STREAM, 0);
else
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error("ERROR opening socket");
bzero((char *) &serv_addr, sizeof(serv_addr));
portno = port;
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = INADDR_ANY;
serv_addr.sin_port = htons(portno);
if (bind(sockfd, (struct sockaddr *) &serv_addr,
sizeof(serv_addr)) < 0)
error("ERROR on binding");
listen(sockfd,5);
clilen = sizeof(cli_addr);
}
Assuming Linux >= 2.2, replace this:
n = write(newsockfd,data.c_str(),data.length()+1);
with this:
n = send(newsockfd, data.c_str(), data.length()+1, MSG_NOSIGNAL);
send(2) will then return -1 with errno set to EPIPE, rather than generating a fatal SIGPIPE. Alternatively, ignore SIGPIPE.
When you receive the SIGPIPE, the connection behind newsockfd has been broken. We don't have enough code to reproduce the problem, client and server, so it's rather moot to say what might actually be wrong. However, converting SIGPIPEs to EPIPEs will at least give your server a chance to handle the broken connection.
Your stack trace indicates that the program is crashing with signal 13, which means you have a broken pipe.
That would indicate that your connection is broken, but you are still trying to write to it. See this thread for why that might causes the broken pipe error: What causes the Broken Pipe Error?
Now, on how to solve the issue, I suspect you're not actually getting a proper connection setup on your 'accept' call. Make sure you check the status of your 'accept' call before calling write.
The problems that are causing your accept call to fail are likely on the other side of the connection I think.
I have a connection protocol that has been defined by our customer. Data are sent between two linux computers using UDP and TCP protocols. The IP addresses and ports are fixed on startup.
We are sending messages at 200 Hz and I have been using connect to save some time on the transmissions.
My problem is that if there is a communication error, I need to tear down the connections and reinitialise.
I have a problem with one of the UDP connections as it will not rebind to the required address and returns errno 22.
The code I am using is something like:
int
doConnect(int& sock, int local_port, char *local_ip, int remote_port, char *remote_ip)
{
sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
struct sockaddr_in addr;
memset(&addr, 0, sizeof(sockaddr_in);
addr.sin_family = AF_INET;
addr.sin_port = htons(local_port);
inet_pton(local_ip,&addr.sin_addr.s_addr);
if (0 > bind(sock, (struct sockaddr*)&addr, sizeof(addr)))
{
printf("Bind Error errno = %d\n", errno);
return ERR_BIND;
}
memset(&addr, 0, sizeof(sockaddr_in);
addr.sin_family = AF_INET;
addr.sin_port = htons(remote_port);
inet_pton(remote_ip,&addr.sin_addr.s_addr);
if (0 > connect(sock, (struct sockaddr*)&addr, sizeof(addr)))
{
printf("Connect Error errno = %d\n", errno);
return ERR_CONNECT;
}
return ERR_OK;
}
The way that this is used is like this:
int s1(-1), s2(-1);
doConnect(s1, 31003, "172.17.21.255", 31006, "172.17.21.1");
doConnect(s2, 31001, "172.17.21.3", 31004, "172.17.21.1");
When an error occurs
close(s1);
close(s2);
doConnect(s1, 31003, "172.17.21.255", 31006, "172.17.21.1");
doConnect(s2, 31001, "172.17.21.3", 31004, "172.17.21.1");
Here the local address is 172.17.21.3 and I am connecting to 172.17.21.1. s1 listens to a broadcast message.
s1 successfully reconnects to the remote machine, but s2 fails with error 22 from the call to bind.
I have tried explicitly calling bind and connect to an AF_UNSPEC address immediately before I close the socket. This doesn't solve the problem.
Are there any options that I should be using?
Perhaps you could try:
int val = 1;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &val, sizeof(val));
I also suggest you double check that you're not passing the same socket to the two consecutive doConnect() calls (as errno 22 = EINVAL, which in the case of bind() appears to mean that the socket is already bound to an address).
The underlying socket layer might hold the port & IP address still open, even after your call to close. Try some of the following:
do a sleep(10) (or more) between the close and the call to doConnect again
configure the sockets using setsockopt with the SO_LINGER set to off
This actually happens more commonly with TCP connections, but I see no reason UDP can't have this problem as well.