I am using gsoap for Symbian S60 3rd Edition FP2 in a Qt application. I am making several requests to a WS every 5 seconds. After 2 hours the application stops being able to connect to the WS and I get this Error 28: SOAP_TCP_ERROR from gsoap. If I stop the application and start it again it is able to connect to the WS again. Why is this happening?
I've put the gsoap WS call in a for loop and it stops connecting to the WS at the 892th time, every time I run it.
You can do several things as a prework:
enable DBGLOG at gsoap
use soap_faultdetail at client side.
I'm 99% sure that it will give you a tcp connection timeout error which means that connection handshake has just failed.
If it is so, it means that WS has not accepted the connection for some reason. The source of problems might lay somewhere between proxy/firewall/os/buggy ws/driver to name just few of them. Because of that, one can use reconnection attempt. I'm not familiar with symbian, but in the windows OS reconnection is performed behind the scenes:
see: https://technet.microsoft.com/en-us/library/cc938209.aspx
By default, reconnection attempt is made twice but this behaviour could be changed either by registry parameter, driver or winsock.
I think you have to write explicit connection-retry subroutine at your application level and force gSOAP to use it (see hooks section at gSOAP documentations) or just call soap_connect couple of times if it returns error.
NOTE: introduction of connection_timeout at gsoap level may be confusing.
If you will decide to put this one (if you do not already have this) in your code, perform some tests wether the reconnection attempt is really perfomed within this timeout or not.
What I'm just trying to say is that your application could set timeout to 30 minutes, but your OS will put SYN packet into the WS host just couple of times within let's say couple of first seconds. If the WS host will not respond with SYN-ACK for some reason, your gsoap's tcp_connect subroutine will fall into 30minutes waste-of-time-loop.
Related
Multiple clients are connected to a single ZMQ_PUSH socket. When a client is powered off unexpectedly, server does not get an alert and keep sending messages to it. Despite of using ZMQ_OBLOCK and setting ZMQ_HWM to 5 (queue only 5 messages at max), my server doesn't get an error until unless client is reconnected and all the messages in queue are received at once.
I recently ran into a similar problem when using ZMQ. We would cut power to interconnected systems, and the subscriber would be unable to reconnect automatically. It turns out the there has recently (past year or so) been implemented a heartbeat mechanism over ZMTP, the underlying protocol used by ZMQ sockets.
If you are using ZMQ version 4.2.0 or greater, look into setting the ZMQ_HEARTBEAT_IVL and ZMQ_HEARTBEAT_TIMEOUT socket options (http://api.zeromq.org/4-2:zmq-setsockopt). These will set the interval between heartbeats (ZMQ_HEARTBEAT_IVL) and how long to wait for the reply until closing the connection (ZMQ_HEARTBEAT_TIMEOUT).
EDIT: You must set these socket options before connecting.
There is nothing in zmq explicitly to detect the unexpected termination of a program at the other end of a socket, or the gratuitous and unexpected failure of a network connection.
There has been historical talk of adding some kind of underlying ping-pong are-you-still-alive internal messaging to zmq, but last time I looked (quite some time ago) it had been decided not to do this.
This does mean that crashes, network failures, etc aren't necessarily handled very cleanly, and your application will not necessarily know what is going on or whether messages have been successfully sent. It is Actor model after all. As you're finding your program may eventually determine something had previously gone wrong. Timeouts in zmtp will spot the failure, and eventually the consequences bubble back up to your program.
To do anything better you'd have to layer something like a ping-pong on top yourself (eg have a separate socket just for that so that you can track the reachability of clients) but that then starts making it very hard to use the nice parts of ZMQ such as push / pull. Which is probably why the (excellent) zmq authors decided not to put it in themselves.
When faced with a similar problem I ended up writing my own transport library. I couldn't find one off the shelf that gave nice behaviour in the face of network failures, crashes, etc. It implemented CSP, not actor model, wasn't terribly fast (an inevitability), didn't do patterns in the zmq sense, but did mean that programs knew exactly where messages were at all times, and knew that clients were alive or unreachable at all times. The CSPness also meant message transfers were an execution rendezvous, so programs know what each other is doing too.
I have to connect with a Socket.io 0.9 server (for legacy compatibility reasons) from my C++ code. socket.io-poco looks like the only library that provides this functionality, so I have taken the plunge and pulled in Poco in order to support that. Things mostly work, until they do not.
My process seems to stall on a send call. The call inside SocketImpl.cpp does not return, but it takes around half an hour of disconnected execution to get to that state. I am not sure how to prevent and/or recover from the program getting into this bad state.
The program executes on Windows 2012 R2. It connects to the server and converses successfully, but the connection can become volatile. I will come back and the service will be not visible to the server sometimes. This can take hours or days to occur. My test scenario is artificially disconnecting the server and seeing what happens. That normally results in the program getting into this non-returning state in about half-an-hour.
Any ideas for how to mitigate or resolve this issue?
A different C++ library capable of speaking Socket.io 0.9x
Something I can do to the stale socket.io-poco code to make it more defensive
Guesses as to what I or any of the layers in between have messed up?
Any other ideas?
I decided I needed to learn more about Winsock, so I found a guide. That told me to look at setting SO_SNDTIMEO with setsockopt. After searching through SocketImpl.html I found setSendTimeout and found I can call it in the socket.io-poco call using WebSocket.
I then just had to catch the exception and call a new reconnect function when the timeout occurred:
void SIOClientImpl::reconnect() {
// Disconnect
_heartbeatTimer->stop();
_ws->close();
// Connect
if((handshake(_queryArgs)) && (openSocket())) {
connectToEndpoint(_uri.getPath());
} else {
Poco::Thread::sleep(100);
}
}
I don't know which to hope for: that this answer is helpful or that nobody else has to try and do this!
The scenario is the next one:
I have a XMLRPC-C++ applcation, listening for connections on PORT=8081. It implements an Abyss Server, using the xmlrpc-c library as next:
xmlrpc_c::serverAbyss myAbyssServer(
myRegistry, //handler of methods
port, //8081
"xmlrpc_log"
);
when I create multiple connections from a script calling many XMLRPC methods, it works fine.
the script is something like this:
Script1:
rpc.method1(parameters);
rpc.method2(parameters);
rpc.methodN(parameters);
If I check connections in the server with netstat and the xmlrpc_log when this script is executing, the output is something like XMLRPC-SERVER:8081 XMLRPC-CLIENT:SOME TIME_WAIT. Though the XMLRPC_CLIENT IP is the same, fore very rpc.method call it creates a new connections.
The problem appears when I execute two of this scripts in the same client. It means, the call rpc.methodM(parameters) in one script, is executed simultaneously with the rpc.methodN(parameters) in the other script, in the same client.
This produces a crash in the server, and XMLRPC-SERVER stay down till a restart the process.
I read the Abyss help, and runOnce() method will not help. By default, calling the constructor as above, the MaxConnections by default is 30, and timeout 15 segs, for the Abyss server.
Is there some configuration to avoid this crash? I will need to support more than one client at the same time and many connections simultaneously.
Thanks for any help related to this,
Sincerely,
Luchux.
Well. apparently the server is handling the multiple connections and supporting multithreading with pthreads. The problem should be in my code executed by the RPC calls, I guess because a reentrant/thread safety problem.
After some break working with other project, I came back to this code and the problem was in a Natural Language library, with some not re-entrant methods. They solved it, I solved it :) -
I recently starting diving into http programming in C and have a functioning server that can handle GET and POST. My question comes in to my site load times and how I should send the response headers and response message.
I notice in Chromes resource tracking tool that there is almost no (a few ms) connecting/sending/proxy/blocking/waiting time in most cases (on the same network as the server), but the receive time can vary wildly. I'm not entirely sure what the receive time is including. I mostly see a long receive (40 to 140ms or more) time on the png files and sometimes javascript files and rarely other files, but it's not really consistent.
Could anyone shed some light on this for me?
I haven't done much testing yet, but I was wondering if I changed the method which I use to send the header/message would help. I currently have every file for the site cached in server memory along with it's header (all in the same char*). When I send the file that was requested, I just do 1 send() call with the header/file combo (it does not involve any string operations b/c it is all done in advance on server start up).
Would it be better to break it into multiple small send() calls?
Just some stats that I get with Chrome dev tools (again, on local network through a wireless router connection), the site loads in from 120ms to 570ms. It's 19 files at a total of 139.85KB. The computer it's on is a Asus 901 netbook (atom 1.6ghz, 2gb ddr2) with TinyCore linux. I know there are some optimizations I could be doing with how threads start up and a few other things, but not sure that's affecting it to much atm.
If you're sending the entire response in one send(), you should set the TCP_NODELAY socket option.
If that doesn't help, you may want to try using a packet capturing tool like Wireshark to see if you can spot where the delay is introduced.
The software I'm working on needs to be able to connect to many servers in a short period of time, using TCP/IP. The software runs under Win32. If a server does not respond, I want to be able to quickly continue with the next server in the list.
Sometimes when a remote server does not respond, I get a connection timeout error after roughly 20 seconds. Often the timeout comes quicker.
My problem is that these 20 seconds hurts the performance of my software, and I would like my software to give up sooner (after say 5 seconds). I assume that the TCP/IP stack (?) in Windows automatically adjusts the timeout based on some parameters?
Is it sane to override this timeout in my application, and close the socket if I'm unable to connect within X seconds?
(It's probably irrelevant, but the app is built using C++ and uses I/O completion ports for asynchronous network communication)
If you use IO completion ports and async operations, why do you need to wait for a connect to complete before continuing with the next server on the list? Use ConnectEx and pass in an overlapped structure. This way the individual server connect time will no add up, the total connect time is the max server connect time not the sum.
On Linux you can
int syncnt = 1;
int syncnt_sz = sizeof(syncnt);
setsockopt(sockfd, IPPROTO_TCP, TCP_SYNCNT, &syncnt, syncnt_sz);
to reduce (or increase) the number of SYN retries per connect per socket. Unfortunately, it's not portable to Windows.
As for your proposed solution: closing a socket while it is still in connecting state should be fine, and it's probably the easiest way. But since it sounds like you're already using asynchronous completions, can you simply try to open four connections at a time? If all four time out, at least it will only take 20 seconds instead of 80.
All configurable TCP/IP parameters for Windows are here
See TcpMaxConnectRetransmissions
You might consider trying to open many connections at once (each with its own socket), and then work with the one that responds first. The others can be closed.
You could do this with non-blocking open calls, or with blocking calls and threads. Then the lag waiting for a connection to open shouldn't be any more than is minimally nessecary.
You have to be careful when you override the socket timeout. If you are too aggressive and attempt to connect to many servers very quickly then the windows TCP/IP stack will assume your application is an internet worm and throttle it down. If this happens, then the performance of your application will become even worse.
The details of when exactly the throttling back occurs is not advertised, but the timeout you propose ( 5 seconds ) should be OK, in my experience.
The details that are available about this can be found here