On Atmosphere 2.2.0 running on Jetty with a Python websocket-client-0.20.0 client, sometimes, although rarely, we get a situation where our logs fill up with the following:
12:18:01.105 WARN org.atmosphere.websocket.WebSocket: Unable to write 501 Websocket protocol not supported
12:18:01.106 WARN o.a.w.protocol.SimpleHttpProtocol: Status code higher or equal than 400. Unable to deliver the websocket messages to installed component. Status 501 Message Websocket protocol not supported
This spamming happens really fast, at about 5ms intervals, and if this occurs in production there is no other way than to take the system down. So we really need to avoid this.
It may be an issue in our client, but what I'm wondering is if there is a way to recognize these kinds of errors at the server back-end and to just close the connection, if we can't come up with a fix to the root cause?
... edit, some six months later: This issue seems to arise from an earlier Jetty exception:
11:05:54.363 ERROR o.a.container.Jetty9WebSocketHandler: {}
org.eclipse.jetty.websocket.api.WebSocketTimeoutException: Timeout on Read at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onReadTimeout(AbstractWebSocketConnection.java:521) [devi
ce-interaction-service-1.0.jar:na]
at org.eclipse.jetty.io.AbstractConnection.onFillInterestedFailed(AbstractConnection.java:258) [device-interaction-service-1.0
.jar:na]
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillInterestedFailed(AbstractWebSocketConnection.java:4
97) [device-interaction-service-1.0.jar:na]
at org.eclipse.jetty.io.AbstractConnection$ReadCallback$1.run(AbstractConnection.java:420) [device-interaction-service-1.0.jar
:na]
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601) [device-interaction-service-1.0.jar:na]
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532) [device-interaction-service-1.0.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
11:05:54.365 DEBUG o.a.w.DefaultWebSocketProcessor: Unable to properly complete 2d48f7ce-6150-42c2-b815-93acf94bdb93
11:05:55.156 WARN org.atmosphere.websocket.WebSocket: Unable to write 501 Websocket protocol not supported
11:05:55.157 WARN o.a.w.protocol.SimpleHttpProtocol: Status code higher or equal than 400. Unable to deliver the websocket messages to installed component. Status 501 Message Websocket protocol not supported
11:05:55.201 WARN org.atmosphere.websocket.WebSocket: Unable to write 501 Websocket protocol not supported
11:05:55.202 WARN o.a.w.protocol.SimpleHttpProtocol: Status code higher or equal than 400. Unable to deliver the websocket messages to installed component. Status 501 Message Websocket protocol not supported
11:05:55.203 WARN org.atmosphere.websocket.WebSocket: Unable to write 501 Websocket protocol not supported
...
It also does seem to me that the connection is left hanging, and no #Disconnect method gets called on my #ManagedService.
I wonder how to intercept this and to just shut down the connection, forcing the client to reconnect.
Write your own WebSocketProtocol or extends the SimpleHttpProtocol#onError https://github.com/Atmosphere/atmosphere/wiki/Writing-WebSocket-Sub-Protocol
Related
I'm working with WSO2 APIM 3.1.0 and some of my endpoints are getting constantly in SUSPENDED state.
The endpoint was built to return http code 400, 403, 404 as part of its business logic.
In the Advanced Configurations for a given endpoint, we might set the Error Code to move the endpoint into suspension or timeout state.
The error codes below are available for selection:
101000 Receiver input/output error sending
101001 Receiver input/output error receiving
101500 Sender input/output error sending
101501 Sender input/output error receiving
101503 Connection failed
101504 Connection timed out (no input was detected on this connection over the maximum period of inactivity)
101505 Connection closed
101506 NHTTP protocol violation
101507 Connection canceled
101508 Request to establish new connection timed out
101509 Send abort
101510 Response processing failed
Http codes, such as 400/403/404, returned by the endpoint are mapped to some of those WSO2 error codes ?
Yes, If you set suspension code, it will be internally mapped with the response code.
Question on why pubsub requests seem to trigger such a high number of 503 errors? Is this something common? It seems other people see something similar but a majority of my requests end up that way
Similar to
Google Pubsub: UNAVAILABLE: The service was unable to fulfill your request
Catch error code from GCP pub/sub
This is expected behavior. Streaming pull, which is used by the client libraries, creates a bidirectional stream for receiving messages and sending back acknowledgements. These streams stay open for long periods of time and don't close with a successful response code when messages are received, they terminate with an error condition when the stream disconnects, perhaps due to a restart on the part of the server receiving the request or because of brief network blip. Therefore, even if you are receiving messages successfully, you'll still see error response codes for all of the streams themselves. The new streaming pull docs address this question directly.
we run a website that obtains location data through the Google Place API. We have 150k daily searches available, which we haven´t met yet as the website has been live for few weeks only. We have suddenly received a 502 error. A notification in the Console says: “The server encountered a temporary error and could not complete your request.”. Is this a temporary error? Is there any suggestions on what we can do? The website hasn’t been available for 40 minutes.
When you receive 5xx status or UNKNOWN_ERROR in the response, you should implement a retrying logic. Google has a following recommendation in their web services documentation:
In rare cases something may go wrong serving your request; you may receive a 4XX or 5XX HTTP response code, or the TCP connection may simply fail somewhere between your client and Google's server. Often it is worthwhile re-trying the request as the followup request may succeed when the original failed. However, it is important not to simply loop repeatedly making requests to Google's servers. This looping behavior can overload the network between your client and Google causing problems for many parties.
A better approach is to retry with increasing delays between attempts. Usually the delay is increased by a multiplicative factor with each attempt, an approach known as Exponential Backoff.
https://developers.google.com/maps/documentation/directions/web-service-best-practices#exponential-backoff
However, if retrying logic with Exponential Backoff doesn't help and the error persists for a long time you should file a bug in Google issue tracker
I hope this addresses your doubt!
UPDATE
There was an issue on Google side yesterday (November 6, 2017), you can refer to the following bug that explains the issue:
https://issuetracker.google.com/issues/68938173
Test cluster of two brokers, WKA membership scheme, PostgreSQL message store, working fine for a couple of days, then throwing following errors:
TID: [] [] [2016-07-19 12:09:24,738] ERROR {org.wso2.andes.server.protocol.MultiVersionProtocolEngine} - Error establishing session {org.wso2.andes.server.protocol.MultiVersionProtocolEngine}
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:218)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:198)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.access$400(SocketIoProcessor.java:45)
at org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:485)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:745)
Startup of Message Broker looks fine, no errors, JDBC connection to PostgreSQL DB is ok, Registry mount looks ok. Then after that error appears in wso2carbon.log several times/minute.
Anyone any ideas? As far as I know nothing's changed and I don't know what it's trying to connect to.
This usually happens when client's whom connected to MB tries to create connections per message. jms is heavy connection and not recommended to create connections per each message. Therefore, please go through client implementation and verify connections are not created per message.
If by any chance you are using wso2 esb to publish/subscribe queues/topics to mb there is a property "transport.jms.CacheLevel" connection caching in esb axis2.xml.Read the documentation and use appropriate caching level for your usecase.
There was bug in connection caching property to be ignored in esb 4.8.1 which is currently fixed in 4.9.0 as well.
These are the possible cases I can think of with the given information. If you need more info please provide a detailed usecase.
Where can I go look to find the source of a connection reset error? Here are the details:
I have a Clojure applet that uses clj-http.client.
I need to track down what is sending the following error
Feb 14, 2013 5:16:04 PM
org.apache.http.impl.client.DefaultRequestDirector execute
INFO: I/O exception (java.net.SocketException)
caught when processing request: Connection reset
Feb 14, 2013 5:16:04 PM
org.apache.http.impl.client.DefaultRequestDirector execute
INFO: Retrying request
We have looked through the server's IIS logs, and cannot find any error indicating a connection reset. We've also looked at the server's Event Logs, and cannot find an error that matches the error I'm getting in the client. As a matter of fact, the IIS logs look OK. I can see my address verification "GET" requests right in the log.
It's just a guess, though I often get that error message when the web server is configured to respond to the wrong host name. If it is serving for www.example.com/my/service and I open a connection to 1.2.3.4/my/service then it hangs up with "connection reset".