Redis PUBSUB connection issue after idle period - c++

I am using nelikelov/redisclient version 0.5.0 and I am using code same as in the PUBSUB example provided in the library. My application subscribes to a channel and receives messages.
What I am facing is that every Monday, the application is not being able to receive messages from Redis.
Is there any timeout that I should handle in case the connection remains idle during the weekend? Shall I configure something extra in my application or in Redis to bypass this?

I'm not familiar with the client you're using, but Redis itself doesn't close idle connections (PubSub or not) by default and keeps them alive. You can verify that your Redis server is configured to maintain idle connections and keep them alive by examining the values of the timeout and tcp-keepalive directives (0 and 300 by default, respectively).
Other than the above and given the periodical aspect of the disconnects, I'd investigate the network settings of the client application server.

Related

Google Cloud - IoT Core - Config sent every 1 hour reboot the device

I have a ESP8266 with a relay to turn on/off a ligth.
All is working great, but IoT core is sending a configuration every 1 hour and that makes the device to reboot, when the device starts again there is no guarantee that the initial state is the desired.
Its there any way to avoid this automatically config?
Thanks.
IoT Core sends the latest configuration to the device each time the device (re)connects, to make sure it is up to date, even if new configuration was sent to it while it was disconnected. This is expected IoT Core behaviour.
As mentioned in other answer, what is probably happening is that your device is not sending data during that period of time, which makes the connection timeout after one hour. The device tries to reconnect, receives the latest configuration and that causes it to reboot.
You have many options to avoid this:
Implement keep alive to keep the connection open.
Refresh the JWT before it expires (this effectively restarts the timer for timeout too).
If you are not expecting configuration sent from IoT Core to the device, do not subscribe to the configuration mqtt topic.
I could solved it by change the logic inside the device. Every hour the JWD needed to be refreshed, so that made iot core send a message to the device with the new status.

Got an error reading communication packets in Google Cloud SQL

From 31th March I've got following error in Google Cloud SQL:
Got an error reading communication packets.
I have been using Google Cloud SQL for 2 years, but never faced with such problem.
I'm very worried about it.
This is detail error message:
textPayload: "2019-04-29T17:21:26.007574Z 203385 [Note] Aborted connection 203385 to db: {db_name} user: {db_username} host: 'cloudsqlproxy~{private ip}' (Got an error reading communication packets)"
While it is true that this error message often occurs after a maintenance period, it isn't necessarily a cause for concern as this is a known behavior by MySQL.
Possible explanations about why this issue is happening are :
The large increase of connection requests to the instance, with the
number of active connections increasing over a short period of time.
The freezing / unavailability of the instance can also occur due to
the burst of connections happening in a very short time interval. It
is observed that this freezing always happens with an increase of
connection requests. This increase in connections causes the
instance to be overloaded and hence unavailable to respond to
further connection requests until the number of connections
decreases or the instance stabilizes.
The server was too busy to accept new connections.
There were high rates of previous connections that were not closed
correctly.
The client terminated it abnormally.
readTimeout setting being set too low in the MySQL driver.
In an excerpt from the documentation, it is stated that:
There are many reasons why a connection attempt might not succeed.
Network communication is never guaranteed, and the database might be
temporarily unable to respond. Make sure your application handles
broken or unsuccessful connections gracefully.
Also a low Cloud SQL Proxy version can be the reason for such
incident issues. Possible upgrade to the latest version of (v1.23.0)
can be a troubleshooting solution.
IP from where you are trying to connect, may not be added to the
Authorized Networks in the Cloud SQL instance.
Some possible workaround for this issue, depending which is your case could be one of the following:
In the case that the issue is related to a high load, you could
retry the connection, using an exponential backoff to prevent
from sending too many simultaneous connection requests. The best
practice here is to exponentially back off your connection requests
and add randomized backoffsto avoid throttling, and potentially
overloading the instance. As a way to mitigate this issue in the
future, it is recommended that connection requests should be
spaced-out to prevent overloading. Although, depending on how you
are connecting to Cloud SQL, exponential backoffs may already be in
use by default with certain ORM packages.
If the issue could be related to an accumulation of long-running
inactive connections, you would be able to know if it is your case
using show full processliston your database looking for
the connections with high Time or connections where Command is
Sleep.
If this is your case you would have a few possible options:
If you are not using a connection pool you could try to update the client application logic to properly close connections immediately at the end of an operation or use a connection pool to limit your connections lifetime. In particular, it is ideal to manage the connection count by using a connection pool. This way unused connections are recycled and also the number of simultaneous connection requests can be limited through the use of the maximum pool size parameter.
If you are using a connecting pool, you could return the idle connections to the pool immediately at the end of an operation and set a shorter timeout by adjusting wait_timeout or interactive_timeoutflag values. Set CloudSQL wait_timeout flag to 600 seconds to force refreshing connections.
To check the network and port connectivity once -
Step 1. Confirm TCP connectivity on port 3306 with tcptraceroute or
netcat.
Step 2. If [Step 1] succeeded then try to check if there are any
errors in using mysql client to check timeout/error.
When the client might be terminating the connection abruptly you
could check for:
If the MySQL client or mysqld server are receiving a packet bigger
than max_allowed_packet bytes, or the client receiving a packet
too large message,if it so you could send smaller packets or
increase the max_allowed_packet flag value on both client
and server. If there are transactions that are not being properly
committed using both "begin" and "commit", there is the need to
update the client application logic to properly commit the
transaction.
There are several utilities that I think will be helpful here,
if you can install mtr and the tcpdump utilities to
monitor the packets during these connection-increasing events.
It is strongly recommended to enable the general_log in the
database flags. Another suggestion is to also enable the slow_query
database flag and output to a file. Also have a look at this
GitHub issue comment and go through the list of additional
solutions proposed for this issue here
This error message indicates a connection issue, either because your application doesn't terminate connections properly or because of a network issue.
As suggested in these troubleshooting steps for MySQL or PostgreSQL instances from the GCP docs, you can start debugging by checking that you follow best practices for managing database connections.

Are there any known Linux inetd/"wait"-capable web servers with graceful idle shutdown?

I would like to start a web server on-demand as an inetd "tcp/wait" service which shuts itself down after a programmable period of inactivity.
Many web servers already support inetd "tcp/nowait" mode, but this mode has the disadvantage that a new process needs to be forked for every new connection. It is therefore slower and more resource-intensive than running a dedicated server daemon.
A web server supporting inetd's "tcp/wait" would only be launched by inetd for the first request, then serve any number of requests using the same server instance until no requests occurred for some period of idle time, in which case the server instance automatically terminates and lets inetd start it again once the next period of activity starts.
Such a tcp/wait inetd web server should have approximately the same efficiency as a dedicated web server (i. e. running permanently) during times of activity. However, it will automatically shut down in times of inactivity, saving system resources.
Irregular "anti-demand"-driven shutdowns will also clean up any memory leaks from the web server and possibly associated FGCI-services (which would terminate together with the web server).
I know that it is already possible to use systemd's socket activation in combination with lighttpd's -i option to implement what I want.
However, I want a solution that also works without systemd, depending on nothing else than a running Internet superserver no matter how the latter one has been started (inetd/xinetd started by sysvinit, runit, manually, or systemd's socket activation replacing inetd/xinetd).

How to address RdKafka::ERR__TIMED_OUT and RdKafka::ERR__MSG_TIMED_OUT in librdkafka?

I am working on C++ kafka client librdkafka. Looking into the example https://github.com/edenhill/librdkafka/blob/master/src-cpp/rdkafkacpp.h and https://github.com/edenhill/librdkafka/blob/master/examples/rdkafka_example.cpp, it seems that there is no process of connecting to broker? How to do some reconnect staff for these connection errors? How to check the connection status?
librdkafka abstracts all broker connectivity from the application, it will attempt to always keep a connection to each known broker (either learnt through metadata.broker.list or by the broker list returned from the first bootstrap brokers).
Upon connection error librdkafka will attempt to connect again, forever.
If none of brokers can be connected to the ALL_BROKERS_DOWN event will be triggered but there is currently no corresponding event for when brokers being to come back online.
The application doesn't need to worry though since librdkafka takes care of all reconnects and message retransmissions in the background and it will keep trying to get the messages produced until either message.timeout.ms or message.send.max.retries are exceeded.
There's more information on this in the introduction guide:
https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md

ActiveMQCPP connection.start() hangs up

I'm using ActiveMQ CPP 5.2.3 if it matters.
I have JMS producer that connects using failover transport to JMS network of brokers.
When I call connection->start() it hangs up (see AMQ-2114).
If I skip connection start() and call connection->createSession(), than this call is blocked too.
The requirement is that my application will try forever to connect to broker(s).
Any suggestions/workarounds?
NOTE:
This is not duplicate of here, since I'm talking about C++ and such solutions as embedded broker, spring are not available in C++.
This is normal when the connection is awaiting a transport to connect to the broker. The start method must send the client's id info to the broker before any other operation, so if no connection is present it must block. You can set some options on the failover transport like the startupMaxReconnectAttempts option to control how long it will try to connect before reporting a failure. See the URI configuration page:
http://activemq.apache.org/cms/configuring.html