Repeated log statements in EC2 syslog - amazon-web-services

In my EC2 instance, syslog I am seeing a lot of:
Aug 19 07:42:01 ip-172-31-0-40 CRON[6465]: (root) CMD (/var/awslogs/bin/awslogs-nanny.sh > /dev/null 2>&1)
type statements getting printed in the Cloudwatch logs (more than a few dozen per second). Is there any way to turn this off and give only minimal log messages ?

That log line is generated by the cron daemon itself it writes to syslog by default on some distros. How to change this behavior is also dependent but for example the top answer for https://unix.stackexchange.com/questions/212355/where-is-my-logfile-of-crontab explains how configure cron to have its own log file on Debian based systems. This would stop the messages flooding your syslog

Related

How to enable log in openconnect vpn

I new to openconnect (https://github.com/openconnect/openconnect.git), can someone please tell that how can I redirect all the log to a file in openconnect, and how to change the log level.
Thanks in advance.
This is working for me. I am adding --timestamp for
Prepend a timestamp to each progress message
and --syslog for:
After tunnel is brought up, use syslog for further progress messages
export vpn_server="<YOUR IP ADDRESS>"
export vpn_username="<YOUR USERNAME>"
sudo openconnect --syslog --timestamp --servercert --protocol=anyconnect -u $vpn_username $vpn_server
Then in another terminal tab, tail the messages
tail -f /var/log/syslog
This bit was taken from https://askubuntu.com/a/1062368
More info about other parameters is here https://www.infradead.org/openconnect/manual.html

GCP VM time sync issue after resuming from suspension (in both linux and windows)

GCP VM doesn't update the system datetime after resuming it from suspension.
It keeps the system date/time same as what it was while suspending. Due to this, my scripts to fetch gcloud resources is failing as with auth token expiry error.
As per the Google Documentation https://cloud.google.com/compute/docs/instances/managing-instances#linux_1,
NTP is already configured but for my VMs I get the "command not found" error for ntpq -p.
$ sudo timedatectl status
Local time: Wed 2020-08-05 15:31:34 EDT
Universal time: Wed 2020-08-05 19:31:34 UTC
RTC time: Wed 2020-08-05 19:31:34
Time zone: America/New_York (EDT, -0400)
System clock synchronized: yes
NTP service: inactive
RTC in local TZ: no
gcloud auth activate-service-account in my script is failing with below error
(gcloud.compute.instances.describe) There was a problem refreshing your current auth tokens: invalid_grant: Invalid JWT: Token must be a short-lived token (60 minutes) and in a reasonable timeframe. Check your iat and exp values in the JWT claim.
OS - Windows/Linux
After resuming, the hardware clock of the VM instance is set correctly as it gets time from the hypervisor. You can check it with sudo hwclock.
The problem is with the time service of the operating system.
For Windows, it could take few minutes to sync system time with the time source. If you can't wait for the timesync cycle to complete, you can logon to Windows and force time synchronization manually:
net stop W32Time
net start W32Time
w32tm /resync /force
In Linux, NTP cannot handle a time offset of more that 1000 seconds (see http://doc.ntp.org/4.1.0/ntpd.htm. Therefore you have to force time synchronization manually. There are various ways to do that (some of them are deprecated, but still may work):
netdate timeserver1
ntpdate -u -s timeserver1
hwclock --hctosys
service ntp restart
systemctl restart ntp.service
If you run into this issue while using Google Cloud Platform, they replace netd and systemd-timesyncd with chronyd
I had to use systemctl start chrony to get my time in working order. Tried hwclock --hctosys, but it was ignoring time zones and thus setting the wrong time.
This happened because I was suspending every minute by accident. A permanent fix would be to modify the systemd definition and ask it keep retrying to start it.
The reason it stopped was this Can't synchronise: no selectable sources

AWS-EC2 Redis-server RDB snapshot write error

I have a web application running on Laravel5.2 framework, with session driver set to redis with following AWS setup.
Instance-1: Running web application, with Redis configurations in .env file as follow
Redis-host: aws-private-ip-of-instance-2
Redis-password: NULL
Redis-port: 6379
Instance-2: Redis-server running with following configuration
Bind aws-private-ip-of-instance-2 and 127.0.0.1
Working directory /var/lib/redis with 775 permission, and ower-group is redis.
RDB snapshot name dump.rdb with 660 permission, and ower-group is redis.
NOTE: In AWS inbound rule for port 6379 is configured for
Instance-2.
Everything works fine, until redis tries to write the data on the RDB file. Following error shows on front-end.
MISCONF Redis is configured to save RDB snapshots, but is currently
not able to persist on disk. Commands that may modify the data set are
disabled. Please check Redis logs for details about the error.
While in the logs of Redis server i got following data.
4873:M 23 Sep 10:08:15.028 * 1 changes in 900 seconds. Saving...
4873:M 23 Sep 10:08:15.028 * Background saving started by pid 7392
7392:C 23 Sep 10:08:15.028 # Failed opening .rdb for saving: Read-only file system
4873:M 23 Sep 10:08:15.128 # Background saving error
Things I have tried
Add vm.overcommit_memory = 1 to /etc/sysctl.conf, as suggested in Redis-administraition-blog
Change path to dump.rdb file to tmp folder and change permissions to 777.
This other Stack Exchange thread might help, since you are using a custom /tmp dir for data:
The simple way to do this is to run systemctl edit redis. This will create an override drop-in file /etc/systemd/system/redis.service.d/override.conf, in which you can place your changes (and the proper section):
[Service]
ReadWriteDirectories=-/my/custom/data/dir
You may also create that directory and place files ending in .conf in it manually. But do not leave the directory empty, as this will disable the service.
In either case, run systemctl daemon-reload and you are ready to restart your service.
Many threads also point to filesystem inconsistency as root cause. Since you are using EC2, check this AWS forums post:
To fix this, you will have to:
Stop the instance
Detach the root volume of your instance
Attach the volume as a data volume to any running Linux instance in the same availability zone
Perform a filesystem check (fsck) on the volume and fix the issues
Detach the volume and attach it back to your instance as it's root volume
Boot back instance and verify if the volume was able to mount successfully
As a last resort, terminate the instance if possible.
Hope it helps!
Well this is very embarrassing to post answer of own question, which was a really stupid mistake. But hope new folks here learns from my mistake too.
So first thing I have done is enable detail logs for redis-server in /etc/redis/redis.conf file by changing log_level option to debug.
Observe the logs and understand that my redis port 6379 was open for everyone on internet.
So from logs I observe that someone else's server is spoofing into my redis server and making it slave of it. And as my redis server is configure in a way that slave is read-only, when i try to access my redis-server it throw error of read-only.
After applying the fire-wall for redis server port, I have not encounter this issue anymore.

rsyslog stale file handler with catalina.out

Problem:
After deploying a microservices as a war via AWS EBS Tomcat 7 container...noticed that the log rotation which occurs at UTC day boundary leaves a stale inode file.
The log rotation is more of a copy n truncate, which causes a stale file handler for rsyslog, which is listening for changes to catalina.out. What's the best way to prevent stale inode descriptors? Should I specify a rollover policy in logback.xml or logrotate or ...?
output of sudo lsof /var/log/tomcat7/catalina.out (and sudo stat report latest inode)
rsyslogd 18970 root 2r REG 202,1 1250 134754 /var/log/tomcat7/catalina.out
but doesn't match log output of rsyslog in debug mode.
4638.114765354:7fc839b8c700: stream checking for file change on '/var/log/tomcat7/catalina.out', inode 135952/135952file 7 read 0 bytes
Workaround
Stop Tomcat, remove catalina.out, then restart tomcat. This allowed rsyslog to continue streaming of new records.
However, after a few hours, rsyslog fails to stream newer log records to rsyslog destination server. The debug log of rsyslog contains the same inode as output of stat and lsof. If you run
sudo stat /var/log/tomcat7/catalina.out
rsyslog starts streaming again.
Have you noticed rsyslog stop streaming intermittently outside of the log rollover use case?
Why would a sudo stat /var/log/tomcat7/catalina.out cause rsyslog to stream again?
I also have a problem with rsyslog not sending new records as soon as logrotated does its rotation on catalina.out. issuing a stat doesn't quite cut it for me as the problem is tomcat stops writing to catalina.out (!) ...after perusing various forums and blogs, I was able to address this through the following steps:
make sure that we have $WorkDirectory defined in rsyslog configuration; this allows rsyslog write the "state file" for catalina.out (or for any other log file(s) it watches, for that matter)
As noted here on Loggly's blog, you need to stop rsyslog, delete this state file, then restart rsyslog on a postrotate entry.
my logrotate setting for catalina.out (state files are in /var/lib/rsyslog):
/opt/tomcat/logs/catalina.out {
rotate 7
size 50M
notifempty
missingok
postrotate
service rsyslog stop
rm /var/lib/rsyslog/*
service rsyslog start
endscript
}

How to figure out why ssh session does not exit sometimes?

I have a C++ application that uses ssh to summon a connection to the server. I find that sometimes the ssh session is left lying around long after the command to summon the server has exited. Looking at the Centos4 man page for ssh I see the following:
The session terminates when the command or shell on the remote machine
exits and all X11 and TCP/IP connections have been closed. The exit
status of the remote program is returned as the exit status of ssh.
I see that the command has exited, so I imagine not all the X11 and TCP/IP connections have been closed. How can I figure out which of these ssh is waiting for so that I can fix my summon command's C++ application to clean up whatever is being left behind that keeps the ssh open.
I wonder why this failure only occurs some of the time and not on every invocation? It seems to occur approximately 50% of the time. What could my C++ application be leaving around to trigger this?
More background: The server is a daemon, when launched, it forks and the parent exits, leaving the child running. The client summons by using:
popen("ssh -n -o ConnectTimeout=300 user#host \"sererApp argsHere\""
" 2>&1 < /dev/null", "r")
Use libssh or libssh2, rather than calling popen(3) from C only to invoke ssh(1) which itself is another C program. If you want my personal experience, I'd say try libssh2 - I've used it in a C++ program and it works.
I find some hints here:
http://www.snailbook.com/faq/background-jobs.auto.html
This problem is usually due to a feature of the OpenSSH server. When writing an SSH server, you have to answer the question, "When should the server close the SSH connection?" The obvious answer might seem to be: close it when the server-side user program started by client request (shell or remote command) exits. However, it's actually a bit more complicated; this simple strategy allows a race condition which can cause data loss (see the explanation below). To avoid this problem, sshd instead waits until it encounters end-of-file (eof) on the pipes connecting to the stdout and stderr of the user program.
#sienkiew: If you really want to execute a command or script via ssh and exit, have a look at the daemontool of the libslack package. (Similar tools that can detach a command from its standard streams would be screen, tmux or detach.)
To inspect stdin, stdout & stderr of the command executed via ssh on the command line, you can, for example, use lsof.
# sample code to figure out why ssh session does not exit
# sleep keeps its stdout open, so sshd only sees EOF after command completion
ssh localhost 'sleep 10 &' # blocks
ssh localhost 'sleep 10 1>&- &' # does not block
ssh localhost 'sleep 10 & lsof -p ${!}'
ssh localhost 'sleep 10 1>&- & lsof -p ${!}'
ssh localhost 'sleep 10 1>/dev/null & lsof -p ${!}'
ssh localhost 'sleep 10 1>/dev/null 2>&1 & lsof -p ${!}'