Issue with awslogs service and CloudWatch Logs Agent on Ubuntu 16.04 - amazon-web-services

On one of my AWS ec2 instances running Ubuntu 16.04, I'm getting the following errors filled up in my /var/syslog.
Jul 17 18:11:21 Mysql-Slave systemd[1]: Stopped The CloudWatch Logs agent.
Jul 17 18:11:21 Mysql-Slave systemd[1]: Started The CloudWatch Logs agent.
Jul 17 18:11:26 Mysql-Slave systemd[1]: awslogs.service: Main process exited, code=exited, status=255/n/a
Jul 17 18:11:26 Mysql-Slave systemd[1]: awslogs.service: Unit entered failed state.
Jul 17 18:11:26 Mysql-Slave systemd[1]: awslogs.service: Failed with result 'exit-code'.
Jul 17 18:11:26 Mysql-Slave systemd[1]: awslogs.service: Service hold-off time over, scheduling restart.
Jul 17 18:11:26 Mysql-Slave systemd[1]: Stopped The CloudWatch Logs agent.
Jul 17 18:11:26 Mysql-Slave systemd[1]: Started The CloudWatch Logs agent.
Jul 17 18:11:32 Mysql-Slave systemd[1]: awslogs.service: Main process exited, code=exited, status=255/n/a
Jul 17 18:11:32 Mysql-Slave systemd[1]: awslogs.service: Unit entered failed state.
Jul 17 18:11:32 Mysql-Slave systemd[1]: awslogs.service: Failed with result 'exit-code'.
Jul 17 18:11:32 Mysql-Slave systemd[1]: awslogs.service: Service hold-off time over, scheduling restart.
Jul 17 18:11:32 Mysql-Slave systemd[1]: Stopped The CloudWatch Logs agent.
Jul 17 18:11:32 Mysql-Slave systemd[1]: Started The CloudWatch Logs agent.
The /var/log/awslogs.log contains these messages:
database is locked
2018-07-17 20:59:01,055 - cwlogs.push - INFO - 27074 - MainThread - Missing or invalid value for use_gzip_http_content_encoding config. Defaulting to using gzip encoding.
2018-07-17 20:59:01,055 - cwlogs.push - INFO - 27074 - MainThread - Using default logging configuration.
database is locked
2018-07-17 20:59:06,549 - cwlogs.push - INFO - 27104 - MainThread - Missing or invalid value for use_gzip_http_content_encoding config. Defaulting to using gzip encoding.
2018-07-17 20:59:06,549 - cwlogs.push - INFO - 27104 - MainThread - Using default logging configuration.
database is locked
2018-07-17 20:59:12,054 - cwlogs.push - INFO - 27110 - MainThread - Missing or invalid value for use_gzip_http_content_encoding config. Defaulting to using gzip encoding.
2018-07-17 20:59:12,054 - cwlogs.push - INFO - 27110 - MainThread - Using default logging configuration.
Any pointers in troubleshooting this will be of great help.

A similar issue was posted in the following link - https://forums.aws.amazon.com/thread.jspa?threadID=165134
I did the following:
a) Stopped the awslogs service
$ service awslogs stop ## Amazon Linux
OR
$ service awslogsd stop ## Amazon Linux 2
b) Deleted the agent-state file in /var/awslogs/state/ (I renamed it in my case)
$ mv agent-state agent-state.old ## Amazon Linux
OR
$ cd /var/lib/awslogs; mv agent-stat agent-stat.old ## Amazon Linux 2
c) Restarted the awslogs service
$ service awslogs start ## Amazon Linux
OR
$ sudo systemctl start awslogsd ## Amazon Linux 2
A new agent-state file was created as a result and the errors mentioned my post disappeared after this.

Please try the following commands based on your Linux version
sudo service awslogs start
If you are running Amazon Linux 2, try the below command
sudo systemctl start awslogsd
took me 2 hours to figure this out

In my case, I found duplicate entries for some properties in /etc/awslogs/awslogs.conf file.
(Not all were duplicates, as some of the properties were commented, and I uncommented them to set values.)
It didn't work. Then I scrolled till the bottom of the file.
I found following entries. Set the values to these properties and it worked.
[/var/log/messages]
datetime_format = %b %d %H:%M:%S
file = /home/ec2-user/application.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = MyProject

Related

how to get Jenkins to be assessable from aws ec2 instance

so this is the problem I have installed open jdk 8 for jenkins. jenkins is insalled and running given
● jenkins.service - LSB: Start Jenkins at boot time
Loaded: loaded (/etc/init.d/jenkins; generated)
Active: active (exited) since Thu 2021-10-21 19:22:55 UTC; 20min ago
Docs: man:systemd-sysv-generator(8)
Process: 437 ExecStart=/etc/init.d/jenkins start (code=exited, status=0/SUCCESS)
Oct 21 19:22:52 ip-172-31-30-187 systemd[1]: Starting LSB: Start Jenkins at boot time...
Oct 21 19:22:53 ip-172-31-30-187 jenkins[437]: Correct java version found
Oct 21 19:22:53 ip-172-31-30-187 jenkins[437]: * Starting Jenkins Automation Server jenkins
Oct 21 19:22:54 ip-172-31-30-187 su[619]: (to jenkins) root on none
Oct 21 19:22:54 ip-172-31-30-187 su[619]: pam_unix(su-l:session): session opened for user jenkins by (u>
Oct 21 19:22:54 ip-172-31-30-187 su[619]: pam_unix(su-l:session): session closed for user jenkins
Oct 21 19:22:55 ip-172-31-30-187 jenkins[437]: ...done.
Oct 21 19:22:55 ip-172-31-30-187 systemd[1]: Started LSB: Start Jenkins at boot time.
however, using serverip:8080 brings up nothing
used this tutorial https://www.youtube.com/watch?v=B6K1IF-489M&t=36s
port 8080 is also added to security group
this problem was not solved but making a fresh ec2 instance and installing Jenkins by following that tutorial did the trick

Can't stop Clamav deamon in Linux

I am trying to stop Clamav service in Linux, but I am not able to do that.
I have installed Clamav in a seperate directory.
When running below command:
$ systemctl stop clamav-daemon
I get this error message:
Warning: Stopping clamav-daemon.service, but it can still be activated by:
clamav-daemon.socket
When running:
$ systemctl status clamav-daemon
I get:
clamav-daemon.service - Clam AntiVirus userspace daemon
Loaded: loaded (/usr/lib/systemd/system/clamav-daemon.service; disabled; vendor preset: disabled)
Active: active (running) since Wed 2020-04-29 13:23:33 IST; 7s ago
Docs: man:clamd(8)
man:clamd.conf(5)
https://www.clamav.net/documents/
Main PID: 32213 (clamd)
Tasks: 1
CGroup: /system.slice/clamav-daemon.service
└─32213 /usr/local/sbin/clamd --foreground=true
Any help will be appreciated. Thanks.
When you are logged in as a normal user which can be understood by seeing the $ sign in your command, clamav won't stop. You need to perform the following command.
It will stop clamav for only logged-in session
$ sudo systemctl stop clamav-daemon
To see the status
$ sudo systemctl status clamav-daemon
It will return:
● clamav-daemon.service - Clam AntiVirus userspace daemon
Loaded: loaded (/lib/systemd/system/clamav-daemon.service; disabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/clamav-daemon.service.d
└─extend.conf
Active: inactive (dead)
Docs: man:clamd(8)
man:clamd.conf(5)
https://www.clamav.net/documents/
Aug 20 08:58:53 machine clamd[808]: Thu Aug 20 08:58:53 2020 -> HTML support enabled.
Aug 20 08:58:53 machine clamd[808]: Thu Aug 20 08:58:53 2020 -> XMLDOCS support enabled.
Aug 20 08:58:53 machine clamd[808]: Thu Aug 20 08:58:53 2020 -> HWP3 support enabled.
Aug 20 08:58:53 machine clamd[808]: Thu Aug 20 08:58:53 2020 -> Self checking every 3600 seconds.
Aug 20 09:58:53 machine clamd[808]: Thu Aug 20 09:58:53 2020 -> SelfCheck: Database status OK.
Aug 20 10:57:51 machine systemd[1]: Stopping Clam AntiVirus userspace daemon...
Aug 20 10:57:52 machine clamd[808]: Thu Aug 20 10:57:52 2020 -> --- Stopped at Thu Aug 20 10:57:52 >
Aug 20 10:57:52 machine clamd[808]: Thu Aug 20 10:57:52 2020 -> Socket file removed.
Aug 20 10:57:52 machine systemd[1]: clamav-daemon.service: Succeeded.
Aug 20 10:57:52 machine systemd[1]: Stopped Clam AntiVirus userspace daemon.
If you have created symbolic link of clamav daemon to start the program automatically when pc boot then you need to remove that link so that clamav shouldn't start automatically
$ sudo systemctl disable clamav-daemon

Failed to start Redis In-Memory Data Store. Ubuntu 18.04

I am trying to install redis on my AWS server. I have Ubuntu 18.04 installed on it. I am following steps to install redis from digitalocean article.
When i run sudo systemctl status redis command i am getting below error.
screenshot
I tried to edit /etc/systemd/system/redis.service file and added Type=forking under [Service] section but still getting the same error.
Can anyone suggest me how i can get it fixed?
Thanks in advance.
Based on same digitalocean tutorial, actually it's running fine.
Run this command sudo systemctl restart redis.service, we get (showing "failed" on last line):
● redis.service - Redis In-Memory Data Store
Loaded: loaded (/etc/systemd/system/redis.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2021-06-28 12:03:11 +03; 1min 0s ago
Process: 20428 ExecStart=/usr/local/bin/redis-server /etc/redis/redis.conf (code=exited, status=
Main PID: 20428 (code=exited, status=203/EXEC)
Jun 28 12:03:11 XYZ systemd[1]: redis.service: Service hold-off time over, scheduling restar
Jun 28 12:03:11 XYZ systemd[1]: redis.service: Scheduled restart job, restart counter is at
Jun 28 12:03:11 XYZ systemd[1]: Stopped Redis In-Memory Data Store.
Jun 28 12:03:11 XYZ systemd[1]: redis.service: Start request repeated too quickly.
Jun 28 12:03:11 XYZ systemd[1]: redis.service: Failed with result 'exit-code'.
Jun 28 12:03:11 XYZ systemd[1]: Failed to start Redis In-Memory Data Store.
But if you run sudo service redis-server status, we get (showing "running" on 3rd line):
● redis-server.service - Advanced key-value store
Loaded: loaded (/lib/systemd/system/redis-server.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2021-06-28 11:50:13 +03; 19min ago
Docs: http://redis.io/documentation,
man:redis-server(1)
Process: 19278 ExecStop=/bin/kill -s TERM $MAINPID (code=exited, status=0/SUCCESS)
Process: 19371 ExecStart=/usr/bin/redis-server /etc/redis/redis.conf (code=exited, status=0/SUCC
Main PID: 19382 (redis-server)
Tasks: 4 (limit: 4915)
CGroup: /system.slice/redis-server.service
└─19382 /usr/bin/redis-server 127.0.0.1:6379
Jun 28 11:50:13 XYZ systemd[1]: Starting Advanced key-value store...
Jun 28 11:50:13 XYZ systemd[1]: redis-server.service: Can't open PID file /var/run/redis/red
Jun 28 11:50:13 XYZ systemd[1]: Started Advanced key-value store.
After searching for hours, it seems like it's some difference between systemctl & service and nothing more, but the actual redis server is running fine. Corrects me if that's not the case. Here's the link: https://askubuntu.com/questions/903354/difference-between-systemctl-and-service-commands
You can even check if redis is working fine, by redis-cli ping, should print PONG
I also encountered this problem, then I tried to check it again.
Finally, I found that when I authorized /var/lib/redis, I entered the wrong command, causing the redis account to have no access to /var/lib/redis.
sudo chown redis:redis /var/lib/redis
sudo systemctl restart redis
succeeded.

docker-machine create with digitalocean driver: ssh command error

I´m using docker tools on windows.
create command was working perfectly last week and I managed to create a number of machines on Digital Ocean. Then I tried today with no success. I repeated the same command with different regions and I always get the same result:
λ docker-machine create -d digitalocean --digitalocean-access-token=MYTOKEN --digitalocean-region=ams2 vmname
Running pre-create checks...
Creating machine...
(fernu) Creating SSH key...
(fernu) Creating Digital Ocean droplet...
(fernu) Waiting for IP address to be assigned to the Droplet...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with ubuntu(systemd)...
Installing Docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Error creating machine: Error running provisioning: ssh command error:
command : sudo systemctl -f start docker
err : exit status 1
output : Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
If I execute the suggested command:
root#fernu:~# systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10-machine.conf
Active: inactive (dead) (Result: exit-code) since Fri 2017-06-30 20:56:13 UTC; 8min ago
Docs: https://docs.docker.com
Process: 4943 ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=digitalocean (code=exited, status=1/FAILURE)
Main PID: 4943 (code=exited, status=1/FAILURE)
Jun 30 20:56:13 fernu systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Jun 30 20:56:13 fernu systemd[1]: Failed to start Docker Application Container Engine.
Jun 30 20:56:13 fernu systemd[1]: docker.service: Unit entered failed state.
Jun 30 20:56:13 fernu systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 30 20:56:13 fernu systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Jun 30 20:56:13 fernu systemd[1]: Stopped Docker Application Container Engine.
Jun 30 20:56:13 fernu systemd[1]: docker.service: Start request repeated too quickly.
Jun 30 20:56:13 fernu systemd[1]: Failed to start Docker Application Container Engine.
Any help would be appreciated
Update
It´s working with ubuntu 14:
--digitalocean-image=ubuntu-14-04-x64 so it seams like a problem with the default image (ubuntu-16-04-x64)
This seems to be hitting a lot of people. TL;DR: There is a bug in docker-machine v0.12.0 and this issue can be resolved by upgrading.
Logging in to the DigitalOcean instance and running journalctl -xe provides more information:
-- Unit docker.service has begun starting up.
Jul 07 20:03:52 docker-sandbox docker[4930]: `docker daemon` is not supported on Linux. Please run `do
Jul 07 20:03:52 docker-sandbox systemd[1]: docker.service: Main process exited, code=exited, status=1/
Jul 07 20:03:52 docker-sandbox systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
The key here is docker daemon is not supported on Linux. A bug in docker-machine's version comparison code caused an incorrect systemd unit file to be produced (located at /etc/systemd/system/docker.service.d/10-machine.conf) on certain versions of Ubuntu.
A fix has been committed and a new release (v0.12.1) was made.
You can grab the latest release at: https://github.com/docker/machine/releases/tag/v0.12.1

nginx cannot start on redhat server

I am trying to install nginx on a rhel 7 and it says process doesn't start. Following is the log.
Nov 13 06:36:42 ip-10-0-0-10.ec2.internal systemd[1]: Starting nginx -
high performance web server...**
Nov 13 06:36:42 ip-10-0-0-10.ec2.internal nginx[30974]: nginx: the
configuration file /etc/nginx/nginx.conf syntax is ok
Nov 13 06:36:42 ip-10-0-0-10.ec2.internal nginx[30974]: nginx: [emerg]
open() "/mnt/nginx_logs/pubstore/access.log" failed (13: Permission
denied)
Nov 13 06:36:42 ip-10-0-0-10.ec2.internal nginx[30974]: nginx:
configuration file /etc/nginx/nginx.conf test failed
Nov 13 06:36:42 ip-10-0-0-10.ec2.internal systemd[1]: nginx.service:
control process exited, code=exited status=1
Nov 13 06:36:42 ip-10-0-0-10.ec2.internal systemd[1]: Failed to start
nginx - high performance web server.
Nov 13 06:36:42 ip-10-0-0-10.ec2.internal systemd[1]: Unit
nginx.service entered failed state.**
The permission of the file access log is as follows. I have given permission but still it doesn't start.
-rwxrwxrwx. 1 nginx nginx 0 Nov 13 02:07 access.log
-rwxrwxrwx. 1 nginx nginx 0 Nov 13 02:07 error.log
The installation is done on a puppet agent on amazon ec2 instance
This line:
Nov 13 06:36:42 ip-10-0-0-10.ec2.internal nginx[30974]: nginx: [emerg] open() "/mnt/nginx_logs/pubstore/access.log" failed (13: Permission denied)
Tells you that the user you are running nginx as, does not have access to write to the log file its configured to write to.
Since the logs are being stored in a non-standard location, you will likely have to ensure that the directory you want to store logs in, is writable by the same user that nginx is running as.