Failed to start Zabbix Agent for every 10 seconds - centos7

I am using centos 7.
How did I check the log.
journalctl -xe
What I got from the log.(I saw the same log every 10 seconds.)
Oct 02 10:19:51 lp01.localdomain systemd[1]: zabbix-agent.service holdoff time over, scheduling restart.
Oct 02 10:19:51 lp01.localdomain systemd[1]: Starting Zabbix Agent...
-- Subject: Unit zabbix-agent.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit zabbix-agent.service has begun starting up.
Oct 02 10:19:51 lp01.localdomain zabbix_agentd[8985]: zabbix_agentd [8987]: cannot open "/var/log/zabbix/zabbix_agentd.log": [13] Permission denied
Oct 02 10:19:51 lp01.localdomain systemd[1]: PID file /run/zabbix/zabbix_agentd.pid not readable (yet?) after start.
Oct 02 10:19:51 lp01.localdomain systemd[1]: zabbix-agent.service never wrote its PID file. Failing.
Oct 02 10:19:51 lp01.localdomain systemd[1]: Failed to start Zabbix Agent.
-- Subject: Unit zabbix-agent.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit zabbix-agent.service has failed.
--
-- The result is failed.
Oct 02 10:19:51 lp01.localdomain systemd[1]: Unit zabbix-agent.service entered failed state.
Oct 02 10:19:51 lp01.localdomain systemd[1]: zabbix-agent.service failed.
So I checked "/var/log/zabbix/zabbix_agentd.log" file first.
ll /var/log/zabbix/zabbix_agentd.log
But it said No such file or directory.
ls: cannot access /var/log/zabbix/zabbix_agentd.log: No such file or directory
and then I checked "/run/zabbix/zabbix_agentd.pid" file.
ll /run/zabbix/zabbix_agentd.pid
It also said No such file or directory.
ls: cannot access /run/zabbix/zabbix_agentd.pid: No such file or directory
You have new mail in /var/spool/mail/root
I checked if Selinux is running.
getenforce
and it said Selinux is Disabled..
My questions are
How can I start zabbix?
If I can't start zabbix, can I stop zabbix from starting-failing every 10 seconds?
Thank you.

add permission to the directory - /var/log/zabbix/ & /var/log/zabbix-agent/
chmod 707 /var/log/zabbix/
chmod 707 /var/log/zabbix-agent/
or
change owner of the directory?
chown zabbix:zabbix /var/log/zabbix/
chown zabbix:zabbix /var/log/zabbix-agent/
And then, would stop service zabbix?
systemctl stop zabbix-agent

Related

Tomcat Server Loaded errir

I tried to install and start the Tomcat on my AWS EC2 instance.
However, I failed to run tomcat server after installed it.
I followed the following article for that
https://blog.devops4me.com/aws-tutorial-how-to-install-tomcat-in-aws-ec2-install/
And after finished step 6, and reloaded tomcat server, I tried to restart tomcat service like step 7. However, my terminal showed as below:
Feb 14 20:17:07 ip-172-31-54-104.ec2.internal systemd[1]: tomcat.service: control process exited, code=exited status=203
Feb 14 20:17:07 ip-172-31-54-104.ec2.internal systemd[1]: Failed to start Tomcat Server.
Feb 14 20:17:07 ip-172-31-54-104.ec2.internal systemd[1]: Unit tomcat.service entered failed state.
Feb 14 20:17:07 ip-172-31-54-104.ec2.internal systemd[1]: tomcat.service failed.
Feb 14 20:17:10 ip-172-31-54-104.ec2.internal systemd[1]: [/etc/systemd/system/tomcat.service:22] Missing '='.
Feb 14 20:17:22 ip-172-31-54-104.ec2.internal systemd[1]: tomcat.service holdoff time over, scheduling restart.
Feb 14 20:17:22 ip-172-31-54-104.ec2.internal systemd[1]: tomcat.service failed to schedule restart job: Unit is not loaded properly: Bad message.
Feb 14 20:17:22 ip-172-31-54-104.ec2.internal systemd[1]: Unit tomcat.service entered failed state.
Feb 14 20:17:22 ip-172-31-54-104.ec2.internal systemd[1]: tomcat.service failed.
I don't really understand how this work, and tried to search error on internet, but still doesn't have any clue of how to solve this problem, if anyone can give me a hint, that would be great, thank you!

Failed to start Apache after configuring ssl on Amazon Linux2

I'm trying to install ssl certificate on aws ec2(Amazon Linux2). Apache can start properly until I configure ssl following "Step 1: Enable TLS on the server" in https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/SSL-on-amazon-linux-ami.html
After I make those changes, apache even can not start:
Job for httpd.service failed because the control process exited with error code
I got the following log by typing "journalctl -xe":
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit session-26.scope has begun starting up.
Jan 27 04:30:01 ip-172-31-18-58.ap-southeast-2.compute.internal CROND[10239]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Jan 27 04:30:01 ip-172-31-18-58.ap-southeast-2.compute.internal systemd[1]: Removed slice User Slice of root.
-- Subject: Unit user-0.slice has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit user-0.slice has finished shutting down.
Jan 27 04:30:01 ip-172-31-18-58.ap-southeast-2.compute.internal systemd[1]: Stopping User Slice of root.
-- Subject: Unit user-0.slice has begun shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit user-0.slice has begun shutting down.
Jan 27 04:30:47 ip-172-31-18-58.ap-southeast-2.compute.internal sshd[10245]: Received disconnect from 221.181.185.135 port 21146:11: [preauth]
Jan 27 04:30:47 ip-172-31-18-58.ap-southeast-2.compute.internal sshd[10245]: Disconnected from 221.181.185.135 port 21146 [preauth]
Jan 27 04:31:22 ip-172-31-18-58.ap-southeast-2.compute.internal dhclient[2893]: XMT: Solicit on eth0, interval 120170ms.
Jan 27 04:33:22 ip-172-31-18-58.ap-southeast-2.compute.internal dhclient[2893]: XMT: Solicit on eth0, interval 119290ms.
Jan 27 04:34:55 ip-172-31-18-58.ap-southeast-2.compute.internal sshd[10251]: Received disconnect from 218.93.208.150 port 29589:11: [preauth]
Jan 27 04:34:55 ip-172-31-18-58.ap-southeast-2.compute.internal sshd[10251]: Disconnected from 218.93.208.150 port 29589 [preauth]
Jan 27 04:35:17 ip-172-31-18-58.ap-southeast-2.compute.internal sudo[10253]: ec2-user : TTY=pts/2 ; PWD=/etc/pki/tls/certs ; USER=root ; COMMAND=/bin/systemct
Jan 27 04:35:17 ip-172-31-18-58.ap-southeast-2.compute.internal sudo[10253]: pam_unix(sudo:session): session opened for user root by ec2-user(uid=0)
Jan 27 04:35:17 ip-172-31-18-58.ap-southeast-2.compute.internal systemd[1]: Starting The Apache HTTP Server...
-- Subject: Unit httpd.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit httpd.service has begun starting up.
Jan 27 04:35:17 ip-172-31-18-58.ap-southeast-2.compute.internal systemd[1]: httpd.service: main process exited, code=exited, status=1/FAILURE
Jan 27 04:35:17 ip-172-31-18-58.ap-southeast-2.compute.internal systemd[1]: Failed to start The Apache HTTP Server.
-- Subject: Unit httpd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit httpd.service has failed.
--
-- The result is failed.
Jan 27 04:35:17 ip-172-31-18-58.ap-southeast-2.compute.internal systemd[1]: Unit httpd.service entered failed state.
Jan 27 04:35:17 ip-172-31-18-58.ap-southeast-2.compute.internal systemd[1]: httpd.service failed.
Jan 27 04:35:17 ip-172-31-18-58.ap-southeast-2.compute.internal sudo[10253]: pam_unix(sudo:session): session closed for user root
Jan 27 04:35:22 ip-172-31-18-58.ap-southeast-2.compute.internal dhclient[2893]: XMT: Solicit on eth0, interval 130780ms.
I get stuck here for a long time. Anyone can help me please? Thank you in advance.

Problem installing RStudio onto a GCP cluster

I'm trying to follow this tutorial but keep getting an error when I try to install RStudio on the main cluster. (see section Installing RStudio Server..., item 3).
When I run the line
$ sudo gdebi rstudio-server-1.2.1335-amd64.deb
The installation starts but then fails with
Jun 03 03:10:16 cluster-c141-m systemd[1]: Starting RStudio Server...
Jun 03 03:10:16 cluster-c141-m rserver[20313]: /usr/lib/rstudio-server/bin/rserver: error while loading shar…ectory
Jun 03 03:10:16 cluster-c141-m systemd[1]: rstudio-server.service: Control process exited, code=exited status=127
Jun 03 03:10:16 cluster-c141-m systemd[1]: Failed to start RStudio Server.
Jun 03 03:10:16 cluster-c141-m systemd[1]: rstudio-server.service: Unit entered failed state.
Jun 03 03:10:16 cluster-c141-m systemd[1]: rstudio-server.service: Failed with result 'exit-code'.
Thanks for any suggestions.
You are installing the rstudio-server for Debian 8. Install the Debian 9 version.
The tutorial could use an update.
There is an initialization action to take care of installation of rstudio (in case you want to create another cluster later): https://github.com/GoogleCloudPlatform/dataproc-initialization-actions/tree/master/rstudio

starting celery workers as daemons

I am trying to set up celery to run in production. I have been following the instructions here:
https://www.linode.com/docs/development/python/task-queue-celery-rabbitmq/#start-the-workers-as-daemons
I am currently up to step #7, i.e. 'sudo systemctl start celeryd'. When I am running this I am being told celeryd.service has failed. I have run 'journalctl -xe' to find the log details, which I have copied in below.
I'm very new to celery so I'm finding difficulty in interpreting the log file to figure out what's going wrong, so any help would be much appreciated. If more information is needed then please ask and i'll do my best to provide it.
Apr 05 10:44:47 user-admin systemd[6477]: celeryd.service: Failed to determine user credentials: No such process
Apr 05 10:44:47 user-admin systemd[6477]: celeryd.service: Failed at step USER spawning /bin/sh: No such process
-- Subject: Process /bin/sh could not be executed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- The process /bin/sh could not be executed and failed.
--
-- The error number returned by this process is 3.
Apr 05 10:44:47 user-admin systemd[1]: celeryd.service: Control process exited, code=exited status=217
Apr 05 10:44:47 user-admin systemd[1]: celeryd.service: Failed with result 'exit-code'.
Apr 05 10:44:47 user-admin systemd[1]: Failed to start Celery Service.
-- Subject: Unit celeryd.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit celeryd.service has failed.
--
-- The result is RESULT.
Apr 05 10:44:47 user-admin sudo[6472]: pam_unix(sudo:session): session closed for user root
Apr 05 10:45:01 user-admin CRON[6481]: pam_unix(cron:session): session opened for user root by (uid=0)
Apr 05 10:45:01 user-admin CRON[6482]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Apr 05 10:45:01 user-admin CRON[6481]: pam_unix(cron:session): session closed for user root
Apr 05 10:45:05 user-admin sudo[6485]: djangoadmin : TTY=pts/1 ; PWD=/var/log/celery ; USER=root ; COMMAND=/bin/journalctl -xe
Remove the /bin/sh -c from ExecStart, ExecStop and ExecRestart (in your celeryd.service).
Assuming you have a virtual environment in /home/celery/venv, and Celery installed in this environment, then your ExecStart (and other Exec* lines) should look like:
ExecStart=/home/celery/venv/bin/celery multi start ${CELERYD_NODES} \
-A ${CELERY_APP} --pidfile=${CELERYD_PID_FILE} \
--logfile=${CELERYD_LOG_FILE} --loglevel=${CELERYD_LOG_LEVEL}
${CELERYD_OPTS}'
To create virtual environment do something like: python3 -m venv /home/celery/venv
If the celery user is created in a different path, then change the /home/celery in the code above to the appropriate "home" of the celery user...
UPDATE: If you used the same config file as on the Linode page, then you may use ExecStart=${CELERY_BIN} multi start...

docker-machine create with digitalocean driver: ssh command error

I´m using docker tools on windows.
create command was working perfectly last week and I managed to create a number of machines on Digital Ocean. Then I tried today with no success. I repeated the same command with different regions and I always get the same result:
λ docker-machine create -d digitalocean --digitalocean-access-token=MYTOKEN --digitalocean-region=ams2 vmname
Running pre-create checks...
Creating machine...
(fernu) Creating SSH key...
(fernu) Creating Digital Ocean droplet...
(fernu) Waiting for IP address to be assigned to the Droplet...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with ubuntu(systemd)...
Installing Docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Error creating machine: Error running provisioning: ssh command error:
command : sudo systemctl -f start docker
err : exit status 1
output : Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
If I execute the suggested command:
root#fernu:~# systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10-machine.conf
Active: inactive (dead) (Result: exit-code) since Fri 2017-06-30 20:56:13 UTC; 8min ago
Docs: https://docs.docker.com
Process: 4943 ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=digitalocean (code=exited, status=1/FAILURE)
Main PID: 4943 (code=exited, status=1/FAILURE)
Jun 30 20:56:13 fernu systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Jun 30 20:56:13 fernu systemd[1]: Failed to start Docker Application Container Engine.
Jun 30 20:56:13 fernu systemd[1]: docker.service: Unit entered failed state.
Jun 30 20:56:13 fernu systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 30 20:56:13 fernu systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Jun 30 20:56:13 fernu systemd[1]: Stopped Docker Application Container Engine.
Jun 30 20:56:13 fernu systemd[1]: docker.service: Start request repeated too quickly.
Jun 30 20:56:13 fernu systemd[1]: Failed to start Docker Application Container Engine.
Any help would be appreciated
Update
It´s working with ubuntu 14:
--digitalocean-image=ubuntu-14-04-x64 so it seams like a problem with the default image (ubuntu-16-04-x64)
This seems to be hitting a lot of people. TL;DR: There is a bug in docker-machine v0.12.0 and this issue can be resolved by upgrading.
Logging in to the DigitalOcean instance and running journalctl -xe provides more information:
-- Unit docker.service has begun starting up.
Jul 07 20:03:52 docker-sandbox docker[4930]: `docker daemon` is not supported on Linux. Please run `do
Jul 07 20:03:52 docker-sandbox systemd[1]: docker.service: Main process exited, code=exited, status=1/
Jul 07 20:03:52 docker-sandbox systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
The key here is docker daemon is not supported on Linux. A bug in docker-machine's version comparison code caused an incorrect systemd unit file to be produced (located at /etc/systemd/system/docker.service.d/10-machine.conf) on certain versions of Ubuntu.
A fix has been committed and a new release (v0.12.1) was made.
You can grab the latest release at: https://github.com/docker/machine/releases/tag/v0.12.1