Github self-hosted runner fails to run as a service on RH Linux - github-actions-self-hosted-runners

I installed a GitHub self-hosted runner on a RH Linux EC2 instance. It runs fine in interactive mode: ./run.sh
But when trying to run it as a service. (sudo ./svc.sh start), it fails to start.
Active: failed
runsvc.sh (code=exited, status=203/EXEC)
Any ideas on how to get around this?

I am running an Oracle Linux 8 instance on the Oracle Cloud and had the same issue, exactly the same error output. Very likely you have SELinux running. This is blocking your service to start. This command helped me to solve the issue:
chcon system_u:object_r:usr_t:s0 runsvc.sh

After running the chcon command, I could run the start but it crashed instantly with a status=1/failure.
So i ran the chcon command on the run.sh command and in the service file found in systemd I put run.sh instead of svc.sh start.
Now it starts without problems and i can get the status of the service.
Active: active (running) since Fri 2023-02-10 11:59:14 CET; 6s ago
Main PID: 2065992 (run.sh)

Related

Can not find NVIDIA driver after stop and start a deep learning VM

[TL;DR] First, wait for a couple of minutes and check if the Nvidia driver starts to work properly. If not, stop and start the VM instance again.
I created a Deep Learning VM (Google Click to Deploy) with an A100 GPU. After stopping and starting the instance, when I run nvidia-smi, I got the following error message:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
But if I type which nvidia-smi, I got
/usr/bin/nvidia-smi
It seems the driver is there but can not be used. Can someone suggest how to enable NVIDIA driver after stopping and starting a deep learning VM? The first time I created and opened the instance, the driver is automatically installed.
The system information is (using uname -m && cat /etc/*release):
x86_64
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
I tried the installation scripts from GCP. First run
curl https://raw.githubusercontent.com/GoogleCloudPlatform/compute-gpu-installation/main/linux/install_gpu_driver.py --output install_gpu_driver.py
And then run
sudo python3 install_gpu_driver.py
which gives the following message:
Executing: which nvidia-smi
/usr/bin/nvidia-smi
Already installed.
After posting the question, the Nvidia driver starts to work properly after waiting for a couple of minutes.
In the following days, I tried stopping/starting the VM instance multiple times. Sometimes nvidia-smi directly works, sometimes does not after >20 min waiting. My current best answer to this question is first waiting for several minutes. If nvidia-smi still does not work, stop and start the instance again.
What worked for me (not sure if it will go well to next starts) was to remove all drivers: sudo apt remove --purge '*nvidia*', and then force the installation with sudo python3 install_gpu_driver.py.
In the install_gpu_driver.py, change line 230 to return False inside of the check_driver_installed function. Then, run the script.
Who uses docker may face this error docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]] and have to reinstall the docker too. This thread helped me.

How to run bash script on Google Cloud VM?

I found this auto shutdown script for VM instances on GCP and tried to add that into the VM's metadata.
Here's a link to that shutdown script.
The config sets it so that after 20 mins the idle VM will shut down, but it's been a few hours and it never shut down. Are there any more steps I have to do after adding the script to the VM metadata?
The metadata script I added:
Startup scripts are executed while the VM starts. If you execute your "shutdown script" at the boot there will be nothing for it to do. Additionally in order for this to work a proper service has to be created and it will be using this script to detect & shutdown the VM in case it's idling.
So - even if the main script ashutdown was executed at boot and there was no idling it did nothing. And since the service wasn't there to run it again your instance will run indefinatelly.
For this to work you need to install everything on the VM in question;
Download all three files to some directory in your vm, for example with curl:
curl -LJO https://raw.githubusercontent.com/GoogleCloudPlatform/ai-platform-samples/master/notebooks/tools/auto-shutdown/ashutdown
curl -LJO https://raw.githubusercontent.com/GoogleCloudPlatform/ai-platform-samples/master/notebooks/tools/auto-shutdown/ashutdown.service
curl -LJO https://raw.githubusercontent.com/GoogleCloudPlatform/ai-platform-samples/master/notebooks/tools/auto-shutdown/install.sh
Make install.sh exacutable: sudo chmod +x install.sh
Run it: sudo ./install.sh.
This should install & run the ashutdown service in your system.
You can check if it's running with service ashutdown status.
These instructions are for Debian system so if you're running CentOS or other flavour of Linux they may differ.

How can I use kubernetes cluster in Windows WSL2?

I am trying to create cluster by using this article in my WSl Ubuntu. But It returns some errors.
Errors:
yusuf#DESKTOP-QK5VI8R:~/aws/kubs2$ sudo systemctl daemon-reload
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
yusuf#DESKTOP-QK5VI8R:~/aws/kubs2$ sudo systemctl restart kubelet
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
yusuf#DESKTOP-QK5VI8R:~/aws/kubs2$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.21.1
[preflight] Running pre-flight checks
[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-6443]: Port 6443 is in use
[ERROR Service-Docker]: docker service is not active, please run 'systemctl start docker.service'
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
I don't understand the reason when I use sudo systemctl restart kubelet. Error like this occurs:
docker service is not enabled, please run 'systemctl enable docker.service'
When I use:
yusuf#DESKTOP-QK5VI8R:~/aws/kubs2$ systemctl enable docker.service
Failed to enable unit, unit docker.service does not exist.
But I have docker images still runnig:
What is wrong while creating Cluster Kubernetes in WSL? Is there any good tutorial for creating cluster in WSL?
Tutorial you're following is designed for cloud Virtual machines with Linux OS on them (this is important since WSL works a bit differently).
E.g. SystemD is not presented in WSL, behaviour you're facing is currently in development phase.
What you need is to follow designated tutorial for WSL (WSL2 in this case). Also see that docker is set up on Windows machine and shares its features with WSL integration. Please find Kubernetes on Windows desktop tutorial (this uses KinD or minikube which is enough for development and testing)
Also there's a part for enabling SystemD which can potentially resolve your issue on a state where you are (I didn't test this as I don't have a windows machine).

How do I restart my cron service on my Ubuntu AWS instance? [duplicate]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
The community reviewed whether to reopen this question 2 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
Do I have to restart cron after changing the crontable file?
No.
From the cron man page:
...cron will then examine the modification time on all crontabs
and reload those which have changed. Thus cron need not be restarted
whenever a crontab file is modified
But if you just want to make sure its done anyway,
sudo service cron reload
or
/etc/init.d/cron reload
On CentOS with cPanel sudo /etc/init.d/crond reload does the trick.
On CentOS7: sudo systemctl start crond.service
I had a similar issue on 16.04 VPS Digital Ocean. If you are changing crontabs, make sure to run
sudo service cron restart
Commands for RHEL/Fedora/CentOS/Scientific Linux user
Start cron service
To start the cron service, use: /etc/init.d/crond start
OR RHEL/CentOS 5.x/6.x user: service crond start
OR RHEL/Centos Linux 7.x user: systemctl start crond.service
Stop cron service
To stop the cron service, use: /etc/init.d/crond stop
OR RHEL/CentOS 5.x/6.x user: service crond stop
OR RHEL/Centos Linux 7.x user: systemctl stop crond.service
Restart cron service
To restart the cron service, use: /etc/init.d/crond restart
OR RHEL/CentOS 5.x/6.x user: service crond restart
OR RHEL/Centos Linux 7.x user: systemctl restart crond.service
Commands for Ubuntu/Mint/Debian based Linux distro
Debian Start cron service
To start the cron service, use: /etc/init.d/cron start
OR sudo /etc/init.d/cron start
OR sudo service cron start
Debian Stop cron service
To stop the cron service, use: /etc/init.d/cron stop
OR sudo /etc/init.d/cron stop
OR sudo service cron stop
Debian Restart cron service
To restart the cron service, use: /etc/init.d/cron restart
OR sudo /etc/init.d/cron restart
OR sudo service cron restart
Source: https://www.cyberciti.biz/faq/howto-linux-unix-start-restart-cron/
Depending on distribution, using "cron reload" might do nothing. To paste a snippet out of init.d/cron (debian squeeze):
reload|force-reload) log_daemon_msg "Reloading configuration files for periodic command scheduler" "cron"
# cron reloads automatically
log_end_msg 0
;;
Some developer/maintainer relied on it reloading, but doesn't, and in this case there's not a way to force reload. I'm generating my crontab files as part of a deploy, and unless somehow the length of the file changes, the changes are not reloaded.
try this one for centos 7 : service crond reload
If file /var/spool/cron/crontabs/root edited via SFTP client - service cron restart needed.
Reload service not work.
If edited file /var/spool/cron/crontabs/root via console linux (nano, mc) - restart NOT needed.
If edited cron via crontab -e - restart NOT needed.
Try this out: sudo cron reload
It works for me on ubuntu 12.10
Try this: service crond restart, Hence it's crond not cron.
There are instances wherein cron needs to be restarted in order for the start up script to work. There's nothing wrong in restarting the cron.
sudo service cron restart
On CentOS (my version is 6.5) when editing crontab you must close the editor to reflect your changes in CRON.
crontab -e
After that command You can see that new entry appears in /var/log/cron
Sep 24 10:44:26 ***** crontab[17216]: (*****) BEGIN EDIT (*****)
But only saving crontab editor after making some changes does not work. You must leave the editor to reflect changes in cron. After exiting new entry appears in the log:
Sep 24 10:47:58 ***** crontab[17216]: (*****) END EDIT (*****)
From this point changes you made are visible to CRON.
Ubuntu 18.04
* Usage: /etc/init.d/cron {start|stop|status|restart|reload|force-reload}

Stop detached strongloop application

I installed loopback on my server (ubuntu) and then created an app and use the command slc run to run... everything works as expected.
Now i have 1 question and also 1 issue i am facing with:
The question: i need to use slc run command but to keep the app "alive" also after i close the terminal. For that i used the --detach option and it works, What i wanted to know if the --detach option is the best practice or i need to do it in a different way.
The issue: After i use the --detach i don't really know how to stop it. Is there a command that i can use to stop the process from running?
To stop a --detached process, go to the same directory it was run from and do slc runctl stop. There are a number of runctl commands, but stop is probably the one you are most interested in.
Best practices is a longer answer. The short version is: don't use --detach ever and do use an init script to run your app and keep it running (probably Upstart, since you're on Ubuntu).
Using slc run
If you wan to run slc run as an Upstart job you can install strong-service-install with npm install -g strong-service-install. This will give you sl-svc-install, a utility for creating Upstart and systemd services.
You'll end up running something like sudo sl-svc-install --name my-app --user youruser --cwd /path/to/app/root -- slc run . which should create a Upstart job named my-app which will run your app as your uid from the app's root. Your app's stdout/stderr will be sent to /var/log/upstart/my-app.log. If you are using a version of Ubuntu older than 12.04 you'll need to specify --upstart 0.6 and your logs will end up going to syslog instead.
Using slc pm
Another, possibly easier route, is to use slc pm, which operates at a level above slc run and happens to be easier to install as an OS service. For this route you already have everything installed. Run sudo slc pm-install and a strong-pm Upstart service will be installed as well as a strong-pm user to run it as with a $HOME of /var/lib/strong-pm.
Where the PM approach gets slightly more complicated is that you have to deploy your app to it. Most likely this is just a matter of going to your app root and running slc deploy http://localhost:8701/, but the specifics will depend on your app. You can configure environment variables for your app, deploy new versions, and your logs will show up in /var/log/upstart/strong-pm.log.
General Best Practices
For either of the options above, I recommend not doing npm install -g strongloop on your server since it includes things like yeoman generators and other tools that are more useful on a workstation than a server.
If you want to go the slc run route, you would do npm install -g strong-supervisor strong-service-install and replace your slc run with sl-run.
If you want to go the slc pm route, you would do npm install -g strong-pm and replace slc pm-install with sl-pm-install.
Disclaimer
I work at StrongLoop and primarily work on these tools.
View the status of running apps using:
slc ctl status
Example output:
Service ID: 1
Service Name: app
Environment variables:
No environment variables defined
Instances:
Version Agent version Debugger version Cluster size Driver metadata
5.2.1 2.0.3 n/a 1 N/A
Processes:
ID PID WID Listening Ports Tracking objects? CPU profiling? Tracing? Debugging?
1.1.2708 2708 0
1.1.5836 5836 1 0.0.0.0:3001
Service ID: 2
Service Name: default
Environment variables:
No environment variables defined
Instances:
Version Agent version Debugger version Cluster size Driver metadata
5.2.1 2.0.3 n/a 1 N/A
Processes:
ID PID WID Listening Ports Tracking objects? CPU profiling? Tracing? Debugging?
2.1.2760 2760 0
2.1.1676 1676 1 0.0.0.0:3002
To kill the first app, use slc ctrl stop
slc ctl stop app
Service "app" hard stopped
What if i have to run the application as a cluster ? Can i still do it via the upstart created.
Like
sudo sl-svc-install --name my-app --user youruser --cwd /path/to/app/root -- slc run --cluster 4 .
I tried doing this but the /etc/init/my-app.conf does not show any information about the cluster.