Django channels elastic beanstalk (linux 2) hooks postdeploy: -b command not found - django

For deployment of django channels on elastic beanstalk (linux 2 AMI) I tried implementing this blog ,it required creation of ./platform/hooks/postdeploy/ and in it two files 01_set_env.sh and 02_run_supervisor_daemon.sh, on eb deploy it fails and on checking the eb-engine.logs this error shows up.
2021/09/28 05:05:44.382229 [ERROR] An error occurred during execution of command [app-deploy] - [RunAppDeployPostDeployHooks]. Stop running the command. Error: Command .platform/hooks/postdeploy/02_run_supervisor_daemon.sh failed with error exit status 2. Stderr:.platform/hooks/postdeploy/02_run_supervisor_daemon.sh: line 13: -b: command not found
02_run_supervisor_daemon.sh
#!/bin/bash
# Get system environment variables
systemenv=`cat /opt/elasticbeanstalk/deployment/custom_env_var | tr '
' ',' | sed 's/%/%%/g' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/:$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
systemenv=${systemenv%?}
systemenv=`echo $systemenv | sed 's/,/",/g' | sed 's/=/="/g'`
systemenv="$systemenv""
# Create daemon configuration script
daemonconf="[program:daphne]
command=daphne -b :: -p 5000 backend.asgi:application
directory=/var/app
user=ec2-user
numprocs=1
stdout_logfile=/var/log/stdout_daphne.log
stderr_logfile=/var/log/stderr_daphne.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
environment=$systemenv"
# Create the Supervisor conf script
echo "$daemonconf" | sudo tee /etc/supervisor/conf.d/daemon.conf
# Reread the Supervisor config
supervisorctl reread
# Update Supervisor in cache without restarting all services
supervisorctl update
# Start/restart processes through Supervisor
supervisorctl restart daphne
01_set_env.sh
#!/bin/bash
#Create a copy of the environment variable file.
cp /opt/elasticbeanstalk/deployment/env /opt/elasticbeanstalk/deployment/custom_env_var
#Set permissions to the custom_env_var file so this file can be accessed by any user on the instance. You can restrict permissions as per your requirements.
chmod 644 /opt/elasticbeanstalk/deployment/custom_env_var
#Remove duplicate files upon deployment.
rm -f /opt/elasticbeanstalk/deployment/*.bak
Would be great if you could suggest some resources for the deployment of Django channels on ebs(linux 2 as most blogs are for linux1) as well.

Related

ssh tunnel script hangs forever on beanstalk deployment

I'm attempting to create a ssh tunnel, when deploying an application to aws beanstalk. I want to put the tunnel as a background process, that is always connected on application deploy. The script is hanging forever on the deployment and I can't see why.
"/home/ec2-user/eclair-ssh-tunnel.sh":
mode: "000500" # u+rx
owner: root
group: root
content: |
cd /root
eval $(ssh-agent -s)
DISPLAY=":0.0" SSH_ASKPASS="./askpass_script" ssh-add eclair-test-key </dev/null
# we want this command to keep running in the backgriund
# so we add & at then end
nohup ssh -L 48682:localhost:8080 ubuntu#[host...] -N &
and here is the output I'm getting from /var/log/eb-activity.log:
[2019-06-14T14:53:23.268Z] INFO [15615] - [Application update suredbits-api-root-0.37.0-testnet-ssh-tunnel-fix-port-9#30/AppDeployStage1/AppDeployPostHook/01_eclair-ssh-tunnel.sh] : Starting activity...
The ssh tunnel is spawned, and I can find it by doing:
[ec2-user#ip-172-31-25-154 ~]$ ps aux | grep 48682
root 16047 0.0 0.0 175560 6704 ? S 14:53 0:00 ssh -L 48682:localhost:8080 ubuntu#ec2-34-221-186-19.us-west-2.compute.amazonaws.com -N
If I kill that process, the deployment continues as expected, which indicates that the bug is in the tunnel script. I can't seem to find out where though.
You need to add -n option to ssh when run it in background to avoid reading from stdin.

Thousands of extraneous gunicorn workers

I'm using gunicorn 19.7.1 appserver with nginx reverse proxy for a Django project (Ubuntu 14.04 machine).
ps aux | grep gunicorn | grep -v grep | wc -l yields 3043 at the moment.
Whereas in /etc/init/gunicorn.conf, I've always had -w 33. Yet these extra workers persist even if I do sudo service gunicorn stop and sudo service gunicorn start.
How do I kill the extraneous workers?
How did this happen?
The worker count of 33 has always been properly configured on my busy production system.
However a few hours ago, I was trying python's multiprocessing on the server and things went south. Gunicorn workers ate up all the memory and took out the resident redis instances as well.
I reverted the change and have managed to get everything back online, except the memory hasn't been released and I've had to cope with these legacy gunicorn workers. What's going on?
Yet these extra workers persist even if I do sudo service gunicorn stop and sudo service gunicorn start.
service only manages service-initiated processes, so if you started Gunicorn workers outside of the service framework, these workers will continue to live even if you stop.
How do I kill the extraneous workers?
The fast way:
Run this command to list all gunicorn process IDs and terminate them, and then restart Gunicorn:
$ pkill gunicorn
$ sudo service gunicorn start
The better way:
Identify your "desired" Gunicorn workers by finding the parent:
$ sudo service gunicorn status
Note the parent process ID. Let's say it's 123.
Save a list of all the "desired" workers' PIDs:
$ echo 123 > desired_workers
$ pgrep -P 123 >> desired_workers
Save a list of all workers' PIDs:
$ pgrep gunicorn > all_workers
Terminate the "undesired" workers:
$ cat desired_workers all_workers | sort | uniq -u | xargs kill

Best way to find out when old nginx configuration dropped all requests

I'm writing a 0dt deploy script for django.
When a deploy is made, it creates a new django server with the most recent code version, writes a new nginx config pointing to the right socket and reload nginx.
How do I find out when the old nginx workers replied all connections so I can drop the old django server?
Is looking at the workers pid the best option? I can't use nginx status url because the old config stops receiving connections.
Additionaly, there's another problem. Django is my backend, nginx is also proxying to a node server to serve the client. Is it possible to look at the active connections of a single upstream? Otherwise I will have to wait all connections to finish on the frontend too.
Well, in case anyone comes looking for it, I ended up with this solution:
[...]
# Run API
echo ":: Starting new server"
pm2 start ./server.sh --name api-$PRJ > /dev/null
# Copy nginx config
#echo ":: Swap nginx config"
rm ~/nginx/api.atados.com.br.*
cp ~/deploy/api/nginx.conf ~/nginx/api.atados.com.br.$PRJ.conf
perl -pi -e 's/{PRJ}/$PRJ/g' ~/nginx/api.atados.com.br.$PRJ.conf
# Grab workers pid
workers=`ps -aux | grep "nginx: worker" | sed "$ d" | awk '{print $2}'`
# Reload nginx
echo ":: Reloading nginx and waiting for old connections to drop."
sudo service nginx reload > /dev/null
# Wait for workers to die
for job in $workers
do
while [ -e /proc/$job ]
do
sleep 1
done
done
# Close old server
pm2 list | grep api | awk '{print $2}' | grep -v $PRJ | xargs pm2 delete > /dev/null
[...]

How do I restart gunicorn hup , i dont know masterpid or location of PID file

I want to restart a Django server which is running using gunicorn.
I know how to use gunicorn in my system. But now I need to restart a remote server which is not set up by me.
I don't know masterpid to restart the server how can I get the masterPID.
Usually I HUP gunicorn with sudo kill -s HUP masterpid.
I tried with ps aux|grep gunicorn
and I did not find the gunicorn.pid file anywhere.
How can I get the masterpid?
the one liner below, gets the job perfectly done:
kill -HUP `ps -C gunicorn fch -o pid | head -n 1`
Explanation
pc -C gunicorn only lists the processes with gunicorn command, i.e., workers and master process. Workers are children of master as can be seen using ps -C gunicorn fc -o ppid,pid,cmd. We only need the pid of the master, therefore h flag is used to remove the first line which is PID text. Note that, f flag assures that master is printed above workers.
The correct procedure is to send HUP signal only to the master. In this way gunicorn is gracefully restarted, only the workers, not master, are recreated.
You can run gunicorn with option '-p', so you can get the pid of the master process from the pid file.
For example:
gunicorn -p app.pid your_app.wsgi.app
You can get the pid of the master by:
cat app.pid
This should also work to restart gunicorn:
ps aux |grep gunicorn |grep yourapp | awk '{ print $2 }' |xargs kill -HUP
Step 1:
Go to /etc/systemd/system/gunicorn.service and open file
add bellow line
PIDFile=/run/gunicorn/gunicorn.pid
--pid /run/gunicorn/gunicorn.pid
Example:
[Service]
PIDFile=/run/gunicorn/gunicorn.pid
WorkingDirectory=/home/django/django_project
ExecStart=/usr/bin/gunicorn --pid /run/gunicorn/gunicorn.pid --name=django_project.....
User=django
Group=django
Step 2:
Go to /etc/tmpfiles.d/ and create new file gunicorn.conf if not exist
add Bellow line
d /run/gunicorn 0755 django django -
where django = user and group name
Step 3:
Reboot your server or /etc/init.d/gunicorn restart to restart gunicorn to take effect
your pid file location is /run/gunicorn/gunicorn.pid check now..
Building on krizex's answer answer, when your master pid is stored in a file, you can gracefully reload your app in one command like this
$ cat app.pid |xargs kill -HUP
I would have liked to comment on the answer itself but I don't have enough reputation to comment yet 😢.

How to stop gunicorn properly

I'm starting gunicorn with the Django command python manage.py run_gunicorn. How can I stop gunicorn properly?
Note: I have a semi-automated server deployment with fabric. Thus using something like ps aux | grep gunicorn to kill the process manually by pid is not an option.
To see the processes is ps ax|grep gunicorn and to stop gunicorn_django is pkill gunicorn.
One option would be to use Supervisor to manage Gunicorn.
Then again i don't see why you can't kill the process via Fabric.
Assuming you let Gunicorn write a pid file you could easily read that file in a Fabric command.
Something like this should work:
run("kill `cat /path/to/your/file/gunicorn.pid`")
pkill gunicorn
or
pkill -P1 gunicorn
should kill all running gunicorn processes
pkill gunicorn stops all gunicorn daemons. So if you are running multiple instances of gunicorn with different ports, try this shell script.
#!/bin/bash
Port=5000
pid=`ps ax | grep gunicorn | grep $Port | awk '{split($0,a," "); print a[1]}' | head -n 1`
if [ -z "$pid" ]; then
echo "no gunicorn deamon on port $Port"
else
kill $pid
echo "killed gunicorn deamon on port $Port"
fi
ps ax | grep gunicorn | grep $Port shows the daemons with specific port.
Here is the command which worked for me :
pkill -f gunicorn
It will kill any process with the name gunicorn
Start:
gunicorn --pid PID_FILE APP:app
Stop:
kill $(cat PID_FILE)
The --pid flag of gunicorn requires a single parameter: a file where the process id will be stored. This file is also automatically deleted when the service is stopped.
I have used PID_FILE for simplicity but you should use something like /tmp/MY_APP_PID as file name.
If the PID file exists it means the service is running. If it is not there, the service is not running. To stop the service just kill it as mentioned.
You could also want to include the --daemon flag in order to detach the process from the current shell.
To start the service which is running on gunicorn
sudo systemctl enable myproject
sudo systemctl start myproject
or
sudo systemctl restart myproject
But to stop the service running on gunicorn
sudo systemctl stop myproject
to know more about python application hosting using gunicorn please refer here
kill -9 `ps -eo pid,command | grep 'gunicorn.*${moduleName:appName}' | grep -v grep | sort | head -1 | awk '{print $1}'`
ps -eo pid,command will only fetch process id, command and args out
grep -v grep to get rid of output like 'grep --color=auto xxx'
sort | head -1 to do ascending sort and get first line
awk '{print $1}' to get pid back
One more thing you may need to pay attention to: Where gunicorn is installed and which one you're using?
Ubuntu 16 has gunicorn installed by default, the executable is gunicorn3 and located on /usr/bin/gunicorn3, and if you installed it by pip, it's located on /usr/local/bin/gunicorn. You would need to use which gunicorn and gunicorn -v to find out.
In your terminal, do:
ps ax|grep gunicorn
Then to kill the Gunicorn process, just do that:
kill -9 <gunicorn pid number>
In my case I dealt with many processes
For example: kill -9 398 399 4225 4772
The above solutions does not remove pid file when the process is killed.
cat <pid-file> | xargs kill -2
This solution reads pid file and send interrupt signal. This closes gunicorn properly and pid file is also removed.
PID file can be generated by
gunicorn --pid PID-FILE
or by adding the following in config file
pidfile = "pid_file"
If we run:
pkill gunicorn
We stop all gunicorn services, in this case to start gunicorn we only need to stop the parent process associated with the service that attends the port where gunicorn will be executed.
The following script searches for said process (pid), if it exists it kills this process:
#!/bin/bash
# ---------------------
stop_unicorn_on_port() {
pid=$(lsof -w -t -i "TCP:${1}" | head -1)
if [ -z "${pid}" ]; then
echo "🦄 no service deamon on port ${1}"
else
kill -9 "${pid}"
echo "🦄 killed service deamon(${pid}) on port ${1}"
fi
}
# Example/Testing
stop_unicorn_on_port 5000
stop_unicorn_on_port 5001
stop_unicorn_on_port 5002
more info check: man lsoft
-t specifies that lsof should produce terse output with process identifiers only and no header - e.g., so
that the output may be piped to kill(1). -t selects the -w option.
-iselects the listing of files any of whose Internet address matches the address specified in i. If no
address is specified, this option selects the listing of all Internet and x.25 (HP-UX) network files...
Here are some sample addresses:
-i6 - IPv6 only
TCP:25 - TCP and port 25
#1.2.3.4 - Internet IPv4 host address 1.2.3.4
I built upon #David's recommendation to use --pid (PID_FILE) to fix the problem I faced because killing the parent pid didn't kill worker processes.
import os
import sys
import psutil
def stop_pid(pid):
if sys.platform == 'win32':
p = psutil.Process(pid)
p.terminate() # or p.kill()
else:
os.system('kill -9 {0}'.format(pid))
def get_child_pids(ppid):
pid_list = []
for process in psutil.process_iter():
_ppid = process.ppid()
if _ppid == ppid:
_pid = process.pid
pid_list.append(_pid)
return pid_list
def send_kill_cmd(ppid, cpids):
stop_pid(ppid) # Killing the parent proc first
for pid in cpids:
stop_pid(pid)
if __name__ == '__main__':
parent_pid = int(sys.argv[1])
child_pids = get_child_pids(parent_pid)
send_kill_cmd(parent_pid, child_pids)
Then finally excecuted above python script with below commands
#!/bin/bash
FILE_NAME=PID_FILE
if [ -f "$FILE_NAME" ]; then
pypy stop_gunicorn.py "$(cat PID_FILE)"
echo "killed - $(cat PID_FILE) and it's child processes."
sleep 2
fi
echo 'Starting gunicorn'
nohup gunicorn --workers 1 --bind 0.0.0.0:5050 app:app --thread 50 --worker-class eventlet --reload --pid PID_FILE > nohup_outs/nohup_process.out &