I'm trying to run the celery worker after my application has been deployed to AWS Elasticbeanstalk.
99_celery_start.sh
#!/usr/bin/env bash
#make dirs
sudo mkdir -p /usr/etc/
sudo chmod 755 /usr/etc/
sudo touch /usr/etc/celery.conf
sudo touch /usr/etc/supervisord.conf
# Get django environment variables
celeryenv=`cat /var/app/rootfolder/myprject/.env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
celeryenv=${celeryenv%?}
# Create celery configuration script
celeryconf="[program:celery-worker]
user=root
directory=/var/app/rootfolder
; Set full path to celery program if using virtualenv
command=/var/app/venv/*/bin/celery -A myprject worker -P solo --loglevel=INFO
.
.
.
.
celery.config
container_commands:
01_celery_configure:
command: "mkdir -p /.platform/hooks/postdeploy/ && cp .ebextensions/99_celery_start.sh /.platform/hooks/postdeploy/ && chmod 774 /.platform/hooks/postdeploy/99_celery_start.sh"
02_run_celery:
command: "sudo /.platform/hooks/postdeploy/99_celery_start.sh"
So what I'm trying is to copy the celery-worker script in the .ebextension folder and paste it into the post-deploy hook folder so that the script runs after the application is deployed on the instance.But the command 02_run_celery is executing before the application is extracted and deployed on the instance. Since the script requires the application folder /var/app/rootfolder/myprjct/.env, the deployment process gives an error cat: /var/app/rootfolder/myprjct/.env: No such file or directory.
If your assumption is right and this is a race condition (the second command execute before the application deployed) - what about waiting for the application deployment? or waiting for that specific file you're interested in?
...
# Wait for the app deployment
while [ ! -f /var/app/rootfolder/myprject/.env ]
do
sleep 1
echo "Waiting for the application deployment"
done
# Get django environment variables
celeryenv=`cat /var/app/rootfolder/myprject/.env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
celeryenv=${celeryenv%?}
...
If anyone still looking for the answer then this is how I did it:
Folder structure:
|-- .ebextensions/
| |-- celery.config # Option settings
| `-- cloudwatch.config # Other .ebextensions sections, for example files and container commands
`-- .platform/
|-- nginx/ # Proxy configuration
| |-- nginx.conf
| `-- conf.d/
| `-- custom.conf
|-- hooks/ # Application deployment hooks
| `-- postdeploy/
| `-- 99_celery_start.sh
Now add permissions for 99_celery_start.script in celery.config:
01_celery_perm:
command: "sudo chmod +x .platform/hooks/postdeploy/99_celery_start.sh"
02_dos2unix:
command: "perl -i -pe's/\r$//;' .platform/hooks/postdeploy/99_celery_start.sh"
IMPORTANT: Make sure the script should be saved in LF line endings instead of CRLF.
Related
I am very new to gcloud command line and new to scripting altogether. I'm cleaning up a GCP org with multiple stray projects. I am trying to run a gcloud command to find the creator of all my projects so I can reach out to each project creator and ask them to clean up a few things.
I found a command to search logs for a project and find the original project creator, provided the project isn't older than 400 days.
gcloud logging read --project [PROJECT] \
--order=asc --limit=1 \
--format='table(protoPayload.methodName, protoPayload.authenticationInfo.principalEmail)'
My problem is this: I have over 300 projects in my org currently. I have a .csv of all project names and IDs via (gcloud projects list).
Using the above command. How can I make [project] a variable and call/import the project name field from my .csv as the variable.
What I hope to accomplish is this: The gcloud command line provided the output for each project name in the .csv file and outputs it all to a another .csv file. I hope this all made sense.
Thanks.
I haven't tried anything yet. I don't want to run the same command for each of the 300 projects manually.
I have put together this bash script, however I've been unable to properly test as I don't currently have access to any GCP project, but hopefully it will work.
Input:
This is how the CSV file should look like
| ids |
|------|
| 1234 |
| 4567 |
| 7890 |
| 0987 |
Output: what the script will generate
| project_id | owner |
|------------|-------|
| 1234 | john |
| 4567 | doe |
| 7890 | test |
| 0987 | user |
#! /bin/bash
touch output.csv
echo "project_id, owner;" >>> output.csv
while IFS="," read -r data
do
echo "Fetching project creator for: $data"
creator=$(gcloud logging read --project ${data} --order=asc --limit=1 --format='table(protoPayload.methodName, protoPayload.authenticationInfo.principalEmail)')
echo "${data},${creator};" >>> output.csv
done < <(cut -d ";" -f1 input.csv | tail -n +2)
How do I pass my aws credentials when I am running a Jenkinsjob
taking this as an example https://github.com/PaulMaddox/amazon-eks-kubectl
$ docker run -v ~/.aws:/home/kubectl/.aws -e CLUSTER=demo maddox/kubectl get services
The above works on my laptop , but I want to pass aws credentials on the file.I have aws configured in my Jenkins-->credentials .I also have a bitbucket repo which contains a Jenkinsfile and a yam file for "service" and "deployment"
the way I do it now is run the kubectl create -f filename.yaml and it deploys to eks .. just want to do the same thing but automate it with a Jenkinsfile , suggestions on how to do it either with kubectl or with helm
In your Jenkinsfile you should include similar section:
stage('Deploy on Dev') {
node('master'){
withEnv(["KUBECONFIG=${JENKINS_HOME}/.kube/dev-config","IMAGE=${ACCOUNT}.dkr.ecr.us-east-1.amazonaws.com/${ECR_REPO_NAME}:${IMAGETAG}"]){
sh "sed -i 's|IMAGE|${IMAGE}|g' k8s/deployment.yaml"
sh "sed -i 's|ACCOUNT|${ACCOUNT}|g' k8s/service.yaml"
sh "sed -i 's|ENVIRONMENT|dev|g' k8s/*.yaml"
sh "sed -i 's|BUILD_NUMBER|01|g' k8s/*.yaml"
sh "kubectl apply -f k8s"
DEPLOYMENT = sh (
script: 'cat k8s/deployment.yaml | yq -r .metadata.name',
returnStdout: true
).trim()
echo "Creating k8s resources..."
sleep 180
DESIRED= sh (
script: "kubectl get deployment/$DEPLOYMENT | awk '{print \$2}' | grep -v DESIRED",
returnStdout: true
).trim()
CURRENT= sh (
script: "kubectl get deployment/$DEPLOYMENT | awk '{print \$3}' | grep -v CURRENT",
returnStdout: true
).trim()
if (DESIRED.equals(CURRENT)) {
currentBuild.result = "SUCCESS"
return
} else {
error("Deployment Unsuccessful.")
currentBuild.result = "FAILURE"
return
}
}
}
}
}
which will be responsible for automating deployment proccess.
I hope it helps.
I've installed Celery[sqs] and django-celery-beat in my Django 1.10 project.
I've trying to run them both (worker and beat) using Supervisor on and Elastic Beanstalk instance.
The Supervisor config is being created dynamically with the following script:
#!/usr/bin/env bash
# get django environment variables
celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
celeryenv=${celeryenv%?}
# create celery beat config script
celerybeatconf="[program:celery-beat]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery beat -A phsite --loglevel=DEBUG --workdir=/tmp -S django --pidfile /tmp/celerybeat.pid
directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-beat.log
stderr_logfile=/var/log/celery-beat.log
autostart=false
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 10
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
environment=$celeryenv"
# create celery worker config script
celeryworkerconf="[program:celery-worker]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery worker -A phsite --loglevel=INFO
directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-worker.log
stderr_logfile=/var/log/celery-worker.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=999
environment=$celeryenv"
# create files for the scripts
echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf
echo "$celeryworkerconf" | tee /opt/python/etc/celeryworker.conf
# add configuration script to supervisord conf (if not there already)
if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
then
echo "[include]" | tee -a /opt/python/etc/supervisord.conf
echo "files: celerybeat.conf celeryworker.conf" | tee -a /opt/python/etc/supervisord.conf
fi
# reread the supervisord config
/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf reread
# update supervisord in cache without restarting all services
/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf update
After which the following ebextension is running:
container_commands:
01_create_celery_beat_configuration_file:
command: "cat .ebextensions/files/celery_configuration.sh > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && sed -i 's/\r$//' /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
02_chmod_supervisor_sock:
command: "chmod 777 /opt/python/run/supervisor.sock"
03_create_logs:
command: "touch /var/log/celery-beat.log /var/log/celery-worker.log"
04_chmod_logs:
command: "chmod 777 /var/log/celery-beat.log /var/log/celery-worker.log"
05_start_celery_worker:
command: "/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart celery-worker"
06_start_celery_beat:
command: "/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf start celery-beat"
When logging-in to the Instance, and running
/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf status
celery-beat is already "not started" (with an empty log file) while the celery-worker is running.
The weirdest part is that if I run it manually (e.g.
/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf start celery-beat
It's running without errors.
Anyone has any idea how to debug it?
Why would it not load within the eb_extension while it does load later?
Maybe that has to do with the fact the Django is not up yet and am using django_celery_beat.schedulers:DatabaseScheduler configuration?
So the simple reason is, the shell script created in the eb_extension:
container_commands:
01_create_celery_beat_configuration_file:
command: "cat .ebextensions/files/celery_configuration.sh > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && sed -i 's/\r$//' /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
Is created in the appdeploy/post directory and therefore runs (post deployment and basically) after the following commands are executed.
The start/restart command don't do a thing because the shell script hasn't registered those services yet. 🤦♂️
There are a few solutions to configuring .ebextension container commands for cronjobs but none of them are working for me.
I am concerned that the reason they aren't working is because .ebextensions is not in the root directory. This messy code was handed over to me and I've tried to move .ebextensions to where it needs to be but that seems to break everything else.
This app is a streaming video application currently in production and I can't afford to break it so I ended up just leaving it where it is.
Can someone tell if I am doing this right and I just need to find a way to move .ebextensions or is the problem in my cronjob configuration?
app1/.ebextensions/02_python.config
container_commands:
...
cronjob:
command: "echo .ebextensions/cronjobs.txt > /etc/cron.d/cronjobs && 644 /etc/cron.d/cronjobs"
leader_only: true
...
app1/.ebextensions/cronjobs.txt
***** root source /opt/python/run/venv/bin/activate && python3 manage.py runcrons > /var/log/cronjobs.log
app1/settings.py
INSTALLED_APPS = [
...
'django_cron',
...
]
CRON_CLASSES = [
'app2.crons.MyCronJob',
]
app2/crons
from django_cron import CronJobBase, Schedule
class MyCronJob(CronJobBase):
RUN_EVERY_MINS = 1
schedule = Schedule(run_every_mins=RUN_EVERY_MINS)
def do(self):
# calculate stuff
# update variables
This deploys to AWS elastic beanstalk without error and logs show it's run but the work doesn't get done and it only runs the command once on deploy. Logs show this.
Command cronjob] : Starting activity...
[2018-02-15T12:58:41.648Z] INFO [24604] - [Application update ingest16#207/AppDeployStage0/EbExtensionPostBuild/Infra-EmbeddedPostBuild/postbuild_0_api_backend/Test for Command 05_cronjob] :
Completed activity. Result:
Completed successfully
This does the job but only once on deploy.
container_commands:
...
cronjob:
command: "source /opt/python/run/venv/bin/activate && python3 manage.py runcrons"
leader_only: true
...
This doesn't work at all.
container_commands:
...
cronjob:
command: "echo /app1/.ebextensions/cronjobs.txt > /etc/cron.d/cronjobs && 644 /etc/cron.d/cronjobs"
leader_only: true
...
Hi why using django_cron, when you only need cron ?
Here is my config .ebextensions:
container_commands:
...
0.0.1.cron.mailing:
command: "cat .ebextensions/mailing.txt > /etc/cron.d/mailing && chmod 644 /etc/cron.d/mailing"
leader_only: true
Here my mailing.txt:
Every Morning at 05:00am
#* * * * * * command
#| | | | | | |
#| | | | | | + Comande Line
#| | | | | +-- Year (range: 1900-3000)
#| | | | +---- Day of the Week (range: 1-7, 1 standing for Monday)
#| | | +------ Month of the Year (range: 1-12)
#| | +-------- Day of the Month (range: 1-31)
#| +---------- Hour (range: 0-23)
#+------------ Minute (range: 0-59)
# m h dom mon dow command
0 5 * * * root source /opt/python/run/venv/bin/activate && source /opt/python/current/env && cd /opt/python/current/app/ && python manage.py my_command >> /home/ec2-user/cron-mailing.log 2>&1
And here how to create custom command : https://docs.djangoproject.com/en/2.0/howto/custom-management-commands/#module-django.core.management
Hope this help,
You need space in your cron file between * :
Your cronfile :
***** root source /opt/python/run/venv/bin/activate && python3 manage.py runcrons > /var/log/cronjobs.log
Fix it like that :
* * * * * root source /opt/python/run/venv/bin/activate && python3 manage.py runcrons > /var/log/cronjobs.log
I would like to download a latest source code of software (WRF) from some url and automate the installation process thereafter. A sample url like is given below:-
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.1.TAR.gz
In the above url, the version number may change time to time after the developer release the new version. Now I would like to download the latest available version from the main script. I tried the following:-
wget -k -l 0 "http://www2.mmm.ucar.edu/wrf/src/" -O index.html ; cat index.html | grep -o 'http:[^"]*.gz' | grep 'WRFV'
With above code, I could pull all available version of the software. The output of the above code is below:-
http://www2.mmm.ucar.edu/wrf/src/WRFV2.0.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.0.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.0.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.4.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.4.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.5.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.5.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.6.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.6.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Var-do-not-use.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.0.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.0.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.4.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.4.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.5.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.5.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3_OVERLAY_3.0.1.1.TAR.gz
However, I am unable to go further to filter out only later version from the link.
Usually, for processing the html-pages i recommendig some perl tools, but because this is an Directory Index output, (probably) can be done by bash tools like grep sed and such...
The following code is divided to several smaller bash functions, for easy changes
#!/bin/bash
#getdata - should output html source of the page
getdata() {
#use wget with output to stdout or curl or fetch
curl -s "http://www2.mmm.ucar.edu/wrf/src/"
#cat index.html
}
#filer_rows - get the filename and the date columns
filter_rows() {
sed -n 's:<tr><td.*href="\([^"]*\)">.*>\([0-9].*\)</td>.*</td>.*</td></tr>:\2#\1:p' | grep "${1:-.}"
}
#sort_by_date - probably don't need comment... sorts the lines by date... ;)
sort_by_date() {
while IFS=# read -r date file
do
echo "$(date --date="$date" +%s)#$file"
done | sort -gr
}
#MAIN
file=$(getdata | filter_rows WRFV | sort_by_date | head -1 | cut -d# -f2)
echo "You want download: $file"
prints
You want download: WRFV3-Chem-3.6.1.TAR.gz
What about adding a numeric sort and taking the top line:
wget -k -l 0 "http://www2.mmm.ucar.edu/wrf/src/" -O index.html ; cat index.html | grep -o 'http:[^"]*.gz' | grep 'WRFV[0-9]*[0-9]\.[0-9]' | sort -r -n | head -1