I want to enable SSL (using Let's Encrypt) for my Django project running on AWS Elastic Beanstalk.
tldr:
Unfortunately, it seems that when Let's encrypt connects to my website to check for the token instead it gets a 404 error.
During secondary validation: Invalid response from
http://sub.example.com/.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI
[107.20.106.65]: "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n
<meta http-equiv=\"content-type\" content=\"text/html;
charset=utf-8\">\n <title>Page not "
Now I don't know if this problem is caused by Django configuration, nginx configuration, Elastic Beanstalk, my subdomain, Certbot or anything else...
What next steps to debug it should I take?
(Of course, the sub.example.com stands for an existing subdomain that I own.)
My domain, let's say: example.com was registered through an external domain registrar and then I created a subdomain sub.example.com which points to the EB CNAME (foo-bar-foo-bar.bar-foo.us-east-1.elasticbeanstalk.com.).
The site is available via http using both addresses (sub.example.com and foo-bar-foo-bar.bar-foo.us-east-1.elasticbeanstalk.com) and displays the Django welcome page with an image of a green rocket.
Here is the script I created to create the project and environment (following the official tutorial):
VAR_MYDOMAIN=sub.example.com
VAR_NUMBER=7
VAR_PROJECT_DIRNAME=project-foo-$VAR_NUMBER
VAR_DJANGO_PROJECT_NAME=project_foo_$VAR_NUMBER
VAR_EB_APP_NAME=project_foo_app_$VAR_NUMBER
VAR_EB_ENV_NAME=project-foo-env-$VAR_NUMBER
VAR_AWS_KEYNAME=aws_keys_name
mkdir $VAR_PROJECT_DIRNAME
cd $VAR_PROJECT_DIRNAME
py -m venv eb-virt
source eb-virt/Scripts/activate
pip install django==2.1.1
django-admin startproject $VAR_DJANGO_PROJECT_NAME
cd $VAR_DJANGO_PROJECT_NAME
pip freeze > requirements.txt
mkdir .ebextensions
echo "option_settings:
aws:elasticbeanstalk:container:python:
WSGIPath: $VAR_DJANGO_PROJECT_NAME.wsgi:application" > .ebextensions/django.config
deactivate
eb init -p python-3.7 $VAR_EB_APP_NAME -r us-east-1 -k $VAR_AWS_KEYNAME
eb create $VAR_EB_ENV_NAME
ls
sed -i -e "s|ALLOWED_HOSTS = |ALLOWED_HOSTS = \['`eb status | grep "CNAME" | cut -f 2 -d : | xargs`\',\'$VAR_MYDOMAIN\']#|g" $VAR_DJANGO_PROJECT_NAME/settings.py && eb deploy
eb open
echo "done"
Then I followed this tutorial in order to
install Certbot
Open port 443
Configure the certificate for Nginx
Add certificate renewal to cron
So I created this script:
VAR_MYDOMAIN=sub.example.com
VAR_NUMBER=7
VAR_PROJECT_DIRNAME=project-foo-$VAR_NUMBER
VAR_DJANGO_PROJECT_NAME=project_foo_$VAR_NUMBER
VAR_TEST_CERT=--test-cert
VAR_MYDOMAIN_EMAIL=validaddress#example.com
cd $VAR_PROJECT_DIRNAME/$VAR_DJANGO_PROJECT_NAME
mkdir .platform
mkdir .platform/hooks
mkdir .platform/hooks/postdeploy
echo "container_commands:
00_download_epel:
command: \"sudo wget -r --no-parent -A 'epel-release-*.rpm' http://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/\"
ignoreErrors: true
test: test ! -d \"/etc/letsencrypt/\"
10_install_epel_release:
command: \"sudo rpm -Uvh dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/epel-release-*.rpm\"
ignoreErrors: true
test: test ! -d \"/etc/letsencrypt/\"
20_enable_epel:
command: \"sudo yum-config-manager --enable epel*\"
ignoreErrors: true
test: test ! -d \"/etc/letsencrypt/\"
30_install_certbot:
command: \"sudo yum install -y certbot python3-certbot-nginx python2-certbot-nginx python-certbot-nginx\"
ignoreErrors: true
test: test ! -d \"/etc/letsencrypt/\"" > .ebextensions/00_install_certbot.config
echo "Resources:
sslSecurityGroupIngress:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: {\"Fn::GetAtt\" : [\"AWSEBSecurityGroup\", \"GroupId\"]}
IpProtocol: tcp
ToPort: 443
FromPort: 443
CidrIp: 0.0.0.0/0" > .ebextensions/01_open_https_port.config
echo "#!/bin/sh
sudo certbot -n $VAR_TEST_CERT -d $VAR_MYDOMAIN --nginx --agree-tos --email $VAR_MYDOMAIN_EMAIL" > .platform/hooks/postdeploy/00_get_certificate.sh
echo "container_commands:
00_permission_hook:
command: \"chmod +x .platform/hooks/postdeploy/00_get_certificate.sh\"" > .ebextensions/02_grant_executable_rights.config
echo "files:
/tmp/renew_cert_cron:
mode: \"000777\"
owner: root
group: root
content: |
0 1,13 * * * certbot renew --no-self-upgrade" > .ebextensions/03_renew_ssl_certificate_cron_job.config
eb deploy
eb open
Unfortunately, during the deployment I get following errors:
Upload Complete.
2022-01-30 17:57:02 INFO Environment update is starting.
2022-01-30 17:57:42 INFO Deploying new version to instance(s).
2022-01-30 17:57:46 INFO Instance deployment successfully generated a 'Procfile'.
2022-01-30 17:58:54 ERROR Instance deployment failed. For details, see 'eb-engine.log'.
2022-01-30 17:58:57 ERROR [Instance: i-xxxxxxxxxxxxxxxxx] Command failed on instance. Return code: 1 Output: Engine execution has encountered an error..
2022-01-30 17:58:57 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2022-01-30 17:58:57 ERROR Unsuccessful command execution on instance id(s) 'i-xxxxxxxxxxxxxxxxx'. Aborting the operation.
2022-01-30 17:58:57 ERROR Failed to deploy application.
ERROR: ServiceError - Failed to deploy application.
And in the logs I see the following information:
----------------------------------------
/var/log/eb-hooks.log
----------------------------------------
2022/01/30 17:58:18.723761 [INFO] Running command .platform/hooks/postdeploy/00_get_certificate.sh
2022/01/30 17:58:54.348928 [INFO] Account registered.
Requesting a certificate for sub.example.com
IMPORTANT NOTES:
- The following errors were reported by the server:
Domain: sub.example.com
Type: dns
Detail: During secondary validation: Invalid response from
http://sub.example.com/.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI
[107.20.106.65]: "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n
<meta http-equiv=\"content-type\" content=\"text/html;
charset=utf-8\">\n <title>Page not "
----------------------------------------
/var/log/nginx/access.log
----------------------------------------
172.31.14.185 - - [30/Jan/2022:17:58:21 +0000] "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1" 404 2162 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "18.196.102.134"
172.31.14.185 - - [30/Jan/2022:17:58:22 +0000] "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "18.236.228.243"
172.31.14.185 - - [30/Jan/2022:17:58:22 +0000] "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "66.133.109.36"
172.31.14.185 - - [30/Jan/2022:17:58:31 +0000] "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "18.222.145.89"
----------------------------------------
/var/log/nginx/error.log
----------------------------------------
2022/01/30 17:58:20 [notice] 4486#4486: signal process started
2022/01/30 17:58:22 [warn] 4487#4487: *9 using uninitialized "year" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:22 [warn] 4487#4487: *9 using uninitialized "month" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:22 [warn] 4487#4487: *9 using uninitialized "day" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:22 [warn] 4487#4487: *9 using uninitialized "hour" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:22 [warn] 4487#4487: *11 using uninitialized "year" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:22 [warn] 4487#4487: *11 using uninitialized "month" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:22 [warn] 4487#4487: *11 using uninitialized "day" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:22 [warn] 4487#4487: *11 using uninitialized "hour" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:31 [warn] 4487#4487: *11 using uninitialized "year" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:31 [warn] 4487#4487: *11 using uninitialized "month" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:31 [warn] 4487#4487: *11 using uninitialized "day" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:31 [warn] 4487#4487: *11 using uninitialized "hour" variable while logging request, client: 172.31.14.185, server: sub.example.com, request: "GET /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI HTTP/1.1", host: "sub.example.com"
2022/01/30 17:58:53 [notice] 4491#4491: signal process started
----------------------------------------
/var/log/eb-engine.log
----------------------------------------
2022/01/30 17:58:17.585504 [INFO] Running command /bin/sh -c systemctl daemon-reload
2022/01/30 17:58:17.680658 [INFO] Running command /bin/sh -c systemctl reset-failed
2022/01/30 17:58:17.685474 [INFO] Register application processes...
2022/01/30 17:58:17.685486 [INFO] Registering the proc: web
2022/01/30 17:58:17.685498 [INFO] Running command /bin/sh -c systemctl show -p PartOf web.service
2022/01/30 17:58:17.691588 [INFO] Running command /bin/sh -c systemctl daemon-reload
2022/01/30 17:58:17.778134 [INFO] Running command /bin/sh -c systemctl reset-failed
2022/01/30 17:58:17.782568 [INFO] Running command /bin/sh -c systemctl is-enabled eb-app.target
2022/01/30 17:58:17.786244 [INFO] Running command /bin/sh -c systemctl enable eb-app.target
2022/01/30 17:58:17.881674 [INFO] Running command /bin/sh -c systemctl start eb-app.target
2022/01/30 17:58:17.887119 [INFO] Running command /bin/sh -c systemctl enable web.service
2022/01/30 17:58:17.984848 [INFO] Running command /bin/sh -c systemctl show -p PartOf web.service
2022/01/30 17:58:17.990266 [INFO] Running command /bin/sh -c systemctl is-active web.service
2022/01/30 17:58:17.993666 [INFO] Running command /bin/sh -c systemctl start web.service
2022/01/30 17:58:18.412552 [INFO] Executing instruction: start X-Ray
2022/01/30 17:58:18.412570 [INFO] X-Ray is not enabled.
2022/01/30 17:58:18.412576 [INFO] Executing instruction: start proxy with new configuration
2022/01/30 17:58:18.412613 [INFO] Running command /bin/sh -c /usr/sbin/nginx -t -c /var/proxy/staging/nginx/nginx.conf
2022/01/30 17:58:18.438413 [INFO] Running command /bin/sh -c cp -rp /var/proxy/staging/nginx/* /etc/nginx
2022/01/30 17:58:18.444085 [INFO] Running command /bin/sh -c systemctl show -p PartOf nginx.service
2022/01/30 17:58:18.459610 [INFO] Running command /bin/sh -c systemctl daemon-reload
2022/01/30 17:58:18.596722 [INFO] Running command /bin/sh -c systemctl reset-failed
2022/01/30 17:58:18.601333 [INFO] Running command /bin/sh -c systemctl show -p PartOf nginx.service
2022/01/30 17:58:18.612251 [INFO] Running command /bin/sh -c systemctl is-active nginx.service
2022/01/30 17:58:18.618702 [INFO] Running command /bin/sh -c systemctl start nginx.service
2022/01/30 17:58:18.696121 [INFO] Executing instruction: configureSqsd
2022/01/30 17:58:18.696138 [INFO] This is a web server environment instance, skip configure sqsd daemon ...
2022/01/30 17:58:18.696143 [INFO] Executing instruction: startSqsd
2022/01/30 17:58:18.696147 [INFO] This is a web server environment instance, skip start sqsd daemon ...
2022/01/30 17:58:18.696152 [INFO] Executing instruction: Track pids in healthd
2022/01/30 17:58:18.696157 [INFO] This is an enhanced health env...
2022/01/30 17:58:18.696171 [INFO] Running command /bin/sh -c systemctl show -p ConsistsOf aws-eb.target | cut -d= -f2
2022/01/30 17:58:18.711442 [INFO] nginx.service healthd.service cfn-hup.service
2022/01/30 17:58:18.711474 [INFO] Running command /bin/sh -c systemctl show -p ConsistsOf eb-app.target | cut -d= -f2
2022/01/30 17:58:18.723246 [INFO] web.service
2022/01/30 17:58:18.723613 [INFO] Executing instruction: RunAppDeployPostDeployHooks
2022/01/30 17:58:18.723662 [INFO] Executing platform hooks in .platform/hooks/postdeploy/
2022/01/30 17:58:18.723737 [INFO] Following platform hooks will be executed in order: [00_get_certificate.sh]
2022/01/30 17:58:18.723752 [INFO] Running platform hook: .platform/hooks/postdeploy/00_get_certificate.sh
2022/01/30 17:58:54.348954 [ERROR] An error occurred during execution of command [app-deploy] - [RunAppDeployPostDeployHooks]. Stop running the command. Error: Command .platform/hooks/postdeploy/00_get_certificate.sh failed with error exit status 1. Stderr:Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator nginx, Installer nginx
Performing the following challenges:
http-01 challenge for sub.example.com
Waiting for verification...
Challenge failed for domain sub.example.com
http-01 challenge for sub.example.com
Cleaning up challenges
Some challenges have failed.
2022/01/30 17:58:54.348964 [INFO] Executing cleanup logic
2022/01/30 17:58:54.349077 [INFO] CommandService Response: {"status":"FAILURE","api_version":"1.0","results":[{"status":"FAILURE","msg":"Engine execution has encountered an error.","returncode":1,"events":[{"msg":"Instance deployment successfully generated a 'Procfile'.","timestamp":1643565466,"severity":"INFO"},{"msg":"Instance deployment failed. For details, see 'eb-engine.log'.","timestamp":1643565534,"severity":"ERROR"}]}]}
2022/01/30 17:58:54.349260 [INFO] Platform Engine finished execution on command: app-deploy
2022/01/30 18:00:32.199383 [INFO] Starting...
2022/01/30 18:00:32.199429 [INFO] Starting EBPlatform-PlatformEngine
2022/01/30 18:00:32.199445 [INFO] reading event message file
2022/01/30 18:00:32.199571 [INFO] no eb envtier info file found, skip loading env tier info.
2022/01/30 18:00:32.199632 [INFO] Engine received EB command cfn-hup-exec
----------------------------------------
/var/log/web.stdout.log
----------------------------------------
Jan 30 17:55:30 ip-172-31-7-79 web: [2022-01-30 17:55:30 +0000] [3495] [INFO] Starting gunicorn 20.1.0
Jan 30 17:55:30 ip-172-31-7-79 web: [2022-01-30 17:55:30 +0000] [3495] [INFO] Listening at: http://127.0.0.1:8000 (3495)
Jan 30 17:55:30 ip-172-31-7-79 web: [2022-01-30 17:55:30 +0000] [3495] [INFO] Using worker: gthread
Jan 30 17:55:30 ip-172-31-7-79 web: [2022-01-30 17:55:30 +0000] [3551] [INFO] Booting worker with pid: 3551
Jan 30 17:56:11 ip-172-31-7-79 web: [2022-01-30 17:56:11 +0000] [3495] [INFO] Handling signal: term
Jan 30 17:56:12 ip-172-31-7-79 web: [2022-01-30 17:56:12 +0000] [3551] [INFO] Worker exiting (pid: 3551)
Jan 30 17:56:12 ip-172-31-7-79 web: [2022-01-30 17:56:12 +0000] [3495] [INFO] Shutting down: Master
Jan 30 17:56:13 ip-172-31-7-79 web: [2022-01-30 17:56:13 +0000] [3900] [INFO] Starting gunicorn 20.1.0
Jan 30 17:56:13 ip-172-31-7-79 web: [2022-01-30 17:56:13 +0000] [3900] [INFO] Listening at: http://127.0.0.1:8000 (3900)
Jan 30 17:56:13 ip-172-31-7-79 web: [2022-01-30 17:56:13 +0000] [3900] [INFO] Using worker: gthread
Jan 30 17:56:13 ip-172-31-7-79 web: [2022-01-30 17:56:13 +0000] [3958] [INFO] Booting worker with pid: 3958
Jan 30 17:56:27 ip-172-31-7-79 web: Not Found: /static/admin/css/fonts.css
Jan 30 17:56:28 ip-172-31-7-79 web: Not Found: /favicon.ico
Jan 30 17:58:17 ip-172-31-7-79 web: [2022-01-30 17:58:17 +0000] [3900] [INFO] Handling signal: term
Jan 30 17:58:18 ip-172-31-7-79 web: [2022-01-30 17:58:18 +0000] [3958] [INFO] Worker exiting (pid: 3958)
Jan 30 17:58:18 ip-172-31-7-79 web: [2022-01-30 17:58:18 +0000] [3900] [INFO] Shutting down: Master
Jan 30 17:58:18 ip-172-31-7-79 web: [2022-01-30 17:58:18 +0000] [4422] [INFO] Starting gunicorn 20.1.0
Jan 30 17:58:18 ip-172-31-7-79 web: [2022-01-30 17:58:18 +0000] [4422] [INFO] Listening at: http://127.0.0.1:8000 (4422)
Jan 30 17:58:18 ip-172-31-7-79 web: [2022-01-30 17:58:18 +0000] [4422] [INFO] Using worker: gthread
Jan 30 17:58:18 ip-172-31-7-79 web: [2022-01-30 17:58:18 +0000] [4479] [INFO] Booting worker with pid: 4479
Jan 30 17:58:21 ip-172-31-7-79 web: Not Found: /.well-known/acme-challenge/Gzo8gzkIEbLmtvGkSDhnNheml9XxNsctHJA3ufA0FYI
You might want to check the security group attached to your elastic beanstalk environment. Try adding an inbound rule which will allow all traffic for all ip ranges. (Not an ideal approach but could help)
In my case I had limited ip ranges which could connect to the website and certbot was unable to run challenges.
Related
Well I've depoloyed my Django application on DigitalOcean, and used domain which I bought. Now instead of default application page it shows 502 Bad Gateway
nginx/1.14.0 (Ubuntu). And nginx errors log returns such error:
*4 connect() to unix:/home/username/project.sock failed (111: Connection refused) while connecting to upstream, client: 82.194.22.116, server: challenge.com, request: "GET / HTTP/1.1", upstream: "http://unix:/home/username/project.sock:/", host: "challenge.com"
my nginx configurations:
server {
listen 80;
server_name challenge.com;
location = /favicon.ico { access_log off; log_not_found off; }
location /static/ {
root /home/username;
}
location / {
include proxy_params;
proxy_pass http://unix:/home/username/ccproject.sock;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Host $server_name;
}
}
my settings in ``settings.py```:
ALLOWED_HOSTS = ['64.225.1.249', 'challenge.com']
And my socket file is in /home/username/
gunicorn status:
(env) progbash#challengers:~/ccproject$ sudo systemctl status gunicorn
● gunicorn.service - gunicorn daemon
Loaded: loaded (/etc/systemd/system/gunicorn.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2019-12-12 10:27:02 UTC; 1min 56s ago
Process: 29262 ExecStart=/home/username/ccproject/env/bin/gunicorn --access-logfile - --workers 3 --bind unix:/home/username/ccproject.sock
Main PID: 29262 (code=exited, status=1/FAILURE)
Dec 12 10:26:57 challenge systemd[1]: Started gunicorn daemon.
Dec 12 10:26:57 challenge gunicorn[29262]: [2019-12-12 10:26:57 +0000] [29262] [INFO] Starting gunicorn 20.0.4
Dec 12 10:26:57 challengers gunicorn[29262]: [2019-12-12 10:26:57 +0000] [29262] [ERROR] Retrying in 1 second.
Dec 12 10:26:58 challenge gunicorn[29262]: [2019-12-12 10:26:58 +0000] [29262] [ERROR] Retrying in 1 second.
Dec 12 10:26:59 challengers gunicorn[29262]: [2019-12-12 10:26:59 +0000] [29262] [ERROR] Retrying in 1 second.
Dec 12 10:27:00 challenge gunicorn[29262]: [2019-12-12 10:27:00 +0000] [29262] [ERROR] Retrying in 1 second.
Dec 12 10:27:01 challenge gunicorn[29262]: [2019-12-12 10:27:01 +0000] [29262] [ERROR] Retrying in 1 second.
Dec 12 10:27:02 challenge gunicorn[29262]: [2019-12-12 10:27:02 +0000] [29262] [ERROR] Can't connect to /home/username/ccproject.sock
Dec 12 10:27:02 challenge systemd[1]: gunicorn.service: Main process exited, code=exited, status=1/FAILURE
Dec 12 10:27:02 challenge systemd[1]: gunicorn.service: Failed with result 'exit-code'.
How did your unix socket came to life? Do you have /etc/systemd/system/gunicorn.socket script as per here: https://docs.gunicorn.org/en/stable/deploy.html
I've been trying to deploy for 2 days now and It seems like I can't get it to work even though I went through many articles, StackOverflow questions, and Digital Ocean Tutorials.
My main tutorial is this one: https://www.digitalocean.com/community/tutorials/how-to-set-up-django-with-postgres-nginx-and-gunicorn-on-ubuntu-16-04?comment=47694#create-and-configure-a-new-django-project
when I bind my gunicorn file (see command below) and go to my_ip_address:8001 everything works fine
gunicorn --bind 0.0.0.0:8001 vp.wsgi:application
But at the part where I created and edited my gunicorn.service file:
sudo nano /etc/systemd/system/gunicorn.service
[Unit]
Description=gunicorn daemon
After=network.target
[Service]
User=tony
Group=www-data
WorkingDirectory=/home/tony/vp/vp/
ExecStart=/home/tony/vp/vpenv/bin/gunicorn --workers 3 --bind unix:/home/tony/vp/vp/vp.sock vp.wsgi:application
[Install]
WantedBy=multi-user.target
And my nginx file (I replaced my ip address with my_ip_address for privacy)
sudo nano /etc/nginx/sites-available/vp
server {
listen 80;
server_name my_ip_address;
location = /facivon.ico { access_log off; log_not_found off; }
location /static/ {
root /home/tony/vp;
}
location / {
include proxy_params;
proxy_pass http://unix:/home/tony/vp/vp/vp.sock;
}
}
I get a bad gateway 502 error.
Even after reloading everything:
(vpenv) ~/vp/vp$ sudo systemctl daemon-reload
(vpenv) ~/vp/vp$ sudo systemctl start gunicorn
(vpenv) ~/vp/vp$ sudo systemctl enable gunicorn
(vpenv) ~/vp/vp$ sudo systemctl restart nginx
So I checked the status of gunicorn:
(vpenv) ~/vp/vp$ sudo systemctl status gunicorn
And get the error:
gunicorn.service - gunicorn daemon
Loaded: loaded (/etc/systemd/system/gunicorn.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sun 2017-04-23 13:41:09 UTC; 18s ago
Main PID: 15438 (code=exited, status=3)
Apr 23 13:41:09 vp-first gunicorn[15438]: SECRET_KEY = os.environ["VP_SECRET_KEY"]
Apr 23 13:41:09 vp-first gunicorn[15438]: File "/home/tony/vp/vpenv/lib/python3.5/os.py", line 7
Apr 23 13:41:09 vp-first gunicorn[15438]: raise KeyError(key) from None
Apr 23 13:41:09 vp-first gunicorn[15438]: KeyError: 'VP_SECRET_KEY'
Apr 23 13:41:09 vp-first gunicorn[15438]: [2017-04-23 13:41:09 +0000] [15445] [INFO] Worker exitin
Apr 23 13:41:09 vp-first gunicorn[15438]: [2017-04-23 13:41:09 +0000] [15438] [INFO] Shutting down
Apr 23 13:41:09 vp-first gunicorn[15438]: [2017-04-23 13:41:09 +0000] [15438] [INFO] Reason: Worke
Apr 23 13:41:09 vp-first systemd[1]: gunicorn.service: Main process exited, code=exited, status=3/
Apr 23 13:41:09 vp-first systemd[1]: gunicorn.service: Unit entered failed state.
Apr 23 13:41:09 vp-first systemd[1]: gunicorn.service: Failed with result 'exit-code'.
^X
I have placed my Secret Key in both ~./bashrc (and did source ~./bashrc), and in my virtualenv activate file (and did source vpenv/bin/activate).
The .sock file is nowhere to be found!
Some notes:
Before, I was getting an other error that gunicorn could not boot and my gunicorn and nginx config paths looked like this:
Gunicorn:
WorkingDirectory=/home/tony/vp/
ExecStart=/home/tony/vp/vpenv/bin/gunicorn --workers 3 --bind unix:/home/tony/vp/vp.sock vp.wsgi:application
Nginx:
location / {
include proxy_params;
proxy_pass http://unix:/home/tony/vp/vp.sock;
}
As you can see the paths were vp/vp.sock not vp/vp/vp.sock as they are now.
When I do:
$ ps -aux | grep gunicorn
I get:
tony 15624 0.0 0.1 12944 976 pts/3 S+ 13:57 0:00 grep --color=auto gunicorn
Which means there is an error.
my nginx error log file:
2017/04/23 13:41:19 [crit] 15491#15491: *2 connect() to unix:/home/tony/vp/vp/vp.sock failed (2: No such file or directory) while connecting to upstream, client: Client.IP, server: Server.IP, request: "GET / HTTP/1.1", upstream: "http://unix:/home/tony/vp/vp/vp.sock:/", host: "Server.IP"
2017/04/23 13:41:19 [crit] 15491#15491: *2 connect() to unix:/home/tony/vp/vp/vp.sock failed (2: No such file or directory) while connecting to upstream, client: Client.IP, server: Server.IP, request: "GET /favicon.ico HTTP/1.1", upstream: "http://unix:/home/tony/vp/vp/vp.sock:/favicon.ico", host: "Server.IP", referrer: "http://Server.IP/"
Here is my wsgi.py file:
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings.production")
application = get_wsgi_application()
And yes I use multiple settings files.
I have to say that this is my first time deploying but I do my best to understand everything.
Hope you can help!!!
The new user I created did not have permission to access .bashrc
What I did was I placed my environment variables inside my gunicorn.service file like this:
[Service]
Environment=VP_SECRET_KEY=<value>
restarted everything:
sudo systemctl daemon-reload
sudo systemctl start gunicorn
sudo systemctl enable gunicorn
sudo systemctl restart nginx
And done!
Checked out a few other similar issues here, but can't diagnose the issue.
My site has been occasionally going down with a 502 Bad Gateway Error.
I found the following in my error log. Note i'm using a VPS server on Ubuntu 16.04 using Gunicorn/NGINX.
2017/02/21 01:08:29 [crit] 1247#1247: *1 connect() to unix:/home/django/chrisblog/chrisblog.sock failed (2: No such file or directory) while connecting to upstream, client: 173.48.32.62, server: 45.32.201.31, request: "GET /redditclone/ HTTP/1.1", upstream: "http://unix:/home/django/chrisblog/chrisblog.sock:/redditclone/", host: "pythoncreate.com"
2017/02/21 01:10:36 [crit] 1575#1575: *1 connect() to unix:/home/django/chrisblog/chrisblog.sock failed (2: No such file or directory) while connecting to upstream, client: 173.48.32.62, server: 45.32.201.31, request: "GET / HTTP/1.1", upstream: "http://unix:/home/django/chrisblog/chrisblog.sock:/", host: "pythoncreate.com"
2017/02/21 01:48:04 [crit] 2342#2342: *2 connect() to unix:/home/django/chrisblog/chrisblog.sock failed (2: No such file or directory) while connecting to upstream, client: 173.48.32.62, server: 45.32.201.31, request: "GET / HTTP/1.1", upstream: "http://unix:/home/django/chrisblog/chrisblog.sock:/", host: "pythoncreate.com"
When i go to check the gunicorn status i get the following so it looks like maybe it is failing for some reason?:
gunicorn.service - gunicorn daemon
Loaded: loaded (/etc/systemd/system/gunicorn.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2017-02-21 01:46:08 UTC; 7min ago
Main PID: 2245 (code=exited, status=203/EXEC)
Feb 21 01:46:08 mydjangoblog systemd[1]: Started gunicorn daemon.
Feb 21 01:46:08 mydjangoblog systemd[1]: gunicorn.service: Main process exited, code=exited, status=203/EXEC
Feb 21 01:46:08 mydjangoblog systemd[1]: gunicorn.service: Unit entered failed state.
Feb 21 01:46:08 mydjangoblog systemd[1]: gunicorn.service: Failed with result 'exit-code'.
And here is a shot of output of ps aux | grep nginx
root 2341 0.0 0.1 125104 1480 ? Ss 01:47 0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data 2342 0.0 0.4 125444 3152 ? S 01:47 0:00 nginx: worker process
django 2461 0.0 0.2 16576 2000 pts/0 S+ 01:57 0:00 grep --color=auto nginx
any help here is hugely appreciated
You just need to check whether "/home/django/chrisblog/chrisblog.sock" exists.
I tried to move some instance from tokio region (which by the way, it's working normally) to sao paulo region, then I followed this basic steps to perform but when I launch the instance from the generated AMI and turn on, it shows me "502 Bad Gateway" message in browser.
The principal components on this relocated server are: nginx, uwsgi, django, supervisor, new relic.
All configurations are the same for this relocated server, so I restarted all services, it seems that nginx is working well however it has a detail to apply the next config which is config file of my site:
nginx/sites-available/mysite:
server {
listen 80;
server_name mysite.com;
access_log /var/log/nginx/site_access.log;
error_log /var/log/nginx/site_error.log;
location /static {
alias /home/ubuntu/apps/site/static/;
}
location /media/ {
alias /home/ubuntu/apps/site/media/;
}
location / {
client_max_body_size 400M;
proxy_read_timeout 120;
proxy_connect_timeout 120;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Client-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_pass http://127.0.0.1:8888;
proxy_buffering off;
}
}
To be honest, I expected it operate normally since http://127.0.0.1:8888 is working but I don't understand the reason why nginx conexion is broken, I need some help so that I can research a little more. what else am I forgetting?
UPDATE:
Well ... By the #Michael - sqlbot's suggestion I checked log files, according to this file:
/var/log/nginx/site_error.log
2015/04/06 15:34:31 [error] 832#0: *12 connect() failed (111: Connection refused)
while connecting to upstream, client:
190.233.157.2, server: mysite.com, request: "GET /favicon.ico HTTP/1.1", upstream:
"http://127.0.0.1:8888/favicon.ico", host: "54.207.136.99"
By what I'm going to verify the conexion again and it's what shows me:
$ ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_req=1 ttl=64 time=0.035 ms
64 bytes from 127.0.0.1: icmp_req=2 ttl=64 time=0.028 ms
64 bytes from 127.0.0.1: icmp_req=3 ttl=64 time=0.028 ms
64 bytes from 127.0.0.1: icmp_req=4 ttl=64 time=0.026 ms
--- 127.0.0.1 ping statistics ---
And then I tried it with curl and after about 30 seconds, it prints the following:
$ curl 127.0.0.1:8888
curl: (56) Recv failure: Connection reset by peer
I'm having this strange error, what does it mean really ?
UPDATE 2:
There are config file of mysite on uwsgi and their logs file, however it's the same kind of messages of server on tokio (which is working normally), so I discard that this was a problem of uwsgi:
/etc/uwsgi/apps-enabled/mysite.ini
[uwsgi]
vhost = true
plugins = python
socket = /tmp/mysite.sock
master = true
enable-threads = true
processes = 2
wsgi-file = /home/ubuntu/apps/mysite/mysite/wsgi.py
virtualenv = /home/ubuntu/.venv/mysite
chdir = /home/ubuntu/apps/mysite
touch-reload = /home/ubuntu/apps/mysite/reload
/var/log/uwsgi/app/mysite.log
[uWSGI] getting INI configuration from /usr/share/uwsgi/conf/default.ini
[uWSGI] getting INI configuration from /etc/uwsgi/apps-enabled/mysite.ini
Sun Apr 12 18:29:55 2015 - *** Starting uWSGI 1.0.3-debian (64bit) on [Sun Apr 12 18:29:55 2015] ***
Sun Apr 12 18:29:55 2015 - compiled with version: 4.6.3 on 17 July 2012 02:26:54
Sun Apr 12 18:29:55 2015 - current working directory: /
Sun Apr 12 18:29:55 2015 - writing pidfile to /run/uwsgi/app/mysite/pid
Sun Apr 12 18:29:55 2015 - detected binary path: /usr/bin/uwsgi-core
Sun Apr 12 18:29:55 2015 - setgid() to 33
Sun Apr 12 18:29:55 2015 - setuid() to 33
Sun Apr 12 18:29:55 2015 - your memory page size is 4096 bytes
Sun Apr 12 18:29:55 2015 - VirtualHosting mode enabled.
Sun Apr 12 18:29:55 2015 - uwsgi socket 0 bound to UNIX address /run/uwsgi/app/mysite/socket fd 5
Sun Apr 12 18:29:55 2015 - uwsgi socket 1 bound to UNIX address /tmp/mysite.sock fd 6
Sun Apr 12 18:29:55 2015 - Python version: 2.7.3 (default, Aug 1 2012, 05:25:23) [GCC 4.6.3]
Sun Apr 12 18:29:55 2015 - Set PythonHome to /home/ubuntu/.venv/mysite
Sun Apr 12 18:29:55 2015 - Python main interpreter initialized at 0x916120
Sun Apr 12 18:29:55 2015 - threads support enabled
Sun Apr 12 18:29:55 2015 - your server socket listen backlog is limited to 100 connections
Sun Apr 12 18:29:55 2015 - *** Operational MODE: preforking ***
Sun Apr 12 18:29:57 2015 - WSGI application 0 (mountpoint='') ready on interpreter 0x916120 pid: 1137 (default app)
Sun Apr 12 18:29:57 2015 - *** uWSGI is running in multiple interpreter mode ***
Sun Apr 12 18:29:57 2015 - spawned uWSGI master process (pid: 1137)
Sun Apr 12 18:29:57 2015 - spawned uWSGI worker 1 (pid: 1236, cores: 1)
Sun Apr 12 18:29:57 2015 - spawned uWSGI worker 2 (pid: 1237, cores: 1)
Sun Apr 12 18:29:57 2015 - unable to stat() /home/ubuntu/apps/mysite/reload, reload will be triggered as soon as the file is created
UPDATE 3:
I typed netstat -nap -p | grep 8888 and it shows me:
tcp 0 0 127.0.0.1:8888 0.0.0.0:* LISTEN 7550/python
then I typed ps aux | grep 7550 and ...
ubuntu 7550 2.4 0.4 65752 15568 ? S 21:44 0:00 /home/ubuntu/.venv/mysite/bin/python /home/ubuntu/.venv/mysite/bin/gunicorn_django -w 3 --user=ubuntu --group=ubuntu --log-level=debug --timeout 120 --log-file=/var/log/gunicorn/mysite.log -b 127.0.0.1:8888
ubuntu 7585 0.0 0.0 8104 924 pts/1 S+ 21:44 0:00 grep --color=auto 7550
Well, I checked with cat /var/log/gunicorn/mysite.log and I got this:
Traceback (most recent call last):
File "/home/ubuntu/.venv/mysite/bin/gunicorn_django", line 8, in <module>
load_entry_point('gunicorn==0.14.6', 'console_scripts', 'gunicorn_django')()
File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/app/djangoapp.py", line 132, in run
DjangoApplication("%prog [OPTIONS] [SETTINGS_PATH]").run()
File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 124, in run
Arbiter(self).run()
File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 185, in run
self.halt(reason=inst.reason, exit_status=inst.exit_status)
File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 280, in halt
self.stop()
File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 328, in stop
self.reap_workers()
File "/home/ubuntu/.venv/mysite/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 419, in reap_workers
raise HaltServer(reason, self.WORKER_BOOT_ERROR)
gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3>
2015-04-12 21:44:36 [7550] [INFO] Starting gunicorn 0.14.6
2015-04-12 21:44:36 [7550] [DEBUG] Arbiter booted
2015-04-12 21:44:36 [7550] [INFO] Listening at: http://127.0.0.1:8888 (7550)
2015-04-12 21:44:36 [7550] [INFO] Using worker: sync
2015-04-12 21:44:36 [7558] [INFO] Booting worker with pid: 7558
2015-04-12 21:44:36 [7559] [INFO] Booting worker with pid: 7559
2015-04-12 21:44:36 [7560] [INFO] Booting worker with pid: 7560
Production environment is up!
Production environment is up!
Production environment is up!
Well ... Gunicorn seems to be failing (it's inside virtualenv), so I checked the exucution on debug mode:
gunicorn mysite.wsgi:application --preload --debug --log-level debug
2015-04-12 22:32:42 [9085] [DEBUG] Current configuration:
2015-04-12 22:32:42 [9085] [DEBUG] access_log_format: "%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"
2015-04-12 22:32:42 [9085] [DEBUG] accesslog: None
2015-04-12 22:32:42 [9085] [DEBUG] backlog: 2048
2015-04-12 22:32:42 [9085] [DEBUG] bind: 127.0.0.1:8000
2015-04-12 22:32:42 [9085] [DEBUG] check_config: False
2015-04-12 22:32:42 [9085] [DEBUG] config: None
2015-04-12 22:32:42 [9085] [DEBUG] daemon: False
2015-04-12 22:32:42 [9085] [DEBUG] debug: True
2015-04-12 22:32:42 [9085] [DEBUG] default_proc_name: mysite.wsgi:application
2015-04-12 22:32:42 [9085] [DEBUG] django_settings: None
2015-04-12 22:32:42 [9085] [DEBUG] errorlog: -
2015-04-12 22:32:42 [9085] [DEBUG] graceful_timeout: 30
2015-04-12 22:32:42 [9085] [DEBUG] group: 1000
2015-04-12 22:32:42 [9085] [DEBUG] keepalive: 2
2015-04-12 22:32:42 [9085] [DEBUG] limit_request_field_size: 8190
2015-04-12 22:32:42 [9085] [DEBUG] limit_request_fields: 100
2015-04-12 22:32:42 [9085] [DEBUG] limit_request_line: 4094
2015-04-12 22:32:42 [9085] [DEBUG] logconfig: None
2015-04-12 22:32:42 [9085] [DEBUG] logger_class: simple
2015-04-12 22:32:42 [9085] [DEBUG] loglevel: debug
2015-04-12 22:32:42 [9085] [DEBUG] max_requests: 0
2015-04-12 22:32:42 [9085] [DEBUG] on_reload: <function on_reload at 0x7f6f421e9320>
2015-04-12 22:32:42 [9085] [DEBUG] on_starting: <function on_starting at 0x7f6f421e91b8>
2015-04-12 22:32:42 [9085] [DEBUG] pidfile: None
2015-04-12 22:32:42 [9085] [DEBUG] post_fork: <function post_fork at 0x7f6f421e9758>
2015-04-12 22:32:42 [9085] [DEBUG] post_request: <function post_request at 0x7f6f421e9b18>
2015-04-12 22:32:42 [9085] [DEBUG] pre_exec: <function pre_exec at 0x7f6f421e98c0>
2015-04-12 22:32:42 [9085] [DEBUG] pre_fork: <function pre_fork at 0x7f6f421e95f0>
2015-04-12 22:32:42 [9085] [DEBUG] pre_request: <function pre_request at 0x7f6f421e9a28>
2015-04-12 22:32:42 [9085] [DEBUG] preload_app: True
2015-04-12 22:32:42 [9085] [DEBUG] proc_name: None
2015-04-12 22:32:42 [9085] [DEBUG] pythonpath: None
2015-04-12 22:32:42 [9085] [DEBUG] secure_scheme_headers: {'X-FORWARDED-PROTOCOL': 'ssl', 'X-FORWARDED-SSL': 'on'}
2015-04-12 22:32:42 [9085] [DEBUG] spew: False
2015-04-12 22:32:42 [9085] [DEBUG] timeout: 30
2015-04-12 22:32:42 [9085] [DEBUG] tmp_upload_dir: None
2015-04-12 22:32:42 [9085] [DEBUG] umask: 0
2015-04-12 22:32:42 [9085] [DEBUG] user: 1000
2015-04-12 22:32:42 [9085] [DEBUG] when_ready: <function when_ready at 0x7f6f421e9488>
2015-04-12 22:32:42 [9085] [DEBUG] worker_class: sync
2015-04-12 22:32:42 [9085] [DEBUG] worker_connections: 1000
2015-04-12 22:32:42 [9085] [DEBUG] worker_exit: <function worker_exit at 0x7f6f421e9c80>
2015-04-12 22:32:42 [9085] [DEBUG] workers: 1
2015-04-12 22:32:42 [9085] [DEBUG] x_forwarded_for_header: X-FORWARDED-FOR
2015-04-12 22:32:42 [9085] [WARNING] debug mode: app isn't preloaded.
2015-04-12 22:32:42 [9085] [INFO] Starting gunicorn 0.14.6
2015-04-12 22:32:42 [9085] [DEBUG] Arbiter booted
2015-04-12 22:32:42 [9085] [INFO] Listening at: http://127.0.0.1:8000 (9085)
2015-04-12 22:32:42 [9085] [INFO] Using worker: sync
2015-04-12 22:32:42 [9088] [INFO] Booting worker with pid: 9088
^[[A^C2015-04-12 22:34:38 [9088] [INFO] Worker exiting (pid: 9088)
2015-04-12 22:34:38 [9085] [INFO] Handling signal: int
2015-04-12 22:34:38 [9085] [INFO] Shutting down: Master
I know there's a problem with gunicorn so far, it fails and restart itself and fails again, but these messages does not shows me a clear error ... is there another ideas ? I'm starting to feel very confunsed :S
Effectively ... Environment variables were the culprits (and me, for not realizing), they were not configured properly therefore Django crashes when Gunicorn tries to run it.
And I solved this problem by checking all environment vars and setup properly according to my instance EC2 ... thnks so much to #Serj Zaharchenko for the simple but powerful clue.
I found this, don't know if solves your problem.
The first line of the gunicorn_django file was "#!/opt/django/env/mysite/bin/python", which is the path of my virtualenviroment python path. The problem solved by replace it as "#!/usr/bin/env python"
Over the past few weeks we've been getting more and more 502 errors. Currently our stack is nginx + gunicron + django on an m1.large EC2 instance backed by a small RDS instance.
They seem to become more frequent as the request load increases. I will see the random 502 while using a browser, but our command line scripts that hit the api (Tasty Pie) will usually fail on their second or third request. However, if I add in a sleep function to the script right before it makes a request it will be okay for that request but 502 on the next. Note that we are using digest auth with the requests library and slumber wrapper -- hence the 401, 200 pattern.
To make debugging even trickier, the issue resolves itself when Gunicorn is run with the --debug option. The error still exists if I remove the --debug option but limit my Gunicorn workers to 1 explicitly.
My nginx.conf:
user www-data;
worker_processes 4;
pid /var/run/nginx.pid;
events {
worker_connections 768;
# multi_accept on;
}
http {
##
# Basic Settings
##
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;
# server_names_hash_bucket_size 64;
# server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# Logging Settings
##
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
##
# Gzip Settings
##
gzip on;
gzip_disable "msie6";
gzip_proxied any;
gzip_types application/x-ghi-packedschemafeatures-v1
gzip_http_version 1.1;
gzip_comp_level 1;
gzip_min_length 500;
proxy_buffering on;
proxy_http_version 1.1;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
Virtual Host File:
server {
listen 80;
server_name pipeline.ourdomain.com;
location / {
rewrite ^ https://$server_name$request_uri permanent;
}
}
server {
listen 443;
server_name pipeline.ourdomain.com;
ssl on;
ssl_protocols SSLv3 TLSv1;
ssl_ciphers ALL:-ADH:+HIGH:+MEDIUM:-LOW:-SSLv2:-EXP;
ssl_session_cache shared:SSL:10m;
ssl_certificate /etc/ssl/certs/ourdomain.com.combined.crt;
ssl_certificate_key /etc/ssl/private/ourdomain.com.key;
root /var/www/;
location /static/ {
alias /var/www/production/pipeline/public/;
}
location / {
proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Forwarded-Protocol https;
proxy_connect_timeout 240;
proxy_read_timeout 280;
proxy_pass http://localhost:8000/;
}
error_page 500 502 503 504 /static/50x.html;
}
Gunicorn Command
#!/bin/bash
set -e
LOGFILE=/var/log/gunicorn/ea_pipeline.log
LOGDIR=$(dirname $LOGFILE)
SETTINGS=production_settings
# user/group to run as
USER=ubuntu
GROUP=ubuntu
DJANGO_PATH=$(dirname $(readlink -f $0))/../
cd $DJANGO_PATH
echo $(pwd)
. ../env/bin/activate
test -d $LOGDIR || mkdir -p $LOGDIR
exec ../env/bin/gunicorn_django \
--user=$USER --group=$GROUP --log-level=debug \
--preload \
--workers=4 \
--timeout=90 \
--settings=$SETTINGS \
--limit-request-line=8190 \
--limit-request-field_size 0 \
--pythonpath=$DJANGO_PATH \
--log-file=$LOGFILE production_settings.py 2>>$LOGFILE
Sample of the Access Log:
67.134.170.194 - - [24/Aug/2012:00:28:17 +0000] "GET /api/v1/storage/ HTTP/1.1" 401 5 "-" "python-requests/0.13.8 CPython/2.7.3 Linux/3.2.0-29-generic"
67.134.170.194 - - [24/Aug/2012:00:28:18 +0000] "GET /api/v1/storage/ HTTP/1.1" 200 326 "-" "python-requests/0.13.8 CPython/2.7.3 Linux/3.2.0-29-generic"
67.134.170.194 - - [24/Aug/2012:00:28:18 +0000] "GET /api/v1/customer/?client_id=lamb_01 HTTP/1.1" 502 18 "-" "python-requests/0.13.8 CPython/2.7.3 Linux/3.2.0-29-generic"
67.134.170.194 - - [24/Aug/2012:00:29:41 +0000] "GET /api/v1/storage/ HTTP/1.1" 502 18 "-" "python-requests/0.13.8 CPython/2.7.3 Linux/3.2.0-29-generic"
Nginx Error log:
2012/08/24 00:28:18 [error] 16490#0: *3 connect() failed (111: Connection refused) while connecting to upstream, client: 67.134.170.194, server: pipeline.ourdomain.com, request: "GET /api/v1/customer/?client_id=lamb_01 HTTP/1.1", upstream: "http://127.0.0.1:8000/api/v1/customer/?client_id=lamb_01", host: "pipeline.ourdomain.com"
2012/08/24 00:29:41 [error] 16490#0: *7 connect() failed (111: Connection refused) while connecting to upstream, client: 67.134.170.194, server: pipeline.ourdomain.com, request: "GET /api/v1/storage/ HTTP/1.1", upstream: "http://127.0.0.1:8000/api/v1/storage/", host: "pipeline.ourdomain.com"
Sample of the Gunicorn log:
2012-08-24 17:03:13 [8716] [INFO] Starting gunicorn 0.14.3
2012-08-24 17:03:13 [8716] [DEBUG] Arbiter booted
2012-08-24 17:03:13 [8716] [INFO] Listening at: http://127.0.0.1:8000 (8716)
2012-08-24 17:03:13 [8716] [INFO] Using worker: sync
2012-08-24 17:03:13 [8735] [INFO] Booting worker with pid: 8735
2012-08-24 17:03:13 [8736] [INFO] Booting worker with pid: 8736
2012-08-24 17:03:13 [8737] [INFO] Booting worker with pid: 8737
2012-08-24 17:03:13 [8738] [INFO] Booting worker with pid: 8738
2012-08-24 17:03:21 [8738] [DEBUG] GET /api/v1/storage/
Assertion failed: ok (mailbox.cpp:84)
2012-08-24 17:03:21 [8738] [INFO] Parent changed, shutting down: <Worker 8738>
2012-08-24 17:03:21 [8738] [INFO] Worker exiting (pid: 8738)
Error in sys.exitfunc:
2012-08-24 17:03:21 [8737] [DEBUG] GET /api/v1/storage/
2012-08-24 17:03:22 [8838] [INFO] Starting gunicorn 0.14.3
2012-08-24 17:03:22 [8838] [ERROR] Connection in use: ('127.0.0.1', 8000)
2012-08-24 17:03:22 [8838] [ERROR] Retrying in 1 second.
2012-08-24 17:03:22 [8737] [INFO] Parent changed, shutting down: <Worker 8737>
2012-08-24 17:03:22 [8737] [INFO] Worker exiting (pid: 8737)
Error in sys.exitfunc:
2012-08-24 17:03:22 [8736] [DEBUG] GET /api/v1/customer/
2012-08-24 17:03:23 [8736] [INFO] Parent changed, shutting down: <Worker 8736>
2012-08-24 17:03:23 [8736] [INFO] Worker exiting (pid: 8736)
Error in sys.exitfunc:
2012-08-24 17:03:23 [8838] [ERROR] Connection in use: ('127.0.0.1', 8000)
2012-08-24 17:03:23 [8838] [ERROR] Retrying in 1 second.
2012-08-24 17:03:24 [8735] [DEBUG] GET /api/v1/upload_action/
2012-08-24 17:03:24 [8838] [ERROR] Connection in use: ('127.0.0.1', 8000)
2012-08-24 17:03:24 [8838] [ERROR] Retrying in 1 second.
2012-08-24 17:03:24 [8735] [INFO] Parent changed, shutting down: <Worker 8735>
2012-08-24 17:03:24 [8735] [INFO] Worker exiting (pid: 8735)
Error in sys.exitfunc:
2012-08-24 17:03:25 [8838] [DEBUG] Arbiter booted
2012-08-24 17:03:25 [8838] [INFO] Listening at: http://127.0.0.1:8000 (8838)
2012-08-24 17:03:25 [8838] [INFO] Using worker: sync
2012-08-24 17:03:25 [8907] [INFO] Booting worker with pid: 8907
2012-08-24 17:03:25 [8908] [INFO] Booting worker with pid: 8908
2012-08-24 17:03:25 [8909] [INFO] Booting worker with pid: 8909
2012-08-24 17:03:25 [8910] [INFO] Booting worker with pid: 8910
This is a very old post. But I had exactly the same problem with a NGinx+Gunicorn+Flask setup. I had as well a 502 error with the same log as yours on every 300th request or so. Changing the gunicorn worker type to an asynchronous solved the problem for me (I picked gthread). Hope this answer will help someone.
How to change the setting: http://docs.gunicorn.org/en/stable/settings.html#worker-class
How to choose your worker type:
http://docs.gunicorn.org/en/latest/design.html#choosing-a-worker-type
And here a good explanation why:
How many concurrent requests does a single Flask process receive?