C++ grpc client to Nginx ssl - c++

I have a grpc server wrapped with nginx.
And I'm trying to write clients to it in different languages.
The python version works well
cred = grpc.ssl_channel_credentials()
channel = grpc.secure_channel(NAME, cred)
stub = MyServiceStub(channel)
But analogical c++ code doesn't.
auto channel = grpc::CreateChannel(NAME, grpc::SslCredentials(grpc::SslCredentialsOptions()));
auto stub = MyService::NewStub(channel);
nginx version 1.16.1 (built with OpenSSL 1.0.2k-fips 26 Jan 2017)
server {
listen 80;
server_name NAME;
location / {
return 302 grpc://NAME$request_uri;
}
}
server {
listen 443 ssl http2;
server_name NAME;
ssl_certificate chain.pem;
ssl_certificate_key privkey.pem;
location / {
grpc_pass grpc://IP;
grpc_read_timeout 3600;
}
}
nginx access log print "PRI * HTTP/2.0" 400 157 "-" "-" "-" and in debug for bad request
[debug] 24255#0: *103047 http check ssl handshake
[debug] 24255#0: *103047 http recv(): 1
[debug] 24255#0: *103047 plain http
[debug] 24255#0: *103047 http wait request handler
but a good request get
[debug] 30192#0: *105029 http check ssl handshake
[debug] 30192#0: *105029 http recv(): 1
[debug] 30192#0: *105029 https ssl handshake: 0x16
[debug] 30192#0: *105029 tcp_nodelay
[debug] 30192#0: *105029 ssl get session: 5F263490:3
grpc version 1.34.0 and c++ client outputs errors
Failed parsing HTTP/2
Expected SETTINGS frame as the first frame, got frame type 80
failed to connect to all addresses
Trying to connect an http1.x server
I find a similar Go problem How to find out why Nginx return 400 while use it as http/2 load balancer? , but don't know how to pass "h2" in c++.

Related

AWS EC2 can't access my ec2 public domain, tried many web solutions none worked

I don't this is a very common question, I'm only asking it because I've already started some ec2 instances using the method I'll explain bellow and I succed, maybe EC2 changed something the right away to connect it by HTTP using public dns. Here are the steps I've always done and I don't know why it isn't working anymore.
public dns: ec2-23-22-52-143.compute-1.amazonaws.com
1 - Settup the default security group, that is oppened for every traffic
2 - Add IAM policity to this ec2, as you can see IAM function bellow
3 - Access SSH and configure nginx, I used putty and could enter on the instance. The configuration for nginx is /etc/nginx/sites-avaiable/default
## default nginx config
server {
listen 80 default_server;
server_name _;
# front-end
location / {
root /var/www/html;
try_files $uri /index.html;
}
# node api
location /api/ {
proxy_pass http://localhost:3000/;
}
}
4 - Clone both my front-end and back-end repositories from github
5 - build production and move to /var/www/html all frontend dist files
6 - Start my node.js server using pm2
7 - Start nginx
sudo nginx -t
sudo systemctl start nginx
sudo netstat -plant | grep 80
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 21159/nginx: master
As you can see the port:80 is being listening by nginx
Guys I have no ideia why I can't access the public dns of this instance, I made everything identical as I've done in the past, It has always been working doing these steps, anything has changed using AWS EC2 ubuntu 20 instance, let me know. Thanks a lot, I'm headaching trying to figure this out.
Last steps to try to solve it is check nginx logs
cd /var/log/nginx
2022/04/05 09:42:02 [error] 8216#8216: *1 directory index of "/var/www/html/" is forbidden, client: 103.178.236.40, server: _, request: "GET http://example.>
But even doing this, it has not solved the issue:
sudo chmod -R 777 /var/www/html
You are accessing the site via https (443) while it's running on http (80).
Here is the result of curl.
root#MSI:~# curl -vk https://ec2-23-22-52-143.compute-1.amazonaws.com
* Rebuilt URL to: https://ec2-23-22-52-143.compute-1.amazonaws.com/
* Trying 23.22.52.143...
* TCP_NODELAY set
* connect to 23.22.52.143 port 443 failed: Connection refused
* Failed to connect to ec2-23-22-52-143.compute-1.amazonaws.com port 443: Connection refused
* Closing connection 0
curl: (7) Failed to connect to ec2-23-22-52-143.compute-1.amazonaws.com port 443: Connection refused
root#MSI:~# curl -vk http://ec2-23-22-52-143.compute-1.amazonaws.com
* Rebuilt URL to: http://ec2-23-22-52-143.compute-1.amazonaws.com/
* Trying 23.22.52.143...
* TCP_NODELAY set
* Connected to ec2-23-22-52-143.compute-1.amazonaws.com (23.22.52.143) port 80 (#0)
> GET / HTTP/1.1
> Host: ec2-23-22-52-143.compute-1.amazonaws.com
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.18.0 (Ubuntu)
< Date: Tue, 05 Apr 2022 13:02:08 GMT
< Content-Type: text/html
< Content-Length: 1676
< Last-Modified: Tue, 05 Apr 2022 09:54:30 GMT
< Connection: keep-alive
< ETag: "624c11d6-68c"
< Accept-Ranges: bytes
<
* Connection #0 to host ec2-23-22-52-143.compute-1.amazonaws.com left intact
<!DOCTYPE html><html class="bg-image" lang="en"><head><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width,initial-scale=1"><link rel="my icon" href="/assets/icone.ico" type="image/x-icon"><title>Lab301mktdigital</title><link rel="preconnect" href="https://fonts.googleapis.com"><link rel="preconnect" href="https://fonts.gstatic.com" crossorigin><link href="https://fonts.googleapis.com/css2?family=Comfortaa:wght#300;400;500;600;700&display=swap" rel="stylesheet"><link href="https://fonts.googleapis.com/css2?family=Dancing+Script:wght#400;500;600;700&display=swap" rel="stylesheet"><link href="https://fonts.googleapis.com/css2?family=Playfair+Display:ital,wght#0,400;0,500;0,600;0,700;0,800;0,900;1,400;1,500;1,600;1,700;1,800;1,900&display=swap" rel="stylesheet"><link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/#mdi/font#latest/css/materialdesignicons.min.css"><link rel="stylesheet" href="./assets/styles/general.css"><link href="/css/app.9ba0b389.css" rel="preload" as="style"><link href="/css/chunk-vendors.f754c4c0.css" rel="preload" as="style"><link href="/js/app.5380592f.js" rel="preload" as="script"><link href="/js/chunk-vendors.1ab5dd1a.js" rel="preload" as="script"><link href="/css/chunk-vendors.f754c4c0.css" rel="stylesheet"><link href="/css/app.9ba0b389.css" rel="stylesheet"></head><body><noscript><strong>We're sorry but freelancer-front-end doesn't work properly without JavaScript enabled. Please enable it to continue.</strong></noscript><div id="app"></div><script src="/js/chunk-vendors.1ab5dd1a.js"></script><script src="/js/app.5380592f.js"></script></body></html>
From browser :

AWS API gateway returns 403 for some and 200 for others

I have a apigateway endpoint that returns 200 for me, but when it's called by a third party they get 403.
I request via curl and python requests and get 200 for both
Bash:
curl -X POST -v --http1.1 https://939pd1ndql.execute-api.us-east-1.amazonaws.com/default/bitbucket-events
Python
requests.post('https://939pd1ndql.execute-api.us-east-1.amazonaws.com/default/bitbucket-events',
I get 200 response for each request.
However when a third party calls the endpoint they get
HTTPSConnectionPool(host='939pd1ndql.execute-api.us-east-1.amazonaws.com', port=443): Max retries exceeded with url: /default/bitbucket-events (Caused by ProxyError('Cannot connect to proxy.', error('Tunnel connection failed: 403 Forbidden',)))
The third part is bitbucket - I am trying to create bitbucket app (really just a JSON payload telling bitbucket to create a webhook):
I do not have control over how bitbucket performs the requests and the request is very opaque but I pointed it at ngrok and intercepted the request it makes:
POST /default/bitbucket-events HTTP/1.1
Host: 939pd1ndql.execute-api.us-east-1.amazonaws.com
User-Agent: python-requests/2.22.0
Content-Length: 2292
Accept: */*
Accept-Encoding: gzip, deflate
Content-Type: application/json
Sentry-Trace: 00-41043c2935294252aa25ac44716a2300-86324af91ef0493e-00
X-Forwarded-For: 104.192.142.247
X-Forwarded-Proto: https
X-Newrelic-Id: VwMGVVZSGwQJVFVXDwcPXg==
X-Newrelic-Transaction: PxQPB1daXQMHVwRWAQkDUQUIFB8EBw8RVU4aWl4JDVcDUgoEBVcLVlNXDkNKQQoBBlZRAAQHFTs=
{LOTS OF JSON HERE}
Nothing in the request that bitbucket sends looks like it could cause this problem.
The response I get from the curl command is:
* Trying 3.84.56.177...
* TCP_NODELAY set
* Connected to 939pd1ndql.execute-api.us-east-1.amazonaws.com (3.84.56.177) port 443 (#0)
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
* subject: CN=*.execute-api.us-east-1.amazonaws.com
* start date: Jul 22 00:00:00 2021 GMT
* expire date: Aug 20 23:59:59 2022 GMT
* subjectAltName: host "939pd1ndql.execute-api.us-east-1.amazonaws.com" matched cert's "*.execute-api.us-east-1.amazonaws.com"
* issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
* SSL certificate verify ok.
> POST /default/bitbucket-events HTTP/1.1
> Host: 939pd1ndql.execute-api.us-east-1.amazonaws.com
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Tue, 12 Apr 2022 22:00:39 GMT
< Content-Type: application/json
< Content-Length: 0
< Connection: keep-alive
< x-amzn-RequestId: 78585bb0-5db4-4273-9333-45ef8b44952d
< Access-Control-Allow-Origin: *
< x-amz-apigw-id: QfN1IHrSoAMFrMw=
I have now devolved the apigateway to be just a mock endpoint that returns 200 response:
and I have set the logging to be very loud:
But I only see log entries as a result of the curl and python request I make. The bitbucket request does not result in a log line.
Could this mean the bitbucket request is being rejected by AWS before my api gateway is handling the request? I have no WAF enabled
As you can tell I am running out of ideas.
I replicated your setup, but with my own API Gateway. I was able to install the app though, so strongly suspect it is something to with your API Gateway setup.
I am using the exact same app descriptor, with only the URL being different.
{
"key": "codereview.doctor.staging",
"name": "Code Review Doctor Staging",
"description": "Target lambdas with 'staging' version alias",
"vendor": {
"name": "Code Review Doctor",
"url": "https://codereview.doctor"
},
"baseUrl": "https://fj7987nlx3.execute-api.ap-southeast-1.amazonaws.com",
"authentication": {
"type": "jwt"
},
"lifecycle": {
"installed": "/default/bitbucket-events",
"uninstalled": "/default/bitbucket-events"
},
"modules": {
"webhooks": [
{
"event": "pullrequest:created",
"url": "/default/bitbucket-events"
},
{
"event": "pullrequest:updated",
"url": "/default/bitbucket-events"
},
{
"event": "pullrequest:fulfilled",
"url": "/default/bitbucket-events"
}
]
},
"scopes": ["account", "repository", "pullrequest"],
"contexts": ["account"]
}
My API GW POST configuration looks exactly like yours, so the difference may be somewhere else.
Note that I have deleted my API GW stage, so you will not be able to test using mine for now.

Load testing throws Server 502 error: Bad Gateway after 700 users. Gunicorn, Gevent, Nginx, Django

I'm trying to reach 2000 concurrent users with my benchmarking tool. I'm using locust to simulate them.
My server has 24vCPUs, 128GB RAM, 25SSD.
I want to be able to serve 2000 concurrent users without errors but after only 700 users I run into trouble.
Gunicorn
I installed gevent to be able to serve async but this didn't change anything on my load test (can gevent not be working?).
My systemd file is as follows:
mysite-production.conf
[Unit]
Description=mysite production daemon
After=network.target
[Service]
User=www-data
Group=www-data
WorkingDirectory=/var/www/mysite/production/src
ExecStart=/var/www/mysite/production/venv/bin/gunicorn --worker-class=gevent --worker-connections=1000 --workers=49 --bind unix:/var/www/mysite/production/sock/gunicorn --log-level DEBUG --log-file '/var/www/mysite/production/log/gunicorn.log' mysite.wsgi:application
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
[Install]
WantedBy=multi-user.target
According to my calculations: 49 x 1000 = 49000 requests per second that I could be serving.
Instead at about 700 users I get the following error in my locust failures tab:
# fails Method Name Type
1227 GET // HTTPError('500 Server Error: Internal Server Error for url: http://my.site.com//')
It basically says I have a internal server error.
when I open my gunicorn.log I see Ignoring EPIPE:
gunicorn.log
[2020-01-27 20:22:30 +0000] [13121] [DEBUG] Ignoring EPIPE
[2020-01-27 20:22:31 +0000] [13121] [DEBUG] Ignoring EPIPE
[2020-01-27 20:22:31 +0000] [13121] [DEBUG] Ignoring EPIPE
[2020-01-27 20:22:31 +0000] [13121] [DEBUG] Ignoring EPIPE
[2020-01-27 20:22:31 +0000] [13121] [DEBUG] Ignoring EPIPE
Nginx
My nginx access.log shows some 499 and 500 errors:
access.log
185.159.126.246 - - [27/Jan/2020:19:22:25 +0000] "GET // HTTP/1.1" 200 30727 "-" "python-requests/2.22.0"
185.159.126.246 - - [27/Jan/2020:19:22:25 +0000] "GET // HTTP/1.1" 499 0 "-" "python-requests/2.22.0"
185.159.126.246 - - [27/Jan/2020:19:22:25 +0000] "GET // HTTP/1.1" 499 0 "-" "python-requests/2.22.0"
185.159.126.246 - - [27/Jan/2020:19:22:25 +0000] "GET // HTTP/1.1" 200 30727 "-" "python-requests/2.22.0"
185.159.126.246 - - [27/Jan/2020:19:22:25 +0000] "GET // HTTP/1.1" 500 2453 "-" "python-requests/2.22.0"
185.159.126.246 - - [27/Jan/2020:19:22:25 +0000] "GET // HTTP/1.1" 500 2453 "-" "python-requests/2.22.0"
185.159.126.246 - - [27/Jan/2020:19:22:25 +0000] "GET // HTTP/1.1" 499 0 "-" "python-requests/2.22.0"
185.159.126.246 - - [27/Jan/2020:19:22:25 +0000] "GET // HTTP/1.1" 499 0 "-" "python-requests/2.22.0"
the nginx error.log doesn't show anything, but in the previous test it showed Resource temporary unavailable:
error.log
2020/01/27 19:15:32 [error] 1514#1514: *57151 connect() to unix:/var/www/mysite/production/sock/gunicorn failed (11: Resource temporarily unavailable) while connecting to upstream, client: 185.159.126.246, server: my.site.com, request: "GET // HTTP/1.1", upstream: "http://unix:/var/www/mysite/production/sock/gunicorn://", host: "my.site.com"
2020/01/27 19:15:32 [error] 1514#1514: *56133 connect() to unix:/var/www/mysite/production/sock/gunicorn failed (11: Resource temporarily unavailable) while connecting to upstream, client: 185.159.126.246, server: my.site.com, request: "GET // HTTP/1.1", upstream: "http://unix:/var/www/mysite/production/sock/gunicorn://", host: "my.site.com"
Here is my nginx.conf:
nginx.conf
user www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
worker_rlimit_nofile 40000;
events {
worker_connections 4096;
# multi_accept on;
}
http {
##
# Basic Settings
##
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;
# server_names_hash_bucket_size 64;
# server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# SSL Settings
##
ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
ssl_prefer_server_ciphers on;
##
# Logging Settings
##
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
##
# Gzip Settings
##
gzip on;
# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
and here is my server block:
my.site.com (server-block)
upstream mysite-production {
server unix:/var/www/mysite/production/sock/gunicorn;
}
server {
listen [::]:80;
listen 80;
server_name my.site.com;
# set client body size to 100M #
client_max_body_size 100M;
location / {
include proxy_params;
proxy_pass http://unix:/var/www/mysite/production/sock/gunicorn;
}
location /static/ {
root /var/www/mysite/production/;
expires 30d;
add_header Vary Accept-Encoding;
access_log off;
gzip on;
gzip_comp_level 6;
gzip_vary on;
gzip_types text/plain text/css application/json application/x-javascript application/javascript text/xml application/xml application/rss+xml text/javascript image/svg+xml application/vnd.ms-fontobject application/x-font-ttf font/opentype;
}
location /media/ {
root /var/www/mysite/production/;
expires 30d;
add_header Vary Accept-Encoding;
access_log off;
}
}
Questions
Questions that come up when I look at all this:
Where does the 502 error come from? It does not show me enough information to understand what is happening... Are all my workers occupied after 700 users causing the errors?
Where does the 499 error come from?
Does it have anything to do with my postgresql database? if so how do I find out?
How can I know if gevent has worked or not? the results of the load tests with ab and locust haven't really changed.
what does Ignoring EPIPE mean and why is it happening?
I appreciate you taking the time to look at this. Thank you in advance for all your comments and answers!
If you need more info please let me know and I will add it to my question.
EDIT:
Based on #ServerFaults comment I checked how much memory gets drained when the load test happens. apparently nothing really significant:
MemTotal: 132027088 kB
MemFree: 127013484 kB
MemAvailable: 126458072 kB
Buffers: 57788 kB
Cached: 401088 kB
After running netstat -s (I don't have the knowledge to understand what everything means):
Ip:
Forwarding: 2
6956023 total packets received
8 with invalid addresses
0 forwarded
0 incoming packets discarded
6953503 incoming packets delivered
6722961 requests sent out
20 outgoing packets dropped
Icmp:
107 ICMP messages received
0 input ICMP message failed
ICMP input histogram:
destination unreachable: 45
echo requests: 62
968 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 906
echo replies: 62
IcmpMsg:
InType3: 45
InType8: 62
OutType0: 62
OutType3: 906
Tcp:
222160 active connection openings
136492 passive connection openings
78293 failed connection attempts
161239 connection resets received
2 connections established
6245053 segments received
6487229 segments sent out
11978 segments retransmitted
0 bad segments received
185530 resets sent
Udp:
667101 packets received
41158 packets to unknown port received
0 packet receive errors
727164 packets sent
0 receive buffer errors
0 send buffer errors
UdpLite:
TcpExt:
2939 SYN cookies sent
28 SYN cookies received
215 invalid SYN cookies received
113 resets received for embryonic SYN_RECV sockets
18751 TCP sockets finished time wait in fast timer
175 packetes rejected in established connections because of timestamp
61997 delayed acks sent
54 delayed acks further delayed because of locked socket
Quick ack mode was activated 3475 times
2687504 packet headers predicted
1171485 acknowledgments not containing data payload received
2270054 predicted acknowledgments
TCPSackRecovery: 921
Detected reordering 22500 times using SACK
Detected reordering 596 times using time stamp
139 congestion windows fully recovered without slow start
490 congestion windows partially recovered using Hoe heuristic
TCPDSACKUndo: 14
413 congestion windows recovered without slow start after partial ack
TCPLostRetransmit: 47
TCPSackFailures: 5
7 timeouts in loss state
2416 fast retransmits
123 retransmits in slow start
TCPTimeouts: 5279
TCPLossProbes: 4284
TCPLossProbeRecovery: 22
TCPSackRecoveryFail: 1
TCPDSACKOldSent: 3482
TCPDSACKRecv: 1996
TCPDSACKOfoRecv: 4
52968 connections reset due to unexpected data
54079 connections reset due to early user close
51 connections aborted due to timeout
TCPDSACKIgnoredOld: 25
TCPDSACKIgnoredNoUndo: 822
TCPSpuriousRTOs: 81
TCPSackShifted: 4787
TCPSackMerged: 5128
TCPSackShiftFallback: 33199
TCPReqQFullDoCookies: 2939
TCPRcvCoalesce: 84434
TCPOFOQueue: 26
TCPChallengeACK: 3
TCPAutoCorking: 7
TCPFromZeroWindowAdv: 125
TCPToZeroWindowAdv: 125
TCPWantZeroWindowAdv: 1376
TCPSynRetrans: 4695
TCPOrigDataSent: 5023492
TCPHystartTrainDetect: 2403
TCPHystartTrainCwnd: 53596
TCPHystartDelayDetect: 346
TCPHystartDelayCwnd: 9883
TCPACKSkippedSynRecv: 6
TCPACKSkippedPAWS: 10
TCPACKSkippedSeq: 15
TCPWinProbe: 15
IpExt:
InOctets: 6090026491
OutOctets: 7098177391
InNoECTPkts: 6966279
InECT0Pkts: 11
InCEPkts: 1
Edit 2:
After creating a static html page without spawning Gunicorn, and it was able to easily serve 4000 users without any delays. After passing the worker_connections I received:
/loadtest/ ConnectionError(MaxRetryError("HTTPConnectionPool(host='ll-my.site.com', port=80): Max retries exceeded with url: /loadtest/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x....>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))"))
From what I understand from this is that NGINX is not issue here.
Edit 3:
I checked my postgresql logs and found out that it's full of this error:
FATAL: remaining connection slots are reserved for non-replication superuser connections

Why root returns 403 error in API Gateway

I have a very simple lambda function that facilitates short URL redirection. Like so...
var env = process.env.NODE_ENV
exports.handler = async function (event) {
var mappings = {
"": "https://example.com",
"/": "https://example.com",
"/article1": "https://example.com/articles/article-title",
"/podcasts": "https://example.com/podcasts"
}
return {
body: null,
headers: {
"Location": mappings[event.path] || "https://example.com/four-oh-four"
},
isBase64Encoded: false,
statusCode: 301
}
}
The URL redirects just fine for all routes except the homepage (with or without a slash). Instead of the homepage, I get a "Missing Authentication Token" error from API Gateway (or Cloudfront rather).
Curling doesn't appear to reveal anything... (Updated the curl code, my bad I left the redirect).
$ curl -v https://short.url/
* Trying xxx.xx.xxx.xx...
* TCP_NODELAY set
* Connected to short.url (xxx.xx.xxx.xx) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /path/to/ca-certificates.crt
CApath: /path/to/certs
* (304) (OUT), TLS handshake, Client hello (1):
* (304) (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / xxxxxxxxxxxx-SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=*.ib.run
* start date: Apr 5 00:00:00 2019 GMT
* expire date: May 5 12:00:00 2020 GMT
* subjectAltName: host "short.url" matched cert's "short.url"
* issuer: xxx; O=xxx; OU=xxx; CN=xxx
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle xxxxxxxx)
> GET / HTTP/2
> Host: short.url
> User-Agent: curl/7.58.0
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 403
< content-type: application/json
< content-length: 42
< date: Sat, 20 Jul 2019 03:51:44 GMT
< x-amzn-requestid: xxxxxxxxxx-xxxxxxxxxx-xxxxxxxxxx
< x-amzn-errortype: MissingAuthenticationTokenException
< x-amz-apigw-id: xxxxxxxxxxxxxx_
< x-cache: Error from cloudfront
< via: 1.1 xxxxxxxxxxxxxxxxxxxxxx.cloudfront.net (CloudFront)
< x-amz-cf-pop: xxxxx-xx
< x-amz-cf-id: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx===
<
* Connection #0 to host short.url left intact
{"message":"Missing Authentication Token"}
The response "Missing Authentication Token" is misleading.
It suggests that you need to provide an Token.
The real error is, that your routes in Api gateway are not setup properly.
So it is basically an Route not found from api-gateway.
You need to provide a Route for "/" with a method or the any method and redirect it to the Lambda function. You probably setup an subroute but no route for "/"
At the moment the curl is hitting the url "/" with the method GET and Api-Gateway does not know how to route this call so it answers with: "Missing Authentication Token".
You can reproduce this behavior with every non existent route. Try: /sdfsdfsdf for example. You will get the same error.
Setup the route and you shoud be fine.
I hope I could help you!
Dominik

HTTP 500 Deploying Elixir/Phoenix to AWS Elastic Beanstalk

I'm having trouble with the elixir/phoenix config that I need for a deployment to AWS/Elastic Beanstalk. (Following the guide found here: https://thoughtbot.com/blog/deploying-elixir-to-aws-elastic-beanstalk-with-docker - my Dockerfile looks similar except for updated libraries).
I can run in eb local run, but am having trouble pushing to production.
However, when I try and deploy to EB, I get the following warning, and it crashes:
Environment health has transitioned from Degraded to Severe.
100.0 % of the requests are failing with HTTP 5xx.
Command failed on all instances.
Incorrect application version found on all instances. Expected version "app-8412-171116_115503" (deployment 5).
ELB processes are not healthy on all instances.
100.0 % of the requests to the ELB are erroring with HTTP 4xx.
Insufficient request rate (0.5 requests/min) to determine application health (5 minutes ago).
ELB health is failing or not available for all instances.
I was wondering if someone could let me know if my configs look right.
I've been trying a bunch of things, but I think I've gotten confused, as I'm just guessing at this point.
config.exs
use Mix.Config
config :newsly,
ecto_repos: [Newsly.Repo]
config :logger, :console,
format: "$time $metadata[$level] $message\n",
metadata: [:request_id]
import_config "#{Mix.env}.exs"
prod.exs
use Mix.Config
config :logger, :console, format: "[$level] $message\n"
config :phoenix, :stacktrace_depth, 5
import_config "prod.secret.exs"
prod.secret.exs
use Mix.Config
config :ex_aws,
access_key_id: System.get_env("AWS_ACCESS_KEY_ID"),
secret_access_key: System.get_env("AWS_SECRET_ACCESS_KEY"),
bucket_name: System.get_env("BUCKET_NAME"),
s3: [
scheme: "https://",
host: System.get_env("BUCKET_NAME"),
region: "us-west-2"
]
config :newsly, Newsly.Repo,
adapter: Ecto.Adapters.Postgres,
username: System.get_env("USERNAME"),
password: System.get_env("PASSWORD"),
database: System.get_env("DATABASE"),
hostname: System.get_env("DBHOST"),
# sometimes hostname is db (like in the docker-compose method - play with this one)
pool_size: 10
config :newsly, Newsly.Endpoint,
http: [port: 4000],
debug_errors: true,
code_reloader: false,
url: [scheme: "http", host: System.get_env("HOST"), port: 4000],
secret_key_base: System.get_env("SECRET_KEY_BASE"),
pubsub: [adapter: Phoenix.PubSub.PG2, pool_size: 5, name: Newsly.PubSub],
check_origin: false,
watchers: [node: ["node_modules/brunch/bin/brunch", "watch", "--stdin",
cd: Path.expand("../", __DIR__)]]
And in my Dockerfile I set my environment variables like the following;
ENV AWS_ACCESS_KEY_ID=nottelling
ENV AWS_SECRET_ACCESS_KEY=nottelling
ENV BUCKET_NAME=s3 storage bucket (not eb related)
ENV SECRET_KEY_BASE=nottelling
ENV HOST=host name of my eb instance im uploading to
ENV DBHOST=AWS rds host that holds postgres
ENV USERNAME=nottelling
ENV PASSWORD=nottelling
My health report on the instance fails to red with the following warning:
Environment health has transitioned from Warning to Severe. 100.0 % of the requests are failing with HTTP 5xx. ELB processes are not healthy on all instances. ELB health is failing or not available for all instances.
NGINX seems to be choking with the lines
2017/11/16 17:59:46 [error] 28815#0: *99 connect() failed (113: No route to host) while connecting to upstream, client: 172.31.20.108, server: , request: "GET / HTTP/1.1", upstream: "http://172.17.0.2:4000/", host: "172.31.38.244"
in nginx logs.
If I look at eb-activity log I have
duplicate MIME type "text/html" in /etc/nginx/sites-enabled/elasticbeanstalk-nginx-docker-proxy.conf:11
which seems to be killing nginx
[2017-11-16T18:02:33.927Z] INFO [29355] - [Application update app-8412-171116_115503#5/AppDeployStage1/AppDeployEnactHook/01flip.sh] : Completed activity. Result:
nginx: [warn] duplicate MIME type "text/html" in /etc/nginx/sites-enabled/elasticbeanstalk-nginx-docker-proxy.conf:11
Stopping nginx: [ OK ]
Starting nginx: nginx: [warn] duplicate MIME type "text/html" in /etc/nginx/sites-enabled/elasticbeanstalk-nginx-docker-proxy.conf:11
[ OK ]
iptables: Saving firewall rules to /etc/sysconfig/iptables: [ OK ]
Stopping current app container: e0161742ee69...
Error response from daemon: No such image: aws_beanstalk/current-app:latest
Making STAGING app container current...
Untagged: aws_beanstalk/staging-app:latest
eb-docker start/running, process 1398
Docker container e25f2b562f4f is running aws_beanstalk/current-app.
Does anyone have any ideas?
EDIT:
Digging through the logs for nginx I found
map $http_upgrade $connection_upgrade {
default "upgrade";
"" "";
}
server {
listen 80;
gzip on;
gzip_comp_level 4;
gzip_types text/html text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
if ($time_iso8601 ~ "^(\d{4})-(\d{2})-(\d{2})T(\d{2})") {
set $year $1;
set $month $2;
set $day $3;
set $hour $4;
}
access_log /var/log/nginx/healthd/application.log.$year-$month-$day-$hour healthd;
access_log /var/log/nginx/access.log;
location / {
proxy_pass http://docker;
proxy_http_version 1.1;
proxy_set_header Connection $connection_upgrade;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
So,
duplicate MIME type "text/html" in /etc/nginx/sites-enabled/elasticbeanstalk-nginx-docker-proxy.conf:11
Seems to be referring to this line:
gzip_types text/html text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
But at this point I would find it surprising if nginx was choking simply because it defines text/html twice. So now I'm not sure....
EDIT EDIT:
I should mention that my nginx/error.logs look like the following (the last lines repeat ad-infinum):
2017/11/16 17:19:22 [warn] 18445#0: duplicate MIME type "text/html" in /etc/nginx/sites-enabled/elasticbeanstalk-nginx-docker-proxy.conf:11
2017/11/16 17:19:22 [warn] 18460#0: duplicate MIME type "text/html" in /etc/nginx/sites-enabled/elasticbeanstalk-nginx-docker-proxy.conf:11
2017/11/16 17:20:06 [error] 18467#0: *11 connect() failed (113: No route to host) while connecting to upstream, client: 172.31.32.139, server: , request: "GET / HTTP/1.1", upstream: "http://172.17.0.2:4000/", host: "172.31.38.244"
2017/11/16 17:20:15 [error] 18467#0: *13 connect() failed (113: No route to host) while connecting to upstream, client: 172.31.20.108, server: , request: "GET / HTTP/1.1", upstream: "http://172.17.0.2:4000/", host: "172.31.38.244"
2017/11/16 17:20:21 [error] 18467#0: *15 connect() failed (113: No route to host) while connecting to upstream, client: 172.31.32.139, server: , request: "GET / HTTP/1.1", upstream: "http://172.17.0.2:4000/", host: "172.31.38.244"
2017/11/16 17:20:30 [error] 18467#0: *17 connect() failed (113: No route to host) while connecting to upstream, client: 172.31.20.108, server: , request: "GET / HTTP/1.1", upstream:
THIS IS THE HEART OF THE PROBLEM
NGINX fundamentally cant connect the entrypoint to the application and I don't know why!
UPDATE:
Using Kevin Johnson's advice I successfully pushed up to AWS ECR my Dockerfile and it compiled correctly when I eb deploy'ed my application with a good Dockerrun.aws.json. This is in fact a preferred way to do this. HOWEVER, I still get the same error! I do not know what is going on, but I think I can safely say that my Dockerfile successfully compiles.
I think there is something broken in AWS and I'm not sure what.
UPDATE
Problem is related to a NGINX routing issue. More information here in a clean question: How Do I modify NGINX routing on Elastic Beanstalk AWS?
I highly suspect that the issue you are dealing with is either:
Your Dockerrun.aws.config file points to a non existing docker image on ECS Repository. This is indicated by the error message:
Error response from daemon: No such image: aws_beanstalk/current-app:latest
Making STAGING app container current...
When EB fails to replace the instance with the latest configuration, it will resort back to the old one, which could be the Hello World app of AWS that you may have leveraged in setting up EB. That container does not have a web service running on port 4000, whereas your Dockerrun.aws.config specifies port 4000.
(I would be surprised though that your Dockerrun.aws.config does not get replaced as well by EB to previous working version)
So check Dockerrun.aws.config and ensure that the image endpoint mentioned therein actually exists. Try pulling it locally and run it accordingly. First clean up your local environment of all images and running docker containers if need be.
Your application running within docker immediately crashes upon startup. Again, EB will detect this and replaces the crashed container with the previous container which again does not have port 4000 open.