ELB health check failing - amazon-web-services

ELB health check failing - amazon-web-services

an instance was taken out of service in response to a ELB system
health check failure.
I hit the health check endpoint with my browser and it returns fine, but I'm getting the above message.
How can I debug this?
I've looked at instant settings => Get System Logs and nginx logs,
edit
nginx has
- [27/Mar/2020:05:35:42 +0000] "GET /littlehome/heartbeat/ HTTP/1.1" 200 2 2.920 2.920 "-" "ELB-HealthChecker/2.0"
- [27/Mar/2020:05:35:42 +0000] "GET /littlehome/heartbeat/ HTTP/1.1" 200 2 2.858 2.856 "-" "ELB-HealthChecker/2.0"
it returned 200 for sure..
and still aws think it received 502
{
"Target": {
"Id": "i-085e8dffe8781f876",
"Port": 80
},
"HealthCheckPort": "80",
"TargetHealth": {
"State": "unhealthy",
"Reason": "Target.ResponseCodeMismatch",
"Description": "Health checks failed with these codes: [502]"
}
},

Based on the comments, the issue was that grace period in Auto Scaling Group was too short. The solution was to increase it.

Related

Can not create shared-domain in Cloud Foundry - 504 Gateway Time-out

I can not create shared-domain in Cloud Foundry, any pushed apps get's health check connection refused.
I had working Cloud Foundry environment based on OpenStack IaaS. Everything worked as expected. I took my deployment files and after some time deployed it successfully in IaaS Vmware VSphere 7. The problem is, that every app that I push has problems with health check:
2020-10-29T16:55:01.43+0000 [CELL/0] OUT Cell 938b869c-5a68-40cc-9486-c5bc0d53a73a successfully destroyed container for instance 44e9c2a6-b54d-4fc4-4118-6d6d
2020-10-29T16:55:36.55+0000 [CELL/0] OUT Cell 938b869c-5a68-40cc-9486-c5bc0d53a73a creating container for instance 17f161a2-9788-426d-414d-6c33
2020-10-29T16:55:37.18+0000 [CELL/0] OUT Cell 938b869c-5a68-40cc-9486-c5bc0d53a73a successfully created container for instance 17f161a2-9788-426d-414d-6c33
2020-10-29T16:55:37.47+0000 [CELL/0] OUT Downloading droplet...
2020-10-29T16:55:37.75+0000 [CELL/0] OUT Downloaded droplet
2020-10-29T16:55:37.75+0000 [CELL/0] OUT Starting health monitoring of container
2020-10-29T16:56:38.45+0000 [HEALTH/0] ERR Failed to make TCP connection to port 8080: connection refused
2020-10-29T16:56:38.45+0000 [CELL/0] ERR Timed out after 1m0s: health check never passed.
2020-10-29T16:56:38.46+0000 [CELL/SSHD/0] OUT Exit status 0
2020-10-29T16:56:38.48+0000 [APP/PROC/WEB/0] OUT Exit status 143
I am also not able to create any shared domains:
bash-5.0# cf create-shared-domain tcp.cf.test-env.net --router-group default-tcp -v
REQUEST: [2020-10-29T17:03:33Z]
GET /v2/info HTTP/1.1
Host: api.cf.test-env.net
Accept: application/json
User-Agent: cf/6.47.2+d526c2cb3.2019-11-05 (go1.12.12; amd64 linux)
RESPONSE: [2020-10-29T17:03:33Z]
HTTP/1.1 200 OK
Content-Length: 561
Content-Type: application/json;charset=utf-8
Date: Thu, 29 Oct 2020 17:03:33 GMT
Server: nginx
X-Content-Type-Options: nosniff
X-Vcap-Request-Id: 4badb79b-2faf-4623-6c3c-ce5fa3223cd5::dc43d2c9-c902-4429-9d65-d9a0060983c5
{
"api_version": "2.144.0",
"app_ssh_endpoint": "ssh.cf.test-env.net:2222",
"app_ssh_host_key_fingerprint": "ae:a3:ed:ad:37:d3:8a:7b:ed:b4:e5:d2:25:e5:8c:d0",
"app_ssh_oauth_client": "ssh-proxy",
"authorization_endpoint": "https://login.cf.test-env.net",
"build": "",
"description": "",
"doppler_logging_endpoint": "wss://doppler.cf.test-env.net:443",
"min_cli_version": null,
"min_recommended_cli_version": null,
"name": "",
"osbapi_version": "2.15",
"routing_endpoint": "https://api.cf.test-env.net/routing",
"support": "",
"token_endpoint": "https://uaa.cf.test-env.net",
"version": 0
}
REQUEST: [2020-10-29T17:03:33Z]
GET /login HTTP/1.1
Host: login.cf.test-env.net
Accept: application/json
Connection: close
User-Agent: cf/6.47.2+d526c2cb3.2019-11-05 (go1.12.12; amd64 linux)
RESPONSE: [2020-10-29T17:03:34Z]
HTTP/1.1 200 OK
Cache-Control: no-store
Content-Language: en-US
Content-Length: 384
Content-Type: application/json;charset=UTF-8
Date: Thu, 29 Oct 2020 17:03:34 GMT
Set-Cookie: X-Uaa-Csrf=NJlSPAjspn7m8oWuQdKsVD; Max-Age=86400; Expires=Fri, 30-Oct-2020 17:03:34 GMT; Path=/; Secure; HttpOnly
Strict-Transport-Security: max-age=31536000
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-Vcap-Request-Id: 577d4d31-ec30-477e-6f44-c0dd9306270d
X-Xss-Protection: 1; mode=block
{
"app": {
"version": "74.12.0"
},
"commit_id": "7311e68",
"entityID": "login.cf.test-env.net",
"idpDefinitions": {},
"links": {
"login": "https://login.cf.test-env.net",
"passwd": "/forgot_password",
"register": "/create_account",
"uaa": "https://uaa.cf.test-env.net"
},
"prompts": {
"password": "[PRIVATE DATA HIDDEN]",
"username": [
"text",
"Email"
]
},
"timestamp": "2019-12-02T22:53:03+0000",
"zone_name": "uaa"
}
Creating shared domain tcp.cf.test-env.net as admin...
REQUEST: [2020-10-29T17:03:34Z]
GET /routing/v1/router_groups?name=default-tcp HTTP/1.1
Host: api.cf.test-env.net
Accept: application/json
Authorization: [PRIVATE DATA HIDDEN]
Connection: close
Content-Type: application/json
User-Agent: cf/6.47.2+d526c2cb3.2019-11-05 (go1.12.12; amd64 linux)
[application/json Content Hidden]
RESPONSE: [2020-10-29T17:03:34Z]
HTTP/1.1 200 OK
Content-Length: 114
Content-Type: application/json
Date: Thu, 29 Oct 2020 17:03:34 GMT
X-Vcap-Request-Id: 9459b068-0987-4f5e-7dee-1efdb5ca6fb8
[
{
"guid": "343ba1e8-88a7-4003-6db6-4feabedd072b",
"name": "default-tcp",
"reservable_ports": "1024-2048",
"type": "tcp"
}
]
REQUEST: [2020-10-29T17:03:34Z]
POST /v2/shared_domains HTTP/1.1
Host: api.cf.test-env.net
Accept: application/json
Authorization: [PRIVATE DATA HIDDEN]
Content-Type: application/json
User-Agent: cf/6.47.2+d526c2cb3.2019-11-05 (go1.12.12; amd64 linux)
{
"internal": false,
"name": "tcp.cf.test-env.net",
"router_group_guid": "343ba1e8-88a7-4003-6db6-4feabedd072b"
}
RESPONSE: [2020-10-29T17:04:04Z]
HTTP/1.0 504 Gateway Time-out
Cache-Control: no-cache
Connection: close
Content-Type: text/html
<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>
Error unmarshalling the following into a cloud controller error: <html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>
FAILED
I suspect network configuration issue, that blocks some internal CF parts from connection. There is no any firewall or any rules found in VMware. I can also ping and make ssh connection between bosh created VM's.
Any ideas, what else can I do?

the problem was with DNAT and SNAT rules on VmWare NSX-T. If any internal VM asked about dns name "api.cf.test-env.net" it get's remote (public) IP address as answer. When the connection should be established, the internal VM has been asking api.cf.test-env.net by public IP address, and get's the local one by second stage of TCP three-way-handshake - what caused TCP RST. After creating DNAT and SNAT rules correctly, everything works as expected. I still wondering why "api.cf.test-env.net" is not answered by bosh-dns with internal address. Does anyone know why it so and how it can be changed?

read event trigger on image downloading from s3

I use amazon services. I have a task to track an IP address and user agent for each who download an image from s3.
I use amazon API gateway and amazon lambda and Amazon S3. Is it possible? I found triggers only on uploading or deleting the file from s3

As at now, S3 doesn't have object read event trigger. What you may do is to use cloudtrail to track the api call to read object of the s3 bucket and create an alarm to trigger a lambda function.
ex: S3 -> CloudTrail -> CloudWatch Event -> Rule -> Lamdba
Another simple solution would be to allow the object download directly via lambda.
ex: API Gateway -> Lambda -> S3
This will return the lambda output which can be the blob (be aware of the size limit) or preferably pre-signed url for the object.

You mention that your goal is "I want to track an IP address and user agent for each request of an image from s3".
To obtain this information, you should activate Amazon S3 server access logging:
Server access logging provides detailed records for the requests that are made to a bucket.
The Amazon S3 Server Access Log Format includes IP address, User Agent and other standard web log information:
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be awsexamplebucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 3E57427F3EXAMPLE REST.GET.VERSIONING - "GET /awsexamplebucket1?versioning HTTP/1.1" 200 - 113 - 7 - "-" "S3Console/0.4" - s9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234= SigV2 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader awsexamplebucket1.s3.us-west-1.amazonaws.com TLSV1.1
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be awsexamplebucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 891CE47D2EXAMPLE REST.GET.LOGGING_STATUS - "GET /awsexamplebucket1?logging HTTP/1.1" 200 - 242 - 11 - "-" "S3Console/0.4" - 9vKBE6vMhrNiWHZmb2L0mXOcqPGzQOI5XLnCtZNPxev+Hf+7tpT6sxDwDty4LHBUOZJG96N1234= SigV2 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader awsexamplebucket1.s3.us-west-1.amazonaws.com TLSV1.1
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be awsexamplebucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be A1206F460EXAMPLE REST.GET.BUCKETPOLICY - "GET /awsexamplebucket1?policy HTTP/1.1" 404 NoSuchBucketPolicy 297 - 38 - "-" "S3Console/0.4" - BNaBsXZQQDbssi6xMBdBU2sLt+Yf5kZDmeBUP35sFoKa3sLLeMC78iwEIWxs99CRUrbS4n11234= SigV2 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader awsexamplebucket1.s3.us-west-1.amazonaws.com TLSV1.1
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be awsexamplebucket1 [06/Feb/2019:00:01:00 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 7B4A0FABBEXAMPLE REST.GET.VERSIONING - "GET /awsexamplebucket1?versioning HTTP/1.1" 200 - 113 - 33 - "-" "S3Console/0.4" - Ke1bUcazaN1jWuUlPJaxF64cQVpUEhoZKEG/hmy/gijN/I1DeWqDfFvnpybfEseEME/u7ME1234= SigV2 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader awsexamplebucket1.s3.us-west-1.amazonaws.com TLSV1.1

CloudWatch Custom logs are not rendering properly

When I try to push my IIS or MSSQL logs into CloudWatch, I can see logs in the server are appearing however they are in the single line in CW where as in the servers they are two different events with different timestamp
I've treid using "multi_line_start_pattern": "yyyy-MM-dd HH:mm:ss" however this doesn't solve my problem
CloudWatch Json file:
{
"FullName": "AWS.EC2.Windows.CloudWatch.CustomLog.CustomLogInputComponent,AWS.EC2.Windows.CloudWatch",
"Id": "IISLogs",
"Parameters": {
"CultureName": "en-US",
"Encoding": "UTF-8",
"Filter": "",
"LineCount": "5",
"LogDirectoryPath": "C:\\logfiles",
"TimeZoneKind": "UTC",
"TimestampFormat": "\\%Y-%m-%d %H:%M:%S\\" (also tried "yyyy-MM-dd HH:mm:ss" format)
}
},
{
"FullName": "AWS.EC2.Windows.CloudWatch.CloudWatchLogsOutput,AWS.EC2.Windows.CloudWatch",
"Id": "CloudWatchIISLogs",
"Parameters": {
"LogGroup": "/application/iis",
"LogStream": "{instance_id}",
"Region": "eu-west-1",
"multi_line_start_pattern": "yyyy-MM-dd HH:mm:ss"
}
}
under flows:
"(IISLogs),CloudWatchIISLogs",
Logs I see in CW: I see its not finding difference between each end line, however in the IIS server I do have the logs seperated in next line. same is happening for MSSQL.
I would expect the logs to be pushed into the CW same as mentioned in the server/instance unlike below:
Under time: I have the timestamp:
Under Message: this is coming under single message where as it consists of multiple messages (3 events of user1)
2019-05-31 12:19:42 ::1 GET / - 80 user ::1 Mozilla/5.0+(Windows+NT+10.0;+WOW64;+Trident/7.0;+rv:11.0)+like+Gecko - 200 0 0 2032019-05-31 12:19:43 ::1 GET / - 80 user1 ::1 Mozilla/5.0+(Windows+NT+10.0;+WOW64;+Trident/7.0;+rv:11.0)+like+Gecko - 200 0 0 152019-05-31 12:19:43 ::1 GET /libs/jquery-1.7.1.min.js - 80 user1 ::1 Mozilla/5.0+(Windows+NT+10.0;+WOW64;+Trident/7.0;+rv:11.0)+like+Gecko http://localhost/ 304 0 0 02019-05-31 12:19:43 ::1 GET /libs/canvg/canvg.js - 80 user1 ::1
status code is merging with the next line which is date/time due to which logs are not showing/split up properly.
Any help would be appreciated.
Thanks

I have got an answer to this, this was due to the agent that we were using - SSM, post migration to CW agent its resolved.

Django-allauth facebook callback (/accounts/facebook/login/callback/) error without trace

I have successfully login to facebook:
"GET /accounts/facebook/login/ HTTP/1.0" 302 0
"GET /accounts/facebook/login/callback/?code=AQCMYR8By_NW2ArWZ63Kq00twt4mSUiQ9BBApbvwt7eLWYyiMxYJkOXuRlbXzb9tq4lS-QunyFUlKxgVc0P6D3K-rl6AVkuMUZ2o7XjJi1LNmiaiUdzT6WHzWhyAbRm2SLIkm6mwgOPMI_g47h_yRE4tra1qLMZikfWq9npXC2QWybHQ9XeaFv3zS13EqaG8H9rJ-RMKmZXb9Ti4uSzK3-Vlzk1ORLWEIbIZw3YiEpqg18fSf4hb3PEB-ro7C5FmflEdoxwaig3Vdmddvl9wOyqmx1czE4bIwqtYR3yFilZ2h0o8uEj0g03rbBY5e5GGAcNyjFmgQj1zGsgMJIQDjFXO&state=fBRf1vERs3PD HTTP/1.0" 200 4362
In front I recieves message
Social Network Login Failure
An error occurred while attempting to login via your social network account.
VK auth is working.
What can be wrong and now to debug it?
Any SOCIALACCOUNT_PROVIDERS and any other VIRABLES values.

Redirect based on "Accept-Language" request header leads to error on Google Cloud CDN

I am currently setting up an Nginx server on a "Google Compute Engine" behind Google's Load Balancer/CDN combo:
Website visitor <---> CDN <---> Load Balancer <---> Nginx on Google Compute Engine
I would like to redirect the visitor from https://www.example.org/ to either https://www.example.org/de/ or https://www.example.org/en/ depending on the value of the "Accept-Language" HTTP-Header in the client's request. For this purpose, I am using the following code in the nginx.conf configuration file:
set $language_suffix "en";
if ($http_accept_language ~* "^de") {
set $language_suffix "de";
}
location = / {
add_header Vary "Accept-Language";
return 303 https://www.example.org/$language_suffix/;
}
But, above config leads to a 502 error:
~> curl -I https://www.example.org/
HTTP/2 502
content-type: text/html; charset=UTF-8
referrer-policy: no-referrer
content-length: 332
date: Mon, 11 Jun 2018 09:57:55 GMT
alt-svc: clear
How can I fix this?
UPDATE:
XXX.XXX.XXX.XXX - "HEAD https://www.XXXXXXX.com/" 502 106 "curl/7.60.0" {
httpRequest: {
cacheLookup: true
remoteIp: "XXX.XXX.XXX.XXX"
requestMethod: "HEAD"
requestSize: "38"
requestUrl: "https://www.XXXXXXX.com/"
responseSize: "106"
status: 502
userAgent: "curl/7.60.0"
}
insertId: "XXXXXXXXXXXXX"
jsonPayload: {
#type: "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry"
statusDetails: "failed_to_pick_backend"
}
logName: "projects/crack-triode-XXXXXXXX/logs/requests"
receiveTimestamp: "2018-06-11T03:33:10.864056419Z"
resource: {
labels: {
backend_service_name: ""
forwarding_rule_name: "XXX-werbserver-ipv4-https"
project_id: "crack-triode-XXXXXXXX"
target_proxy_name: "XXX-werbserver-loadbalancer-target-proxy-2"
url_map_name: "XXX-werbserver-loadbalancer"
zone: "global"
}
type: "http_load_balancer"
}
severity: "WARNING"
spanId: "XXXXXXXXXXXXXX"
timestamp: "2018-06-11T03:33:10.088466141Z"
trace: "projects/crack-triode-XXXXXXXX/traces/XXXXXXXXXXXXXXX"
}

You have to change the request uri from / to some else, that returns HTTP-Status 200. I am now using /robots.txt. The setting can be changed at:
https://console.cloud.google.com/compute/healthChecks

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

ELB health check failing - amazon-web-services

Based on the comments, the issue was that grace period in Auto Scaling Group was too short. The solution was to increase it.

Related

Can not create shared-domain in Cloud Foundry - 504 Gateway Time-out

read event trigger on image downloading from s3

CloudWatch Custom logs are not rendering properly

Django-allauth facebook callback (/accounts/facebook/login/callback/) error without trace

Redirect based on "Accept-Language" request header leads to error on Google Cloud CDN

Categories

Resources