Failed to start Kibana on AWS machine - amazon-web-services

I'm following blog post about using ELK stack. Machine for instalation is an amazon small Ubuntu instance.
I got to the point when I need to install Kibana service so I run:
sudo apt-get install kibana
Then I changed in /etc/kibana/kibana.yml
server.port: 5601
elasticsearch.url: "0.0.0.0:9200"
since I can get response from elasticsearch sudo curl 0.0.0.0:9200
then I run
sudo service kibana start
And after running sudo service kibana status I receiving:
x#ip-xx-xx-xx-xx:/$ sudo service kibana status
● kibana.service - Kibana
Loaded: loaded (/etc/systemd/system/kibana.service; disabled; vendor preset: enabled)
Active: active (running) since Fri 2016-12-02 13:52:55 UTC; 13ms ago
Main PID: 5921 (node)
Tasks: 6
Memory: 1.1M
CPU: 3ms
CGroup: /system.slice/kibana.service
└─5921 /usr/share/kibana/bin/../node/bin/node --no-warnings /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml
Dec 02 13:52:55 ip-xx-xx-xx-xx systemd[1]: Started Kibana.
x#ip-xx-xx-xx-xx:/$ sudo service kibana status
● kibana.service - Kibana
Loaded: loaded (/etc/systemd/system/kibana.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Dec 02 13:52:56 ip-xx-xx-xx-xx kibana[5921]: buildSha: '8f2ace746d1b84702bb618308efa65dc0c3f8a34' },
Dec 02 13:52:56 ip-xx-xx-xx-xx kibana[5921]: dev: { basePathProxyTarget: 5603 },
Dec 02 13:52:56 ip-xx-xx-xx-xx kibana[5921]: pid: { exclusive: false },
Dec 02 13:52:56 ip-xx-xx-xx-xx systemd[1]: kibana.service: Main process exited, code=exited, status=1/FAILURE
Dec 02 13:52:56 ip-xx-xx-xx-xx systemd[1]: kibana.service: Unit entered failed state.
Dec 02 13:52:56 ip-xx-xx-xx-xx systemd[1]: kibana.service: Failed with result 'exit-code'.
Dec 02 13:52:57 ip-xx-xx-xx-xx systemd[1]: kibana.service: Service hold-off time over, scheduling restart.
Dec 02 13:52:57 ip-xx-xx-xx-xx systemd[1]: Stopped Kibana.
Dec 02 13:52:57 ip-xx-xx-xx-xx systemd[1]: kibana.service: Start request repeated too quickly.
Dec 02 13:52:57 ip-xx-xx-xx-xx systemd[1]: Failed to start Kibana.
Unfortunatelly log is not created under directory /var/log/kibanaeven after setting rights by chown kibana:kibana /var/log/kibana:
ll /var/log/kibana/
total 8
drwxr-xr-x 2 kibana kibana 4096 Dec 2 10:20 ./
drwxrwxr-x 9 root syslog 4096 Dec 2 09:50 ../
First of all I wish to see Kibana log (whole problem resolution will be even better :) )

Related

Docker redis sync fell into error loop on remote server

I am running Django dev server in docker with celery and redis on my remote host machine.
Everything works fine like 30 mins, and then redis starts falling into infinite loop while MASTER <-> REPLICA sync started
Here's the console output:
redis_1 | 1:S 16 Feb 2023 17:42:37.119 * Non blocking connect for SYNC fired the event.
redis_1 | 1:S 16 Feb 2023 17:42:37.805 # Failed to read response from the server: No error information
redis_1 | 1:S 16 Feb 2023 17:42:37.805 # Master did not respond to command during SYNC handshake
redis_1 | 1:S 16 Feb 2023 17:42:38.057 * Connecting to MASTER 194.40.243.205:8886
redis_1 | 1:S 16 Feb 2023 17:42:38.058 * MASTER <-> REPLICA sync started
redis_1 | 1:S 16 Feb 2023 17:42:38.111 * Non blocking connect for SYNC fired the event.
redis_1 | 1:S 16 Feb 2023 17:42:39.194 * Master replied to PING, replication can continue...
redis_1 | 1:S 16 Feb 2023 17:42:39.367 * Partial resynchronization not possible (no cached master)
redis_1 | 1:S 16 Feb 2023 17:42:39.449 * Full resync from master: ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ:1
redis_1 | 1:S 16 Feb 2023 17:42:39.449 * MASTER <-> REPLICA sync: receiving 54992 bytes from master to disk
redis_1 | 1:S 16 Feb 2023 17:42:39.607 * MASTER <-> REPLICA sync: Flushing old data
redis_1 | 1:S 16 Feb 2023 17:42:39.608 * MASTER <-> REPLICA sync: Loading DB in memory
redis_1 | 1:S 16 Feb 2023 17:42:39.621 # Wrong signature trying to load DB from file
redis_1 | 1:S 16 Feb 2023 17:42:39.623 # Failed trying to load the MASTER synchronization DB from disk: Invalid argument
redis_1 | 1:S 16 Feb 2023 17:42:39.624 * Reconnecting to MASTER 194.40.243.205:8886 after failure
redis_1 | 1:S 16 Feb 2023 17:42:39.625 * MASTER <-> REPLICA sync started
redis_1 | 1:S 16 Feb 2023 17:42:39.709 * Non blocking connect for SYNC fired the event.
redis_1 | 1:S 16 Feb 2023 17:42:40.891 # Failed to read response from the server: No error information
redis_1 | 1:S 16 Feb 2023 17:42:40.891 # Master did not respond to command during SYNC handshake
redis_1 | 1:S 16 Feb 2023 17:42:41.069 * Connecting to MASTER 194.40.243.205:8886
redis_1 | 1:S 16 Feb 2023 17:42:41.069 * MASTER <-> REPLICA sync started
redis_1 | 1:S 16 Feb 2023 17:42:41.128 * Non blocking connect for SYNC fired the event.
redis_1 | 1:S 16 Feb 2023 17:42:42.167 # Failed to read response from the server: No error information
redis_1 | 1:S 16 Feb 2023 17:42:42.167 # Master did not respond to command during SYNC handshake
redis_1 | 1:S 16 Feb 2023 17:42:43.074 * Connecting to MASTER 194.40.243.205:8886
and this is only a few seconds.
My docker-compose file:
services:
redis:
image: redis:latest
restart: always
ports:
- "6379:6379"
django:
image: docker-app:django
container_name: django-app
build: .
volumes:
- .:/app/
ports:
- "8000:8000"
celery:
image: celery-app
restart: on-failure
build: .
command:
- celery
- -A
- myapp.celery_app
- worker
- -B
- --loglevel=INFO
volumes:
- .:/app/
container_name: celery
depends_on:
- django
I have no idea what's wrong here, 'cause on local machine everything works fine. I aslo tried apk-get update && apk-get upgrade, nothing changed
EDIT
Output on redis start
redis_1 | 1:C 16 Feb 2023 18:02:56.934 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_1 | 1:C 16 Feb 2023 18:02:56.935 # Redis version=7.0.4, bits=64, commit=00000000, modified=0, pid=1, just started
redis_1 | 1:C 16 Feb 2023 18:02:56.935 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
redis_1 | 1:M 16 Feb 2023 18:02:56.936 * Increased maximum number of open files to 10032 (it was originally set to 1024).
redis_1 | 1:M 16 Feb 2023 18:02:56.936 * monotonic clock: POSIX clock_gettime
redis_1 | 1:M 16 Feb 2023 18:02:56.943 * Running mode=standalone, port=6379.
redis_1 | 1:M 16 Feb 2023 18:02:56.943 # Server initialized
redis_1 | 1:M 16 Feb 2023 18:02:56.945 * Loading RDB produced by version 7.0.8
redis_1 | 1:M 16 Feb 2023 18:02:56.945 * RDB age 1201 seconds
redis_1 | 1:M 16 Feb 2023 18:02:56.946 * RDB memory usage when created 1.44 Mb
redis_1 | 1:M 16 Feb 2023 18:02:56.946 * Done loading RDB, keys loaded: 0, keys expired: 0.
redis_1 | 1:M 16 Feb 2023 18:02:56.946 * DB loaded from disk: 0.001 seconds
redis_1 | 1:M 16 Feb 2023 18:02:56.946 * Ready to accept connections

loop over a list of instances to do yum update failed with exit status 126

I need to automate yum update across a list of instances, I tried something like aws ssm send-command --document-name "AWS-RunShellScript" --parameters 'commands=["sudo yum -y update"]' --targets "Key=instanceids,Values=<target instance id>" --timeout-seconds 600 in my local terminal (MFA enabled, logged in as IAM user, can list all ec2 instance under all regions by aws ec2 describe-instances) got the output with StatusDetails": "Pending" and the update never took place.
I checked the ssm log after starting an ssm session on the target instance
2021-12-08 00:03:32 INFO [ssm-agent-worker] [MessagingDeliveryService] Sending reply {
"additionalInfo": {
"agent": {
"lang": "en-US",
"name": "amazon-ssm-agent",
"os": "",
"osver": "1",
"ver": ""
},
"dateTime": "2021-12-08T00:03:32.061Z",
"runId": "",
"runtimeStatusCounts": {
"Failed": 1
}
},
"documentStatus": "InProgress",
"documentTraceOutput": "",
"runtimeStatus": {
"aws:runShellScript": {
"status": "Failed",
"code": 126,
"name": "aws:runShellScript",
"output": "\n----------ERROR-------\nsh: /var/lib/amazon/ssm/i-074cfdd5be7fe517b/document/orchestration/2d917bcc-fc6e-4e4b-b500-cc2e2b7bd4d6/awsrunShellScript/0.awsrunShellScript/_script.sh: Permission denied\nfailed to run commands: exit status 126",
"startDateTime": "2021-12-08T00:03:32.024Z",
"endDateTime": "2021-12-08T00:03:32.061Z",
"outputS3BucketName": "",
"outputS3KeyPrefix": "",
"stepName": "",
"standardOutput": "",
"standardError": "sh: /var/lib/amazon/ssm/i-074cfdd5be7fe517b/document/orchestration/2d917bcc-fc6e-4e4b-b500-cc2e2b7bd4d6/awsrunShellScript/0.awsrunShellScript/_script.sh: Permission denied\nfailed to run commands: exit status 126"
}
}
}
I checked the directory permission
ls -al /var/lib/amazon/
total 4
drwxr-xr-x 3 root root 17 Jul 26 23:53 .
drwxr-xr-x 32 root root 4096 Aug 6 18:49 ..
drwxr-xr-x 6 root root 80 Aug 7 00:03 ssm
and further one level down
ls -al /var/lib/amazon/ssm
total 0
drwxr-xr-x 6 root root 80 Aug 7 00:03 .
drwxr-xr-x 3 root root 17 Jul 26 23:53 ..
drw------- 2 root root 6 Aug 7 00:03 daemons
drw------- 8 root root 111 Dec 8 00:03 i-074cfdd5be7fe517b
drwxr-x--- 2 root root 39 Aug 7 00:03 ipc
drw------- 3 root root 23 Aug 7 00:03 localcommands
I also tried more basic commands like echo HelloWorld and got the same 126 error.

CI/CD deployment of Prisma ORM to Elastic Beanstalk through CodePipeline

I'm trying to deploy a simple ExpressJS app with Prisma ORM.
Here is the full project: https://github.com/oxyn/aws-codebuild/settings
I'm trying to build through Amazon CodePipeline. For build, I have selected Ubuntu (aws/codebuild/standard:5.0). For Elastic Amazon is using Amazon Unix as I know.
I specified binaryTargets:
generator client {
provider = "prisma-client-js"
binaryTargets = ["rhel-openssl-1.0.x"]
}
but still, I can't get data.
"error": {
"clientVersion": "3.0.2"
},
This is an error that I'm getting inside of try-catch and this is log from Elastic:
----------------------------------------
/var/log/web.stdout.log
----------------------------------------
Sep 12 04:59:50 ip-172-31-12-243 web: /var/app/current/node_modules/.prisma/client
Sep 12 04:59:50 ip-172-31-12-243 web: To solve this problem, add the platform "rhel-openssl-1.0.x" to the "binaryTargets" attribute in the "generator" block in the "schema.prisma" file:
Sep 12 04:59:50 ip-172-31-12-243 web: generator client {
Sep 12 04:59:50 ip-172-31-12-243 web: provider = "prisma-client-js"
Sep 12 04:59:50 ip-172-31-12-243 web: binaryTargets = ["native"]
Sep 12 04:59:50 ip-172-31-12-243 web: }
Sep 12 04:59:50 ip-172-31-12-243 web: Then run "prisma generate" for your changes to take effect.
Sep 12 04:59:50 ip-172-31-12-243 web: Read more about deploying Prisma Client: https://pris.ly/d/client-generator
Sep 12 04:59:50 ip-172-31-12-243 web: at LibraryEngine.getLibQueryEnginePath (/var/app/current/node_modules/#prisma/client/runtime/index.js:25285:15)
Sep 12 04:59:50 ip-172-31-12-243 web: at async LibraryEngine.loadEngine (/var/app/current/node_modules/#prisma/client/runtime/index.js:24947:35)
Sep 12 04:59:50 ip-172-31-12-243 web: at async LibraryEngine.instantiateLibrary (/var/app/current/node_modules/#prisma/client/runtime/index.js:24913:7)
I'm generating Prisma in post_build section:
post_build:
commands:
- echo Build completed on `date`
- npm run generate
What I'm doing wrong?
I had the same problem.
I add the command "npx prisma generate" after npm install

"Unable to load RELATIVE ref" for openapi-run.yaml in GCP Endpoints

I use Google Cloud Build to build GCP Endpoints which relies on openapi-run.yaml to define the API. I have a large JSON schema file of which I want to refer to in the openapi-run.yaml:
/testRef:
get:
summary: test for Reference
operationId: testRef
responses:
'200':
description: A successful response
schema:
$ref: "/workspace/src/models/myLargeSchema.json#/Account"
But when I run gcloud endpoints services deploy openapi-run.yaml --project myProject, it gives me an error saying
"OpenAPI spec in file {openapi-run.yaml} is ill formed and cannot be parsed: Unable to load RELATIVE ref: /workspace/src/models/myLargeSchema.json"
I double-checked using ls command and confirm that the schema file is retrievable:
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bash'
args: [ '-c', 'ls ./src/models -la']
Output
Step #0: drwxr-xr-x 2 501 dialout 4096 Jan 27 13:12 .
Step #0: drwxr-xr-x 15 501 dialout 4096 Dec 6 22:53 ..
Step #0: -rw-r--r-- 1 501 dialout 2912 Dec 31 11:30 JSONSchemaValidator.ts
Step #0: -rw-r--r-- 1 501 dialout 3370813 Jan 27 13:12 myLargeSchema.json
Step #0: -rw-r--r-- 1 501 dialout 539 Dec 19 19:58 user.ts
Question
How can I write my openapi-run.yaml so that it can refers to my JSON schema in another file?
I have found out that calling for external references in NOT supporting in GCP Endpoints: https://cloud.google.com/endpoints/docs/openapi/openapi-limitations#external_type_references
You are using an absolute filename, not a relative one. Try dropping the leading /workspace/ from the path in $ref.

AWS: Simple cfn-init fails on Amazon Linux 2 for no apparent reason

I am provisioning a cloudformation stack. I am just trying to run the simplest possible cfn-initever on an instance started using a custom ami that was based on Amazon Linux 2:
EC2ESMasterNode1:
Type: AWS::EC2::Instance
Metadata:
Comment: ES Cluster Master 1 instance
AWS::CloudFormation::Init:
config:
commands:
01_template_elastic:
command:
!Sub |
echo "'Hello World'"
Properties:
ImageId: ami-09693313102a30b2c
InstanceType: !Ref MasterInstanceType
SubnetId: !Ref Subn1ID
SecurityGroupIds: [!Ref SGES]
KeyName: mykey
UserData:
"Fn::Base64":
!Sub |
#!/bin/bash -xe
# Start cfn-init
/opt/aws/bin/cfn-init -s ${AWS::StackName} --resource EC2ESMasterNode1 --region ${AWS::Region}
# Send the respective signal to Cloudformation
/opt/aws/bin/cfn-signal -e 0 --stack ${AWS::StackName} --resource EC2ESMasterNode1 --region ${AWS::Region}
Tags:
- Key: "Name"
Value: !Ref Master1NodeName
The /var/log/cloud-init-output.log has the following print
No packages needed for security; 15 packages available
Resolving Dependencies
Cloud-init v. 18.2-72.amzn2.0.6 running 'modules:final' at Wed, 02 Jan 2019 12:41:26 +0000. Up 14.42 seconds.
+ /opt/aws/bin/cfn-init -s test-elastic --resource EC2ESMasterNode1 --region eu-west-1
+ /opt/aws/bin/cfn-signal -e 0 --stack test-elastic --resource EC2ESMasterNode1 --region eu-west-1
ValidationError: Stack arn:aws:cloudformation:eu-west-1:248059334340:stack/test-elastic/9fc79150-0e8b-11e9-b135-503ac9e74cfd is in CREATE_COMPLETE state and cannot be signaled
Jan 02 12:41:27 cloud-init[2575]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [1]
Jan 02 12:41:27 cloud-init[2575]: cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
Jan 02 12:41:27 cloud-init[2575]: util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_scripts_user.pyc'>) failed
Cloud-init v. 18.2-72.amzn2.0.6 finished at Wed, 02 Jan 2019 12:41:27 +0000. Datasource DataSourceEc2. Up 15.30 seconds
The /var/log/cloud-init.log has the following errors:
Jan 02 12:41:26 cloud-init[2575]: handlers.py[DEBUG]: start: modules-final/config-scripts-user: running config-scripts-user with frequency once-per-instance
Jan 02 12:41:26 cloud-init[2575]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-0c10a5ff1be475b99/sem/config_scripts_user - wb: [644] 20 bytes
Jan 02 12:41:26 cloud-init[2575]: helpers.py[DEBUG]: Running config-scripts-user using lock (<FileLock using file '/var/lib/cloud/instances/i-0c10a5ff1be475b99/sem/config_scripts_user'>)
Jan 02 12:41:26 cloud-init[2575]: util.py[DEBUG]: Running command ['/var/lib/cloud/instance/scripts/part-001'] with allowed return codes [0] (shell=True, capture=False)
Jan 02 12:41:27 cloud-init[2575]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [1]
Jan 02 12:41:27 cloud-init[2575]: util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-001 [1]
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 860, in runparts
subp(prefix + [exe_path], capture=False, shell=True)
File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 2053, in subp
cmd=args)
ProcessExecutionError: Unexpected error while running command.
Command: ['/var/lib/cloud/instance/scripts/part-001']
Exit code: 1
Reason: -
Stdout: -
Stderr: -
Jan 02 12:41:27 cloud-init[2575]: cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
Jan 02 12:41:27 cloud-init[2575]: handlers.py[DEBUG]: finish: modules-final/config-scripts-user: FAIL: running config-scripts-user with frequency once-per-instance
Jan 02 12:41:27 cloud-init[2575]: util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_scripts_user.pyc'>) failed
Jan 02 12:41:27 cloud-init[2575]: util.py[DEBUG]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_scripts_user.pyc'>) failed
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/cloudinit/stages.py", line 798, in _run_modules
freq=freq)
File "/usr/lib/python2.7/site-packages/cloudinit/cloud.py", line 54, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python2.7/site-packages/cloudinit/helpers.py", line 187, in run
results = functor(*args)
File "/usr/lib/python2.7/site-packages/cloudinit/config/cc_scripts_user.py", line 45, in handle
util.runparts(runparts_path)
File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 867, in runparts
% (len(failed), len(attempted)))
RuntimeError: Runparts: 1 failures in 1 attempted commands
Jan 02 12:41:27 cloud-init[2575]: stages.py[DEBUG]: Running module ssh-authkey-fingerprints (<module 'cloudinit.config.cc_ssh_authkey_fingerprints' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_ssh_authkey_fingerprints.pyc'>) with frequency once-per-instance
_
cat /var/log/cfn-init-cmd.log
2019-01-02 12:50:54,777 P2582 [INFO] ************************************************************
2019-01-02 12:50:54,777 P2582 [INFO] ConfigSet default
2019-01-02 12:50:54,778 P2582 [INFO] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2019-01-02 12:50:54,778 P2582 [INFO] Config config
2019-01-02 12:50:54,778 P2582 [INFO] ============================================================
2019-01-02 12:50:54,778 P2582 [INFO] Command 01_template_elastic
2019-01-02 12:50:54,782 P2582 [INFO] -----------------------Command Output-----------------------
2019-01-02 12:50:54,782 P2582 [INFO] 'Hello World'
2019-01-02 12:50:54,783 P2582 [INFO] ------------------------------------------------------------
2019-01-02 12:50:54,783 P2582 [INFO] Completed successfully.
Does anyone have a clue what the error is about?
Furthermore, why the stack is created with success? (as also the specific resource?)
The error message in /var/log/cloud-init.log means that your UserData script exited with error status 1 rather than the expected 0.
Meanwhile, your /var/log/cloud-init-output.log contains this line:
ValidationError: Stack arn:aws:cloudformation:eu-west-1:248059334340:stack/test-elastic/9fc79150-0e8b-11e9-b135-503ac9e74cfd
is in CREATE_COMPLETE state and cannot be signaled
To your other question:
Furthermore, why the stack is created with success? (as also the specific resource?)
It is the normal behaviour of the stack to go into CREATE_COMPLETE state once the resources are created. The running of the UserData script doesn't by default delay this state.
Because you are using the cfn-signal, I assume that you have a requirement for the CREATE_COMPLETE state to be deferred until such time as you send the signal in UserData.
There is a good blog post on how to set this all up here.
But tl;dr -
You probably just need to add a CreationPolicy to your EC2 instance resource like this:
Resources:
EC2ESMasterNode1:
...
CreationPolicy:
ResourceSignal:
Count: 1
Timeout: PT10M
That says wait for 1 signal and time out after 10 minutes. Set those according to your requirements obviously.