The error occurs when creating Openshift Cluster on AWS with IPI

The error occurs when creating Openshift Cluster on AWS with IPI - amazon-web-services

We are trying to creating Openshift Cluster looking at this website.
https://keithtenzer.com/openshift/openshift-4-aws-ipi-installation-getting-started-guide/
We run "create cluster" but installation is failed.
The Error is following.
ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects:
Get "https://api.xxxxx.openshift.yyyyy.private:6443/apis/config.openshift.io/v1/clusteroperators":
dial tcp: lookup api.xxxxx.openshift.yyyyy.private on aa.bbb.c.d:ee: no such host
ERROR Bootstrap failed to complete: Get "https://api.xxxxx.openshift.yyyyy.private:6443/version":
dial tcp: lookup api.xxxxx.openshift.yyyy.private on aa.bbb.c.d:ee: no such host
ERROR Failed waiting for Kubernetes API. T
We made openshift.yyyyy.private in Public Hosted Zone with Route53 before installation but it seems that api.xxxxx.openshift.yyyyy.private is NAME_RESOLUTION failed.
What should we do to complete installation?

Related

Unable to connect to the server connecteX Error

I am using Microk8s on my local Windows 10 machine and am unable to get pods or use most of kubectl commands.
At the time of installation of microk8s, multipass selected VitrualBox for VM virtualization.
I get below error
C:\Users\owner>microk8s kubectl get pods
Unable to connect to the server: dial tcp 10.0.2.15:16443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
VirtualBox Version 6.1
Ensured that multipass list shows that the VM is in running state.
There was a prior version of VitrualBox and minikube - both of which I have now un-installed and installed VirtualBox alone. Also cleaned up %USER_HOME%/.kube directory and %USER_HOME%/AppData/local/minikube & %USER_HOME%/AppData/local/VirtualBox folders too.

Problem with configuration CTDB GlusterFS with Samba

I am trying to set up a HA GlusterFS (version 7.7) Replicated Distribution Storage Cluster on Centos 7. I read some docs and tutorials, but I am stuck at the moment with my CTDB setup. Is there someone whith experience with such configurations? I can use your help!
This error I get when starting smb service:
ctdbd init connection: ctdbd_init_connection_internal failed (Input/output error) clustering=yes but ctdbd connect failed: Input/output error

Why OpenShift installer for AWS provider fails unable to connect to Kubernetes API?

OpenShift AWS installer fails waiting for Kubernetes API to be available with fatal error "waiting for Kubernetes API: context deadline exceeded":
$ openshift-install create cluster --dir=$HOME/openshift --log-level debug
...
DEBUG Still waiting for the Kubernetes API: Get https://api.cluster-name.'IP_ADDRESS'.nip.io:6443/version?timeout=32s: dial tcp 'IP_ADDRESS':6443: i/o timeout
DEBUG Still waiting for the Kubernetes API: Get https://api.cluster-name.'IP_ADDRESS'.nip.io:6443/version?timeout=32s: dial tcp 'IP_ADDRESS':6443: i/o timeout
DEBUG Still waiting for the Kubernetes API: Get https://api.cluster-name.'IP_ADDRESS'.nip.io:6443/version?timeout=32s: dial tcp 'IP_ADDRESS':6443: i/o timeout
DEBUG Still waiting for the Kubernetes API: Get https://api.cluster-name.'IP_ADDRESS'.nip.io:6443/version?timeout=32s: dial tcp 'IP_ADDRESS':6443: i/o timeout
DEBUG Fetching "Install Config"...
DEBUG Loading "Install Config"...
DEBUG Loading "SSH Key"...
DEBUG Using "SSH Key" loaded from state file
DEBUG Loading "Base Domain"...
DEBUG Loading "Platform"...
DEBUG Using "Platform" loaded from state file
DEBUG Using "Base Domain" loaded from state file
DEBUG Loading "Cluster Name"...
DEBUG Loading "Base Domain"...
DEBUG Using "Cluster Name" loaded from state file
DEBUG Loading "Pull Secret"...
DEBUG Using "Pull Secret" loaded from state file
DEBUG Loading "Platform"...
DEBUG Using "Install Config" loaded from state file
DEBUG Reusing previously-fetched "Install Config"
INFO Use the following commands to gather logs from the cluster
...
FATAL waiting for Kubernetes API: context deadline exceeded
The problem is also described here

In my case the installer tried to connect to Kubernetes API linked to a non-existing endpoint. One of indications of that if oc-client hangs when run a simple command like oc whoami - it actually tries to connect to the same endpoint (taken that KUBECONFIG is set).
It turned out that it has to do with Route 53 hosted zone and in particular with a subdomain. When OpenShift is being installed against a subdomain (like in my case), a record set in a main domain referencing to the subdomain needs to be created. So, for openshift.example.com do the following in aws console:
Go to Route 53 -> Hosted zones -> click openshift.example.com. (if it's not there - create a hosted zone) -> copy NS records, e.g.:
ns-711.awsdns-24.net.
ns-126.awsdns-15.com.
ns-1274.awsdns-31.org.
ns-1556.awsdns-02.co.uk.
Back to Hosted Zones -> example.com. -> Create Record Set:
create a record set for openshift.example.com, type: NS - Name server, Value: paste copied NS records.
After this change the installation went through successfully.

Cloud Foundry cli i/o timeout

I was able to successfully deploy BOSH and CF on GCP. I was able to install the cf cli on my worker machine and was able to cf login to the api endpoint without any issues. Now I am attempting to deploy a python and a node.js hello-world style application (cf push) but I am running into the following error:
Python:
**ERROR** Could not install python: Get https://buildpacks.cloudfoundry.org/dependencies/python/python-3.5.4-linux-x64-5c7aa3b0.tgz: dial tcp: lookup buildpacks.cloudfoundry.org on 169.254.0.2:53: read udp 10.255.61.196:36513->169.254.0.2:53: i/o timeout
Failed to compile droplet: Failed to run all supply scripts: exit status 14
NodeJS
-----> Nodejs Buildpack version 1.6.28
-----> Installing binaries
engines.node (package.json): unspecified
engines.npm (package.json): unspecified (use default)
**WARNING** Node version not specified in package.json. See: http://docs.cloudfoundry.org/buildpacks/node/node-tips.html
-----> Installing node 6.14.3
Download [https://buildpacks.cloudfoundry.org/dependencies/node/node-6.14.3-linux-x64-ae2a82a5.tgz]
**ERROR** Unable to install node: Get https://buildpacks.cloudfoundry.org/dependencies/node/node-6.14.3-linux-x64-ae2a82a5.tgz: dial tcp: lookup buildpacks.cloudfoundry.org on 169.254.0.2:53: read udp 10.255.61.206:34802->169.254.0.2:53: i/o timeout
Failed to compile droplet: Failed to run all supply scripts: exit status 14
I am able to download and ping the build pack urls manually on the worker machine, jumpbox, and the bosh vms so I believe DNS is working properly on each of those machine types.
As part of the default deployment, I believe a socks5 tunnel is created to allow communication from my worker machine to the jumpbox so this is where I believe the issue lies. https://docs.cloudfoundry.org/cf-cli/http-proxy.html
When running bbl print-env, export BOSH_ALL_PROXY=ssh+socks5://jumpbox#35.192.140.0:22?private-key=/tmp/bosh-jumpbox725514160/bosh_jumpbox_private.key , however when I export https_proxy=socks5://jumpbox#35.192.140.0:22?private-key=/tmp/bosh-jumpbox389236516/bosh_jumpbox_private.key and do a cf push I receive the following error:
Request error: Get https://api.cloudfoundry.costub.com/v2/info: proxy: SOCKS5 proxy at 35.192.140.0:22 has unexpected version 83
TIP: If you are behind a firewall and require an HTTP proxy, verify the https_proxy environment variable is correctly set. Else, check your network connection.
FAILED
Am I on the right track? Is my https_proxy variable formatted correctly? I also tried https_proxy=socks5://jumpbox#35.192.140.0:22 with the same result.

Cannot reach containers from codebuild

I've been having issue reaching containers from within codebuild. I have an exposed GraphQL service with a downstream auth service and a postgresql database all started through Docker Compose. Running them and testing them works fine locally, however I cannot get the right comination of host names in codebuild.
It looks like my test is able to run if I hit the GraphQL endpoint at 0.0.0.0:8000 however once my GraphQL container attempts to reach the downstream service I will get a connection refused. I've tried reaching the auth service from inside the GraphQL service at auth:8001, 0.0.0.0:8001, with port 8001 exposed, and by setting up a briged network. I am always getting a connection refused error.
I've attached part of my codebuild logs.
Any ideas what I might be missing?
Container 2018/08/28 05:37:17 Running command docker ps CONTAINER ID
IMAGE COMMAND CREATED STATUS PORTS NAMES 6c4ab1fdc980
docker-compose_graphql "app" 1 second ago Up Less than a second
0.0.0.0:8000->8000/tcp docker-compose_graphql_1 5c665f5f812d docker-compose_auth "/bin/sh -c app" 2 seconds ago Up Less than a
second 0.0.0.0:8001->8001/tcp docker-compose_auth_1 b28148784c04
postgres:10.4 "docker-entrypoint..." 2 seconds ago Up 1 second
0.0.0.0:5432->5432/tcp docker-compose_psql_1
Container 2018/08/28 05:37:17 Running command go test ; cd ../..
Register panic: [{"message":"rpc error: code = Unavailable desc = all
SubConns are in TransientFailure, latest connection error: connection
error: desc = \"transport: Error while dialing dial tcp 0.0.0.0:8001:
connect: connection refused\"","path":

From the "host" machine my exposed GraphQL service could only be reached using the IP address 0.0.0.0. The internal networking was set up correctly and each service could be reached at <NAME>:<PORT> as expected, however, upon error the IP address would be shown (172.27.0.1) instead of the host name.
My problem was that all internal connections were not yet ready, leading to the "connection refused" error. The command sleep 5 after docker-compose up gave my services time to fully initialize before testing.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js