kubectl wait - error: no matching resources found - kubectl

I am installing metallb, but need to wait for resources to be created.
kubectl wait --for=condition=ready --timeout=60s -n metallb-system --all pods
But I get:
error: no matching resources found
If I dont wait I get:
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "ipaddresspoolvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-ipaddresspool?timeout=10s": dial tcp 10.106.91.126:443: connect: connection refused
Do you know how to wait for resources to be created before actually be able to wait for condition.
Info:
kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:36:36Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/arm64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:29:58Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/arm64"}

For the error “no matching resources found”:
Wait for a minute and try again, It will be resolved.
You can find the explanation about that error in the following link Setting up Config Connector
For the error STDIN:
Follow the steps mentioned below:
You are getting this error because API server is NOT able to connect to the webhook
1)Check your Firewall Rules allowing TCP port 443 or not.
2)Temporarily disable the operator
kubectl -n config-management-system scale deployment config-management-operator --replicas=0
deployment.apps/config-management-operator scaled
Delete the deployment
kubectl delete deployments.apps -n <namespace> -system <namespace>-controller-manager
deployment.apps "namespace-controller-manager" deleted
3)create a configmap in the default namespace
kubectl create configmap foo
configmap/foo created
4)Check that configmap does not work with the label on the object
cat <<EOF | kubectl create -f -
apiVersion: v1
kind: ConfigMap
metadata:
labels:
configmanagement.gke.io/debug-force-validation-webhook: "true"
name: foo
EOF
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "debug-validation.namespace.sh": failed to call webhook: Post "https://namespace-webhook-service.namespace-system.svc:443/v1/admit?timeout=3s": no endpoints available for service "namespace-webhook-service"
5)And finally do clean up by using the below commands :
kubectl delete configmap foo
configmap "foo" deleted
kubectl -n config-management-system scale deployment config-management-operator --replicas=1
deployment.apps/config-management-operator scaled

Related

Helm hook for post-install, post-upgrade using busybox wget is failing

I am trying to deploy a Helm hook post-install, post-upgrade hook which will create a simple pod with busybox and perform a wget on an app's application port to insure the app is reachable.
I can not get the hook to pass, even though I know the sample app is up and available.
Here is the manifest:
apiVersion: v1
kind: Pod
metadata:
name: post-install-test
annotations:
"helm.sh/hook": post-install,post-upgrade
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
containers:
- name: wget
image: busybox
imagePullPolicy: IfNotPresent
command: ["/bin/sh","-c"]
args: ["sleep 15; wget {{ include "sampleapp.fullname" . }}:{{ .Values.service.applicationPort.port }}"]
restartPolicy: Never
As you can see in the manifest in the args, the name of the container is in Helm's template syntax. A developer will input the desired name of their app in a Jenkins pipeline, so I can't hardcode it.
I see from kubectl logs -n namespace post-install-test, this result:
Connecting to sample-app:8080 (172.20.87.74:8080)
wget: server returned error: HTTP/1.1 404 Not Found
But when I check the EKS resources I see the pod running the sample app that I'm trying to test with the added suffix of what I've determined is the pod-template-hash.
sample-app-7fcbd52srj9
Is this suffix making my Helm hook fail? Is there a way I can account for this template hash?
I've tried different syntaxes on the command, but I can confirm with the kubectl logs the helm hook is attempting to connect but keeps getting a 404.

pods are stuck in CrashLoopBackOff after updating my eks to 1.16

I just updated my eks from 1.15 to 1.16 and I couldn't get my deployments in my namespaces up and running. when I do kubectl get po and try to list my pods they're all stuck in CrashLoopBackOff state. I tried describe one pod and this is what I get in the events section
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 56m (x8 over 72m) kubelet Pulling image "xxxxxxx.dkr.ecr.us-west-2.amazonaws.com/xxx-xxxx-xxxx:master.697.7af45fff8e0"
Warning BackOff 75s (x299 over 66m) kubelet Back-off restarting failed container
kuberntets version -
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:10:43Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.15-eks-e1a842", GitCommit:"e1a8424098604fa0ad8dd7b314b18d979c5c54dc", GitTreeState:"clean", BuildDate:"2021-07-31T01:19:13Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
It seems like your container is stuck in image pull state, here are somethings that you can check.
Ensure image is present in ECR
Ensure the EKS cluster is able to connect to ECR - If it's a private repo it would require credentials.
Run a docker pull and see if it's able to pull it directly (most likely it will fail or ask for credentials if not already passed)
So the problem is I was trying to deploy x86 containers on ARM node instance. Everything worked once I changed my launch template image for my node group

The request is invalid: patch: Invalid value:... cannot convert int64 to string and Error from server (BadRequest): json: cannot unmarshal string

I'm sure regarding YAML format and kubernetes (AWS EKS) as per validations of Kubeval & Yamllint.
The following is a aws-auth-patch.yml file.
However... When I executed in CMD kubectl patch configmap/aws-auth -n kube-system --patch "$(cat aws-auth-patch.yml)"
error: Error from server (BadRequest): json: cannot unmarshal string into Go value of type map[string]interface {}
Also in Windows PowerShell kubectl patch configmap/aws-auth -n kube-system --patch $(Get-Content aws-auth-patch.yml -Raw)
error: The request is invalid: patch: Invalid value: "map[apiVersion:v1 data:map[....etc...": cannot convert int64 to string
I think that YAML file format is normal.
What is causing this error?
I've solved it by change my OS from Windows 10 to WSL (Windows Sub-system for Linux ) (ubuntu 20.04 LTS) and now the below command executed successfully.
kubectl patch configmap/aws-auth -n kube-system --patch "$(cat aws-auth-patch.yml)"
and result is:
configmap/aws-auth patched

Internal error occurred: failed calling webhook "v1.vseldondeployment.kb.io" while deploying Seldon yaml file on minikube

I am trying to follow the instruction on Seldon to build and deploy the iris model on minikube.
https://docs.seldon.io/projects/seldon-core/en/latest/workflow/github-readme.html#getting-started
I am able to install Seldon with Helm and Knative using YAML file. But while I am trying to apply this YAML file to deploy the Iris model, I am having the following error:
Internal error occurred: failed calling webhook "v1.vseldondeployment.kb.io": Post "https://seldon-webhook-service.seldon-system.svc:443/validate-machinelearning-seldon-io-v1-seldondeployment?timeout=30s": dial tcp 10.107.97.236:443: connect: connection refused
I used kubectl apply YAML on other files such as knative and broker installation they don't have this problem, but when I kubectl apply any SeldonDeployment YAML file this error comes up, I also tried the cifar10.yaml for cifar10 model deploy and mnist-model.yaml for mnist model deploy they have the same problem.
Has anyone experienced similar kind of problem and what are the best ways to troubleshoot and solve the problem?
My Seldon is 1.8.0-dev, minikube is v1.19.0 and kubectl Server is v1.20.2
Here is the YAML file:
kind: SeldonDeployment
metadata:
name: iris-model
namespace: seldon
spec:
name: iris
predictors:
- graph:
implementation: SKLEARN_SERVER
modelUri: gs://seldon-models/sklearn/iris
name: classifier
name: default
replicas: 1
Error Code
Make sure that the Seldon core manager in seldon-system is running ok: kubectl get pods -n seldon-system.
In my case, the pod was in CrashLoopBackOff status and was constantly restarting.
Turns out the problem had been while installing the seldon. Instead of having
helm install seldon-core seldon-core-operator \
— repo https://storage.googleapis.com/seldon-charts \
— set usageMetrics.enabled=true \
— set istio.enabled=true \
— namespace seldon-system
try once:
helm install seldon-core seldon-core-operator \
--repo https://storage.googleapis.com/seldon-charts \
--set usageMetrics.enabled=true \
--namespace seldon-system \
--set ambassador.enabled=true
Reference
P. S.
When reinstalling you can just delete all the namespaces (which shouldn't be a problem since ur just doing a tutorial) with kubectl delete --all namespaces.

istio upgrade from 1.4.6 -> 1.5.0 throws istiod erros : remote error: tls: error decrypting message

Just upgraded istio from 1.4.6 (helm) to istio 1.5.0 (istioctl) [Purged istio and installed from istioctl] but it appears the istiod logs keep throwing the following :
2020-03-16T18:25:45.209055Z info grpc: Server.Serve failed to complete security handshake from "10.150.56.111:56870": remote error: tls: error decrypting message
2020-03-16T18:25:46.792447Z info grpc: Server.Serve failed to complete security handshake from "10.150.57.112:49162": remote error: tls: error decrypting message
2020-03-16T18:25:46.930483Z info grpc: Server.Serve failed to complete security handshake from "10.150.56.160:36878": remote error: tls: error decrypting message
2020-03-16T18:25:48.284122Z info grpc: Server.Serve failed to complete security handshake from "10.150.52.230:44758": remote error: tls: error decrypting message
2020-03-16T18:25:48.288180Z info grpc: Server.Serve failed to complete security handshake from "10.150.57.149:56756": remote error: tls: error decrypting message
2020-03-16T18:25:49.108515Z info grpc: Server.Serve failed to complete security handshake from "10.150.57.151:53970": remote error: tls: error decrypting message
2020-03-16T18:25:49.111874Z info Handling event update for pod contentgatewayaidest-7f4694d87-qmq8z in namespace djin-content -> 10.150.53.50
2020-03-16T18:25:49.519861Z info grpc: Server.Serve failed to complete security handshake from "10.150.57.91:59510": remote error: tls: error decrypting message
2020-03-16T18:25:50.133664Z info grpc: Server.Serve failed to complete security handshake from "10.150.57.203:59726": remote error: tls: error decrypting message
2020-03-16T18:25:50.331020Z info grpc: Server.Serve failed to complete security handshake from "10.150.57.195:59970": remote error: tls: error decrypting message
2020-03-16T18:25:52.110695Z info Handling event update for pod contentgateway-d74b44c7-dtdxs in namespace djin-content -> 10.150.56.215
2020-03-16T18:25:53.312761Z info Handling event update for pod dysonpriority-b6dbc589b-mk628 in namespace djin-content -> 10.150.52.91
2020-03-16T18:25:53.496524Z info grpc: Server.Serve failed to complete security handshake from "10.150.56.111:57276": remote error: tls: error decrypting message
This also leads to no sidecars successfully launching and failing with :
2020-03-16T18:32:17.265394Z info Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 16 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2020-03-16T18:32:19.269334Z info Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 16 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2020-03-16T18:32:21.265214Z info Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 16 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2020-03-16T18:32:23.266159Z info Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 16 successful, 0 rejected; lds updates: 0 successful,
Weirdly other clusters that I upgraded go through fine. Any idea where this error might be popping up from ? istioctl analyze works fine.
error goes away after killing the nodes (recreating) but istio-proxies still fail with :
info Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 0 rejected
As far as I know since version 1.4.4 they add istioctl upgrade, which should be used when You want to upgrade istio from 1.4.x to 1.5.0.
The istioctl upgrade command performs an upgrade of Istio. Before performing the upgrade, it checks that the Istio installation meets the upgrade eligibility criteria. Also, it alerts the user if it detects any changes in the profile default values between Istio versions.
The upgrade command can also perform a downgrade of Istio.
See the istioctl upgrade reference for all the options provided by the istioctl upgrade command.
istioctl upgrade --help
The upgrade command checks for upgrade version eligibility and, if eligible, upgrades the Istio control plane components in-place. Warning: traffic may be disrupted during upgrade. Please ensure PodDisruptionBudgets are defined to maintain service continuity.
I made a test on gcp cluster with istio 1.4.6 installed with istioctl and then I used istioctl upgrade from version 1.5.0 and everything works fine.
kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
istio-ingressgateway-598796f4d9-lvzdb 1/1 Running 0 12m
istiod-7d9c7bdd6-mggx7 1/1 Running 0 12m
prometheus-b47d8c58c-7spq5 2/2 Running 0 12m
I checked the logs and made some simple examples and no errors occurs in istiod like in your example.
Upgrade prerequisites for istioctl upgrade
Ensure you meet these requirements before starting the upgrade process:
Istio version 1.4.4 or higher is installed.
Your Istio installation was installed using istioctl.
I assume because of the differences between 1.4.x and 1.5.0 there might be some issues when you want to use both of the installatio methods, helm and istioctl. The best option here would be to install istio 1.4.6 with istioctl and then upgrade it to 1.5.0.
I hope this answer your question. Let me know if you have any more questions.