`kubectl patch` each element of array - kubectl

I would like to patch all container templates in a Kubernetes deployment with a single kubectl patch command, without having to know their name. Is that possible?
I know I am able to achieve the replacement through awk, sed, jq and kubectl replace, but I would favour something like a [*] in the expression...
Patch command for a certain container spec
kubectl patch deployment mydeployment -p '{"spec":{"template":{"spec":{"containers":[{"name":"myname","imagePullPolicy":"Always"}]}}}}'
Example Deployment
apiVersion: extensions/v1beta1
kind: Deployment
spec:
replicas: 1
template:
spec:
containers:
- image: example.com/my/fancyimage:latest
imagePullPolicy: Never
name: myname
dnsPolicy: ClusterFirst
restartPolicy: Always

It's not exactly what you asked, since i use here command line tools, but if it would appear here, it would save me some time. So i post it for others, who come here from search engines.
kubectl \
--username=USERNAME \
--password=PASSWORD \
--server="https://EXAMPLE.COM" \
--insecure-skip-tls-verify=true \
--namespace=MY_NAMESPACE \
get deployments | \
grep -v NAME | cut -f1 -d" " | \
xargs kubectl \
--username=USERNAME \
--password=PASSWORD \
--server="https://EXAMPLE.COM" \
--insecure-skip-tls-verify=true \
--namespace=MY_NAMESPACE \
patch deployment \
-p='{"spec":{"template":{"spec":{"containers":[{"name":"myname","imagePullPolicy":"Always"}]}}}}'

The best practice way of achieving this, assuming you want this to be an ongoing requirement, would be to use the AlwaysPullImages AdmissionController
AlwaysPullImages
This admission controller modifies every new Pod to force the image pull policy to Always...
You would apply this to your api controller by appending it to the --enable-admission-plugins argument

Related

Helm hook for post-install, post-upgrade using busybox wget is failing

I am trying to deploy a Helm hook post-install, post-upgrade hook which will create a simple pod with busybox and perform a wget on an app's application port to insure the app is reachable.
I can not get the hook to pass, even though I know the sample app is up and available.
Here is the manifest:
apiVersion: v1
kind: Pod
metadata:
name: post-install-test
annotations:
"helm.sh/hook": post-install,post-upgrade
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
containers:
- name: wget
image: busybox
imagePullPolicy: IfNotPresent
command: ["/bin/sh","-c"]
args: ["sleep 15; wget {{ include "sampleapp.fullname" . }}:{{ .Values.service.applicationPort.port }}"]
restartPolicy: Never
As you can see in the manifest in the args, the name of the container is in Helm's template syntax. A developer will input the desired name of their app in a Jenkins pipeline, so I can't hardcode it.
I see from kubectl logs -n namespace post-install-test, this result:
Connecting to sample-app:8080 (172.20.87.74:8080)
wget: server returned error: HTTP/1.1 404 Not Found
But when I check the EKS resources I see the pod running the sample app that I'm trying to test with the added suffix of what I've determined is the pod-template-hash.
sample-app-7fcbd52srj9
Is this suffix making my Helm hook fail? Is there a way I can account for this template hash?
I've tried different syntaxes on the command, but I can confirm with the kubectl logs the helm hook is attempting to connect but keeps getting a 404.

Internal error occurred: failed calling webhook "v1.vseldondeployment.kb.io" while deploying Seldon yaml file on minikube

I am trying to follow the instruction on Seldon to build and deploy the iris model on minikube.
https://docs.seldon.io/projects/seldon-core/en/latest/workflow/github-readme.html#getting-started
I am able to install Seldon with Helm and Knative using YAML file. But while I am trying to apply this YAML file to deploy the Iris model, I am having the following error:
Internal error occurred: failed calling webhook "v1.vseldondeployment.kb.io": Post "https://seldon-webhook-service.seldon-system.svc:443/validate-machinelearning-seldon-io-v1-seldondeployment?timeout=30s": dial tcp 10.107.97.236:443: connect: connection refused
I used kubectl apply YAML on other files such as knative and broker installation they don't have this problem, but when I kubectl apply any SeldonDeployment YAML file this error comes up, I also tried the cifar10.yaml for cifar10 model deploy and mnist-model.yaml for mnist model deploy they have the same problem.
Has anyone experienced similar kind of problem and what are the best ways to troubleshoot and solve the problem?
My Seldon is 1.8.0-dev, minikube is v1.19.0 and kubectl Server is v1.20.2
Here is the YAML file:
kind: SeldonDeployment
metadata:
name: iris-model
namespace: seldon
spec:
name: iris
predictors:
- graph:
implementation: SKLEARN_SERVER
modelUri: gs://seldon-models/sklearn/iris
name: classifier
name: default
replicas: 1
Error Code
Make sure that the Seldon core manager in seldon-system is running ok: kubectl get pods -n seldon-system.
In my case, the pod was in CrashLoopBackOff status and was constantly restarting.
Turns out the problem had been while installing the seldon. Instead of having
helm install seldon-core seldon-core-operator \
— repo https://storage.googleapis.com/seldon-charts \
— set usageMetrics.enabled=true \
— set istio.enabled=true \
— namespace seldon-system
try once:
helm install seldon-core seldon-core-operator \
--repo https://storage.googleapis.com/seldon-charts \
--set usageMetrics.enabled=true \
--namespace seldon-system \
--set ambassador.enabled=true
Reference
P. S.
When reinstalling you can just delete all the namespaces (which shouldn't be a problem since ur just doing a tutorial) with kubectl delete --all namespaces.

Google Cloud Function fail to build

I'm trying to update a cloud function that has been working for over a week now.
But when I try to update the function today, I get BUILD FAILED: BUILD HAS TIMED OUT error
Build fail error
I am using the google cloud console to deploy the python function and not cloud shell. I even tried to make a new copy of the function and that fails too.
Looking at the logs, it says INVALID_ARGUMENT. But I'm just using the console and haven't changed anything apart from the python code in comparison to previous build that I successfully deployed last week.
Error logs
{
insertId: "fjw53vd2r9o"
logName: " my log name "
operation: {…}
protoPayload: {
#type: "type.googleapis.com/google.cloud.audit.AuditLog"
authenticationInfo: {…}
methodName: "google.cloud.functions.v1.CloudFunctionsService.UpdateFunction"
requestMetadata: {…}
resourceName: " my function name"
serviceName: "cloudfunctions.googleapis.com"
status: {
code: 3
message: "INVALID_ARGUMENT"
}
}
receiveTimestamp: "2020-02-05T18:04:18.269557510Z"
resource: {…}
severity: "ERROR"
timestamp: "2020-02-05T18:04:18.241Z"
}
I even tried to increase the timeout parameter to 540 seconds and I still get the build error.
Timeout parameter setting
Can someone help please ?
In future, please copy and paste the text from errors and logs rather than reference screenshots; it's easier to parse and it's possibly more permanent.
It's possible that there's an intermittent issue with the service (in your region) that is causing you problems. Does this issue continue?
You may check the status dashboard (there are none for Functions) for service issues:
https://status.cloud.google.com/
I just deployed and updated a Golang Function in us-centrall without issues.
Which language|runtime are you using?
Which region?
Are you confident that your updates to the Function are correct?
A more effective albeit dramatic way to test this would be to create a new (temporary) project and try to deploy the function there (possibly to a different region too).
NB The timeout setting applies to the Function's invocations not to the deployment.
Example (using gcloud)
PROJECT=[[YOUR-PROJECT]]
BILLING=[[YOUR-BILLING]]
gcloud projects create ${PROJECT}
gcloud beta billing projects link ${PROJECT} --billing-account=${BILLING}
gcloud services enable cloudfunctions.googleapis.com --project=${PROJECT}
touch function.go go.mod
# Deploy
gcloud functions deploy fred \
--region=us-central1 \
--allow-unauthenticated \
--entry-point=HelloFreddie \
--trigger-http \
--source=${PWD} \
--project=${PROJECT} \
--runtime=go113
# Update
gcloud functions deploy fred \
--region=us-central1 \
--allow-unauthenticated \
--entry-point=HelloFreddie \
--trigger-http \
--source=${PWD} \
--project=${PROJECT} \
--runtime=go113
# Test
curl \
--request GET \
$(\
gcloud functions describe fred \
--region=us-central1 \
--project=${PROJECT} \
--format="value(httpsTrigger.url)")
Hello Freddie
Logs:
gcloud logging read "resource.type=\"cloud_function\" resource.labels.function_name=\"fred\" resource.labels.region=\"us-central1\" protoPayload.methodName=(\"google.cloud.functions.v1.CloudFunctionsService.CreateFunction\" OR \"google.cloud.functions.v1.CloudFunctionsService.UpdateFunction\")" \
--project=${PROJECT} \
--format="json(protoPayload.methodName,protoPayload.status)"
[
{
"protoPayload": {
"methodName": "google.cloud.functions.v1.CloudFunctionsService.CreateFunction"
}
},
{
"protoPayload": {
"methodName": "google.cloud.functions.v1.CloudFunctionsService.CreateFunction",
"status": {}
}
},
{
"protoPayload": {
"methodName": "google.cloud.functions.v1.CloudFunctionsService.UpdateFunction"
}
},
{
"protoPayload": {
"methodName": "google.cloud.functions.v1.CloudFunctionsService.UpdateFunction",
"status": {}
}
}
]

Horizontal Pod Autoscaler (HPA): Current utilization: <unknown> with custom namespace

UPDATE: I'm deploying on AWS cloud with the help of kops.
I'm in the process applying HPA for one of my kubernete deployment.
While testing the sample app, I deployed with default namespace, I can see the metrics being exposed as below ( showing current utilisation is 0%)
$ kubectl run busybox --image=busybox --port 8080 -- sh -c "while true; do { echo -e 'HTTP/1.1 200 OK\r\n'; \
env | grep HOSTNAME | sed 's/.*=//g'; } | nc -l -p 8080; done"
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
busybox Deployment/busybox 0%/20% 1 4 1 14m
But when I deploy with the custom namespace ( example: test), the current utilisation is showing unknown
$ kubectl get hpa --namespace test
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
busybox Deployment/busybox <unknown>/20% 1 4 1 25m
Can someone please suggest whats wrong here?
For future you need to meet few conditions for HPA to work. You need to have metrics server or heapster running on your cluster. What is important is to set resources on namespace basis.
You did not provide in what environment is your cluster running, but in GKE by default you have a cpu resource set (100m), but you need to specify it on new namespaces:
Please note that if some of the pod’s containers do not have the
relevant resource request set, CPU utilization for the pod will not be
defined and the autoscaler will not take any action for that metric.
In your case I am not sure why it does work after redeploy, as there is not enough information. But for future remember to:
1) object that you want to scale and HPA should be in the same namespace
2) set resources on CPU per namespace or simply add --requests=cpu=value so the HPA will be able to scale based on that.
UPDATE:
for your particular case:
1) kubectl run busybox --image=busybox --port 8080 -n test --requests=cpu=200m -- sh -c "while true; do { echo -e 'HTTP/1.1 200 OK\r\n'; \
env | grep HOSTNAME | sed 's/.*=//g'; } | nc -l -p 8080; done"
2) kubectl autoscale deployment busybox --cpu-percent=50 --min=1 --max=10 -n test
Try running the commands below in the namespace where you experience this issue and see if you get any pointers.
kubectl get --raw /apis/metrics.k8s.io/ - This should display a valid JSON
Also, do a kubectl describe hpa name_of_hpa_deployment - This may indicate if there are any issues with your hpa deployment in that namespace.

Kubernetes 1.0.1 External Load Balancer on GCE with CoreOS

Using a previous version of Kubernetes (0.16.x) I was able to create a cluster of CoreOS based VMs on GCE that were capable of generating external network load balancers for services. With the release of v1 of Kubernetes the configuration necessary for this functionality seems to have changed. Could anyone offer any advice or point me in the direction of some documentation that might help me out with this issue?
I suspect that the problem is to do with ip/naming as I was previously using kube-register to handle this, and this component no longer seems necessary. My current configuration will create internal service load balancers without issue, and will even create external service load balancers, but they are only viewable through the gcloud UI and are not registered or displayed in kubectl output. Unfortunately the external ips generated do not actually proxy the traffic through either.
The kube-controller-manager service log looks like this:
Aug 05 12:15:42 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: I0805 12:15:42.516360 1604 gce.go:515] Firewall doesn't exist, moving on to deleting target pool.
Aug 05 12:15:42 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: E0805 12:15:42.516492 1604 servicecontroller.go:171] Failed to process service delta. Retrying: googleapi: Error 404: The resource 'projects/staging-infrastructure/global/firewalls/k8s-fw-a4db9328c3b6b11e5ab9f42010af0397' was not found, notFound
Aug 05 12:15:42 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: I0805 12:15:42.516539 1604 servicecontroller.go:601] Successfully updated 2 out of 2 external load balancers to direct traffic to the updated set of nodes
Aug 05 12:16:07 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: E0805 12:16:07.620094 1604 servicecontroller.go:171] Failed to process service delta. Retrying: failed to create external load balancer for service default/autobot-cache-graph: googleapi: Error 400: Invalid value for field 'resource.targetTags[0]': 'europe-west1-b-k8s-node-0.c.staging-infrastructure.int'. Must be a match of regex '(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)', invalid
Aug 05 12:16:12 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: I0805 12:16:12.804512 1604 servicecontroller.go:275] Deleting old LB for previously uncached service default/autobot-cache-graph whose endpoint &{[{146.148.114.97 }]} doesn't match the service's desired IPs []
Here is the config I am using (download chmod etc omitted for clarity).
On the master:
- name: kube-apiserver.service
command: start
content: |
[Unit]
Description=Kubernetes API Server
Requires=setup-network-environment.service etcd.service generate-serviceaccount-key.service
After=setup-network-environment.service etcd.service generate-serviceaccount-key.service
[Service]
EnvironmentFile=/etc/network-environment
ExecStart=/opt/bin/hyperkube apiserver \
--cloud-provider=gce \
--service_account_key_file=/opt/bin/kube-serviceaccount.key \
--service_account_lookup=false \
--admission_control=NamespaceLifecycle,NamespaceAutoProvision,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota \
--runtime_config=api/v1 \
--allow_privileged=true \
--insecure_bind_address=0.0.0.0 \
--insecure_port=8080 \
--kubelet_https=true \
--secure_port=6443 \
--service-cluster-ip-range=10.100.0.0/16 \
--etcd_servers=http://127.0.0.1:2379 \
--bind-address=${DEFAULT_IPV4} \
--logtostderr=true
Restart=always
RestartSec=10
- name: kube-controller-manager.service
command: start
content: |
[Unit]
Description=Kubernetes Controller Manager
Requires=kube-apiserver.service
After=kube-apiserver.service
[Service]
ExecStart=/opt/bin/hyperkube controller-manager \
--cloud-provider=gce \
--service_account_private_key_file=/opt/bin/kube-serviceaccount.key \
--master=127.0.0.1:8080 \
--logtostderr=true
Restart=always
RestartSec=10
- name: kube-scheduler.service
command: start
content: |
[Unit]
Description=Kubernetes Scheduler
Requires=kube-apiserver.service
After=kube-apiserver.service
[Service]
ExecStart=/opt/bin/hyperkube scheduler --master=127.0.0.1:8080
Restart=always
RestartSec=10
And on the node:
- name: kubelet.service
command: start
content: |
[Unit]
Description=Kubernetes Kubelet
Requires=setup-network-environment.service
After=setup-network-environment.service
[Service]
EnvironmentFile=/etc/network-environment
WorkingDirectory=/root
ExecStart=/opt/bin/hyperkube kubelet \
--cloud-provider=gce \
--address=0.0.0.0 \
--port=10250 \
--api_servers=<master_ip>:8080 \
--allow_privileged=true \
--logtostderr=true \
--cadvisor_port=4194 \
--healthz_bind_address=0.0.0.0 \
--healthz_port=10248
Restart=always
RestartSec=10
- name: kube-proxy.service
command: start
content: |
[Unit]
Description=Kubernetes Proxy
Requires=setup-network-environment.service
After=setup-network-environment.service
[Service]
ExecStart=/opt/bin/hyperkube proxy \
--master=<master_ip>:8080 \
--logtostderr=true
Restart=always
RestartSec=10
To me it looks like a mismatch in naming and ip, but I'm not sure how to adjust my config to resolve. Any guidance greatly appreciated.
How did you create the nodes in your cluster? We've seen another instance of this issue due to bugs in the cluster bootstrapping script that was used that didn't apply the expected node names and tags.
If you recreate your cluster using the following two commands as recommended on the issue linked to above, creating load balancers should work for you:
export OS_DISTRIBUTION=coreos
cluster/kube-up.sh
Otherwise, you may need to wait for the issues to be fixed.