How can I change config of istiod deployment using istio-operator?

How can I change config of istiod deployment using istio-operator? - istio

I am setting up istio controlplane using istio-operator on an EKS cluster with calico CNI. After installing istio on the cluster, I got to know that new pods are not coming up and the reason I got after googling is given below:
Istio Installation successful but not able to deploy POD
Now, I want to apply a change hostNetwork: true under spec.template.spec to istiod deployment using the istio-operator only.
I did some more googling to change or override the values of istiod deployment and got the following yamls files:
https://github.com/istio/istio/tree/ca541df418d0902ebeb9506c84d24c6bd9743801/operator/cmd/mesh/testdata/manifest-generate/input
But they are also not working. Below is the last configurations I have applied:
kind: IstioOperator
metadata:
namespace: istio-system
name: zeta-zone-istiocontrolplane
spec:
profile: minimal
values:
pilot:
resources:
requests:
cpu: 222m
memory: 333Mi
hostNetwork: true
unvalidatedValues:
hostNetwork: true
Can anybody help me to add hostNetwork: true under spec.template.spec to istiod deployment using the istio-operator only?

I was able to achieve that using the following YAML after a lot of hit and trials and checking logs of istio-operator:
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: istiocontrolplane
spec:
profile: minimal
hub: docker.io/istio
tag: 1.10.3
meshConfig:
rootNamespace: istio-system
components:
base:
enabled: true
pilot:
enabled: true
namespace: istio-system
k8s:
overlays:
- kind: Deployment
name: istiod
patches:
- path: spec.template.spec.hostNetwork
value: true # OVERRIDDEN

Related

install istio 1.14.1 with cni on GKE 1.24

I am currently installing istio 1.14.1 on a google kubernetes cluster (GKE), I am making the following manifest file:
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
components:
base:
enabled: true
cni:
enabled: true
namespace: kube-system
values:
cni:
cniBinDir: /home/kubernetes/bin
cniConfDir: /etc/cni/net.d
chained: false
cniConfFileName: istio-cni.conf
meshConfig:
enableAutoMtls: true
I would like to know if it is well done and if someone can give me an orientation on how it should be configured correctly

Create Istio Ingress-gateway POD without creating istiod

I am bit new to istio and still learning. I have a use-case in which Istio is already deployed in istio-system namespace but I need to deploy istio ingress-gateway Pod in test-ns namespace using istioOperator. I am using istio 1.6.7.
From Istio docs, its mentioned to run this cmd:
istioctl manifest apply --set profile=default --filename=istio-ingress-values.yaml but this will create istiod Pods in istio-system which i donot want since its already created.
So, I ran below cmds to just create Ingress Gateway POD but can;t see any Pods or services created in test-ns. Kindly help if this is possible
kubectl apply -f istio-ingress-values.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: test-ns
name: testoperator
ingressGateways:
- enabled: true
name: istio-ingressgateway
namespace: test-ns
k8s:
env:
- name: ISTIO_META_ROUTER_MODE
value: sni-dnat
hpaSpec:
maxReplicas: 5
metrics:
- resource:
name: cpu
targetAverageUtilization: 80
type: Resource
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: istio-ingressgateway
resources: {}
service:
ports:
- name: http2
port: 80
targetPort: 80
- name: https
port: 443
targetPort: 443

In Istio it is possible to tune configuration profiles.
As I can see, you are using the default profile, so I will describe how you can tune this configuration profile to create istio-ingressgateway in the test-ns namespace.
We can display the default profile settings by running the istioctl profile dump default command.
First, I saved these default settings in the default_profile_dump.yml file:
# istioctl profile dump default > default_profile_dump.yml
And then I modified this file:
NOTE: I only added one line: namespace: test-ns.
...
ingressGateways:
- enabled: true
name: istio-ingressgateway
namespace: test-ns
...
After modifying default settings of the ingressGateways, I applied these new settings:
# istioctl manifest apply -f default_profile_dump.yml
This will install the Istio 1.9.1 default profile with ["Istio core" "Istiod" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
✔ Istio core installed
✔ Istiod installed
✔ Ingress gateways installed
- Pruning removed resources Removed HorizontalPodAutoscaler:istio-system:istio-ingressgateway.
Removed PodDisruptionBudget:istio-system:istio-ingressgateway.
Removed Deployment:istio-system:istio-ingressgateway.
Removed Service:istio-system:istio-ingressgateway.
Removed ServiceAccount:istio-system:istio-ingressgateway-service-account.
Removed RoleBinding:istio-system:istio-ingressgateway-sds.
Removed Role:istio-system:istio-ingressgateway-sds.
✔ Installation complete
Finally, we can check where istio-ingressgateway was deployed:
# kubectl get pod -A | grep ingressgateway
test-ns istio-ingressgateway-7fc7c7c-r92tw 1/1 Running 0 33s
The istiod Deployment remained intact in the istio-system namespace:
# kubectl get deploy,pods -n istio-system
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/istiod 1/1 1 1 51m
NAME READY STATUS RESTARTS AGE
pod/istiod-64675984c5-xl97n 1/1 Running 0 51m

AWS EKS Worker Nodes Going "NotReady"

I'm creating a new EKS Kubernetes Cluster on AWS.
When I deploy my workloads (migrating from an existing cluster) Kubelet stopps posting node status and all worker nodes become "NotReady" within a minute.
I was assuming that a misconfiguration within my cluster should not make the nodes crash - but apperently it does.
Can a misconfiguration within my cluster really make the AWS EKS Worker Nodes "NotReady"? Are there some rules of thumb under what circumstances this can happen? CPU Load to high? Pods in kube-system crashing?

This is a community wiki answer based on the solution from comments and posted for better visibility. Feel free to expand it.
As suggested by #gusto2 the problem was with the kubelet pod that was unable to call the API server. #stackoverflowjakob late confirmed that the connection between worker and master node was broken due to VPC misconfiguration and it was discovered by checking AWS Console -> EKS status.

Did you change the default PSP (pod security policy)? In my case, I added new eks.restricted psp, and new nodes will be NotReady status. My solution is to restore eks.privileged psp.
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: eks.privileged
annotations:
kubernetes.io/description: 'privileged allows full unrestricted access to
pod features, as if the PodSecurityPolicy controller was not enabled.'
seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
labels:
kubernetes.io/cluster-service: "true"
eks.amazonaws.com/component: pod-security-policy
spec:
privileged: true
allowPrivilegeEscalation: true
allowedCapabilities:
- '*'
volumes:
- '*'
hostNetwork: true
hostPorts:
- min: 0
max: 65535
hostIPC: true
hostPID: true
runAsUser:
rule: 'RunAsAny'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
readOnlyRootFilesystem: false
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: eks:podsecuritypolicy:privileged
labels:
kubernetes.io/cluster-service: "true"
eks.amazonaws.com/component: pod-security-policy
rules:
- apiGroups:
- policy
resourceNames:
- eks.privileged
resources:
- podsecuritypolicies
verbs:
- use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: eks:podsecuritypolicy:authenticated
annotations:
kubernetes.io/description: 'Allow all authenticated users to create privileged pods.'
labels:
kubernetes.io/cluster-service: "true"
eks.amazonaws.com/component: pod-security-policy
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: eks:podsecuritypolicy:privileged
subjects:
- kind: Group
apiGroup: rbac.authorization.k8s.io
name: system:authenticated

You can try kubectl describe node $BAD_NODE or ssh into node and try sudo dmesg -T and
try restating the kubelet on the node /etc/init.d/kubelet restart
Or
systemctl restart kubelet
Or delete node (drain first)
kubectl drain <node-name>
kubectl delete node <node-name>

kubernetes get host dns name to container for kafka advertise host

I want to deploy kafka on kubernetes.
Because I will be streaming with high bandwidth from the internet to kafka I want to use the hostport and advertise the hosts "dnsName:hostPort" to zookeeper so that all traffic goes directly to the kafka broker (as opposed to using nodeport and a loadbalancer where traffic hits some random node which redirects it creating unnecessary traffic).
I have setup my kubernetes cluster on amazon. With kubectl describe node ${nodeId} I get the internalIp, externalIp, internal and external Dns name of the node.
I want to pass the externalDns name to the kafka broker so that it can use it as advertise host.
How can I pass that information to the container? Ideally I could do this from the deployment yaml but I'm also open to other solutions.

How can I pass that information to the container? Ideally I could do this from the deployment yaml but I'm also open to other solutions.
The first thing I would try is envFrom: fieldRef: and see if it will let you reach into the PodSpec's status: field to grab the nodeName. I deeply appreciate that isn't the ExternalDnsName you asked about, but if fieldRef works, it could be a lot less typing and thus could be a good tradeoff.
But, with "I'm also open to other solutions" in mind: don't forget that -- unless instructed otherwise -- each Pod is able to interact with the kubernetes API, and with the correct RBAC permissions it can request the very information you're seeking. You can do that either as a command: override, to do setup work before launching the kafka broker, or you can do that work in an init container, write the external address into a shared bit of filesystem (with volume: emptyDir: {} or similar), and then any glue code for slurping that value into your kafka broker.
I am 100% certain that the envFrom: fieldRef: construct that I mentioned earlier can acquire the metadata.name and metadata.namespace of the Pod, at which point the Pod can ask the kubernetes API for its own PodSpec, extract the nodeName from the aforementioned status: field, then ask the kubernetes API for the Node info, and voilà, you have all the information kubernetes knows about that Node.

Matthew L Daniels answer describes the valid approach of querying the kubernetes api using the nodename which is obtained by an env var. The difficulty lies in giving the pod the proper rbac access and setting up an init Container.
Here the kubernetes yml that implements this with an init container using the python kubernetes client:
### This serviceAccount gives the kafka sidecar the permission to query the kubernetes API for node information so that it can find out the advertise host (node public dns name) for the kafka which uses hostPort to be as efficient as possible.
apiVersion: v1
kind: ServiceAccount
metadata:
name: node-reader-service-account
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: node-reader-cluster-role
rules:
- apiGroups: [""] # The API group "" indicates the core API Group.
resources: ["nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: read-nodes-rolebinding
subjects:
- kind: ServiceAccount # May be "User", "Group" or "ServiceAccount"
name: node-reader-service-account
namespace: default
roleRef:
kind: ClusterRole
name: node-reader-cluster-role
apiGroup: rbac.authorization.k8s.io
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
creationTimestamp: null
labels:
io.kompose.service: kafka
name: kafka
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
creationTimestamp: null
labels:
io.kompose.service: kafka
spec:
serviceAccountName: node-reader-service-account
containers:
- name: kafka
image: someImage
resources: {}
command: ["/bin/sh"]
args: ["-c", "export KAFKA_ADVERTISED_LISTENERS=$(cat '/etc/sidecar-data/dnsName') && env | grep KAFKA_ADVERTISED_LISTENERS && /start-kafka.sh"]
volumeMounts:
- name: sidecar-data
mountPath: /etc/sidecar-data/
initContainers:
- name: kafka-sidecar
image: sidecarImage
command: ["python"]
args: ["/script/getHostDnsName.py", "$(KUBE_NODE_NAME)", "/etc/sidecar-data/dnsName"]
env:
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
volumeMounts:
- name: sidecar-data
mountPath: /etc/sidecar-data/
volumes:
- name: sidecar-data
emptyDir: {}
restartPolicy: Always
status: {}

kube-controller-manager outputs an error "cannot change NodeName"

I use kubernetes on AWS with CoreOS & flannel VLAN network.
(followed this guide https://coreos.com/kubernetes/docs/latest/getting-started.html)
k8s version is 1.4.6.
And I have the following node-exporter daemon-set.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: node-exporter
labels:
app: node-exporter
tier: monitor
category: platform
spec:
template:
metadata:
labels:
app: node-exporter
tier: monitor
category: platform
name: node-exporter
spec:
containers:
- image: prom/node-exporter:0.12.0
name: node-exporter
ports:
- containerPort: 9100
hostPort: 9100
name: scrape
hostNetwork: true
hostPID: true
When I run this, kube-controller-manager outputs an error repeatedly as below:
E1117 18:31:23.197206 1 endpoints_controller.go:513]
Endpoints "node-exporter" is invalid:
[subsets[0].addresses[0].nodeName: Forbidden: Cannot change NodeName for 172.17.64.5 to ip-172-17-64-5.ec2.internal,
subsets[0].addresses[1].nodeName: Forbidden: Cannot change NodeName for 172.17.64.6 to ip-172-17-64-6.ec2.internal,
subsets[0].addresses[2].nodeName: Forbidden: Cannot change NodeName for 172.17.80.5 to ip-172-17-80-5.ec2.internal,
subsets[0].addresses[3].nodeName: Forbidden: Cannot change NodeName for 172.17.80.6 to ip-172-17-80-6.ec2.internal,
subsets[0].addresses[4].nodeName: Forbidden: Cannot change NodeName for 172.17.96.6 to ip-172-17-96-6.ec2.internal]
Just for information, despite from this error message, node_exporter is accessible on e.g.) 172-17-96-6:9100 . My nodes are in a private network including k8s master.
But these logs are output too many and makes it difficult to see other logs by eyes from our log console. Could I see how to resolve this error?
Because I built my k8s cluster from scratch, cloud-provider=aws flag was not activated at first and I recently turned it on, but not sure if it's related to this issue.

It looks this is caused by my another manifest file
apiVersion: v1
kind: Service
metadata:
name: node-exporter
labels:
app: node-exporter
tier: monitor
category: platform
annotations:
prometheus.io/scrape: 'true'
spec:
clusterIP: None
ports:
- name: scrape
port: 9100
protocol: TCP
selector:
app: node-exporter
type: ClusterIP
I thought this is necessary to expose node-exporter daemon-set above, but it could rather introduce some sort of conflict when I set hostNetwork: true in a daemon-set (actually, a pod) manifest. I'm not 100% certain though, after I delete this service the error disappears while I can still access to 172-17-96-6:9100 from outside of the k8s cluster.
I just followed by this post when setting prometheus and node-exporter,
https://coreos.com/blog/prometheus-and-kubernetes-up-and-running.html
in case others face with the same problem, I'm leaving my comment here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js