Istio stop loadbalancing when add configs for outlier detections

Istio stop loadbalancing when add configs for outlier detections - istio

Team ,
I’ve been playing around with isito1.7 and outlier detections, here are some weird things I found
vs-dr.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: recommendation
spec:
hosts:
- "recommendation-demo.com"
gateways:
- istio-system/monitoring-gateway
http:
- name: "other-account-route"
route:
- destination:
host: recommendation
subset: v2
weight: 100
- destination:
host: recommendation
subset: v1
weight: 0
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: recomm-dr
spec:
host: recommendation
subsets:
- name: v2
labels:
version: v2
trafficPolicy:
loadBalancer:
simple: ROUND_ROBIN
connectionPool:
tcp: {}
http: {}
outlierDetection:
consecutiveErrors: 2
interval: 1s
baseEjectionTime: 30s
maxEjectionPercent: 10
- name: v1
labels:
version: v1
so If outlier detection is not configured in destination rules , the loadbalacing is working successfully like
kubectl -n micro exec -it $CLIENT_POD -c istio-proxy – sh -c ‘while true; do curl -L recommendation-demo.com; sleep 1; done’
recommendation v2 from ‘recommendation-v2-57ddf9cd95-wb7rj’: 45
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 851
recommendation v2 from ‘recommendation-v2-57ddf9cd95-jtkrz’: 44
recommendation v2 from ‘recommendation-v2-57ddf9cd95-wb7rj’: 46
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 852
recommendation v2 from ‘recommendation-v2-57ddf9cd95-jtkrz’: 45
recommendation v2 from ‘recommendation-v2-57ddf9cd95-wb7rj’: 47
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 853
recommendation v2 from ‘recommendation-v2-57ddf9cd95-jtkrz’: 46
recommendation v2 from ‘recommendation-v2-57ddf9cd95-wb7rj’: 48
recommendation v2 from ‘recommendation-v2-57ddf9cd95-jtkrz’: 47
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 854
But after I add this part
outlierDetection:
consecutiveErrors: 2
interval: 1s
baseEjectionTime: 30s
maxEjectionPercent: 50
the only result I got is from
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1321
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1322
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1323
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1324
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1325
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1326
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1327
And BTW after I add the outlier config and then scale out the deployment , the youngest pod can be routed successfully
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 32
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 33
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1364
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 34
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1365
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 35
recommendation v2 from ‘recommendation-v2-57ddf9cd95-skkgd’: 1366
recommendation v2 from ‘recommendation-v2-57ddf9cd95-xhq4n’: 36
So my question is ,
Is this an expected behaviour ? In this case, lets say we have 3 pods in one rs ,and apply the ourlier configs then the request will only routed to the youngest pod recommendation-v2-57ddf9cd95-skkgd
We have rs and outlier configs in place , then we add extra pods to the rs , they can be loadbalanced successfully ?
Anyone has sucess configs for outliers ?
Much appreciated for any replies!

I have created below example with this video on youtube and this github repository.
It's based on 1 deployment with service and appropriate gateway, virtual service and destination rule.
Testing on gke with istio 1.7.4.
Example yamls.
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: recommendation
version: v2
name: recommendation-v2
spec:
replicas: 2
selector:
matchLabels:
app: recommendation
version: v2
template:
metadata:
labels:
app: recommendation
version: v2
annotations:
sidecar.istio.io/inject: "true"
spec:
containers:
- env:
- name: JAVA_OPTIONS
value: -Xms15m -Xmx15m -Xmn15m
name: recommendation
image: quay.io/rhdevelopers/istio-tutorial-recommendation:v2.2
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
name: http
protocol: TCP
- containerPort: 8778
name: jolokia
protocol: TCP
- containerPort: 9779
name: prometheus
protocol: TCP
resources:
requests:
memory: "80Mi"
cpu: "200m" # 1/5 core
limits:
memory: "120Mi"
cpu: "500m"
livenessProbe:
exec:
command:
- curl
- localhost:8080/health/live
initialDelaySeconds: 5
periodSeconds: 4
timeoutSeconds: 1
readinessProbe:
exec:
command:
- curl
- localhost:8080/health/ready
initialDelaySeconds: 6
periodSeconds: 5
timeoutSeconds: 1
securityContext:
privileged: false
---
apiVersion: v1
kind: Service
metadata:
name: recommendation
labels:
app: recommendation
spec:
ports:
- name: http
port: 8080
selector:
app: recommendation
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: my-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: recommendation
spec:
hosts:
- "*"
gateways:
- "my-gateway"
http:
- name: "other-account-route"
route:
- destination:
host: recommendation
subset: v2
weight: 100
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: recomm-dr
spec:
host: recommendation
subsets:
- name: v2
labels:
version: v2
trafficPolicy:
loadBalancer:
simple: ROUND_ROBIN
outlierDetection:
consecutiveErrors: 1
interval: 1s
baseEjectionTime: 60s
maxEjectionPercent: 100
1.Is this an expected behaviour ? In this case, lets say we have 3 pods in one rs ,and apply the ourlier configs then the request will only routed to the youngest pod recommendation-v2-57ddf9cd95-skkgd
No, after you apply this outlierDetection it should work as before, unless they return 503.
2.We have rs and outlier configs in place , then we add extra pods to the rs , they can be loadbalanced successfully ?
Yes, they should be loadbalanced successfully.
There is test from above yamls.
Added below outlierDetection
outlierDetection:
consecutiveErrors: 1
interval: 10s
baseEjectionTime: 90s
maxEjectionPercent: 100
With 2 replicas and outlierDetection.
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 1
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 1
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 2
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 2
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 3
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 3
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 4
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 4
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 5
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 5
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 6
With 2 replicas, outlierDetection and added next 2 replicas with kubectl scale deployment recommendation-v2 --replicas=4
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 15
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 17
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 16
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 18
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 17
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 19
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 18
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 20
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 19
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 20
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 1
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 21
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 21
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 2
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 22
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 22
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 3
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 23
recommendation v2 from 'recommendation-v2-7f76b4c8cc-kvqjk': 1
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 23
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 24
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 4
recommendation v2 from 'recommendation-v2-7f76b4c8cc-ml9m7': 5
recommendation v2 from 'recommendation-v2-7f76b4c8cc-kvqjk': 2
recommendation v2 from 'recommendation-v2-7f76b4c8cc-kvqjk': 3
recommendation v2 from 'recommendation-v2-7f76b4c8cc-6tvmj': 24
recommendation v2 from 'recommendation-v2-7f76b4c8cc-htz56': 25
There are 2 new replicas added, ml9m7 and kvqjk.
Anyone has sucess configs for outliers ? Much appreciated for any replies!
If I understand correctly how it should work then above example works correct, if you manually change 1 of the pods to return 503 it's ejected from the pool and added back after 90s
There is a way from above video how to make the recommendation replica return 503.
kubectl exec -ti recommendation-v2-7f76b4c8cc-6tvmj -c recommendation /bin/bash
bash-4.4# curl localhost:8080/misbehave
Following requests to / will return a 503
And if you start sending traffic you can check logs of the deployment replica which should return 503 with
kubectl logs recommendation-v2-7f76b4c8cc-6tvmj -c recommendation --tail 10
There are a few requests every 90s, after istio detect 503 it will be ejected with outlierDetection. After 90s istio will try to send the traffic again and again.
Additional resources:
https://dzone.com/articles/istio-circuit-breaker-with-outlier-detection
https://www.citrix.com/blogs/2020/07/15/outlier-detection-using-citrix-adc-in-istio-service-mesh/
https://istio.io/latest/docs/tasks/traffic-management/circuit-breaking/

So this issue seems to be related with the locality load balancer
When outlierDetection is not defined, the locality failover is disable -> so locality is not used. That is why the loadbalancer is working correctly.
But after setting outlierDetection the locality failover is enabled by default -> and so the request will be loadbalance on a locality
-> if you want to be sure:
loadBalancer:
simple: RANDOM
localityLbSetting:
enabled: false

Related

GKE Ingress health check failed on ingress but succeed on Loadbalncer

On GKE, I have a deployment working fine, status running and health checks fine:
here it is:
apiVersion: apps/v1
kind: Deployment
metadata:
name: erp-app
labels:
app: erp-app
switch: app
spec:
replicas: 1
selector:
matchLabels:
app: erp-app
template:
metadata:
labels:
app: erp-app
spec:
containers:
- name: erp-container
# Extract this from Google Container Registry
image: gcr.io/project/project:latest
imagePullPolicy: Always
env:
ports:
- containerPort: 8080
livenessProbe:
failureThreshold: 10
httpGet:
path: /
port: 8080
scheme: HTTP
initialDelaySeconds: 150
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 30
readinessProbe:
failureThreshold: 10
httpGet:
path: /
port: 8080
scheme: HTTP
initialDelaySeconds: 150
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 20
Then, I created a service to map ports 8080 to 80
apiVersion: v1
kind: Service
metadata:
labels:
app: erp-app
name: erp-loadbalancer
spec:
ports:
- port: 80
protocol: TCP
targetPort: 8080
selector:
app: erp-app
sessionAffinity: None
type: NodePort
And then, GKE Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
networking.gke.io/managed-certificates: managed-cert
kubernetes.io/ingress.class: "gce"
spec:
defaultBackend:
service:
name: erp-loadbalancer
port:
number: 80
Things is, ingress does not want to work because backend healthcheck does not pass. If I check health check on gcloud (https://console.cloud.google.com/compute/healthChecks) I have created for http port 80 on / (on this path, app is serving a 200)
If I force it to be tcp, then the health check pass. But google automatically switch it back to http, which leads to a 404.
My question here would be: what's wrong in my configuration for my server to be available with an external loadbalancer and not available when using an ingress ? (backend unhealthy state)

ANSWER:
My page / was sending a 301 redirect to /login.jsp
301 is not a valid status code for GCP Health checks
So I changed it and my readinessProb to /login.jsp and that make the whole ingress working fine

Cross sharing in case someone has the same issue:
https://stackoverflow.com/a/74564439/4547221
TL;DR
GCP will automatically create the health checks based on readinessProbe, otherwise, root / is used.

How to prevent my RabbitMQ pod from autoscaling and restarting

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: rabbitmq
namespace: staging
spec:
chart:
spec:
chart: rabbitmq
sourceRef:
kind: HelmRepository
name: bitnami
namespace: kube-system
version: "8.22.3"
interval: 5m
releaseName: rabbitmq
values:
priorityClassName: system-node-critical
replicaCount: 4
auth:
username: staging
password: rabbitmq-secret
existingPasswordSecret: rabbitmq-secret
erlangCookie: **************************
extraConfiguration: |-
default_vhost=use2-mmc-1021
consumer_timeout=86400000
memoryHighWatermark:
enabled: "true"
type: absolute
value: 1536MB
resources:
limits:
cpu: 500m
memory: 2048Mi
requests:
cpu: 500m
memory: 2048Mi
livenessProbe:
initialDelaySeconds: 120
timeoutSeconds: 20
periodSeconds: 30
failureThreshold: 6
successThreshold: 1
readinessProbe:
initialDelaySeconds: 10
timeoutSeconds: 20
periodSeconds: 30
failureThreshold: 3
successThreshold: 1
service:
type: ClusterIP
serviceAccount:
create: true
This is my code for rabbitmq.yaml file.
I want to prevent my pod from restarting , as restarting the pod takes some time which results in server errors.
cluster autoscaler should not affect my rabbitmq pod.
i've already tried using system-node-critical in my code , it didn't make any difference.
please help me out with some answers to my problem.
cluster autoscaler should not affect my rabbitmq pod or how to make my pod as System-node-critical.

AWS cross account Loki promtail setup in EKS

Here is my setup.
I have 2 AWS accounts.
Applications account
Monitoring account
Application account has EKS + Istio + Application related microservices + promtail agents.
Monitoring account has centralized logging system within EKS + Istio + (Grafana & Prometheus & loki pods running)
From Applications account, I want to push logs to Loki on Monitoring a/c. I tried exposing Loki service outside monitoring a/c but I am facing issues to set loki url as https://<DNS_URL>/loki. This change I tried by using suggestions at here and here, but that is not working for me. I have installed the loki-stack from this url
The question is how can I access loki URL from applications account so that it can be configured in promtail in applications a/c?
Please note both accounts are using pods within EKS and not standalone loki or promtail.
Thanks and regards.
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: loki
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2021-10-25T14:59:20Z"
labels:
app: loki
app.kubernetes.io/managed-by: Helm
chart: loki-2.5.0
heritage: Helm
release: loki
name: loki
namespace: monitoring
resourceVersion: "18279654"
uid: 7eba14cb-41c9-445d-bedb-4b88647f1ebc
spec:
clusterIP: 172.20.217.122
clusterIPs:
- 172.20.217.122
ports:
- name: metrics
port: 80
protocol: TCP
targetPort: 3100
selector:
app: loki
release: loki
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
generation: 14
name: grafana-vs
namespace: monitoring
resourceVersion: "18256422"
uid: e8969da7-062c-49d6-9152-af8362c08016
spec:
gateways:
- my-gateway
hosts:
- '*'
http:
- match:
- uri:
prefix: /grafana/
name: grafana-ui
rewrite:
uri: /
route:
- destination:
host: prometheus-operator-grafana.monitoring.svc.cluster.local
port:
number: 80
- match:
- uri:
prefix: /loki
name: loki-ui
rewrite:
uri: /loki
route:
- destination:
host: loki.monitoring.svc.cluster.local
port:
number: 80
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.istio.io/v1alpha3","kind":"Gateway","metadata":{"annotations":{},"name":"my-gateway","namespace":"monitoring"},"spec":{"selector":{"istio":"ingressgateway"},"servers":[{"hosts":["*"],"port":{"name":"http","number":80,"protocol":"HTTP"}}]}}
creationTimestamp: "2021-10-18T12:28:05Z"
generation: 1
name: my-gateway
namespace: monitoring
resourceVersion: "16618724"
uid: 9b254a22-958c-4cc4-b426-4e7447c03b87
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
---
apiVersion: v1
items:
- apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
alb.ingress.kubernetes.io/scheme: internal
alb.ingress.kubernetes.io/target-type: ip
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.k8s.io/v1beta1","kind":"Ingress","metadata":{"annotations":{"alb.ingress.kubernetes.io/scheme":"internal","alb.ingress.kubernetes.io/target-type":"ip","kubernetes.io/ingress.class":"alb"},"name":"ingress-alb","namespace":"istio-system"},"spec":{"rules":[{"http":{"paths":[{"backend":{"serviceName":"istio-ingressgateway","servicePort":80},"path":"/*"}]}}]}}
kubernetes.io/ingress.class: alb
finalizers:
- ingress.k8s.aws/resources
generation: 1
name: ingress-alb
namespace: istio-system
resourceVersion: "4447931"
uid: 74b31fba-0f03-41c6-a63f-6a10dee8780c
spec:
rules:
- http:
paths:
- backend:
service:
name: istio-ingressgateway
port:
number: 80
path: /*
pathType: ImplementationSpecific
status:
loadBalancer:
ingress:
- hostname: internal-k8s-istiosys-ingressa-25a256ef4d-1368971909.us-east-1.elb.amazonaws.com
kind: List
metadata:
resourceVersion: ""
selfLink: ""
The ingress is associated with AWS ALB.
I want to access Loki from ALB URL like http(s)://my-alb-url/loki
I hope I have provided the required details now.
Let me know. Thanks.

...how can I access loki URL from applications account so that it can be configured in promtail in applications a/c?
You didn't describe what issue when you use external LB above which should work, anyway, since this method will go thru Internet, the security risk is higher with egress cost consider the volume of logging. You can use Privatelink in this case, see page 16 Shared Services. Your promtail will use the ENI DNS name as the loki target.

GKE Ingress with NEGs: backend healthcheck doesn't pass

I have created GKE Ingress as follows:
apiVersion: cloud.google.com/v1beta1 #tried cloud.google.com/v1 as well
kind: BackendConfig
metadata:
name: backend-config
namespace: prod
spec:
healthCheck:
checkIntervalSec: 30
port: 8080
type: HTTP #case-sensitive
requestPath: /healthcheck
connectionDraining:
drainingTimeoutSec: 60
---
apiVersion: v1
kind: Service
metadata:
name: web-engine-service
namespace: prod
annotations:
cloud.google.com/neg: '{"ingress": true}' # Creates a NEG after an Ingress is created.
cloud.google.com/backend-config: '{"ports": {"web-engine-port":"backend-config"}}' #https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-features#associating_backendconfig_with_your_ingress
spec:
selector:
app: web-engine-pod
ports:
- name: web-engine-port
protocol: TCP
port: 8080
targetPort: 5000
---
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
labels:
app: web-engine-deployment
environment: prod
name: web-engine-deployment
namespace: prod
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: web-engine-pod
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
name: web-engine-pod
labels:
app: web-engine-pod
environment: prod
spec:
containers:
- image: my-image:my-tag
imagePullPolicy: Always
name: web-engine-1
resources: {}
ports:
- name: flask-port
containerPort: 5000
protocol: TCP
readinessProbe:
httpGet:
path: /healthcheck
port: 5000
initialDelaySeconds: 30
periodSeconds: 100
restartPolicy: Always
terminationGracePeriodSeconds: 30
---
apiVersion: networking.gke.io/v1beta2
kind: ManagedCertificate
metadata:
name: my-certificate
namespace: prod
spec:
domains:
- api.mydomain.com #https://cloud.google.com/load-balancing/docs/ssl-certificates/google-managed-certs#renewal
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: prod-ingress
namespace: prod
annotations:
kubernetes.io/ingress.allow-http: "false"
kubernetes.io/ingress.global-static-ip-name: load-balancer-ip
networking.gke.io/managed-certificates: my-certificate
spec:
rules:
- http:
paths:
- path: /model
backend:
serviceName: web-engine-service
servicePort: 8080
I don't know what I'm doing wrong because my heathchecks are not Ok. And based on the perimeter logging I added to the app, nothing is even trying to hit that pod.
I've tried BackendConfig for both 8080 and 5000.
By the way, it's not 100% clear based on the docs if Load Balancer should be configured to targetPorts of corresponding Pods or Services.
The healthcheck is registered with HTTP Load Balancer and Compute Engine:
It seems that something is not right with the Backend service IP.
The corresponding backend service configuration:
$ gcloud compute backend-services describe k8s1-85ef2f9a-prod-web-engine-service-8080-b938a707
...
affinityCookieTtlSec: 0
backends:
- balancingMode: RATE
capacityScaler: 1.0
group: https://www.googleapis.com/compute/v1/projects/wnd/zones/europe-west3-a/networkEndpointGroups/k8s1-85ef2f9a-prod-web-engine-service-8080-b938a707
maxRatePerEndpoint: 1.0
connectionDraining:
drainingTimeoutSec: 60
creationTimestamp: '2020-08-01T11:14:06.096-07:00'
description: '{"kubernetes.io/service-name":"prod/web-engine-service","kubernetes.io/service-port":"8080","x-features":["NEG"]}'
enableCDN: false
fingerprint: 5Vkqvg9lcRg=
healthChecks:
- https://www.googleapis.com/compute/v1/projects/wnd/global/healthChecks/k8s1-85ef2f9a-prod-web-engine-service-8080-b938a707
id: '2233674285070159361'
kind: compute#backendService
loadBalancingScheme: EXTERNAL
logConfig:
enable: true
sampleRate: 1.0
name: k8s1-85ef2f9a-prod-web-engine-service-8080-b938a707
port: 80
portName: port0
protocol: HTTP
selfLink: https://www.googleapis.com/compute/v1/projects/wnd/global/backendServices/k8s1-85ef2f9a-prod-web-engine-service-8080-b938a707
sessionAffinity: NONE
timeoutSec: 30
(port 80 looks really suspicious but I thought maybe it's just left there as default and is not in use when NEGs are configured).

Figured it out. By default, even the latest GKE clusters are created with no IP Alias support. It's also called VPC-native. I didn't even bother to check that initially because:
NEGs are supported out of the box, and what's more they seem to be default with no need for explicit annotation when used on the GKE version I had(1.17.8-gke.17). It doesn't make sense to not enable IP Aliases by default then because it basically means that cluster is in a non-functional state by default.
I didn't check VPC-Native support initially, because this name for the feature is simply misleading. I had extensive prior experience with AWS and my faulty assumption was that VPC-Native is like EC2-VPC, as opposed to EC2-Classic, which is legacy.

Istio DestinationRule subset label not found on matching host

I'm trying to configure an Istio VirtialService / DestinationRule so that a grpc call to the service from a pod labeled datacenter=chi5 is routed to a grpc server on a pod labeled datacenter=chi5.
I have Istio 1.4 installed on a cluster running Kubernetes 1.15.
A route is not getting created in istio-sidecar envoy config for the chi5 subset and traffic is being routed round robin between each service endpoint regardless of pod label.
Kiali is reporting an error in the DestinationRule config: "this subset's labels are not found in any matching host".
Do I misunderstand the functionality of these Istio traffic management objects or is there an error in my configuration?
I believe my pod's are correctly labeled:
$ (dev) kubectl get pods -n istio-demo --show-labels
NAME READY STATUS RESTARTS AGE LABELS
ticketclient-586c69f77d-wkj5d 2/2 Running 0 158m app=ticketclient,datacenter=chi6,pod-template-hash=586c69f77d,run=client-service,security.istio.io/tlsMode=istio
ticketserver-7654cb5f88-bqnqb 2/2 Running 0 158m app=ticketserver,datacenter=chi5,pod-template-hash=7654cb5f88,run=ticket-service,security.istio.io/tlsMode=istio
ticketserver-7654cb5f88-pms25 2/2 Running 0 158m app=ticketserver,datacenter=chi6,pod-template-hash=7654cb5f88,run=ticket-service,security.istio.io/tlsMode=istio
The port-name on my k8s Service object is correctly prefixed with the grpc protocol:
$ (dev) kubectl describe service -n istio-demo ticket-service
Name: ticket-service
Namespace: istio-demo
Labels: app=ticketserver
Annotations: <none>
Selector: run=ticket-service
Type: ClusterIP
IP: 10.234.14.53
Port: grpc-ticket 10000/TCP
TargetPort: 6001/TCP
Endpoints: 10.37.128.37:6001,10.44.0.0:6001
Session Affinity: None
Events: <none>
I've deployed the following Istio objects to Kubernetes:
Kind: VirtualService
Name: ticket-destinationrule
Namespace: istio-demo
Labels: app=ticketserver
Annotations: <none>
API Version: networking.istio.io/v1alpha3
Kind: DestinationRule
Spec:
Host: ticket-service.istio-demo.svc.cluster.local
Subsets:
Labels:
Datacenter: chi5
Name: chi5
Labels:
Datacenter: chi6
Name: chi6
Events: <none>
---
Name: ticket-virtualservice
Namespace: istio-demo
Labels: app=ticketserver
Annotations: <none>
API Version: networking.istio.io/v1alpha3
Kind: VirtualService
Spec:
Hosts:
ticket-service.istio-demo.svc.cluster.local
Http:
Match:
Name: ticket-chi5
Port: 10000
Source Labels:
Datacenter: chi5
Route:
Destination:
Host: ticket-service.istio-demo.svc.cluster.local
Subset: chi5
Events: <none>

I have made reproduction of your issue with 2 nginx pods.
What you want to have can be achieved with sourceLabel,check below example, I think it explain everything.
For start I made 2 ubuntu pods, 1 with label app:ubuntu and 1 without any labels.
apiVersion: v1
kind: Pod
metadata:
name: ubu2
labels:
app: ubuntu
spec:
containers:
- name: ubu2
image: ubuntu
command: ["/bin/sh"]
args: ["-c", "apt-get update && apt-get install curl -y && sleep 3000"]
apiVersion: v1
kind: Pod
metadata:
name: ubu1
spec:
containers:
- name: ubu1
image: ubuntu
command: ["/bin/sh"]
args: ["-c", "apt-get update && apt-get install curl -y && sleep 3000"]
Then 2 deployments with service.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx1
spec:
selector:
matchLabels:
run: nginx1
replicas: 1
template:
metadata:
labels:
run: nginx1
app: frontend
spec:
containers:
- name: nginx1
image: nginx
ports:
- containerPort: 80
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo Hello nginx1 > /usr/share/nginx/html/index.html"]
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx2
spec:
selector:
matchLabels:
run: nginx2
replicas: 1
template:
metadata:
labels:
run: nginx2
app: frontend
spec:
containers:
- name: nginx2
image: nginx
ports:
- containerPort: 80
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo Hello nginx2 > /usr/share/nginx/html/index.html"]
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: frontend
spec:
ports:
- port: 80
protocol: TCP
selector:
app: frontend
Another thing is virtual service with mesh gateway, so it works only in the mesh, with 2 matches, 1 with sourceLabel which goes from pods with app: ubuntu label to nginx pod with v1 subset, and another match which goes to the nginx pod with v2 subset.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: nginxvirt
spec:
gateways:
- mesh
hosts:
- nginx.default.svc.cluster.local
http:
- name: match-myuid
match:
- sourceLabels:
app: ubuntu
route:
- destination:
host: nginx.default.svc.cluster.local
port:
number: 80
subset: v1
- name: default
route:
- destination:
host: nginx.default.svc.cluster.local
port:
number: 80
subset: v2
And the last thing is DestinationRule which take subsets from virtual service and sent it to proper nginx pod with label either nginx1 or nginx2
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: nginxdest
spec:
host: nginx.default.svc.cluster.local
subsets:
- name: v1
labels:
run: nginx1
- name: v2
labels:
run: nginx2
kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
nginx1-5c5b84567c-tvtzm 2/2 Running 0 23m app=frontend,run=nginx1,security.istio.io/tlsMode=istio
nginx2-5d95c8b96-6m9zb 2/2 Running 0 23m app=frontend,run=nginx2,security.istio.io/tlsMode=istio
ubu1 2/2 Running 4 3h19m security.istio.io/tlsMode=istio
ubu2 2/2 Running 2 10m app=ubuntu,security.istio.io/tlsMode=istio
Results from ubuntu pods
Ubuntu with label
curl nginx/
Hello nginx1
Ubuntu without label
curl nginx/
Hello nginx2
Let me know if that help.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Istio stop loadbalancing when add configs for outlier detections - istio

Related

GKE Ingress health check failed on ingress but succeed on Loadbalncer

How to prevent my RabbitMQ pod from autoscaling and restarting

AWS cross account Loki promtail setup in EKS

GKE Ingress with NEGs: backend healthcheck doesn't pass

Istio DestinationRule subset label not found on matching host

Categories

Resources