Django/Google Kubernetes Intermittent 111: Connection refused to upstream services - django

I've done quite a bit of searching and cannot seem to find anyone that shows a resolution to this problem.
I'm getting intermittent 111 Connection refused errors on my kubernetes clusters. It seems that about 90% of my requests succeed and the other 10% fail. If you "refresh" the page, a previously failed request will then succeed. I have 2 different Kubernetes clusters with the same exact setup both showing the errors.
This looks to be very close to what I am experiencing. I did install my setup onto a new cluster, but the same problem persisted:
Kubernetes ClusterIP intermittent 502 connection refused
Setup
Kubernetes Cluster Version: 1.18.12-gke.1206
Django Version: 3.1.4
Helm to manage kubernetes charts
Cluster Setup
Kubernetes nginx ingress controller that serves web traffic into the cluster:
https://kubernetes.github.io/ingress-nginx/deploy/#gce-gke
From there I have 2 Ingresses defined that route traffic based on the referrer url.
Stage Ingress
Prod Ingress
Ingress
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: potr-tms-ingress-{{ .Values.environment }}
namespace: {{ .Values.environment }}
labels:
app: potr-tms-{{ .Values.environment }}
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/from-to-www-redirect: "true"
# this line below doesn't seem to have an effect
# nginx.ingress.kubernetes.io/service-upstream: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "100M"
cert-manager.io/cluster-issuer: "letsencrypt-{{ .Values.environment }}"
spec:
rules:
- host: {{ .Values.ingress_host }}
http:
paths:
- path: /
backend:
serviceName: potr-tms-service-{{ .Values.environment }}
servicePort: 8000
tls:
- hosts:
- {{ .Values.ingress_host }}
- www.{{ .Values.ingress_host }}
secretName: potr-tms-{{ .Values.environment }}-tls
These ingresses route to 2 services that I have defined for prod and stage:
Service
apiVersion: v1
kind: Service
metadata:
name: potr-tms-service-{{ .Values.environment }}
namespace: {{ .Values.environment }}
labels:
app: potr-tms-{{ .Values.environment }}
spec:
type: ClusterIP
ports:
- name: potr-tms-service-{{ .Values.environment }}
port: 8000
protocol: TCP
targetPort: 8000
selector:
app: potr-tms-{{ .Values.environment }}
These 2 services route to deployments that I have for both prod and stage:
Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: potr-tms-deployment-{{ .Values.environment }}
namespace: {{ .Values.environment }}
labels:
app: potr-tms-{{ .Values.environment }}
spec:
replicas: {{ .Values.deployment_replicas }}
selector:
matchLabels:
app: potr-tms-{{ .Values.environment }}
strategy:
type: RollingUpdate
template:
metadata:
annotations:
rollme: {{ randAlphaNum 5 | quote }}
labels:
app: potr-tms-{{ .Values.environment }}
spec:
containers:
- command: ["gunicorn", "--bind", ":8000", "config.wsgi"]
# - command: ["python", "manage.py", "runserver", "0.0.0.0:8000"]
envFrom:
- secretRef:
name: potr-tms-secrets-{{ .Values.environment }}
image: gcr.io/potrtms/potr-tms-{{ .Values.environment }}:latest
name: potr-tms-{{ .Values.environment }}
ports:
- containerPort: 8000
resources:
requests:
cpu: 200m
memory: 512Mi
restartPolicy: Always
serviceAccountName: "potr-tms-service-account-{{ .Values.environment }}"
status: {}
Error
This is the error that I'm seeing inside of my ingress controller logs:
This seems pretty clear, if my deployment pods were failing or showing errors they would be "unavailable" and the service would not be able to route them to the pod. To try and debug this I did increase my deployment resources and replica counts. The amount of web traffic to this app is pretty low though, ~10 users.
What I've Tried
I tried using a completely different ingress controller https://github.com/kubernetes/ingress-nginx
Increasing deployment resources / replica counts (seems to have no effect)
Installing my whole setup on a brand new cluster (same results)
restart the ingress controller / deleting and re installing
Potentially it sounds like this could be a Gunicorn problem. To test I tried starting my pods with python manage.py runserver, problem remained.
Update
Raising the pod counts seems to have helped a little bit.
deployment replicas: 15
cpu request: 200m
memory request: 512Mi
Some requests do fail still though.

Did you find a solution to this? I am seeing something very similar on a minikube setup.
In my case, I believe I also see the nginx controller restarting after the 502. The 502 is intermittent, frequently the first access fails, then reload works.
The best idea I've found so far is to increase the Nginx timeout parameter, but I have not tried that yet. Still trying to search out all options.

I was not able to figure out why these connection errors happen but I did find a work around that seems to solve the problem for our users.
Inside of your ingress config add the annotation
nginx.ingress.kubernetes.io/proxy-next-upstream-tries: "10"
I set it to 10 just to make sure it retried as I was fairly confident our services were working. You could probably get away with 2 or 3.
Here's my full ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: potr-tms-ingress-{{ .Values.environment }}
namespace: {{ .Values.environment }}
labels:
app: potr-tms-{{ .Values.environment }}
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/from-to-www-redirect: "true"
# nginx.ingress.kubernetes.io/service-upstream: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "100M"
nginx.ingress.kubernetes.io/client-body-buffer-size: "100m"
nginx.ingress.kubernetes.io/proxy-max-temp-file-size: "1024m"
nginx.ingress.kubernetes.io/proxy-next-upstream-tries: "10"
cert-manager.io/cluster-issuer: "letsencrypt-{{ .Values.environment }}"
spec:
rules:
- host: {{ .Values.ingress_host }}
http:
paths:
- path: /
backend:
serviceName: potr-tms-service-{{ .Values.environment }}
servicePort: 8000
tls:
- hosts:
- {{ .Values.ingress_host }}
- www.{{ .Values.ingress_host }}
secretName: potr-tms-{{ .Values.environment }}-tls

Related

argocd login redirect loop for SSO

I'm using argocd with JumpCloud for SSO. For some reason, it periodically gets into a weird state where clicking on the login button redirects back to the login page, a not found error, or an infinite redirect loop. It's intermittent and can sometimes (but not always) be fixed by deleting cookies, clearing cache, and/or restarting or deleting the argocd-server pods.
charts/argocd/values.yaml
environment: ""
argo-cd:
redis-ha:
enabled: true
controller:
replicas: 1
metrics:
enabled: true
serviceMonitor:
enabled: true
namespace: kube-system
additionalLabels:
release: kube-prometheus-stack
nodeSelector:
eks.amazonaws.com/capacityType: ON_DEMAND
args:
appResyncPeriod: 20
resources:
requests:
cpu: 1000m
memory: 768Mi
server:
metrics:
enabled: true
serviceMonitor:
enabled: true
namespace: kube-system
additionalLabels:
release: kube-prometheus-stack
nodeSelector:
eks.amazonaws.com/capacityType: ON_DEMAND
resources:
requests:
cpu: 50m
memory: 128Mi
rbacConfigCreate: false
config:
dex.config: |
connectors:
- type: ldap
name: jumpcloud.com
id: ad
config:
# Ldap server address
host: ldap.jumpcloud.com:636
insecureNoSSL: false
insecureSkipVerify: true
# Variable name stores ldap bindDN in argocd-secret
bindDN: "uid=dexservice,ou=Users,o=myorganizationid,dc=jumpcloud,dc=com"
# Variable name stores ldap bind password in argocd-secret
bindPW: "mysecurepassword"
usernamePrompt: Jumpcloud Username
userSearch:
baseDN: ou=Users,o=myorganizationid,dc=jumpcloud,dc=com
username: uid
idAttr: entryDN
emailAttr: mail
nameAttr: cn
groupSearch:
baseDN: ou=Users,o=myorganizationid,dc=jumpcloud,dc=com
filter: "(|(cn=K8S-Admin)(cn=K8S-ReadOnly))"
userMatchers:
- userAttr: entryDN
groupAttr: member
nameAttr: cn
repositories: |
- url: git#github.com:myorgname/myreponame
sshPrivateKeySecret:
name: ssh-private-key-secret
key: id_ed25519
service:
type: NodePort
dex:
nodeSelector:
eks.amazonaws.com/capacityType: ON_DEMAND
redis:
nodeSelector:
eks.amazonaws.com/capacityType: ON_DEMAND
repoServer:
replicas: 2
nodeSelector:
eks.amazonaws.com/capacityType: ON_DEMAND
applicationSet:
replicaCount: 2
configs:
params:
server.insecure: true
charts/argocd/templates/argocd-rbac-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-rbac-cm
labels:
helm.sh/chart: {{ (index .Chart.Dependencies 0).Version }}
app.kubernetes.io/component: server
app.kubernetes.io/instance: argocd
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: argocd-rbac-cm
app.kubernetes.io/part-of: argocd
argocd.argoproj.io/instance: argocd
data:
policy.default: role:none
scopes: '[groups, email]'
policy.csv: |
p, role:none, *, *, */*, deny
g, K8S-Admin, role:admin
g, K8S-ReadOnly, "role:admin"
charts/argocd/templates/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ .Chart.Name }}
annotations:
"kubernetes.io/ingress.class": "alb"
alb.ingress.kubernetes.io/group.name: {{ .Values.environment }}
"alb.ingress.kubernetes.io/scheme": "internal"
"alb.ingress.kubernetes.io/target-type": "instance"
"alb.ingress.kubernetes.io/listen-ports": '[{"HTTP":80},{"HTTPS":443}]'
"alb.ingress.kubernetes.io/backend-protocol": HTTP
"alb.ingress.kubernetes.io/healthcheck-path": "/"
"alb.ingress.kubernetes.io/actions.ssl-redirect": '{"Type": "redirect", "RedirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}}'
"external-dns.alpha.kubernetes.io/hostname": argocd.{{ .Values.hostName }}
spec:
rules:
- host: argocd.{{ .Values.hostName }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: ssl-redirect
port:
name: use-annotation
- path: /
pathType: Prefix
backend:
service:
name: argocd-server
port:
number: 80
We're currently experiencing similar (weird) issues with our Argo CD stack that utilizes DEX as an LDAP connector. DEX logs shows successful authentication, yet when returning to Argo CD, it intermittently seems to switch back to an OICD based login method and denies access with varying error messages completely unrelated to LDAP.
Recent communication on this topic on Argo's Slack Channel showed that this is not a known issue, but might have to do with running Argo in HA mode, i.e. using two or more argocd-server-instances, which was exactly our setup.
I've scaled down argocd-server to one instance only for the time being. Until now, the issue hasn't resurfaced.
This might not be an exact answer, but rather (hopefully) provide a helpful hint as to where to look for a possible solution. Approval pending. :)

Service can't be registered in Target groups

I'm new to using Kubernetes and AWS so there are a lot of concepts I may not understand. I hope you can help me with this problem I am having.
I have 3 services, frontend, backend and auth each with their corresponding nodeport and an ingress that maps the one host to each service, everything is running on EKS and for the ingress deployment I am using AWS ingress controller. Once everything is deployed I try to register the node-group in the targets the frontend and auth services work correctly but backend stays in unhealthy state. I thought it could be a port problem but if you look at auth and backend they have almost the same deployment defined and both are api created with dotnet core. One thing to note is that I can do kubectl port-forward <backend-pod> 80:80 and it is running without problems. And when I run the kubectl describe ingresses command I get this:
Name: ingress
Labels: app.kubernetes.io/managed-by=Helm
Namespace: default
Address: xxxxxxxxxxxxxxxxxxxxxxxxxxx.xxxxx.elb.amazonaws.com
Ingress Class: \<none\>
Default backend: \<default\>
Rules:
Host Path Backends
----------------
domain.com
/ front-service:default-port (10.0.1.183:80,10.0.2.98:80)
back.domain.com
/ backend-service:default-port (\<none\>)
auth.domain.com
/ auth-service:default-port (10.0.1.30:80,10.0.1.33:80)
alb.ingress.kubernetes.io/listen-ports: \[{"HTTPS":443}, {"HTTP":80}\]
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/ssl-redirect: 443
kubernetes.io/ingress.class: alb
Events:
Type Reason Age From Message
-------------------------
Normal SuccessfullyReconciled 8m20s (x15 over 41h) ingress Successfully reconciled
Frontend
apiVersion: apps/v1
kind: Deployment
metadata:
name: front
labels:
name: front
spec:
replicas: 2
selector:
matchLabels:
name: front
template:
metadata:
labels:
name: front
spec:
containers:
- name: frontend
image: {{ .Values.image }}
imagePullPolicy: Always
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: wrfront-{{ .Values.namespace }}-service
spec:
type: NodePort
ports:
- port: 80
targetPort: 80
name: default-port
protocol: TCP
selector:
name: front
---
Auth
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-wrauth-keys
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
---
apiVersion: "v1"
kind: "ConfigMap"
metadata:
name: "auth-config-ocpm"
labels:
app: "auth"
data:
ASPNETCORE_URL: "http://+:80"
ASPNETCORE_ENVIRONMENT: "Development"
ASPNETCORE_LOGGINGCONSOLEDISABLECOLORS: "true"
---
apiVersion: "apps/v1"
kind: "Deployment"
metadata:
name: "auth"
labels:
app: "auth"
spec:
replicas: 2
strategy:
type: Recreate
selector:
matchLabels:
app: "auth"
template:
metadata:
labels:
app: "auth"
spec:
volumes:
- name: auth-keys-storage
persistentVolumeClaim:
claimName: pvc-wrauth-keys
containers:
- name: "api-auth"
image: {{ .Values.image }}
imagePullPolicy: Always
ports:
- containerPort: 80
volumeMounts:
- name: auth-keys-storage
mountPath: "/app/auth-keys"
env:
- name: "ASPNETCORE_URL"
valueFrom:
configMapKeyRef:
key: "ASPNETCORE_URL"
name: "auth-config-ocpm"
- name: "ASPNETCORE_ENVIRONMENT"
valueFrom:
configMapKeyRef:
key: "ASPNETCORE_ENVIRONMENT"
name: "auth-config-ocpm"
- name: "ASPNETCORE_LOGGINGCONSOLEDISABLECOLORS"
valueFrom:
configMapKeyRef:
key: "ASPNETCORE_LOGGINGCONSOLEDISABLECOLORS"
name: "auth-config-ocpm"
---
apiVersion: v1
kind: Service
metadata:
name: auth-service
spec:
type: NodePort
selector:
app: auth
ports:
- name: default-port
protocol: TCP
port: 80
targetPort: 80
Backend (Service with problem)
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
labels:
app: backend
spec:
replicas: 2
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: backend
image: {{ .Values.image }}
imagePullPolicy: Always
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: backend-service
spec:
type: NodePort
selector:
name: backend
ports:
- name: default-port
protocol: TCP
port: 80
targetPort: 80
Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
kubernetes.io/ingress.class: alb
# SSL Settings
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
alb.ingress.kubernetes.io/ssl-redirect: '443'
alb.ingress.kubernetes.io/certificate-arn: {{ .Values.certificate }}
spec:
rules:
- host: {{ .Values.host }}
http:
paths:
- path: /
backend:
service:
name: front-service
port:
name: default-port
pathType: Prefix
- host: back.{{ .Values.host }}
http:
paths:
- path: /
backend:
service:
name: backend-service
port:
name: default-port
pathType: Prefix
- host: auth.{{ .Values.host }}
http:
paths:
- path: /
backend:
service:
name: auth-service
port:
name: default-port
pathType: Prefix
I've tried to deploy other services and they work correctly, also running only backend or only another service, but always the same thing happens and always with the backend.
What could be happening? Is it a configuration problem? Some error in Ingress or Deployment? Or is it just the backend service?
I would be very grateful for any help.
domain.com
/ front-service:default-port (10.0.1.183:80,10.0.2.98:80)
back.domain.com
/ backend-service:default-port (\<none\>)
auth.domain.com
/ auth-service:default-port (10.0.1.30:80,10.0.1.33:80)
This one is saying that your backend service is not registered to the Ingress.
One thing to remember is that Ingress registers Services by pods' ClusterIP, like your Ingress output "10.0.1.30:80", not NodePort. And according to docs , I don't know why you can have multiple NodePort services with the same port. But when you do port-forward, you actually open that port on all your instances, I assume you have 2 instances, and then your ALB health check that port and return healthy.
But I think your issue is from your Ingress that can not locate your backend service.
My suggestions are:
Trying with only backend-service with port changed, and maybe without auth and frontend services. Default range of NodePort is 30000 - 32767
Going inside that pod or create a new pod, make a request to that service using its URL to check what it returns. By default, ALB only accept status 200 from its homepage.

GCP GKE Ingress Health Checks

I have a deployment and service running in GKE using Deployment Manager. Everything about my service works correctly except that the ingress I am creating reports the service in a perpetually unhealthy state.
To be clear, everything about the deployment works except the healthcheck (and as a consequence, the ingress). This was working previously (circa late 2019), and apparently about a year ago GKE added some additional requirements for healthchecks on ingress target services and I have been unable to make sense of them.
I have put an explicit health check on the service, and it reports healthy, but the ingress does not recognize it. The service is using a NodePort but also has containerPort 80 open on the deployment, and it does respond with HTTP 200 to requests on :80 locally, but clearly that is not helping in the deployed service.
The cluster itself is an almost nearly identical copy of the Deployment Manager example
Here is the deployment:
- name: {{ DEPLOYMENT }}
type: {{ CLUSTER_TYPE }}:{{ DEPLOYMENT_COLLECTION }}
metadata:
dependsOn:
- {{ properties['clusterType'] }}
properties:
apiVersion: apps/v1
kind: Deployment
namespace: {{ properties['namespace'] | default('default') }}
metadata:
name: {{ DEPLOYMENT }}
labels:
app: {{ APP }}
tier: resters
spec:
replicas: 1
selector:
matchLabels:
app: {{ APP }}
tier: resters
template:
metadata:
labels:
app: {{ APP }}
tier: resters
spec:
containers:
- name: rester
image: {{ IMAGE }}
resources:
requests:
cpu: 100m
memory: 250Mi
ports:
- containerPort: 80
env:
- name: GCP_PROJECT
value: {{ PROJECT }}
- name: SERVICE_NAME
value: {{ APP }}
- name: MODE
value: rest
- name: REDIS_ADDR
value: {{ properties['memorystoreAddr'] }}
... the service:
- name: {{ SERVICE }}
type: {{ CLUSTER_TYPE }}:{{ SERVICE_COLLECTION }}
metadata:
dependsOn:
- {{ properties['clusterType'] }}
- {{ APP }}-cluster-nodeport-firewall-rule
- {{ DEPLOYMENT }}
properties:
apiVersion: v1
kind: Service
namespace: {{ properties['namespace'] | default('default') }}
metadata:
name: {{ SERVICE }}
labels:
app: {{ APP }}
tier: resters
spec:
type: NodePort
ports:
- nodePort: {{ NODE_PORT }}
port: {{ CONTAINER_PORT }}
targetPort: {{ CONTAINER_PORT }}
protocol: TCP
selector:
app: {{ APP }}
tier: resters
... the explicit healthcheck:
- name: {{ SERVICE }}-healthcheck
type: compute.v1.healthCheck
metadata:
dependsOn:
- {{ SERVICE }}
properties:
name: {{ SERVICE }}-healthcheck
type: HTTP
httpHealthCheck:
port: {{ NODE_PORT }}
requestPath: /healthz
proxyHeader: NONE
checkIntervalSec: 10
healthyThreshold: 2
unhealthyThreshold: 3
timeoutSec: 5
... the firewall rules:
- name: {{ CLUSTER_NAME }}-nodeport-firewall-rule
type: compute.v1.firewall
properties:
name: {{ CLUSTER_NAME }}-nodeport-firewall-rule
network: projects/{{ PROJECT }}/global/networks/default
sourceRanges:
- 130.211.0.0/22
- 35.191.0.0/16
targetTags:
- {{ CLUSTER_NAME }}-node
allowed:
- IPProtocol: TCP
ports:
- 30000-32767
- 80
You could try to define a readinessProbe on your container in your Deployment.
This is also a metric that the ingress uses to create health checks (note that these health checks probes come from outside of GKE)
And In my experience, these readiness probes work pretty well to get the ingress health checks to work,
To do this, you create something like this, this is a TCP Probe, I have seen better performance with TCP probes.
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 10
periodSeconds: 10
So this probe will check port: 80, which is the one I see is used by the pod in this service, and this will also help configure the ingress health check for a better result.
Here is some helpful documentation on how to create the TCP readiness probes which the ingress health check can be based on.

istio - using vs service and gw instead loadbalancer not working

I’ve the following application which Im able to run in K8S successfully which using service with type load balancer, very simple app with two routes
/ - you should see 'hello application`
/api/books should provide list of book in json format
This is the service
apiVersion: v1
kind: Service
metadata:
name: go-ms
labels:
app: go-ms
tier: service
spec:
type: LoadBalancer
ports:
- port: 8080
selector:
app: go-ms
This is the deployment
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: go-ms
labels:
app: go-ms
spec:
replicas: 2
template:
metadata:
labels:
app: go-ms
tier: service
spec:
containers:
- name: go-ms
image: rayndockder/http:0.0.2
ports:
- containerPort: 8080
env:
- name: PORT
value: "8080"
resources:
requests:
memory: "64Mi"
cpu: "125m"
limits:
memory: "128Mi"
cpu: "250m"
after applied the both yamls and when calling the URL:
http://b0751-1302075110.eu-central-1.elb.amazonaws.com/api/books
I was able to see the data in the browser as expected and also for the root app using just the external ip
Now I want to use istio, so I follow the guide and install it successfully via helm
using https://istio.io/docs/setup/kubernetes/install/helm/ and verify that all the 53 crd are there and also istio-system
components (such as istio-ingressgateway
istio-pilot etc all 8 deployments are in up and running)
I’ve change the service above from LoadBalancer to NodePort
and create the following istio config according to the istio docs
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: http-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 8080
name: http
protocol: HTTP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: virtualservice
spec:
hosts:
- "*"
gateways:
- http-gateway
http:
- match:
- uri:
prefix: "/"
- uri:
exact: "/api/books"
route:
- destination:
port:
number: 8080
host: go-ms
in addition I’ve added the following
kubectl label namespace books istio-injection=enabled where the application is deployed,
Now to get the external Ip i've used command
kubectl get svc -n istio-system -l istio=ingressgateway
and get this in the external-ip
b0751-1302075110.eu-central-1.elb.amazonaws.com
when trying to access to the URL
http://b0751-1302075110.eu-central-1.elb.amazonaws.com/api/books
I got error:
This site can’t be reached
ERR_CONNECTION_TIMED_OUT
if I run the docker rayndockder/http:0.0.2 via
docker run -it -p 8080:8080 httpv2
I path's works correctly!
Any idea/hint What could be the issue ?
Is there a way to trace the istio configs to see whether if something is missing or we have some collusion with port or network policy maybe ?
btw, the deployment and service can run on each cluster for testing of someone could help...
if I change all to port to 80 (in all yaml files and the application and the docker ) I was able to get the data for the root path, but not for "api/books"
I tired your config with the modification of gateway port to 80 from 8080 in my local minikube setup of kubernetes and istio. This is the command I used:
kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: go-ms
labels:
app: go-ms
tier: service
spec:
ports:
- port: 8080
selector:
app: go-ms
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: go-ms
labels:
app: go-ms
spec:
replicas: 2
template:
metadata:
labels:
app: go-ms
tier: service
spec:
containers:
- name: go-ms
image: rayndockder/http:0.0.2
ports:
- containerPort: 8080
env:
- name: PORT
value: "8080"
resources:
requests:
memory: "64Mi"
cpu: "125m"
limits:
memory: "128Mi"
cpu: "250m"
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: http-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: go-ms-virtualservice
spec:
hosts:
- "*"
gateways:
- http-gateway
http:
- match:
- uri:
prefix: /
- uri:
exact: /api/books
route:
- destination:
port:
number: 8080
host: go-ms
EOF
The reason that I changed the gateway port to 80 is that, the istio ingress gateway by default opens up a few ports such as 80, 443 and few others. In my case, as minikube doesn't have an external load balancer, I used node ports which is 31380 in my case.
I was able to access the app with url of http://$(minikube ip):31380.
There is no point in changing the port of services, deployments since these are application specific.
May be this question specifies the ports opened by istio ingress gateway.

Cant get access to service

I have the second problem: cloud sql proxy pod wrapped onto service and must provide access to database.
And I have a job which must create new database for every branch.
But when this job runs the second error appears. I cant get access to the cloudsql-proxy-service.
I cant understand why its happens. Thanks.
E psql: could not connect to server: Connection timed out
E Is the server running on host "cloudsql-proxy-service"
(10.43.254.123) and accepting
E TCP/IP connections on port 5432?
apiVersion: apps/v1
kind: Deployment
metadata:
name: cloudsql-proxy
labels:
type: backend
name: app
annotations:
"helm.sh/created": {{ .Release.Time.Seconds | quote }}
"helm.sh/hook": pre-install
"helm.sh/hook-weight": "-20"
spec:
replicas: 1
selector:
matchLabels:
name: cloudsql-proxy
template:
metadata:
labels:
name: cloudsql-proxy
spec:
containers:
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command:
- "/cloud_sql_proxy"
- "-instances={{ .Values.testDatabaseInstanceConnectionName }}=tcp:5432"
- "-credential_file=/secrets/cloudsql/credentials.json"
securityContext:
runAsUser: 2 # non-root user
allowPrivilegeEscalation: false
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
ports:
- containerPort: 5432
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: {{ .Values.cloudSqlProxySecretName }}
---
apiVersion: v1
kind: Service
metadata:
name: cloudsql-proxy-service
labels:
type: backend
name: app
annotations:
"helm.sh/created": {{ .Release.Time.Seconds | quote }}
"helm.sh/hook": pre-install
"helm.sh/hook-weight": "-20"
spec:
selector:
name: cloudsql-proxy
ports:
- port: 5432
apiVersion: batch/v1
kind: Job
metadata:
name: create-test-database
labels:
type: backend
name: app
annotations:
"helm.sh/created": {{ .Release.Time.Seconds | quote }}
"helm.sh/hook": pre-install
"helm.sh/hook-weight": "-10"
spec:
template:
metadata:
name: create-test-database
spec:
containers:
- name: postgres-client
image: kalumkalac/postgresql-client
env:
- name: PGUSER
value: {{ .Values.testDatabaseCredentials.username }}
- name: PGPASSWORD
value: {{ .Values.testDatabaseCredentials.password }}
- name: PGDATABASE
value: {{ .Values.testDatabaseCredentials.defaultDatabaseName }}
- name: PGHOST
value: cloudsql-proxy-service
command:
- psql
- -q
- -c CREATE DATABASE {{ .Values.testDatabaseCredentials.name|quote }}
restartPolicy: Never
backoffLimit: 0 # Deny retry job