NGINX Ingress times out with large file downloads - amazon-web-services

We are running a Spring Boot app in a k8s pod that is hosted behind an NGINX ingress with a EC2 load balancer. Our app occasionally needs to send a very large file (10/20 GB). We have observed that this operation occasionally times out when querying through the ingress, but does not timeout when queried directly. To more easily reproduce this, we created a simple endpoint to request a file of arbitrary size (/files/SIZE). That is what you can see below.
When a request times out, the ingress controller does not seem to post any logs. From the HTTP client, when the request times out, here is what we are given:
{ [3744 bytes data]
100 16.4G 0 16.4G 0 0 22.7M 0 --:--:-- 0:12:23 --:--:-- 23.9M* TLSv1.2 (IN), TLS alert, close notify (256):
{ [2 bytes data]
100 16.5G 0 16.5G 0 0 22.7M 0 --:--:-- 0:12:23 --:--:-- 23.6M
* Connection #0 to host INGRESS_URL left intact
* Closing connection 0
curl INGRESS_URL/files/21474836480 -v 31.47s user 26.92s system 7% cpu 12:23.81 total
Here is the configuration of our ingress:
kind: Ingress
apiVersion: extensions/v1beta1
metadata:
name: USER
namespace: NAMESPACE
selfLink: /apis/extensions/v1beta1/namespaces/NAMESPACE/ingresses/USER
uid: d84f3ab2-7f2c-42c1-a44f-c6a7d432f03e
resourceVersion: '658287365'
generation: 1
creationTimestamp: '2021-06-29T13:21:45Z'
labels:
app.kubernetes.io/instance: USER
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: APP
helm.sh/chart: CHART
annotations:
kubernetes.io/ingress.class: nginx-l4-ext
meta.helm.sh/release-name: USER
meta.helm.sh/release-namespace: NAMESPACE
nginx.ingress.kubernetes.io/client-max-body-size: '0'
nginx.ingress.kubernetes.io/proxy-body-size: '0'
nginx.ingress.kubernetes.io/proxy-buffering: 'off'
nginx.ingress.kubernetes.io/proxy-max-temp-file-size: '0'
nginx.ingress.kubernetes.io/proxy-read-timeout: '1800'
nginx.ingress.kubernetes.io/proxy-send-timeout: '1800'
nginx.ingress.kubernetes.io/websocket-services: core-service
nginx.org/websocket-services: core-service
managedFields:
- manager: Go-http-client
operation: Update
apiVersion: networking.k8s.io/v1beta1
time: '2021-06-29T13:21:45Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
.: {}
'f:kubernetes.io/ingress.class': {}
'f:meta.helm.sh/release-name': {}
'f:meta.helm.sh/release-namespace': {}
'f:nginx.ingress.kubernetes.io/client-max-body-size': {}
'f:nginx.ingress.kubernetes.io/proxy-body-size': {}
'f:nginx.ingress.kubernetes.io/proxy-buffering': {}
'f:nginx.ingress.kubernetes.io/proxy-max-temp-file-size': {}
'f:nginx.ingress.kubernetes.io/proxy-read-timeout': {}
'f:nginx.ingress.kubernetes.io/proxy-send-timeout': {}
'f:nginx.ingress.kubernetes.io/websocket-services': {}
'f:nginx.org/websocket-services': {}
'f:labels':
.: {}
'f:app.kubernetes.io/instance': {}
'f:app.kubernetes.io/managed-by': {}
'f:app.kubernetes.io/name': {}
'f:helm.sh/chart': {}
'f:spec':
'f:rules': {}
- manager: nginx-ingress-controller
operation: Update
apiVersion: networking.k8s.io/v1beta1
time: '2021-06-29T13:21:59Z'
fieldsType: FieldsV1
fieldsV1:
'f:status':
'f:loadBalancer':
'f:ingress': {}
spec:
rules:
- host: HOST_URL.com
http:
paths:
- path: /
pathType: Prefix
backend:
serviceName: SERVICE_NAME
servicePort: 9081
status:
loadBalancer:
ingress:
- hostname: LOAD_BALANCER_URL
We are running ingress-nginx#v0.46.0
If anyone has any suggestions for why our large downloads are timing out, that would be great!
Testing Already Done:
Verified the params are actually appearing in the generated nginx.conf
Tried changing client-body-timeout - this had to effect.
Recreated the whole environment on my local minikube instance. The application works there. Is it possible this is an Amazon ELB issue?
Changing spring.mvc.async.request-timeout does not fix the issue.
The issue only occurs when making HTTPS calls. HTTP calls run totally fine

I had a similar issue with one of my SpringBoot Apps and the issue was with the Springboot configuration in the application.properties file.
spring:
mvc:
async:
request-timeout: 3600000
Reference: https://stackoverflow.com/a/43496244/2777988

Related

Redis deployed in AWS - Connection time out from localhost SpringBoot app

Small question regarding Redis deployed in AWS (not AWS Elastic Cache) and an issue connecting to it.
Here is the setup of the Redis deployed in AWS: (pasting only the Kubernetes StatefulSet and Service)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
spec:
serviceName: redis
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
initContainers:
- name: config
image: redis:7.0.5-alpine
command: [ "sh", "-c" ]
args:
- |
cp /tmp/redis/redis.conf /etc/redis/redis.conf
echo "finding master..."
MASTER_FDQN=`hostname -f | sed -e 's/redis-[0-9]\./redis-0./'`
if [ "$(redis-cli -h sentinel -p 5000 ping)" != "PONG" ]; then
echo "master not found, defaulting to redis-0"
if [ "$(hostname)" = "redis-0" ]; then
echo "this is redis-0, not updating config..."
else
echo "updating redis.conf..."
echo "slaveof $MASTER_FDQN 6379" >> /etc/redis/redis.conf
fi
else
echo "sentinel found, finding master"
MASTER="$(redis-cli -h sentinel -p 5000 sentinel get-master-addr-by-name mymaster | grep -E '(^redis-\d{1,})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})')"
echo "master found : $MASTER, updating redis.conf"
echo "slaveof $MASTER 6379" >> /etc/redis/redis.conf
fi
volumeMounts:
- name: redis-config
mountPath: /etc/redis/
- name: config
mountPath: /tmp/redis/
containers:
- name: redis
image: redis:7.0.5-alpine
command: ["redis-server"]
args: ["/etc/redis/redis.conf"]
ports:
- containerPort: 6379
name: redis
volumeMounts:
- name: data
mountPath: /data
- name: redis-config
mountPath: /etc/redis/
volumes:
- name: redis-config
emptyDir: {}
- name: config
configMap:
name: redis-config
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: nfs-1
resources:
requests:
storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
ports:
- port: 6379
targetPort: 6379
name: redis
selector:
app: redis
type: LoadBalancer
The pods are healthy, I can exec into it and perform operations fine. Here is the get all:
NAME READY STATUS RESTARTS AGE
pod/redis-0 1/1 Running 0 22h
pod/redis-1 1/1 Running 0 22h
pod/redis-2 1/1 Running 0 22h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/redis LoadBalancer 192.168.45.55 10.51.5.2 6379:30315/TCP 26h
NAME READY AGE
statefulset.apps/redis 3/3 22h
Here is the describe of the service:
Name: redis
Namespace: Namespace
Labels: <none>
Annotations: <none>
Selector: app=redis
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 192.168.22.33
IPs: 192.168.22.33
LoadBalancer Ingress: 10.51.5.2
Port: redis 6379/TCP
TargetPort: 6379/TCP
NodePort: redis 30315/TCP
Endpoints: 192.xxx:6379,192.xxx:6379,192.xxx:6379
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal IPAllocated 68s metallb-controller Assigned IP ["10.51.5.2"]
Normal nodeAssigned 58s (x5 over 66s) metallb-speaker announcing from node "someaddress.com" with protocol "bgp"
Normal nodeAssigned 58s (x5 over 66s) metallb-speaker announcing from node "someaddress.com" with protocol "bgp"
I then try to connect to it, i.e. inserting some data with a very straightforward Spring Boot application. The application has no business logic, just trying to insert data.
Here are the relevant parts:
#Configuration
public class RedisConfiguration {
#Bean
public ReactiveRedisConnectionFactory reactiveRedisConnectionFactory() {
return new LettuceConnectionFactory("10.51.5.2", 30315);
}
#Repository
public class RedisRepository {
private final ReactiveRedisOperations<String, String> reactiveRedisOperations;
public RedisRepository(ReactiveRedisOperations<String, String> reactiveRedisOperations) {
this.reactiveRedisOperations = reactiveRedisOperations;
}
public Mono<RedisPojo> save(RedisPojo redisPojo) {
return reactiveRedisOperations.opsForValue().set(redisPojo.getInput(), redisPojo.getOutput()).map(__ -> redisPojo);
}
Each time I am trying to write the data, I am getting this exception:
2022-12-02T20:20:08.015+08:00 ERROR 1184 --- [ctor-http-nio-3] a.w.r.e.AbstractErrorWebExceptionHandler : [8f16a752-1] 500 Server Error for HTTP POST "/save"
org.springframework.data.redis.RedisConnectionFailureException: Unable to connect to Redis
at org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory$ExceptionTranslatingConnectionProvider.translateException(LettuceConnectionFactory.java:1602) ~[spring-data-redis-3.0.0.jar:3.0.0]
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
*__checkpoint ⇢ Handler com.redis.controller.RedisController#test(RedisRequest) [DispatcherHandler]
*__checkpoint ⇢ HTTP POST "/save" [ExceptionHandlingWebHandler]
Original Stack Trace:
at org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory$ExceptionTranslatingConnectionProvider.translateException(LettuceConnectionFactory.java:1602) ~[spring-data-redis-3.0.0.jar:3.0.0]
Caused by: io.lettuce.core.RedisConnectionException: Unable to connect to 10.51.5.2/<unresolved>:30315
at io.lettuce.core.RedisConnectionException.create(RedisConnectionException.java:78) ~[lettuce-core-6.2.1.RELEASE.jar:6.2.1.RELEASE]
at io.lettuce.core.RedisConnectionException.create(RedisConnectionException.java:56) ~[lettuce-core-6.2.1.RELEASE.jar:6.2.1.RELEASE]
at io.lettuce.core.AbstractRedisClient.getConnection(AbstractRedisClient.java:350) ~[lettuce-core-6.2.1.RELEASE.jar:6.2.1.RELEASE]
at io.lettuce.core.RedisClient.connect(RedisClient.java:216) ~[lettuce-core-6.2.1.RELEASE.jar:6.2.1.RELEASE]
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: /10.51.5.2:30315
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) ~[netty-transport-4.1.85.Final.jar:4.1.85.Final]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.85.Final.jar:4.1.85.Final]
This is particularly puzzling, because I am quite sure the code of the Spring Boot app is working. When I change the IP of return new LettuceConnectionFactory("10.51.5.2", 30315);: to
a regular Redis on my laptop ("localhost", 6379),
a dockerized Redis on my laptop,
a dockerized Redis on prem, all are working fine.
Therefore, I am quite puzzled what did I do wrong with the setup of this Redis in AWS.
What should I do in order to connect to it properly.
May I get some help please?
Thank you
By default, Redis binds itself to the IP addresses 127.0.0.1 and ::1 and does not accept connections against non-local interfaces. Chances are high that this is your main issue and you may want to review your redis.conf file to bind Redis to the interface you need or to the generic * -::*, as explained in the comments of the config file itself (which I have linked above).
With that being said, Redis also does not accept connections on non-local interfaces if the default user has no password - a security layer named Protected mode. Thus you should either give your default user a password or disable protected mode in your redis.conf file.
Not sure if this applies to your case but, as a side note, I would suggest to always avoid exposing Redis to the Internet.
You are mixing 2 things.
To enable this service for pods in different namespaces you do not need external load balancer, you can just try to use redis.namespace-name:6379 dns name and it will just work. Such dns is there for every service you create (but works only inside kubernetes)
Kubernetes will make sure that your traffic will be routed to proper pods (assuming there is more than one).
If you want to expose redis from outside of kubernetes then you need to make sure there is connectivity from the outside and then you need network load balancer that will forward traffic to your kubernetes service (in your case node port, so you need NLB with eks worker nodes: 30315 as a targets)
If your worker nodes have public IP and their SecurityGroups allow connecting to them directly, you could try to connect to worker node's IP directly just to test things out (without LB).
And regardless off yout setup you can always create proxy via kubectl
kubectl port-forward -n redisNS svc/redis 6379:6379
and connect from spring boot app to localhost:6379
How do you want to connect from app to redis in a final setup?

Istio: Health check / sidecar fails when I enable the JWT RequestAuthentication

OBSOLETE:
I keep this post for further reference, but you can check better diagnose (not solved yet, but workarounded) in
Istio: RequestAuthentication jwksUri does not resolve internal services names
UPDATE:
In Istio log we see the next error. uaa is a kubernetes pod serving OAUTH authentication/authorization. It is accessed with the name uaa from the normal services. I do not know why the istiod cannot find uaa host name. Have I to use an specific name? (remember, standard services find uaa host perfectly)
2021-03-03T18:39:36.750311Z error model Failed to fetch public key from "http://uaa:8090/uaa/token_keys": Get "http://uaa:8090/uaa/token_keys": dial tcp: lookup uaa on 10.96.0.10:53: no such host
2021-03-03T18:39:36.750364Z error Failed to fetch jwt public key from "http://uaa:8090/uaa/token_keys": Get "http://uaa:8090/uaa/token_keys": dial tcp: lookup uaa on 10.96.0.10:53: no such host
2021-03-03T18:39:36.753394Z info ads LDS: PUSH for node:product-composite-5cbf8498c7-jd4n5.chp18 resources:29 size:134.3kB
2021-03-03T18:39:36.754623Z info ads RDS: PUSH for node:product-composite-5cbf8498c7-jd4n5.chp18 resources:14 size:14.2kB
2021-03-03T18:39:36.790916Z warn ads ADS:LDS: ACK ERROR sidecar~10.1.1.56~product-composite-5cbf8498c7-jd4n5.chp18~chp18.svc.cluster.local-10 Internal:Error adding/updating listener(s) virtualInbound: Provider 'origins-0' in jwt_authn config has invalid local jwks: Jwks RSA [n] or [e] field is missing or has a parse error
2021-03-03T18:39:55.618106Z info ads ADS: "10.1.1.55:41162" sidecar~10.1.1.55~review-65b6886c89-bcv5f.chp18~chp18.svc.cluster.local-6 terminated rpc error: code = Canceled desc = context canceled
Original question
I have a service that is working fine, after injecting istio sidecar to a standard kubernetes pod.
I'm trying to add jwt Authentication, and for this, I'm following the official guide Authorization with JWT
My problem is
If I create the JWT resources (RequestAuthorization and AuthorizationPolicy) AFTER injecting the istio dependencies, everything (seems) to work fine
But if I create the JWT resources (RequestAuthorization and AuthorizationPolicy) and then inject the Istio the pod doesn't start. After checking the logs, seems that the sidecar is not able to work (maybe checking the health?)
My code:
JWT Resources
apiVersion: "security.istio.io/v1beta1"
kind: "RequestAuthentication"
metadata:
name: "ra-product-composite"
spec:
selector:
matchLabels:
app: "product-composite"
jwtRules:
- issuer: "http://uaa:8090/uaa/oauth/token"
jwksUri: "http://uaa:8090/uaa/token_keys"
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: "ap-product-composite"
spec:
selector:
matchLabels:
app: "product-composite"
action: ALLOW
# rules:
# - from:
# - source:
# requestPrincipals: ["http://uaa:8090/uaa/oauth/token/faf5e647-74ab-42cc-acdb-13cc9c573d5d"]
# b99ccf71-50ed-4714-a7fc-e85ebae4a8bb
2- I use destination rules as follows
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: dr-product-composite
spec:
host: product-composite
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
3- My service deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: product-composite
spec:
replicas: 1
selector:
matchLabels:
app: product-composite
template:
metadata:
labels:
app: product-composite
version: latest
spec:
containers:
- name: comp
image: bthinking/product-composite-service
imagePullPolicy: Never
env:
- name: SPRING_PROFILES_ACTIVE
value: "docker"
- name: SPRING_CONFIG_LOCATION
value: file:/config-repo/application.yml,file:/config-repo/product-composite.yml
envFrom:
- secretRef:
name: rabbitmq-client-secrets
ports:
- containerPort: 80
resources:
limits:
memory: 350Mi
livenessProbe:
httpGet:
scheme: HTTP
path: /actuator/info
port: 4004
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
failureThreshold: 20
successThreshold: 1
readinessProbe:
httpGet:
scheme: HTTP
path: /actuator/health
port: 4004
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
failureThreshold: 3
successThreshold: 1
volumeMounts:
- name: config-repo-volume
mountPath: /config-repo
volumes:
- name: config-repo-volume
configMap:
name: config-repo-product-composite
---
apiVersion: v1
kind: Service
metadata:
name: product-composite
spec:
selector:
app: "product-composite"
ports:
- port: 80
name: http
targetPort: 80
- port: 4004
name: http-mgm
targetPort: 4004
4- Error log in the pod (combined service and sidecar)
2021-03-02 19:34:41.315 DEBUG 1 --- [undedElastic-12] o.s.s.w.s.a.AuthorizationWebFilter : Authorization successful
2021-03-02 19:34:41.315 DEBUG 1 --- [undedElastic-12] .b.a.e.w.r.WebFluxEndpointHandlerMapping : [0e009bf1-133] Mapped to org.springframework.boot.actuate.endpoint.web.reactive.AbstractWebFluxEndpointHandlerMapping$ReadOperationHandler#e13aa23
2021-03-02 19:34:41.316 DEBUG 1 --- [undedElastic-12] ebSessionServerSecurityContextRepository : No SecurityContext found in WebSession: 'org.springframework.web.server.session.InMemoryWebSessionStore$InMemoryWebSession#48e89a58'
2021-03-02 19:34:41.319 DEBUG 1 --- [undedElastic-15] .s.w.r.r.m.a.ResponseEntityResultHandler : [0e009bf1-133] Using 'application/vnd.spring-boot.actuator.v3+json' given [*/*] and supported [application/vnd.spring-boot.actuator.v3+json, application/vnd.spring-boot.actuator.v2+json, application/json]
2021-03-02 19:34:41.320 DEBUG 1 --- [undedElastic-15] .s.w.r.r.m.a.ResponseEntityResultHandler : [0e009bf1-133] 0..1 [java.util.Collections$UnmodifiableMap<?, ?>]
2021-03-02 19:34:41.321 DEBUG 1 --- [undedElastic-15] o.s.http.codec.json.Jackson2JsonEncoder : [0e009bf1-133] Encoding [{}]
2021-03-02 19:34:41.326 DEBUG 1 --- [or-http-epoll-3] r.n.http.server.HttpServerOperations : [id: 0x0e009bf1, L:/127.0.0.1:4004 - R:/127.0.0.1:57138] Detected non persistent http connection, preparing to close
2021-03-02 19:34:41.327 DEBUG 1 --- [or-http-epoll-3] o.s.w.s.adapter.HttpWebHandlerAdapter : [0e009bf1-133] Completed 200 OK
2021-03-02 19:34:41.327 DEBUG 1 --- [or-http-epoll-3] r.n.http.server.HttpServerOperations : [id: 0x0e009bf1, L:/127.0.0.1:4004 - R:/127.0.0.1:57138] Last HTTP response frame
2021-03-02 19:34:41.328 DEBUG 1 --- [or-http-epoll-3] r.n.http.server.HttpServerOperations : [id: 0x0e009bf1, L:/127.0.0.1:4004 - R:/127.0.0.1:57138] Last HTTP packet was sent, terminating the channel
2021-03-02T19:34:41.871551Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 1 rejected
5- Istio injection
kubectl get deployment product-composite -o yaml | istioctl kube-inject -f - | kubectl apply -f -
NOTICE: I have checked a lot of post in SO, and it seems that health checking create a lot of problems with sidecars and other configurations. I have checked the guide Health Checking of Istio Services with no success. Specifically, I tried to disable the sidecar.istio.io/rewriteAppHTTPProbers: "false", but it is worse (in this case, doesn't start neither the sidecar neither the service.

Istio 1.5.2: how to apply an AuthorizationPolicy with HTTP-conditions to a service?

Problem
We run Istio on our Kubernetes cluster and we're implementing AuthorizationPolicies.
We want to apply a filter on email address, an HTTP-condition only applicable to HTTP services.
Our Kiali service should be an HTTP service (it has an HTTP port, an HTTP listener, and even has HTTP conditions applied to its filters), and yet the AuthorizationPolicy does not work.
What gives?
Our setup
We have a management namespace with an ingressgateway (port 443), and a gateway+virtual service for Kiali.
These latter two point to the Kiali service in the kiali namespace.
Both the management and kiali namespace have a deny-all policy and an allow policy to make an exception for particular users.
(See AuthorizationPolicy YAMLs below.)
Authorization on the management ingress gateway works.
The ingress gateway has 3 listeners, all HTTP, and HTTP conditions are created and applied as you would expect.
You can visit its backend services other than Kiali if you're on the email list, and you cannot do so if you're not on the email list.
Authorization on the Kiali service does not work.
It has 99 listeners (!), including an HTTP listener on its configured 20001 port and its IP, but it does not work.
You cannot visit the Kiali service (due to the default deny-all policy).
The Kiali service has port 20001 enabled and named 'http-kiali', so the VirtualService should be ok with that. (See YAMls for service and virtual service below).
EDIT: it was suggested that the syntax of the email values matters.
I think that has been taken care of:
in the management namespace, the YAML below works as expected
in the kiali namespace, the same YAML fails to work as expected.
the empty brackets in the 'property(map[request.auth.claims[email]:{[brackets#test.com] []}])' message I think are the Values (present) and NotValues (absent), respectively, as per 'constructed internal model: &{Permissions:[{Properties:[map[request.auth.claims[email]:{Values:[brackets#test.com] NotValues:[]}]]}]}'
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: testpolicy-brackets
namespace: kiali
spec:
action: ALLOW
rules:
- when:
- key: source.namespace
values: ["brackets"]
- key: request.auth.claims[email]
values: ["brackets#test.com"]
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: testpolicy-yamllist
namespace: kiali
spec:
action: ALLOW
rules:
- when:
- key: source.namespace
values:
- list
- key: request.auth.claims[email]
values:
- list#test.com
debug rbac found authorization allow policies for workload [app=kiali,pod-template-hash=5c97c4bb66,security.istio.io/tlsMode=istio,service.istio.io/canonical-name=kiali,service.istio.io/canonical-revision=v1.16.0,version=v1.16.0] in kiali
debug rbac constructed internal model: &{Permissions:[{Services:[] Hosts:[] NotHosts:[] Paths:[] NotPaths:[] Methods:[] NotMethods:[] Ports:[] NotPorts:[] Constraints:[] AllowAll:true v1beta1:true}] Principals:[{Users:[] Names:[] NotNames:[] Group: Groups:[] NotGroups:[] Namespaces:[] NotNamespaces:[] IPs:[] NotIPs:[] RequestPrincipals:[] NotRequestPrincipals:[] Properties:[map[source.namespace:{Values:[brackets] NotValues:[]}] map[request.auth.claims[email]:{Values:[brackets#test.com] NotValues:[]}]] AllowAll:false v1beta1:true}]}
debug rbac generated policy ns[kiali]-policy[testpolicy-brackets]-rule[0]: permissions:<and_rules:<rules:<any:true > > > principals:<and_ids:<ids:<or_ids:<ids:<metadata:<filter:"istio_authn" path:<key:"source.principal" > value:<string_match:<safe_regex:<google_re2:<> regex:".*/ns/brackets/.*" > > > > > > > ids:<or_ids:<ids:<metadata:<filter:"istio_authn" path:<key:"request.auth.claims" > path:<key:"email" > value:<list_match:<one_of:<string_match:<exact:"brackets#test.com" > > > > > > > > > >
debug rbac ignored HTTP principal for TCP service: property(map[request.auth.claims[email]:{[brackets#test.com] []}])
debug rbac role skipped for no principals found
debug rbac found authorization allow policies for workload [app=kiali,pod-template-hash=5c97c4bb66,security.istio.io/tlsMode=istio,service.istio.io/canonical-name=kiali,service.istio.io/canonical-revision=v1.16.0,version=v1.16.0] in kiali
debug rbac constructed internal model: &{Permissions:[{Services:[] Hosts:[] NotHosts:[] Paths:[] NotPaths:[] Methods:[] NotMethods:[] Ports:[] NotPorts:[] Constraints:[] AllowAll:true v1beta1:true}] Principals:[{Users:[] Names:[] NotNames:[] Group: Groups:[] NotGroups:[] Namespaces:[] NotNamespaces:[] IPs:[] NotIPs:[] RequestPrincipals:[] NotRequestPrincipals:[] Properties:[map[source.namespace:{Values:[list] NotValues:[]}] map[request.auth.claims[email]:{Values:[list#test.com] NotValues:[]}]] AllowAll:false v1beta1:true}]}
debug rbac generated policy ns[kiali]-policy[testpolicy-yamllist]-rule[0]: permissions:<and_rules:<rules:<any:true > > > principals:<and_ids:<ids:<or_ids:<ids:<metadata:<filter:"istio_authn" path:<key:"source.principal" > value:<string_match:<safe_regex:<google_re2:<> regex:".*/ns/list/.*" > > > > > > > ids:<or_ids:<ids:<metadata:<filter:"istio_authn" path:<key:"request.auth.claims" > path:<key:"email" > value:<list_match:<one_of:<string_match:<exact:"list#test.com" > > > > > > > > > >
debug rbac ignored HTTP principal for TCP service: property(map[request.auth.claims[email]:{[list#test.com] []}])
debug rbac role skipped for no principals found
(Follows: a list of YAMLs mentioned above)
# Cluster AuthorizationPolicies
## Management namespace
Name: default-deny-all-policy
Namespace: management
API Version: security.istio.io/v1beta1
Kind: AuthorizationPolicy
Spec:
---
Name: allow-specified-email-addresses
Namespace: management
API Version: security.istio.io/v1beta1
Kind: AuthorizationPolicy
Spec:
Action: ALLOW
Rules:
When:
Key: request.auth.claims[email]
Values:
my.email#my.provider.com
---
## Kiali namespace
Name: default-deny-all-policy
Namespace: kiali
API Version: security.istio.io/v1beta1
Kind: AuthorizationPolicy
Spec:
---
Name: allow-specified-email-addresses
Namespace: kiali
API Version: security.istio.io/v1beta1
Kind: AuthorizationPolicy
Spec:
Action: ALLOW
Rules:
When:
Key: request.auth.claims[email]
Values:
my.email#my.provider.com
---
# Kiali service YAML
apiVersion: v1
kind: Service
metadata:
labels:
app: kiali
version: v1.16.0
name: kiali
namespace: kiali
spec:
clusterIP: 10.233.18.102
ports:
- name: http-kiali
port: 20001
protocol: TCP
targetPort: 20001
selector:
app: kiali
version: v1.16.0
sessionAffinity: None
type: ClusterIP
---
# Kiali VirtualService YAML
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: kiali-virtualservice
namespace: management
spec:
gateways:
- kiali-gateway
hosts:
- our_external_kiali_url
http:
- match:
- uri:
prefix: /
route:
- destination:
host: kiali.kiali.svc.cluster.local
port:
number: 20001
Marking as solved: I had forgotten to apply a RequestAuthentication to the Kiali namespace.
The problematic situation, with the fix in bold:
RequestAuthentication on the management namespace adds a user JWT (through an EnvoyFilter that forwards requests to an authentication service)
AuthorizationPolicy on the management namespace checks the request.auth.claims[email]. These fields exist in the JWT and all is well.
RequestAuthentication on the Kiali namespace missing. I fixed the problem by adding a RequestAuthentication for the Kiali namespace, which populates the user information, which allows the AuthorizationPolicy to perform its checks on actually existing fields.
AuthorizationPolicy on the Kiali namespace also checks the request.auth.claims[email] field, but since there is no authentication, there is no JWT with these fields. (There are some fields populated, e.g. source.namespace, but nothing like a JWT.) Hence, user validation on that field fails, as you would expect.
According to istio documentation:
Unsupported keys and values are silently ignored.
In your debug log there is:
debug rbac ignored HTTP principal for TCP service: property(map[request.auth.claims[email]:{[my.email#my.provider.com] []}])
As You can see there are []}] chars there that might suggest that the value got parsed the wrong way and got ignored as unsupported value.
Try to put Your values like suggested in documentation inside [""]:
request.auth.claims
Claims from the origin JWT. The actual claim name is surrounded by brackets HTTP only
key: request.auth.claims[iss]
values: ["*#foo.com"]
Hope it helps.

AWS ALB not resolving

So I have an EKS cluster, and have set up the AWS Alb Ingress Controller:
https://github.com/kubernetes-sigs/aws-alb-ingress-controller
I'm trying to set up Grafana here, and the Ingress is created but it doesn't seem to resolve at all.
I have the follow Ingress:
$ kubectl describe ingress grafana
Name: grafana
Namespace: orbix-mvp
Address: 4ae1e4ba-orbixmvp-grafana-fd7d-993303634.eu-central-1.elb.amazonaws.com
Default backend: default-http-backend:80 (<none>)
Rules:
Host Path Backends
---- ---- --------
grafana-orbix.orbixpay.com
/ grafana:80 (<none>)
Annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-2016-08
alb.ingress.kubernetes.io/subnets: subnet-08431d96168e36c30,subnet-0e2a7e2766852bf8a
alb.ingress.kubernetes.io/success-codes: 302
kubernetes.io/ingress.class: alb
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CREATE 45m alb-ingress-controller LoadBalancer 4ae1e4ba-orbixmvp-grafana-fd7d created, ARN: arn:aws:elasticloadbalancing:eu-central-1:109153834985:loadbalancer/app/4ae1e4ba-orbixmvp-grafana-fd7d/4b98cb7027b71697
Normal CREATE 45m alb-ingress-controller rule 1 created with conditions [{ Field: "host-header", Values: ["grafana-orbix.orbixpay.com"] },{ Field: "path-pattern", Values: ["/"] }]
The backend fro it is the following service:
$ kubectl describe service grafana
Name: grafana
Namespace: orbix-mvp
Labels: app=grafana
chart=grafana-1.25.1
heritage=Tiller
release=grafana
Annotations: <none>
Selector: app=grafana,release=grafana
Type: NodePort
IP: 172.20.11.232
Port: service 80/TCP
TargetPort: 3000/TCP
NodePort: service 30772/TCP
Endpoints: 10.0.0.180:3000
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
It does have a proper endpoint:
$ kubectl get endpoints | grep grafana
grafana 10.0.0.180:3000 46m
The pod itself is properly tagged and has the correct IP that's the endpoint above:
$ kubectl describe pod grafana-bdc977fd4-ptzhg
Name: grafana-bdc977fd4-ptzhg
Namespace: orbix-mvp
Priority: 0
PriorityClassName: <none>
Node: ip-10-0-0-230.eu-central-1.compute.internal/10.0.0.230
Start Time: Mon, 11 Feb 2019 13:24:43 +0200
Labels: app=grafana
pod-template-hash=687533980
release=grafana
Annotations: <none>
Status: Running
IP: 10.0.0.180
My AWS account has the LoadBalancer listed as Active, the subnets are on the same VPC as the cluster, security groups are being generated by the Ingress Controller.
Everything seems to be set up properly, however when I access the LoadBalancer address, it just times out.
$ kubectl get ingresses
NAME HOSTS ADDRESS PORTS AGE
grafana grafana-orbix.orbixpay.com 4ae1e4ba-orbixmvp-grafana-fd7d-993303634.eu-central-1.elb.amazonaws.com 80 49m
I actually figured it out - the Ingress configuration was allowing for traffic for the domain only. That excludes traffic to the load balancer address (which I assumed is allowed by default).
Basically it needs to be allowed for * in order for the Load Balancer URL to work too. Also, if the app redirects to /login like in my case, all paths need to be allowed too, since that redirect doesn't work if the path specified is for / only.

using sourceLabel in istio is not working

I am trying to filter access to external ressources. I have created a service entry
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: bbc-ext
spec:
hosts:
- "www.bbc.co.uk"
ports:
- number: 443
name: https
protocol: HTTPS
I am using sourceLabel to filter the source app allowed to access external ressources.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: bbc-ext
spec:
hosts:
- "www.bbc.co.uk"
http:
- match:
- sourceLabels:
envir: "production"
route:
- destination:
host: "www.bbc.co.uk"
weight: 100
- route:
- destination:
host: "www.bbc.co.uk"
fault:
abort:
percent: 100
httpStatus: 400
My pod is labeled envir=development but is still allowed access to the ressource.
kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
sleep-d7bfccf65-ws6t6 2/2 Running 0 16m app=sleep,envir=development,pod-template-hash=836977921
But, when I log in the container and run a curl request it is still valid. What am I doing wrong here?
kubectl exec -it sleep-d7bfccf65-ws6t6 -c sleep bash
root#sleep-d7bfccf65-ws6t6:/# curl -v -sL https://www.bbc.co.uk -w "%{http_code}\n" -o /dev/null
[...]
< Cache-Control: private, max-age=0, must-revalidate
< Vary: Accept-Encoding, X-CDN, X-BBC-Edge-Scheme
<
{ [data not shown]
* Connection #0 to host www.bbc.co.uk left intact
200
still the same.
also noticed that sync is not working for routes.
istioctl proxy-status
PROXY CDS LDS EDS RDS PILOT
istio-egressgateway-6cb5b78857-cvqfz.istio-system SYNCED SYNCED SYNCED (100%) NOT SENT istio-pilot-56f6487cdb-qlhzr
istio-ingressgateway-5766b9cc69-64bgd.istio-system SYNCED SYNCED SYNCED (100%) NOT SENT istio-pilot-56f6487cdb-qlhzr
sleep-86f6b99f94-n8l8r.production SYNCED SYNCED SYNCED (100%) SYNCED istio-pilot-56f6487cdb-qlhzr
sleep-d7bfccf65-qbs7v.development SYNCED SYNCED SYNCED (100%) SYNCED istio-pilot-56f6487cdb-qlhzr