How to enable user-level rate limiting in istio - istio

I saw Istio site mention Rate Limiting support but I can only find global rate-limit example.
Is it possible to do so at user-level? For example, if my user logged in but sends more than 50 requests within a second then I'd like to block said user, etc. In a similar situation, if user doesn't logged in then that device cannot send more than 30 requests per seconds.

Yes it is possible to conditionally apply rate limits based on arbitrary
attributes using a match condition in the quota rule.
apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
name: quota
namespace: istio-system
spec:
match: source.namespace != destination.namespace
actions:
- handler: handler.memquota
instances:
- requestcount.quota
The quota will only apply when the source namespace is not equal to the destination namespace. In your case you probablty want to set a match like this:
match:
request:
headers:
cookie:
regex: "^(.*?;)?(user=jason)(;.*)?$"
I made a PR to improve the rate-limiting docs you can find it here: https://github.com/istio/istio.github.io/pull/1109

Related

Kiali is not working with VictoriaMetricsHello

Was anyone was able to make kiali visualise mesh using VictoriaMetrics instead of Prometheus?
When I use prometheus and kiali setup from istio samples then Mesh Visualisation works.
But when I replace prometheus with Victoria Metrics (agent, select, insert, storage) then kiali simply showing empty graph.
I have checked that both Prometheus and Victoria Metrics have the same istio_requests_total metric.
But when I use Victoria Metrics select URL in spec.external_services.prometheus.url the graph comes empty.
apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
name: kiali
namespace: istio-system
spec:
auth:
strategy: anonymous
external_services:
istio:
config_map_name: istio-1-14
url_service_version: http://istiod-1-14:15014/version
prometheus:
url: http://vmselect-example-vmcluster-persistent.poc.svc.cluster.local:8481/select/0/prometheus/
In logs I see two errors related to the fact that vm-select does not have corresponding endpoints
2022-07-15T19:25:13Z ERR Failed to fetch Prometheus configuration: bad_response: readObjectStart: expect { or n, but found r, error found in #1 byte of ...|remoteAddr:|..., bigger context ...|remoteAddr: "10.4.34.83:57468"; requestURI: /select|...
2022-07-15T19:25:13Z ERR Failed to fetch Prometheus flags: bad_response: readObjectStart: expect { or n, but found r, error found in #1 byte of ...|remoteAddr:|..., bigger context ...|remoteAddr: "10.4.34.83:57468"; requestURI: /select|...
and multiple warnings
2022-07-15T19:35:28Z WRN Skipping {destination_canonical_revision="v1", destination_canonical_service="microservice", destination_cluster="Kubernetes", destination_service="microservice.poc.svc.cluster.local", destination_service_name="microservice", destination_service_namespace="poc", destination_workload="microservice", destination_workload_namespace="poc", request_protocol="http", response_code="200", response_flags="-", source_canonical_revision="latest", source_canonical_service="istio-ingressgateway-internal", source_cluster="Kubernetes"}, missing expected TS labels
Here is my VMPodScrape which I expect will take all metrics from all pods
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: all-scrape
namespace: poc
spec:
podMetricsEndpoints:
- scheme: http
path: /stats/prometheus
targetPort: 15090
selector: {}
namespaceSelector:
any: true
Error messages don't look critical, in case of error kiali should use default values. As far as I understand, it tries to recognize scrape interval and retention based on prometheus configuration file and flags.
I think you have an issue with relabeling config, it drops labels required for kiali.
There is docs with needed labels by metric name:
https://kiali.io/docs/faq/general/#which-istio-metrics-and-attributes-are-required-by-kiali
I'd recommend check scrape config at VMAgent. Probably relabeling rules are outdated.

GCP url-map with cookie regex matching rule

I am setting up a GCP url map to route requests to backend services based on cookie values. Since cookies would have multiple key values, I am trying to use a regex matcher.
I need to route requests to backends based on region value from cookie.
A typical cookie would look like this: foo=bar;region=eu;variant=beta;
defaultService: https://www.googleapis.com/compute/v1/projects/<project_id>/global/backendServices/multi-region-1
kind: compute#urlMap
name: regex-url-map
hostRules:
- hosts:
- '*'
pathMatcher: path-matcher-1
pathMatchers:
- defaultService: https://www.googleapis.com/compute/v1/projects/<project_id>/global/backendServices/multi-region-1
name: path-matcher-1
routeRules:
- matchRules:
- prefixMatch: /
headerMatches:
- headerName: Cookie
regexMatch: (region=us)
priority: 0
service: https://www.googleapis.com/compute/v1/projects/<project_id>/global/backendServices/multi-region-1
- matchRules:
- prefixMatch: /
headerMatches:
- headerName: Cookie
regexMatch: (region=eu)
priority: 1
service: https://www.googleapis.com/compute/v1/projects/<project_id>/global/backendServices/multi-region-2
However, this url-map fails validation with this error:
$ gcloud compute url-maps validate --source regex-url-map.yaml
result:
loadErrors:
- HttpHeaderMatch has no predicates specified
loadSucceeded: false
testPassed: false
Please note that an exact match with cookie passes validation and matches correctly if cookie value is just something like this: region=us. The headerMatches section for exact match would look like this:
headerMatches:
- headerName: Cookie
exactMatch: region=us
Any pointers on what am I doing wrong here?
Thanks!
Your way of reasoning is correct but the feature you're trying to use is unsupported in external load balancing in GCP; it works only with internal load balancing.
Look at the last phrase from the documentation:
Note that regexMatch only applies to Loadbalancers that have their loadBalancingScheme set to INTERNAL_SELF_MANAGED.
I know it isn't the answer you're looking for but you can always file a new feature request on Google's IssueTracker and explain in detail what you want, how it could work etc.
You can always try to pass the region value in the http request - instead of requesting https://myhost.com all the time - also if you could add a suffix, for example: https://myhost.com/region1 it would allow the GCP load balancer rules to process it and direct the traffic to the backend you wish.
Have a look at this example what you can and can't do with forwarding rules in GCP. Another example here. And another one (mine) explaining how to use pathMatcher to direct traffic to different backend services.

istio setting request size limits - lookup failed: 'request.size'

I am attempting to limit traffic by request size using istio. Given that the virtual service does not provide this I am trying to due it via a mixer policy.
I setup the following
---
apiVersion: "config.istio.io/v1alpha2"
kind: handler
metadata:
name: denylargerequest
spec:
compiledAdapter: denier
params:
status:
code: 9
message: Request Too Large
---
apiVersion: "config.istio.io/v1alpha2"
kind: instance
metadata:
name: denylargerequest
spec:
compiledTemplate: checknothing
---
apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
name: denylargerequest
spec:
match: destination.labels["app"] == "httpbin" && request.size > 100
actions:
- handler: denylargerequest
instances: [ denylargerequest ]
Requests are not denied and I see the following error from istio-mixer
2020-01-07T15:42:40.564240Z warn input set condition evaluation error: id='2', error='lookup failed: 'request.size''
If I remove the request.size portion of the match I get the expected behavior which is a 400 http status with a message about request size. Of course, I get it on every request which is not desired. But that, along with the above error makes it clear that the request.size attribute is the problem.
I do not see anywhere in istio's docs what attributes are available to the mixer rules.
I am running istio 1.3.0.
Any suggestions on the mixer rule? Or an alternative way to enforce request size limits via istio?
The rule match mentioned in the question:
match: destination.labels["app"] == "httpbin" && request.size > 100
will not work because of mismatched attribute types.
According to istio documentation:
Match is an attribute based predicate. When Mixer receives a request
it evaluates the match expression and executes all the associated
actions if the match evaluates to true.
A few example match:
an empty match evaluates to true
true, a boolean literal; a rule with this match will always be executed
match(destination.service.host, "ratings.*") selects any request targeting a service whose name starts with “ratings”
attr1 == "20" && attr2 == "30" logical AND, OR, and NOT are also available
This means that the request.size > 100 has integer values that are not supported.
However, it is possible to do with help of Common Expression Language (CEL).
You can enable CEL in istio by using the policy.istio.io/lang annotation (set it to CEL).
Then by using Type Values from the List of Standard Definitions we can use functions to parse values into different types.
Just as a suggesion for solution.
Alternative way would be to use envoyfilter filter like in this github issue.
According to another related github issue about Envoy's per connection buffer limit:
The current resolution is to use envoyfilter, reopen if you feel this is a must feature
Hope this helps.

Setting a custom call source header with Istio

I have a setup using Kubernetes and Istio where we run a set of services. Each of our services have an istio-sidecar and a REST-api. What we would like is that whenever a service within our setup calls another that the called service knows what service is the caller (Preferably through a header).
Looking at the example image from bookinfo:
bookinfo-image (Link due to <10 reputation)
This would mean that in the source code for the ratings service I would like to be able to, for example, read a header telling me the request came from e.g. Reviews-v2.
My intuition tells me that I should be able to handle this in the istio sidecars, but I fail to realise exactly how.
Until now I have looked at especially envoy filters in the hope that they could help me. I see that for the envoy filters I would be able to set a header, but what I don't see is how I would get the information about what service made the call in order to set it in the header.
Envoy automatically sets the X-Forwarded-Client-Cert header, which contains the SPIFFE ID of the caller. SPIFFE ID in Istio is a URI in the form spiffe://cluster.local/ns/<namespace>/sa/<service account>. Practically, it designates the Kubernetes Service Account of the caller. You may want to test it by using the Istio httpbin sample and sending a request to httpbin:8000/headers
I ended up finding another solution by using a "rule". If we made sure that policy enforcing is enabled and then added the rule:
apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
name: header-rule
namespace: istio-system
spec:
actions: []
requestHeaderOperations:
- name: serviceid
values:
- source.labels["app"]
operation: REPLACE
We achieved what we were attempting to do.

How can I confirm whether Circuit Breaking (via DestinationRule) is at work or not for external service (ServiceEntry & VirtualService)

Summary of Problem
I'm trying to impose Circuit Breaker parameters for an external endpoint outside of my mesh, hosted somewhere else. However, the parameters I have set doesn't seem to be imposed because I am still getting successful HTTP 200 responses, when I expect it to start failing with HTTP 503.
Tools versions are:
Istio-1.2.4
Kubernetes: v1.10.11
Docker Dekstop Version 2.0.0.3
Notable config:
global.outboundTrafficPolicy.mode is REGISTRY_ONLY.
Within Mesh is mTLS. External traffic policy, TLS is DISABLED
Related Resources
ServiceEntry
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: external-service
spec:
hosts:
- external-service.sample.com
location: MESH_EXTERNAL
exportTo:
- "*"
ports:
- number: 80
name: http
protocol: HTTP
resolution: DNS
VirtualService
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: external-service-vs
spec:
hosts:
- external-service.sample.com
http:
- timeout: 200ms
retries:
attempts: 1
perTryTimeout: 200ms
route:
- destination:
host: external-service.sample.com
port:
number: 80
DestinationRule
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: external-service-dr
spec:
host: external-service.sample.com
trafficPolicy:
tls:
mode: DISABLE
connectionPool:
tcp:
maxConnections: 1
connectTimeout: 200ms
http:
http2MaxRequests: 1
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
maxRetries: 1
idleTimeout: 200ms
outlierDetection:
consecutiveErrors: 1
interval: 1s
baseEjectionTime: 10s
maxEjectionPercent: 100
Testing
I have an application inside the mesh injected with an Envoy Proxy. The app basically just run load concurrent for HTTP1.1 GET external-service.sample.com/endpoint.
I adjust the number of concurrent users in the App (1 to 10) and requests per second per user (1 to 20).
I was expecting for the response to start failing with the ramp up. But that's not the case. I get success throughout.
Key Asks
If you see something very glaring, please point it out.
I already checked logs and /stats from my Envoy Proxy (outgoing request and response). What other istio logs do I need to check to understand more whether the request was subjected and evaluated by istio to the destinationrule or not?
Besides the statistic data gathered by Istio Mixer from nested Envoy instances, you might consider fetching Circuit Breaker events from Envoy Access logs.
Since Access logging enabled across Istio mesh plane, you can extract the relevant Circuit Breaker log entries, distinguished by specific response flags:
UO: Upstream overflow (circuit breaking) in addition to 503 response
code.
And fetched up record from container's envoy proxy access logs, i.e:
[2019-09-18T09:49:56.716Z] "GET /headers HTTP/1.1" 503 UO "-" "-" 0 81 0 - "-"
I have not really addressed the issue directly on this.
But, I have done the whole setup from the start with a clean slate all over again from setting up istio. And after that it was already throwing the expected HTTP 503.
It was rather challenging than necessary to know the state of the circuit breakers. There was supposed to be a ticket logged, but it seems development for such feature is not yet on the horizon.
Nevertheless, when verifying, I did take a look on some telemetry metrics to understand the circuit breaker state. I think this way could be better because I only want to know whether the circuit is close or open at a moment and not analyze from multiple input data.
Thanks.