How to bypass jwt policy within mesh using istio - istio

I am setting up the end user authentication using istio. I have service A and service B in my mesh and service B is applied the jwt policy so that for requests out the cluster will need the authorization token to access.
However, I found if service A need to access service B, it also returns 401 means need the token, how I can bypass the authentication within the mesh and apply it only for traffic out of the mesh?

I don't think that it is feasible to arrange access to the same K8s target service with different authentication approaches, assuming that external mesh visitors are entitling with JWT token against this service, however internal mesh clients should not respect this policy rule.
Looking deeper into the Origin authentication (End-user authentication) design aspects you may find that Istio Policy strongly relies on the target workloads, representing particular K8s service object. Therefore, you can't change JWC policy behavior depending on the way how the initial request reaches the target k8s service.
I would say that you can apply JWT policy on the top istio-ingressgateway service level, thus all outward network requests to the mesh services will be authenticated on this stage, besides mesh service-to-service communication discovered by K8s system, that can be secured by mTLS transport channel. But this solution requires to re-consider current microservices design affecting all the entire mesh services authorization and authentication methods.

Related

How to build an AWS serverless Apollo Federation API which is constructed from subgraphs in separate microservices?

The Task:
Imagine a large-ish company or product that has many microservices run by separate teams. Each microservice exists in a separate repo. You want to build a single unified Apollo GraphQL API which collates all the subgraphs from the separate microservice APIs. And you want to build it using Serverless technologies. The Unified API should be authenticated using Cognito but the underlying subgraph APIs shouldn't be exposed to the public internet.
Ideal Scenario: AWS-AppSync would support federation natively.
In reality Scenario: Since this can't be done easily we have to run the Apollo Federation Server in a lambda which is fronted by API Gateway. The setup of the Federation server knows where the endpoints are for the subgraph.
Question: How to we construct the subgraphs using AppSync or further Lambda servers for each microservice, or something else? What techniques have people used to deploy Apollo Federation within AWS?
Design Considerations:
Resources Based Policies:
What I would prefer is the API-Gateway to be authenticated with Cognito but the App Sycn subgraphs to be given full access to the Lambda. However because App Sync doesn't support Resource Based Policies this isn't possible.
API-Key:
I can use API-Key auth on the subgraphs but since AppSync has publically accessible endpoints this feels like a security risk.
Cognito:
A possibility - would need to pass through Cognito auth from API-Gateway, to Lambda, then to subgraphs. Feels icky.
Lambda Authorization:
Add Lambda auth for subgraphs and use request context(?) to determine the request was coming internally. A hack for resource-based policies.
Out to Internet and Back Subgraph:
AppSync provides a pubilc url for the endpoint and composing the federated graph pulls the schemas to build the supegraph schema. This feels like internal services going out to the internet and then back in. The best solution would be some internal ip addresses / urls and hosting all the subgraphs within a private VPC.
Conclusion:
Building a secure federated graph feels hacky with serverless technologies. It feels like I should avoid AppSync all together and use a subgraphs (Private API Gateways in private VPC - powered by Lambdas) feeding info to a Public API Gateway authenticated by Cognito.
Interested in thoughts.

AWS Cognito JWT attribute-based routing

I'm new to AWS and it's services. What I want to achieve is a multi-tenancy SaaS application. What my concept looks like:
I use Cognito for user authentication. There all users no matter what tenant they belong to should use one frontend to login. For the tenant-recognition I use a custom attribute "custom:tenant" which I get from the JWT when the login is successful.
For the applicantion itself I want to use VPCs and to ensure encapsulation each tenant should have their own VPC.
Example:
User A of Tenant 1 login and gets back JWT with claim "custom:tenant":"1" should be routed to VPC 1
User B of Tenant 2 login and gets back JWT with claim "custom:tenant":"2" should be routed to VPC 2
Now my question is: how do I achieve this routing from the success of the login to the appropriate VPC? Do I need further Services for that or where do I find these settings?
There is a standard content based routing technique for routing based on the contents of JWTs. This type of thing is usually managed by a reverse proxy or API gateway placed in front of APIs, which runs some custom logic to read the JWT and route accordingly. This also keeps the plumbing outside of application components.
EXAMPLE
Here is an NGINX example coded in LUA, a high level scripting language, to read the JWT and extract a claim. In this example it is a zone whereas in your case it is a tenant ID:
NGINX Configuration
NGINX Plugin Code
Architecture Article
PREREQUISITES
Not all middleware supports this type of routing though. Eg you won't be able to do it in a simple load balancer. One option might be to use NGINX as a cloud managed service though it will cost money. A good gateway in front of APIs is an important architectural component though, so see if your company feels it is worth investing in.

How to secure API behind Kong Gateway for both pubic and internal traffic

We currently have multiple APIs that are not behind a gateway. The APIs that are exposed publicly use OpenID Connect for authentication and claims authorization. Some of the APIs are internal only and are network secured behind a firewall.
We plan to setup Kong Gateway Enterprise in front of our APIs. We should be able to centralize token validation from public clients at the gateway. We could possibly centralize some basic authorization as well (e.g. scopes). Some logic will probably still need to happen in the upstream API. So, those APIs will still need to know the context of the caller (client and user).
Ideally, we would like to be able to have APIs that can be exposed publicly and also called internally to avoid duplicating logic. I'd like to understand some secure approaches for making this happen with Kong. Exactly how to setup the system behind the gateway is still unclear to me.
Some questions I have are:
Should we have both an internal gateway and an external? Is there guidance on how to choose when to create separate gateways?
If we have multiple upstream services in a chain, how do you pass along the auth context?
Custom headers?
Pass along the original JWT?
How can we make a service securely respond to both internal and external calls?
We could setup up a mesh and use mTLS but wouldn't the method of passing the auth context be different between mTLS and the gateway?
We could set custom headers from Kong and have other internal services render them as well. But since this isn't in a JWT, aren't we loosing authenticity of the claims?
We could have every caller, including internal services, get their own token but that could make the number of clients and secrets difficult to manage. Plus, it doesn't handle the situation when those services are still acting on behalf of the user as a part of an earlier request.
Or we could continue to keep separate internal and external services but duplicate some logic.
Some other possibly helpful notes:
There is no other existing PKI other than our OIDC provider.
Services will not all be containerized. Think IIS on EC2.
Services are mostly REST-ish.
There is a lot there to unpack here, and the answer is: it depends
Generally, you should expose the bare minimum API externally, so a separate gateway in the DMZ with only the API endpoints required by external clients. Your generally going to be making more internal changes so you don't want to expose a sensitive endpoint by accident.
Don’t be too concerned about duplication when it comes to APIs, it’s quite common to have multiple API gateways, even egress gateways for external communication. There are patterns like (BFF - Backend for frontend pattern) where each client has its own gateway for orchestration, security, routing, logging, auditing. The more clients are isolated from each other the easier and less risky it is to make API changes.
In regards to propagating the Auth context, it really comes down to trust, and how secure your network and internal actors are. If you're using a custom header then you have to consider the "Confused Deputy Problem". Using a signed JWT solves that, but if the token gets leaked it can be used maliciously against any service in the chain.
You can use RFC8693 Token exchange to mitigate that and even combine it with MTLS, but again that could be overkill for your app. If the JWT is handled by an external client, it becomes even riskier. In that case, it should ideally be opaque and only accepted by the external-facing gateway. That GW can then exchange it for a new token for all internal communication.

Auth between AWS API Gateway and Elastic Cloud hosted Elasticsearch

We're configuring an AWS API Gateway proxy in front of Elasticsearch deployed on Elastic Cloud (for throttling, usage plans, and various other reasons). In order to authenticate between the Gateway and ES, one idea is to configure an integration request on the API Gateway resource to add an Authorization header with creds created in ES. Is this the best strategy? It seems inferior to IAM roles, but that option isn't available as they're not accessible for the ES instance (Elastic Cloud hosts our deployment on AWS, but it's not a resource under our control). The API Gateway itself will require an API key.
I am not an expert at Elasticsearch, but it sounds like you want to securely forward a request from API gateway to another REST web service. Because Elasticsearch is an external REST web service to AWS, you will not have access to IAM roles. I had a similar integration to another cloud rest service (not elasticsearch) will do my best to review the tools in AWS that are available to complete the request.
One idea is to configure an integration request on the API Gateway resource to add an Authorization header with creds created in ES. Is this the best strategy?
This is the most straightforward strategy. In API Gateway, you can map custom headers in the Integration Request. This is where you will map your Authorization header for Elastic Search.
Similarly you can map your Authorization Header as a "Stage Variable" which will make it easier to maintain if the Authorization Header will change across different Elasticsearch environments.
In both strategies, you are storing your Authorization Header in API Gateway. Since the request to Elasticsearch should be HTTPS, the data will be secure in transit. This thread has more information about storing credentials in API Gateway.
From MikeD#AWS: There are currently no known issues with using stage variables to manage credentials; however, stage variables were not explicitly designed to be a secure mechanism for credentials management. Like all API Gateway configuration information, stage variables are protected using standard AWS permissions and policies and they are encrypted when transmitted over the wire. Internally, stage variables are treated as confidential customer information.
I think this applies to your question. You can store the Authorization Header in the API Gateway Proxy, however you have to acknowledge that API Gateway Configuration information was not explicitly designed for sensitive information. That being said, there are no known issues with doing so. This approach is the most straightforward to configure if you are willing to assume that risk.
What is a more "AWS" Approach?
An "AWS" approach would be to use the services designed for the function. For example, using the Key Management Service to store your Elasticsearch Authorization Header.
Similarly to the tutorial referenced in the comments, you will want to forward your request from API Gateway to Lambda. You will be responsible for creating the HTTPS request to Elasticsearch in the language of your choice. There are several tutorials on this but this is the official AWS documentation. AWS provides blueprints as a template to start a Lambda Function. The Blueprint https-request will work.
Once the request is being forwarded from API Gateway to Lambda, configure the authorization header for the Lambda request as an Environment Variable and implement Environment Variable Encryption. This is a secure recommended way to store sensitive data, such as the Elasticsearch authorization header.
This approach will require more configuration but uses AWS services for intended purposes.
My Opinion: I initially used the first approach (Authorization Headers in API Gateway) to authenticate with a dev instance because it was quick and easy, but as I learned more I decided the second approach was more aligned with the AWS Well Architected Framework

WSO2 API Manager v1.8.0 - Clustering

I have a question on WSO2 API Manager Clustering. I have gone through the deployment documentation in detail and understand the distributed deployment concept where in one can seggregate the publisher, store, key manager and gateway. But as per my asessment, that makes the deployment architecture pretty complex to maintain. So I would like to have a simpler deployment.
What I have tested is to simply have two different instances of the WSO2 API Manager to run in two different boxes pointing to the same underlying data sources in MySQL. What I have seen is that, the API calls work perfectly and the tokens obtained from one WSO2 instance would work for API invocation on the other API Manager instance. The only issue with this model is that we need to deploy the APIs from individual publisher components for as many WSO2 API Manager instances that are running. I am fine to do that since the publishing will be done by one single small team. We will have a hardware load balancer in front having the API endpoint URLs and token endpoint URLs for both the API managers and the harware LB will do the load balancing.
So my question is - are there any problems in following this simple approach from the RUNTIME perspective? Does the clustering add any benefit from RUNTIME perspective for WSO2 API Manager?
Thank you.
Your approach has following drawbacks (there can be more which I do not know);
It is not scalable. Meaning - you can't independently scale (adding more instances of) store or publisher or gateway or key manager.
Distributed throttling won't work. It will lead to throttling inconsistencies since the throttling replication won't happen if you don't enable clustering. Lets say you define 'Gold' tier for an API. Doesn't matter how many gateway instances you are using, a user should be restricted to access no more than 20req/min to this API. This should have been implemented based on a distributed counter (not sure the exact implementation details). So if you don't enable clustering, one gateway node doesn't know the number of requests served by other gateway nodes. So each gateway node will have their own throttle counter. Meaning - a user might be able to access your API more than 20req/min. So this is one of the throttling inconsistencies. Further, lets say one gateway node is throttled out a user but the other gateway node is not. Now, if your LB routes the request to 1st gateway node, user will not be able to access the API. If your LB routes the request to 2nd gateway node, user will be able to access the API. This is another instance of throttling inconsistency. To overcome all these issues, you just need to replicate the throttling across all the gateway nodes by enabling clustering.
Distributed caching won't work. For example, API Key validation information are cached. If you revoke a token in one API Manager node, cache will be cleared in that node. So a user can't use revoked token via that API Manager node, BUT he is able to use the token via the other API Manager node until the cache is invalidated (I guess 15 min by default). This is just one instance where things can go wrong if you don't cluster your API Manager instances. To solve these issues, you just need to enable clustering, then the cache will be in sync across the cluster. Read this doc for more details on various caching available in WSO2 API Manager.
You will be having several issues if you don't have above features. WSO2 highly recommends distributed deployment in production.