GCP Uptime Checks to check private / internal endpoints - google-cloud-platform

I was wondering if anyone else has come across this problem, and if so, what solution was chosen.
We currently use GCP Uptime Checks for our internet facing endpoints - however, we do want to also use them to check non-internet facing endpoints (i.e locked down by a security group for example).
Already considered:
Whitelist the GCP Uptime IPs - this was dismissed as we do not want to maintain a mass amount of IPs on our security groups - which may or may not change over time.
Use GCPs new Private Uptime Checks - this only supports checking endpoints residing in GCP (we need to also check private endpoints in AWS)
Set up a reverse NGINX Proxy VM - which we can lock down to GCP Uptime IPs (in one place, as opposed to multiple) - our services can then just whitelist the Proxy VM
Try and achieve the same logic using the NGINX Ingress Controller in GKE
Has anyone else faced this problem?
Thanks in advance!

You could Create private uptime checks for your GCP endpoints and use an AWS connector project to import the metrics from your AWS endpoints.
Follow these steps provided in the Quickstart for Amazon EC2 guide to monitor your AWS endpoints:
Create a Google Cloud project
Connect an AWS account
Authorize AWS applications
Use Monitoring services with AWS

Related

Is Serverless VPC Access connector a single point of failure?

I am wanting to connect my Cloud Run app to Postgres Cloud SQL instance without assigning the instance a public IP. Seems like the only way to do this is with a Serverless VPC Access connector.
The docs indicate that the Serverless VPC Access connector is billed as 1 e2-micro instance per 100Mbps. Does this indicate that the connector is simply a single e2-micro VM? Is there any redundancy/automated-failover configured behind the scenes?
I can't find any SLA for the Serverless VPC Access and am worried that it could be a single point of failure for my app that brings down all DB connections.
The VPC Access Connector is a Compute Engine instance privately managed by Google Cloud. You are billed per 100Mbit of capacity. The instance size can scale up but not back down. Is this a single point of failure, yes but the service will auto recover. Fault tolerance, recovery time and SLA are not published (AFAIK).
Additional information:
The images for the VPC Access Connector instances are from the project serverless-vpc-access-image.
These instances use RFC1918 addresses that cannot overlap your VPCs.
These instances are basically NAT Gateways and require IP forwarding be allowed constraints/compute.vmCanIPForward.

Why do VPC endpoints not support Amazon RDS?

I want to execute AWS CLI commands of RDS not via the internet, but via a VPC network for mainly creating manual snapshots of RDS.
However, VPC endpoints support only RDS Data API according to the following document:
VPC endpoints - Amazon Virtual Private Cloud
Why? I need to execute a command within closed network for security rules.
Just to reiterate you can still connect to your RDS database through the normal private network using whichever library you choose to perform any DDL, DML, DCL and TCL commands. Although in your case you want to create a snapshot which is via the service endpoint.
VPC endpoints are to connect to the service APIs that power AWS (think the interactions you perform in the console, SDK or CLI), at the moment this means for RDS to create, modify or delete resources you need to use the API over the public internet (using HTTPS for encrypted traffic).
VPC endpoints are added over time, just because a specific API is not there now does not mean it will never be there. There is an integration that has to be carried out by the team of that AWS service to allow VPC endpoints to work.

How to connect AppEngine Standard Gen2 to local resource using Serverless VPC and Cloud VPN?

I have a project setup where I can connect to a local resource through AppEngine Flexible instances launching on a VPC network that is setup with a Cloud VPN connection to my local firewall.
With the release of Serverless VPC for the us-east1 region, I wanted to replace my setup to use AppEngine Standard Gen2 instances vs Flexible for the cost savings. I setup a Serverless VPC for the region/network my AppEngine app is hosted on and my Cloud VPN connection is configured for, updated my app.yaml accordingly, and pushed a new version.
I keep getting timeout errors for the new version that is trying to use Serverless VPC to connect to my local resource.
Some context:
The VPC Network is named "portal" and setup to "Auto" mode (auto creation of subnets for each region)
Cloud VPN is setup as a Classic VPN in the "portal" network with Route-based routing in the us-east1 region, connecting to my remote local 192.168.11.0/24 subnet.
A route exists on the VPC network for destinations 192.168.11.0/24 to use the Cloud VPN I have setup as the next hop (automatically created)
With the above, AppEngine Flexible deployments on the "portal" network can connect to my local resource as can any other Compute Engine VM on the "portal" network
I setup the Serverless VPC connector on the us-east1 region with the subnet 10.8.0.0/28
I'm not too clear how Serverless VPC works so I'm not sure how to even begin troubleshooting. When I click on the route rule for the 192.168.11.0/24 destination, I can see the AppEngine Flexible instances listed along with some "serverless-vpc-access" tagged instances that appear to be on a different subnetwork but using 10.8.0.0/28 IPs.
Should this configuration be working? If not, what changes do I need to make in order to support this?
Your problem (most likely) is caused by static routing. Do you have a route for return traffic coming from your VPN going to the VPC connector? Look at the routes defined for the VPN.
The purpose of a Serverless VPC connector is to allow the connection from the App Engine Standard to your VPC Network since the App Engine Standard environment is hosted and managed by Google and is not part of your VPC Network.
More details can be found here: [https://cloud.google.com/vpc/docs/configure-serverless-vpc-access].
That being said, you should verify the following:
Make sure that you’ve added the new subnet (/28) to your local on premise routes, with your VPN Gateway as the next hop. Since you’re using route-based routing, there is nothing to do regarding the Traffic Selectors on the VPN.
Make sure your local firewall is configured to accept the connection back and forth with the new configuration (/28).
While this probably won't apply to you, I just wanted to point out that communication through the Serverless VPC connector to the App Engine Standard environment is not possible unless it’s done on the same original tcp connection that originated from that same App Engine (TCP Established).
Your configuration, as you described is definitely possible to achieve. As mentioned, there are only a few things you need to verify to make sure it works.

AWS & Azure Hybrid Cloud Setup - is this configuration at all possible (Azure Load Balancer -> AWS VM)?

We have all of our cloud assets currently inside Azure, which includes a Service Fabric Cluster containing many applications and services which communicate with Azure VM's through Azure Load Balancers. The VM's have both public and private IP's, and the Load Balancers' frontend IP configurations point to the private IP's of the VM's.
What I need to do is move my VM's to AWS. Service Fabric has to stay put on Azure though. I don't know if this is possible or not. The Service Fabric services communicate with the Azure VM's through the Load Balancers using the VM's private IP addresses. So the only way I could see achieving this is either:
Keep the load balancers in Azure and direct the traffic from them to AWS VM's.
Point Azure Service Fabric to AWS load balancers.
I don't know if either of the above are technologically possible.
For #1, if I used Azure's load balancing, I believe the load balancer front-end IP config would have to use the public IP of the AWS VM, right? Is that not less secure? If I set it up to go through a VPN (if even possible) is that as secure as using internal private ip's as in the current load balancer config?
For #2, again, not sure if this is technologically achievable - can we even have Service Fabric Services "talk" to AWS load balancers? If so, what is the most secure way to achieve this?
I'm not new to the cloud engineering game, but very new to the idea of using two cloud services as a hybrid solution. Any thoughts would be appreciated.
As far as I know creating multiregion / multi-datacenter cluster in Service Fabric is possible.
Here are the brief list of requirements to have initial mindset about how this would work and here is a sample not approved by Microsoft with cross region Service Fabric cluster configuration (I know this are different regions in Azure not different cloud provider but this sample can be of use to see how some of the things are configured).
Hope this helps.
Based on the details provided in the comments of you own question:
SF is cloud agnostic, you could deploy your entire cluster without any dependencies on Azure at all.
The cluster you see in your azure portal is just an Azure Resource Screen used to describe the details of your cluster.
Your are better of creating the entire cluster in AWS, than doing the requested approach, because at the end, the only thing left in azure would be this Azure Resource Screen.
Extending the Oleg answer, "creating multiregion / multi-datacenter cluster in Service Fabric is possible." I would add, that is also possible to create an azure agnostic cluster where you can host on AWS, Google Cloud or On Premises.
The only details that is not well clear, is that any other option not hosted in azure requires an extra level of management, because you have to play around with the resources(VM, Load Balancers, AutoScaling, OS Updates, and so on) to keep the cluster updated and running.
Also, multi-region and multi-zone cluster were something left aside for a long time in the SF roadmap because it is something very complex to do and this is why they avoid recommend, but is possible.
If you want to go for AWS approach, I guide you to this tutorial: Create AWS infrastructure to host a Service Fabric cluster
This is the first of a 4 part tutorial with guidance on how you can Setup a SF Cluster on AWS infrastructure.
Regarding the other resources hosted on Azure, You could still access then from AWS without any problems.

How to expose APIs endpoints from private AWS ALB

We are having several microservices on AWS ECS. We have single ALB which has different target group for different microservices. We want to expose some endpoints externally while some endpoints just for internal communication.
The problem is that if we put our load balancer in public VPC than it means that we are exposing all register endpoints externally. If we move load balancer to private VPC, we have to use some sort of proxy in public VPC, which required additional infra/cost and custom implementation of all security concerns like D-DOS etc.
What possible approaches we can have or does AWS provide some sort of out of the box solution for this ?
I would strongly recommend running 2 albs for this. Sure, it will cost you more (not double because the traffic costs won't be doubled), but it's much more straight forward to have an internal load balancer and an external load balancer. Work hours cost money too! Running 2 albs will be the least admin and probably the cheapest overall.
Checkout WAF. It stands for web application firewall and is available as AWS service. Follow these steps as guidance:
Create a WAF ACL.
Add "String and regex matching" condition for your private endpoints.
Add "IP addresses" condition for your IP list/range that are allowed to access private endpoints.
Create a rule in your ACL to Allow access if both conditions above are met.
Assign ALB to your WAF ACL.
UPDATE:
In this case you have to use external facing ALB in a public subnet as mentioned by Dan Farrell in comment below.
I would suggest doing it like this:
one internal ALB
one target group per microservice, as limited by ECS.
one Network load balancer(NLB), with one ip based target group.
The Ip based target group will have the internal ALB ip addresses,as the private ip addresses for ALB are not static, you will need to setup cloudwatch cron rule with this lambda function(forked from aws documentation and modified to work on public endpoints as well):
https://github.com/talal-shobaita/populate-nlb-tg-withalb/
Both ALB and NLB are scalable and protected from DDOS by AWS, AWS WAF is another great tool that can be attached directly to your ALB listener for extended protection.
Alternatively, you can wait for AWS to support multiple target group registration per service, it is already in their roadmap:
https://github.com/aws/containers-roadmap/issues/104
This how we eventually solved.
Two LB one in private and one in public subnet.
Some APIs meant to be public, so directly exposed through public LB.
For some private APIs endpoints need to be exposed, added a proxy in public LB and routed those particular paths from public LB to private LB through this proxy.
These days API Gateway is the best way to do this. You can have your API serve a number of different endpoints while serving only the public ones via API Gateway and proxying back to the API.
I don't see it mentioned yet so I'll note that we use a CloudMap for internal routing and an ALB for "external" (in our case simply intra/inter-VPC) communication. I didn't read in depth, but I think this article describes it.
AWS Cloud Map is a managed solution that lets you map logical names to the components/resources for an application. It allows applications to discover the resources using one of the AWS SDKs, RESTful API calls, or DNS queries. AWS Cloud Map serves registered resources, which can be Amazon DynamoDB tables, Amazon Simple Queue Service (SQS) queues, any higher-level application services that are built using EC2 instances or ECS tasks, or using a serverless stack.
...
Amazon ECS is tightly integrated with AWS Cloud Map to enable service discovery for compute workloads running in ECS. When you enable service discovery for ECS services, it automatically keeps track of all task instances in AWS Cloud Map.
You want to look at AWS Security Groups.
A security group acts as a virtual firewall for your instance to control inbound and outbound traffic.
For each security group, you add rules that control the inbound traffic to instances, and a separate set of rules that control the outbound traffic.
Even more specific to your use-case though might be their doc on ELB Security Groups. These are, as you may expect, security groups that are applied at the ELB level rather than the Instance level.
Using security groups, you can specify who has access to which endpoints.