Vertex AI Managed Notebook, get subnet/IP - google-cloud-platform

How can I find IP for vertex AI managed notebook instance? The service is differing from user managed notebooks in certain sense. The creation of an instance doesn't create a compute instance, so it's all managed by itself.
My purpose is to whitelist the set of IPs in Mongo atlas. Set of IPs being of all the notebooks in that region. I'm using google-managed networks in this case.
I've a few doubts here:
Since within managed nb, I can change CPU consumption, will this reinstantiate a new cluster, with entirely new IP, or it will be 1 from among a group of IPs?
Is it possible to add a custom init script?

If you want to connect to a database service on GCP, create a network (or use the default) and instantiate the notebook using this network (Advanced options) and create the white list for this entire network . It's required because the managed notebook creates a peering network on the network you will use, you can check you in VPC Network ➞ VPC Network Peering.
If you want an external IP, it will not work. Google managed notebooks does not use external ips, they basically access the internet via NAT gateways (does not matter if you use google or own managed networks) so you will not be able to do what you want. Move for user managed notebooks (where you can assign a fixed external ip) or white list any IP on your Mongo db service if you are not in a production environment.
About yous doubts:
Since within managed nb, I can change CPU consumption, will this instantiate a new cluster, with entirely new IP, or it will be 1 from among a group of IPs
For the internal network it may change when you restart or recreate the notebook instance. For an external network, it does not exists and explained.
Is it possible to add a custom init script?
Basically not. But you can provide custom docker images for the notebook.

Related

SageMaker: Unable to create network interface because subnet 'subnet-xxxx' does not have enough free addresses to satisfy the request

When you setup SageMaker you specify the VPC that it runs in, and any corresponding subnets. If no subnets are specified it uses 2 by default.
But during the course of architecture creation it's easy to have different resources use the same subnets, causing errors such as this:
Failed to change to instance xx.8xlarge Failed to launch app [xxxx]: LimitExceededError: Unable to create network interface because subnet 'subnet-xxxx' does not have enough free addresses to satisfy the request. Free up addresses or add more addresses for the subnet to use, or create a new domain with a new subnet.
It would be nice to change the subnets that SageMaker uses without having to tear down the entire setup and start over. But the only documentation I can see on configuring the VPC/subnets for SageMakers is in the setup stage.
So, what is SageMaker’s relationship to subnets, where is this configured, and can this be modified after deployment?
For both SageMaker notebook instances and Studio, you can specify a VPC in which you would want the instances to run. When you run a notebook inside a VPC, you need interface endpoints to SM API, runtime API and additional resources you might use (s3, ECR, cloudwatch etc.)
These are associated to your VPC and the subnets you specify. For that reason, currently it is not possible to update the vpc/subnet configuration dynamically and has to be done at setup. The only way to update is to tear down your current notebooks (or domain) and create a new notebook (or domain) with different configuration.
There's more documentation here and here. Note that your training jobs, processing jobs etc. will also need IP addresses to run in the VPC.

Alternative to AWS's Security groups in GCP?

Is there an alternative to AWS's security groups in the Google Cloud Platform?
Following is the situation which I have:
A Basic Node.js server running in Cloud Run as a docker image.
A Postgres SQL database at GCP.
A Redis instance at GCP.
What I want to do is make a 'security group' sort of so that my Postgres SQL DB and Redis instance can only be accessed from my Node.js server and nowhere else. I don't want them to be publically accessible via an IP.
What we do in AWS is, that only services part of a security group can access each other.
I'm not very sure but I guess in GCP I need to make use of Firewall rules (not sure at all).
If I'm correct could someone please guide me as to how to go about this? And if I'm wrong could someone suggest the correct method?
GCP has firewall rules for its VPC that work similar to AWS Security Groups. More details can be found here. You can place your PostgreSQL database, Redis instance and Node.js server inside GCP VPC.
Make Node.js server available to the public via DNS.
Set default-allow-internal rule, so that only the services present in VPC can access each other (halting public access of DB and Redis)
As an alternative approach, you may also keep all three servers public and only allow Node.js IP address to access DB and Redis servers, but the above solution is recommended.
Security groups inside AWS are instance-attached firewall-like components. So for example, you can have a SG on an instance level, similar to configuring IP-tables on regular Linux.
On the other hand, Google Firewall rules are more on a Network level. I guess, for the level of "granularity", I'd say that Security Groups can be replaced to instance-level granularity, so then your alternatives are to use one of the following:
firewalld
nftables
iptables
The thing is that in AWS you can also attach security groups to subnets. So SG's when attached to subnets, are also kind of similar to google firewalls, still, security groups provide a bit more granularity since you can have different security groups per subnet, while in GCP you need to have a firewall per Network. At this level, protection should come from firewalls in subnets.
Thanks #amsh for the solution to the problem. But there were a few more things that were required to be done so I guess it'll be better if I list them out here if anyone needs in the future:
Create a VPC network and add a subnet for a particular region (Eg: us-central1).
Create a VPC connector from the Serverless VPC Access section for the created VPC network in the same region.
In Cloud Run add the created VPC connector in the Connection section.
Create the PostgreSQL and Redis instance in the same region as that of the created VPC network.
In the Private IP section of these instances, select the created VPC network. This will create a Private IP for the respective instances in the region of the created VPC network.
Use this Private IP in the Node.js server to connect to the instance and it'll be good to go.
Common Problems you might face:
Error while creating the VPC Connector: Ensure the IP range of the VPC connector and the VPC network do not overlap.
Different regions: Ensure all instances are in the same region of the VPC network, else they won't connect via the Private IP.
Avoid changing the firewall rules: The firewall rules must not be changed unless you need them to perform differently than they normally do.
Instances in different regions: If the instances are spread across different regions, use VPC network peering to establish a connection between them.

Need Variable Numbers of Isolated Docker-based Applications with Distinct Public IPs from AWS

Does anyone know the correct AWS services needed to launch multiple instances of a Docker image on unique publicly accessible IP addresses?
Every path I have tried with Amazon's ECS seems to be set up for scaling instances locked away in a private network and / or behind a single IP.
The container has instances of a web application running on port 8080, but ideally the end user will connect via port 80.
The objective is to be able to launch around 20 identical copies of the container at once, with each accessible via its own public IP.
There is no need for the public IP to be known in advance, as on startup, I patch the data as needed with the current IP address.
The containers live in Amazon's ECR, and there are a couple of unique instances running in standalone EC2 machines, I was trying to use ECS to launch multiple instances at will, but can successfully launch a total of 1 at a time before getting errors about conflicting ports because things are not isolated enough.
You can do this with ECS:
Change your task definition to use the awsvpc networking mode.
Change your service network configuration to auto-assign a public IP.
If you're deploying onto EC2 instances, I think you may be limited in the number of either network interfaces or public IP addresses that you can use. Fargate does not have this restriction.

cannot connect to Redis Instance in GCP

I created an instance on GCP, but I am not able to access it.
This is similar to this one, but the proposed solution isn't working for me:
Unable to telnet to GCP MemoryStore
I have tried to telnet to it, I am in the same project and region, but apparently I need to be in the same network as it's a private ip, but what if you want to connect using the cloud shell? Also, how would an application running on my local machine access it?
I also included a firewall rule to make sure incoming connections are allowed.
To connect a client to a Cloud Memorystore for Redis instance, the client and the instance must be located in the same region, in same project and in the same VPC network. Please check the “Networking” document where you’ll have information on Basic network settings, limited and unsupported networks, network peering, IP address range.
You can connect to Redis from different GCP products like Compute Engine VM, Google Kubernetes Engine Cluster or Google Kubernetes Engine pod, but you can’t connect directly from the Cloud shell or from your local machine since they are not in your VPC network.
It may also have to do with a missing peering connection to your network. Check in your console at https://console.cloud.google.com/networking/peering/ to see if the peering is set up properly.
Using terraform you can use the following docs: https://www.terraform.io/docs/providers/google/r/redis_instance.html

Permanently binding static IP to preemptible google cloud VM

For our project we need a static IP binding to our Google Cloud VM instance due to IP whitelisting.
Since it's a managed group preemptible, the VM will terminate once in a while.
However, when it terminates I see in the operations log compute.instances.preempted directly followed by compute.instances.repair.recreateInstance with the note:
Instance Group Manager 'xxx' initiated recreateInstance on instance
'xxx'.
Reason: instance's intent is RUNNING but instance's status is
STOPPING.
After that follows a delete and a insert operation in order to restore the instance.
The documentation states:
You can simulate an instance preemption by stopping the instance.
In which case the IP address will stay attached when the VM is started again.
A) So my question, is it possible to have the instance group manager stop and start the VM in the event of preemption, instead of recreating? Since recreating means that the static IP will be detached and needs to be manually attached each time.
B) If option A is not possible, how can I attach the static IP address automatically so that I don't have to attach it manually when the VM is recreated? I'd rather not have an extra NAT VM instance to take care of this problem.
Thanks in advance!
I figured out a workaround to this (specifically, keeping a static IP address assigned to a preemptible VM instance between recreations), with the caveat that your managed instance group has the following properties:
Not autoscaling.
Max group size of 1 (i.e. there is only ever meant to be one VM in this group)
Autohealing is default (i.e. only recreates VMs after they are terminated).
The steps you need to follow are:
Reserve a static IP.
Create an instance template, configured as preemptible.
Create your managed group, assigning your template to the group.
Wait for the group to spin up your VM.
After the VM has spun up, assign the static IP that you reserved in step 1 to the VM.
Create a new instance template derived from the VM instance via gcloud (see https://cloud.google.com/compute/docs/instance-templates/create-instance-templates#gcloud_1).
View the newly create instance template in the Console, and note that you see your External IP assigned to the template.
Update the MiG (Managed Instance Group) to use the new template, created in step 6.
Perform a proactive rolling update on the MiG using the Replace method.
Confirm that your VM was recreated with the same name, the disks were preserved (or not, depending on how you configured the disks in your original template), and the VM has maintained its IP address.
Regards to step 6, my gcloud command looked like this:
gcloud compute instance-templates create vm-template-with-static-ip \
--source-instance=source-vm-id \
--source-instance-zone=us-east4-c
Almost goes without saying, this sort of setup is only useful if you want to:
Minimize your costs by using a single preemptible VM.
Not have to deal with the hassle of turning on a VM again after it's been preempted, ensuring as much uptime as possible.
If you don't mind turning the VM back on manually (and possibly not being aware it's been shutdown for who knows how long) after it has been preempted, then do yourself a favor and don't bother with the MiG and just standup the singular VM.
Answering your questions:
(A) It is not possible at the moment, and I am not sure if it will ever be possible. By design preemptible VMs are deleted to make space for normal VMs (if there are capacity constraints in the given zone) or regularly to differentiate them from normal VMs. In the latter case preemption might seem like a start/stop event, but in the former it may take a substantial amount of time before the VM is recreated.
(B) At the moment there is not good way to achieve it in generality.
If you have a special case where your group has only one instance you can hardcode the IP address in the Instance Template
Otherwise at the moment the only solution I can think of (other than using a Load Balancer) is to write a startup script that would attach the NAT IP.
I've found one way that ensures that all VM's in your network have the same outgoing IP address. Using Cloud NAT you can assign a static IP which all VM's will use, there is a downside though:
GCP forwards traffic using Cloud NAT only when there are no other
matching routes or paths for the traffic. Cloud NAT is not used in the
following cases, even if it is configured:
You configure an external IP on a VM's interface.
If you configure an external IP on a VM's interface, IP packets with the VM's internal IP as the source IP will use the VM's
external IP to reach the Internet. NAT will not be performed on
such packets. However, alias IP ranges assigned to the interface
can still use NAT because they cannot use the external IP to reach
the Internet. With this configuration, you can connect directly to
a GKE VM via SSH, and yet have the GKE pods/containers use Cloud
NAT to reach the Internet.
Note that making a VM accessible via a load balancer external IP does not prevent a VM from using NAT, as long as the VM network
interface itself does not have an external IP address.
Removing the VM's external IP also prevents you from direct SSH access to the VM, even SSH access from the gcloud console itself. The quote above shows an alternative with a load balancer, another way is a bastion, but doesn't directly solve access from for example Kubernetes/kubectl.
If that's no problem for you, this is the way to go.
One solution is to let the instances have dynamically chosen ephemeral IPs, but set the group as the target of a Load Balancer with a static IP. This way even when instances are created or destroyed, the LB acts as a frontend keeping the IP continious over time.