GCP - issues with connecting Vertex.AI to shared VPC - google-cloud-platform

We are trying to create training job in Vertex.AI and we need to connect with resources in our shared VPC. Project in which we are creating this job is service project. We have VPC with private services access configured already. (as described in https://cloud.google.com/vertex-ai/docs/general/vpc-peering)
When we are trying to create a job and use this host network, we get a very generic error message:
Unable to start training due to the following error: Internal error encountered.
Everything seems alright and peering connection with private services (servicenetworking) is in an active state.
Does anyone maybe have an idea where can we look for more information about this problem or maybe some guides or pointers that could help us?

A few points should be verified in this particular setup:
The Compute Engine and Service Networking APIs should be enabled for host and service projects, and the Vertex AI API should be enabled for the service project.
The VPC peering connection within your VPC and Google Services should be created in the host project.
You must specify the name of the network that you want Vertex AI to have access to (shared VPC), as stated in the following document 1.
Verify that the service/user account used has the proper role (Compute Network user).

Related

What is an ideal place for creating GCP Private Service Connect Endpoint, Publish Service etc. in case of "Shared VPC" setup?

I understand GCP Private Service Connect (PSC) is an effective solution for enabling service-centric private network connectivity for GCP APIs and other hosted services within and across VPC projects/organizations/on-prem setup based on Producer/Consumer project model.
GCP documentation on Private Service Connect explains the purpose, PSC Endpoint and Publish Service configuration however I don't find relevant details on what should be an ideal location (or best practice) for creating and configuring PSC Endpoint, Publish Service when you have Shared VPC based network setup.
IMO, PSC Endpoint & Publish Service are network resources so the ideal place is to create in 'Host Project' of Shared VPC Network as a 'Host Project' is meant for centralized network management.
Also, having PSC Endpoint and Publish Service in 'Host Project' will help in sharing a single 'PSC endpoint' for all the 'Service Project' resources (which otherwise require multiple PSC endpoints per Service Project). However I would like to understand from you if you have come across and/or implemented such scenario.
Update: I tried a Shared VPC setup wherein PSC creation is allowed in Service-Project which means GCP doesn't restrict the creation of PSC in Service-Project.
Host Project:
Service Project:
Service Project PSC setup:

How can I deploy and connect to a postgreSQL instance in AlloyDB without utilizing VM?

Currently, I have followed the google docs quick start docs for deploying a simple cloud run web server that is connected to AlloyDB. However, in the docs, it all seem to point towards of having to utilize VM for a postgreSQL client, which then is connected to my AlloyDB cluster instance. I believe a connection can only be made within the same VPC and/or a proxy service via the VM(? Please correct me if I'm wrong)
I was wondering, if I only want to give access to services within the same VPC, is having a VM a must? or is there another way?
You're correct. AlloyDB currently only allows connecting via Private IP, so the only way to talk directly to the instances is within the same VPC. The reason all the tutorials (e.g. https://cloud.google.com/alloydb/docs/quickstart/integrate-cloud-run, which is likely the quickstart you mention) talk about a VM is that in order to create your databases themselves within the AlloyDB cluster, set user grants, etc, you need to be able to talk to it from inside the VPC. Another option for example, would be to set up Cloud VPN to some local network to connect your LAN to the VPC directly. But that's slow, costly, and kind of a pain.
Cloud Run itself does not require the VM piece, the quickstart I linked to above walks through setting up the Serverless VPC Connector which is the required piece to connect Cloud Run to AlloyDB. The VM in those instructions is only for configuring the PG database itself. So once you've done all the configuration you need, you can shut down the VM so it's not costing you anything. If you needed to step back in to make configuration changes, you can spin the VM back up, but it's not something that needs to be running for the Cloud Run -> AlloyDB connection.
Providing public ip functionality for AlloyDB is on the roadmap, but I don't have any kind of timeframe for when it will be implemented.

Private service to service communication for Google Cloud Run

I'd like to have my Google Cloud Run services privately communicate with one another over non-HTTP and/or without having to add bearer authentication in my code.
I'm aware of this documentation from Google which describes how you can do authenticated access between services, although it's obviously only for HTTP.
I think I have a general idea of what's necessary:
Create a custom VPC for my project
Enable the Serverless VPC Connector
What I'm not totally clear on is:
Is any of this necessary? Can Cloud Run services within the same project already see each other?
How do services address one another after this?
Do I gain the ability to use simpler by-convention DNS names? For example, could I have each service in Cloud Run manifest on my VPC as a single first level DNS name like apione and apitwo rather than a larger DNS name that I'd then have to hint in through my deployments?
If not, is there any kind of mechanism for services to discover names?
If I put my managed Cloud SQL postgres database on this network, can I control its DNS name?
Finally, are there any other gotchas I might want to be aware of? You can assume my use case is very simple, two or more long lived services on Cloud Run, doing non-HTTP TCP/UDP communications.
I also found a potentially related Google Cloud Run feature request that is worth upvoting if this isn't currently possible.
Cloud Run services are only reachable through HTTP request. you can't use other network protocol (SSH to log into instances for example, or TCP/UDP communication).
However, Cloud Run can initiate these kind of connection to external services (for instance Compute Engine instances deployed in your VPC, thanks to the serverless VPC Connector).
the serverless VPC connector allow you to make a bridge between the Google Cloud managed environment (where live the Cloud Run (and Cloud Functions/App Engine) instances) and the VPC of your project where you have your own instances (Compute Engine, GKE node pools,...)
Thus you can have a Cloud Run service that reach a Kubernetes pods on GKE through a TCP connection, if it's your requirement.
About service discovery, it's not yet the case but Google work actively on that and Ahmet (Google Cloud Dev Advocate on Cloud Run) has released recently a tool for that. But nothing really build in.

AWS EC2 for QuickBooks

AWS and network noob. I've been asked to migrate QuickBooks Desktop Enterprise to AWS. This seems easy in principle but I'm finding a lot of conflicting and confusing information on how best to do it. The requirements are:
Setup a Windows Server using AWS EC2
QuickBooks will be installed on the server, including a file share that users will map to.
Configure VPN connectivity so that the EC2 instance appears and behaves as if it were on prem.
Allow additional off site VPN connectivity as needed for ad hoc remote access
Cost is a major consideration, which is why I am doing this instead of getting someone who knows this stuff.
The on-prem network is very small - one Win2008R2 server (I know...) that hosts QB now and acts as a file server, 10-15 PCs/printers and a Netgear Nighthawk router with a static IP.
My approach was to first create a new VPC with a private subnet that will contain the EC2 instance and setup a site-to-site VPN connection with the Nighthawk for the on-prem users. I'm unclear as to if I also need to create security group rules to only allow inbound traffic (UDP,TCP file sharing ports) from the static IP or if the VPN negates that need.
I'm trying to test this one step at a time and have an instance setup now. I am remote and am using my current IP address in the security group rules for the test (no VPN yet). I setup the file share but I am unable to access it from my computer. I can RDP and ping it and have turned on the firewall rules to allow NB and SMB but still nothing. I just read another thread that says I need to setup a storage gateway but before I do that, I wanted to see if that is really required or if there's another/better approach. I have to believe this is a common requirement but I seem to be missing something.
This is a bad approach for QuickBooks. Intuit explicitly recommends against using QuickBooks with a file share via VPN:
Networks that are NOT recommended
Virtual Private Network (VPN) Connects computers over long distances via the Internet using an encrypted tunnel.
From here: https://quickbooks.intuit.com/learn-support/en-us/configure-for-multiple-users/recommended-networks-for-quickbooks/00/203276
The correct approach here is to host QuickBooks on the EC2 instance, and let people RDP (remote desktop) into the EC2 Windows server to use QuickBooks. Do not let them install QuickBooks on their client machines and access the QuickBooks data file over the VPN link. Make them RDP directly to the QuickBooks server and access it from there.

How can I prompt google to set up VPC peering from servicenetworking.googleapis.com?

I have some Cloud SQL instances that currently have public IP's. It would make certain security-minded people happy if I changed them to have private IP's.
I am following the instructions documented here: https://cloud.google.com/sql/docs/mysql/private-ip
A summary of those instructions:
Ensure your shared VPC host has servicenetworking.googleapis.com enabled
Ensure your project has servicenetworking.googleapis.com enabled
Allocate an IP address range for your new private IP's
Configure VPC network peering (https://cloud.google.com/sql/docs/mysql/configure-private-services-access)
Create cloud sql instance without public IP
Expect new instance's private IP to be in allocated range
I've completed these through step 4, and I'm seeing this:
My interpretation of that page is that I've done my part and now it's google's turn--but that was several days ago. Do I have do do something to prompt google to create the connection?
I think I'm focusing in the right place because if I try to use I private IP, gcloud tells me to go create the network that I'm waiting on:
❯ gcloud --project=my-project-name beta \
sql instances patch foo \
--network=my-network-name --no-assign-ip
The following message will be used for the patch API method.
{"name": "foo", "project": "my-project-name", "settings": {"ipConfiguration": {"ipv4Enabled": false, "privateNetwork": "https://compute.googleapis.com/compute/v1/projects/my-project-name/global/networks/my-network-name"}}}
Patching Cloud SQL instance...failed.
ERROR: (gcloud.beta.sql.instances.patch) [INTERNAL_ERROR] Failed to create subnetwork. Please create Service Networking connection with service 'servicenetworking.googleapis.com' from consumer project '11111111111' network 'my-network-name' again.
In general private services access is implemented as a VPC peering connection between your VPC network and the Google services VPC network where your Cloud SQL instance resides. As #JohnHanley pointed out, the VPC peering should be created within minutes so it’s not expected you have to wait more than that.
To check the peering creation on Stackdriver you can use the following Advanced Filter:
jsonPayload.event_subtype="compute.networks.addPeering"
That said, it makes sense the error you are observing when trying to patch your SQL Instance as the Peering hasn’t been created. Instead of ‘Inactive’ it should be ‘Peer VPC network is connected’
To sum up, in your scenario the Cloud SQL instance cannot get an IP on the aforementioned network as it cannot reach it.
At this specific point I would suggest you focus on the Peering creation. As you mentioned you tried recreating it and the status remains the same, it’s possible that there’s something in your project preventing the peering to be established.
I would also suggest you check the peering limits quota in case it has been reached:
gcloud compute networks peerings list --network='your network'
Also it would be good to review the VPC Peering Restrictions.
All that being said, if you still experience the same issue when creating the VPC Peering, an internal investigation may be required and I would suggest you to you to report this using this link
I hope this helps.