GCP instance group rolling update fails with error "Invalid Fingerprint" - google-cloud-platform

Rolling update for an instance group fails with "Invalid Fingerprint" error message at console. Earlier rollouts had no issues but recently started seeing this error and updates are failing, even a times Instance Group section of console is going unresponsive.
Already tried :
Creating New Image and using it in a new Template for rolling out update in instance group
Appreciate any clues or help.
Thanks

When trying to roll-out an update to my instance group with the "ROLLING RESTART/REPLACE" button, I got a “Invalid Fingerprint” error message in the Notification. This issue might also be caused by the "ROLLING UPDATE" button. (capital letters are shown at the source GCE interface)
My instance group size was set to 1 instance, and I was getting the error:
"Invalid fingerprint"
To solve the problem, I changed the Instance Group size from 1 to 2, and then rolled the update.
After the update was done - I change the group size back to 1.
For documentation purposes, the error looks like this:
Edit the Instance Group (set number of instances) by clicking the edit button:
.. and update the number of instances:

Currently, our Internal Compute Engine team working on the issue, The current workaround is to use the gcloud command which should also fix the issue on the Cloud Console afterwards. you can do a rolling replace using:
gcloud beta compute instance-groups managed rolling-action replace [instance group]
You can find the details of the command at this link. Also, you can keep an eye for complete resolution of the issue at this public issue tracker link where other users field a defect report. I must also mention that Updating Managed Instance Group is a beta feature as of now.

Related

Error cloning database unable to update the following flags: cloudsql.enable_password_validation

I am attempting to clone a database. I was able to previous clone it in the console, but now I want to create a small script to automate this and it fails with the following error message:
(gcloud.sql.instances.clone) [ERROR_RDBMS] unable to update the following flags: cloudsql.enable_password_validation
If I attempt to clone it in the console, I get the same error shown above.
I looked up the documentation and enable_password_validation does not seem to be in the list of supported flags, which would explain why it can't update it.
If I run gcloud sql instances describe my-instance, I don't see the flag in question.
But running on the source instance:
SELECT * FROM pg_settings
yields this row in particular:
name
setting
unit
category
short_desc
extra_desc
context
vartype
source
min_val
max_val
enumvals
boot_val
reset_val
sourcefile
sourceline
pending_restart
cloudsql.enable_password_validation
off
NULL
Customized Options
Sets whether to enable Cloud SQL password validation.
NULL
superuser
bool
configuration file
NULL
NULL
NULL
on
off
/pgsql/data/postgresql.auto.conf
3
False
Any advice on how to solve this?
There is currently an ongoing issue with password validation in Cloud SQL Postgres instances. The issue involves the exact flag that is giving you problems cloudsql.enable_password_validation:
Diagnosis: Affected postgres instances from a recent release have the following flag set and are unable to remove or disable this flag: cloudsql.enable_password_validation=on. This flag does not appear in Cloud Console, and attempting to disable flag via gcloud returns error where the flag is not recognized or supported. Password validation occurs on every new client connection but is limited to 50 QPS, and thus higher rates will return errors.
When did this issue start occurring? Have you also attempted to clone the database since then? This is due to the issue receiving several updates. If you continue experiencing issues, you could open a support case with GCP as the status page recommends.
EDIT (2/24/2022)
I wanted to update this answer. The issue seems to be resolved as shown in the status page of Google Cloud:
The issue with Cloud SQL has been resolved for all affected instances as of Tuesday, 2022-02-22 14:30 US/Pacific. We thank you for your patience while we worked on resolving the issue.
If you still see this error, you can update the question confirming that it was not resolved as part of the outage resolution.

Replicating data from SQL Server to BigQuery

I've been trying to follow instructions from Google on Replicating data from SQL Server to BigQuery available here: https://cloud.google.com/data-fusion/docs/tutorials/replicating-data/sqlserver-to-bigquery. Following instructions to the letter step by step always results in this odd error when creating the Cloud Fusion instance
Invalid argument (HTTP 400): retry budget exhausted (3 attempts): cloud-control2-saas::GCE_BAD_REQUEST: Invalid value for field 'networkPeering.name': '*******'. Must be a match of regex '(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)'.
**** is the project ID with the VPC network suffix after a dash and it looks a bit like this (I've changed values)
website.com:api-project-0000000000-default
This value is being assigned somewhere by Google, I am not given a choice to select this or enter this through the instructions when creating the Instance.
Googling the error doesn't show me anything useful and sadly I do not have budget to acquire GCP support in this instance to try and ask them why their instruction appear not to work.
I've already checked quotas, billing, service account permissions, etc. I've also tried both a new VPC as well as a shared VPC with all the settings from the guide.
Would appreciate someone more experienced in this area maybe point me in the right direction or if someone has some sort of understanding of where else to check what could be wrong I would appreciate it.
Instructions do point at creating a peering connection but the instructions themselves require the Cloud Data Fusion Instance to be created before configuring the peering connection and since I can't create the Cloud Data Fusion Instance I am unsure on what exactly I am supposed to do.
Appreciate the help!
According to this documentation, before creating a private instance I assume you're creating a VPC network.
networkPeering.name is a combination of your Project-id and VPC-network. The error which you're getting is due to incorrect naming convention of networkPeeering name. ie. the value of networkPeering.name does not match the regex expression (?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?), which in your case is due to the project-ID: website.com:api-project-xxxxxxxxx.
Also note that networkPeering name should be less than 63 characters in length as per the regex expression.

Adding user to group chrome-remote-desktop - Failed to access group. Is the user a member?

I created an instance with Debian 9 and was following the instructions on Google's site here. I have done this before successfully. All was going fine, but now when I do this part:
DISPLAY= /opt/google/chrome-remote-desktop/start-host \
--code="4/xxxxxxxxxxxxxxxxxxxxxxxx" \
--redirect-url="https://remotedesktop.google.com/_/oauthredirect" \
--name=
I get the error
Adding user newuser_gmail_com to group chrome-remote-desktop
ERROR:Failed to access chrome-remote-desktop group. Is the user a
member?
Can anyone help me out here? I notice that when I did this previously, the username create was not newuser_gmail_com, but rather simply newuser. Any suggestions you have would be much appreciated. Many thanks!
I found the answer, but this raises a possible bug for the Google Cloud team. The bug occurs if I add enable-oslogin = TRUE as a metadata. This causes the chrome-remote-desktop to fail.
When a user is added to a group (chrome-remote-desktop in this case), the change is not reflected in existing sessions until the user logs out and back in. To work around this limitation, Chrome Remote Desktop attempts to use sg to access the new group from the existing session. It looks like this isn't working for some reason on this system (apparently OS Login related?), so starting the host fails.
It should be sufficient to log out and back in. Once logged back in, very that the output of groups contains chrome-remote-desktop, then try running the headless setup flow again. (Make sure you generate a new command, as the --code argument is one-time-use only.)

What is the reason for error "Resource in project is the subject of a conflict" while trying to recreate a cloudsql instance?

I am trying to create a cloudsql instance with the following command:
gcloud beta sql instances create sql-instance-1 --tier=db-f1-micro --region=asia-south1 --network=default --storage-type=HDD --storage-size=10GB --authorized-networks=XX.XXX.XX.XX/XX
The instance sql-instance-1 is something I need not running all the time. So I create an sqldump file and when I need the database I create it. When I run this command it fails with the following error
ERROR: (gcloud.beta.sql.instances.create) Resource in project [my-project-id] is the subject of a conflict: The instance or operation is not in an appropriate state to handle the request.
From what I understand the gcloud is complaining that instance name was used before although the instance is already deleted. When I change the name to a new unused name the command works fine. The problem with this is I need to give a new name every time I re-create the instance from the dump.
My questions are:
Is this expected behavior i.e. should name of cloud-sql instance be unique and not used before within a project.
I also found that --network option is not recognized with gcloud. Seems to work only with gcloud beta as explained here. When is this expected to become GA?
This is indeed expected behaviour. From the documentation:
You cannot reuse an instance name for up to a week after you have
deleted an instance.
Regarding the --network flag and it's schedule for GA, there is no ETA for its release outside of beta. However, it's release will be listed in the Google Cloud SDK Release Notes, which you can get updates from by subscribing to the google-cloud-sdk-announce group

Groups API - someGroup.setShowInGroupDirectory(true) - "List this group in the directory" is checked - group not appearing in "Browse All"

Google Apps for Business account here.
SETUP
I am creating a new group using the Directory API -> all ok.
I am then doing the following:
get the Group I just created using the Groups API and assign it
to "someGroup"
invoke "someGroup.setShowInGroupDirectory(true)"
patch "someGroup" using the Groups API
No issues on the execution - everything comes back with no complaints.
VERIFY
I go to the Google Apps Admin console and search for the group I created. All ok - it appears.
I go to the Google Groups homepage for my domain and click "Browse All". The Group I created does not appear here.
I go to the Google Groups / Information / Directory settings page for the Group I created (https://groups.google.com/a/MY_DOMAIN.com/forum/#!groupsettings/MY_GROUP/directory) and observe that "List this group in the directory" is checked.
However, if at this stage I manually uncheck "List this group in the directory", save, recheck it, save... It does appear in the "Browse All" view. I am trying to build an automated solution and can't really depend on my uses to execute this manual step for every group they create.
I've waited 24+ hours for any background sync to occur and still the group is not appearing in the Browse All view unless I manually toggle as described.
Anyone seen anything similar?
On the off-chance someone finds this one day: it actually took +-24 hours for the groups to start dropping in.
The first test apparently look just under 25 hours; another subsequent one took around 23.
The painful thing is that if you manually uncheck, save, recheck and save, they appear immediately.