Can not deploy any revisions (old or first) to Cloud Run - google-cloud-platform

"Cloud Run error: Internal system error."
This errors and only this errors keeps coming when trying to deploy a revision (new or first)
What is going on with Cloud Run?
Their page won't load from GCP (though I can get in via google search) and I cannot deploy any revision without getting this error
The container works locally

It seems to be a temporary issue in the platform. You can check it in google cloud status webpage.
We've received a report of an issue with Cloud Run.
Both Cloud Run and Cloud Source Repository seems to be affected.
Usually google's team is quick in fixing whatever happened, you can find more info here as soon as something starts to move :)

Change the Docker image ENTRYPOINT value or set those fields:

Related

Customised Google Cloudshell image does not launch

Customised Google Cloud Shell image fails to launch, error is 'Cloud Shell is experiencing some issues provisioning a VM to you. Please try again in a few minutes'. Repeated attempts to launch also fail.
I created a custom Google Cloudshell Image with an Ansible lab environment and setup tutorial. When this was tested approximately 10 days ago, it seemed to work as expected. Setup was performed using the following guide
Project is hosted with the 'Open in Google Cloud Shell' button here
For convenience, this is the launch button as a link
The customised Cloud Shell image is hosted at gcr.io/diveintoansible/diveintoansible-lab-gcp-cloudshell
I've checked the permissions and these appear to be open to the public as desired.
Any advice on resolving this, greatly appreciated.
This usually happens because the base image is out of date. If your image worked a few weeks ago, you probably just need to rebuild it.

Error: Asset 'webhooks/ActionsOnGoogleFulfillment' cannot be deployed

I wanted to build a Google assistant with custom actions using actions-sdk. Since I am new to this, I have followed the steps in the tutorial "Build Actions for Google Assistant using Actions SDK (Level 1)" as it is, inorder to build a sample assistant. I followed the tutorial as it is. However, in step 5(Implement fulfillment), when trying to test the the fulfillment by running the command
gactions deploy preview
I am getting the below output in the terminal with error
Sending configuration files...
Sending resources...
Waiting for server to respond. It could take up to 1 minute if your cloud function needs to be redeployed.
[ERROR] Server did not return HTTP 200.
{
"error": {
"code": 400,
"message": "Asset 'webhooks/ActionsOnGoogleFulfillment' cannot be deployed. [An operation on function cf-_CcGD8lKs_F_LHmFYfJZsQ-name in region us-central1 in project <my-project-id> is already in progress. Please try again later.]"
}
}
And when I checked the "Google Cloud Platform -> Cloud Functions Console" for this project, the following is seen.
Image 1(Screenshot)
Cloud Platform Cloud Functions Console
A failed deploy of cloud function with an exclamation mark. And if I delete that functions, then immediately a new function is deployed automatically. But instead of an exclamation mark, a spinning wheel symbol(loading/still deploying) mark is present. I cannot delete that cloud function if it is still loading/deploying. Then after 10-15 min, the spinning symbol changes to exclamation symbol. And then if I delete it, then again a new one automatically appears. And it goes on like this
Image 2 (Screenshot)
Cloud Platform Cloud Function Console
This problem arises only when implementing a webhook/fulfillment(Step 5). For static Actions' response, it successfully deploys for testing on entering the command "gactions deploy preview".(Step 1 to Step 4 are successfully implemented)
I have followed the tutorial as it is, hence the code and directory structure is the same as in tutorial,(only the project-id or actions-console project name will be different).
Github Repository for Code
Since, this is only for the tutorial, at present I am not using a billing account, instead did the following changes in package.json(changed node version from 10 to 8.).
"engines": {
"node": "8"
},
Due to this continuous automatic failed deployment, when I try to explicitly deploy the project, as mentioned above, this error occurs.
"An operation on function cf-_CcGD8lKs_F_LHmFYfJZsQ-name in region us-central1 in project <my-project-id> is already in progress. Please try again later".
Can anyone please suggest how to stop this continuous automatic failed deployment of the cloud functions, so that the function I deploy will be successfully deployed? Would really appreciate your help.
(Note: This is the first time I have posted a question in stack overflow, so please let me know if there are any mistakes or stack overflow question conventions I might not have followed. I will improve it.)
Posting this as Community Wiki as it's based in the comments.
As clarified the issue seems to be the billing account, as the tutorial mentions that it's necessary to have one set for the Cloud Functions to be deployed correctly. Besides that, it's not possible to deploy Cloud Functions (webhooks) without a billing account, so yes, even though that you are not using Node.js 10, you will need to have a billing account configured for your project.
To summarize, billing account will be needed to avoid any possible deployment failure, even if you are not using Node.js 10, as explained in the followed tutorial.

What causes 'Cloud Run error: Internal system error, system will retry later'? Suggestions for troubleshooting?

I'm attempting to deploy a Cloud Run Service as part of tests for my open source project. This is done via our automated CI/CD system and has worked successfully hundreds of times previously.
The Cloud Run Service gets created but the first revision never gets deployed. When I look at the newly created Service in the GCP Console, it shows "Cloud Run error: Internal system error, system will retry later." as the main status message for the Service.
The command line that is failing is:
gcloud --configuration=adapt-cloud-gcloud-testing --quiet run deploy cloud-run-gen-name-a179e65d6fdfc19abc57e15df563d8cb --platform=managed --format=json --no-allow-unauthenticated --memory=128M --cpu=1 --image=gcr.io/adapt-ci/http-echo --region=us-central1 --port=5678 --set-env-vars=ADAPT_TEST_DEPLOY_ID=MockDeploy-aymb --args="-text,Adapt Test"
The output from that command (note: the dots after Creating Revision just keep going):
Deploying container to Cloud Run service [cloud-run-gen-name-a179e65d6fdfc19abc57e15df563d8cb] in project [adapt-ci] region [us-central1]
Deploying new service...
Creating Revision....................................................................................................................
The YAML tab in the Console also shows the same message for each of the three status conditions (see below).
To troubleshoot, I have also tried:
Using the GCP Console to create the most basic Cloud Run Service using the example container from the getting started docs manually, while logged in as the project and organization owner. I see the same failure. I have created Services manually this way previously, with this account and project, with no issues.
Using the GCP Console to create the same example Service as above in a different project, but with the same user and in the same org. This works successfully, so the issue is specific to the project.
I tried two different US regions with the same results.
Since this is typically automated, I attempted to look for any exceeded quotas. On the Cloud Run quotas page and the overall quotas page, I don't see any exceeded quotas now or historically. However, this is an area I'm not super familiar with, so may have missed something.
Retrying dozens of times over the course of two days.
The GCP status page shows no outages.
What are additional troubleshooting steps I should take to investigate & fix this issue?
Partial info from the YAML tab in the GCP Console for the failing Service:
status:
observedGeneration: 1
conditions:
- type: Ready
status: Unknown
message: 'Cloud Run error: Internal system error, system will retry later.'
lastTransitionTime: '2020-10-08T21:07:20.844314Z'
- type: ConfigurationsReady
status: Unknown
message: 'Cloud Run error: Internal system error, system will retry later.'
lastTransitionTime: '2020-10-08T21:07:20.755212Z'
- type: RoutesReady
status: Unknown
message: 'Cloud Run error: Internal system error, system will retry later.'
lastTransitionTime: '2020-10-08T21:07:20.844314Z'
latestCreatedRevisionName: cloud-run-gen-name-3bab80f75cfd57cf87ad89d9d2c18ba3-00001-fus
After quite a bit of trial and error, I got everything working again.
The first thing I did that made some progress was to disable the Cloud Run Admin API and re-enable it. After that change, I was able to create a service using the example container from the Console, logged in as the project owner. I was also able to create a service using the example container from the CLI, logged in as the CI service account. However, the original command from my question still had identical behavior as before. I have no idea how the project got in this state, such that the project owner couldn't use Cloud Run.
The second thing I did was to re-push the container image I was trying to use (gcr.io/adapt-ci/http-echo) to GCR. I pushed the exact same image as was there previously. This finally allowed the CI system to successfully create the Service.
As part of my earlier troubleshooting, I had looked at Google Container Registry for this project and had confirmed that the needed image was still present. However, we had somewhat recently enabled a lifecycle policy on the Cloud Storage bucket to delete items older than a certain amount of time. So my best guess is that policy deleted some, but not all of the files associated with the gcr.io/adapt-ci/http-echo image and this resulted in the internal error instead of an error saying that the container image couldn't be found.

Error when trying to connect to a Cloud SQL instance using the Cloud Shell

I've had a Cloud SQL instance for about a year now.
I always accessed it the same way:
I would go to my project on the Cloud Console.
Click on the Cloud Shell icon at the top right (a small right pointing arrow).
A black shell screen would pop up where I would type
gcloud sql connect <my instance> --user=root.
Enter my password.
Now, all of a sudden, I am getting an error message saying:
There was no instance found at projects//instances/ or you are not authorized to connect to it.
I am the owner of the project, and also have Admin rights to the Cloud SQL instance. The project and instance are still there, and my app that accesses the data stored in the instances' database is working fine - therefore I know the database is also present, otherwise my app wouldn't work.
I didn't touch or change anything in the Cloud SQL instance. Suddenly, I simply can't access my database using the exact same procedure I have been using almost every day over the past year now.
I am able to access the database using a local Python script on my laptop and the Cloud SQL Proxy, but I would like to access it from the Cloud Shell again.
Any ideas on what could the problem be?
gcloud components update - update all of your installed components to the latest version
gcloud init - reinitialize gcloud shell. It performs the following setup steps:
Authorizes gcloud and other SDK tools to access Google Cloud Platform using your user account credentials, or from an account of your choosing whose credentials are already available.
It seems like there was a problem with the GCP Cloud Shell (even though there was no mention of it on the GCP error tracking page). When I logged back in today and followed the same above process everything worked well.
Looks like GCP Cloud Shell could occasionally go rouge and start producing errors. Word of advice, don't panic when this happens (like I did) and start resetting, rebooting and messing up things. Just wait a day and check back again.

Deploying cloud foundry with BOSH - what does a "bosh delete deployment" clean up?

We've just gone through the process of deploying a multi node (34 nodes) cloud foundry using BOSH, with a few hiccups along the way. One in particular was that it took us several "bosh deploy" runs to get through the initial compilation steps. We'd start the bosh deploy, it would start compiling, get through a few components and then fail. There is no doubt that we have some configuration issues with our VMWare based infrastructure and I suspect we are running out of resources. But here is my main question for now.
We were able to get through the compiles by issuing a "bosh delete deployment ourcloud --force" after a failure.
What does this command clear out? It obviously left successfully compiled stuff in place, but what is cleaned? Temporary storage? Anything else?
Thanks.
bosh delete deployment will delete an entire deployment, it deletes the VMs in the vcenter, clears it's db of the info and deletes the manifest. after it's done there should be no trace (except logs) of the deployment.