Google Cloud Run runs over and over again in error - google-cloud-platform

I'm using Cloud Run that is getting triggered by a pubsub message.
But when this Cloud Run code gets an error it does re-run the application over and over again.
This seems unnecessary now when testing because I see the error in the log and doesn't need the code to re-run.
Where can I turn this off?
I'm using Node JS.

You can purge your PubSub push subscription, or delete it.

Solved it short term by enclosing the whole code block by a try/catch and then always be sure to a throw err to catch the error.
After that instead of returning a 400 status in the catch block I returned 200 and the pubsub message got ack:ed that everything was working (Even if it did not).

Related

Troubleshooting error 503 on Google Cloud Run

I am running a container on google cloud run. For each request a new instance is started. The requests need around 15 minutes to get processed. I modified the default timeout and everything is working fine. But sometimes, around 10% of the request, I get an error
The request failed because either the HTTP response was malformed or
connection to the instance had an error. Additional troubleshooting
documentation can be found at:
https://cloud.google.com/run/docs/troubleshooting#timeout-503
When I re-run the exact same request, I get no errors. I tried to put try catch every where, but I am not able to figure out what is happening. I checked the CPU, memory usage ... Everything looks fine, he maximum reached is 50%. Any advice on how can I get more information about the problem?

Mysterious 500 error with AWS Lambda; unable to debug

I have an API that I host using Lambda (nodejs), with API-gateway. I'm using serverless to deploy.
Generally things have been fine, but while I was working on a specific function today, I started to receive HTTP 500 errors when hitting the endpoint. However, while there were still API-Gateway access logs for the end point, there were no Cloudwatch logs for the lambda functions getting hit. I was able to verify that the Authorizer was getting hit successfully, and not returning any issue (if it was, it would have been a 401). After using CLI tools to invoke the function from the command line, the 500 error went away and I was able to successfully hit the endpoints again.
Has anyone ever ran into this before? If I'm missing a debug step, I would really like to know. It was really concerning that my API could be generating 500 errors with no paper trail to help me understand what was happening.
You can check your role and permissions ,this link could help you https://aws.amazon.com/premiumsupport/knowledge-center/api-gateway-lambda-stage-variable-500/
Also you can debug further with X-ray : https://docs.aws.amazon.com/lambda/latest/dg/services-xray.html

Flux webhook receiver giving 400 error code

H,
I am trying to enable webook for GitHub with flux as mentioned in this link https://toolkit.fluxcd.io/guides/webhook-receivers/. GitHub fails to push the event and gets 400 error code. This is on gcp cluster.
Any pointers to debug this of great help.
On the cluster, I cross-checked controllers are all up and running
-Prashanth
Looks like the issue with some setting in the GitRepository names. Now i see the following error when the Push event i sent
{"level":"info","ts":"2021-02-22T05:56:23.346Z","logger":"receiver-server","msg":"handling request","digest":"cfc7a9a5cd337d1c0f58d5d790eb1f988c3c91ed1201a1b692cf8b479abc62a2"}
{"level":"info","ts":"2021-02-22T05:56:23.347Z","logger":"receiver-server","msg":"handling GitHub event: push","receiver":"github-receiver"}
{"level":"info","ts":"2021-02-22T05:56:23.347Z","logger":"receiver-server","msg":"found matching receiver","receiver":"github-receiver"}
{"level":"info","ts":"2021-02-22T05:56:23.354Z","logger":"receiver-server","msg":"resource annotated","receiver":"github-receiver","resource":"gitrepoinfo"}
{"level":"info","ts":"2021-02-22T05:56:23.580Z","logger":"event-server","msg":"Discarding event, no alerts found for the involved object","object":"flux-system/gitrepoinfo","kind":"GitRepository"}
{"level":"info","ts":"2021-02-22T05:56:24.321Z","logger":"event-server","msg":"Discarding event, no alerts found for the involved object","object":"flux-system/podinfo","kind":"Kustomization"}
I added Alert and Provider configuration, then to see the error. Even though events are getting pushed, flux is not picking up changes from repo.
Any pointers would be of great help.
-Prashanth

Data unavailable error message while running script from Runner

I am trying to run my postman collection with the help of runner option while running the collection I am getting "Data unavailable" Error message and my script stop there itself.
Can anyone please guide me for same.
This is an ongoing issue from postman's end. For workaround you can delete your history and then run your collection.

How do you prevent a continuous webjob from restarting after an error occured?

I have a website with four continuous webjobs listening on different topics of a service bus.
If during the execution of one these webjobs, an error occurs and the process exits, how do I prevent the webjob to start up again (which in most cases would simply incur in the error again)?
I tried keeping a disable.job file in the root of each webjob folder, thinking that if I then ran the webjob manually it would override it, but instead it shuts down almost immediately after detecting that that file is present (I thought it would only check on an automatic restart).
There is no mechanism today to achieve that. If a continuous WebJob is not disabled, the WebJob engine will always try to restart if it crashes for any reason. That is what most users expect.
If you don't want that, one thing you could do is catch the exception in your WebJob, and simply do nothing (i.e. get in a Sleep loop). However, I would suggest getting to the bottom of the error, and seeing whether it can be avoided.