NATS Error while developing echo service - cloud-foundry

I'm trying to develop a system service, so I use the echo service as a test.
I developed the service by following the directions on the CF doc.
Now the echo node can be running, but the echo gateway failed with the error "echo_gateway - pid=15040 tid=9321 fid=290e ERROR -- Exiting due to NATS error: Could not connect to server on nats://localhost:4222/"

I got into this issue and struck for almost a week finally someone helped me to resolve it. The underlying problemn is something else and since errors are not trapped properly it gives a wrong message. You need to goto github and get the latest code base. The fix for this issue is http://reviews.cloudfoundry.org/#/c/8891 . Once you fix this issue, you will most likely encounter a timeout field issue. the solution for that is to define the timeout field gateway.yml

A few additional properties became required in the echo_gateway.yml.erb file - specifically, the latest were default_plan and timeout, under the service group. The properties have been added to the appropriate file in the vcap-services-sample-release repo.
Looks like the fix for the misleading error has been merged into github. I haven't updated and verified this myself just yet but the gerrit comments indicate the solution is the same as what the node base has had for some time. I did previously run into that error handling and it was far more helpful.

Related

GCP Cloud functions health check failing

I am trying to deploy a function in GCP for 2 days and receive the following error each time.
OperationError: code=13, message=Function deployment failed due to a health check failure. This usually indicates that your code was built successfully but failed during test execution. Examine the logs to determine the cause. Try deploying again in a few minutes if it appears to be transient.
The Log viewer doesn't give a proper explanation of the problem. Giving the following logs continuously until the deployment fails.
"Error: function terminated. Recommended action: inspect logs for termination reason. The function cannot be initialized."
Now, the interesting fact is that the same code was working couple of weeks ago.
This issue really makes me wonder that it is a bug from GCP after the recent cloud function upgrade.
I've been having the same issue. I found out part of the issue might be related to a dependency issue. In my specific case, related to the slackclient library, but it seems to come from the yarl library and the recommended fix is to define it as yarl==1.4.2
Reference:
https://github.com/slackapi/python-slackclient/issues/764

Random “upstream connect error or disconnect/reset before headers” between services with Istio 1.3

So, this problem is happening randomly (it seems) and between different services.
For example we have a service A which needs to talk to service B, and some times we get this error, but after a while, the error goes away. And this error doesn't happen too often.
When this happens, we see the error log in service A throwing the “upstream connect error” message, but none in service B. So we think it might be related with the sidecars.
One thing we notice is that in service B, we get a lot of this error messages in the istio-proxy container:
[src/istio/mixerclient/report_batch.cc:109] Mixer Report failed with: UNAVAILABLE:upstream connect error or disconnect/reset before headers. reset reason: connection failure
And according to documentation when a request comes in, envoy asks Mixer if everything is good (authorization and other things), and if Mixer doesn’t reply, the request is not success. So that’s why exists an option called policyCheckFailOpen.
We have that in false, I guess is a sane default, we don’t want the request to go through if Mixer cannot be reached, but why can’t?
disablePolicyChecks: true
policyCheckFailOpen: false
controlPlaneSecurityEnabled: false
NOTE: istio-policy is running with the istio-proxy sidecar. Is that correct?
We don’t see that error in some other service which can also fail.
Another log that I can see a lot, and this one happens in all the services not running as root with fsGroup defined in the YAML files is:
watchFileEvents: "/etc/certs": MODIFY|ATTRIB
watchFileEvents: "/etc/certs/..2020_02_10_09_41_46.891624651": MODIFY|ATTRIB
watchFileEvents: notifying
One of the leads I'm chasing is about default circuitBreakers values. Could that be related with this?
Thanks
The error you are seeing is because of a failure to establish a connection to istio-policy
Based on this github issue
Community members add two answers here which could help you with your issue
If mTLS is enabled globally make sure you set controlPlaneSecurityEnabled: true
I was facing the same issue, then I read about protocol selection. I realised the name of the port in the service definition should start with for example http-. This fixed the issue for me. And . if you face the issue still you might need to look at the tls-check for the pods and resolve it using destinationrules and policies.
istio-policy is running with the istio-proxy sidecar. Is that correct?
Yes, I just checked it and it's with sidecar.
Let me know if that help.

Google Cloud Functions - Deployment hangs for 5-10 minutes, then gives error "Deployment failure: Operation interrupted"

I'm getting errors when I try to deploy a Google Cloud Function. The process hangs for about 5-10 minutes and then an error appears:
"Deployment failure:
Operation interrupted."
I tried creating a new test function with nothing in it in two different projects of mine, both are timing out with that same error.
Anyone experiencing anything similar?
There was an incident related to Cloud Functions and Cloud Build that began at 2019-09-24 13:00 and ended at 2019-09-24 18:15 (all times are US/Pacific).
It should be all good now. Please try to deploy your function again.
In case it will not work for you. Please update your question to contain more information: minimum reproducible code, dependencies, timestamp.
Yes, having the same issue here. Tried to check status on their dashboard they mark it has ok but it's not.

App Engine Flexible: Timed out waiting for the app infrastructure to become healthy

I'm trying several times to deploy a new version of a service on my app engine flexible instance using the sdk and the command gcloud app deploy, but all i get is this error
"ERROR: (gcloud.app.deploy) Error Response: [4] Timed out waiting for
the app infrastructure to become healthy."
.
I Couldn't found any answer about it on the issue tracker of gcp.
On this question, he got the same problem, but no one could answered it.
Any guidance will be very helpfull.
According to the gcp team, this particular error was caused because we reached the "In-use IP addresses" quota limit.
They also said that are working on improve the error messages.
"The engineering team has just created a fix for better quota error
details. There is no ETA for when the fix will be released, but I
would guess it would be in the next version of gcloud."
Over half a year later after Loneck's answer I've got the same error. I guessed it could be anything. This is why I've choosen to delete the project, create a new one in a new zone and deployed it there. Then it worked for me. It might have been any other limitation in the zone that I've choosen at the beginning.
I ran into this and solved it by retrying the deploy 3 times.

Multiple Instance Issue with ColdFusion 10: "Bad Gateway Error"

I am creating a multiple instance setup on my developer edition of ColdFusion. I am running on Maverics. My guide to the process is this article by Rob Brooks-Bilson.
I did everything right. However I get the 'Bad Gateway Error' when I try to ping the ColdFusion Administrator.
I think you might have any of the following issues:
The workermap.properties file for your particular instance (cf10/config/wsconfig/1/) has the instance name spelled wrong.
Recheck the worker.properties file that you have added the content properly. This step is very much prone to copy-paste error. There are two places you need to add your instance name: In the list and then the port configuration (copying from the existing).
There is some glitch in your mod_jk file.
last but not the least please re-check that your server.xml (cf10//runtime/conf/) has been edited properly. Also please check if the value of the port attribute of the SERVER tag and the CONNECTOR tag are different. It happens that due to some glitch they might get generated as the same.