Cloud foundry: ERR Timed out after 1m0s: health check never passed - cloud-foundry

After I deployed my app into cloud foundry got the following error message:
ERR Timed out after 1m0s: health check never passed.
Of course on my local machine works perfectly.

You should change your health check type.
if the application does not expose a Web interface you need to change the healthcheck type to process.
Valid values are port, process, and http.
To configure a health check while creating or updating an application, use the cf push command:
$ cf push YOUR-APP -u process
See the Health Check doc for more information:
https://docs.cloudfoundry.org/devguide/deploy-apps/healthchecks.html

Based on the discussion in the comments and my own testing of the actual application you are deploying, it appears that this particular app takes an age to start. Possibly related to individual Java service timeouts (as you have not bound any CF services to the application).
Anyway, while I'm not sure what the actual problem is (possibly an issue with PWS itself), this can be sort-of worked around by specifying the -t option when doing a push, or adding the timeout: <int> attribute to the manifest (see the manifest documentation.
OLD ANSWER
Need more details to be sure, but I imagine one of two things are happening:
You are not using the correct port. Cloud Foundry exposes the port it expects the application to be deployed using the PORT (or, pre-Diego, VCAP_APP_PORT) environment variable. This defaults to 8080, so if your application is not listening on 8080 (or is bound to 127.0.0.1 instead of 0.0.0.0), then the health check will fail.
Your application does not expose any API endpoints, and should be deployed with the --no-route option on CF, and (starting with Diego) needs to have cf set-health-check [app-name] executed against it. This should only be done if your application genuinely does not need a health check.
Some build packs can take care of the first automatically for you. Which build pack are you using? Or, alternatively, which language are you using?

You can disable the health with the below command
(Short term solution)
cf push app_name -p target/app.jar -u none

Related

What keeps accessing Google Cloud metadata on my instance

I have a Google Cloud compute instance running with Ubuntu 18. We had wireshark running tracking another problem and we noticed that every minute something is accessing the meta data server. Three requests every minute:
GET /computeMetadata/v1/instance/virtual-clock/drift-token?alt=json&last_etag=XXXXXXXXXXXXXXXX&recursive=False&timeout_sec=60&wait_for_change=True
GET /computeMetadata/v1/instance/network-interfaces/?alt=json&last_etag=XXXXXXXXXXXXXXXX&recursive=True&timeout_sec=60&wait_for_change=True
GET /computeMetadata/v1/?alt=json&last_etag=XXXXXXXXXXXXXXXX&recursive=True&timeout_sec=77&wait_for_change=True
In call cases, the wireshark says the source is the IP of my instance, and the destination is the 169.254.169.254 which is the Google metadata server.
I don't have any code we have written that is accessing the server. The first one makes me think that this is some Google specific software that is accessing the meta data? But I haven't been able to prove that. What is worrisome is that the response for the third one contains ssh keys. Also, every minute seem excessive.
I see another post talking about scripts in /usr/share/google, but I don't have that directory. I do see that google-fluent is installed. I also see a installed snap for google-cloud-sdk. Could one of those be it? I don't recall installing them, AFAIK, I am not using it, so if that is it, what is the harm in uninstalling it?
You do not have a problem to worry about. The metadata server is private to your instance. The Google VM guest environment software and Stackdriver (fluentd) are making requests to the metadata server to get credentials, detect changes (new SSH keys), set the clock, etc.
The IP address 169.254.169.254 is an IPv4 Link Local Address. Only your VM has a route to that network.
Compute Engine Guest Environment
Do not attempt to uninstall the Guest Environment. You can remove Stackdriver, but I do not recommend that. Stackdriver provides logging and monitoring features that are very useful.

How can I scale CloudFoundry applications "down" without the risk of restarting all of them?

This is a question regarding the Swisscom Application Cloud.
I have implemented a strategy to restart already deployed CloudFoundry applications without using cf restart APP_NAME. I wanted to be able to:
restart running applications without needing access the app manifest and
avoid them suffering any down-time.
The general concept looks like this:
cf scale APP_NAME -I 2
increasing the instance count of the app from 1 to 2
wait for all app instances to be running
cf restart-app-instance APP_NAME 0
restart the "old" app instance
wait for all app instances to be running again
cf scale easyasset-repower-staging -I 1
decrease the instance count of the app back from 2 to 1
This generally works and usually does what I expect it to do. The problem I am having occurs at Stage (3), where sometimes instead of just scaling the instance count back, CloudFoundry will also restart all (remaining) instances.
I do not understand:
Why does this happen only sometimes (all apps restart when scaling down)?
Shouldn't CloudFoundry keep the the remaining instances up and running?
If cf scale is not able to keep perfectly fine running app instances alive - when is it useful?
Please Note:
I am well aware of the Bluegreen / Autopilot plugins for zero-down-time deployment of applications in CloudFoundry and I am actually using them for our deployments from our build server, but they require me to provide a manifest (and additional credentials), which in this case I don't have access to (unless I can somehow extract it from a running app via cf create-app-manifest?).
Update 1:
Looking at the plugins again I found bg-restage, which apparently does approximately what I want, but I have no idea how reliable that one is.
Update 2:
I have concluded that it's probably an obscure issue (or bug) in CloudFoundry and that there are no guarantees given by cf scale that existing instances are to remain running. As pointed out above, I have since realised that it is indeed very much possible to generate the app manifest on the fly (cf create-app-manifest) and even though I couldn't use the bg-restage plugin without errors, I reverted back to the blue-green-deploy plugin, which I can now hand a freshly generated manifest to avoid this whole cf scale exercise.
Comment Questions:
Why do you have the need to restart instances of your application?
We are caching some values from persistent storage on start-up. This restart is happening when changes to that data was detected.
information about the health-check
We are using all types of health checks, depending on which app is to be re-started (http, process and port). I have observed this issue only for apps with health checkhttp. I also have ahttp-endpoint` defined for the health check.
Are you trying to alter the memory with cf scale as well?
No, I am trying to keep all app configuration the same during this process.
When you have two running instances, the command
cf scale <APP> -i 1
will kill instance #1 and instance #0 will not be affected.

Promise Technology VTrak configure webserver

I inherited management of some Promise VTrak disk array servers. They recently had to be transferred to a different location. We've got them set up and networking is all configured, and even have a linux server mounting to it. Before they were transferred I was trained with the web gui it comes with. However, since the move we have not been able to connect to the web gui interface.
I can ssh into the system and really do everything from there, but I would love to figure out why webserver is not coming up.
The VTRAK system does not allow for much configuration it seams. From the CLI I can start, stop, or restart the webserver, and the only thing I can configure is the amount of time someone can be logged into the gui for. I don't see anywhere where you can configure http or anything like that.
We're pretty sure it's not a firewall issue as well.
I got the same issue this morning, and resolved it as follows:
It appears there are conditions that can render the WebPam webserver with invalid configuration HttpPort and HttpsPort values. One condition that causes this is if the sessiontimeout parameter exceeds 1439 set in the GUI. Apparently the software then freaks and creates invalid configuration which locks out the GUI because the http ports have been changed to 25943 and 29538.
To fix this login through CLI:
swmgt -a mod -n webserver -s "sessiontimeout=29"
swmgt -a mod -n webserver -s "HttpPort=80"
swmgt -a mod -n webserver -s "HttpsPort=443"
swmgt -a restart -n Webserver
You should now be able to access the WebPam webserver interface again through your web browser.
After discussing with my IT department we found it was a firewall issue.

How to set up Micro CloudFoundry on Windows

tldr; This question was to get help setting up Micro Cloud Foundry on Windows XP behind a corporate firewall as an innovation-demonstration project for a Fortune 500 IT departent. Basically, the project stalled, despite this stackoverflow page - the magic wasn't strong enough. I am accepting #DanHigman answer below, but if anyone sees this and can provide a simple straight-forward answer, by all means...
Can anyone provide a clear step-by-step on setting up MCF on a Windows (XP in my case) machine behind a corporate firewall, for demostrating the feasibility of PaaS in the corporate IT world?
My VM is installed and running and I can use the menu ok. I have vmc working. I have a test Node.js server app, that works on local, ready to push. But I can't get past that stage.
The firewall gave me trouble so I lowered my goal to just work offline. I followed the instructions noted below as best I could, but often the instructions are mac oriented - I would like them for a Windows command line (especially SSH tunneling):
http://blog.cloudfoundry.com/2011/09/08/working-offline-with-micro-cloud-foundry/
http://support.cloudfoundry.com/entries/20332921-micro-cloud-foundry-trouble-shooting-help
This blogger may have half-way covered my problem doing the SSH tunnel settings, but all it gives is "use Putty" - more detail would help:
http://support.cloudfoundry.com/entries/20419943-using-micro-cloud-locally
Also, whenever the vmc obviously gets an error or other message, it only outputs the following in the command line:
vmc target http://api.vcap.me
<<<
[200, "<html><body>SNP/2.0/102/Unknown Command 'info'</body></html>\r\n\r\n", {}
]
>>>
Thanks for any help. BTW - I know I could do this on my mac, the big obstacle is the windows and firewall environment.
Update:
#Dan and #ebottard: Thanks to your help, I'm almost there. ping is working now, hosts file seems right, but the vmc target api.vcap.me still does not find the VM at that 192.168.253.128 IP - even tho ping does. In the first link above, Martin wrote the following, but assuming we are doing it on a mac:
After the update is complete, you will need to make some changes on your local system. What you will need to do is to set up an SSH tunnel to access your Micro Cloud Foundry VM (note that you will need to supply the IP address in the command below with the actual IP of your VM, which is displayed in the console).
sudo ssh -L 80:192.168.168.149:80 vcap#192.168.168.149
Password:
vcap#192.168.168.149's password:Â
The first password being prompted is the sudo password for your machine, as it is needed to open port 80 which requires root privileges. The second password is the vcap user password which you entered during the initial configuration of your Micro Cloud Foundry.
I need to have these instructions translated into Windows, and all I have to go on is that I might use puTTy (which I have downloaded) to do it. Any more ideas?
Looks like you're running an application on your Windows machine called "Snarl" (a poor Windows-based clone of the OS 10 app Growl :-p). It looks like it's interfering with communication to the MCF intstance, close it and have another try.

Jrun ColdFusion service intermittently fails to start

We occasionaly have a problem where we attempt to start the Jrun service and it fails with the following two errors:
error JRun Naming Service unable to start on port 2902
java.net.BindException: Port in use by another service or process: 2902
info No JDBC data sources have been configured for this server (see jrun-resources.xml)
error java.net.BindException: Port in use by another service or process: 8300
We then have to reboot the machine and Jrun comes up with no problem. This is very intermittent - happens perhaps one out of every 10 times we restart Jrun services.
I saw another reference on StackOverflow that if Windows Services take longer than 30 seconds to restart Windows shuts down the startup proccess. Perhaps that is the issue here? The logs indeed indicate that these errors are thrown about 37+ seconds after the restart command is issued.
We are on a 64bit platform on WinServer 2008.
Thanks!
We've been experiencing a similar problem on some of our servers. Unfortunately, netstat never indicated any sort of actual port conflict for us. My suspicion is that it's related to our recent deployment of a ColdFusion "cumulative hotfix" to our servers. We use the multi-server edition of CF 8.0.1 enterprise with a large number of instances on each machine -- each with its own JVM and its own distinct set of ports. Each CF instance is attached to its own IIS website and runs as its own Windows Service.
Within the past few weeks, we started getting similar "port in use" exceptions on startup, on our 32-bit machines as well as our 64-bit machines, all of which are running Windows Server 2003. I found several possible culprits and tried the following:
In jrun-jms.xml for each CF instance, there's an entry for the RMI transport layer that reads <port>0</port> -- which, according to the JRun documentation, means "choose a random port." I made that non-random and distinct per instance (in the 2600-2650 range) and restarted each instance. Things improved temporarily, perhaps coincidentally.
In the same file, under the entry for the TCPIP transport later, every instance defaulted to <port>2522</port> -- so I changed those to distinct ports per instance in the 2500-2550 range and restarted each instance. That didn't seem to help at all.
I tried researching whether ports in the 2500-3000 range might be used for any other purpose, and I couldn't find anything obvious, and besides, netstat wasn't telling me that any of my choices were in use.
I found something online about Windows designating ports from 1024 to 5000 as the "dynamic port" range, so I added 10000 to the port numbers I had set in jrun-jms.xml and restarted each instance again. Still didn't help.
I tried changing the port in jndi.properties, also by adding 10000 to the port numbers. Unfortunately this meant wiping out all my wsconfig connections to IIS and creating them again from scratch. I had to edit wsconfig_jvm.config as well, adding -DWSConfig.PortScanStartPort=12900 to java.args, so it could detect my CF instances. (By default it only scans ports 2900-3000. See bpurcell.org for details. It's an old post but still relevant.) So far so good!
My best guess is that Adobe (or MS Windows) changed the way some of its code grabs "random" ports. But all I know for sure so far is that the steps outlined above appear to have fixed the problem.
Have you verified that the services are in fact stopping? Task manager should show no instances of jrun.exe. You can also check to see what is bound to that port by opening a command window and running
netstat -a -b
This will list all your open ports, plus what program is using them. You can also use
netstat -a -o
Which does the same thing as the above, but will list the process id instead of the program name. You can then cross-reference those with task manager. You'll need to enable showing the PIDs in task manager by going to View->Select Columns and making sure PID is checked. My guess would be that the jrun processes are not shutting down in a timely fashion.