Currently I have a micro service running in cloud foundry. I am trapping sigterm and sighup. I’m trying to verify which signal is sent when a cf restage is performed. I’ve seen the terminal signals for a lot of other commands except for this one in the docs. I would appreciate if somebody can point me to any documentation or just knowledge on the signal sent to the operating system on a cf restage. Thank you.
The signal that you are sent shouldn't differ between cf actions (i.e. stop, restart, restage, or even if your app is restarted due to foundation maintenance) it should always get a SIGTERM, ten seconds to nicely shutdown, followed by a SIGKILL.
https://docs.pivotal.io/pivotalcf/2-6/devguide/deploy-apps/app-lifecycle.html#shutdown
I did a little test on Pivotal Web Service to confirm when using cf restage, where I catch and log when SIGTERM is sent. You can see right in the middle where the SIGTERM is caught by the app. It's just a little harder to see in this case because you also have the staging logs coming through at the same time.
Hope that helps!
2019-08-25T22:02:02.90-0400 [CELL/0] OUT Cell 65a71ce1-e630-4765-8f60-adebfa730268 stopping instance a91e593b-d9b6-42aa-7021-b8cd
2019-08-25T22:02:02.98-0400 [API/9] OUT Creating build for app with guid f58e6aae-783d-4a28-bd30-54c20d314ef4
2019-08-25T22:02:03.87-0400 [STG/0] OUT Downloading binary_buildpack...
2019-08-25T22:02:03.91-0400 [APP/PROC/WEB/0] OUT running
2019-08-25T22:02:03.94-0400 [STG/0] OUT Downloaded binary_buildpack
2019-08-25T22:02:03.94-0400 [STG/0] OUT Cell 9aa90abe-6a8f-4485-90d1-71da907de9a3 creating container for instance 4cd508ee-3ce3-4e61-a9b7-5a997ca5583e
2019-08-25T22:02:05.36-0400 [STG/0] OUT Cell 9aa90abe-6a8f-4485-90d1-71da907de9a3 successfully created container for instance 4cd508ee-3ce3-4e61-a9b7-5a997ca5583e
2019-08-25T22:02:05.72-0400 [STG/0] OUT Downloading app package...
2019-08-25T22:02:05.72-0400 [STG/0] OUT Downloading build artifacts cache...
2019-08-25T22:02:05.77-0400 [STG/0] ERR Downloading build artifacts cache failed
2019-08-25T22:02:05.92-0400 [STG/0] OUT Downloaded app package (651.6K)
2019-08-25T22:02:06.57-0400 [STG/0] OUT -----> Binary Buildpack version 1.0.33
2019-08-25T22:02:06.83-0400 [STG/0] OUT Exit status 0
2019-08-25T22:02:06.83-0400 [STG/0] OUT Uploading droplet, build artifacts cache...
2019-08-25T22:02:06.83-0400 [STG/0] OUT Uploading droplet...
2019-08-25T22:02:06.83-0400 [STG/0] OUT Uploading build artifacts cache...
2019-08-25T22:02:06.97-0400 [STG/0] OUT Uploaded build artifacts cache (215B)
2019-08-25T22:02:07.02-0400 [API/2] OUT Creating droplet for app with guid f58e6aae-783d-4a28-bd30-54c20d314ef4
2019-08-25T22:02:08.12-0400 [APP/PROC/WEB/0] OUT SIGTERM caught, exiting
2019-08-25T22:02:08.13-0400 [CELL/SSHD/0] OUT Exit status 0
2019-08-25T22:02:08.20-0400 [APP/PROC/WEB/0] OUT Exit status 134
2019-08-25T22:02:08.28-0400 [CELL/0] OUT Cell 65a71ce1-e630-4765-8f60-adebfa730268 destroying container for instance a91e593b-d9b6-42aa-7021-b8cd
2019-08-25T22:02:08.91-0400 [PROXY/0] OUT Exit status 137
2019-08-25T22:02:09.16-0400 [CELL/0] OUT Cell 65a71ce1-e630-4765-8f60-adebfa730268 successfully destroyed container for instance a91e593b-d9b6-42aa-7021-b8cd
2019-08-25T22:02:10.07-0400 [STG/0] OUT Uploaded droplet (653.1K)
2019-08-25T22:02:10.07-0400 [STG/0] OUT Uploading complete
2019-08-25T22:02:11.24-0400 [STG/0] OUT Cell 9aa90abe-6a8f-4485-90d1-71da907de9a3 stopping instance 4cd508ee-3ce3-4e61-a9b7-5a997ca5583e
2019-08-25T22:02:11.24-0400 [STG/0] OUT Cell 9aa90abe-6a8f-4485-90d1-71da907de9a3 destroying container for instance 4cd508ee-3ce3-4e61-a9b7-5a997ca5583e
2019-08-25T22:02:11.68-0400 [CELL/0] OUT Cell e9fa9dcc-6c6e-4cd4-97cd-5781aa4c64e6 creating container for instance f2bc9aaa-64cf-4331-53b5-bd5f
2019-08-25T22:02:11.95-0400 [STG/0] OUT Cell 9aa90abe-6a8f-4485-90d1-71da907de9a3 successfully destroyed container for instance 4cd508ee-3ce3-4e61-a9b7-5a997ca5583e
2019-08-25T22:02:13.28-0400 [CELL/0] OUT Cell e9fa9dcc-6c6e-4cd4-97cd-5781aa4c64e6 successfully created container for instance f2bc9aaa-64cf-4331-53b5-bd5f
2019-08-25T22:02:14.43-0400 [CELL/0] OUT Downloading droplet...
2019-08-25T22:02:14.78-0400 [CELL/0] OUT Downloaded droplet (653.1K)
2019-08-25T22:02:16.07-0400 [APP/PROC/WEB/0] OUT running
Related
I am trying to setup pcf in Ubuntu20. While i am setting up it, its notable to create and found
x509: certificate has expired or is not yet valid
Here are deploy bash log file as follows. Can someone help me out ?
deploy-bosh.log
Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-warden-boshlite-ubuntu-xenial-go_agent/170.16'... Skipped [Stemcell already uploaded] (00:00:00)
Started deploying
Deleting VM '654e6637-3333-4879-a5a7-26a6066585ab'... Finished (00:00:14)
Creating VM for instance 'bosh/0' from stemcell '211465a3-381f-4fdd-83ba-9591803442f9'... Finished (00:00:05)
Waiting for the agent on VM '7091e293-6518-4008-b337-cbbf2d273eae' to be ready... Failed (00:00:04)
Failed deploying (00:00:23)
Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)
Deploying:
Creating instance 'bosh/0':
Waiting until instance is ready:
Post https://mbus:<redacted>#10.144.0.2:6868/agent: x509: certificate has expired or is not yet valid
Exit code 1
I'm trying to deploy a container to cloud run, but my deploy fails because of this error:
Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.
Locally my container is able to start and I can see this log (phoenix app):
19:54:51.487 [info] Running ProjectWeb.Endpoint with cowboy 2.7.0 at 0.0.0.0:8080 (http)
When I add to my docker run invocation -p 8080:8080, I can see that curl localhost:8080/health returns a 200 response.
curl localhost:8080/health
[{"error":null,"healthy":true,"name":"NOOP","time":12}]
What's strange is that in Cloud Run and Cloud Logging, I don't see any of my container logs, even though I see them locally and I know that I have logs that should be outputting to stdout and stderr on start up, so debugging is super hard.
What could be causing the logging issue? Why is Cloud Run able to talk to my container's server?
I was able to successfully deploy BOSH and CF on GCP. I was able to install the cf cli on my worker machine and was able to cf login to the api endpoint without any issues. Now I am attempting to deploy a python and a node.js hello-world style application (cf push) but I am running into the following error:
Python:
**ERROR** Could not install python: Get https://buildpacks.cloudfoundry.org/dependencies/python/python-3.5.4-linux-x64-5c7aa3b0.tgz: dial tcp: lookup buildpacks.cloudfoundry.org on 169.254.0.2:53: read udp 10.255.61.196:36513->169.254.0.2:53: i/o timeout
Failed to compile droplet: Failed to run all supply scripts: exit status 14
NodeJS
-----> Nodejs Buildpack version 1.6.28
-----> Installing binaries
engines.node (package.json): unspecified
engines.npm (package.json): unspecified (use default)
**WARNING** Node version not specified in package.json. See: http://docs.cloudfoundry.org/buildpacks/node/node-tips.html
-----> Installing node 6.14.3
Download [https://buildpacks.cloudfoundry.org/dependencies/node/node-6.14.3-linux-x64-ae2a82a5.tgz]
**ERROR** Unable to install node: Get https://buildpacks.cloudfoundry.org/dependencies/node/node-6.14.3-linux-x64-ae2a82a5.tgz: dial tcp: lookup buildpacks.cloudfoundry.org on 169.254.0.2:53: read udp 10.255.61.206:34802->169.254.0.2:53: i/o timeout
Failed to compile droplet: Failed to run all supply scripts: exit status 14
I am able to download and ping the build pack urls manually on the worker machine, jumpbox, and the bosh vms so I believe DNS is working properly on each of those machine types.
As part of the default deployment, I believe a socks5 tunnel is created to allow communication from my worker machine to the jumpbox so this is where I believe the issue lies. https://docs.cloudfoundry.org/cf-cli/http-proxy.html
When running bbl print-env, export BOSH_ALL_PROXY=ssh+socks5://jumpbox#35.192.140.0:22?private-key=/tmp/bosh-jumpbox725514160/bosh_jumpbox_private.key , however when I export https_proxy=socks5://jumpbox#35.192.140.0:22?private-key=/tmp/bosh-jumpbox389236516/bosh_jumpbox_private.key and do a cf push I receive the following error:
Request error: Get https://api.cloudfoundry.costub.com/v2/info: proxy: SOCKS5 proxy at 35.192.140.0:22 has unexpected version 83
TIP: If you are behind a firewall and require an HTTP proxy, verify the https_proxy environment variable is correctly set. Else, check your network connection.
FAILED
Am I on the right track? Is my https_proxy variable formatted correctly? I also tried https_proxy=socks5://jumpbox#35.192.140.0:22 with the same result.
We are facing the issue while starting the informatica cluster service.
When starting Informatica cluster services, some scripts installing Ambari server on infabde, bdemaster and bdeslave.
The script is trying to install ambari on infabde again and again in loop, So the cluster service failed to start by saying that Ambari already installed in infabde. Its not trying to install to other two nodes.
Error Log:
2017-01-12 17:10:30,763 [localhost-startStop-1] INFO com.infa.products.ihs.service.ambari.ScriptLauncher- Waiting for Script's streams to end.
2017-01-12 17:10:41,210 [localhost-startStop-1] ERROR com.infa.products.ihs.beans.application.ClusterListener- [InfaHadoopServiceException_00047] The launch of Ambari server on host [infabde.lucidtechsol.com] failed because the host already has an installed Ambari server. You can add the host to another cluster.
com.infa.products.ihs.service.exception.InfaHadoopServiceException: [InfaHadoopServiceException_00047] The launch of Ambari server on host [infabde.hostname.com] failed because the host already has an installed Ambari server. You can add the host to another cluster.
Run the reset script
./ResetScript.sh true user#server.com user#server.com
./ResetScript.sh false user#server.com user#client.com
and enable IHS then.
ResetScript.sh can be found in services/Infahadoopserveice/binaries
I installed cloudera vm and started trying some basic stuff. First I just wanted to ls the hdfs directoires. so I issued the below command.
[cloudera#quickstart ~]$ hadoop fs -ls /
ls: Failed on local exception: java.net.SocketException: Network is unreachable; Host Details : local host is: "quickstart.cloudera/10.0.2.15"; destination host is: "quickstart.cloudera":8020;
though ps -fu hdfs says both namenode and data node is running. I checked the status using the service command.
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode status
Hadoop namenode is not running [FAILED]
Thinking all the problems will be resolved if I restart all the services, I executed the below command.
[cloudera#quickstart conf]$ sudo /home/cloudera/cloudera-manager --express --force
[QuickStart] Shutting down CDH services via init scripts...
[QuickStart] Disabling CDH services on boot...
[QuickStart] Starting Cloudera Manager daemons...
[QuickStart] Waiting for Cloudera Manager API...
[QuickStart] Configuring deployment...
Submitted jobs: 92
[QuickStart] Deploying client configuration...
Submitted jobs: 93
[QuickStart] Starting Cloudera Management Service...
Submitted jobs: 101
[QuickStart] Enabling Cloudera Manager daemons on boot...
Now I thought all services will be up so again checked the status of namenode service. Again it came failed.
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode status
Hadoop namenode is not running [FAILED]
Now I decided to manually stop and start the namenode service. Again not much use.
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode stop
no namenode to stop
Stopped Hadoop namenode: [ OK ]
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode status
Hadoop namenode is not running [FAILED]
[cloudera#quickstart ~]$ sudo service hadoop-hdfs-namenode start
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-quickstart.cloudera.out
Failed to start Hadoop namenode. Return value: 1 [FAILED]
I checked the file /var/log/hadoop-hdfs/hadoop-hdfs-namenode-quickstart.cloudera.out . It just said below
log4j:ERROR Could not find value for key log4j.appender.RFA
log4j:ERROR Could not instantiate appender named "RFA".
I also checked /var/log/hadoop-hdfs/hadoop-cmf-hdfs-NAMENODE-quickstart.cloudera.log.out . Found below when I searched for error. Can anyone please suggest me what is the best way to get the services back on track. Unfortunately I am not able to access cloudera manager from browser. Anything that I can do from command line?
2016-02-24 21:02:48,105 WARN com.cloudera.cmf.event.publish.EventStorePublisherWithRetry: Failed to publish event: SimpleEvent{attributes={ROLE_TYPE=[NAMENODE], CATEGORY=[LOG_MESSAGE], ROLE=[hdfs-NAMENODE], SEVERITY=[IMPORTANT], SERVICE=[hdfs], HOST_IDS=[quickstart.cloudera], SERVICE_TYPE=[HDFS], LOG_LEVEL=[WARN], HOSTS=[quickstart.cloudera], EVENTCODE=[EV_LOG_EVENT]}, content=Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!, timestamp=1456295437905} - 1 of 17 failure(s) in last 79302s
java.io.IOException: Error connecting to quickstart.cloudera/10.0.2.15:7184
at com.cloudera.cmf.event.shaded.org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:249)
at com.cloudera.cmf.event.shaded.org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:198)
at com.cloudera.cmf.event.shaded.org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:133)
at com.cloudera.cmf.event.publish.AvroEventStorePublishProxy.checkSpecificRequestor(AvroEventStorePublishProxy.java:122)
at com.cloudera.cmf.event.publish.AvroEventStorePublishProxy.publishEvent(AvroEventStorePublishProxy.java:196)
at com.cloudera.cmf.event.publish.EventStorePublisherWithRetry$PublishEventTask.run(EventStorePublisherWithRetry.java:242)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Network is unreachable
You can try this:
check witch process is using the port 7184 of namenode (i.e netstat linux command)
and kill that and then restart
Or
change you namenode port from conf and restart hadoop