Opendaylight bundles in GracePeriod and cluster not coming up - akka

We are using ODL Nitrogen version. When we perform warm start (ie., restart Karaf servers, without deleting "KARAF_HOME/data" folder following bundles are in "GracePeriod" state for a long time, hence other application bundles that are dependent on this are failing. However when we start Karaf in a clean (without data folder) state, all bundles comes up fine.
We also noticed, netty.tcp port 2550 is not getting binded when bundles goes into failure state. Confirmed this port is not being used by other process also.
349 | GracePeriod | 80 | 2.3.3 | mdsal-eos-binding-adapter
350 | Active | 80 | 2.3.3 | mdsal-eos-binding-api
351 | Active | 80 | 2.3.3 | mdsal-eos-common-api
352 | Active | 80 | 2.3.3 | mdsal-eos-common-spi
376 | GracePeriod | 80 | 2.3.3 | mdsal-singleton-dom-impl
142 | Active | 80 | 2.4.20 | akka-actor
143 | Active | 80 | 2.4.20 | akka-cluster
144 | Active | 80 | 2.4.20 | akka-osgi
145 | Active | 80 | 2.4.20 | akka-persistence
146 | Active | 80 | 2.4.20 | akka-protobuf
147 | Active | 80 | 2.4.20 | akka-remote
148 | Active | 80 | 2.4.20 | akka-slf4j
149 | Active | 80 | 2.4.20 | akka-stream
310 | Active | 80 | 1.6.3 | org.opendaylight.controller.sal-akka-raft
We also observe following logs rolling continuously and only this message is coming very frequently. It seems that its not allowing any other bundles to co-perform.
2018-07-02 22:52:47,299 | WARN | saction-25-27'}} | 298 - org.opendaylight.controller.config-manager - 0.7.3 | DeadlockMonitor$DeadlockMonitorRunnable | ModuleIdentifier{factoryName='binding-broker-impl', instanceName='binding-broker-impl'} did not finish after 84984 ms
2018-07-02 22:52:50,717 | ERROR | rint Extender: 3 | 325 - org.opendaylight.controller.sal-distributed-datastore - 1.6.3 | AbstractDataStore | Shard leaders failed to settle in 90 seconds, giving up
Diag output of Graceperiod bundle
karaf#virtuora>diag 349
mdsal-eos-binding-adapter (349)
-------------------------------
Status: GracePeriod
Blueprint
7/3/18 6:17 PM
Missing dependencies:
(objectClass=org.opendaylight.mdsal.binding.dom.codec.api.BindingNormalizedNodeSerializer) (objectClass=org.opendaylight.mdsal.eos.dom.api.DOMEntityOwnershipService)
karaf#virtuora>diag 376
mdsal-singleton-dom-impl (376)
------------------------------
Status: GracePeriod
Blueprint
7/3/18 6:22 PM
Missing dependencies:
(objectClass=org.opendaylight.mdsal.eos.dom.api.DOMEntityOwnershipService)
Please let us know
why akka is unable to open netty tcp port
why DOMEntityOwnershipService and BindingNormalizedNodeSerializer

You need to set SO_REUSEADDR to enable the port to be directly reused after it is closed. See https://docs.oracle.com/javase/7/docs/api/java/net/StandardSocketOptions.html#SO_REUSEADDR
If you do not set this option then the port will stay blocked for a while dependent on the operation system.
You should also not forcefully kill a process if possible as this does not cleanly shut down the ports.

Related

Problem App Engine app to connect to MySQL in CloudSQL

I've configured SQL second gen. instance and App Engine application (Python 2.7) in one project. I've made necessary settings according to that page.
app.yaml
runtime: python27
api_version: 1
threadsafe: true
env_variables:
CLOUDSQL_CONNECTION_NAME: coral-heuristic-215610:us-central1:db-basic-1
CLOUDSQL_USER: root
CLOUDSQL_PASSWORD: xxxxxxxxx
beta_settings:
cloud_sql_instances: coral-heuristic-215610:us-central1:db-basic-1
libraries:
- name: lxml
version: latest
- name: MySQLdb
version: latest
handlers:
- url: /main
script: main.app
Now as I try to connect from the app (inside Cloud Shell), the error:
OperationalError: (2002, 'Can\'t connect to local MySQL server through socket \'/var/run/mysqld/mysqld.sock\' (2 "No such file or directory")')
Direct connection works:
$ gcloud sql connect db-basic-1 --user=root
was successful...
MySQL [correction_dict]> SHOW PROCESSLIST;
+--------+------+----------------------+-----------------+---------+------+----------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+------+----------------------+-----------------+---------+------+----------+------------------+
| 9 | root | localhost | NULL | Sleep | 4 | | NULL |
| 10 | root | localhost | NULL | Sleep | 4 | | NULL |
| 112306 | root | 35.204.173.246:59210 | correction_dict | Query | 0 | starting | SHOW PROCESSLIST |
| 112357 | root | localhost | NULL | Sleep | 4 | | NULL |
| 112368 | root | localhost | NULL | Sleep | 0 | | NULL |
+--------+------+----------------------+-----------------+---------+------+----------+------------------+
I've authorized IP to connect to the Cloud SQL instance:
Any hints, help?
Google AppEngine Standard provides a unix socket at /cloudsql/[INSTANCE_CONNECTION_NAME] that automatically connects you to your CloudSQL instance. All you need to do is connect to it at that address. For the MySQLDb library, that looks like this:
db = MySQLdb.connect(
unix_socket=cloudsql_unix_socket,
user=CLOUDSQL_USER,
passwd=CLOUDSQL_PASSWORD)
(If you are running AppEngine Flexible, connecting is different and can be found here)

Cloud Foundry router cannot find api.xx.xxxx.com/info (AWS)

Finally managed to successfully deploy cloud foundry to AWS.
Mostly following instructions from http://docs.cloudfoundry.org/deploying/ec2/bootstrap-aws-vpc.html
Its failing at the validation step that is to get a success response for the following:
curl api.subdomain.domain/info
Of course I have substituted the subdomain and domain appropriately.
I am getting the error:
404 Not Found: Requested route ('api.XX.XXXXX.com') does not exist.
The request is coming till the Cloud foundry router router_z1. And I can see this error in the logs for router_z1.
Here is output of my bosh vms command:
------------------------------------+---------+---------------+--------------+
| Job/index | State | Resource Pool | IPs |
+------------------------------------+---------+---------------+--------------+
| unknown/unknown | running | medium_z1 | 10.10.16.254 |
| unknown/unknown | running | medium_z2 | 10.10.81.4 |
| unknown/unknown | running | small_errand | 10.10.17.1 |
| unknown/unknown | running | small_errand | 10.10.17.0 |
| api_worker_z1/0 | running | small_z1 | 10.10.17.20 |
| api_z1/0 | running | large_z1 | 10.10.17.18 |
| clock_global/0 | running | medium_z1 | 10.10.17.19 |
| etcd_z1/0 | running | medium_z1 | 10.10.16.20 |
| hm9000_z1/0 | running | medium_z1 | 10.10.17.21 |
| loggregator_trafficcontroller_z1/0 | running | small_z1 | 10.10.16.34 |
| loggregator_z1/0 | running | medium_z1 | 10.10.16.31 |
| login_z1/0 | running | medium_z1 | 10.10.17.17 |
| nats_z1/0 | running | medium_z1 | 10.10.16.11 |
| router_z1/0 | running | router_z1 | 10.10.16.15 |
| runner_z1/0 | running | runner_z1 | 10.10.17.22 |
| stats_z1/0 | running | small_z1 | 10.10.17.15 |
| uaa_z1/0 | running | medium_z1 | 10.10.17.16 |
+------------------------------------+---------+---------------+--------------+
The only change that I made in the CF deployment manifest was to eliminate instance from zone 2. The reason being AWS default limit for number of instances on EC2 in a particular region is 20.
Any pointers on how to resolve this issue will be appreciated.
Figured out the problem. Couple of issues:
In the CF deployment manifest make sure the system domain property
is <BOSH_VPC_SUBDOMAIN>.<BOSH_VPC_DOMAIN>. That is if you have
reserved cf.example.com for cloud foundry PaaS. Make sure
cf.example.com is what system_domain property in your cloud
foundry deployment manifest refers to. Infact example.com should
not appear in your deployment manifest anywhere without cf..
Through out the deployment manifest it is always cf.example.com
Do not use '#' in any of the passwords within the deployment
manifest. I have logged a bug for this in cf-releases:
https://github.com/cloudfoundry/cf-release/issues/527

How to view cloudfoundry logs when cf login fail

I have used bosh-lite to deploy a single node cloudfoundry in my development environment. After deployment, I run the bosh vms, and it returns the vms list:
+------------------------------------+---------+---------------+--------------+
| Job/index | State | Resource Pool | IPs |
+------------------------------------+---------+---------------+--------------+
| api_z1/0 | running | large_z1 | 10.244.0.138 |
| etcd_leader_z1/0 | running | medium_z1 | 10.244.0.38 |
| ha_proxy_z1/0 | running | router_z1 | 10.244.0.34 |
| hm9000_z1/0 | running | medium_z1 | 10.244.0.142 |
| loggregator_trafficcontroller_z1/0 | running | small_z1 | 10.244.0.10 |
| loggregator_z1/0 | running | medium_z1 | 10.244.0.14 |
| login_z1/0 | running | medium_z1 | 10.244.0.134 |
| nats_z1/0 | running | medium_z1 | 10.244.0.6 |
| postgres_z1/0 | running | medium_z1 | 10.244.0.30 |
| router_z1/0 | running | router_z1 | 10.244.0.22 |
| runner_z1/0 | running | runner_z1 | 10.244.0.26 |
| uaa_z1/0 | running | medium_z1 | 10.244.0.130 |
+------------------------------------+---------+---------------+--------------+
But when I try to use "cf api https://api.10.244.0.34.xip.io --skip-ssl-validation" to connect the cloudfoundry, it returns an error:
ConnectEx tcp: No connection could be made because the target machine
actively refused it.
The log information is very general (actually this is the exception from CF client which is written in .net), and doesn't provide useful information.
My question is, which VM handles the api command? And, where can I find the detail log in that VM?
api_z1/0 is handling the command. You can get its logs via the BOSH CLI itself: bosh logs api_z1 0 --all.
You probably also need to add the route to your local route table so that traffic to HAProxy container at 10.244.0.24 knows to go through the BOSH-lite VM at 192.168.50.4. Run bin/add-route or bin/add-route.bat from the root of your BOSH-lite repo.

rails 4 mysql2 gem Incorrect MySQL client library version! This gem was compiled for 5.5.30 but the client library is 5.6.19

upon deployment in production, I get this error , I don't understand where is coming from this 5.5.30... but I uninstalled the gem locally (oSX) and remotely (Debian) and reinstalled it... so it should be compiled with the latest libraries.. 5.6.19
here are both MySQL versions installed ...
on Debian
mysql -u root -p -e 'SHOW VARIABLES LIKE "%version%";'
Enter password:
+-------------------------+-------------------+
| Variable_name | Value |
+-------------------------+-------------------+
| innodb_version | 5.6.19 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.6.19-1~dotdeb.1 |
| version_comment | (Debian) |
| version_compile_machine | x86_64 |
| version_compile_os | debian-linux-gnu |
+-------------------------+-------------------+
on OSX
yves$ mysql -u root -p -e 'SHOW VARIABLES LIKE "%version%";'
Enter password:
+-------------------------+------------------------------+
| Variable_name | Value |
+-------------------------+------------------------------+
| innodb_version | 5.6.19 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.6.19 |
| version_comment | MySQL Community Server (GPL) |
| version_compile_machine | x86_64 |
| version_compile_os | osx10.7 |
+-------------------------+------------------------------+

NoHttpResponseException on uploading file to S3 (camel-aws)

I am trying to upload around 10 GB file from my local machine to S3 (inside a camel route). Although file gets uploaded in around 3-4 minutes, but it also throwing following exception:
2014-06-26 13:53:33,417 | INFO | ads.com/outbound | FetchRoute | 167 - com.ut.ias - 2.0.3 | Download complete to local. Pushing file to S3
2014-06-26 13:54:19,465 | INFO | manager-worker-6 | AmazonHttpClient | 144 - org.apache.servicemix.bundles.aws-java-sdk - 1.5.1.1 | Unable to execute HTTP request: The target server failed to respond
org.apache.http.NoHttpResponseException: The target server failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95)[142:org.apache.httpcomponents.httpclient:4.2.5]
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)[142:org.apache.httpcomponents.httpclient:4.2.5]
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)[141:org.apache.httpcomponents.httpcore:4.2.4]
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)[141:org.apache.httpcomponents.httpcore:4.2.4]
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)[142:org.apache.httpcomponents.httpclient:4.2.5]
at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)[142:org.apache.httpcomponents.httpclient:4.2.5]
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)[141:org.apache.httpcomponents.httpcore:4.2.4]
.......
at java.util.concurrent.FutureTask.run(FutureTask.java:262)[:1.7.0_55]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)[:1.7.0_55]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)[:1.7.0_55]
at java.lang.Thread.run(Thread.java:744)[:1.7.0_55]
2014-06-26 13:55:08,991 | INFO | ads.com/outbound | FetchRoute | 167 - com.ut.ias - 2.0.3 | Upload complete.
Due to which camel route doesn't stop and it is continuously throwing InterruptedException:
2014-06-26 13:55:11,182 | INFO | ads.com/outbound | SftpOperations | 110 - org.apache.camel.camel-ftp - 2.12.1 | JSCH -> Disconnecting from cxportal.integralads.com port 22
2014-06-26 13:55:11,183 | INFO | lads.com session | SftpOperations | 110 - org.apache.camel.camel-ftp - 2.12.1 | JSCH -> Caught an exception, leaving main loop due to Socket closed
2014-06-26 13:55:11,183 | WARN | lads.com session | eventadmin | 139 - org.apache.felix.eventadmin - 1.3.2 | EventAdmin: Exception: java.lang.InterruptedException
java.lang.InterruptedException
at EDU.oswego.cs.dl.util.concurrent.LinkedQueue.offer(Unknown Source)[139:org.apache.felix.eventadmin:1.3.2]
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor.execute(Unknown Source)[139:org.apache.felix.eventadmin:1.3.2]
at org.apache.felix.eventadmin.impl.tasks.DefaultThreadPool.executeTask(DefaultThreadPool.java:101)[139:org.apache.felix.eventadmin:1.3.2]
at org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks.execute(AsyncDeliverTasks.java:105)[139:org.apache.felix.eventadmin:1.3.2]
at org.apache.felix.eventadmin.impl.handler.EventAdminImpl.postEvent(EventAdminImpl.java:100)[139:org.apache.felix.eventadmin:1.3.2]
at org.apache.felix.eventadmin.impl.adapter.LogEventAdapter$1.logged(LogEventAdapter.java:281)[139:org.apache.felix.eventadmin:1.3.2]
at org.ops4j.pax.logging.service.internal.LogReaderServiceImpl.fire(LogReaderServiceImpl.java:134)[50:org.ops4j.pax.logging.pax-logging-service:1.7.1]
at org.ops4j.pax.logging.service.internal.LogReaderServiceImpl.fireEvent(LogReaderServiceImpl.java:126)[50:org.ops4j.pax.logging.pax-logging-service:1.7.1]
at org.ops4j.pax.logging.service.internal.PaxLoggingServiceImpl.handleEvents(PaxLoggingServiceImpl.java:180)[50:org.ops4j.pax.logging.pax-logging-service:1.7.1]
at org.ops4j.pax.logging.service.internal.PaxLoggerImpl.inform(PaxLoggerImpl.java:145)[50:org.ops4j.pax.logging.pax-logging-service:1.7.1]
at org.ops4j.pax.logging.internal.TrackingLogger.inform(TrackingLogger.java:86)[18:org.ops4j.pax.logging.pax-logging-api:1.7.1]
at org.ops4j.pax.logging.slf4j.Slf4jLogger.info(Slf4jLogger.java:476)[18:org.ops4j.pax.logging.pax-logging-api:1.7.1]
at org.apache.camel.component.file.remote.SftpOperations$JSchLogger.log(SftpOperations.java:359)[110:org.apache.camel.camel-ftp:2.12.1]
at com.jcraft.jsch.Session.run(Session.java:1621)[109:org.apache.servicemix.bundles.jsch:0.1.49.1]
at java.lang.Thread.run(Thread.java:744)[:1.7.0_55]
Please see my code below and let me know, where I am going wrong:
TransferManager tm = new TransferManager(
S3Client.getS3Client());
// TransferManager processes all transfers asynchronously,
// so this call will return immediately.
Upload upload = tm.upload(
Utils.getProperty(Constants.BUCKET),
getS3Key(file.getName()), file);
try {
upload.waitForCompletion();
logger.info("Upload complete.");
} catch (AmazonClientException amazonClientException) {
logger.warn("Unable to upload file, upload was aborted.");
amazonClientException.printStackTrace();
}
The stacktrace doesn't even have any reference to my code, hence couldn't determine where the issue is.
Any help or pointer would be really appreciated.
Thanks