MATLAB unable to connect with Cassandra

MATLAB unable to connect with Cassandra - amazon-web-services

We have a cassandra cluster up and running in test environment. The cluster is reach able through command line, however MATLAB is unable to connect.
Connection string:-
contactPoints = "172.31.61.211";
conn = cassandra(contactPoints)
cassandra with properties:
Cluster: "RQ_1"
HostAddresses: "172.31.61.211"
LocalDataCenter: "DC_1"
Keyspaces: ["dse_insights"]
Error Message:-
Error using cassandra (line 130)
Java exception occurred:
java.lang.NoSuchMethodError: com.google.common.base.Objects.firstNonNull(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
at com.datastax.driver.core.policies.Policies$Builder.build(Policies.java:285)
at com.datastax.driver.core.Cluster$Builder.getConfiguration(Cluster.java:1246)
at com.datastax.driver.core.Cluster.(Cluster.java:116)
at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:181)
at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:1264)
at com.mathworks.toolbox.cassandra.CassandraConnection.(CassandraConnection.java:43)

For me it looks like conflict between library version used by math plab and Cassandra Java driver. Specifically between guava versions - either mathlab uses older or newer version that conflicts with the driver.
Also, mathlab docs recommends to switch to new integration based on C++ driver (doc)

Related

Logstash Google Pubsub Input Plugin fails to load file and pull messages

I'm getting this error when trying to run Logstash pipeline with a configuration that is using google_pubsub on a docker container running in my production env:
2021-09-16 19:13:25 FATAL runner:135 - The given configuration is invalid. Reason: Unable to configure plugins: (PluginLoadingError) Couldn't find any input plugin named 'google_pubsub'. Are you sure this is correct? Trying to load the google_pubsub input plugin resulted in this error: Problems loading the requested plugin named google_pubsub of type input. Error: RuntimeError
you might need to reinstall the gem which depends on the missing jar or in case there is Jars.lock then resolve the jars with `lock_jars` command
no such file to load -- com/google/cloud/google-cloud-pubsub/1.37.1/google-cloud-pubsub-1.37.1 (LoadError)
2021-09-16 19:13:25 ERROR Logstash:96 - java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit
This seems to randomly happen when re-installing the plugin. I thought it's a proxy issue but I have the google domain enabled in the whitelist. Might be the wrong one / missing something. Still, doesn't explain the random failures.
Also, when I run the pipeline in my machine I get GCP events, but when I do it on a VM - no Pubsub messages are being pulled. Could it be a firewall rule blocking them?

The error message suggests there is a problem in loading the ‘google_pubsub’ input plugin. This error generally occurs when the input Pub/Sub plugin is not installed properly. Kindly ensure that you are installing the Logstash Plugin for Pub/Sub correctly.
For example, installing Logstash Plugin for Pub/Sub in a VM :
sudo -u root sudo -u logstash bin/logstash-plugin install logstash-input-google_pubsub
For a detailed demo refer to this community tutorial.

AWS EMR Spark error with `Failed to load class of driverClassName com.mysql.jdbc.Driver`

I'm currently trying to add a process in EMR 6.1.0 that will use Spark to store aggregated data in mysql.
However, when I actually run Spark, I get the following error.
Exception in thread "main" java.lang.RuntimeException: Failed to load class of driverClassName com.mysql.jdbc.
This error did not occur in EMR 6.0.0.
In the process of updating from EMR 6.0.0 to 6.1.0, I changed the Spark version from 2.4.4 to 3.0.0.
The code itself has not changed significantly, and we know that it is not a network problem.
I've spent a lot of time looking through the AWS documentation and can't seem to find any hints.
Can anyone help me?

Place the MySQL connector jar under $SPARK_HOME/jars folder or pass the the MySQL connector jar path in spark-shell/spark-submit command using --jars flag.

Spark 3.x depends on HikariCP.
https://github.com/apache/spark/blob/v3.0.0/dev/deps/spark-deps-hadoop-3.2-hive-2.3#L1
Preloaded HikariCP can't load your application classes due to ClassLoader.
https://github.com/brettwooldridge/HikariCP/blob/HikariCP-2.5.1/src/main/java/com/zaxxer/hikari/HikariConfig.java#L318
this.getClass().getClassLoader().loadClass(driverClassName)
You should add shade settings if use sbt-assemlby plugin.
assembly / assemblyShadeRules := {
Seq("com.zaxxer.hikari").map { packageName =>
ShadeRule.rename(s"${packageName}.**" -> s"my_app_shade_package.${packageName}.#1").inAll
}
}

Can't access GCP Access Secret Manager from Dataproc sparkjob

I am trying to fetch GCP secret manager secret from dataproc spark job. But I am getting the error "Exception in thread "main" java.lang.NoClassDefFoundError: com/google/cloud/secretmanager/v1/AccessSecretVersionResponse".
I have added the jars "google-cloud-secretmanager-1.4.2.jar" and "gax-1.62.0.jar" in the dataproc spark job dependency.
I am using the code mentioned in the below GCP link.
https://cloud.google.com/secret-manager/docs/reference/libraries
Am I missing something here?

2.0-debian10 has python >= 3.0 installed. google-cloud-secretmanager-1.4.2.jar does not support python >= 3.0 (https://pypi.org/project/google-cloud-secret-manager/1.0.0/). Please use a later version of google-cloud-secretmanager.

Flume sink to HDFS error: java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument

With:
Java 1.8.0_231
Hadoop 3.2.1
Flume 1.8.0
Have created a hdfs service on 9000 port.
jps:
11688 DataNode
10120 Jps
11465 NameNode
11964 SecondaryNameNode
12621 NodeManager
12239 ResourceManager
Flume conf:
agent1.channels.memory-channel.type=memory
agent1.sources.tail-source.type=exec
agent1.sources.tail-source.command=tail -F /var/log/nginx/access.log
agent1.sources.tail-source.channels=memory-channel
#hdfs sink
agent1.sinks.hdfs-sink.channel=memory-channel
agent1.sinks.hdfs-sink.type=hdfs
agent1.sinks.hdfs-sink.hdfs.path=hdfs://cluster01:9000/system.log
agent1.sinks.hdfs-sink.hdfs.fileType=DataStream
agent1.channels=memory-channel
agent1.sources=tail-source
agent1.sinks=log-sink hdfs-sink
Then start flume:
./bin/flume-ng agent --conf conf -conf-file conf/test1.conf --name agent1 -Dflume.root.logger=INFO,console
Then meet error:
Info: Including Hadoop libraries found via (/usr/local/hadoop-3.2.1/bin/hadoop) for HDFS access
...
2019-11-04 14:48:24,818 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: SINK, name: hdfs-sink started
2019-11-04 14:48:28,823 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.HDFSDataStream.configure(HDFSDataStream.java:57)] Serializer = TEXT, UseRawLocalFileSystem = false
2019-11-04 14:48:28,836 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:447)] process failed
java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1679)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:226)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:541)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1679)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:226)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:541)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
at java.lang.Thread.run(Thread.java:748)
I have searched for a while but haven't found same error on net. Is there any advice to solve this problem?

That may caused by lib/guava.
I removed lib/guava-11.0.2.jar, and restart flume, found it works.
outputs:
2019-11-04 16:52:58,062 (hdfs-hdfs-sink-call-runner-0) [WARN - org.apache.hadoop.util.NativeCodeLoader.<clinit>(NativeCodeLoader.java:60)] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-11-04 16:53:01,532 (Thread-9) [INFO - org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:239)] SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
But I still don't know which version of guava it using now.

I had the same issue. It seems to be a bug in flume. It references a class name that does not exist in that version of guava

Replace guava-11.x.x.jar file with guava-27.x.x.jar from hadoop 3 common library, this will work
hadoop-3.3.0/share/hadoop/common/lib/guava-27.0-jre.jar put this file in your flume library, don't forget to delete older version from flume library first

As others said, there is clash between guava-11 (hadoop2/flume 1.8.0/1.9.0) and guava-27 (hadoop3).
Other answers don't explain the root cause of the issue: the script under $FLUME_HOME/bin/flume-ng puts into flume classpath all the jars in our hadoop distribution if $HADOOP_HOME environment variable is set.
Few words on why the suggested actions "fix" the problem: deleting $FLUME_HOME/lib/guava-11.0.2.jar leaves only guava-27.0-jre.jar, no more clash.
So, there is no need to copy it under $FLUME_HOME/lib, and it's no bug from Flume, just a version incompatibility because Flume did not upgrade Guava, while Hadoop 3 did.
I did not dig into the details of the changes between those guava versions, it might happen that everything works fine until it does not (for instance, if there is any backward incompatible change between the two).
So, before using this "fix" in production environment, I suggest to test extensively to reduce the risk of unexpected problems.
The best solution would be to wait (or contribute) a new Flume version where Guava is upgrade to v27.

I do agree with Alessandro S.
Flume communicates with HDFS via the HDFS APIs, and it doesn't matter which version the hadoop platform runs with if the APIs do not change which is the most cases. Actually flume is build with some specific version of hadoop library. The problem is that you use the wrong hadoop library version to run flume。
So just use hadoop library from version 2.x.x to run your 1.8.0 flume

Old-style mapred API in HBase does not work

I have a MapReduce job, which takes HBase table as the output destination
of my reduce job. My reducer class implements the TableMap interface in
package org.apache.hadoop.hbase.mapred, and I used the initTableReduceJob()
function in TableMapReduceUtil class from
package org.apache.hadoop.hbase.mapred to configure my job.
But when I run my job, I got the following error at reduce stage
java.lang.NullPointerException
at org.apache.hadoop.mapred.Task.getFsStatistics(Task.java:1099)
at
org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:442)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:490)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
My HBase version is 0.94.0 and my Hadoop version is 1.0.1.
I found a post similar to my question at:
https://forums.aws.amazon.com/thread.jspa?messageID=394846
Could anyone give me some hint about why this happened? Should I just stick
with the org.apache.hadoop.hbase.mapreduce package?

This error suggests that you may be running HBase on the local filesystem without HDFS. Try installing or running Hadoop HDFS. The org.apache.hadoop.mapred API appears to require HDFS.
As a possible convenience, you may try the Cloudera development VM, which packages both.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

MATLAB unable to connect with Cassandra - amazon-web-services

For me it looks like conflict between library version used by math plab and Cassandra Java driver. Specifically between guava versions - either mathlab uses older or newer version that conflicts with the driver. Also, mathlab docs recommends to switch to new integration based on C++ driver (doc)

Related

Logstash Google Pubsub Input Plugin fails to load file and pull messages

AWS EMR Spark error with `Failed to load class of driverClassName com.mysql.jdbc.Driver`

Can't access GCP Access Secret Manager from Dataproc sparkjob

Flume sink to HDFS error: java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument

Old-style mapred API in HBase does not work

Categories

Resources