HDFS IO error in Flume - hdfs

I am trying to load a file from my Windows machine to HDFS using Flume.
I am getting the following error:
12:42:02 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: Incomplete HDFS URI, no host: hdfs://10.74.xxx.217:9000:/user/urmi/FlumeData.1374649892113
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:118)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2150)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2184)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2166)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:302)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:123)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:183)
at org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:432)
at org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:429)
at org.apache.flume.sink.hdfs.HDFSEventSink$ProxyCallable.call(HDFSEventSink.java:164)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

There is an extra colon (:) after your port number in the URL.
SO the correct URL should be
hdfs://10.74.162.217:9000/

Related

Pyspark read jdbc giving errors . How to fix?

I am connecting to RDS MySQL using JDBC in pyspark . I have tried almost everything that I found on Stackoverflow for debugging but still, i am unable to make it work .
spark = SparkSession.builder.config("spark.jars", mysql_jar) \
.master("local[*]").appName("PySpark_MySQL_test").getOrCreate()
df= spark.read.format("jdbc").option("url", "jdbc:mysql://hostname.amazonaws.com:1150/dbname?user=user_name&password=password") \
.option("driver", "com.mysql.cj.jdbc.Driver").option("dbtable", "table_name").load()
I have tried using the same connection details in pymysql library of python it connects and brings back the result.
But here I getting the below error and am unable to solve it.
raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling o38.load.
: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:827)
at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:447)
at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:237)
at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:199)
at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49)
at org.apache.spark.sql.execution.datasources.jdbc.connection.ConnectionProvider$.create(ConnectionProvider.scala:68)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$createConnectionFactory$1(JdbcUtils.scala:62)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:56)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:226)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:307)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:307)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:225)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure
I have experienced the same issues.Now it is worked.The core reason is spark use master node to connect mysql and use work nodes to execute task.So you can connect mysql while raise communication error.Based on this theory,you can open the security rules on mysql to let all spark node can connect to mysql
For anyone coming here for an answer using Docker give the below solution a try.
use the below configuration
source_df = spark.read.format('jdbc').options(
url='jdbc:mysql://host.docker.internal:3306/superset?useSSL=false&allowPublicKeyRetrieval=true',
driver='com.mysql.cj.jdbc.Driver',
dbtable='table',
user='root',
password='root').load()
I have tried the host with localhost, 127.0.0.1, and even the IPAddress from docker inspect but didn't work then changed it to host.docker.internal and it worked.

Developing Glue ETL script locally - java.lang.IllegalStateException: Connection pool shut down

I'm currently developing ETL scripts locally using the AWS Glue ETL library.
I'm facing an issue when extracting data from S3 bucket as DynamicFrame.
When I want to convert to a DataFrame using toDF(), it will always trigger this exception:
py4j.protocol.Py4JJavaError: An error occurred while calling o52.toDF
...
ERROR Executor: Exception in task 5.0 in stage 3.0 (TID 29)
java.lang.IllegalStateException: Connection pool shut down
at org.apache.http.util.Asserts.check(Asserts.java:34)
at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:191)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.requestConnection(PoolingHttpClientConnectionManager.java:267)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy15.requestConnection(Unknown Source)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:176)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1330)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1490)
at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:148)
at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:281)
at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:364)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:179)
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:163)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at com.amazonaws.services.glue.readers.BufferedStream.read(DynamicRecordReader.scala:91)
at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.ensureLoaded(ByteSourceJsonBootstrapper.java:489)
at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.detectEncoding(ByteSourceJsonBootstrapper.java:126)
at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.constructParser(ByteSourceJsonBootstrapper.java:215)
I tried the same code on AWS Glue DevEndpoint and it works fine. Any idea how to resolve this?
Please go with java8, your issue will be resolved
check java -version
I had the same issue when running the dev env with following specs:
Scala version 2.11.12
Spark version 2.4.3
Glue 1.0.0
To fix it, add the following line to the spark configuration in $SPARK_HOME/conf/spark-defaults.conf
spark.master local
Alternatively, depending on how you are running your job, you can configure this dynamically if you are in control of the spark context. i.e.
from pyspark.conf import SparkConf
from pyspark.context import SparkContext
conf = SparkConf()
conf.setMaster("local").setAppName("My app")
sc = SparkContext(conf=conf)
I have found this happens when running in local mode with multiple threads, I have found that increasing fs.s3.connection.maximum or fs.s3a.connection.maximum does not fix the issue. Although this post indicates that it should https://kb.databricks.com/jobs/job-fails-connection-pool.html

Amazon EC2 to AWS Elasticache Redis connection problem

I am connecting to AWS Elasticache Redis via Redisson from my Amazon EC2 instance. After lots of request of redis connection, I get the following issue which halt my program execution. The problem doesn't occur for few request to redis interation, but it eventually happen after lots of requests.
2018-10-11 11:02:38,363 ERROR org.redisson.client.handler.CommandsQueue - Exception occured. Channel: [id: 0x46c06a6a, L:0.0.0.0/0.0.0.0:49308 ! R:redis-pa-qc-001.redis-pa-qc.yzmnbg.use1.cache.amazonaws.com/10.0.24.226:6379]
io.netty.handler.codec.DecoderException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:459)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1412)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:943)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:141)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1429)
at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535)
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:813)
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:292)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1159)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1194)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
... 16 common frames omitted
Caused by: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
at sun.security.validator.PKIXValidator.(PKIXValidator.java:90)
at sun.security.validator.Validator.getInstance(Validator.java:179)
at sun.security.ssl.X509TrustManagerImpl.getValidator(X509TrustManagerImpl.java:312)
at sun.security.ssl.X509TrustManagerImpl.checkTrustedInit(X509TrustManagerImpl.java:171)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:239)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1493)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)
at sun.security.ssl.Handshaker$1.run(Handshaker.java:919)
at sun.security.ssl.Handshaker$1.run(Handshaker.java:916)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1369)
at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1408)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1316)
... 20 common frames omitted
Caused by: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
at java.security.cert.PKIXParameters.setTrustAnchors(PKIXParameters.java:200)
at java.security.cert.PKIXParameters.(PKIXParameters.java:120)
at java.security.cert.PKIXBuilderParameters.(PKIXBuilderParameters.java:104)
at sun.security.validator.PKIXValidator.(PKIXValidator.java:88)
The error is basically JVM is not able to get the required SSL certificates to establish the connection. Check your Java version and update it.
And update the java ca-certs.
Run:
sudo apt-get install ca-certificates-java" (Ubuntu system) or
update-ca-certificates -f

Error using flume while fetching twitter data to hdfs

While fetching the twitter data to HDFS using FLUME , I m getting this error again and again as far as i have changed the versions of the twitter4j.jar files ,please tell me why this error is coming.Can anyone suggest me what will be the next step for fetching the data in HDFS ;
(conf-file-poller-0) [DEBUG
-org.apache.flume.source.DefaultSourceFactory.getClass(DefaultSourceFactory.java:60)]
Source type org.apache.flume.source.twitter.TwitterSource is a custom
type 2017-11-01 15:29:12,648 (conf-file-poller-0) [ERROR -
org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:150)]
Unhandled error java.lang.NoSuchMethodError:
twitter4j.TwitterStream.addListener(Ltwitter4j/StatusListener;)V at
org.apache.flume.source.twitter.TwitterSource.configure(TwitterSource.java:114)
at
org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at
org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:326)
at
org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:101)
at
org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Resolved by adding Twitter4j Stream » 3.0.3 & Twitter4j Core » 3.0.3 .

WSO2-CEP KafkaReceiver deployment

I have defined a KafkaReceiver usind WSO2-CEP in management console.
It doesn't work and I can read this error on server-console log
[2016-02-26 10:17:25,412] INFO {org.wso2.carbon.event.input.adapter.core.internal.InputAdapterRuntime} - Connecting receiver kafkaReceiver
[2016-02-26 10:17:25,415] ERROR {org.wso2.carbon.event.input.adapter.core.internal.InputAdapterRuntime} - Error initializing Input Adapter 'kafkaReceiver, hence this will be suspended indefinitely, Cannot access kafka context due to missing jars
org.wso2.carbon.event.input.adapter.core.exception.InputEventAdapterRuntimeException: Cannot access kafka context due to missing jars
at org.wso2.carbon.event.input.adapter.kafka.KafkaEventAdapter.createConsumerConfig(KafkaEventAdapter.java:114)
at org.wso2.carbon.event.input.adapter.kafka.KafkaEventAdapter.createKafkaAdaptorListener(KafkaEventAdapter.java:132)
at org.wso2.carbon.event.input.adapter.kafka.KafkaEventAdapter.connect(KafkaEventAdapter.java:66)
at org.wso2.carbon.event.input.adapter.core.internal.InputAdapterRuntime.start(InputAdapterRuntime.java:72)
at org.wso2.carbon.event.input.adapter.core.internal.InputAdapterRuntime.startPolling(InputAdapterRuntime.java:62)
at org.wso2.carbon.event.input.adapter.core.internal.CarbonInputEventAdapterService.startPolling(CarbonInputEventAdapterService.java:187)
at org.wso2.carbon.event.receiver.core.internal.CarbonEventReceiverService.startPolling(CarbonEventReceiverService.java:620)
at org.wso2.carbon.event.receiver.core.internal.CarbonEventReceiverManagementService.startPolling(CarbonEventReceiverManagementService.java:99)
at org.wso2.carbon.event.processor.manager.core.internal.CarbonEventManagementService$2.run(CarbonEventManagementService.java:184)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: kafka/consumer/ConsumerConfig
at org.wso2.carbon.event.input.adapter.kafka.KafkaEventAdapter.createConsumerConfig(KafkaEventAdapter.java:112)
... 15 more
Caused by: java.lang.ClassNotFoundException: kafka.consumer.ConsumerConfig cannot be found by org.wso2.carbon.event.input.adapter.kafka_5.0.3
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:501)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:421)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:412)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:107)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 16 more
If we read the class source at this link https://github.com/wso2/carbon-analytics-common/blob/master/components/event-receiver/event-input-adapters/org.wso2.carbon.event.input.adapter.kafka/src/main/java/org/wso2/carbon/event/input/adapter/kafka/KafkaEventAdapter.java
I read that the error is in ConsumerConfig creation
In order to use kafka you need to add kafka download and copy following client JAR files to /repository/components/lib/ directory
kafka_2.10-0.8.2.1.jar
zkclient-0.3.jar
scala-library-2.10.4.jar
zookeeper-3.4.6.jar
metrics-core-2.2.0.jar
kafka-clients-0.8.2.1.jar
[1]. https://docs.wso2.com/display/CEP400/Supporting+Different+Transports#SupportingDifferentTransports-KafkaTransport