How to save parquet in S3 from AWS SageMaker? - amazon-web-services

I would like to save a Spark DataFrame from AWS SageMaker to S3. In Notebook, I ran
myDF.write.mode('overwrite').parquet("s3a://my-bucket/dir/dir2/")
I get
Py4JJavaError: An error occurred while calling o326.parquet. :
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3native.NativeS3FileSystem not found at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at
org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at
org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:394)
at
org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:471)
at
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
at
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609)
at
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
at
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217)
at
org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:508)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at
py4j.Gateway.invoke(Gateway.java:280) at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79) at
py4j.GatewayConnection.run(GatewayConnection.java:214) at
java.lang.Thread.run(Thread.java:745) Caused by:
java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3native.NativeS3FileSystem not found at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
How should I do it correctly in Notebook? Many thanks!

The SageMaker notebook instance is not running Spark code, and it doesn't have the Hadoop or other Java classes that you are trying to invoke.
You usually have in the Jupyter notebook in SageMaker python libraries such as Pandas, and you can use it to write the parquet file (for example, https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_parquet.html ).
Another option is to connect from the Jupyter notebook to an existing (or new) Spark cluster and execute the command remotely there. See here for documentation on how to set this connection up: https://aws.amazon.com/blogs/machine-learning/build-amazon-sagemaker-notebooks-backed-by-spark-in-amazon-emr/

Related

UnsupportedClassVersionError with mysql jdbc driver in AWS Data Pipeline

I am trying to run a Data Pipeline job in AWS. I added the field "Jdbc Driver Jar Uri" and placed the jar file in my s3 bucket, per instructions here, because it seems "Connector/J" that is installed by AWS Data Pipeline does not work.
I'm using mysql-connector-java-8.0.23 and my mysql database version is the same.
java.lang.UnsupportedClassVersionError: com/mysql/jdbc/Driver : Unsupported major.minor version 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:808)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:443)
at java.net.URLClassLoader.access$100(URLClassLoader.java:65)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.net.URLClassLoader$1.run(URLClassLoader.java:349)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:348)
at java.lang.ClassLoader.loadClass(ClassLoader.java:430)
at java.lang.ClassLoader.loadClass(ClassLoader.java:363)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
at amazonaws.datapipeline.database.JdbcDriverInitializer.getDriver(JdbcDriverInitializer.java:75)
at amazonaws.datapipeline.database.ConnectionFactory.getRdsDatabaseConnection(ConnectionFactory.java:158)
at amazonaws.datapipeline.database.ConnectionFactory.getConnection(ConnectionFactory.java:74)
at amazonaws.datapipeline.database.ConnectionFactory.getConnectionWithCredentials(ConnectionFactory.java:302)
at amazonaws.datapipeline.connector.SqlDataNode.createConnection(SqlDataNode.java:100)
at amazonaws.datapipeline.connector.SqlDataNode.getConnection(SqlDataNode.java:94)
at amazonaws.datapipeline.connector.SqlDataNode.prepareStatement(SqlDataNode.java:162)
at amazonaws.datapipeline.connector.SqlInputConnector.open(SqlInputConnector.java:49)
at amazonaws.datapipeline.connector.SqlInputConnector.<init>(SqlInputConnector.java:26)
at amazonaws.datapipeline.connector.SqlDataNode.getInputConnector(SqlDataNode.java:79)
at amazonaws.datapipeline.activity.copy.SingleThreadedCopyActivity.processAll(SingleThreadedCopyActivity.java:47)
at amazonaws.datapipeline.activity.copy.SingleThreadedCopyActivity.runActivity(SingleThreadedCopyActivity.java:35)
at amazonaws.datapipeline.activity.CopyActivity.runActivity(CopyActivity.java:22)
at amazonaws.datapipeline.objects.AbstractActivity.run(AbstractActivity.java:16)
at amazonaws.datapipeline.taskrunner.TaskPoller.executeRemoteRunner(TaskPoller.java:136)
at amazonaws.datapipeline.taskrunner.TaskPoller.executeTask(TaskPoller.java:105)
at amazonaws.datapipeline.taskrunner.TaskPoller$1.run(TaskPoller.java:81)
at private.com.amazonaws.services.datapipeline.poller.PollWorker.executeWork(PollWorker.java:76)
at private.com.amazonaws.services.datapipeline.poller.PollWorker.run(PollWorker.java:53)
at java.lang.Thread.run(Thread.java:748)
I've looked at this question for a solution, but I wasn't able to figure out how to adapt those answers to solving it in AWS Data Pipeline.
Can someone explain what steps need to be taken to fix this ClassVersion error?

AWS Glue spark job - Snowflake connection error

I am getting the below error while running the AWS glue job using spark
Glue version 2.0 spark 2.4 python 3
Could you please let me know if anyone encountered similar issue using AWS glue and snowflake.
2021-04-27 13:59:23,858 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(91)): Exception in User Class
java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1862)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:281)
at org.apache.spark.executor.CoarseGrainedExecutorBackendPlugin$class.launch(CoarseGrainedExecutorBackendWrapper.scala:10)
at org.apache.spark.executor.CoarseGrainedExecutorBackendWrapper$$anon$1.launch(CoarseGrainedExecutorBackendWrapper.scala:15)
at org.apache.spark.executor.CoarseGrainedExecutorBackendWrapper.launch(CoarseGrainedExecutorBackendWrapper.scala:19)
at org.apache.spark.executor.CoarseGrainedExecutorBackendWrapper$.main(CoarseGrainedExecutorBackendWrapper.scala:5)
at org.apache.spark.executor.CoarseGrainedExecutorBackendWrapper.main(CoarseGrainedExecutorBackendWrapper.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.services.glue.SparkProcessLauncherPlugin$class.invoke(ProcessLauncher.scala:44)
at com.amazonaws.services.glue.ProcessLauncher$$anon$1.invoke(ProcessLauncher.scala:75)
at com.amazonaws.services.glue.ProcessLauncher.launch(ProcessLauncher.scala:114)
at com.amazonaws.services.glue.ProcessLauncher$.main(ProcessLauncher.scala:26)
at com.amazonaws.services.glue.ProcessLauncher.main(ProcessLauncher.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
... 17 more
Caused by: java.io.IOException: Failed to connect to /172.36.143.34:41447

NSS/PKCS11 errors in docker alpine wildfly on AWS GovCloud

I am using the woahbase/alpine-wildfly image. I keep receiving the following errors when trying to connect to AWS endpoints for S3 and/or SQS:
Caused by: java.security.ProviderException: Could not initialize NSS
and Caused by: java.io.IOException: NSS initialization failed. The errors seem similar to this bug https://bugs.openjdk.java.net/browse/JDK-8023434 but this was for a windows deployment.
Here is the entire error message:
Exception in thread "main" java.lang.ExceptionInInitializerError
at sun.security.ssl.SSLSessionImpl.<init>(SSLSessionImpl.java:188)
at sun.security.ssl.SSLSessionImpl.<init>(SSLSessionImpl.java:152)
at sun.security.ssl.SSLSessionImpl.<clinit>(SSLSessionImpl.java:79)
at sun.security.ssl.SSLSocketImpl.init(SSLSocketImpl.java:598)
at sun.security.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:566)
at sun.security.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:110)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:363)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.upgrade(DefaultHttpClientConnectionOperator.java:192)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:369)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy2.upgrade(Unknown Source)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:415)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1190)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1030)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.sqs.AmazonSQSClient.doInvoke(AmazonSQSClient.java:1740)
at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:1716)
at com.amazonaws.services.sqs.AmazonSQSClient.executeCreateQueue(AmazonSQSClient.java:718)
at com.amazonaws.services.sqs.AmazonSQSClient.createQueue(AmazonSQSClient.java:695)
at com.amazonaws.services.sqs.AmazonSQSClient.createQueue(AmazonSQSClient.java:730)
at com.mycompany.ck.aws.credentials.test.Main.main(Main.java:54)
Caused by: java.security.ProviderException: Could not initialize NSS
at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:223)
at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:103)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at sun.security.jca.ProviderConfig$2.run(ProviderConfig.java:224)
at sun.security.jca.ProviderConfig$2.run(ProviderConfig.java:206)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.jca.ProviderConfig.doLoadProvider(ProviderConfig.java:206)
at sun.security.jca.ProviderConfig.getProvider(ProviderConfig.java:187)
at sun.security.jca.ProviderList.getProvider(ProviderList.java:233)
at sun.security.jca.ProviderList.getIndex(ProviderList.java:263)
at sun.security.jca.ProviderList.getProviderConfig(ProviderList.java:247)
at sun.security.jca.ProviderList.getProvider(ProviderList.java:253)
at java.security.Security.getProvider(Security.java:503)
at sun.security.ssl.SignatureAndHashAlgorithm.<clinit>(SignatureAndHashAlgorithm.java:415)
... 36 more
Caused by: java.io.IOException: NSS initialization failed
at sun.security.pkcs11.Secmod.initialize(Secmod.java:234)
at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
... 52 more
I am running a RHEL 7.7 host with docker 1.13.1, build 4ef4b30.
Any help would be appreciated. thanks!
Looks like the image might be missing cryptographic libraries to work with SSL algos. Try installing openssl & nss related packages
Build a custom dockerfile with these packages and try executing it.
RUN apk add --no-cache nss openssl

NoSuchMethodError for com.google.protobuf.AbstractMessage.newBuilderForType on Spanner Java Client

Get this error while trying to experiment on google's cloud spanner by running the sample code
https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/spanner/cloud-client/src/main/java/com/example/spanner/SpannerSample.java
The stacktrace is as follows:
Exception in thread "main" java.lang.NoSuchMethodError: com.google.protobuf.AbstractMessage.newBuilderForType(Lcom/google/protobuf/AbstractMessage$BuilderParent;)Lcom/google/protobuf/Message$Builder;
at com.google.protobuf.SingleFieldBuilderV3.getBuilder(SingleFieldBuilderV3.java:142)
at com.google.spanner.v1.Mutation$Builder.getInsertBuilder(Mutation.java:3227)
at com.google.cloud.spanner.Mutation.toProto(Mutation.java:377)
at com.google.cloud.spanner.SpannerImpl$TransactionContextImpl.commit(SpannerImpl.java:1223)
at com.google.cloud.spanner.SpannerImpl$TransactionRunnerImpl.run(SpannerImpl.java:1148)
at com.google.cloud.spanner.SpannerImpl$SessionImpl.write(SpannerImpl.java:704)
at com.google.cloud.spanner.SessionPool$PooledSession.write(SessionPool.java:201)
at com.google.cloud.spanner.DatabaseClientImpl.write(DatabaseClientImpl.java:31)
at spanner_test.SpannerSample.writeExampleData(SpannerSample.java:164)
at spanner_test.SpannerSample.run(SpannerSample.java:423)
at spanner_test.SpannerSample.main(SpannerSample.java:501)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
I am using gradle and my dependency is as follows:
spanner:
com.google.cloud:google-cloud-spanner:0.9.4-beta'
protobuf:
com.google.protobuf:protobuf-java:3.1.0
The project has been migrated to a different repository java-spanner and currently on the version 6.36.0 and using protobuf-java#3.21.10. I tried your sample writeExampleData with the latest version and did not face the same issue.

SAP Hana Vora : Unable create vora table on cloudera

I am trying to create a vora table using Spark-Vora but unable to create it. Please find below full error log
com.sap.spark.vora.CatalogException$SystemErrorException: System error
at com.sap.spark.vora.catalog.VoraCatalog.exists(VoraCatalog.scala:122)
at com.sap.spark.vora.SchemaCatalog.load(SchemaCatalog.java:463)
at com.sap.spark.vora.SchemaCatalog.loadTable(SchemaCatalog.java:454)
at com.sap.spark.vora.SchemaCatalog.loadTable(SchemaCatalog.java:122)
at com.sap.spark.vora.client.catalog.VoraCatalogClient$class.getTableMetadata(VoraCatalogClient.scala:180)
at com.sap.spark.vora.client.VoraClient.getTableMetadata(VoraClient.scala:58)
at com.sap.spark.vora.DefaultSource.createRelation(DefaultSource.scala:165)
at org.apache.spark.sql.execution.datasources.CreateTableUsingTemporaryAwareCommand.resolveDataSource(CreateTableUsingTemporaryAwareCommand.scala:73)
at org.apache.spark.sql.execution.datasources.CreateTableUsingTemporaryAwareCommand.run(CreateTableUsingTemporaryAwareCommand.scala:31)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:57)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57)
at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:69)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:144)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:129)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:35)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:37)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:39)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:41)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:43)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:45)
at $iwC$$iwC$$iwC.<init>(<console>:47)
at $iwC$$iwC.<init>(<console>:49)
at $iwC.<init>(<console>:51)
at <init>(<console>:53)
at .<init>(<console>:57)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: com.sap.hl.catalog.VoraCatalogException$ConnectionTimeoutException: Failure in connecting to the catalog within 2 SECONDS
at com.sap.hl.catalog.commands.Utils.handleResult(Utils.java:82)
at com.sap.hl.catalog.commands.Utils.getTransaction(Utils.java:26)
at com.sap.hl.catalog.commands.Exists.call(Exists.java:24)
at com.sap.hl.catalog.commands.Exists.call(Exists.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
------------------------------------------------------------------------
Please help me to resolve this issue.
Thanks,
Akash
The error indicates that the Vora Catalog is either not installed or not running correctly.
Since the Vora Catalog relies on the Vora DLog, the DLog also needs to be installed and running. For install instructions see Vora Installation and Administration Guide.
You can check the Vora Discovery Server UI (:8500/ui) for a 'green' entry for 'vora-catalog' and 'dlog'. In case of issues, you can check the log files /var/log/vora-catalog and /var/log/vora-dlog.