SAP Hana Vora : Unable create vora table on cloudera - vora

I am trying to create a vora table using Spark-Vora but unable to create it. Please find below full error log
com.sap.spark.vora.CatalogException$SystemErrorException: System error
at com.sap.spark.vora.catalog.VoraCatalog.exists(VoraCatalog.scala:122)
at com.sap.spark.vora.SchemaCatalog.load(SchemaCatalog.java:463)
at com.sap.spark.vora.SchemaCatalog.loadTable(SchemaCatalog.java:454)
at com.sap.spark.vora.SchemaCatalog.loadTable(SchemaCatalog.java:122)
at com.sap.spark.vora.client.catalog.VoraCatalogClient$class.getTableMetadata(VoraCatalogClient.scala:180)
at com.sap.spark.vora.client.VoraClient.getTableMetadata(VoraClient.scala:58)
at com.sap.spark.vora.DefaultSource.createRelation(DefaultSource.scala:165)
at org.apache.spark.sql.execution.datasources.CreateTableUsingTemporaryAwareCommand.resolveDataSource(CreateTableUsingTemporaryAwareCommand.scala:73)
at org.apache.spark.sql.execution.datasources.CreateTableUsingTemporaryAwareCommand.run(CreateTableUsingTemporaryAwareCommand.scala:31)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:57)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57)
at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:69)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:144)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:129)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:35)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:37)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:39)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:41)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:43)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:45)
at $iwC$$iwC$$iwC.<init>(<console>:47)
at $iwC$$iwC.<init>(<console>:49)
at $iwC.<init>(<console>:51)
at <init>(<console>:53)
at .<init>(<console>:57)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: com.sap.hl.catalog.VoraCatalogException$ConnectionTimeoutException: Failure in connecting to the catalog within 2 SECONDS
at com.sap.hl.catalog.commands.Utils.handleResult(Utils.java:82)
at com.sap.hl.catalog.commands.Utils.getTransaction(Utils.java:26)
at com.sap.hl.catalog.commands.Exists.call(Exists.java:24)
at com.sap.hl.catalog.commands.Exists.call(Exists.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
------------------------------------------------------------------------
Please help me to resolve this issue.
Thanks,
Akash

The error indicates that the Vora Catalog is either not installed or not running correctly.
Since the Vora Catalog relies on the Vora DLog, the DLog also needs to be installed and running. For install instructions see Vora Installation and Administration Guide.
You can check the Vora Discovery Server UI (:8500/ui) for a 'green' entry for 'vora-catalog' and 'dlog'. In case of issues, you can check the log files /var/log/vora-catalog and /var/log/vora-dlog.

Related

Getting a "java.lang.reflect.InvocationTargetException" WARNING in AWS-ECS and can't figure out why?

I am getting the following warning in Elasticsearch in AWS ECS
java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:513)
at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallAgentmain(InstrumentationImpl.java:535)
Caused by: java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "accessClassInPackage.jdk.internal.org.objectweb.asm")
at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
at java.base/java.security.AccessController.checkPermission(AccessController.java:897)
at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:322)
at java.base/java.lang.SecurityManager.checkPackageAccess(SecurityManager.java:1238)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:174)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at Log4jHotPatch.asmVersion(Log4jHotPatch.java:71)
at Log4jHotPatch.agentmain(Log4jHotPatch.java:93)
... 6 more
Exception in thread "Attach Listener" Agent failed to start!
Can anyone suggest why this warning is coming up and how to resolve this?

AWS Glue spark job - Snowflake connection error

I am getting the below error while running the AWS glue job using spark
Glue version 2.0 spark 2.4 python 3
Could you please let me know if anyone encountered similar issue using AWS glue and snowflake.
2021-04-27 13:59:23,858 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(91)): Exception in User Class
java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1862)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:281)
at org.apache.spark.executor.CoarseGrainedExecutorBackendPlugin$class.launch(CoarseGrainedExecutorBackendWrapper.scala:10)
at org.apache.spark.executor.CoarseGrainedExecutorBackendWrapper$$anon$1.launch(CoarseGrainedExecutorBackendWrapper.scala:15)
at org.apache.spark.executor.CoarseGrainedExecutorBackendWrapper.launch(CoarseGrainedExecutorBackendWrapper.scala:19)
at org.apache.spark.executor.CoarseGrainedExecutorBackendWrapper$.main(CoarseGrainedExecutorBackendWrapper.scala:5)
at org.apache.spark.executor.CoarseGrainedExecutorBackendWrapper.main(CoarseGrainedExecutorBackendWrapper.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.services.glue.SparkProcessLauncherPlugin$class.invoke(ProcessLauncher.scala:44)
at com.amazonaws.services.glue.ProcessLauncher$$anon$1.invoke(ProcessLauncher.scala:75)
at com.amazonaws.services.glue.ProcessLauncher.launch(ProcessLauncher.scala:114)
at com.amazonaws.services.glue.ProcessLauncher$.main(ProcessLauncher.scala:26)
at com.amazonaws.services.glue.ProcessLauncher.main(ProcessLauncher.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
... 17 more
Caused by: java.io.IOException: Failed to connect to /172.36.143.34:41447

NSS/PKCS11 errors in docker alpine wildfly on AWS GovCloud

I am using the woahbase/alpine-wildfly image. I keep receiving the following errors when trying to connect to AWS endpoints for S3 and/or SQS:
Caused by: java.security.ProviderException: Could not initialize NSS
and Caused by: java.io.IOException: NSS initialization failed. The errors seem similar to this bug https://bugs.openjdk.java.net/browse/JDK-8023434 but this was for a windows deployment.
Here is the entire error message:
Exception in thread "main" java.lang.ExceptionInInitializerError
at sun.security.ssl.SSLSessionImpl.<init>(SSLSessionImpl.java:188)
at sun.security.ssl.SSLSessionImpl.<init>(SSLSessionImpl.java:152)
at sun.security.ssl.SSLSessionImpl.<clinit>(SSLSessionImpl.java:79)
at sun.security.ssl.SSLSocketImpl.init(SSLSocketImpl.java:598)
at sun.security.ssl.SSLSocketImpl.<init>(SSLSocketImpl.java:566)
at sun.security.ssl.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:110)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:363)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.upgrade(DefaultHttpClientConnectionOperator.java:192)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:369)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy2.upgrade(Unknown Source)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:415)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1190)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1030)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.sqs.AmazonSQSClient.doInvoke(AmazonSQSClient.java:1740)
at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:1716)
at com.amazonaws.services.sqs.AmazonSQSClient.executeCreateQueue(AmazonSQSClient.java:718)
at com.amazonaws.services.sqs.AmazonSQSClient.createQueue(AmazonSQSClient.java:695)
at com.amazonaws.services.sqs.AmazonSQSClient.createQueue(AmazonSQSClient.java:730)
at com.mycompany.ck.aws.credentials.test.Main.main(Main.java:54)
Caused by: java.security.ProviderException: Could not initialize NSS
at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:223)
at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:103)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at sun.security.jca.ProviderConfig$2.run(ProviderConfig.java:224)
at sun.security.jca.ProviderConfig$2.run(ProviderConfig.java:206)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.jca.ProviderConfig.doLoadProvider(ProviderConfig.java:206)
at sun.security.jca.ProviderConfig.getProvider(ProviderConfig.java:187)
at sun.security.jca.ProviderList.getProvider(ProviderList.java:233)
at sun.security.jca.ProviderList.getIndex(ProviderList.java:263)
at sun.security.jca.ProviderList.getProviderConfig(ProviderList.java:247)
at sun.security.jca.ProviderList.getProvider(ProviderList.java:253)
at java.security.Security.getProvider(Security.java:503)
at sun.security.ssl.SignatureAndHashAlgorithm.<clinit>(SignatureAndHashAlgorithm.java:415)
... 36 more
Caused by: java.io.IOException: NSS initialization failed
at sun.security.pkcs11.Secmod.initialize(Secmod.java:234)
at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218)
... 52 more
I am running a RHEL 7.7 host with docker 1.13.1, build 4ef4b30.
Any help would be appreciated. thanks!
Looks like the image might be missing cryptographic libraries to work with SSL algos. Try installing openssl & nss related packages
Build a custom dockerfile with these packages and try executing it.
RUN apk add --no-cache nss openssl

How to save parquet in S3 from AWS SageMaker?

I would like to save a Spark DataFrame from AWS SageMaker to S3. In Notebook, I ran
myDF.write.mode('overwrite').parquet("s3a://my-bucket/dir/dir2/")
I get
Py4JJavaError: An error occurred while calling o326.parquet. :
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3native.NativeS3FileSystem not found at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at
org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at
org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:394)
at
org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:471)
at
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
at
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609)
at
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
at
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217)
at
org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:508)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at
py4j.Gateway.invoke(Gateway.java:280) at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79) at
py4j.GatewayConnection.run(GatewayConnection.java:214) at
java.lang.Thread.run(Thread.java:745) Caused by:
java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3native.NativeS3FileSystem not found at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
How should I do it correctly in Notebook? Many thanks!
The SageMaker notebook instance is not running Spark code, and it doesn't have the Hadoop or other Java classes that you are trying to invoke.
You usually have in the Jupyter notebook in SageMaker python libraries such as Pandas, and you can use it to write the parquet file (for example, https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_parquet.html ).
Another option is to connect from the Jupyter notebook to an existing (or new) Spark cluster and execute the command remotely there. See here for documentation on how to set this connection up: https://aws.amazon.com/blogs/machine-learning/build-amazon-sagemaker-notebooks-backed-by-spark-in-amazon-emr/

NoSuchMethodError for com.google.protobuf.AbstractMessage.newBuilderForType on Spanner Java Client

Get this error while trying to experiment on google's cloud spanner by running the sample code
https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/spanner/cloud-client/src/main/java/com/example/spanner/SpannerSample.java
The stacktrace is as follows:
Exception in thread "main" java.lang.NoSuchMethodError: com.google.protobuf.AbstractMessage.newBuilderForType(Lcom/google/protobuf/AbstractMessage$BuilderParent;)Lcom/google/protobuf/Message$Builder;
at com.google.protobuf.SingleFieldBuilderV3.getBuilder(SingleFieldBuilderV3.java:142)
at com.google.spanner.v1.Mutation$Builder.getInsertBuilder(Mutation.java:3227)
at com.google.cloud.spanner.Mutation.toProto(Mutation.java:377)
at com.google.cloud.spanner.SpannerImpl$TransactionContextImpl.commit(SpannerImpl.java:1223)
at com.google.cloud.spanner.SpannerImpl$TransactionRunnerImpl.run(SpannerImpl.java:1148)
at com.google.cloud.spanner.SpannerImpl$SessionImpl.write(SpannerImpl.java:704)
at com.google.cloud.spanner.SessionPool$PooledSession.write(SessionPool.java:201)
at com.google.cloud.spanner.DatabaseClientImpl.write(DatabaseClientImpl.java:31)
at spanner_test.SpannerSample.writeExampleData(SpannerSample.java:164)
at spanner_test.SpannerSample.run(SpannerSample.java:423)
at spanner_test.SpannerSample.main(SpannerSample.java:501)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
I am using gradle and my dependency is as follows:
spanner:
com.google.cloud:google-cloud-spanner:0.9.4-beta'
protobuf:
com.google.protobuf:protobuf-java:3.1.0
The project has been migrated to a different repository java-spanner and currently on the version 6.36.0 and using protobuf-java#3.21.10. I tried your sample writeExampleData with the latest version and did not face the same issue.