java.io.FileNotFoundException Error on New EMR Cluster - amazon-web-services

Failing to understand how to resolve the below error I receive when running the hive cli on a new EMR Server. I've already confirmed that the user that the is being used has the permissions to write to /var/log/hive/user/hadoop
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /var/log/hive/user/hadoop/hive.log (No such file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
at java.io.FileOutputStream.<init>(FileOutputStream.java:142)
at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223)
at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
at org.apache.log4j.PropertyConfigurator.configure(PropertyConfigurator.java:415)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:127)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:77)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:58)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:586)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:570)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
log4j:ERROR Either File or DatePattern options are not set for appender [DRFA].

It's due to the missing folder /var/log/hive/user/hadoop/.
So you now should type the following commands:
Change the owner of this /var/log/hive/ to the current hadoop user using this command:
sudo chown hadoop -R /var/log/hive
Create the /var/log/hive/user/hadoop/ folder
mkdir /var/log/hive/user
mkdir /var/log/hive/user/hadoop
Type hive again and things should be fine.

Related

Jenkins reported a SSLException: Read timed out error when installing the aws sdk plugin

I installed jenkins with docker.
docker run -u root --rm -d -p 8080:8080 -p 50000:50000 -v jenkins-data:/var/jenkins_home -v /var/run/docker.sock:/var/run/docker.sock jenkinsci/blueocean
When I downloaded the aws plugin, it reported a java https error, as follows:
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:457)
at sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:237)
at sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:190)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:109)
Caused: javax.net.ssl.SSLException: Read timed out
at sun.security.ssl.Alert.createSSLException(Alert.java:127)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:138)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1386)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1354)
at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:948)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.MeteredStream.read(MeteredStream.java:134)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3454)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3447)
at org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:81)
at hudson.model.UpdateCenter$UpdateCenterConfiguration.download(UpdateCenter.java:1283)
Caused: java.io.IOException: Failed to load https://updates.jenkins.io/download/plugins/aws-java-sdk/1.11.995/aws-java-sdk.hpi to /var/jenkins_home/plugins/aws-java-sdk.jpi.tmp
at hudson.model.UpdateCenter$UpdateCenterConfiguration.download(UpdateCenter.java:1288)
Caused: java.io.IOException: Failed to download from https://updates.jenkins.io/download/plugins/aws-java-sdk/1.11.995/aws-java-sdk.hpi (redirected to: https://mirrors.tuna.tsinghua.edu.cn/jenkins/plugins/aws-java-sdk/1.11.995/aws-java-sdk.hpi)
at hudson.model.UpdateCenter$UpdateCenterConfiguration.download(UpdateCenter.java:1322)
at hudson.model.UpdateCenter$DownloadJob._run(UpdateCenter.java:1870)
at hudson.model.UpdateCenter$InstallationJob._run(UpdateCenter.java:2162)
at hudson.model.UpdateCenter$DownloadJob.run(UpdateCenter.java:1844)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:118)
at java.lang.Thread.run(Thread.java:748)
Have friends encountered the same problem, and can you give me some help, thanks in advance
https://updates.jenkins.io/download/plugins/aws-java-sdk/1.11.995/aws-java-sdk.hpi
use this URL and download the aws-java-sdk.hpi file
copy paste that file to your Jenkins -> plugin folder and restart the Jenkins it will
automatically install.

Stackdriver Debugger: Failed to initialize service account authentication

I have my service running on my private instance (running outside of Google Cloud) , I am using Tomcat as a webserver. In cdbg_java_agent.ERROR log file I see the following error stack while starting Java Server.
E1025 19:56:10.922689 17636 jni_utils.h:372] GcpHubClient.<init>: java.lang.RuntimeException: Failed to initialize service account authentication
at com.google.devtools.cdbg.debuglets.java.GcpEnvironment.getMetadataQuery(Unknown Source)
at com.google.devtools.cdbg.debuglets.java.GcpHubClient.<init>(Unknown Source)
Caused by: java.lang.ClassNotFoundException: com.google.devtools.cdbg.debuglets.java.ServiceAccountAuth
at java.net.URLClassLoader$1.run(URLClassLoader.java:359)
at java.net.URLClassLoader$1.run(URLClassLoader.java:348)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:347)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at com.google.devtools.cdbg.debuglets.java.InternalsClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
... 2 more
E1025 19:56:10.923075 17636 jni_bridge.cc:50] Failed to instantiate HubClient Java class
E1025 19:56:10.923120 17636 worker.cc:145] HubClient not available: debugger thread can't continue.
I got info from: https://cloud.google.com/debugger/docs/setup/java
I have done
mkdir /opt/cdbg
wget -qO- https://storage.googleapis.com/cloud-debugger/compute-java/debian-wheezy/cdbg_java_agent_gce.tar.gz | \
tar xvz -C /opt/cdbg
And I have a file with my credentials at /opt/cdbg/gcp-svc.json
export GOOGLE_APPLICATION_CREDENTIALS="/opt/cdbg/gcp-svc.json"
JAVA_OPTS which are added
-agentpath:/opt/cdbg/cdbg_java_agent.so
-Dcom.google.cdbg.module=javadebug
-Dcom.google.cdbg.version=1
-Dcom.google.cdbg.auth.serviceaccount.enable=true
-Dcom.google.cdbg.auth.serviceaccount.jsonfile=/opt/cdbg/gcp-svc.json
I know GOOGLE_APPLICATION_CREDENTIALS and -Dcom.google.cdbg.auth.serviceaccount.jsonfile is a repetition but having only either one of them also does not give a different result.
Download the right package that supports service account.
cdbg_java_agent_service_account.tar.gz
To mention what Erez Haba suggested I tried this one:
sudo wget -qO- https://storage.googleapis.com/cloud-debugger/compute-java/debian-wheezy/cdbg_java_agent_service_account.tar.gz | \
sudo tar xvz -C /opt/cdbg
Which did work as expected. +1 to Erez Haba

How to save parquet in S3 from AWS SageMaker?

I would like to save a Spark DataFrame from AWS SageMaker to S3. In Notebook, I ran
myDF.write.mode('overwrite').parquet("s3a://my-bucket/dir/dir2/")
I get
Py4JJavaError: An error occurred while calling o326.parquet. :
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3native.NativeS3FileSystem not found at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at
org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at
org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:394)
at
org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:471)
at
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
at
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609)
at
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
at
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217)
at
org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:508)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at
py4j.Gateway.invoke(Gateway.java:280) at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79) at
py4j.GatewayConnection.run(GatewayConnection.java:214) at
java.lang.Thread.run(Thread.java:745) Caused by:
java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3native.NativeS3FileSystem not found at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
How should I do it correctly in Notebook? Many thanks!
The SageMaker notebook instance is not running Spark code, and it doesn't have the Hadoop or other Java classes that you are trying to invoke.
You usually have in the Jupyter notebook in SageMaker python libraries such as Pandas, and you can use it to write the parquet file (for example, https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_parquet.html ).
Another option is to connect from the Jupyter notebook to an existing (or new) Spark cluster and execute the command remotely there. See here for documentation on how to set this connection up: https://aws.amazon.com/blogs/machine-learning/build-amazon-sagemaker-notebooks-backed-by-spark-in-amazon-emr/

why is Python looking for dev_appserver.py in the google_appengine folder

When going trough this tutorial Build an Android App Using Firebase and the App Engine Flexible Environment I get an Python error.
why is Python looking for dev_appserver.py in the google_appengine folder?
The dev_appserver.py exist in the google-cloud-sdk\bin folder
How can i resolve this?
Here is the Exception:
[INFO] python.exe: can't open file 'C:\GoogleCloudSDK\Cloud SDK\google-cloud-sdk/platform/google_appengine/dev_appserver.py': [Errno 2] No such file or directory
Exception in thread "standard-out-redirection-devappserver" java.lang.RuntimeException: The Java Dev Server has stopped.
at com.google.appengine.gcloudapp.AbstractGcloudMojo$1.run(AbstractGcloudMojo.java:346)
[ERROR] Error: gcloud app command with exit code : 2
[ERROR]
org.apache.maven.plugin.MojoExecutionException: Error: gcloud app command exit code is: 2
at com.google.appengine.gcloudapp.AbstractGcloudMojo.startCommand(AbstractGcloudMojo.java:380)
at com.google.appengine.gcloudapp.GCloudAppRun.execute(GCloudAppRun.java:302)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)

Amazon Product API

I'm trying to use Amazon API. I followed their pdf. I have created a directory called build, and inside a file named jaxws-custom.xml with the provided content.
However, when I run the command:
wsimport -d ./build -s ./src -p com.ECS.client.jax http://ecs.amazon
aws.com/AWSECommerceService/AWSECommerceService.wsdl -b jaxws-custom.xml .
I get the error:
Exception in thread "main" com.sun.xml.internal.ws.streaming.XMLReaderException: Unable to create StAX reader or writer
at com.sun.xml.internal.ws.api.streaming.XMLStreamReaderFactory.create(XMLStreamReaderFactory.java:137)
at com.sun.tools.internal.ws.wscompile.WsimportOptions.parseBindings(WsimportOptions.java:430)
at com.sun.tools.internal.ws.wscompile.WsimportTool.run(WsimportTool.java:162)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.sun.tools.internal.ws.Invoker.invoke(Invoker.java:120)
at com.sun.tools.internal.ws.WsImport.main(WsImport.java:42)
Caused by: java.io.FileNotFoundException: /a/fr-05/vol/stud/home/zimchoni/Downloads/amazon/jaxws-custom.xml (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at java.io.FileInputStream.<init>(FileInputStream.java:101)
at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
at java.net.URL.openStream(URL.java:1037)
at com.sun.xml.internal.ws.api.streaming.XMLStreamReaderFactory.create(XMLStreamReaderFactory.java:135)
... 8 more
Any Idea what can it be ?
Thanks
It's simple remove -b jaxws-custom.xml
wsimport -d e:/build -s e:/src -p com.ECS.client.jax http://webservices.amazon.com/AWSECommerceService/AWSECommerceService.wsdl