Hadoop s3 configuration file missing - amazon-web-services

I'm using the Hadoop library to upload files in S3. Because of some metric configuration file is missing I'm getting this exception
MetricsConfig - Could not locate file hadoop-metrics2-s3a-file-system.properties org.apache.commons.configuration2.ex.ConfigurationException:
Could not locate: org.apache.commons.configuration2.io.FileLocator#77f46cee[fileName=hadoop-metrics2-s3a-file-system.properties,basePath=<null>,sourceURL=,encoding=<null>,fileSystem=<null>,locationStrategy=<null>]
My current configurations are
configuration.set("fs.s3a.access.key", "accessKey")
configuration.set("fs.s3a.secret.key", "secretKey")
Where to add this configuration file? What to add to that configuration file?

don't worry about it, it's just an irritating warning. It's only relevant when you have the s3a or abfs connectors running in a long-lived app where the metrics are being collected and fed to some management tooling.
Set the log level to warn in the log4j.properties file in your spark conf dir
log4j.logger.org.apache.hadoop.metrics2=WARN

I just placed an empty file in the classpath and it stopped complaining. Like:
touch /opt/spark/conf/hadoop-metrics2-s3a-file-system.properties

Related

AWS Lambda /var/task classpath

I am having a problem with no possibility to save files into a /var/task filesystem.
If I understand it right, the classpath of lambda is /var/task. I am running a gwt agent in lambda, but it looks for specific file in a classpath. The problem is, that the files that need to be compiled are saved in /var/tmp. So to gwt agent gives an error.
"Unable to find file.xml.gwt on your classpath; could be a typo, or maybe you forgot to include a classpath entry for source?"

AWS EMR step doesn't find jar imported from s3

I am attempting to run a spark application on aws emr in client mode. I have setup a bootstrap action to import needed files and the jar from s3, and I have a step to run a single spark job.
However when the step executes, the jar I have imported isn't found. Here is the stderr output:
19/12/01 13:42:05 WARN DependencyUtils: Local jar /mnt/var/lib/hadoop/steps/s-2HLX7KPZCA07B/~/myApplicationDirectory does not exist, skipping.
I am able to successfully import the jar and other needed files for the application from my s3 bucket to the master instance, I simply import them to home/ec2-user/myApplicationDirectory/myJar.jar via a bootstrap action.
However I don't understand why the step is looking for the jar at mnt/var/lib/hadoop/...etc.
here are the relevant parts of the cli configuration:
--steps '[{"Args":["spark-submit",
"--deploy-mode","client",
"--num-executors","1",
“--driver-java-options","-Xss4M",
"--conf","spark.driver.maxResultSize=20g",
"--class”,”myApplicationClass”,
“~/myApplicationDirectory”,
“myJar.jar",
…
application specific arguments and paths to folders here
…],
”Type":"CUSTOM_JAR",
thanks for any help,
It looks like it doesn't understand the ~ as referring to the home directory. Try changing "~/myApplicationDirectory" to "/home/ec2-user/myApplicationDirectory".
A little warning: in the sample in your question, straight quotation marks " are mixed with "smart" ones “. Make sure the "smart" quotation marks don't end up in your configuration file, or you will get very confusing error messages.

Command line interface (CLI) not working after mounting lb3 to lb4 as documented

I mounted lb3 into lb4 app as documented but now i can not use lb cli and getting the following error: "Warning: Found no data sources to attach model. There will be no data-access methods available until datasources are attached.".
It's because the cli looking for the json file in the root directory and not in the lb3app directory as advised in the upper doc.
how can i tell the CLI that the configuration files are inside the sub dir lb3app instead of the parent directory newlb4app?
tried to execute the lb from newlb4app and from the sub dir lb3app. no success.
I removed the file .yo-rc.json and it solved the problem. Seems thatthe CLI looking for that file on parents directories and if exists it set that location as the project root dir.
When i deleted the file, the parent directory is now the current directory.

AWS Elastic Beanstalk .ebestensions/nginx/nginx.conf not overriding AWS' default nginx.conf

I have an AWS environment and I'm trying to override the nginx.conf that is used.
According to their documentation, this can be done by including your own file at .ebextensions/nginx/nginx.conf
To override Elastic Beanstalk's default nginx configuration completely, include a configuration in your source bundle at .ebextensions/nginx/nginx.conf
I've done that to no avail. I've tried creating an entirely new application environment to ensure it's not due to the instance not fully restarting, but the original nginx.conf is still being used. I have one other .ebextensions/ configuration file, and it is creating a file as expected.
Any clues as to why my nginx.conf isn't taking? Any details I could provide that might grant some insight? I searched for errors within eb-activity.log but did not see any. It did say that it inflated the .ebextensions/nginx/ directory and created the .ebextensions/nginx/nginx.conf file in the logs where it does so for the rest of the files / directories in your source bundle. Nowhere does it indicate that it tried to use my nginx.conf, though.
The documentation I was looking at was specifically for the java environment. The method works for several other environments, but the node environment startup process is different and ignores that file. I imagine it's because the server directive is within 00_elastic_beanstalk_proxy.conf, rather than nginx.conf.
However, you can still override the nginx.conf by instead using an .ebextension config to create the file /etc/nginx/nginx.conf as I found from the reply in this AWS Forum post

Uploading files to a bluemix app and pointing to them from configuration files

I am trying to upload files to my bluemix app and I am having problems using and understanding the file system. After I have succesfully uploaded files I want to give their path on my configuration files.
Specifically, I want to upload a jar file to the server and later use it as javaagent.
I have tried approaching this isuue from several directions.
I see that I can create a folder in the liberty_buildpack and place the files inside I can later access it on the compilation-release phases from the tmp folder:
/tmp/buildpacks/ibm-websphere-liberty-buildpack/lib/liberty_buildpack/my_folder
Also I can see that in the file system that I see when building and deploying the app I can copy only to the folder located in:
/app
So I copied the JAR file to the app file and set it as a javaagent using 2 method:
Manually set enviorment variable JAVA_OPTS with java agent to point to /app/myjar.jar using cf set-env
Deploy a war file of the app using cf push from wlp server and set the java agent inside the server.xml file and attribute genericJvmArguments
Both of those methods didnt work, and either the deploy phase of the application failed or my features simply didnt work.
So I tried searching the application file system using cf files and came up with the app folder, but strangly it didn't have the same file as the folder I deploy and I couldn't find any connection to the deployed folder ot the build pack.
Can someone explain how this should be done correctly? namely, uploading the file and then how should I point to it from the enviorment variable/server file?
I mean should it be /app/something or maybe other path?
I have also seen the use of relative paths like #droplet.sandbox maybe its the way to address those files? and how should I access those folders from cf files
Thanks.
EDIT:
As I have been instructed in the comments I have added the jar file to the system, the problem is that when I add the javaagent variable to the enviorment variable JAVA_OPTS the deploy stage fails with the timeout error:
payload: {... "reason"=>"CRASHED", "exit_status"=>32, "exit_description"=>"failed to accept connections within health check timeout", "crash_timestamp"=>
1433864527}
The way I am assigning the javaagent is as follows:
cf set-env myApp JAVA_OPTS "path/agent.jar"
I have tried adding several location:
1. I have found that if I add the jar files to my WebContent folder I can find it in: /app/wlp/usr/servers/defaultServer/apps/myapp.war/resources/
2. I have copied the jar file from the /tmp location in the compilation phase to /home/vcap/app/agent.jar
3. I have located the jar file in /app/.java/jre/lib
none of those 3 paths worked.
I found out that if I give a wrong path the system behaves the same so it may be a path problem.
Any ideas?
Try this:
Put your agent jars in a folder called ".profile.d" inside your WAR package;
cf se your-app JAVA_OPTS -javaagent:/home/vcap/app/.profile.d/your.jar ;
Push the war to Bluemix.
Not sure if this is exactly the right answer, but I am using additional jar files in my Liberty application, so maybe this will help.
I push up a myapp.war file to bluemix. Within the war file, inside the WEB-INF folder, I have a lib folder that contains a number of jar files. The classes in those jar files are then used within the java code of my application.
myapp.war/WEB-INF/lib/myPlugin.jar
You could try doing something like that with the jar file(s) you need, building them into the war file.
Other than that, you could try the section Overlaying the JRE from the bluemix liberty documentation to add jars to the JRE.