I am using AWS EMR-6.5.0 with Hadoop-3.2.1
I'm following this guide to launch the stream job: https://levelup.gitconnected.com/map-reduce-with-python-hadoop-on-aws-emr-341bdd07b804
When I run the command :
$ hadoop jar /usr/lib/hadoop/hadoop-streaming.jar -files mapper.py,reducer.py -mapper mapper.py -reducer reducer.py -input books-input -output books-output
I get the error:
ERROR streaming.StreamJob: Error Launching job : Not a file: hdfs://ip-172-31-55-89.ec2.internal/172.31.55.89:8032/user/hadoop/books-input/1340.txt
Streaming Command Failed!
Complete log:
2022-08-26 15:55:12,295 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-55-89.ec2.internal/172.31.55.89:8032
2022-08-26 15:55:12,592 INFO client.AHSProxy: Connecting to Application History server at ip-172-31-55-89.ec2.internal/172.31.55.89:8032
2022-08-26 15:55:12,653 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-55-89.ec2.internal/172.31.55.89:8032
2022-08-26 15:55:12,654 INFO client.AHSProxy: Connecting to Application History server at ip-172-31-55-89.ec2.internal/172.31.55.89:8032
2022-08-26 15:55:13,083 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1661529292338_0001
2022-08-26 15:55:14,507 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
2022-08-26 15:55:14,518 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 049362b7cf53ff5f739d6b1532457f2c6cd495e8]
2022-08-26 15:55:14,690 INFO mapred.FileInputFormat: Total input files to process : 49
2022-08-26 15:55:14,691 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1661529292338_0001
2022-08-26 15:55:14,769 ERROR streaming.StreamJob: Error Launching job : Not a file: hdfs:/ip-172-31-55-89.ec2.internal/172.31.55.89:8032/user/hadoop/books-input/1340.txt
Streaming Command Failed!
I don't know why it says "not a file" for a .txt file,
I have 2 almost identical CDH 5.8 clusters, namely, Lab & Production. I have a mapreduce job that runs fine in Lab but fails in Production cluster. I spent over 10 hours on this already. I made sure I am running exact same code and also compared the configurations between the clusters. I couldn't find any difference.
Only difference I could see is when I run in Production, I see these warnings:
Also note, the path of the cached file starts with "file://null/"
17/08/16 10:13:14 WARN util.MRApps: cache file (mapreduce.job.cache.files) file://null/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/commons-httpclient-3.1.jar conflicts with cache file (mapreduce.job.cache.files) file://null/opt/cloudera/parcels/CDH/lib/hadoop/client/commons-httpclient-3.1.jar This will be an error in Hadoop 2.0
17/08/16 10:13:14 WARN util.MRApps: cache file (mapreduce.job.cache.files) file://null/opt/cloudera/parcels/CDH/lib/hadoop/client/hadoop-yarn-server-common.jar conflicts with cache file (mapreduce.job.cache.files) file://null/opt/cloudera/parcels/CDH/lib/hadoop-yarn/hadoop-yarn-server-common.jar This will be an error in Hadoop 2.0
17/08/16 10:13:14 WARN util.MRApps: cache file (mapreduce.job.cache.files) file://null/opt/cloudera/parcels/CDH/lib/hadoop-yarn/lib/stax-api-1.0-2.jar conflicts with cache file (mapreduce.job.cache.files) file://null/opt/cloudera/parcels/CDH/lib/hadoop/client/stax-api-1.0-2.jar This will be an error in Hadoop 2.0
17/08/16 10:13:14 WARN util.MRApps: cache file (mapreduce.job.cache.files) file://null/opt/cloudera/parcels/CDH/lib/hbase/lib/snappy-java-1.0.4.1.jar conflicts with cache file (mapreduce.job.cache.files) file://null/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/snappy-java-1.0.4.1.jar This will be an error in Hadoop 2.0
17/08/16 10:13:14 INFO impl.YarnClientImpl: Submitted application application_1502835801144_0005
17/08/16 10:13:14 INFO mapreduce.Job: The url to track the job: http://myserver.com:8088/proxy/application_1502835801144_0005/
17/08/16 10:13:14 INFO mapreduce.Job: Running job: job_1502835801144_0005
17/08/16 10:13:15 INFO mapreduce.Job: Job job_1502835801144_0005 running in uber mode : false
17/08/16 10:13:15 INFO mapreduce.Job: map 0% reduce 0%
17/08/16 10:13:15 INFO mapreduce.Job: Job job_1502835801144_0005 failed with state FAILED due to: Application application_1502835801144_0005 failed 2 times due to AM Container for appattempt_1502835801144_0005_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://myserver.com:8088/proxy/application_1502835801144_0005/Then, click on links to logs of each attempt.
Diagnostics: java.io.FileNotFoundException: File file:/var/cdr-ingest-mapreduce/lib/mail-1.4.7.jar does not exist
Failing this attempt. Failing the application.
17/08/16 10:13:15 INFO mapreduce.Job: Counters: 0
17/08/16 10:13:16 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x25ba0c30a33ea46
17/08/16 10:13:16 INFO zookeeper.ZooKeeper: Session: 0x25ba0c30a33ea46 closed
17/08/16 10:13:16 INFO zookeeper.ClientCnxn: EventThread shut down
As we can see, the job tries to start but fails saying that a jar file is not found. I made sure the jar file exist in local fs with ample permissions. I suspect the issue happens when it tries to copy the jar files into the distributed cache and fails somehow.
Here is my shell script that start the MR job:
#!/bin/bash
LIBJARS=`ls -m /var/cdr-ingest-mapreduce/lib/*.jar |tr -d ' '|tr -d '\n'`
LIBJARS="$LIBJARS,`ls -m /opt/cloudera/parcels/CDH/lib/hbase/lib/*.jar |tr -d ' '|tr -d '\n'`"
LIBJARS="$LIBJARS,`ls -m /opt/cloudera/parcels/CDH/lib/hadoop/client/*.jar |tr -d ' '|tr -d '\n'`"
LIBJARS="$LIBJARS,`ls -m /opt/cloudera/parcels/CDH/lib/hadoop-yarn/*.jar |tr -d ' '|tr -d '\n'`"
LIBJARS="$LIBJARS,`ls -m /opt/cloudera/parcels/CDH/lib/hadoop-yarn/lib/*.jar |tr -d ' '|tr -d '\n'`"
LIBJARS="$LIBJARS,`ls -m /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/*.jar |tr -d ' '|tr -d '\n'`"
job_start_timestamp=''
if [ -n "$1" ]; then
job_start_timestamp="-overridedJobStartTimestamp $1"
fi
export HADOOP_CLASSPATH=`echo ${LIBJARS} | sed s/,/:/g`
yarn jar `ls /var/cdr-ingest-mapreduce/cdr-ingest-mapreduce-core*.jar` com.blah.CdrIngestor \
-libjars ${LIBJARS} \
-zookeeper 44.88.111.216,44.88.111.220,44.88.111.211 \
-yarnResourceManagerHost 44.88.111.220 \
-yarnResourceManagerPort 8032 \
-yarnResourceManagerSchedulerHost 44.88.111.220 \
-yarnResourceManagerSchedulerPort 8030 \
-mrClientSubmitFileReplication 6 \
-logFile '/var/log/cdr_ingest_mapreduce/cdr_ingest_mapreduce' \
-hdfsTempOutputDirectory '/cdr/temp_cdr_ingest' \
-versions '3' \
-jobConfigDir '/etc/cdr-ingest-mapreduce' \
${job_start_timestamp}
Node Manager Log:
2017-08-16 18:34:28,438 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM
2017-08-16 18:34:28,551 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting.
2017-08-16 18:34:31,638 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: The Auxilurary Service named 'mapreduce_shuffle' in the configuration is for class class org.apache.hadoop.mapred.ShuffleHandler which has a name of 'httpshuffle'. Because these are not the same tools trying to send ServiceData and read Service Meta Data may have issues unless the refer to the name in the config.
2017-08-16 18:34:31,851 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: container_1502835801144_0006_01_000001 has no corresponding application!
2017-08-16 18:36:08,221 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: rollingMonitorInterval is set as -1. The log rolling mornitoring interval is disabled. The logs will be aggregated after this application is finished.
2017-08-16 18:36:08,364 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hdfs OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: LOCALIZATION_FAILED APPID=application_1502933671610_0001 CONTAINERID=container_1502933671610_0001_01_000001
More logs from Node Manager showing that the jars were not copied to the cache (I am not sure what the 4th parameter "NULL" in the message is):
2017-08-15 15:20:09,876 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1502835577753_0001_01_000001
2017-08-15 15:20:09,876 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1502835577753_0001_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
2017-08-15 15:20:09,877 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1502835577753_0001_01_000001 sent RELEASE event on a resource request { file:/var/cdr-ingest-mapreduce/lib/mail-1.4.7.jar, 1502740240000, FILE, null } not present in cache.
2017-08-15 15:20:09,877 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1502835577753_0001_01_000001 sent RELEASE event on a resource request { file:/var/cdr-ingest-mapreduce/lib/commons-lang3-3.4.jar, 1502740240000, FILE, null } not present in cache.
2017-08-15 15:20:09,877 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1502835577753_0001_01_000001 sent RELEASE event on a resource request { file:/var/cdr-ingest-mapreduce/lib/cdr-ingest-mapreduce-core-1.0.3-SNAPSHOT.jar, 1502740240000, FILE, null } not present in cache.
2017-08-15 15:20:09,877 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1502835577753_0001_01_000001 sent RELEASE event on a resource request { file:/var/cdr-ingest-mapreduce/lib/opencsv-3.8.jar, 1502740240000, FILE, null } not present in cache.
2017-08-15 15:20:09,877 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download resource { { file:/var/cdr-ingest-mapreduce/lib/dataplatform-common-1.0.7.jar, 1502740240000, FILE, null },pending,[(container_1502835577753_0001_01_000001)],31900834426583787,DOWNLOADING}
Any help is appreciated.
Basically, the mapper/reducer was trying to read the dependent jar file(s) from the node manager's local filesystem. I confirmed that by comparing the configurations between the 2 clusters. The value of "fs.defaultFS" was set to "file:///" for the cluster that wasn't working. It looks like, that value comes from the file /etc/hadoop/conf/core-site.xml on the server (edge server) where my mapreduce was started. This file had no configurations because I had no service/role deployed on that edge server. I deployed HDFS/HttpFs on the edge server and redeployed the client configurations across the cluster. Alternatively, one could deploy a gateway role on the server to pull the configurations without having to run any role. Thanks to #tk421 for the tip. This created the contents in /etc/hadoop/conf/core-site.xml and fixed my problem.
For those who don't want to deploy any service/role on the edge server, you could copy the file contents from one of your your data node.
I added this little code snippet before starting the job to print the configuration values:
for (Entry<String, String> entry : config) {
System.out.println(entry.getKey() + "-->" + entry.getValue());
}
// Start and wait for the job to finish
jobStatus = job.waitForCompletion(true);
I have created a runnable jar file and executing in Hadoop, I am not getting the Output. The code works fine. I have checked it in eclipse with adding hadoop jar files and I have got the Output absolutely right
hduser#Strawhats:~$ hadoop jar /home/hduser/Desktop/project.jar /user/hduser/input /user/hduser/output
17/02/20 19:18:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/02/20 19:18:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/02/20 19:18:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/02/20 19:18:05 INFO mapred.FileInputFormat: Total input paths to process : 1
17/02/20 19:18:05 INFO mapreduce.JobSubmitter: number of splits:2
17/02/20 19:18:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1487596891791_0003
17/02/20 19:18:06 INFO impl.YarnClientImpl: Submitted application application_1487596891791_0003
17/02/20 19:18:06 INFO mapreduce.Job: The url to track the job: http://Strawhats:8088/proxy/application_1487596891791_0003/
17/02/20 19:18:06 INFO mapreduce.Job: Running job: job_1487596891791_0003
I am trying to simulate the Hadoop environment using latest Hadoop version 2.6.0, Java SDK 1.70 on my Ubuntu desktop. I configured the hadoop with necessary environment parameters and all its processes are up and running and they can be seen with the following jps command:
nandu#nandu-Desktop:~$ jps
2810 NameNode
3149 SecondaryNameNode
3416 NodeManager
3292 ResourceManager
2966 DataNode
4805 Jps
I could also see the above information, plus the dfs files through the Firefox browser. However, when I tried to run a simple WordCound MapReduce job, it hangs and it doesn't produce any output or shows any error message(s). After a while I killed the process using the "hadoop job -kill " command. Can you please guide me, to find the cause of this issue and how to resolve it? I am giving below the Job start and kill(end) screenshot.
If you need additional information, please let me know.
Your help will be highly appreciated.
Thanks,
===================================================================
nandu#nandu-Desktop:~/dev$ hadoop jar wc.jar WordCount /user/nandu/input /user/nandu/output
15/02/27 10:35:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/27 10:35:20 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/27 10:35:21 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/02/27 10:35:21 INFO input.FileInputFormat: Total input paths to process : 2
15/02/27 10:35:21 INFO mapreduce.JobSubmitter: number of splits:2
15/02/27 10:35:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1425048764581_0003
15/02/27 10:35:22 INFO impl.YarnClientImpl: Submitted application application_1425048764581_0003
15/02/27 10:35:22 INFO mapreduce.Job: The url to track the job: http://nandu-Desktop:8088/proxy/application_1425048764581_0003/
15/02/27 10:35:22 INFO mapreduce.Job: Running job: job_1425048764581_0003
==================== at this point the job was killed ===================
15/02/27 10:38:23 INFO mapreduce.Job: Job job_1425048764581_0003 running in uber mode : false
15/02/27 10:38:23 INFO mapreduce.Job: map 0% reduce 0%
15/02/27 10:38:23 INFO mapreduce.Job: Job job_1425048764581_0003 failed with state KILLED due to: Application killed by user.
15/02/27 10:38:23 INFO mapreduce.Job: Counters: 0
I encountered similar problem while running provided MapReduce sample in hadoop package. In my case it was hanging due to low disk space on my VM (about 1.5 GB was empty). When I freed some disk space it ran pretty fine. Also, please check other system resource requirements are fulfilled.
I am trying to run the wordcount example in c++, on Hadoop 1.0.4, on Ubuntu 12.04, but I am getting the following error:
Command:
hadoop pipes -D hadoop.pipes.java.recordreader=true -D
hadoop.pipes.java.recordwriter=true -input bin/input.txt -output
bin/output.txt -program bin/wordcount.
Error message:
13/06/14 13:50:11 WARN mapred.JobClient: No job jar file set. User
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
13/06/14 13:50:11 INFO util.NativeCodeLoader:
Loaded the native-hadoop library 13/06/14 13:50:11 WARN
snappy.LoadSnappy: Snappy native library not loaded 13/06/14 13:50:11
INFO mapred.FileInputFormat: Total input paths to process : 1 13/06/14
13:50:11 INFO mapred.JobClient: Running job: job_201306141334_0003
13/06/14 13:50:12 INFO mapred.JobClient: map 0% reduce 0% 13/06/14
13:50:24 INFO mapred.JobClient: Task Id :
attempt_201306141334_0003_m_000000_0, Status : FAILED
java.io.IOException at
org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication(OutputHandler.java:188)
at
org.apache.hadoop.mapred.pipes.Application.waitForAuthentication(Application.java:194)
at
org.apache.hadoop.mapred.pipes.Application.(Application.java:149)
at
org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:71)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at
org.apache.hadoop.mapred.Child$4.run(Child.java:255) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
attempt_201306141334_0003_m_000000_0: Server failed to authenticate.
Exiting 13/06/14 13:50:24 INFO mapred.JobClient: Task Id :
attempt_201306141334_0003_m_000001_0, Status : FAILED
I didn't find any solution and i've been trying for quite a while to make it work.
I appreciate your help,
Thanks.
Found this SO question (hadoop not running in the multinode cluster) where that user got similar errors and it ended up being that they did not "Set a class" according to the top answer. This was Java however.
I found this tutorial about running the C++ wordcount example in Hadoop. Hopefully this helps you out.
http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_2.2_--_Running_C%2B%2B_Programs_on_Hadoop