Oozie cannot find workflow file - hdfs

File exists in HDFS:
hdfs dfs -ls hdfs://nameservice/user/user123/workflow.xml
-rw-r--r-- 3 user123 user123 662 2016-10-24 11:25 hdfs://nameservice/user/user123/workflow.xml
Try the same with oozie:
oozie validate hdfs://nameservice/user/user123/workflow.xml
oozie can't find it:
App definition [File does not exist, hdfs://nameservice/user/user123/workflow.xml] does not exist
Same error if I try to submit the workflow using oozie run.
What are the possible causes/things to check?

You need to provide the local file system path not the HDFS location. This is client side validation.
oozie validate <ARGS> : validate a workflow XML file

Related

Permission issue with AWS lambda layer

I am trying to add javaagent to AWS lambda function. I created a layer and uploaded a zip file which has jar file. When I test the function I get
Error opening zip file or JAR manifest missing : /opt/xxxxx.jar
But I can clearly see inside
-rwxrwxrwx 1 root root 1219876 Nov 30 05:30 xxxxx.jar
The only issue i see is it is with root permission. How do i upload so that it has current user permissions ?
It works when i select Runtime as JAVA8(Corretto) or JAVA11(Corretto). Some issue with JAVA8

How to give back hdfs permission to super group?

In order to access hdfs. I unknowing gave the following command in root user.( I had tried to resolve the following error )
sudo su - hdfs
hdfs dfs -mkdir /user/root
hdfs dfs -chown root:hdfs /user/root
exit
Now when I tried to access hdfs it says,
Call From headnode.name.com/192.168.21.110 to headnode.name.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
What can I do to resolve this issue.It would be great if you could explain what the command 'hdfs dfs -chown root:hdfs /user/root'does.
I am using HDP 3.0.1.0 (Ambari)
It seems like your HDFS is down.. Check if your namenode is up.
The command hdfs dfs -chown root:hdfs /user/root changes the ownership of the HDFS directory /user/root(if it exists) to user root and group hdfs. User hdfs should be able to perform this command(or any command in the HDFS in the matter of fact). The "root" user of the HDFS is hdfs.
If you want to make user root an HDFS superuser, you can change the group of the root user to hdfs using(with user root) usermod -g hdfs root and then run(from user hdfs) hdfs dfsadmin -refreshUserToGroupsMappings. This will sync the user groups mappings in the server with the HDFS, making user root a superuser.

Pyspark error reading file. Flume HDFS sink imports file with user=flume and permissions 644

I'm using Cloudera Quickstart VM 5.12
I have a Flume agent moving CSV files from spooldir source into HDFS sink. The operation works ok but the imported files have:
User=flume
Group=cloudera
Permissions=-rw-r--r--
The problem starts when I use Pyspark and get:
PriviledgedActionException as:cloudera (auth:SIMPLE)
cause:org.apache.hadoop.security.AccessControlException: Permission denied:
user=cloudera, access=EXECUTE,
inode=/user/cloudera/flume/events/small.csv:cloudera:cloudera:-rw-r--r--
(Ancestor /user/cloudera/flume/events/small.csv is not a directory).
If I use "hdfs dfs -put ..." instead of Flume, user and group are "cloudera" and permissions are 777. No Spark error
What is the solution? I cannot find a way from Flume to change file's permissions. Maybe my approach is fundamentally wrong
Any ideas?
Thank you

Hadoop dfsadmin -report command is not working in mapr

I need to know the dfs report of the mapr cluster but when i am executing following command i am getting error
hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
report: FileSystem maprfs:/// is not an HDFS file system
Usage: java DFSAdmin [-report] [-live] [-dead] [-decommissioning]
Is there any way to do it in MAPR.
I tried this link as well but it doesn't provided needed information.
Try below commands:
maprcli node list
maprcli dashboard info

How to move file from local to HDFS using oozie?

I am trying to move data from a local file system to the Hadoop distributed file system , but i am not able to move it through oozie
Can we move or copy data from a local filesystem to HDFS using oozie ???
I found a workaround for this problem. The ssh action will always execute from the Oozie server. So if your files are located on the local file system of the Oozie server, you will be able to copy them to HDFS.
The ssh action will always be executed by the 'oozie' user. So your ssh action should look like this: myUser#oozie-server-ip, where myUser is a user with read rights on the files from the Oozie server.
Next, you need to set up passwordless ssh between the oozie user and myUser, on the Oozie server. Generate a public key for the 'oozie' user and copy the generated key in the authorized_keys file of 'myUser'. This is the command for generating the rsa key:
ssh-keygen -t rsa
When generating the key, you need to be logged in with the oozie user. Usually on a Hadoop cluster this user will have its home in /var/lib/oozie and the public key will be generated in id_rsa.pub in /var/lib/oozie/.ssh
Next copy this key in the authorized_keys file of 'myUser'. You will find it in the user's home, in the .ssh folder.
Now that you have set up the passwordless ssh, it time to set up the ssh oozie action. This action will execute the command 'hadoop' and will have as arguments '-copyFromLocal', '${local_file_path}' and '${hdfs_file_path}'.
No, Oozie isn't aware of a local filesystem, cause it's run in Map-Reduce cluster nodes. You should use Apache Flume to move data from a local filesystem to HDFS.
Oozie will not support the Copy action from Local to HDFS or vise versa, but u can call java program to do the same, Shell action will also work, but if you have more than one node in a cluster, then all the node should be having the said local Mount point available or mounted with read/write access.
You can do this using Oozie shell action by putting the copy command in the shell script.
https://oozie.apache.org/docs/3.3.0/DG_ShellActionExtension.html#Shell_Action
Example:
<workflow-app name="reputation" xmlns="uri:oozie:workflow:0.4">
<start to="shell"/>
<action name="shell">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>run.sh</exec>
<file>run.sh#run.sh</file>
<capture-output/>
</shell>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
In Your run.sh you can use: hadoop fs -copyFromLocal command.