hadoop 3.3.1 fs -mkdir No such file or directory - hadoop3

$ hadoop fs -mkdir /home/hadoop/hadoop_input
mkdir: `hdfs://localhost:9000/home/hadoop': No such file or directory
even I create /home/hadoop/hadoop_input folder,the error still exists.

OK,I know why.by default,only '/' directory exists.So I have to first create /home,then /home/hadoop, and last create /home/hadoop/hadoop_input.

Related

Cannot use regular expression in hadoop in Linux command line

I have a folder that contains a large number of subfolders that are dates from 2018. In my HDFS I have created a folder of just December dates (formatted 2018-12-) and I need to delete specifically days 21 - 25. I copied this folder from my HDFS to my docker container and used the command
rm -r *[21-25]
in the folder it worked as expected. But when I run this same command adapted to hdfs
hdfs dfs –rm -r /home/cloudera/logs/2018-Dec/*[21-25]
it gives me the error
rm: `/home/cloudera/logs/2018-Dec/*[21-25]': No such file or directory."
If you need something to be explained in more detail leave a comment. I'm brand new to all of this and I don't 100% understand how to say some of these things.
I figured it out with the help of #Barmer. I was referring to my local systems base directory also I had to change the regular expression to 2[1-5]. So the command ended up being hdfs dfs -rm -r /user/cloudera/logs/2018-Dec/*2[1-5].

HDFS dfs -ls path/filename

I have copied few files to the path. but when I tried to run the command hdfs dfs -ls path/filename then it returns no file found.
hdfs dfs -ls till directory works but when i use the file name it returns no files found. For one of the file, I copied and pasted the file name using ambari. Then file started getting returned on using hdfs dfs -ls path/filename.
What is causing this issue?
Because when you are executing HDFS dfs -ls path/filename what you are saying to hdfs is show me all the files that are in the directory and if end path is a file, of course, you are not listing anything. You must point to a directory not a file.
#saravanan it seems like a permission issue if the file shows up only after using ambari. Make sure the files are owned correctly to confirm the commands. The ls command will list files and folders per documentation.
Here is full documentation for ls command:
[root#hdp ~]# hdfs dfs -help ls
-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...] :
List the contents that match the specified file pattern. If path is not
specified, the contents of /user/<currentUser> will be listed. For a directory a
list of its direct children is returned (unless -d option is specified).
Directory entries are of the form:
permissions - userId groupId sizeOfDirectory(in bytes)
modificationDate(yyyy-MM-dd HH:mm) directoryName
and file entries are of the form:
permissions numberOfReplicas userId groupId sizeOfFile(in bytes)
modificationDate(yyyy-MM-dd HH:mm) fileName
-C Display the paths of files and directories only.
-d Directories are listed as plain files.
-h Formats the sizes of files in a human-readable fashion
rather than a number of bytes.
-q Print ? instead of non-printable characters.
-R Recursively list the contents of directories.
-t Sort files by modification time (most recent first).
-S Sort files by size.
-r Reverse the order of the sort.
-u Use time of last access instead of modification for
display and sorting.
-e Display the erasure coding policy of files and directories.

How to restore a deleted folder from HDFS

I deleted a folder from HDFS, I found it under
/user/hdfs/.Trash/Current/
but I can't restore it. I looked in the forum but I don't find the good solution.
Please someone have a solution I can help me how can I restore my folder in the best directory ?
Thank you very much
Did you try cp or mv? e.g.,
hdfs dfs -cp -r /user/hdfs/.Trash/Current/ /hdfs/Current
Before moving back your directory, you should locate where your file is in:
hadoop fs -lsr /user/<user-name>/.Trash | less
Eg, you may found:
-rw-r--r-- 3 <user-name> supergroup 111792 2020-06-28 13:17 /user/<user-name>/.Trash/200630163000/user/<user-name>/dir1/dir2/file
If dir1 is your deleted dir, move it back:
hadoop fs -mv /user/<user-name>/.Trash/200630163000/user/<user-name>/dir1 <destination>
To move from
/user/hdfs/.Trash/Current/<your file>
Use the -cp command, like this
hdfs dfs -cp /user/hdfs/.Trash/Current/<your file> <destination>
Also you will find that your dir/file name is changed you can change it back to whatever you want by using '-mv' like this:
hdfs dfs -mv <Your deleted filename with its path> <Your new filename with its path>
Example:
hdfs dfs -mv /hdfs/weirdName1613730289428 /hdfs/normalName

How to copy file to HDFS in case insensitive way

I have to copy certain CSV files to HDFS of format
ABCDWXYZ.csvviz. PERSONDETAILS.csv and I have to copy it to an HDFS directory of name AbcdWxyz viz PersonDetails.
Now the problem is I don't have exact HDFS directory name, I get it from the CSV file after trimming it and fire put
Hadoop fs -put $localRootDir/$Dir/*.csv $HDFSRootDir/$Dir
but it throws an error as there is no such directory in HDFS with all uppercase letter.
Now how can I copy the file to HDFS? Is there a way to make the Hadoop put command case insensitive using regex or natively.
Or is there a way by which I can convert the String to required CamelCase
You should be able to use
hadoop fs -find / -iname $Dir -print
to get the path name in the correct spelling as it exists in HDFS. Then feed that back into your copy command.

Hadoop Put command for two files

A file named records.txt from local to HDFS can be copied by using below command
hadoop dfs -put /home/cloudera/localfiles/records.txt /user/cloudera/inputfiles
By using the above command the file records.txt will be copied into HDFS with the same name.
But I want to store two files(records1.txt and demo.txt) into HDFS
I know that we can use something like below
hadoop dfs -put /home/cloudera/localfiles/records* /user/cloudera/inputfiles
but Is there any command that will help us to store one or two files with different names to be copied into hdfs ?
With put command argument, you could provide one or multiple source files as mentioned here. So try something like:
hadoop dfs -put /home/cloudera/localfiles/records* /home/cloudera/localfiles/demo* /user/cloudera/inputfiles
From hadoop shell command usage:
put
Usage: hadoop fs -put <localsrc> ... <dst>
Copy single src, or multiple srcs from local file system to the destination filesystem. Also reads input from stdin and writes to destination filesystem.
hadoop fs -put localfile /user/hadoop/hadoopfile
hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir
hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
hadoop fs -put - hdfs://nn.example.com/hadoop/hadoopfile
Reads the input from stdin.
Exit Code:
Returns 0 on success and -1 on error.
It can be done using copyFromLocal coomand as follows :
hduser#ubuntu:/usr/local/pig$ hadoop dfs -copyFromLocal /home/user/Downloads/records1.txt /home/user/Downloads/demo.txt /user/pig/output