get the list of files hirarchically with its absolute path - list

I want the output as shown in below example.
root#aklinux139:~/.atom# du -sh */* | awk '{print $2}'
blob-store/BLOB
blob-store/INVKEYS
blob-store/MAP
compile-cache/less
compile-cache/root
compile-cache/style-manager
packages/README.md
storage/application.json
root#aklinux139:~/.atom#
But ls does not give this output with any of its options/arguments.
'ls -R' gives path and then its content not the filename with its absolute path.
I need this very often while writing scripts, Can someone help me with this ? Thanks heap in advance.

You can use find, an example for a depth of 2 :
find . -maxdepth 2 -mindepth 2 -printf '%P\n'
If you want to exclude dot files :
find . -maxdepth 2 -mindepth 2 -not -path '*/\.*' -printf '%P\n'
If you want to sort the result (as du -sh) :
find . -maxdepth 2 -mindepth 2 -not -path '*/\.*' -printf '%P\n' | sort

try this -
find / -name "*" |head
/
/var
/var/games
/var/yp
/var/kerberos
/var/kerberos/krb5
/var/kerberos/krb5/user
/var/.updated
/var/nis
/var/account

Related

Using regex OR with find to list and delete files

I have a folder with these files:
sample.jpg
sample.ods
sample.txt
sample.xlsx
Now, I need to find and remove files that end with either .ods or .xlsx.
To fish them out I initially use:
ls | grep -E "*.ods|*.xlsx"
This gives me:
sample.ods
sample.xlsx
Now, I don't want to parse ls so I use find:
find . -type f -regextype grep -regex '.*/*.ods\|*.xlsx' | wc -l
But that gives me the output of 1 while I expect to have 2 files before I extend the command to:
find . -type f -regextype grep -regex '.*/*.ods\|*.xlsx' | xargs -d"\n" rm
Which works but removes only the .ods file but not the .xlsx one.
What am I missing here?
I'm on ubuntu 18.04 and my find version is find (GNU findutils) 4.7.0-git.
You don't need to use regex here, just use -name and -or and so:
find . -type f -name "*.ods" -or -name "*.xlsx" -delete
Find files ending with either ods or xlsx and delete
If you really wanted to use regex, you could use the following:
find . -maxdepth 1 -regextype posix-extended -regex "(.*\.ods)|(.*\.xlsx)" -delete
Make sure that the expressions are in between brackets

Grep yyyy-mm from directories name

Quick question, I have this find clauses that find my backup directories in my external hard drive.
pi#raspberrypi:/media/pi/WD/HS_BACKUP $ find . -depth -maxdepth 2 -type d -name "20*"
/media/pi/WD/HS_BACKUP/2019-12-26_22-30-01
/media/pi/WD/HS_BACKUP/2019-12-27_22-30-01
/media/pi/WD/HS_BACKUP/2020-01-29_23-00-02
/media/pi/WD/HS_BACKUP/2020-02-05_23-00-01
/media/pi/WD/HS_BACKUP/2020-02-12_23-00-01
/media/pi/WD/HS_BACKUP/2020-02-19_23-00-01
/media/pi/WD/HS_BACKUP/2020-02-26_23-00-01
I needed to grep yyyy-mm part of the directories (e. g. 2020-02), the desirable result should be.
2019-12
2019-12
2020-01
2020-02
2020-02
2020-02
How would I do that ? I tried awk with [/_] as delimiter, but it doesn't do the job right.
The grep expression you need is:
find ... | egrep --only-matching '[[:digit:]]{4}-[[:digit:]]{2}'
Could you you please try following. Haven't tested it since I don't have same directory structure like OP, should work but.
find . -depth -maxdepth 2 -type d -name "20*" | awk 'BEGIN{FS="/"} match($NF,/^[0-9]{4}-[0-9]{2}/){print substr($NF,RSTART,RLENGTH)}'

Regex in Bash: not wanting to include directories

I have a list of images, collected using the following line:
# find . -mindepth 1 -type f -name "*.JPG" | grep "MG_[0-9][0-9][0-9][0-9].JPG"
output:
./DCIM/103canon/IMG_0039.JPG
./DCIM/103canon/IMG_0097.JPG
./DCIM/103canon/IMG_1600.JPG
./DCIM/103canon/IMG_2317.JPG
./DCIM/IMG_0042.JPG
./DCIM/IMG_1152.JPG
./DCIM/IMG_1810.JPG
./DCIM/IMG_2564.JPG
./images/IMG_0058.JPG
./images/IMG_0079.JPG
./images/IMG_1233.JPG
./images/IMG_1959.JPG
./images/IMG_2012/favs/IMG_0039.JPG
./images/IMG_2012/favs/IMG_1060.JPG
./images/IMG_2012/favs/IMG_1729.JPG
./images/IMG_2012/favs/IMG_2013.JPG
./images/IMG_2012/favs/IMG_2317.JPG
./images/IMG_2012/IMG_0079.JPG
./images/IMG_2012/IMG_1403.JPG
./images/IMG_2012/IMG_2102.JPG
./images/IMG_2013/IMG_0060.JPG
./images/IMG_2013/IMG_1311.JPG
./images/IMG_2013/IMG_1729.JPG
./images/IMG_2013/IMG_2013.JPG
./IMG_0085.JPG
./IMG_1597.JPG
./IMG_2288.JPG
however I only want the very last portion, the IMG_\d\d\d\d.JPG. I have tried hundreds of regular expressions and this is the one that gives me the best result. Is there a way to only print out the filename without the directory tree before it or is is solely down to the regex?
Thanks
It should be
find . -mindepth 1 -type f -name "*MG_[0-9][0-9][0-9][0-9].JPG" -printf "%f\n"
If the -printf option is not available with your implementation of find (as in current versions of Mac OS X),
then you can use -execdir echo {} \; instead (if that's available):
find . -mindepth 1 -type f -name "*MG_[0-9][0-9][0-9][0-9].JPG" -execdir echo {} \;

find file with numeric values greater than a specified number

When I run the following command, I get a list of files
find . -type f -name '*_duplicate_[0-9]*.txt'
./prefix_duplicate_001.txt
./prefix_duplicate_002.txt
./prefix_duplicate_003.txt
./prefix_duplicate_004.txt
./prefix_duplicate_005.txt
Now I'm only interested in files which have the numbers greater than or equal to 003. How can I get this done?
Thank you in advance.
Using -regex option in find, you can tweak regex to get all files with 3 or higher value after _duplicate_ with leading zeroes:
find . -regextype posix-extended -type f \
-regex '.*_duplicate_0*([3-9]|[1-9][0-9])[0-9]*\.txt'
On OSX use this find:
find -E . -type f -regex '.*_duplicate_0*([3-9]|[1-9][0-9])[0-9]*\.txt'
./prefix_duplicate_003.txt
./prefix_duplicate_004.txt
./prefix_duplicate_005.txt
use this pattern
.*_duplicate_(?!00[1-2])\d{3}\.txt
Demo
As much as I like to use single commands when possible, I think maybe this is what you need here:
find . -type f -name '*_duplicate_[0-9]*.mat' | awk -F '[_.]' '$4 > 3 { print $0 }'
There are variations on that - for example, this:
find . -type f -name "*.mat" | awk -F '[_.]' '$0 ~ /_duplicate_[0-9]*.mat/ && $4 > 3 { print $0 }'
But I'm not sure it really makes a difference from an efficiency standpoint...
00[3-9]|(([1-9]\\d\\d)|(\\d[1-9]\\d))
only for the number part.

Find directory with regular expression: white spaces, command find

I had to write a bash script that have to find the directories in the current directory and that directories must have a name that start with a letter of the alphabet [A-z]. For shell I wrote:
find . -maxdepth 1 -name '[[:alpha:]]*' -type d
and it's ok. But in the script I wrote:
#! /bin/bash
files=$(find . -maxdepth 1 -name '[[:alpha:]]*' -type d)
for FILE in $files; do echo 'you are in', $FILE; done;
But, when it finds a directory with whitespace (ex. ./New Directory) the output is
./New
Directory
as it were 2 different directories. Why? How can i resolve this problem?
find -maxdepth 1 -type d -regextype posix-awk -regex ".*/[A-Z].*" -exec echo "you are in" {} \;
This may work for you :
find . -maxdepth 1 -name '[[:alpha:]]*' -type d | sed 's/^/You are in /'