Move files starting with number and of type pdf [closed] - regex

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am just a beginner in regex, so please forgive me if this question is too easy.
What I want to ask is that I have a bunch of files in a directory and I move some of the files which start with numbers and of type pdf. How to use regex with mv command and what would be the regex.

If you're using linux command prompt, actually you're not using Regex, but you're using GLOB notation instead, which is different. Read up on that. GLOB cannot take complex pattern such as the one you describe. You need to use real regex.
For your case, you can use grep command on the output of ls to find the files meeting your requirement, then call mv on them. Something like this:
while read fileName; do mv $fileName destination_folder; done < <(ls -1 | grep -E '[0-9].*\.pdf')
Let's break it up:
while read fileName; do
mv $fileName destination_folder;
done < <(ls -1 | grep -E '[0-9].*\.pdf')
So basically you read through the directory listing using while loop, which gets the input from the output of the last line ls -1 | grep -E '[0-9].*\.pdf'. Using while loop (instead of simpler for loop) is necessary to cater filenames containing spaces.
Now the command ls -1 | grep -E '[0-9].*\.pdf' basically just list down the filenames, and grab only those matching specified RegEx pattern.

You could use find too:
find . -maxdepth 1 -name "[0-9]*.pdf" -exec mv {} destination \;

Related

How to extract pattern with sed [duplicate]

This question already has answers here:
How to grep for contents after pattern?
(8 answers)
Closed 4 years ago.
I have tried many many times with different tools to extract a pattern from a file using unix tools, but I can't seem to get them to do what I want. So I have a file like this:
[blah]
project=abc123
ON#IRUjdi2ujnq!De
And I want to capture the project name.
I've tried using grep, egrep, awk, and sed, but I haven't been able to get them to work. Here's my current attempt.
cat file | sed -n "s/project = \(.*\)/\1/p"
and here's the current output
abc123
ON#IRUjdi2ujnq!De
For some reason it's considering the last 2 lines to match. I thought my regex would require a literal match on project = but it seems this is not the case.
Can some unix wizard help me out here? Any standard unix tool is fine.
------ EDIT ------
So actually my problem was slightly different. I was actually doing
gcloud config list project | sed -n "s/project = \(.*\)/\1/p"
Sorry I thought that it wouldn't make a difference, but apparently this is the issue.
If you do this gcloud config list project >> file
It will actually only output the the listing of your projects to the file and then it will print
Your active configuration is: [default]
to the terminal, and this is what was messing things up. If I manually write the whole output of doing the gcloud command to a file and then ran sed on that it actually worked. So it's something strange about how gcloud is outputting it's data.
With grep you could do:
grep -o '^project=.*' file | cut -f2- -d=
Result:
abc123

Extract JSON value in Bash [duplicate]

This question already has answers here:
Read JSON data in a shell script [duplicate]
(4 answers)
Closed 6 years ago.
In the Bash I saved response data into a variable.
The result looks like this:
{"token_type":"Bearer","access_token":"022-8baa5324-f57b-445d-c5ec-821c63a5fd35","expires_in":3600,"scope":"any-website.com"}
Now I want to extract the value of the access token into an other var.
In Linux I solved that in this way and it works:
echo "$response_json" | grep -oP '(?<="access_token":")[^"]*'
As result I get:
022-8baa5324-f57b-445d-c5ec-821c63a5fd35
My problem is that MacOS does not support the grep parameter P (Perl expression) anymore. Parameter E does not work with that expression.
I would appreciate any help with a solution without requiring to install additional Bash tools.
Everyone says they don't want to install new tools, but really, line-oriented tools like grep simply weren't designed to cope with structured text like JSON. If you are going to work with JSON, get tools designed to process it.
jq is one such option:
$ echo "$response_json" | jq -r '.access_token'
022-8baa5324-f57b-445d-c5ec-821c63a5fd35

Remove lines from a file which has a matching regex from another file [duplicate]

This question already has answers here:
How to remove the lines which appear on file B from another file A?
(12 answers)
Closed 7 years ago.
I have this shell script:
AVAIL_REMOVAL=$(grep -oPa '^.*(?=(\.com))' $HOME/dcheck/files/available.txt) | sed -i "/$AVAIL_REMOVAL/d" $HOME/dcheck/files/domains.txt
$HOME/dcheck/files/available.txt
unregistereddomain1.com available 15/12/28_14:05:27
unregistereddomain3.com available 15/12/28_14:05:28
$HOME/dcheck/files/domains.txt
unregistereddomain1
registereddomain2
unregistereddomain3
I want to remove unregistereddomain1 and unregistereddomain3 lines from domains.txt. How is it possible?
Also, is there a faster solution than grep? This benchmark showed that grep needed the most time to execute: Deleting lines from one file which are in another file
EDIT:
This works with one line files, but not multiline:
sed -i "/$(grep -oPa '^.*(?=(\.com))' $HOME/dcheck/files/available.txt)/d" $HOME/dcheck/files/domains.txt
EDIT 2:
Just copy here to have a backup. This solution needed for a domain checker bash script which if terminating some reason, at the next restart, it will remove the lines from the input file:
grep -oPa --no-filename '^.*(?=(\.com))' $AVAILABLE $REGISTERED > $GREPINPUT \
&& awk 'FNR==NR { a[$0]; next } !($0 in a)' $GREPINPUT $DOMAINS > $DOMAINSDIFF \
&& cat $DOMAINSDIFF > $DOMAINS \
&& rm -rf $GREPINPUT $DOMAINSDIFF
Most of the domain checker scripts here trying to solve this removel at the end of the script. But what they do not think about what's happening when the script terminated to run and there's no graceful shutdown? Than it will check again every single line from the input file, including the ones that are already checked... This one solves this problem. This way the script (with proper service management, like docker-compose, systemd, supervisord) can run for years from millions of millions size list files, until it will totally eat up the input file!
from man grep:
-f file
--file=file
Obtain patterns from file, one per line. The empty file contains
zero patterns, and therefore matches nothing. (-f is specified by POSIX.)
Regarding the speed: depending on regexp the performance may differ drastically. The one you use seems /suspicious/. The fixed lines matches are the fastest, almost always.

Filter specific lines from directory tree listing [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have the following directory listing:
/home/a/b/c/d/5089/294265
/home/a/b/c/d/5089/79783
/home/a/b/c/d/41630
/home/a/b/c/d/41630/293520
/home/a/b/c/d/41630/293520/293520
...
I want to filter only the lines that go 7 directories deep. In this example I would need only the line: /home/a/b/c/d/41630/293520/293520
Please suggest.
Thanks
You could use grep. Saying:
grep -P '(/[^/]*){8}' inputfile
would return
/home/a/b/c/d/41630/293520/293520
Not sure how you are generating this listing, but if you were using find you could control the depth by specifying -mindepth and -maxdepth options.
You can try:
find /home/x/y/z/ -print | awk -F/ 'NF>8'
or you could try
find /home/x/y/z/ -mindepth 7 -print
YourInput | sed 's|/.|&|7;t
d'
remove line with less than 7 "/" followed by something
echo /home/a/b/c/d/*/*/*
should do the trick.
Using awk:
find /home| awk -F \/ 'NF==9' file

Add and Sort numbers in files [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have directories like
./2012/NY/F/
./2012/NJ/M/
....
Under these directories, there are files with names like Zoe etc...
Each file contains a number.
I'd like to sum the numbers in the file with same file name in different directories and find the max of sum, how should I write?
To locate the files, use a glob such as specified in this question.
To do the actual summing, there are quite a few possibilities depending on the number of files and range of the numbers, but a reasonably general-purpose way would be with awk:
awk '{sum += $1} END { print sum }' file1 file2 ...
Suppose that your ./2012/NY/F, /2012/sfs/XXS all under directory, say, /home/yourusername/data/,
You can try this if you are using *nix or if you have cygwin installed on your windows
cd /home/yourusername/data ; find ./ -name yourfile_name_to_lookup.txt | xargs awk 'BEGIN {sum=0} ; {sum+=$1} ; END {print sum} '
I assume the number starting from the first column in that file ($1).
If you know the unique names of the files and the file names don't have space in them, then following may work.
cd 2012/
for i in "Zoe" "file2" "file3"
do
k=$(cat $(find . -type f -name "$i"));
echo $k | awk '{for(i=t=0;i<NF;) t+=$++i; $0=t}1';
done | sort -r
This will sum up files with same names from subdirs under 2012 and sort -r will return the numbers in max to min order.
I assume that the entire contents of the file is a number. I assume that the number is an integer. Requires bash 4 for the associative array
declare -A sum_for_file
for path in ./2012/*/*/*; do
(( sum_for_file["$(basename "$path")"] += $(< "$path") ))
done
max=0
for file in "${!sum_for_file[#]}"; do
if (( ${sum_for_file["$file"]} > max )); then
max=${sum_for_file["$file"]}
maxfile=$file
fi
# you didn't say you needed to print it, but if you do
printf "%d\t%s\n" ${sum_for_file["$file"]} "$file"
done
echo "the maximum sum is $max found in files named $maxfile"