Script to place files in folder by extracting date from filename - regex

I know this has been asked many times, i am terrible with bash and i do not understand the regex format for it. Figured i'd ask for help..
I have a security camera which writes files to a folder in this format:
MDalarm_20170320_084514.mkv
so it goes -- MDalarm_yearmonthday_hourminutesecond.mkv
I want to create a cronjob that will run a script to clean this up, by doing the following:
Taking the files and placing them in a folder for year/month/day then renaming the file to the time only ie: 08_26_15.mkv, even 082615.mkv would be fine if too much of a hassle.
So in the example of MDalarm_20170320_084514.mkv
it should produce
/2017/03/20/08_45_14.mkv
or similar.
The files will be placed in the root folder as they come and the script will run once/twice a day on the folder for cleanup.
I'm decent with regex in php/js/etc.. but the bash one i completely do not understand well enough to get this done. I sincerely appreciate the help.
Cheers!

Use this to make the desired file name
$echo MDalarm_20170320_084514.mkv | sed -E "s/^MDalarm_[[:digit:]]{8}_//"
084514.mkv
and this to make the desired folder name
$echo MDalarm_20170320_084514.mkv | sed -E "s/^MDalarm_([[:digit:]]{4})([[:digit:]]{2})([[:digit:]]{2})_.*$/\/\1\/\2\/\3/"
/2017/03/20
Use them in shell commands to make folder (if needed) and copy/rename/move file.

This is what i ended up with and it works, thank you Yunnosch for the regex.
#!/bin/bash
if [[ `ls | grep -c mkv` == 0 ]]
then
echo "NO MKV FILES"
else
for f in *.mkv; do
name=`echo "$f"| sed -E "s/^MDalarm_[[:digit:]]{8}_([[:digit:]]{2})([[:digit:]]{2})([[:digit:]]{2})(.*)$/\1h-\2m-\3s\4/"`
dir=`echo "$f" | sed -E "s/^MDalarm_([[:digit:]]{4})([[:digit:]]{2})([[:digit:]]{2})_.*$/\1\/\2\/\3/"`
mkdir -p "$dir"
mv "$f" "$name"
mv "$name" "$dir"
done
fi
Once someone wrote the regex out i figured out the format, different yet similar.

Related

how to change files names using pattern in Mac OS?

I have 55 files named:
result_fresh_1.txt
...
result_fresh_55.txt
I want to rename them to:
result_bl_1.txt
...
result_bl_55.txt
how can I do this automatically?
for file in result_fresh_*.txt
do
mv "$file" $(echo "$file" | sed 's/_fresh_/_bl_/')
done
I have a Perl-based rename command (also available on Linux machines, so that equivalent source must be available elsewhere too) that would reduce it to:
rename 's/_fresh_/_bl_/' result_fresh_*.txt
There aren't any spaces in the names shown, but the code would work sanely unless the names included newlines.
Try:
for f1 in $(echo {1..55});do
if [ -f "result_fresh_$f1.txt" ];then
mv result_fresh_$f1.txt result_b1_$f1.txt
fi
done

Find and repalce string that includes quotes within files in multiple directories - unix aix

So here's the scenario. I'd like to change the following value from true to false in 100's of files in an installation but can't figure out the command and been working on this for a few days now. what i have is a simple script which looks for all instances of a file and stores the results in a file. I'm using this command to find the files I need to modify:
find /directory -type f \ ( -name 'filename' \) > file_instances.txt
Now what i'd like to do is run the following command, or a variation of it, to modify the following value:
sed 's/directoryBrowsingEnabled="false"/directoryBrowsingEnabled="true"/g' $i > $i
When i tested the above command, it had blanked out the file when it attempted to replace the string but if i run the command against a single file, the change is made correctly.
Can someone please shed some light on to this?
Thank you in advance
What has semi-worked for me is the following:
You can call sed with the -i option instead of doing > $i. You can even do a backup of the old file just in case you have a problem by adding a suffix.
sed -e 'command' -i.backup myfile.txt
This will execute command inplace on myfile.txt and save the old file in myfile.txt.backup.
EDIT:
Not using -i may indeed result in blank files, this is because unix doesn't like you to read and write at the same time (it leads to a race condition).
You can convince yourself of this by some simple cat commands:
$ echo "This is a test" > test.txt
$ cat test.txt > test.txt # This will return an error cat being smart
$ cat <test.txt >test.txt # This will blank the file, cat being not that smart
On AIX you might be missing the -i option of sed. Sad. You could make a script that moves each file to a tmp file and redirects (with sed) to the original file or try using a here-construction with vi:
cat file_instances.txt | while read file; do
vi ${file}<<END >/dev/null 2>&1
:1,$ s/directoryBrowsingEnabled="false"/directoryBrowsingEnabled="true"/g
:wq
END
done

Safe search&replace on linux

Let's consider I have files located in different subfolders and I would like to search, test and replace something into these files.
I would like to do it in three steps:
Search of a specific pattern (with or without regexp)
Test to replace it with something (with or without regexp)
Apply the changes only to the concerned files
My current solution is to define some aliases in my .bashrc in order to easily use grep and sed:
alias findsrc='find . -name "*.[ch]" -or -name "*.asm" -or -name "*.inc"'
alias grepsrc='findsrc | xargs grep -n --color '
alias sedsrc='findsrc | xargs sed '
Then I use
grepsrc <pattern> to search my pattern
(no solution found yet)
sedsrc -i 's/<pattern>/replace/g'
Unfortunately this solution does not satisfy me. The first issue is that sed touch all the files even of no changes. Then, the need to use aliases does not look very clean to me.
Ideally I would like have a workflow similar to this one:
Register a new context:
$ fetch register 'mysrcs' --recurse *.h *.c *.asm *.inc
Context list:
$ fetch context
1. mysrcs --recurse *.h *.c *.asm *.inc
Extracted from ~/.fetchrc
Find something:
$ fetch files mysrcs /0x[a-f0-9]{3}/
./foo.c:235 Yeah 0x245
./bar.h:2 Oh yeah 0x2ac hex
Test a replacement:
$ fetch test mysrcs /0x[a-f0-9]{3}/0xabc/
./foo.c:235 Yeah 0xabc
./bar.h:2 Oh yeah 0xabc hex
Apply the replacement:
$ fetch subst --backup mysrcs /0x[a-f0-9]{3}/0xabc/
./foo.c:235 Yeah 0xabc
./bar.h:2 Oh yeah 0xabc hex
Backup number: 242
Restore in case of mistake:
$ fetch restore 242
This kind of tools look pretty standard to me. Everybody needs to search and replace. What alternative can I use that is standard in Linux?
#!/bin/ksh
# Call the batch with the 2 (search than replace) pattern value as argument
# assuming the 2 pattern are "sed" compliant regex
SearchStr="$1"
ReplaceStr="$2"
# Assuming it start the search from current folder and take any file
# if more filter needed, use a find before with a pipe
grep -l -r "$SearchStr" . | while read ThisFile
do
sed -i -e "s/${SearchStr}/${ReplaceStr}/g" ${ThisFile}
done
should be a base script to adapt to your need
I often have to perform such maintenance tasks. I use a mix of find, grep, sed, and awk.
And instead of aliases, I use functions.
For example:
# i. and ii.
function grepsrc {
find . -name "*.[ch]" -or -name "*.asm" -or -name "*.inc" -exec grep -Hn "$1"
}
# iii.
function sedsrc {
grepsrc "$1" | awk -F: '{print $1}' | uniq | while read f; do
sed -i s/"$1"/"$2"/g $f
done
}
Usage example:
sedsrc "foo[bB]ar*" "polop"
for F in $(grep -Rl <pattern>) ; do sed 's/search/replace/' "$F" | sponge "$F" ; done
grep with the -l argument just lists files that match
We then use an iterator to just run those files which match through sed
We use the sponge program from the moreutils package to write the processed stream back to the same file
This is simple and requires no additional shell functions or complex scripts.
If you want to make it safe as well... check the folder into a Git repository. That's what version control is for.
Yes there is a tool doing exactely that you are looking for. This is Git. Why do you want to manage the backup of your files in case of mistakes when specialized tools can do that job for you?
You split your request in 3 subquestions:
How quickly search into a subset of my files?
How to apply a substitution temporarly, then go back to the original state?
How to substitute into your subset of files?
We first need to do some jobs in your workspace. You need to init a Git repository then add all your files into this repository:
$ cd my_project
$ git init
$ git add **/*.h **/*.c **/*.inc
$ git commit -m "My initial state"
Now, you can quickly get the list of your files with:
$ git ls-files
To do a replacement, you can either use sed, perl or awk. Here the example using sed:
$ git ls-files | xargs sed -i -e 's/search/replace/'
If you are not happy with this change, you can roll-back anytime with:
$ git checkout HEAD
This allows you to test your change and step-back anytime you want to.
Now, we did not simplified the commands yet. So I suggest to add an alias to your Git configuration file, usually located here ~/.gitconfig. Add this:
[alias]
sed = ! git grep -z --full-name -l '.' | xargs -0 sed -i -e
So now you can just type:
$ git sed s/a/b/
It's magic...

Remove duplicate filename extensions

I have thousands of files named something like filename.gz.gz.gz.gz.gz.gz.gz.gz.gz.gz.gz
I am using the find command like this find . -name "*.gz*" to locate these files and either use -exec or pipe to xargs and have some magic command to clean this mess, so that I end up with filename.gz
Someone please help me come up with this magic command that would remove the unneeded instances of .gz. I had tried experimenting with sed 's/\.gz//' and sed 's/(\.gz)//' but they do not seem to work (or to be more honest, I am not very familiar with sed). I do not have to use sed by the way, any solution that would help solve this problem would be welcome :-)
one way with find and awk:
find $(pwd) -name '*.gz'|awk '{n=$0;sub(/(\.gz)+$/,".gz",n);print "mv",$0,n}'|sh
Note:
I assume there is no special chars (like spaces...) in your filename. If there were, you need quote the filename in mv command.
I added a $(pwd) to get the absolute path of found name.
you can remove the ending |sh to check generated mv ... .... cmd, if it is correct.
If everything looks good, add the |sh to execute the mv
see example here:
You may use
ls a.gz.gz.gz |sed -r 's/(\.gz)+/.gz/'
or without the regex flag
ls a.gz.gz.gz |sed 's/\(\.gz\)\+/.gz/'
ls *.gz | perl -ne '/((.*?.gz).*)/; print "mv $1 $2\n"'
It will print shell commands to rename your files, it won't execute those commands. It is safe. To execute it, you can save it to file and execute, or simply pipe to shell:
ls *.gz | ... | sh
sed is great for replacing text inside files.
You can do that with bash string substitution:
for file in *.gz.gz; do
mv "${file}" "${file%%.*}.gz"
done
This might work for you (GNU sed):
echo *.gz | sed -r 's/^([^.]*)(\.gz){2,}$/mv -v & \1\2/e'
find . -name "*.gz.gz" |
while read f; do echo mv "$f" "$(sed -r 's/(\.gz)+$/.gz/' <<<"$f")"; done
This only previews the renaming (mv) command; remove the echo to perform actual renaming.
Processes matching files in the current directory tree, as in the OP (and not just files located directly in the current directory).
Limits matching to files that end in at least 2 .gz extensions (so as not to needlessly process files that end in just one).
When determining the new name with sed, makes sure that substring .gz doesn't just match anywhere in the filename, but only as part of a contiguous sequence of .gz extensions at the end of the filename.
Handles filenames with special chars. such as embedded spaces correctly (with the exception of filenames with embedded newlines.)
Using bash string substitution:
for f in *.gz.gz; do
mv "$f" "${f%%.gz.gz*}.gz"
done
This is a slight modification of jaypal's nice answer (which would fail if any of your files had a period as part of its name, such as foo.c.gz.gz). (Mine is not perfect, either) Note the use of double-quotes, which protects against filenames with "bad" characters, such as spaces or stars.
If you wish to use find to process an entire directory tree, the variant is:
find . -name \*.gz.gz | \
while read f; do
mv "$f" "${f%%.gz.gz*}.gz"
done
And if you are fussy and need to handle filenames with embedded newlines, change the while read to while IFS= read -r -d $'\0', and add a -print0 to find; see How do I use a for-each loop to iterate over file paths output by the find utility in the shell / Bash?.
But is this renaming a good idea? How was your filename.gz.gz created? gzip has guards against accidentally doing so. If you circumvent these via something like gzip -c $1 > $1.gz, buried in some script, then renaming these files will give you grief.
Another way with rename:
find . -iname '*.gz.gz' -exec rename -n 's/(\.\w+)\1+$/$1/' {} +
When happy with the results remove -n (dry-run) option.

Problem using "find" in shell scripting

I while back I wrote a shell script that automatically runs a python script over any c++ files it can find in a specified directory. I tested it, it worked fine, and I saved it and forgot about it; problem is I've came back to use it and encountered a problem (turns out I didnt test it enough eh?).
Anyway, the source directory paths I was testing before had no spaces in their names, e.g.
/somedirectory/subfolder/src/
But when I try and run the script using a path with spaces in it, e.g.
/Documents\ and\ Settings/subfolder/src/
It doesnt work.
I've located where the problem is, but I'm not sure how to fix it. Here's the code causing the problem:
names=( $(find "${SOURCE_ROOT_DIRECTORY}" -regex "[A-Za-z0-9]*.*\(cpp\|h\|cc\)$"))
The regular expression works with paths with no spaces, so I'm not sure if there's a problem with the regular expression, or if the "find" command stops when it encounters a space.
Can anyone help?
find doesn't "stop" when it hits files with spaces in their names. The problem occurs when try to store them as elements in an array.
Change IFS to the newline character (by default it is space):
#change IFS
OLDIFS=$IFS
IFS=$'\n'
#run find
names=($(find . -regex "[A-Za-z0-9]*.*\(cpp\|h\|cc\)$"))
#restore IFS
IFS=$OLDIFS
#test out the array
echo "size: ${#names[#]}"
for i in "${names[#]}"
do
echo "$i"
done
The canonical usage pattern is:
find subfolder/ -type f -name '*.cpp' -print0 |
xargs -0rn1 myscript.py
This has all the bells and whistles, you can probably do without -type f and perhaps some of the xargs flags
you can use read
while read -r file; do
names+=("$file")
done < <(find "${SOURCE_ROOT_DIRECTORY}" -regex "[A-Za-z0-9]*.*\(cpp\|h\|cc\)$")
a small test
$ mkdir -p /tmp/test && cd $_
$ touch foo bar "ab cd"
$ ls
ab cd bar foo
$ while read -r file; do names+=("$file"); done < <(find /tmp/test -type f);
$ echo ${#names[#]}
3
$ for file in "${names[#]}"; do echo "$file"; done;
/tmp/test/ab cd
/tmp/test/bar
/tmp/test/foo
$ unset names file