regex help for replacing directories under certain conditions - regex

I need to write a regex to update some registry files for work for an upcoming migration
I have to dynamically update all strings in files where:
has a directory path that starts with C:
directory does not have "xyz" or "abc" in the path
if the string in the file contains a file [EX *.ext], the only thing that is updated is the directory path and not the filename & extension so those are left alone
does not replace but instead keeps any appending information in the string [EX: " " " or ")"] that are used to close out the string parameters in the files
I have got it doing 1 and 2, but not 3 and 4. I'm using PowerShell 4.0, recursively checking files in a directory and producing a new file with the changes for each file found in each subdirectory.
The conversion should look like:
FROM: setting1="\"C:\\\\this\\\\is\\\\\adirectory\""
TO: setting1="\"O:\\\\\New_Directory_Path\""
FROM: othersetting="\"C:\\\\this\\\\abc\\\\directory""
TO: UNCHANGED
FROM: thisfile="\"(C:\\\\this\\\\directory\\\\somefile.some_ext)""
TO: thisfile="\"(O:\\\\New_Directory_Path\\\\somefile.some_ext)""
I'm at my wits end on this one, this is what I have so far, in terms of regex:
gc $file | % {$_ -replace "C:(?!.*abc.*)(?!.*xyz.*)(?!.*\..*).*","O:\\\\New_Directory_Path\"} | out-file "$filePath\$newFileName"
Hoping someone on here can lend a hand. Also, this is my first time posting on here, sorry for not putting my code in tags.

try this pattern
Edited:
C:(?!.*(abc|xyz)).*?(?=[^\\]+\.[^\\]+|$)
Demo

Related

remove some character string from multiple filenames

my OS is Win 10. i have a folder which contains files like: '1.JPG.JPG.jpg' or '99.JPG.jpg' or '335.JPG.JPG.jpg' -almost 470 files-
notice that the file names are look like value of ID column to reference and some files have 2 times '.JPG' but some have 1 time. also there are some files contains 'jpg' instead of 'JPG' (both without quotes) in between file name.
i want to rename all files with number value in start of file name and then add .jpg to all files like 1.jpg or 99.jpg or 335.jpg etc. this is sure all files are jpeg, there is no .png or .bmp etc.
please help how i can do this?
EDIT: can i used to get digit part of file name and hard code .jpg and replace all file names at once using script, if yes, please guide how it can be done?
its done using following Powershell commands searched from internet:
step 1) ls | Rename-Item -NewName {$_.name -replace ".JPG.JPG",""}
step 2) ls | Rename-Item -NewName { [io.path]::ChangeExtension($_.name, "jpg") }
hope peoples like me (the novice) will get benefit.
Link: https://www.windowscentral.com/how-rename-multiple-files-bulk-windows-10#:~:text=You%20can%20press%20and%20hold,file%20to%20select%20a%20group.

Iterating over directory with specified path in Bash

pathToBins=$1
bins="${pathToBins}contigs.fa.metabat-bins-*"
for fileName in $bins
do
echo $fileName
done
My goal is to attach a path to my file name. I can iterate over a folder and get the file name when I don't attach the path. My challenge is when I add the path echo fileName my regular expression no longer works and I get "/home/erikrasmussen/Desktop/Script/realLargeMetaBatBinscontigs.fa.metabat-bins-*" where the regular expression '*' is treated like a string. How can I get the path and also the full file name while iterating over a folder of files?
Although I don't really know how your files are arranged on your hard drive, a casual glance at "/home/erikrasmussen/Desktop/Script/realLargeMetaBatBinscontigs.fa.metabat-bins-*" suggests that it is missing a / before contigs. If that is the case, then you should change your definition of bins to:
bins="${pathToBins}/contigs.fa.metabat-bins-*"
However, it is much more robust to use bash arrays instead of relying on filenames to not include whitespace and metacharacters. So I would suggest:
bins=(${pathToBins}/contigs.fa.metabat-bins-*)
for fileName in "${bins[#]}"
do
echo "$fileName"
done
Bash normally does not expand a pattern which doesn't match any file, so in that case you will see the original pattern. If you use the array formulation above, you could set the bash option nullglob, which will cause the unmatched pattern to vanish instead, leaving an empty array.

bulk file renaming in bash, to remove name with spaces, leaving trailing digits

Can a bash/shell expert help me in this? Each time I use PDF to split large pdf file (say its name is X.pdf) into separate pages, where each page is one pdf file, it creates files with this pattern
"X 1.pdf"
"X 2.pdf"
"X 3.pdf" etc...
The file name "X" above is the original file name, which can be anything. It then adds one space after the name, then the page number. Page numbers always start from 1 and up to how many pages. There is no option in adobe PDF to change this.
I need to run a shell command to simply remove/strip out all the "X " part, and just leave the digits, like this
1.pdf
2.pdf
3.pdf
....
100.pdf ...etc..
Not being good in pattern matching, not sure what regular expression I need.
I know I need something like
for i in *.pdf; do mv "$i$" ........; done
And it is the ....... part I do not know how to do.
This only needs to run on Linux/Unix system.
Use sed..
for i in *.pdf; do mv "$i" $(sed 's/.*[[:blank:]]//' <<< "$i"); done
And it would be simple through rename
rename 's/.*\s//' *.pdf
You can remove everything up to (including) the last space in the variable with this:
${i##* }
That's "star space" after the double hash, meaning "anything followed by space". ${i#* } would remove up to the first space.
So run this to check:
for i in *.pdf; do echo mv -i -- "$i" "${i##* }" ; done
and remove the echo if it looks good. The -i suggested by Gordon Davisson will prompt you before overwriting, and -- signifies end of options, which prevents things from blowing up if you ever have filenames starting with -.
If you just want to do bulk renaming of files (or directories) and don't mind using external tools, then here's mine: rnm
The command to do what you want would be:
rnm -rs '/.*\s//' *.pdf
.*\s selects the part before (and with) the last white space and replaces it with empty string.
Note:
It doesn't overwrite any existing files (throws warning if it finds an existing file with the target name).
And this operation is failsafe. You can get back the changes made by last rnm command with rnm -u.
Here's a list of documents for rnm.

Combine Bash Regex Expressions

I have a server that has some pages written in LESS. I have a launch.sh script that essentially builds all the CSS files from LESS, puts them in a directory, and starts the server (written in Node.js).
Here is what the script looks like currently:
# Searches the CSS directory for LESS files
for file in views/less/*.less
do
FROM=$file
A=${file/.*/.css}
B=${A/less/css}
TO=${B/views/resources}
echo "$FROM -> $TO"
# Compiles each LESS file into a CSS file of the same name with minified output
lessc --clean-css $FROM $TO
done
Everything works fine, but I was wondering if I could condense the regex expressins, denoated as A and B. Essentially the script takes the entire build path, let's say:
/views/less/style.less
and replaces less to css and replaces views to resources. So, the final path (after the conversion process) becomes:
/resources/css/style.css
Any help would be greatly appreciated!
You can replace all occurrences of less in a variable by doubling the slash:
A=${file//less/css}

Reorganizing large amount of files with regex?

I have a large amount of files organized in a hierarchy of folders and particular file name notations and extensions. What I need to do, is write a program to walk through the tree of files and basically rename and reorganize them. I also need to generate a report of the changes and information about the transformed organization along with statistics.
The solution that I can see, is to walk through the tree of files just like any other tree data structure, and use regular expressions on the path name of the files. This seems very doable and not a huge amount of work. My questions are, is there tools I should be using other than just C# and regex? Perl comes to mind since I know it was originally designed for report generation, but I have no experience with the language. And also, is using regex for this situation viable, because I have only used it for file CONTENTS not file names and organization.
Yes, Perl can do this. Here's something pretty simple:
#! /usr/bin/env perl
use strict;
use warnings;
use File::Find;
my $directory = "."; #Or whatever directory tree you're looking for...
find (\&wanted, $directory);
sub wanted {
print "Full File Name = <$File::Find::name>\n";
print "Directory Name = <$File::Find::dir>\n";
print "Basename = <$_\n>";
# Using tests to see various things about the file
if (-f $File::Find::name) {
print "File <$File::Find::name> is a file\n";
}
if (-d $File::Find::name) {
print "Directory <$File::Find::name> is a directory\n";
}
# Using regular expressions on the file name
if ($File::Find::name =~ /beans/) { #Using Regular expressions on file names
print "The file <$File::Find::name> contains the string <beans>\n";
}
}
The find command takes the directory, and calls the wanted subroutine for each file and directory in the entire directory tree. It is up to that subroutine to figure out what to do with that file.
As you can see, you can do various tests on the file, and use regular expressions to parse the file's name. You can also move, rename, or delete the file to your heart's content.
Perl will do exactly what you want. Now, all you have to do is learn it.
If you can live with glob patterns instead of regular expressions, mmv might be an option.
> ls
a1.txt a2.txt b34.txt
> mmv -v "?*.txt" "#2 - #1.txt"
a1.txt -> 1 - a.txt : done
a2.txt -> 2 - a.txt : done
b34.txt -> 34 - b.txt : done
Directories at any depth can be reorganized, too. Check out the manual. If you run Windows, you can find the tool in Cygwin.