How do I match a filename without quotes to a name with quotes? - regex

If somebody knows a better title, please do tell me.
Given these examples:
Three files called: Jacobs.Ladder.txt, Emma.Unplugged.2020.txt & Emma.txt
Three folders called: Jacobs.Ladder.2019, Emma.Unplugged & Emma
Three database-entries called: Jacob's Ladder, Emma: Unplugged & Emma
My PowerShell script is presented with the file, then needs to find the right folder, copy it there, then find the entry in the database and update it there. I need a script that catches all three cases.
So far, what I've done is this:
I take the file's BaseName property:
Jacobs.Ladder, Emma.Unplugged, Emma
I use Get-ChildItem to put all directories in a variable, do:
$Directories | Where-Object -Property 'Name' -Contains $File.BaseName
This finds exact matches, so it will only find the folder Emma
Next, if nothing was found, I do:
$Directories | Where-Object -Property 'Name' -Like ('{0}*' -f $File.BaseName)
This finds the folder Jacobs.Ladder.2019. It has already found Emma in the previous step, so that has been handled. It will not however, find Emma.Unplugged because the file's BaseName is Emma.Unplugged.2020.
So that's my first challenge/question: how do I write a conditional that matches Emma.Unplugged.2020 to Emma.Unplugged?
Next, to match the file name with the database entries, I start with a replace to turn the dots into spaces.
I then search for an almost exact match:
$Database | Where-Object -Property 'Name' -Like $FileNameWithSpaces.Replace(' ','*')
I know, I could have just replaced the dots with asterisks. But I don't know yet if the name with spaces is going to be useful somewhere else.
Although this does compensate for the colon in Emma.Unplugged, this once again only finds Emma.
There is my second challenge: how do I form my query to match (again) Emma.Unplugged.2020 to Emma: Unplugged but also Jacobs.Ladder to Jacob's Ladder?
Of course, if you can think of more scenarios that I didn't catch here, feel free to add those in your code.
As for the topic title, my biggest challenge is that single quote in the word. Right now the best way I can think of is if I on-the-fly turn all database-names into their filename equivalent. Then Jacob's Ladder would become Jacobs.Ladder and I would have a match. Which would leave the 2020 as my biggest challenge.

Related

Transfering files based on their names using power shell

Our source location having two types of files in its disposal, in terms of their name. First one starts with DM_psedocharge_<curentdate in YYYYMMDD format>.csv and other one as Monthly_Extract_<curentdate in YYYYMMDD format>.csv.
Now the requirement states that first file shall transfer to target folder C:\DMRelated and second will move to C:\MonthlyExtract folder. Both cases the source folder is common.
I tried to capture the right file based on reg es expression within a if condition. If the condition mathes "^DM_" then MOVE to C:\DMRelated else MOVE to C:\MonthlyExtract. But the expression that I have chosen seems to be not right. Can any one please help me on this? I am new on scripting Power Shell.
Since you're always looking for the current date, you don't need pattern matching - you already know the exact name of the file!
Set-Location "C:\source\location"
Move-Item "DM_psedocharge_$(Get-Date -Format yyyyMMdd).csv" -Destination C:\DMRelated
Move-Item "Monthly_extract_$(Get-Date -Format yyyyMMdd).csv" -Destination C:\MonthlyExtract
In the example above we use the Get-Date cmdlet (which defaults to the current date and time) with the -Format parameter to get a formatted date string in the form yyyyMMdd

Check if string is like regex

I have a folder of many applications that are used for deployment on laptops or workstations. Now this folder is becoming a big mess because multiple people use this folder and everyone uses a different storage method. Therefore I wanted to write a script that helps manage the files in a way we can always find them.
In Powershell I want to.
List all the files (.msi, .exe, and maybe more)
Determine if the files are correctly stored already (Developer\Application\Application_version_architecture.extension ie. Adobe\Flashplayer\Flashplayer_22_x64.msi)
If true, leave it.
If not, I question the user of the script
some things about the application so the script then renames and
moves it to the correct folder.
Currently I'm stuck on step 2. I want to use a regex where I determine what the standard should be. However, it keeps ruling out applications that are correctly named.
I use the following command to retrieve the filenames:
Get-ChildItem -Path $path -Recurse -Name
This retrieves the files in the application folder with complete path like
"Adobe\Flashplayer\Flashplayer_22_x64.msi"
Or when incorrect
"Adobe\flashplayeractivex.msi"
I then use the following regex to check if they are correct or incorrect
\w*\\\w*\\[a-zA-Z]*\_[0-9a-zA-Z\.]*\_(([x][6][4])|([x][8][6])|([b|B][o][t][h]))\.(([m|M][s|S][i|I])|([e|E][x|X][e|E]))
Which I have confirmed working on Rubular.
However, I cannot get it working with powershell. I've tried the following:
if ($file -match '\w*\\\w*\\[a-zA-Z]*\_[0-9a-zA-Z\.]*\_(([x][6][4])|([x][8][6])|([b|B][o][t][h]))\.(([m|M][s|S][i|I])|([e|E][x|X][e|E]))') {......commands...}
Which does not seem to work because of the escapes (Powershell threw some errors at me). I then tried:
$pattern = [regex]::Escape('\w*\\\w*\\[a-zA-Z]*\_[0-9a-zA-Z\.]*\_(([x][6][4])|([x][8][6])|([b|B][o][t][h]))\.(([m|M][s|S][i|I])|([e|E][x|X][e|E]))')
if ($file -match $pattern) {......commands...}
Which didn't gave me errors, but did not work because it didn't "match" "Apple\iTunes\iTunes_12.3_x64.exe" which does match on Rubular.
Does anyone recognize this problem or see what I do wrong?
I wouldn't try all this in a single regex. Instead I would check each policy individually:
$path = 'C:\tmp'
$validExtensions = #('.msi', '.exe')
$filnameRegex = '\w+_[0-9a-zA-Z\.]+_(?:x32|x64|[b|B]oth)'
Get-ChildItem -Path $path -Recurse | ForEach-Object {
if (-not ($_.Extension -cin $validExtensions))
{
Write-Host "$($_.FullName) has an invalid extension."
}
if (-not ($_.BaseName -match $filnameRegex))
{
Write-Host "$($_.FullName) doesn't match the filename policy."
}
if (3 -ne ($_.FullName.Split([System.IO.Path]::DirectorySeparatorChar).Length `
- $path.Split([System.IO.Path]::DirectorySeparatorChar).Length))
{
Write-Host "$($_.FullName) doesn't match the directory policy."
}
}

egrep not behaving as expected when looking for one of three desired values

I'm writing a function that is supposed to automatically delete directories whose names meet a few prerequisites. One of these prequisites is that directories with a datestamp of today, yesterday or the day before yesterday are not deleted even if they otherwise fulfill the conditions. To that end, I fill three variables with the datestamps of today, yesterday and the day before that and plan to use them with "egrep -v" to exclude them from my for loop that is going to delete it.
The directory I am using as a testing directory contains the following files:
FFFA72U_20160513
FFFF11F_20160404
FFFF12F
FFFF13F
FFFF17F
FFFF21F_20130230
FFFF99F_20160511
I've tried a lot of different combinations, but I can't seem to get the egrep part right. My code currently looks like this:
currentDate=`date +%Y%m%d`
yesterday=`date --date yesterday +%Y%m%d`
bYesterday=`date --date="2 days ago" +%Y%m%d`
for i in `ls ./*targetdir* | egrep -i "^[A-Z]{4}[0-9]{2}[A-Z]_[0-9]{8}$" | egrep -iv "(${currenDate}|${yesterday}|${bYesterday})$"`
do
*actions here*
done
When the above executes, I expect it to return the two files that have the 20160404 and the 20130230 datestamps, but I get no matches whatsoever.
Removing the double-quotes around the egrep string gives me an error that the ( is unexpected, so that does not help. Replacing the double-quotes with single quotes also generates no output.
When I prefix both parentheses with a / or a \, it returns all four directories with a datestamp while I expect it to exclude the ones with a May 2016 datestamp.
I've tried many more small tweaks (e.g., escaping the pipes) that I can't perfectly recall/repeat here, but it boils down to the fact that I have no clue whatsoever why it is not generating the desired outpout.
At this point I'm a bit flabbergasted by it all and I'd really appreciate any pointers because even after all my attempts and reading several topics on this matter I don't really see a simple way to get the script to do what I want.
Typo: currenDate instead of currentDate. Empty alternative matches all records, the inverse matches none.

bulk file renaming in bash, to remove name with spaces, leaving trailing digits

Can a bash/shell expert help me in this? Each time I use PDF to split large pdf file (say its name is X.pdf) into separate pages, where each page is one pdf file, it creates files with this pattern
"X 1.pdf"
"X 2.pdf"
"X 3.pdf" etc...
The file name "X" above is the original file name, which can be anything. It then adds one space after the name, then the page number. Page numbers always start from 1 and up to how many pages. There is no option in adobe PDF to change this.
I need to run a shell command to simply remove/strip out all the "X " part, and just leave the digits, like this
1.pdf
2.pdf
3.pdf
....
100.pdf ...etc..
Not being good in pattern matching, not sure what regular expression I need.
I know I need something like
for i in *.pdf; do mv "$i$" ........; done
And it is the ....... part I do not know how to do.
This only needs to run on Linux/Unix system.
Use sed..
for i in *.pdf; do mv "$i" $(sed 's/.*[[:blank:]]//' <<< "$i"); done
And it would be simple through rename
rename 's/.*\s//' *.pdf
You can remove everything up to (including) the last space in the variable with this:
${i##* }
That's "star space" after the double hash, meaning "anything followed by space". ${i#* } would remove up to the first space.
So run this to check:
for i in *.pdf; do echo mv -i -- "$i" "${i##* }" ; done
and remove the echo if it looks good. The -i suggested by Gordon Davisson will prompt you before overwriting, and -- signifies end of options, which prevents things from blowing up if you ever have filenames starting with -.
If you just want to do bulk renaming of files (or directories) and don't mind using external tools, then here's mine: rnm
The command to do what you want would be:
rnm -rs '/.*\s//' *.pdf
.*\s selects the part before (and with) the last white space and replaces it with empty string.
Note:
It doesn't overwrite any existing files (throws warning if it finds an existing file with the target name).
And this operation is failsafe. You can get back the changes made by last rnm command with rnm -u.
Here's a list of documents for rnm.

Sed isn't replacing all occurrences of string in file

EDIT: I am using Cygwin. I am unsure whether this is of relevance and it was a detail I missed during writing this question.
EDIT2: Have tried replacing the "TAB" char people pointed out with the RegEx \s which covers spacing chars (spaces and tabs primarily) and this did not affect the expression at all, meaning that it is not the tabs causing the issue, especially since the expression runs once without errors anyway.
So far this script has been causing me a ton of trouble.
I DID have an issue before but I resolved that while I was writing a question here (lucky imo) but this one I've been stuck on for at least an hour now and I've tried varying solutions, none of which actually work or told me something I didn't already try.
I have a rather cool seeming FTP log fetcher script and part of this script replaces the 600MB of errors in this logfile to nothing, essentially removing them. Unfortunately this script also gets rid of parts of other errors too, so I've had to edit it. This is where I'm getting stuck.
Through base research I managed to find out that sed could do what I want, and through three hours of playing so far it does most of what I tell it to, minus one thing. One, and ONLY one, of the sed statements I have built only replaces the first instance of the string I've given it despite having the g modifier attached to the end.
I am working with a test script right now as to avoid potential permanent damage to my original FTP script, and the test script copies over an example file with a few of the errors I need replacing.
Walkthrough of the scripts INTENDED behaviour before showing:
1. Sets a prefix which happens on ALL lines in the file, pretty important part of the script.
2. Copies the example file to a file named test2.log
3. Replace all instances of the UNIX newline char \n with [loll] (first thing that came to my mind)
4. Remove all instances of battle error type 1 and 2.
5. Return all [loll] strings with the UNIX \n for newlines, therefore returning the logfile to its original state minus the errors.
Script:
#DTP="\[([0-9]+-[0-9]+-[0-9]+-[0-9]+|latest)\.log\] \[[0-9]+:[0-9]+:[0-9]+\] \[Server thread/(INFO|WARN)\]: "
echo "${DTP}"
DTP1="\[[0-9]*:[0-9]*:[0-9]*\]\s\[Server\sThread\/\(WARN\|INFO\)\]:\s"
DTP="\[loll\]\[[0-9]*:[0-9]*:[0-9]*\]\s\[Server\sThread\/\(WARN\|INFO\)\]:\s"
echo "${DTP}"
echo "1"
cp test.log test2.log
#cat test.log >test2.log
sed -i ':a;N;$!ba;s/\n/\[loll\]/g' test2.log #| egrep -i "" >test2.log
sed -i 's/'${DTP1}'Caught error in battle. Continuing...'${DTP}'java.lang.NullPointerException'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.participants.PixelmonWrapper.useAttack(PixelmonWrapper.java:173)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.participants.PixelmonWrapper.takeTurn(PixelmonWrapper.java:330)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.BattleControllerBase.takeTurn(BattleControllerBase.java:276)'${DTP}' at com.pixelmonmod.pixelmon.battles.controller.BattleControllerBase.update(BattleControllerBase.java:157)'${DTP}' at com.pixelmonmod.pixelmon.battles.BattleRegistry.updateBattles(BattleRegistry.java:63)'${DTP}' at com.pixelmonmod.pixelmon.battles.BattleTickHandler.tickStart(BattleTickHandler.java:12)'${DTP}' at cpw.mods.fml.common.eventhandler.ASMEventHandler_20_BattleTickHandler_tickStart_WorldTickEvent.invoke(.dynamic)'${DTP}' at cpw.mods.fml.common.eventhandler.ASMEventHandler.invoke(ASMEventHandler.java:51)'${DTP}' at cpw.mods.fml.common.eventhandler.EventBus.post(EventBus.java:122)'${DTP}' at cpw.mods.fml.common.FMLCommonHandler.onPostWorldTick(FMLCommonHandler.java:255)'${DTP}' at net.minecraft.server.MinecraftServer.func_71190_q(MinecraftServer.java:929)'${DTP}' at net.minecraft.server.dedicated.DedicatedServer.func_71190_q(DedicatedServer.java:429)'${DTP}' at net.minecraft.server.MinecraftServer.func_71217_p(MinecraftServer.java:776)'${DTP}' at net.minecraft.server.MinecraftServer.run(MinecraftServer.java:639)'${DTP}' at java.lang.Thread.run(Thread.java:745)//gI' test2.log
echo "2"
sed -i 's/'${DTP1}'Caught error in battle. Continuing...'${DTP}'java.lang.NullPointerException\[loll\]//gI' test2.log
echo "3"
sed -i 's/\[loll\]/\n/g' test2.log
I've set them to also run case insensitive checks on the provided strings as sometimes I write with all lower case, however for most of this I copied and pasted it directly.
Sample input:
http://pastebin.com/3KPB33X2
Outputs:
Expected:
meow
Test message
WOOF MEOWLOL
Actual: http://pastebin.com/pnvDwkxz
It's been killing my mind for a while now because I had this issue even before the other one, except I barely noticed it. I can't find any predictable behaviour in the script, and as far as I am aware it SHOULD be working perfectly fine and giving me the output I expect.
Any help would be appreciated, because as soon as I can get this bug sorted out I'll be able to enter in the rest of the script and replace this with the existing battle-error replacement script in my log-fetcher.
Knowing me it's something small and stupid but I've tried literally everything I came across, including adding the :a;N;$!ba; to the start of the bit which isn't working properly (and realising that failed horribly).
Thanks.
~BAI1
Are you looking for something like this:
sed -n ':a;s/\[.*Server thread\/\(INFO\|WARN\).*//i;/^$/!p;n;b a' battle.log