Import-Csv with regex - regex

I want to import a CSV and filter out the data with certain regex's. Example of input CSV:
John;Doe;18
Peter;Test;21
I want to filter out individual names, for example: I only want to see the persons with the lastname starting with a 'd' and those who are 20 years and above.
How can I write the filtered data into a new array? Here's the piece of code I've written so far:
$persons = import-csv -Header "surname", "lastname", "age" -delimiter ";" data.csv
$persons | ForEach-Object {
if $_.lastname -match "^d" {
#?
}
Thanks very much

Try the following:
$persons = Import-Csv -Header "surname", "lastname", "age" -Delimiter ";" data.csv | where {
$_.lastname -match '^d' -and $_.age -ge 20
}
It filters out the records with lastname starting with "d" and have an age of 20 or greater. Finally, it saves the results in an array called $persons

$persons | Where-Object {$_.LastName -like "d*" -and $_.Age -gt 20}

try this regular expression
^([^;]*);([^;]*);([^;]*)$
Tried it with perl
#!/usr/bin/perl
use strict;
while (<>) {
if ( /^([^;]*);([^;]*);([^;]*)$/ ){
print "first =". $1 . "\n";
print "last =". $2 . "\n";
print "age =". $3 . "\n";
}
}
which gives this output
first =John
last =Doe
age =18
first =Peter
last =Test
age =21
however focus on the regular expression

Related

I want to split a string from : to \n in Powershell script

I am using a config file that contains some information as shown below.
User1:xyz#gmail.com
User1_Role:Admin
NAME:sdfdsfu4343-234324-ffsdf-34324d-dsfhdjhfd943
ID:xyz#abc-demo-test-abc-mssql
Password:rewrfsdv34354*fds*vdfg435434
I want to split each value from*: to newline* in my Powershell script.
I am using -split '[: \n]' it matches perfectly until there is no '' in the value. If there is an '*' it will fetch till that. For example, for Password, it matches only rewrfsdv34354. Here is my code:
$i = 0
foreach ($keyOrValue in $Contents -split '[: *\n]') {
if ($i++ % 2 -eq 0) {
$varName = $keyOrValue
}
else {
Set-Variable $varName $keyOrValue
}
}
I need to match all the chars after : to \n. Please share your ideas.
It's probably best to perform two separate splits here, it makes things easier to work out if the code is going wrong for some reason, although the $i % 2 -eq 0 part is a neat way to pick up key/value.
I would go for this:
# Split the Contents variable by newline first
foreach ($line in $Contents -split '[\n]') {
# Now split each line by colon
$keyOrValue = $line -split ':'
# Then set the variables based on the parts of the colon-split
Set-Variable $keyOrValue[0] $keyOrValue[1]
}
You could also convert to a hashmap and go from there, e.g.:
$h = #{}
gc config.txt | % { $key, $value = $_ -split ' *: *'; $h[$key] = $value }
Or with ConvertFrom-StringData:
$h = (gc -raw dims.txt) -replace ':','=' | ConvertFrom-StringData
Now you have convenient access to keys and values, e.g.:
$h
Output:
Name Value
---- -----
Password rewrfsdv34354*fds*vdfg435434
User1 xyz#gmail.com
ID xyz#abc-demo-test-abc-mssql
NAME sdfdsfu4343-234324-ffsdf-34324d-dsfhdjhfd943
User1_Role Admin
Or only keys:
$h.keys
Output:
Password
User1
ID
NAME
User1_Role
Or only values:
$h.values
Output:
rewrfsdv34354*fds*vdfg435434
xyz#gmail.com
xyz#abc-demo-test-abc-mssql
sdfdsfu4343-234324-ffsdf-34324d-dsfhdjhfd943
Admin
Or specific values:
$h['user1'] + ", " + $h['user1_role']
Output:
xyz#gmail.com, Admin
etc.

Powershell script using RegEx to look for a pattern in one .txt and find line in a second .txt

I have a real "headsmasher" on my plate.
I have this piece of script:
$lines = Select-String -List -Path $sourceFile -Pattern $pattern -Context 20
foreach ($id in $lines) {
if (Select-String -Quiet -LiteralPath export.txt -Pattern "$($Matches[1]).+$($id.Pattern)") {
}
else {
Select-String -Path $sourceFile -Pattern $pattern -Context 20 >> $duplicateTransactionsFile
}
}
but it is not working for me as I wanted it to.
I have two .txt files: "$sourcefile = source.txt" and "export.txt"
The source.txt looks like something like this:
Some text here ***********
------------------------------------------------
F I N A L C O U N T 1 9 , 9 9
**************
** [0000123456]
ID Number:0000123456
Complete!
****************!
***********
Some other text here*******
------------------------------------------------
F I N A L C O U N T 9 , 9 9
**********
** [0000789000]
ID Number:0000789000
Complete!
******************!
************
The export.txt is like this:
0000123456 19,99
0000555555 ,89
0000666666 3,05
0000777777 31,19
0000789000 9,99
What I am trying to do is look into source.txt and search for the number that I enter (spaced out in my case)
*e.g: "9,99" but only that. As you can see, the next number in the source.txt is "19,99" and it also contains "9,99" but I do not want it to be matched.
and once I find the number, look for the next line in the source.txt that contains the text "ID Number:" then get the numbers right after the ":" Once I get those numbers after the ":", I want to now look into the export.txt and see if the numbers after the ":" are there and whether it has the "9,99" on the same line next to it but exactly "9,99" and nothing else lie "19,99", "29,99", and so on.
Then the rest is easy:
if (*true*) {
do this
}
else {
do that
}
Could you guys give me some love here and help a brother out?
I very much appreciate any help or hint you could share.
Best of wishes!
You could approach this like below:
# read the export.txt file and convert to a Hashtable for fast lookup
$export = ((Get-Content -Path 'D:\Test\export.txt').Trim() -replace '\s+', '=') -join "`r`n" | ConvertFrom-StringData
# read the source file and split into multiline data blocks
$source = ((Get-Content -Path 'D:\Test\source.txt' -Raw) -split '-{2,}').Trim() | Where-Object { $_ -match '(?sm)^\s?F I N A L C O U N T' }
# make sure the number given is spaced-out
$search = (((Read-Host "Search for Final Count number") -replace '\s' -split '') -join ' ').Trim()
Write-Host "Looking for a matching item using Final Count '$search'"
# see if we can find a data block that matches the $search
$blocks = $source | Where-Object { $_ -match "(?sm)^F I N A L C O U N T\s+$search\s?$" }
if (!$blocks) {
Write-Host "No item in source.txt could be found with Final Count '$search'" -ForegroundColor Red
}
else {
# loop over the data block(s) and pick the one that matches the search count
$blocks | ForEach-Object {
# parse out the ID
$id = $_ -replace '(?sm).*ID Number:(\d+).*', '$1'
# check if the $export Hashtable contains a key with that ID number
if ($export.Contains($id)) {
# check if that item has a value of $search without the spaces
if ($export[$id] -eq ($search -replace '\s')) {
# found it; do something
Write-Host "Found a match in the export.txt" -ForegroundColor Green
}
else {
# found ID with different FinalCount
Write-Host "An item with ID '$id' was found, but with different Final Count ($($export[$id]))" -ForegroundColor Red
}
}
else {
# ID not found
Write-Host "No item with ID '$id' could be found in the export.txt" -ForegroundColor Red
}
}
}
If as per your comment, you would like the code to loop over the Final Count numbers found in the source.txt file instead of a user typing in a number to search for, you can shorten the above code to:
# read the export.txt file and convert to a Hashtable for fast lookup
$export = ((Get-Content -Path 'D:\Test\export.txt').Trim() -replace '\s+', '=') -join "`r`n" | ConvertFrom-StringData
# read the source file and split into multiline data blocks
$blocks = ((Get-Content -Path 'D:\Test\source.txt' -Raw) -split '-{2,}').Trim() |
Where-Object { $_ -match '(?sm)^\s?F I N A L C O U N T' }
if (!$blocks) {
Write-Host "No item in source.txt could be found with Final Count '$search'" -ForegroundColor Red
}
else {
# loop over the data block(s)
$blocks | ForEach-Object {
# parse out the FINAL COUNT number to look for in the export.txt
$search = ([regex]'(?sm)^F I N A L C O U N T\s+([\d,\s]+)$').Match($_).Groups[1].Value
# remove the spaces, surrounding '0' and trailing comma (if any)
$search = ($search -replace '\s').Trim('0').TrimEnd(',')
Write-Host "Looking for a matching item using Final Count '$search'"
# parse out the ID
$id = $_ -replace '(?sm).*ID Number:(\d+).*', '$1'
# check if the $export Hashtable contains a key with that ID number
if ($export.Contains($id)) {
# check if that item has a value of $search without the spaces
if ($export[$id] -eq $search) {
# found it; do something
Write-Host "Found a match in the export.txt with ID: $($export[$id])" -ForegroundColor Green
}
else {
# found ID with different FinalCount
Write-Host "An item with ID '$id' was found, but with different Final Count ($($export[$id]))" -ForegroundColor Red
}
}
else {
# ID not found
Write-Host "No item with ID '$id' could be found in the export.txt" -ForegroundColor Red
}
}
}
There are surely multiple valid ways to accomplish this. Here is my approach:
(See comments for explanations. Let me know if you have any questions)
param (
# You can provide this when calling the script using "-Search 9,99"
# If not privided, powershell will prompt to enter the value
[Parameter(Mandatory)]
$Search,
$Source = "source.txt",
$Export = "export.txt"
)
# insert spaces
$pattern = $Search.ToCharArray() -join " "
# Search for the value in the source file.
$found = $false
switch -Regex -File $Source {
# This regex looks for something that is not a number,
# followed by only whitespace, and then your (spaced) search value.
# This makes sure "19,99" is not matched with "9,99".
# You could use a more elaborate regex here, but for your example,
# this one should work fine.
"\D\s+$pattern" {
$found = $true
}
"ID Number:(\d+)" {
# Get the ID number from the match.
$id = $Matches[1]
# If the search value was found
# (that means, this ID number is immediately followed by the search value)
# we can stop looking.
if ($found) {
break
}
}
}
# quick check if the value was actually found
if (-not $found) {
throw "Value $Search not found in $Source."
}
# Search for the id in the export file.
switch -Regex -File $Export {
"$id\s+(\S+)" {
# Get the amount value from the match
$value = $Matches[1]
# If the value matches your search...
if ($value -eq $search) {
# do this
}
else {
# otherwise do that
}
break
}
}
Note: You could additionally convert the values to decimal to account for different text representations when searching and comparing.

Extract Key Value from string

I have the following code:
$myString = "Name=Tony;Fee=10;Account=Premium"
$splitString = "$($myString)".Split(";")
$name = $splitString -match "Name=(?<content>.*)"
$acct = $splitString -match "Account=(?<content>.*)"
$name
$acct
Result:
Name=Tony
Account=Premium
How can I get it to return just the value? Example:
Tony
Premium
Thanks in advance
Combine -split, the string splitting operator, with a switch statement and its -Regex switch:
$name, $acct =
switch -Regex ("Name=Tony;Fee=10;Account=Premium" -split ';') {
'^(?:Name|Account)=(?<content>.*)' { $Matches['content'] }
}
Here's another approach using -Split and taking advantage of the text's format using ConvertFrom-StringData
$name, $acct = "Name=Tony;Fee=10;Account=Premium" -split ';' |
ConvertFrom-StringData | ForEach-Object {$_.name,$_.Account}
I would use -split here to to construct a hashtable of key value pairs:
$myString = "Name=Tony;Fee=10;Account=Premium"
# Create hashtable
$ht = #{}
# Spit string by ';' and loop over each item
foreach ($item in $myString -split ';') {
# Split item by '=' to get key value pair
$pair = $item -split '='
# Set key value pair into hashtable
$ht[$pair[0]] = $pair[1]
}
# Output values you want
$ht.Name
$ht.Account
# Tony
# Premium
Which also is beneficial if you need to lookup other values.

Reading list style text file into powershell array

I am provided a list of string blocks in a text file, and i need this to be in an array in powershell.
The list looks like this
a:1
b:2
c:3
d:
e:5
[blank line]
a:10
b:20
c:30
d:
e:50
[blank line]
...
and i want this in a powershell array to further work with it.
Im using
$output = #()
Get-Content ".\Input.txt" | ForEach-Object {
$splitline = ($_).Split(":")
if($splitline.Count -eq 2) {
if($splitline[0] -eq "a") {
#Write-Output "New Block starting"
$output += ($string)
$string = "$($splitline[1])"
} else {
$string += ",$($splitline[1])"
}
}
}
Write-Host $output -ForegroundColor Green
$output | Export-Csv ".\Output.csv" -NoTypeInformation
$output | Out-File ".\Output.txt"
But this whole thing feels quite cumbersome and the output is not a csv file, which at this point is i think because of the way i use the array. Out-File does produce a file that contains rows that are separated by commas.
Maybe someone can give me a push in the right direction.
Thx
x
One solution is to convert your data to an array of hash tables that can be read into a custom object. Then the output array object can be exported, formatted, or read as required.
$hashtables = (Get-Content Input.txt) -replace '(.*?):','$1=' | ConvertFrom-StringData
$ObjectShell = "" | Select-Object ($hashtable.keys | Select-Object -Unique)
$output = foreach ($hashtable in $hashtable) {
$obj = $ObjectShell.psobject.Copy()
foreach ($n in $hashtable.GetEnumerator()) {
$obj.($n.key) = $n.value
}
$obj
}
$output
$output | Export-Csv Output.csv -NoTypeInformation
Explanation:
The first colons (:) on each line are replaced with =. That enables ConvertFrom-StringData to create an array of hash tables with values on the LHS of the = being the keys and values on the RHS of the = being the values. If you know there is only one : on each line, you can make the -replace operation simpler.
$ObjectShell is just an object with all of the properties your data presents. You need all of your properties present for each line of data whether or not you assign values to them. Otherwise, your CSV output or table view within the console will have issues.
The first foreach iterates through the $hashtables array. Then we need to enumerate through each hash table to find the keys and values, which is performed by the second foreach loop. Each key/value pair is stored as a copy of $ObjectShell. The .psobject.Copy() method is used to prevent references to the original object. Updating data that is a reference will update the data of the original object.
$output contains the array of objects of all processed data.
Usability of output:
# Console Output
$output | format-table
a b c d e
- - - - -
1
2
3
5
10
20
30
50
# Convert to CSV
$output | ConvertTo-Csv -NoTypeInformation
"a","b","c","d","e"
"1",,,,
,"2",,,
,,"3",,
,,,"",
,,,,"5"
,,,,
"10",,,,
,"20",,,
,,"30",,
,,,"",
,,,,"50"
# Accessing Properties
$output.b
2
20
$output[0],$output[1]
a : 1
b :
c :
d :
e :
a :
b : 2
c :
d :
e :
Alternative Conversion:
$output = ((Get-Content Input.txt -raw) -split "(?m)^\r?\n") | Foreach-Object {
$data = $_ -replace "(.*?):(.*?)(\r?\n)",'"$1":"$2",$3'
$data = $data.Remove($data.LastIndexOf(','),1)
("{1}`r`n{0}`r`n{2}" -f $data,'{','}') | ConvertFrom-Json
}
$output | ConvertTo-Csv -NoType
Alternative Explanation:
Since ConvertFrom-StringData does not guarantee hash table key order, this alternative readies the file for a JSON conversion. This will maintain the property order listed in the file provided each group's order is the same. Otherwise, the property order of the first group will be respected.
All properties and their respective values are divided by the first : character on each line. The property and value are each surrounded by double quotes. Each property line is separated by a ,. Then finally the opening { and closing } are added. The resulting JSON-formatted string is converted to a custom object.
You can split by \n newline, see example:
$text = #"
a:1
b:2
c:3
d:
e:5
a:10
b:20
c:30
d:
e:50
e:50
e:50
e:50
"#
$Array = $text -split '\n' | ? {$_}
$Array.Count
15
if you want to exclude the empty lines, add ? {$_}
With your example:
$Array = (Get-Content ".\Input.txt") -split '\n' | ? {$_}

get first letters from each word in a string with powershell

Have a powersshell to return a list of folders from DFS, I want to reduce long folder names to their resultant acronym. This is what I have so far, it replaces space with underscores...
$folders = Get-DfsnFolder -Path "\\dfs\path\*"| %{$folder = $_.path.split("\"); $folder[4].replace(" ","_")}
foreach ($folder in $folders) { if ($folder.length -gt 24) {
if ($_.length -gt 20) { $_.split("?<=\_)[\_]+").substring(0,1);
<Do something here to put the letters back into $folders>
}
}
Essentially I want an acronym creator
I don't fully understand why you want the underscores, is that a separate requirement for strings shorter than 20 characters?
Anyway, for splitting a string in to words and returning only the first letter of each word, you can combine split and join like this:
("this is a test" -split " " |% { $_[0] }) -join ""
Combined with your code, you can use something like:
$folders |% {
if($_.length -gt 20) {
( $_ -split " " |% { $_[0] } ) -join ""
} else {
$_.replace(" ","_")
}
}
you can use the Rename-Item cmdlet:
Rename-Item takes a [destination path] and a [source path]
Rename-Item $_.FullName $_.Name.Replace(' ','_') -Force #this is for the space to underscore `Replace`
Rename-Item $_.FullName $_.Name.Substring(0,1) -Force #this is for the first letter `Substring`