replace not working with regex - regex

I'm trying to replace a string input by a user. I have the following input (as a firstname, lastname)...
John, Doe
I am use the following code:
$userInput = $userInput -replace '\s',''
$firstName = $userInput -replace ",*$",""
$lastName = $userInput -replace "^*,",""
Output looks like the following:
$userInput = John,Doe
$firstName = John,Doe
$lastName = JohnDoe
I need the output to look like this:
$userInput = John,Doe
$firstName = John
$lastName = Doe
What am I doing wrong?

,*$ says to find 0 or more commas at the very end of the string (not what you want).
^*, is.. well, I'm not really sure it would be considered valid regex. I guess it would mean find 0 or more "beginning of string" followed by a comma (it's a weird thing to specify).
So for first name, you would really want something like this:
$firstName = $userInput -replace ',.*$',''
So that says, find a comma followed by 0 or more of any character followed by the end of the string (then replace it with nothing).
For last name:
$lastName = $userInput -replace '^.*?,',''
And this says, find the beginning of the string, followed by 0 or more of any character (non-greedy, that's what the ? after the * means), then replace it with nothing.
Aaaand as I'm writing this, #PetSerAl commented what my last solution was going to be, which is to use a split:
$firstName, $lastName = $userInput -split ',\s*'

Related

match regex and replace bug with special charakters

I've built a script to read all Active Directory Group Memberships and save them to a file.
Problem is, the Get-ADPrincipalGroupMembership cmdlet outputs all groups like this:
CN=Group_Name,OU=Example Mail,OU=Example Management, DC=domain,DC=de
So I need to do a bit of a regex and/or replacement magic here to replace the whole line with just the first string beginning from "CN=" to the first ",".
The result would be like this:
Group_Name
So, there is one AD group that's not gonna be replaced. I already got an idea why tho, but I don't know how to work around this. In our AD there is a group with a special character, something like this:
CN=AD_Group_Name+up,OU=Example Mail,OU=Example Management, DC=domain,DC=de
So, because of the little "+" sign, the whole line doesn't even get touched.
Does anyone know why this is happening?
Import-Module ActiveDirectory
# Get Username
Write-Host "Please enter the Username you want to export the AD-Groups from."
$UserName = Read-Host "Username"
# Set Working-Dir and Output-File Block:
$WorkingDir = "C:\Users\USER\Desktop"
Write-Host "Working directory is set to " + $WorkingDir
$OutputFile = $WorkingDir + "\" + $UserName + ".txt"
# Save Results to File
Get-ADPrincipalGroupMembership $UserName |
select -Property distinguishedName |
Out-File $OutputFile -Encoding UTF8
# RegEx-Block to find every AD-Group in Raw Output File and delete all
# unnaccessary information:
[regex]$RegEx_mark_whole_Line = "^.*"
# The ^ matches the start of a line (in Ruby) and .* will match zero or more
# characters other than a newline
[regex]$RegEx_mark_ADGroup_Name = "(?<=CN=).*?(?=,)"
# This regex matches everything behind the first "CN=" in line and stops at
# the first "," in the line. Then it should jump to the next line.
# Replace-Block (line by line): Replace whole line with just the AD group
# name (distinguishedName) of this line.
foreach ($line in Get-Content $OutputFile) {
if ($line -like "CN=*") {
$separator = "CN=",","
$option = [System.StringSplitOptions]::RemoveEmptyEntries
$ADGroup = $line.Split($separator, $option)
(Get-Content $OutputFile) -replace $line, $ADGroup[0] |
Set-Content $OutputFile -Encoding UTF8
}
}
Your group name contains a character (+) that has a special meaning in a regular expression (one or more times the preceding expression). To disable special characters escape the search string in your replace operation:
... -replace [regex]::Escape($line), $ADGroup[0]
However, I fail to see what you need that replacement for in the first place. Basically you're replacing a line in the output file with a substring from that line that you already extracted before. Just write that substring to the output file and you're done.
$separator = 'CN=', ','
$option = [StringSplitOptions]::RemoveEmptyEntries
(Get-Content $OutputFile) | ForEach-Object {
$_.Split($separator, $option)[0]
} | Set-Content $OutputFile
Better yet, use the Get-ADObject cmdlet to expand the names of the group members:
Get-ADPrincipalGroupMembership $UserName |
Get-ADObject |
Select-Object -Expand Name
First off, depending on what you're doing here this might or might not be a good idea. The CN is /not/ immutable so if you're storing it somewhere as a key you're likely to run into problems down the road. The objectGUID property of the group is a good primary key, though.
As far as getting this value, I think you can simplify this a lot. The name property that the cmdlet outputs will always have your desired value:
Get-ADPrincipalGroupMembership <username> | select name
Ansgar's answer is much better in terms of using the regex, but I believe that in this case you could do a dirty workaround with the IndexOf function. In your if-statement you could do the following:
if ($line -like "CN=*") {
$ADGroup = $line.Substring(3, $line.IndexOf(',')-3)
}
The reason this works here is that you know the output will begin with CN=YourGroupName meaning that you know that the string you want begins at the 4th character. Secondly, you know that the group name will not contain any comma, meaning that the IndexOf(',') will always find the end of that string so you don't need to worry about the nth occurrence of a string in a string.

Trim More than 20 Characters

I am working on a script that will generate AD usernames based off of a csv file. Right now I have the following line working.
Select-Object #{n=’Username’;e={$_.FirstName.ToLower() + $_.LastName.ToLower() -replace "[^a-zA-Z]" }}
As of right now this takes the name and combines it into a AD friendly name. However I need to name to be shorted to no more than 20 characters. I have tried a few different methods to shorten the username but I haven't had any luck.
Any ideas on how I can get the username shorted?
Probably the most elegant approach is to use a positive lookbehind in your replacement:
... -replace '(?<=^.{20}).*'
This expression matches the remainder of the string only if it is preceded by 20 characters at the beginning of the string (^.{20}).
Another option would be a replacement with a capturing group on the first 20 characters:
... -replace '^(.{20}).*', '$1'
This captures at most 20 characters at the beginning of the string and replaces the whole string with just the captured group ($1).
$str[0..19] -join ''
e.g.
PS C:\> 'ab'[0..19]
ab
PS C:\> 'abcdefghijklmnopqrstuvwxyz'[0..19] -join ''
abcdefghijklmnopqrst
Which I would try in your line as:
Select-Object #{n=’Username’;e={(($_.FirstName + $_.LastName) -replace "[^a-z]").ToLower()[0..19] -join '' }}
([a-z] because PowerShell regex matches are case in-senstive, and moving .ToLower() so you only need to call it once).
And if you are using Strict-Mode, then why not check the length to avoid going outside the bounds of the array with the delightful:
$str[0..[math]::Min($str.Length, 19)] -join ''
To truncate a string in PowerShell, you can use the .NET String::Substring method. The following line will return the first $targetLength characters of $str, or the whole string if $str is shorter than that.
if ($str.Length -gt $targetLength) { $str.Substring(0, $targetLength) } else { $str }
If you prefer a regex solution, the following works (thanks to #PetSerAl)
$str -replace "(?<=.{$targetLength}).*"
A quick measurement shows the regex method to be about 70% slower than the substring method (942ms versus 557ms on a 200,000 line logfile)

Powershell Find String Between Characters and Replace

In Powershell script, I have Hashtable contains personal information. The hashtable looks like
{first = "James", last = "Brown", phone = "12345"...}
Using this hashtable, I would like to replace strings in template text file. For each string matches #key# format, I want to replace this string to value that correspond to key in hashtable. Here is a sample input and output:
input.txt
My first name is #first# and last name is #last#.
Call me at #phone#
output.txt
My first name is James and last name is Brown.
Call me at 12345
Could you advise me how to return "key" string between "#"s so I can find their value for the string replacement function? Any other ideas for this problem is welcomed.
You could do this with pure regex, but for the sake of readability, I like doing this as more code than regex:
$tmpl = 'My first name is #first# and last name is #last#.
Call me at #phone#'
$h = #{
first = "James"
last = "Brown"
phone = "12345"
}
$new = $tmpl
foreach ($key in $h.Keys) {
$escKey = [Regex]::Escape($key)
$new = $new -replace "#$escKey#", $h[$key]
}
$new
Explanation
$tmpl contains the template string.
$h is the hashtable.
$new will contain the replaced string.
We enumerate through each of the keys in the hash.
We store a regex escaped version of the key in $escKey.
We replace $escKey surrounded by # characters with the hashtable lookup for the particular key.
One of the nice things about doing this is that you can change your hashtable and your template, and never have to update the regex. It will also gracefully handle the cases where a key has no corresponding replacable section in the template (and vice-versa).
You can create a template using an expandable (double-quoted) here-string:
$Template = #"
My first name is $($hash.first) and last name is $($hash.last).
Call me at $($hash.phone)
"#
$hash = #{first = "James"; last = "Brown"; phone = "12345"}
$Template
My first name is James and last name is Brown.
Call me at 12345

Parsing custom arguments in Powershell "-" delimitted

I have a string
-car:"Nissan" -Model:"Dina" -Color:"Light-blue" -wheels:"4"
How can I extract the arguments? Initial thoughts was to use the '-' as the delimiter, however that's not going to work.
Use of a regular expression is probably the easiest solution of the task. This can be done in PowerShell:
$text = #'
-car:"Nissan" -Model:"Dina" -Color:"Light-blue" -wheels:"4" -windowSize.Front:"24"
'#
# assume parameter values do not contain ", otherwise this pattern should be changed
$pattern = '-([\.\w]+):"([^"]+)"'
foreach($match in [System.Text.RegularExpressions.Regex]::Matches($text, $pattern)) {
$param = $match.Groups[1].Value
$value = $match.Groups[2].Value
"$param is $value"
}
Output:
car is Nissan
Model is Dina
Color is Light-blue
wheels is 4
windowSize.Front is 24

Perl RegEx to find the portion of the email address before the #

I have this below issue in Perl.I have a file in which I get list of emails as input.
I would like to parse the string before '#' of all email addresses. (Later I will store all the string before # in an array)
For eg. in : abcdefgh#gmail.com, i would like to parse the email address and extract abcdefgh.
My intention is to get only the string before '#'. Now the question is how to check it using regular expression. Or is there any other method using substr?
while I use regular expression : $mail =~ "\#" in Perl, it's not giving me the result.
Also, how will I find that the character '#' is in which index of the string $mail?
I appreciate if anyone can help me out.
#!usr/bin/perl
$mail = "abcdefgh#gmail.com";
if ($mail =~ "\#" ) {
print("my name = You got it!");
}
else
{
print("my name = Try again!");
}
In the above code $mail =~ "\#" is not giving me desired output but ($mail =~ "abc" ) does.
$mail =~ "#" will work only if the given string $mail = "abcdefgh\#gmail.com";
But in my case, i will be getting the input with email address as its.
Not with an escape character.
Thanks,
Tom
Enabling warnings would have pointed out your problem:
#!/usr/bin/perl
use warnings;
$mail = "abcdefgh#gmail.com";
__END__
Possible unintended interpolation of #gmail in string at - line 3.
Name "main::gmail" used only once: possible typo at - line 3.
and enabling strict would have prevented it from even compiling:
#!/usr/bin/perl
use strict;
use warnings;
my $mail = "abcdefgh#gmail.com";
__END__
Possible unintended interpolation of #gmail in string at - line 4.
Global symbol "#gmail" requires explicit package name at - line 4.
Execution of - aborted due to compilation errors.
In other words, your problem wasn't the regex working or not working, it was that the string you were matching against contained "abcdefgh.com", not what you expected.
The # sign is a metacharacter in double-quoted strings. If you put your email address in single quotes, you won't get that problem.
Also, I should add the obligatory comment that this is fine if you're just experimenting, but in production code you should not parse email addresses using regular expressions, but instead use a module such as Mail::Address.
What if you tried this:
my $email = 'user#email.com';
$email =~ /^(.+?)#/;
print $1
$1 will be everything before the #.
If you want the index of a string, you can use the index() function. ie.
my $email = 'foo#bar';
my $index = index($email, '#');
If you want to return the former half of the email, I'd use split() over regular expressions.
my $email = 'foo#bar';
my #result = split '#', $email;
my $username = $result[0];
Or even better with substr
my $username = substr($email, 0, index($email, '#'))
$mail = 'abcdefgh#gmail.com';
$mail =~ /^([^#]*)#/;
print "$1\n"