Powershell - How to replace a number with a variable in a string? - regex
Trying to replace a number (20 with a variable $cntr=120) in a string using replace operator. But getting stuck with $cntr in the output. Where I am doing wrong? Any better solutions please.
Input string
myurl.com/search?project=ABC&startAt=**20**&maxResults=100&expand=log
Desired Output string
myurl.com/search?project=ABC&startAt=**120**&maxResults=100&expand=log
Actual Output string
myurl.com/search?project=ABC&startAt=**$cntr**&maxResults=100&expand=log
Code:
$str='myurl.com/search?project=ABC&startAt=20&maxResults=100&expand=log'
$cntr=120
$str = $str -replace '^(.+&startAt=)(\d+)(&.+)$', '$1$cntr$3'
$str
You need to
Use double quotes to be able to use string interpolation
Use the unambiguous backreference syntax, ${n}, where n is the group ID.
In this case, you can use
PS C:\Users\admin> $str='myurl.com/search?project=ABC&startAt=20&maxResults=100&expand=log'
PS C:\Users\admin> $cntr=120
PS C:\Users\admin> $str = $str -replace '^(.+&startAt=)(\d+)(&.+)$', "`${1}$cntr`$3"
PS C:\Users\admin> $str
myurl.com/search?project=ABC&startAt=120&maxResults=100&expand=log
See the .NET regex "Substituting a Numbered Group" documentation:
All digits that follow $ are interpreted as belonging to the number group. If this is not your intent, you can substitute a named group instead. For example, you can use the replacement string ${1}1 instead of $11 to define the replacement string as the value of the first captured group along with the number "1".
A couple things here:
If you just add the "12" you end up with $112$3 which isn't what you want. What I did was appended a slash in front and then removed it on the backend, so the replace becomes $1\12$3.
$str='myurl.com/search?project=ABC&startAt=20&maxResults=100&expand=log'
$cntr=12
$str = ($str -replace '^(.+&startAt=)(\d+)(&.+)$', ('$1\' + $cntr.ToString() +'$3')).Replace("\", "")
$str
Looking to see if there's another way to add the literal "12" in the replace section with the extra character, but this does work.
Here's another way to do it where you have a literal string between the $1 and $3 and then replace that at the end.
$str='myurl.com/search?project=ABC&startAt=20&maxResults=100&expand=log'
$cntr=12
$str = ($str -replace '^(.+&startAt=)(\d+)(&.+)$', ('$1REPLACECOUNTER$3')).Replace("REPLACECOUNTER", "$cntr")
$str
Related
Trim More than 20 Characters
I am working on a script that will generate AD usernames based off of a csv file. Right now I have the following line working. Select-Object #{n=’Username’;e={$_.FirstName.ToLower() + $_.LastName.ToLower() -replace "[^a-zA-Z]" }} As of right now this takes the name and combines it into a AD friendly name. However I need to name to be shorted to no more than 20 characters. I have tried a few different methods to shorten the username but I haven't had any luck. Any ideas on how I can get the username shorted?
Probably the most elegant approach is to use a positive lookbehind in your replacement: ... -replace '(?<=^.{20}).*' This expression matches the remainder of the string only if it is preceded by 20 characters at the beginning of the string (^.{20}). Another option would be a replacement with a capturing group on the first 20 characters: ... -replace '^(.{20}).*', '$1' This captures at most 20 characters at the beginning of the string and replaces the whole string with just the captured group ($1).
$str[0..19] -join '' e.g. PS C:\> 'ab'[0..19] ab PS C:\> 'abcdefghijklmnopqrstuvwxyz'[0..19] -join '' abcdefghijklmnopqrst Which I would try in your line as: Select-Object #{n=’Username’;e={(($_.FirstName + $_.LastName) -replace "[^a-z]").ToLower()[0..19] -join '' }} ([a-z] because PowerShell regex matches are case in-senstive, and moving .ToLower() so you only need to call it once). And if you are using Strict-Mode, then why not check the length to avoid going outside the bounds of the array with the delightful: $str[0..[math]::Min($str.Length, 19)] -join ''
To truncate a string in PowerShell, you can use the .NET String::Substring method. The following line will return the first $targetLength characters of $str, or the whole string if $str is shorter than that. if ($str.Length -gt $targetLength) { $str.Substring(0, $targetLength) } else { $str } If you prefer a regex solution, the following works (thanks to #PetSerAl) $str -replace "(?<=.{$targetLength}).*" A quick measurement shows the regex method to be about 70% slower than the substring method (942ms versus 557ms on a 200,000 line logfile)
regular expression that matches any word that starts with pre and ends in al
The following regular expression gives me proper results when tried in Notepad++ editor but when tried with the below perl program I get wrong results. Right answer and explanation please. The link to file I used for testing my pattern is as follows: (http://sainikhil.me/stackoverflow/dictionaryWords.txt) Regular expression: ^Pre(.*)al(\s*)$ Perl program: use strict; use warnings; sub print_matches { my $pattern = "^Pre(.*)al(\s*)\$"; my $file = shift; open my $fp, $file; while(my $line = <$fp>) { if($line =~ m/$pattern/) { print $line; } } } print_matches #ARGV;
A few thoughts: You should not escape the dollar sign The capturing group around the whitespaces is useless Same for the capturing group around the dot . which leads to: ^Pre.*al\s*$ If you don't want words like precious final to match (because of the middle whitespace, change regex to: ^Pre\S*al\s*$ Included in your code: while(my $line = <$fp>) { if($line =~ /^Pre\S*al\s*$/m) { print $line; } }
You're getting messed up by assigning the pattern to a variable before using it as a regex and putting it in a double-quoted string when you do so. This is why you need to escape the $, because, in a double-quoted string, a bare $ indicates that you want to interpolate the value of a variable. (e.g., my $str = "foo$bar";) The reason this is causing you a problem is because the backslash in \s is treated as escaping the s - which gives you just plain s: $ perl -E 'say "^Pre(.*)al(\s*)\$";' ^Pre(.*)al(s*)$ As a result, when you go to execute the regex, it's looking for zero or more ses rather than zero or more whitespace characters. The most direct fix for this would be to escape the backslash: $ perl -E 'say "^Pre(.*)al(\\s*)\$";' ^Pre(.*)al(\s*)$ A better fix would be to use single quotes instead of double quotes and don't escape the $: $ perl -E "say '^Pre(.*)al(\s*)$';" ^Pre(.*)al(\s*)$ The best fix would be to use the qr (quote regex) operator instead of single or double quotes, although that makes it a little less human-readable if you print it out later to verify the content of the regex (which I assume to be why you're putting it into a variable in the first place): $ perl -E "say qr/^Pre(.*)al(\s*)$/;" (?^u:^Pre(.*)al(\s*)$) Or, of course, just don't put it into a variable at all and do your matching with if($line =~ m/^Pre(.*)al(\s*)$/) ...
Try removing trailing newline character(s): while(my $line = <$fp>) { $line =~ s/[\r\n]+$//s; And, to match only words that begin with Pre and end with al, try this regular expression: /^Pre\w*al$/ (\w means any letter of a word, not just any character) And, if you want to match both Pre and pre, do a case-insensitive match: /^Pre\w*al$/i
Powershell variable in replacement string with named groups
The following Powershell replace operation with named groups s1 and s2 in regex (just for illustration, not a correct syntax) works fine : $s -Replace "(?<s1>....)(?<s2>...)" '${s2}xxx${s1}' My question is : how to replace with a variable $x instead of the literal xxx, that is, something like : $s -Replace "(?<s1>....)(?<s2>...) '${s2}$x${s1}' That doesn't work as Powershell doesn't replace variable in single quoted string but the named group resolution doesn't work anymore if replacement string is put in double quotes like this "${s2}$x${s1}".
#PetSerAl comment is correct, here is code to test it: $sep = "," "AAA BBB" -Replace '(?<A>\w+)\s+(?<B>\w+)',"`${A}$sep`${B}" Output: AAA,BBB Explanation: Powershell will evaluate the double quoted string, escaping the $ sign with a back tick will ensure these are not evaluated and a valid string is provided for the -Replace operator. msdn about replace operator msdn about escape characters or via Get-Help about_escape & Get-Help about_comparison_operators
Extract GUID from line via regular expression in perl
I have problems testing for and when successful extracting a GUID from a line of a textfile. Given a guid 12345678-1234-1234-1234-123456789abc, where all hexadecimal characters are allowed (I use digits and abc to display the number of characters each substring contains), I tried it like this my $myline = " set whatevervariable = \'aaaaaaaa-bbbb-cccc-1234-aaaaaaaabbbb\'"; my $guid =($myline =~ m/[A-Fa-f0-9]{8}[\-][A-Fa-f0-9]{4}[\-][A-Fa-f0-9]{4}[\-][A-Fa-f0-9]{4}[\-]([A-Fa-f0-9]){12}/gi) The test works fine, but how can I extract the GUID afterwards and use it as a string in my perl script? The [] operator does not work... Thanks for help, G.
use group: my $guid =($myline =~ m/([A-Fa-f0-9]{8}[\-][A-Fa-f0-9]{4}[\-][A-Fa-f0-9]{4}[\-][A-Fa-f0-9]{4}[\-]([A-Fa-f0-9]){12})/gi) # note parens here __^ and here __^ You could also simplify the regex: my $guid =($myline =~ m/([a-f\d]{8}-[a-f\d]{4}-[a-f\d]{4}-[a-f\d]{4}-([a-f\d]){12})/gi) According to daxim's comment, you'd add a modifier: my $guid =($myline =~ m/([a-f\d]{8}-[a-f\d]{4}-[a-f\d]{4}-[a-f\d]{4}-([a-f\d]){12})/gia) or use my $guid =($myline =~ m/([[:xdigit:]]{8}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-([[:xdigit:]]){12})/gia)
my $myline = " set whatevervariable = \'aaaaaaaa-bbbb-cccc-1234-aaaaaaaabbbb\'"; my ($guid) =($myline =~ m/([A-Fa-f0-9]{8}[\-][A-Fa-f0-9]{4}[\-][A-Fa-f0-9]{4}[\-][A-Fa-f0-9]{4}[\-][A-Fa-f0-9]{12})/gi); There are two changes: you had your parens wrong. when you do: $variable = $some_other =~ m/.../g; - the $variable will contain number of matches. If you want the match, you should do:my ( $variable ) = ...`. If there is a change of actually having more than one guid in the file, use: my #guids = $myline =~ m/..../g; Additionally - since you are using //i flag, there is no need to use [A-Fa-f] - simple [a-f] is good enough. What's more - you don't have to do [-] thing. - character is not magical in regexps. This all sums to: my #guids = $myline =~ m/([a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12})/gi;
How do I use Perl to intersperse characters between consecutive matches with a regex substitution?
The following lines of comma-separated values contains several consecutive empty fields: $rawData = "2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,,Clear\n 2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,,,,\n" I want to replace these empty fields with 'N/A' values, which is why I decided to do it via a regex substitution. I tried this first of all: $rawdata =~ s/,([,\n])/,N\/A/g; # RELABEL UNAVAILABLE DATA AS 'N/A' which returned 2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,N/A,Clear\n 2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,N/A,,N/A,\n Not what I wanted. The problem occurs when more than two consecutive commas occur. The regex gobbles up two commas at a time, so it starts at the third comma rather than the second when it rescans the string. I thought this could be something to do with lookahead vs. lookback assertions, so I tried the following regex out: $rawdata =~ s/(?<=,)([,\n])|,([,\n])$/,N\/A$1/g; # RELABEL UNAVAILABLE DATA AS 'N/A' which resulted in: 2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,,N/A,Clear\n 2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,,N/A,,N/A,,N/A,,N/A\n That didn't work either. It just shifted the comma-pairings by one. I know that washing this string through the same regex twice will do it, but that seems crude. Surely, there must be a way to get a single regex substitution to do the job. Any suggestions? The final string should look like this: 2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,N/A,N/A,Clear\n 2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,,N/A,,N/A,N/A,N/A,N/A,N/A\n
EDIT: Note that you could open a filehandle to the data string and let readline deal with line endings: #!/usr/bin/perl use strict; use warnings; use autodie; my $str = <<EO_DATA; 2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,,Clear 2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,,,, EO_DATA open my $str_h, '<', \$str; while(my $row = <$str_h>) { chomp $row; print join(',', map { length $_ ? $_ : 'N/A'} split /,/, $row, -1 ), "\n"; } Output: E:\Home> t.pl 2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,N/A,Clear 2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,N/A,N/A,N/A,N/A You can also use: pos $str -= 1 while $str =~ s{,(,|\n)}{,N/A$1}g; Explanation: When s/// finds a ,, and replaces it with ,N/A, it has already moved to the character after the last comma. So, it will miss some consecutive commas if you only use $str =~ s{,(,|\n)}{,N/A$1}g; Therefore, I used a loop to move pos $str back by a character after each successful substitution. Now, as #ysth shows: $str =~ s!,(?=[,\n])!,N/A!g; would make the while unnecessary.
I couldn't quite make out what you were trying to do in your lookbehind example, but I suspect you are suffering from a precedence error there, and that everything after the lookbehind should be enclosed in a (?: ... ) so the | doesn't avoid doing the lookbehind. Starting from scratch, what you are trying to do sounds pretty simple: place N/A after a comma if it is followed by another comma or a newline: s!,(?=[,\n])!,N/A!g; Example: my $rawData = "2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,,Clear\n2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,,,,\n"; use Data::Dumper; $Data::Dumper::Useqq = $Data::Dumper::Terse = 1; print Dumper($rawData); $rawData =~ s!,(?=[,\n])!,N/A!g; print Dumper($rawData); Output: "2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,,Clear\n2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,,,,\n" "2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,N/A,Clear\n2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,N/A,N/A,N/A,N/A\n"
You could search for (?<=,)(?=,|$) and replace that with N/A. This regex matches the (empty) space between two commas or between a comma and end of line.
The quick and dirty hack version: my $rawData = "2008-02-06,8:00 AM,14.0,6.0,59,1027,-9999.0,West,6.9,-,N/A,,Clear 2008-02-06,9:00 AM,16,6,40,1028,12,WNW,10.4,,,,\n"; while ($rawData =~ s/,,/,N\/A,/g) {}; print $rawData; Not the fastest code, but the shortest. It should loop through at max twice.
Not a regex, but not too complicated either: $string = join ",", map{$_ eq "" ? "N/A" : $_} split (/,/, $string,-1); The ,-1 is needed at the end to force split to include any empty fields at the end of the string.