Powershell - Replacing a string with a variable ending with a dollar sign - regex

I'm a bit lost with this one. For whatever reason the replace function in powershell doesn't play well with variables ending with a $ sign.
Command:
$var='A#$A#$'
$line=('$var='+"'"+"'")
$line -replace '^.+$',('$line='+"'"+$var+"'")
Expected output:
$line='A#$A#$'
Actual output:
$line='A#$A#

It looks like you're getting hit with a regex substitution that you don't want. The regex special variable $' represents everything after your match. Since your regex matches the entire string, $' is effectively empty. During the replace operation, the .Net regex engine sees $' in your expected output and substitutes in that empty string.
One way to avoid this is to replace all instances of $ in your $var string with $$:
$line -replace '^.+$',('$line='+"'"+($var.Replace('$','$$'))+"'")
You can see more information about regex substitution in .Net here:
Substitutions

I was able to find a band-aid of sorts by replacing $ with a special character and then reverting it back after the change. Preferably you would choose a character that doesn't have a key on your keyboard. For me I chose "¤".
$var='A#$A#$'
$var=$var -replace '\$','¤'
$line=("`$var=''")
$line -replace '^.+$',("`$line='$var'") -replace '¤','$'

I don't really understand the purpose of your posted lines, it seems to me that it would just make more sense to do $line='$line='''+$var+"'", BUT if you insist on your way, just do two replace calls, like this:
$line -replace '^.+$',('$line=''LOL''') -replace 'LOL',$var

Related

Remove certain characters from a regex group

I have a string that looks like this (key":["value","value","value"])
"emailDomains":["google.co.uk","google.com","google.com","google.com","google.co.uk"]
and I use the following regex to select from the string. (the regex is setup in a way where it wont select a string that looks like this "key":[{"key":"value","key":"value"}] )
(?<=:\[").*?(?="])
Resulting Selection:
google.co.uk","google.com","google.com","google.com","google.co.uk
I want to remove the " in that select string, and i was wondering if there was an easy way to do this using the replace command. Desired result...
"emailDomains":["google.co.uk, google.com, google.com, google.com, google.co.uk"]
How do I solve this problem?
If your string indeed has the form "key":["v1", "v2", ... "vN"], you can split off the part that needs to be changed, replace "," by a space in it, and re-assemble:
my #parts = split / (\["\s* | \s*\"]) /x, $string; #"
$parts[2] =~ s/",\s*"/ /g;
my $processed = join '', #parts;
The regex pattern for the separator in split is captured since in that case the separators are also in the returned list, what is helpful here for putting the string back together. Then, we need to change the third element of the array.
In this approach, we have to change a specific element in the array so if your format varies, even a little, this may not (or still may) be suitable.
This should of course be processed as JSON, using a module. If the format isn't sure, as indicated in a comment, it would be best to try to ensure that you have JSON. Picking bits and pieces like above (or below) is a road to madness once requirements slowly start evolving.
The same approach can be used in a regex, and this may in fact have an advantage to be able to scoop up and ignore everything preceding the : (with split that part may end up with multiple elements if the format isn't exactly as shown, what then affects everything)
$string =~ s{ :\["\s*\K (.*?) ( "\] ) }{
my $e = $2;
my $n = $1 =~ s/",\s*"/ /gr;
$n.$e
}ex;
Here /e modifier makes it so that the replacement side is evaluated as code, where we do the same as with the split above. Notes on regex
Have to save away $2 first, since it gets reset in the next regex
The /r modifier†, which doesn't change its target but rather returns the changed string, is what allows us to use substitution operator on the read-only $1
If nothing gets captured for $2, and perhaps for $1, that means that there was no match and the outcome is simply that $string doesn't change, quietly. So if this substitution should always work then you may want to add handling of such unexpected data
Don't need a $n above, but can return ($1 =~ s/",\s*"/ /gr) . $e
Or, using lookarounds as attempted
$string =~ s{ (?<=:\[") (.+?) (?="\]) }{ $1 =~ s/",\s*"/ /gr }egx;
what does reduce the amount of code, but may be trickier to work with later.
While this is a direct answer to the question I think it's least maintainable.
†  This useful modifier, for "non-destructive substitution," appeared in v5.14. In earlier Perl versions we would copy the string and run regex on that, with an idiom
(my $n = $1) =~ s/",\s*"/ /g;
In the lookarounds-example we then need a little more
$string =~ s{...}{ (my $n = $1) =~ s/",\s*"/ /g; $n }gr
since s/ operator returns the number of substitutions made while we need $n to be returned from that whole piece of code in {} (the replacement side), to be used as the replacement.
You can use this \G based regex to start the match with :[" and further captures the values appropriately and replaces matched text so that only comma is retained and doublequotes are removed.
(:\[")|(?!^)\G([^"]+)"(,)"
Regex Demo
Your text is almost proper JSON, so it's really easy to go the final inch and make it so, and then process that:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say postderef/;
no warnings qw/experimental::postderef/;
use JSON::XS; # Install through your OS package manager or a CPAN client
my $str = q/"emailDomains":["google.co.uk","google.com","google.com","google.com","google.co.uk"]/;
my $json = JSON::XS->new();
my $obj = $json->decode("{$str}");
my $fixed = $json->ascii->encode({emailDomains =>
join(', ', $obj->{'emailDomains'}->#*)});
$fixed =~ s/^\{|\}$//g;
say $fixed;
Try Regex: " *, *"
Replace with: ,
Demo

How to replace consecutive and identical characters in Perl?

I have a string like
XXXXYYYYZZZYYZZZYYYY which needs to be converted to
XXXXAAAYZZZAYZZZAAAY
$s =~ s/Y{2}+/AY/g;
this has 2 problems, {2}+ will get YYYY to AYAY; and AY is not the same length as YYYY (expecting AAAY)
How to get this done in perl?
Use a "look-ahead":
$s =~ s/Y(?=Y+)/A/g;
(?=Y+) means "followed by one or more Y characters", so any Y character that is followed by another Y character will be replaced with an A.
More info from perlretut
There's always more than one way to do it. My suggestion is to grab all the Ys except the last one, and then use that to create a string of As of the same length. The e modifier tells perl to execute the code in the replacement side instead of using it directly, and the r modifier tells =~ to return the result of the substitution instead of modifying the input text directly (useful for these one-liner tests, among other places).
$ perl -E 'say shift =~ s/(Y+)(?=Y)/"A"x length$1/gre' XXXXYYYYZZZYYZZZYYYY
XXXXAAAYZZZAYZZZAAAY
$s =~ s/Y{2}+/AY/g
RHS Pattern is ambiguously obscure pattern: Y{2}+, that's very rarely used regex pattern except if {}+ very rarely is available in few advanced regex engine, including perl maybe, as a regex feature called 'atomic grouping'.
You might have meant (Y{2})+ which is (YY)+ or Y{2,} which is YY+
in perl it's no brainer simple and easy as it supports lookaround feature
perl -e '$s=XXXXYYYYZZZYYZZZYYYY ;$s =~ s/Y(?=Y)/A/g;print $s'
actually lower regex engine such sed still can do it albeit in cumbersome, uneasy way
echo XXXXYYYYZZZYYZZZYYYY |sed -E 's/YY+/&\n/g;s/Y/A/g;s/A\n/Y/g'

PowerShell regex does not match near newline

I have an exe output in form
Compression : CCITT Group 4
Width : 3180
and try to extract CCITT Group 4 to $var with PowerShell script
$var = [regex]::match($exeoutput,'Compression\s+:\s+([\w\s]+)(?=\n)').Groups[1].Value
The http://regexstorm.net/tester say, the regexp Compression\s+:\s+([\w\s]+)(?=\n) is correct but not PowerShell. PowerShell does not match. How can I write the regexp correctly?
You want to get all text from some specific pattern till the end of the line. So, you do not even need the lookahead (?=\n), just use .+, because . matches any char but a newline (LF) char:
$var = [regex]::match($exeoutput,'Compression\s+:\s+(.+)').Groups[1].Value
Or, you may use a -match operator and after the match is found access the captured value using $matches[1]:
$exeoutput -match 'Compression\s*:\s*(.+)'
$var = $matches[1]
Wiktor Stribiżew's helpful answer simplifies your regex and shows you how to use PowerShell's -match operator as an alternative.
Your follow-up comment about piping to Out-String fixing your problem implies that your problem was that $exeOutput contained an array of lines rather than a single, multiline string.
This is indeed what happens when you capture the output from a call to an external program (*.exe): PowerShell captures the stdout output lines as an array of strings (the lines without their trailing newline).
As an alternative to converting array $exeOutput to a single, multiline string with Out-String (which, incidentally, is slow[1]), you can use a switch statement to operate on the array directly:
# Stores 'CCITT Group 4' in $var
$var = switch -regex ($exeOutput) { 'Compression\s+:\s+(.+)' { $Matches[1]; break } }
Alternatively, given the specific format of the lines in $exeOutput, you could leverage the ConvertFrom-StringData cmdlet, which can perform parsing the lines into key-value pairs for you, after having replaced the : separator with =:
$var = ($exeoutput -replace ':', '=' | ConvertFrom-StringData).Compression
[1] Use of a cmdlet is generally slower than using an expression; with a string array $array as input, you can achieve what $array | Out-String does more efficiently with $array -join "`n", though note that Out-String also appends a trailing newline.

How to use $_ in content without been replaced by powershell?

I'm trying to replace a word to some php code
$filecontent = [regex]::Replace($filecontent, $myword, $phpcode)
But the $phpcode have some php code using also a Special variable $_
<?php $cur_author = (isset($_GET['author_name'])) ? get_user_by('slug', $author_name) : get_userdata(intval($author)); ?>
The problem is when the code is replace in $filecontent it replaces the $_ variable from the php code ( $_GET ) with it have on the pipeline.
This not happen with the other variables like $author_name .
How can I resolve this?
Does this work for you?
$filecontent = [regex]::Replace($filecontent, $myword, {$phpcode})
In a regex replace operation the $_ is a reserved substituion pattern that represents the entire string
http://msdn.microsoft.com/en-us/library/az24scfc.aspx
Wrapping it in braces makes it a scriptblock delegate, bypassing the normal regex pattern matching algorithms for doing the replacement.
You have two options. First use a single quoted string and PowerShell will treat that as a verbatim string (C# term) i.e. it won't try to string interpolate:
'$_ is passed through without interpretation'
The other option is to escape the $ character in a double quoted string:
"`$_ is passed through without interpretation"
When I'm messing with a regex I will default to using single quoted strings unless I have a variable that needs to be interpolated inside the string.
Another possibility is that $_ is being interpreted by regex as a substitution group in which case you need to use the substitution escape on the $ e.g. $$.
Im not sure I am following you correctly, but does this help?
$file = path to your file
$oldword = the word you want to replace
$newword = the word you want to replace it with
If the Oldword you are replacing has special charactes ( ie. \ or $ ) then you must escape them first. You can escape them by putting a backslash in front of the special character. The Newword, does not need to be escaped. A $ would become "\$".
(get-content $file) | foreach-object {$_ -replace $oldword,$NewWord} | Set-Content $file

Split string on character, except that doubled up character is an escape

Input:
$string = "this-is-just-an--example"
Output:
this
is
just
an-example
Tried various things centered around Regex.Split and "-[^-]" or "-([^])".
Example of things that did not work:
[regex]::Split( $string, "-[^-]" )
[regex]::Split( $string, "-([^-])" )
Of course I can use the String.Split and iterate, and realize that empty string means I ran into escaped character... But it's ugly code.
P.S. tried searching for duped for a few minutes, didn't find any.
Use the lookahead and lookbehind assertions, and then do a replace to eliminate the remaining doubled characters:
$string = "this-is-just-an--example"
$string -split '(?<!-)-(?!-)' -replace '--','-'
this
is
just
an-example
Replacing the escape sequence with another value first eliminates the need for a complex split condition. All that is needed is a unique replacement value that does not already appear in the string.
For example, using # as the replacement value for $string = 'this-is-just-an--example', the following line will get the desired result:
$string -replace '--','#' -split '-' -replace '#','-'
-replace '--','#' eliminates the escape sequence (giving this-is-just-an#example),
-split '-' then separates the result (giving an array containing this, is, just, and an#example),
and finally -replace '#','-' restores the escaped value (giving this, is, just, an-example).
Both -split and -replace are built-in PowerShell operators that work on strings using regular expressions (equivalent to the Regex.Split and Regex.Replace methods in .NET).