Extract dynamic string with regex (PowerShell) [closed] - regex

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 11 days ago.
Improve this question
I have a long output string in PowerShell with all complex characters
this is part of it:
{host-up|rp-web1|/images/logos/Generic_Host.gif|0|276|0 CRITICAL service-critical|rp-web1|ssl_expiration_bitwarden|26186|0|0|1|0 2023/02/06 ....
"service-critical" is a fixed string and appears several more times in the string
"rp-web1|ssl_expiration_bitwarden" - this is a dynamic string which comes right after "service-critical"
I was not able to write a regex that managed to extract all the dynamic strings in the text
Of course I tried to use 3 pipes between the dynamic string but without success
I expect to get all dynamic string after "service-critical" like:
rp-web1|ssl_expiration_bitwarden

Use the .NET [regex]::Matches() method to extract all desired matches of a given regex.
# Sample input string with 2 "service-critical" groups.
$str=#'
{host-up|...|0 CRITICAL service-critical|rp-web1|ssl_expiration_bitwarden1|26186|0|0|1|0 2023/02/06 ....
{host-up|...|0 CRITICAL service-critical|rp-web2|ssl_expiration_bitwarden2|26186|0|0|1|0 2023/02/06 ....
'#
# Define the regex.
$regex = '(?<=\bservice-critical\|)[^|]+\|[^|]+'
# Find and report all matches
# -> 'rp-web1|ssl_expiration_bitwarden1', 'rp-web2|ssl_expiration_bitwarden2'
[regex]::Matches($str, $regex).Value
For an explanation of the regex and the ability to experiment with it, see this regex101.com page.
Unfortunately, use of PowerShell's -matchoperator (with subsequent inspection of automatic $Matches variable) is not an option in this case, because it invariably looks only for one match.
GitHub issue #7867 is a green-lit proposal to introduce a -matchall operator in order to support finding all matches.

You can try this one:
(?:service-critical\|)\K(.*?\|.*?)(?=\|)
Here is a demo
(?:service-critical\|) - looks for service-critial| phrase but without grouping it
\K - omits matching of found service-critial|, as that is the part you do not want to see in results
NOTE: as #mklement0 pointed out it won't work in .NET regex engine (which is used by powershell). You can skip it in this case and get matching with service-critical| phrase or use positive lookbehind structure which is shown in #mklement0 answer
(.*?\|.*?) - captures a group of any character (or no characters at all, separated by single| in non-greedy way
(?=\|) - assures it is followed by | (it is called positive lookahead)

Related

$1 through $9 only in perl, can we go further? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I have a requirement to match 12 numbers in a sequence but i am getting limited to 9th number. Is there any way to go beyond 9 matches ?
my string is something like
{"Column5": "Null", "Column4": "Null", "Column6": "Null", "Column1": "END", "Column3": "Null", "Column2": "Null"}
where columns are fixed but in place of Null there can be any sequence/characters.
I tried matching columns and subsequent strings but i have 12 matches whereas i am limited till only $9.
Any suggestions ?
You can easily put your matches into an actual array rather than relying on $1 and friends:
my #matches = $some_string =~ /(some) (regex) with (m)(a)(n)(y) (c)(a)(p)(t)(u)r(e)(s)/;
Or, as suggested in a comment, use a JSON parser if you're parsing JSON data. It will be more reliable than a quick regex-based solution.
Please use Dave Sherohman's suggestion about using a JSON parser, or at least use an actual array to store the matches.
Perl imposes no hard limit on the number of captures (or the limit is so high that no reasonable script would run into). The code in this answer and even the script in the question shows that you can refer to matched text in capturing group beyond 9 as usual, i.e. group 10 with $10, group 100 with $100.
(In case anyone is confused, $1, $10, ... are variables used outside the regex to refer to content of the capturing group. It's not syntax for backreference (e.g. \1, \10, ... or \g{1}, \g{10}, ...), which is used in the regex to match the same text captured by the capturing groups).

How to get text out of a delimited string [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm trying to extract a portion of a delimited string. The string is something like this:
272;#This is the text i want
I'd like to get everything after the "#". Anybody have any suggestions?
TL;DR
Language implementations matter. Not all languages support every regular expression operator or feature. There are some general approaches, though, such as zero-width assertions and capture groups.
Positive Lookbehind
Use a zero-width assertion to find the character preceding your string. For example, to capture just the text of interest using Ruby 2.0:
'272;#This is the text i want'.match /(?<=#).*/
pp $&
#=> "This is the text i want"
Capture Groups
Use capture and non-capture groups to match text, then extract the group you're interested in. For example, to capture your desired match in the first capturing group with Ruby 2.0:
'272;#This is the text i want'.match /(?:#)(.*)/
pp $1
#=> "This is the text i want"
You can use the regex #(.*) and extract the first capturing group - btw, what language are you trying to do this??
edit: if you can't access the capturing groups you can try lookbehind if it's supported by the engine:
(?<=#).*
Consider the following Regex...
(?<=#).*
Good Luck!
Try:
string text = "272;#";
int index = text.IndexOf("#");
string sub = text.Substring(index + 1);

Match everything in a string after the 3rd '/' [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have two different forms of a string:
https://anexample.com/things/stuff
and
https:///things/stuff
I need a regular expression that will match everything in the string after the 3rd slash, no matter the syntax. Once it hits the 3rd slash, the rest of the string is matched. I have found a bunch of examples, but I can't seem to tweak the right way to get it to work. Thanks in advance.
You can use this
^[^/]*/[^/]*/[^/]*/(.*)$
You can use this regex:
^(?:[^\/]*\/){3}(.*)$
And use matched group #1
In javascript:
var s = 'https:///things/stuff';
m = s.match(/^(?:[^\/]*\/){3}(.*)$/);
// m[1] => things/stuff
Assuming PCRE, and that you won't have newlines in your string:
If the 3 slashes can be at any position (like your first example):
^[^/]/[^/]*/[^/]*/(.*)$
This could also be expressed as
^(:?[^/]*/){3}(.*)$
Using positive lookbehind, you could use the following, which should only match what you want instead of putting it into a capturing group:
(?<=^(:?[^/]*/){3}).*$
Any needed escaping due to used delimiters is left as an exercise to the reader of course ( if you use / as a delimiter, you have to escape all / in the expression, like \/)
And there's probably a million other alternatives, depending on what exact needs you have besides the ones you mentioned.
Something like this should work, however I'm writting it without any testing, but it should look for three sections of any character any number of times followed by slash and then catch last section which is everything until line end - you can of course change delimiter to whitespace or whatever.
^.*/{3}(.*$)

Regex expression to replace word before search pattern [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 8 years ago.
Improve this question
I am very confused how to replace a word before pattern ".ext".
example :
Before Replace : abcd.ext.com
After Replace : customer.ext.com
You can use something like [^.]+(?=\.) as the match and replace it by customer.
(?=\.) is a positive lookahead which will match when there is a dot following the part before, but it won't match any characters on its own.
E.g. in C# you can use
Regex.Replace(foo, #"[^.]+(?=\.)", "customer");
If you're doing this in C# then I would recommend you just doing something like this:
var newFileName = fileName.Replace(Path.GetFileName(fileName), "newFileNameValue");
If it's in VB.NET, it would look almost exactly the same:
Dim newFileName As String = fileName.Replace(Path.GetFileName(fileName), "newFileNameValue")
You can use a Regex, but it's probably a little overkill and less stable. See, when building a Regex you have to break it down to a really abstract level. You need to handle every extension that's in your domain and that list can grow pretty quickly. So then it's generally not feasible to include those extensions in the Regex itself.
To further add to the problem, a valid file name might be something like this, MyFile.v1.l1.ext1.txt. The extension of that file is .txt, but grabbing that with a Regex is tough.
On Unix you can use sed like this:
echo "$str"|sed 's/abcd\(\.ext\)/customer\1/'
i.e. look for abcd immediately followed by .ext (capture this in a group). Then replace it with customer and match group #1 (.ext)
If you're using any other platform/language approach should be same.
perl
$x =~ s/(.*)(\.ext\.com)/customer$2/;

How to write this Regular Expression for this situation? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to get the input String splitted by a colon. For example, a:int. I can use [^:]* to get the a and int.
However, I don't want the String to be split by any combination which includes colon, such as A:=3:command. What I want are the A:=3 and command but not A, =3, command.
Could someone tell me how to write the regular expression?
I'm going to assume, pending an edit by the OP, that the only colons that should appear in a split are those followed by simple ASCII letters or numbers. The solution can easily be generalized.
Here is a concrete example in JavaScript:
s = "x:=3:comment"
s.split(/:(?=[\s\w])/)
The result is
['x:=3','comment']
The split function says "split on colons that are followed by spaces or word characters (ASCII letters or numbers or underscores)".
Other languages have more powerful forms of lookaround (in particular negative lookarounds), but the basic idea is to construct a regex where the split value is a colon in a particular context.
ADDENDUM
Another example:
"this:has:(some%: 7colons:$:6)".split(/:(?=[\s\w])/)
produces:
['this','has:(some%',' 7colons:$','6')]
On the face of it, you want to split on the last colon in the string, so you want the trailing material to be a string of non-colons, and the preceding material to be anything. You also didn't specify (at the time I answered the question) which sub-species of regex you want (which language you are writing in), so you get Perl for my answer.
#!/usr/bin/env perl
use strict;
use warnings;
my #array = ( "a:int", "A:=3:comment" );
foreach my $item (#array)
{
my($prefix, $suffix) = $item =~ m/^(.*):([^:]+)$/;
print "$prefix and $suffix\n";
}
The output from that script is:
a and int
A:=3 and comment
Clearly, if the rule for the split is different (it isn't simply 'the last colon'), then the pattern will have to change. But this achieves the stated requirements reasonably cleanly.
In addition to Ray's answer, another option is to white-list the operators you support, for example, to support := (JavaScript example):
var s = "hello:world:=5:and:r";
var tokens = s.match(/(?:[^:]|:=)+/g);
For example, if you want the operators :=, =:, :=: and ::, you could write:
/(?:[^:]|:=|=:|:=:|::)+/g
(this can be simplified to, but I think it's easily maintainable).