Matching a substring using a regular expression in PowerShell - regex

Regular expressions are really not my forte and I am trying to learn. Struggling with this one at the moment.
<fraglink id="230681395" resid="1057000484">
I have a file with loads of text in it, and every now and then bits like the above appear in it. I want to get the number in between the quotes after resid=.
Is some sort of look ahead / behind required here?

It looks like you want a regex like:
resid="([0-9]+)"
And to grab $1.

Since your content looks like XML, you should probably not use a regular expression to grab your desired value. If you share your whole file we will show you how to select the value properly using XPath for example.
However, if you want to use a regular expression for training purposes, try this:
$content = Get-Content 'your_file_path' -raw
[regex]::Match($content, '\bresid="([^"]+)').Groups[1].Value

Using a lookahead ((?<=pattern)), your pattern could look like:
(?<=resid=")\d+
With Regex.Match():
$id = [regex]::Match($inputString,'(?<=resid=")\d+').Value

Related

Regular expression to extract part of a file path using the logstash grok filter

I am new to regular expressions but I think people here may give me valuable inputs. I am using the logstash grok filter in which I can supply only regular expressions.
I have a string like this
/app/webpf04/sns882A/snsdomain/logs/access.log
I want to use a regular expression to get the sns882A part from the string, which is the substring after the third "/", how can I do that?
I am restricted to regex as grok only accepts regex. Is it possible to use regex for this?
Yes you can use regular expression to get what you want via grok:
/[^/]+/[^/]+/(?<field1>[^/]+)/
for your regex:
/\w*\/\w*\/(\w*)\/
You can also test with:
http://www.regextester.com/
By googling regex tester, you can have different UI.
If you are indeed using Perl then you should use the File::Spec module like this
use strict;
use warnings;
use File::Spec;
my $path = '/app/webpf04/sns882A/snsdomain/logs/access.log';
my #path = File::Spec->splitdir($path);
print $path[3], "\n";
output
sns882A
This is how I would do it in Perl:
my ($name) = ($fullname =~ m{^(?:/.*?){2}/(.*?)/});
EDIT:
If your framework does not support Perl-ish non-grouping groups (?:xyz), this regex should work instead:
^/.*?/.*?/(.*?)/
If you are concerned about performance of .*?, this works as well:
^/[^/]+/[^/]+/([^/]+)/
One more note: All of regexes above will match string /app/webpf04/sns882A/.
But matching string is completely different from first matching group, which is sns882A in all three cases.
Same answer but a small bug fix. If you doesnt specify ^ in starting,it will go for the next match(try longer paths adding more / for input.). To fix it just add ^ in the starting like this. ^ means starting of the input line. finally group1 is your answer.
^/[^/]+/[^/]+/([^/]+)/
If you are using any URI paths use below.(it will handle path aswell as URI).
^.*?/[^/]+/[^/]+/([^/]+)/

Regular expression to pick extension

What would regular expression look like for any string which ends with .txt?
Tried few myself but it doesn't look like I'm getting anywhere.
I'd like to construct a regex object to feed a function.
Something like : .*\.txt$
If you want more precisions, I guess you should precise a language and some other stuffs...
All you have to do is match the end of the string using
/\.txt$/
Matching more than that e.g., .*\.txt$ is not necessary
Assuming Perl-style regular expressions, /[^\.]*\.txt$/ should work.

Transform an expression to another with regex for a replace

I need to replace a lot of strings in an application and I can do it with regex but I don't know how.
My current string is: {$str.LOREM} or {$str.LOREM_IPSUM}.
And the output desired is: <?php echo i18n::n('example.lorem');?> or <?php echo i18n::n('example.lorem_ipsum');?>.
UPDATE: Due to confussion in a answer: I want to do it with my IDE. I have like 500 different strings. Netbeans let me use regular expressions and I would like to find one that works with the above example. If possible if is not need to write all the 500 to change, it'll be beter. Thanks!
How I can do it?
If your string looks like that, you dont need regex, as that will be pretty slow. You can use the faster str_replace method for that.
Its as simple as:
$content = str_replace('{$str.LOREM}', i18n::n('example.lorem'), $content);
$content = str_replace('{$str.LOREM_IPSUM}',i18n::n('example.lorem_ipsum'),$content);

Notepad++ replace with reg expression?

I have a big list with links and other date in it. I want to filter out all the data and have a list with just the links.
Example of the current list:
32,2012-01-04 06:44:44,http://link.com/link
33,2012-01-04 06:44:45,http://link.com/link,{Text|textext|text},http://link.com/link|http://link.com/link|http://link.com/link
Notepad++ offers find replace functionality using RegEx. You can access this feature by using Ctrl+H.
If you're actually asking for a regular expression to do this, you can use something like this to match URLs:
\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
which I found here.
Additionally you can test out changes to your regex easily at http://gskinner.com/RegExr/
Using the input you provided, here's a pattern you can use on http://www.regexr.com/
You'll need to make sure the global (/g) flag is on
Expression:
.*?(http.*?)[,|\n]
Input:
32,2012-01-04 06:44:44,http://link.com/link1
33,2012-01-04 06:44:45,http://link.com/link2,{Text|textext|text},http://link.com/link3|http://link.com/link4|http://link.com/link5
Substitution:
$1\n
Output:
http://link.com/link1
http://link.com/link2
http://link.com/link3
http://link.com/link4
http://link.com/link5

Returning a portion of a regular expression match

This question shows my ignorance of regular expressions. I've never understood it quite enough.
If I wanted to match, for instance, just the URL portion of an tag in HTML, what would I need to do?
My regular expression to get the entire tag is:
<A[^>]*?HREF\s*=\s*[""']?([^'"" >]+?)[ '""]?>
I have no idea what I would need to do to get the URL out of that and I have no clue where to look in regular expression documentation to figure this out.
If programming in Perl you could utilize the $1 operator within an if() statement. For ex.
if( $HREF =~ /<A[^>]*?HREF\s*=\s*[""']?([^'"" >]+?)[ '""]?>/ ) {
print $1;
}
the exactly HOW part depends on the regex library you're using, but the way is to use a grouped expression. You actually already have one in your example, as grouped expressions are parenthesized. The href attribute value is your first group (your zeroth group is the whole expression.)
You can use round brackets to group parts of the regular expression match. In this case you could use a round bracket around the URL part and then later use a number to refer to that group. See here to see how exactly you can do this.
I switched things up a bit - try something like this:
<a[^>]*href="([^"]*).*>