regex extract word in path - regex

I need a regex for get a word in a path
example:
(update)
/var/log/rsyslog/apache/test1/2014/05/file1.log
/var/log/rsyslog/apache/test2/2014/05/file2.log
/var/log/rsyslog/apache/test3/2014/05/file3.log
the output should be
test1
test2
test3
thank you for your help

I'm not sure which language you're using, in general, this regex works:
/\/(.*?)\.log/
Regex Explanation
/(.*?)\.log
Match the character “/” literally «/»
Match the regular expression below and capture its match into backreference number 1 «(.*?)»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “.” literally «\.»
Match the characters “log” literally «log»

Related

regex in Perl to replace content containing double equal signs

I need a regex in Perl to turn this:
(== doc_url html/arbitrary_file_name.html ==)
into this:
(/doc_assets/legacy/html/arbitrary_file_name.html)
I've tried all kinds of things. My current attempt looks like this:
$content =~ s!\=\= doc_url ([\w\W]+?)\=\=!/doc_assets/legacy/$1!gis;
(In this particular attempt, I'm just letting the enclosing parentheses remain, since that doesn't change from the input to the output.)
Anyway, nothing is working for me. I assume it's the == throwing things off. Any help will be greatly appreciated.
I guess you need something like:
s!.*?doc_url (.*?/.*?) .*!(/doc_assets/legacy/$1)!sg
i.e.:
#!/usr/bin/perl
$subject = "(== doc_url html/arbitrary_file_name.html ==)";
$subject =~ s!.*?doc_url (.*?/.*?) .*!(/doc_assets/legacy/$1)!sg;
print $subject;
#(/doc_assets/legacy/html/arbitrary_file_name.html)
Ideone Demo
Regex Explanation:
.*?doc_url (.*?/.*?) .*
Options: Case sensitive; Exact spacing; Dot matches line breaks; ^$ don’t match at line breaks; Numbered capture
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character string “doc_url ” literally (case sensitive) «doc_url »
Match the regex below and capture its match into backreference number 1 «(.*?/.*?)»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “/” literally «/»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “ ” literally « »
Match any single character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
(/doc_assets/legacy/$1)
Insert the character string “(/doc_assets/legacy/” literally «(/doc_assets/legacy/»
Insert the text that was last matched by capturing group number 1 «$1»
Insert the character “)” literally «)»

regular expression match _ underscore

I have a string like this :
002_part1_part2_______________by_test
and I would like to stop the match at the second underscore character, like this :
002_part1_part2_
How can I do that with a Regular expression ?
Thanks
Create a pattern to match any character but not of an _ zero or more times followed by an underscore symbol. Put that pattern inside a capturing or non-capturing group and make it to repeat exactly 3 times by adding range quantifier {3} next to that group.
^(?:[^_]*_){3}
DEMO
You can use:
.*\d_
EXPLANATION:
Match any single character that is NOT a line break character (line feed) «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character that is a “digit” (any decimal number in any Unicode script) «\d»
Match the character “_” literally «_»
https://regex101.com/r/uX0qD5/1

Javascript transformation

Is there any simple way to transform:
"<A[hello|home]>"
to:
"hello|home"
Thanks!
Apart from the clever advice in the comments to simply remove certain characters, if you are unable to remove these characters because they are present elsewhere in the text and do want to match that format, here is a way to do it with regex:
Search: <\w+\[([^|]*\|[^\]]*)\]>
Replace: \1 or $1 depending on editor or regex engine.
See the Substitution pane at the bottom of the demo.
Explanation
<\w+\[([^|]*\|[^\]]*)\]>
Match the character “<” literally <
Match a single character that is a “word character” (Unicode; any letter or ideograph, digit, connector punctuation) \w+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match the character “[” literally \[
Match the regex below and capture its match into backreference number 1 ([^|]*\|[^\]]*)
Match any character that is NOT a “|” [^|]*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character “|” literally \|
Match any character that is NOT a “]” [^\]]*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character “]” literally \]
Match the character “>” literally >
\1
Insert the backslash character \
Insert the character “1” literally 1

Regex for 2 items but with one exclusion

I am building a RegEx that needs to find lines that have either:
DateTime.Now
or
Date.Now
But cannot have the literal "SystemDateTime" on the same line.
I started with this (DateTime\.Now|Date\.Now) but now I am stuck with where to put the "SystemDateTime"
Use this. Assuming you are not using /s modifier(or DOTALL) which takes newline characters under the dot(.)
(?!.*SystemDateTime)(DateTime\.Now|Date\.Now)
(?!.*SystemDateTime) means there is no SystemDateTime in front.
You could use negative lookahead like this:
(?!.*SystemDateTime)\bDate(?:Time)?\.Now\b
/(?!.*SystemDateTime)Date(?:Time)?\.Now/
DEMO
EXPLANATION:
Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!.*SystemDateTime)»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the characters “SystemDateTime” literally «SystemDateTime»
Match the characters “Date” literally «Date»
Match the regular expression below «(?:Time)?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the characters “Time” literally «Time»
Match the character “.” literally «\.»
Match the characters “Now” literally «Now»

Decode the regexp string that matches the word in string

I have the following regexp
var value = "hello";
"(?<start>.*?\W*?)(?<term>" + Regex.Escape(value) + #")(?<end>\W.*?)"
I'm trying to figure out the meaning, because it doesnt work against the single word.
for example, it matches "they said hello us", but fails for just "hello"
can you please help me to decode what does this regexp string mean?!
PS: it's .NET regexp
Its because of \W in last part. \W is non A-Z0-9_ char.
In "they said hello us", there is space after hello, but "hello" there is nothing there, thats why.
If you change it to (?<end>\W*.*?) it may work.
Actually, the regex itself does not make sense for me, it should rather like
"\b" + Regex.Escape(value) + "\b"
\b is word boundary
The regex may be trying to find a pattern comprising whole words, so that your hello example doesn't match, say, Othello. If so, the word boundary regex, \b, is tailor-made for the purpose:
#"\b(" + Regex.Escape(value) + #")\b"
if this is .NET regex and the Regex.escape() part is replaced with just 'hello' .. Regex Buddy says it means:
(?<start>.*?\W*?)(?<term>hello)(?<end>\W.*?)
Options: case insensitive
Match the regular expression below and capture its match into backreference with name “start” «(?<start>.*?\W*?)»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match a single character that is a “non-word character” «\W*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the regular expression below and capture its match into backreference with name “term” «(?<term>hello)»
Match the characters “hello” literally «hello»
Match the regular expression below and capture its match into backreference with name “end” «(?<end>\W.*?)»
Match a single character that is a “non-word character” «\W»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»