Despite trying posix regex, getting this error in monit - regex

I have this monit syntax,
check file access_log_1 with path /app/DNIF_logs/access_log_1
ignore content = ".*favicon.*"
if content = "^([:digit:]{1,3}\.){3}[:digit:]{1,3}[:space:]((((([:digit:]{1,3}\.){3}[:digit:]{1,3})|\-)[:space:]([:digit:]{6,7}|\-)[:space:][-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()#:%_\+.~#?&=]*)[:space:])|)-[:space:]-[:space:]\[[:digit:]{2}\/([A-Z]|[a-z]){3}\/[:digit:]{4}\:[:digit:]{2}\:[:digit:]{2}\:[:digit:]{2}[:space:]\+[:digit:]{4}\][:space:]\"(.*)\"[:space:](500|502|503)([:digit:]|[:space:])*\"https?:\/\/(www\.)?[-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()#:%_\+.~#?&\/\/=]*)\"(.*)" then alert
While running this monit I'm getting this error
/monitrc:40: syntax error '.*'
I tried removing the first .* and got
monitrc:41: syntax error '"[:space:](500|502|503)([:digit:]|[:space:])*\"'
so got to know error is at the first occurance of (.*)
As per the answer here, we've to use posix regex syntax. Is there a different posix syntax for .* or what should I do?

In POSIX ERE,
Bare POSIX character classes are not allowed, you should always use them inside bracket expressions (i.e. square brackets)
Note [[:digit:]] is the same as [0-9]
Inside bracket expressions, you can't use regex escapes, all literal backslashes are treated as literal backslash matching patterns
\b is not recognized, but you can usually use \< as a starting and \> as a trailing word boundaries
To use double quotes in the regex, use single quotes to delimit the regex string.
You may fix the regex like
if content = '^([0-9]{1,3}\.){3}[0-9]{1,3}[[:space:]]((((([0-9]{1,3}\.){3}[0-9]{1,3})|-)[[:space:]]([0-9]{6,7}|-)[[:space:]][-a-zA-Z0-9#:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\>([-a-zA-Z0-9()#:%_+.~#?&=]*)[[:space:]])|)-[[:space:]]-[[:space:]]\[[0-9]{2}/[[:alpha:]]{3}/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2}[[:space:]]\+[0-9]{4}][[:space:]]"(.*)"[[:space:]](500|502|503)[0-9[:space:]]*"https?://(www\.)?[-a-zA-Z0-9#:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\>([-a-zA-Z0-9()#:%_+.~#?&/=]*)"(.*)' then alert
Note:
([A-Z]|[a-z])* is better written as just [[:alpha:]]* (any zero or more letters
([0-9]|[[:space:]])* has a shorter variant, [0-9[:space:]]*.

Related

How to exclude character that has preceding character different than specified in regular expression?

With regular expression I would like to get all characters between round brackets, but \( and \) characters should be also included in the result.
Examples:
input: fo(ob)a)r
output: ob
input: foo(bar\(qwerty\))baz
output: bar\(qwerty\)
This is what I used for finding text between brackets:
(?<=\()([^\s\(\)]+)(?=\)), but I can't make exceptions for brackets preceded by \.
You could do something like this :
.*(?<!\\)\((.*?)(?<!\\)\)
Basically, it matches as many characters as possible until it sees an open parenthesis without a backslash (using a negative lookbehind), then groups the next matching characters until a closing parenthesis (still without a backslash).
Note that this regex may not work properly if you escape the backslashes.
Example : https://regex101.com/r/BqVKZp/1
This regex works for both your examples, without any lookaheads and lookbehinds:
\((.+[^\\])\)
A U flag is needed.

vim delete regex: pattern not found

This is my first time trying to use a regex for deletion.
The regex:
/net=.+\.net/
as shown here matches a string that starts with net= some random characters and ends with .net
However, when using it in vim:
:g/net=.+\.net/d
I simply get Pattern not found: net=.+\.net
I am guessing that vim uses a slightly different format, or do I need to escape the characters =, . and + ?
:help pattern is your friend. In your case, you need to escape + or prefix your whole pattern with \v to turn it “verymagic”.
Do not escape =, it would turn it into the same thing as {0,1} in some regexp engine, namely a greedy optional atom matcher.

Seperate backreference followed by numeric literal in perl regex

I found this related question : In perl, backreference in replacement text followed by numerical literal
but it seems entirely different.
I have a regex like this one
s/([^0-9])([xy])/\1 1\2/g
^
whitespace here
But that whitespace comes up in the substitution.
How do I not get the whitespace in the substituted string without having perl confuse the backreference to \11?
For eg.
15+x+y changes to 15+ 1x+ 1y.
I want to get 15+1x+1y.
\1 is a regex atom that matches what the first capture captured. It makes no sense to use it in a replacement expression. You want $1.
$ perl -we'$_="abc"; s/(a)/\1/'
\1 better written as $1 at -e line 1.
In a string literal (including the replacement expression of a substitution), you can delimit $var using curlies: ${var}. That means you want the following:
s/([^0-9])([xy])/${1}1$2/g
The following is more efficient (although gives a different answer for xxx):
s/[^0-9]\K(?=[xy])/1/g
Just put braces around the number:
s/([^0-9])([xy])/${1}1${2}/g

Slashes and hashes in Perl and metacharacters

Thanks for the previous assistance everyone!. I have a query regarding RegExp in Perl
My issue is..
I know, when matching you can write m// or // or ## (must include m or s if you use this). What is causing me the confusion is a book example on escaping characters I have. I believe most people escape lots of characters, as a sure fire way of the program working without missing a metacharacter something ie: \# when looking to match # say in an email address.
Here's my issue and I know what this script does:
$date= "15/12/99"
$date=~ s#(\d+)/(\d+)/(\d+)#$1/$2/$3#; << why are no forward slashes escaped??
print($date);
Yet the later example I have, shows it rewritten, as (which i also understand and they're escaped)
$date =~ s/()(\d+)\/(\d+)\/(d+)/$2\/$1\/$3; <<<<which is escaping the forward slashes.
I know the slashes or hashes are programmer preference and their use. What I don't understand is why the second example, escapes the slashes, yet the first doesn't - I have tried and they work both ways. No escaping slashes with hashes? What's even MORE confusing is, looking at yet another book example I also have earlier to this one, using hashes again, they too escape the # symbol.
if ($address =~ m#\##) { print("That's an email address"); } or something similar
So what do you escape from what you don't using hashes or slashes? I know you have to escape metacharacters to match them but I'm confused.
When you build a regexp, you define a character as a delimiter for your regexp i.e. doing // or ##.
If you need to use that character inside your regexp, you will need to escape it so that the regexp engine does not see it as the end of the regexp.
If you build your regexp between forward slashes /, you will need to escape the forward slashes contained in your regexp, hence the escaping in your second example.
Of course, the same rule apply with any character you use as a regexp delimiter, not just forward slashes.
The forward slashes are not meta characters in themselves - only the use of them in the second example as expression separators makes them "special".
The format of a substitute expression is:
s<expression separator char><expression to look for><expression separator char><expression to replace with><expression separator char>
In the first example, using a hash as the first character after the =~ s, makes that character the expression separator, so forward slash is not special and does not require any escaping.
in the second example, the expression separator is indeed the forward slash, so it must be escaped within the expressions themselves.
The regex match-operator allows to define a custom non-whitespace-character as seperator.
In your first example the '#' is used as seperator. So in this regex you don't need to escape the '/' because it hase no special meaning. In the second regex, the seperator char isn't changed. So the default '/' is used. Now you have to escape all '/' in your pattern. Otherwise the parser is confused. :)
If you are not use slashes, the recommend practice is to use the curly braces and the /x modifier.
$date=~ s{ (\d+) \/ (\d+) \/ (\d+) }{$1/$2/$3}x;
Escaping the non-alphanumerics is also a standard even if they are not meta-characters. See perldoc -f quotemeta.
There is another depth to this question about escaping forward slashes with the s operator.
With my example the capturing becomes the problem.
$image_name =~ s/((http:\/\/.+\/)\/)/$2/g;
For this to work the typo with the addition of a second forward slash, had to be captured.
Also, trying to work with just the two slashes did not work. The first slash has to be led by more than one character.
Changing "http://world.com/Photos//space_shots/out_of_this_world.jpg"
To: "http://world.com/Photos/space_shots/out_of_this_world.jpg"

Using escape characters inside grep

I have the following regular expression for eliminating spaces, tabs, and new lines: [^ \n\t]
However, I want to expand this for certain additional characters, such as > and <.
I tried [^ \n\t<>], which works well for now, but I want the expression to not match if the < or > is preceded by a \.
I tried [^ \n\t[^\\]<[^\\]>], but this did not work.
Can any one of the sequences below occur in your input?
\\>
\\\>
\\\\>
\blank
\tab
\newline
...
If so, how do you propose to treat them?
If not, then zero-width look-behind assertions will do the trick, provided that your regular expression engine supports it. This will be the case in any engine that supports Perl-style regular expressions (including Perl's, PHP, etc.):
(?<!\\)[ \n\t<>]
The above will match any un-escaped space, newline, tab or angled braces. More generically (using \s to denote any space characters, including \r):
(?<!\\)\s
Alternatively, using complementary notation without the need for a zero-width look-behind assertion (but arguably less efficiently):
(?:[^ \n\t<>]|\\[<>])
You may also use a variation of the latter to handle the \\>, \\\>, \\\\> etc. cases as well up to some finite number of preceding backslashes, such as:
(?:[^ \n\t<>]|(?:^|[^<>])[\\]{1,3,5,7,9}[<>])
According to the grep man page:
A bracket expression is a list of
characters enclosed by [ and ]. It
matches any single character in that
list; if the first character of the
list is the caret ^ then it matches
any character not in the list.
This means that you can't match a sequence of characters such as \< or \> only single characters.
Unless you have a version of grep built with Perl regex support then you can use lookarounds like one of the other posters mentioned. Not all versions of grep have this support though.
Maybe you can use egrep and put your pattern string inside quotes. This should obliterate the need for escaping.