Regular expression with drools - regex

I have a string with multiline as below.
rawMessage=sysUpTimeInstance-->0:0:00:05.00
snmpTrapOID.0-->linkDown.0.0
In the drools when portion i have written the condition as below.
rawMessage matches "(?i).*linkDown(.|\n|\r)*"
but it is not working.Please provide me some pointers to handle multiline.

Its not clear to me what you want to do/achieve. Your regex looks not wrong (I don't know the drools flavour and what you want to match).
In general (.|\n|\r)* is able to match any character including newlines. In your example there is no newline after "linkDown", so what should it match there?
Maybe you need to double escape (I don't know for drools) like this: (.|\\n|\\r)*.
Another possibility is to use the singleline modifer s (Again, I don't know if drools supports this modifier). This makes the . match also newline characters, could then look something like this
rawMessage matches "(?i)(?s).*linkDown.*"
or if it should only match multiline from "linkdown" on
rawMessage matches "(?i).*linkDown(?s).*"

Drools uses standard java regular expressions. As the previous answer mention, your expression looks wrong. And yes, you need to double escape special chars like you would do in java. Just check the javadoc for the Pattern class in the java API.

Related

Brackets within a Regex string

I'm trying to use a regular expression to match on a string. Brackets are special characters within regex, am I'm unsure of how'd i'd go about including them in my regex.
To provide more context, I want to find a string such as test[test]
My regex currently looks like this: ^*test[test]. My expression is built out more much than this, but this example is enough to understand the problem.
How can i search for brackets in my string without triggering a character class. I need to use a regex, please don't recommend switching to something else.
You can escape a character with a backslash so \[
I can highly recommend https://regex101.com/ to test your regex without having to code it.
Try: ^.*test\[test\] - This mean {start of line}, {anything}, "test[test]".

Confusion regarding regex pattern

I have tried to write a regex to catch certains words in a sentence but it is not working. The below regex is only working when I give a exact match.
[\s]*((delete)|(exec)|(drop\s*table)|(insert)|(shutdown)|(update)|(\bor\b))
Lets say I send a HTTP Header - headerName = insert it works,
but does not work when I give headerName = awesome insert number
--edit--
#user1180, Yes I can use prepared statements, but we are also looking into the regex part.
#Marcel and Wiktor, yes it is working in that website. I guess my tool is not recognizing the regex. I am using Mulesoft ESB, which uses Matches when the evaluated value fits a given regular expression (regex), specifically a regex "flavor" supported by Java.
It is using something like this,
matches /\+(\d+)\s\((\d+)\)\s(\d+\-\d+)/ and I am not aware of how to write my usecase in this regex format.
My usecase is too catch SQL injection pattern, which would check the request header/queryparam for delete (exec)(drop\s*table)(insert)(shutdown)(update)or parameters.
Since your regex must match the whole input you need to wrap the pattern with .*, something similar to (?s).*(<YOUR PATTERN>).*.
Use
(?s).*\b(delete|exec|drop\s+table|insert|shutdown|update|or)\b.*
Details
(?s) - turns on DOTALL mode where . matches any char
.* - any 0+ chars, as many as possible
\b(delete|exec|drop\s+table|insert|shutdown|update|or)\b - any one of the whole words (note \b is a word boundary construct) in the group
.* - any 0+ chars, as many as possible
I also replaced drop\s*table with drop\s+table since I guess droptable is not expected.

A regular expression that matches two long strings and ignores everything in between

I am searching through a 1.5 million line Premiere Pro project for any text that matches one of my audio filters and is set to mono.
Text that I am searching for begins with the <ChannelType> tag and ends with the <FilterMatchName>Tags. So it would looks like this
<ChannelType>0</ChannelType>
<FrameRate>5292000</FrameRate>
</AudioComponent>
<FilterPreset>0</FilterPreset>
<OpaqueData Encoding="base64" Checksum="53060659">AAAAAD8L8lo+AUr+Pac1NjwTmoUAAAAAP0uQDD37nIg9ui6MPjwU5j+AAAA+C/JaAAAAAD8qqqsAAAAAP4AAAD92L8w9py8FAAAAAHNvZnQgY29tcHJlc3Npb24AIiBkZWZhdWx0PSIwIiBzdGVwPSIxIiBtaW49IjAiIG1heD0iMSIvPgoJICA8Zmw=</OpaqueData>
<FilterIndex>-1</FilterIndex>
<FilterMatchName>1094998321 Dynamics1</FilterMatchName>
If I were in a Word doc, I would just do a find as
<ChannelType>0</ChannelType>*<FilterMatchName>1094998321 Dynamics1</FilterMatchName>
I am terrible with Regex. I was hoping someone could help me out. Everything I have tried either doesn't match anything, or matches EVERYTHING in the document. I am using Notepad++.
Since you are working in Notepad++, you have access to PCRE regular expressions. This one will get all the text between <ChannelType> and </FilterMatchName>
(?s)<ChannelType>.*?</FilterMatchName>
the (?s) allows the . to match newline characters
After matching <ChannelType>, the .*? lazily matches all characters up to...
the closing </FilterMatchName>, which we match.
Let me know if you have any questions. :)
What type of regular expressions are you using (which language/library)?
Basically you can use .* instead of * in regular expressions. IF your text is long though, it's better to use a Reluctant quantifier[1] if your re implementation allows it.
This is a good site with comparison of different re implementations and tutorials:
http://www.regular-expressions.info
[1] http://docs.oracle.com/javase/tutorial/essential/regex/quant.html

Regex Extract in Google Docs for capturing the end of variable strings

In Google Docs, if I have a series of strings like "Something.Here.Search.Term.Chicago", where the last component after "Term." can be anything.
How do I use regex extract to only capture what comes after "Term."?
Note that the length of the string varies before Term so I can't use Left or Right and position since it's always different.
You can use a positive look-behind as well, to avoid having to capture with groups:
/(?<=Term\.).*/
Though depending on the language you are implementing this with, it may not support look-behinds (namely JavaScript).
If you don't want to mess about with capturing groups and you know the component you want is the substring between the last . and the end of the string, you could use
[^.]+$
Here's what worked for me using you sample data:
=REGEXREPLACE(A1; ".*Term.(.*)" ; "$1")
I don't know Google Docs, but normally in regular expressions, you would do
"Something\.Here\.Search\.Term\.(.*)"
The () means capture and remember the pattern within. In this case .* means everything. You can usually access the pattern as $1, etc. in Javascript.
See Examples of Regular Expressions
What about using a "look-ahead" expression (?=),
then something repeated followed by a word boundary?
Something like this:
(?=Term\\.).*\W

Regex match first characters of string

I am trying to create a regex that will match the first 3 characters of a string,
If I have a string ABCFFFF I want to verify that the first 3 characters are ABC.
It's pretty straightforward, the pattern would be ^ABC
As others may point out, using regular expressions for such a task is an overkill. Any programming language with regex support can do it better with simple string manipulations.
Just simple regex will work:
/^ABC/
But is it a good use case for using regex, I am not sure. Consider using substring in whatever language/platform you're using.
"^ABC" should work. '^' matches the start in most regex implementations.