Unable to format RegEx to handle Dollar Sign ($) - regex

I am stuck trying to figure out how to get a RegEx text search to work with a dollar sign. Let's say I have two strings:
TestPerson One | 123456789 | ($100.00) | $0 | 03/27/2018 | Open
TestPerson Two | 987654321 | ($250.00) | ($25) | 03/27/2018 | Open
Using jQuery, I am creating the RegEx. If I was to search for TestPerson, the RegEx would look like this:
/^(?=.*\bTestPerson).*$/i
This would return both strings, as they both contain TestPerson. If I try and search for $, I get zero results even though both strings contian a $. I know the dollar sign is a special character in RegEx, but escaping it does not work either.
How can I format my RegEx to where searching for $ will return both results?
Thanks!

I think this seems multiline modifier on-off problem. I guess you turned off the multiline modifier and implemented the regex so the unexpected output results. Demo
If you turned on the multiline modifier, you could get the output you want. Demo

To check whether or not a string contains a substring, you don't regex: JavaScript has the string method includes(). This method searches a string for a given value and returns true if it exists in the string and false otherwise.
var a = [
'TestPerson One | 123456789 | ($100.00) | $0 | 03/27/2018 | Open',
'TestPerson Two | 987654321 | ($250.00) | ($25) | 03/27/2018 | Open'
]
a.forEach(function(s) {
console.log(s.includes('TestPerson') && s.includes('$'))
})

Related

RegEx Substring Extraction

I am trying to write a RegEx on the following text:
CpuUtilization\[GqIF:CA-TORONTO-1-AD-1 | FAULT-DOMAIN-3 | ocid1.image.oc1.ca-toronto-1.aaaaaaaaq4cxrudcxy5seck2cweks2zglo2tfieag6svtvqssa2zmjha | Default | ca-toronto-1 | oke-ccf3jglvbia-nc7pit2gv2a-sa65utwc32a-2 | ocid1.instance.oc1.ca-toronto-1.an2g6ljrwe6j4fqcgrlo7dmzkrtbcgr3jy35gie3qh3w65ctfh3hsd6da | VM.Standard.E2.2\]
I need to extract oke-ccf3jglvbia-nc7pit2gv2a-sa65utwc32a-2 from the statement. The text above can change depending, so looking for a generic RegEx.
I tried using: (\[^\\|\]+)\\|.+ which extract the first occurrence before |
Why use RegEx?
const s = 'CpuUtilization\[GqIF:CA-TORONTO-1-AD-1 | FAULT-DOMAIN-3 | ocid1.image.oc1.ca-toronto-1.aaaaaaaaq4cxrudcxy5seck2cweks2zglo2tfieag6svtvqssa2zmjha | Default | ca-toronto-1 | oke-ccf3jglvbia-nc7pit2gv2a-sa65utwc32a-2 | ocid1.instance.oc1.ca-toronto-1.an2g6ljrwe6j4fqcgrlo7dmzkrtbcgr3jy35gie3qh3w65ctfh3hsd6da | VM.Standard.E2.2\]'
console.log(s.split(" | ")[5])
A regex solution can be
^(?:[^|]+ \| ){5}([^ ]+).*$
^ start of the string
(?:[^|]+ \| ){5} any character but \ followed by |, 5 times. (The ?: makes this a non capturing group).
([^ ]+) your string as the first group
.*$ any character to end of line
To get your string out of this, subtitute it with $1 or \1.
Test it on regex101. There you can test different programming languages/regex processors.
Remark:
Like the answer of Kazi this works in this case, maybe not in others.
There are no more examples in you question.
This answer is in function nearly the same.

Splunk query not endswith

I am just into learning of Splunk queries, I'm trying to grab a data from myfile.csv file based on the regex expression.
In particular, I'm looking forward, print only the rows where column fqdn not endswith udc.net and htc.com.
Below is my query which is working but i'm writing it twice.
| inputlookup myfile.csv
| regex support_group="^mygroup-Linux$"
| regex u_sec_dom="^Normal Secure$"
| regex fqdn!=".*?udc.net$"
| regex fqdn!=".*?htc.com$"
| where match(fqdn,".")
I am trying them to combine with | separeted but not working though...
| regex fqdn!="(.*?udc.net | ".*?htc.com)$"
You can do this with a search and where clause:
| inputlookup myfile.csv
| search support_group="mygroup-Linux" u_sec_dom="Normal Secure"
| where !match(fqdn,"udc.net$") AND !match(fqdn,"htc.com$")
Or just a single search clause:
| inputlookup myfile.csv
| search support_group="mygroup-Linux" u_sec_dom="Normal Secure" NOT (fqdn IN("*udc.net","*htc.com")
You can also rewrite the IN() thusly:
(fqdn="*udc.net" OR fqdn="*htc.com")
The combined regex will work if you omit the spaces on either side of the |. The extra spaces become part of the regex and prevent matches.
There's no need for the final where command. Splunk by default will display all events that match ..

Regex to exclude a substring

I am trying to match a param string but exclude any matches when a substring is present.
From my limited regex knowledge this should work to exlude any string containing "porcupine", but it's not. What am I doing wrong?
(\/animal\?.*(?!porcupine).*color=white)
Expected Outcome
| string | matches? |
| ----------------------------------------------- | -------- |
| /animal?nose=wrinkly&type=porcupine&color=white | false |
| /animal?nose=wrinkly&type=puppy&color=white | true |
Actual Outcome
| string | matches? |
| ----------------------------------------------- | -------- |
| /animal?nose=wrinkly&type=porcupine&color=white | true |
| /animal?nose=wrinkly&type=puppy&color=white | true |
Use a Tempered Greedy Token:
/animal\?(?:(?!porcupine).)*color=white
Demo & explanation
The .* searches anything for any number of times, greedily. So you could replace it with a literal search:
(\/animal\?nose=wrinkly\&type=(?!porcupine).*color=white)
See example here: https://regex101.com/r/HJiM2N/1
This may seem overly verbose but it is actually relatively efficient in the number of steps:
(?!\/animal.*?porcupine.*?color)\/animal\?.*color=white
See Regex Demo
If the input string consists of only one and only one occurrence of what you are trying to match and nothing else, then just use the following to ensure that porcupine does not occur anywhere in the input string:
(?!.*porcupine)\/animal\?.*color=white
The code:
import re
tests = [
'/animal?nose=wrinkly&type=porcupine&color=white',
'/animal?nose=wrinkly&type=puppy&color=white'
]
rex = r'(?!\/animal.*?porcupine.*?color)\/animal\?.*color=white'
for test in tests:
m = re.search(rex, test)
print(test, 'True' if m else 'False')
Prints:
/animal?nose=wrinkly&type=porcupine&color=white False
/animal?nose=wrinkly&type=puppy&color=white True

How do I select a substring using a regexp in robot framework

In the Robot Framework library called String, there are several keywords that allow us to use a regexp to manipulate a string, but these manipulations don't seem to include selecting a substring from a string.
To clarify, what I intend is to have a price, i.e. € 1234,00 from which I would like to select only the 4 primary digits, meaning I am left with 1234 (which I will convert to an int for use in validation calculations). I have a regexp which will allow me to do that, which is as follows:
(\d+)[\.\,]
If I use Remove String Using Regexp with this regexp I will be left with exactly what I tried to remove. If I use Get Lines Matching Regexp, I will get the entire line rather than just the result I wanted, and if I use Get Regexp Matches I will get the right result except it will be in a list, which I will then have to manipulate again so that doesn't seem optimal.
Did I simply miss the keyword that will allow me to do this or am I forced to write my own custom keyword that will let me do this? I am slightly amazed that this functionality doesn't seem to be available, as this is the first use case I would think of when I think of using a regexp with a string...
You can use the Evaluate keyword to run some python code.
For example:
| Using 'Evaluate' to find a pattern in a string
| | ${string}= | set variable | € 1234,00
| | ${result}= | evaluate | re.search(r'\\d+', '''${string}''').group(0) | re
| | should be equal as strings | ${result} | 1234
Starting with robot framework 2.9 there is a keyword named Get regexp matches, which returns a list of all matches.
For example:
| Using 'Get regexp matches' to find a pattern in a string
| | ${string}= | set variable | € 1234,00
| | ${matches}= | get regexp matches | ${string} | \\d+
| | should be equal as strings | ${matches[0]} | 1234

Regex - Contains pattern but not starting with xyz

I'm trying to match a number pattern in a text file.
The file can contain values such as
12345 567890
90123 string word word 54616
98765
The pattern should match on any line that contains a 5 digit number that does not start with 1234
I have tried using ((?!1234).*)[[:digit:]]{5} but it does not give the desired results.
Edit: The pattern can occur anywhere in the line and should still match
Any suggestions?
This regex should work for matching a line containing a number at least 5 digits long iff the line does not start with '12345':
^((?!12345).*\d{5}.*)$
Short explanation:
^((?!12345).*\d{5}.*)$ _____________
^ \_______/\/\___/\/ ^__|match the end|
_____________________________| | _| | |__ |of the line |
|match the start of a line| | | __|____ |
______________________________|_ | |match ey| |
|look ahead and make sure the | | |exactly | |
|line does not begin with "12345"| | |5 digits| |
___|_____ |
|match any|______|
|character|
|sequence |
EDIT:
It seems that the question has been edited, so this solution no longer reflects the OP's requirements. Still I'll leave it here in case someone looking for something similar lands on this page.
The following would work, using \b to match word boundaries such as start of string or space:
\b(?!12345)\d{5}.*
try this, contains at least 5 decimal digits but not 12345 using a negative look behind
\d{5,}(?<!12345)