How to match sub pattern in Robot Framework?

How to match sub pattern in Robot Framework? - regex

I am doing following things in RFW:
STEP 1 : I need to match the "NUM_FLOWS" value from the following command output.
STEP 2 : If its "Zero - 0" , Testcase should FAIL. If its NON-ZERO, Test case is PASS.
Sample command output:
router-7F2C13#show app stats gmail on TEST/switch1234-15E8CC
--------------------------------------------------------------------------------
APPLICATION BYTES_IN BYTES_OUT NUM_FLOWS
--------------------------------------------------------------------------------
gmail 0 0 4
--------------------------------------------------------------------------------
router-7F2C13#
How to do this with "Should Match Regexp" and "Should Match" keywords? How to check only that number sub-pattern? (Example: In the above command output, NUM_FLOWS is NON-ZERO, Then testcase should PASS.)
Please help me to achieve this.
Thanks in advance.
My New robot file content:
Write show dpi app stats BitTorrent_encrypted on AVC/ap7532-15E8CC
${raw_text} Read Until Regexp .*#
${data[0].num_flows} 0
| | ${data}= | parse output | ${raw_text}
| | Should not be equal as integers | ${data[0].num_flows} | 0
| | ... | Excepted num_flows to be non-zero but it was zero | values=False

There are many ways to solve this. A simple way is to use robot's regular expression keywords to look for "gmail" at the start of a line, and then expect three numbers and then the number 0 (zero) followed by the end of the line. This assumes that a) NUM_FLOWS is always the last column, and b) there is only one line that begins with "gmail". I don't know if those are valid assumptions or not.
Because the data spans multiple lines, the pattern includes (?m) (the multiline flag) so that $ means "end of line" in addition to "end of string".
| | Should not match regexp | ${data} | (?m)\\s+gmail\\s+\\d+\\s+\\d+\\s+0\\s*$
| | ... | Expected non-zero value in the fourth column for gmail, but it was zero.
There are plenty of other ways to solve the problem. For example, if you need to check for other values in other columns, you might want to write a python keyword that parses the data and returns some sort of data structure.
Here's a quick example. It's not bulletproof, and makes some assumptions about the data passed in. I wouldn't use it in production, but it illustrates the technique. The keyword returns a list of items, and each item is a custom object with four attributes: name, bytes_in, bytes_our and num_flows:
# python library
import re
def parse_output(data):
class Data(object):
def __init__(self, raw_text):
columns = re.split(r'\s*', raw_text.strip())
self.name = columns[0]
self.bytes_in = int(columns[1])
self.bytes_out = int(columns[2])
self.num_flows = int(columns[3])
lines = data.split("\n")
result = []
# skip first four lines and the last two
for line in lines[4:-3]:
result.append(Data(line))
return result
Using it in a test:
*** Test Cases ***
| | # <put your code here to get the data from the >
| | # <router and store it in ${raw_text} >
| | ${raw_text}= | ...
| | ${data}= | parse output | ${raw_text}
| | Should not be equal as integers | ${data[0].num_flows} | 0
| | ... | Excepted num_flows to be non-zero but it was zero | values=False

Related

Print String array of a json payload in splunk

I need to print a string array along with one field in my json object.
The data:
{ "key1":"val1", "key2":"value2", "codes":["apple","mango","banana","orange"], "key3_conditional":"yes"}
My Search query:
<My search query>
| rex "\|(?<payload>[^\|]*)$"
| spath input=payload
| rex "\"codes\":\"(?<codes>[^\"]*)"
| eval is_unknown=if(isnotnull(key3_conditional), key3_conditional, "no")
| table codes, is_unknown
Desired result
codes | is_unknown
--------------------------------------------------
apple, mango, banana, orange | yes
Currently, this only displays the 1st value in codes i.e. apple and I need all values of codes as comma separated. I'm supposing there is some issue with my regex. Please suggest.

If this data is being brought-in a JSON, you won't have to rex it out
If not, though, the issue is your regular expression
Try it out on regex101.com - you'll see you're only grabbing the first value because you're stopping at a literal "
Try this instead:
...
| rex field=_raw "codes\":\[(?<codes>[^\]]+)"
| eval codes=split(replace(codes,"\"",""),",")
That will make codes into a multivalue field
If you don't care about it being multivalue, you can just do:
| eval codes=replace(codes,"\"","")
to pull the quote marks

Regex to exclude a substring

I am trying to match a param string but exclude any matches when a substring is present.
From my limited regex knowledge this should work to exlude any string containing "porcupine", but it's not. What am I doing wrong?
(\/animal\?.*(?!porcupine).*color=white)
Expected Outcome
| string | matches? |
| ----------------------------------------------- | -------- |
| /animal?nose=wrinkly&type=porcupine&color=white | false |
| /animal?nose=wrinkly&type=puppy&color=white | true |
Actual Outcome
| string | matches? |
| ----------------------------------------------- | -------- |
| /animal?nose=wrinkly&type=porcupine&color=white | true |
| /animal?nose=wrinkly&type=puppy&color=white | true |

Use a Tempered Greedy Token:
/animal\?(?:(?!porcupine).)*color=white
Demo & explanation

The .* searches anything for any number of times, greedily. So you could replace it with a literal search:
(\/animal\?nose=wrinkly\&type=(?!porcupine).*color=white)
See example here: https://regex101.com/r/HJiM2N/1

This may seem overly verbose but it is actually relatively efficient in the number of steps:
(?!\/animal.*?porcupine.*?color)\/animal\?.*color=white
See Regex Demo
If the input string consists of only one and only one occurrence of what you are trying to match and nothing else, then just use the following to ensure that porcupine does not occur anywhere in the input string:
(?!.*porcupine)\/animal\?.*color=white
The code:
import re
tests = [
'/animal?nose=wrinkly&type=porcupine&color=white',
'/animal?nose=wrinkly&type=puppy&color=white'
]
rex = r'(?!\/animal.*?porcupine.*?color)\/animal\?.*color=white'
for test in tests:
m = re.search(rex, test)
print(test, 'True' if m else 'False')
Prints:
/animal?nose=wrinkly&type=porcupine&color=white False
/animal?nose=wrinkly&type=puppy&color=white True

How do I select a substring using a regexp in robot framework

In the Robot Framework library called String, there are several keywords that allow us to use a regexp to manipulate a string, but these manipulations don't seem to include selecting a substring from a string.
To clarify, what I intend is to have a price, i.e. € 1234,00 from which I would like to select only the 4 primary digits, meaning I am left with 1234 (which I will convert to an int for use in validation calculations). I have a regexp which will allow me to do that, which is as follows:
(\d+)[\.\,]
If I use Remove String Using Regexp with this regexp I will be left with exactly what I tried to remove. If I use Get Lines Matching Regexp, I will get the entire line rather than just the result I wanted, and if I use Get Regexp Matches I will get the right result except it will be in a list, which I will then have to manipulate again so that doesn't seem optimal.
Did I simply miss the keyword that will allow me to do this or am I forced to write my own custom keyword that will let me do this? I am slightly amazed that this functionality doesn't seem to be available, as this is the first use case I would think of when I think of using a regexp with a string...

You can use the Evaluate keyword to run some python code.
For example:
| Using 'Evaluate' to find a pattern in a string
| | ${string}= | set variable | € 1234,00
| | ${result}= | evaluate | re.search(r'\\d+', '''${string}''').group(0) | re
| | should be equal as strings | ${result} | 1234
Starting with robot framework 2.9 there is a keyword named Get regexp matches, which returns a list of all matches.
For example:
| Using 'Get regexp matches' to find a pattern in a string
| | ${string}= | set variable | € 1234,00
| | ${matches}= | get regexp matches | ${string} | \\d+
| | should be equal as strings | ${matches[0]} | 1234

Regex - Contains pattern but not starting with xyz

I'm trying to match a number pattern in a text file.
The file can contain values such as
12345 567890
90123 string word word 54616
98765
The pattern should match on any line that contains a 5 digit number that does not start with 1234
I have tried using ((?!1234).*)[[:digit:]]{5} but it does not give the desired results.
Edit: The pattern can occur anywhere in the line and should still match
Any suggestions?

This regex should work for matching a line containing a number at least 5 digits long iff the line does not start with '12345':
^((?!12345).*\d{5}.*)$
Short explanation:
^((?!12345).*\d{5}.*)$ _____________
^ \_______/\/\___/\/ ^__|match the end|
_____________________________| | _| | |__ |of the line |
|match the start of a line| | | __|____ |
______________________________|_ | |match ey| |
|look ahead and make sure the | | |exactly | |
|line does not begin with "12345"| | |5 digits| |
___|_____ |
|match any|______|
|character|
|sequence |
EDIT:
It seems that the question has been edited, so this solution no longer reflects the OP's requirements. Still I'll leave it here in case someone looking for something similar lands on this page.

The following would work, using \b to match word boundaries such as start of string or space:
\b(?!12345)\d{5}.*

try this, contains at least 5 decimal digits but not 12345 using a negative look behind
\d{5,}(?<!12345)

notepad++: keep regex (multi occurence per line) and line structure, remove other characters

I have a 130k line text file with patent information and I just want to keep the dates (regex "[0-9]{4}-[0-9]{2}-[0-9]{2} ") for subsequent work in Excel. For this purpose I need to keep the line structure intact (also blank lines). My main problem is that I can't seem to find a way to identify and keep multiple occurrences of date information in the same line while deleting all other information.
Original file structure:
US20110228428A1 | US | | 7 | 2010-03-19 | SEAGATE TECHNOLOGY LLC
US20120026629A1 | US | | 7 | 2010-07-28 | TDK CORP | US20120127612A1 | US | | EXAMINER | 2010-11-24 | | US20120147501A1 | US | | 2 | 2010-12-09 | SAE MAGNETICS HK LTD,HEADWAY TECHNOLOGIES INC
Desired file structure:
2010-03-19
2010-07-28 2010-11-24 2010-12-09
Thank you for your help!

Search for
.*?(?:([0-9]{4}-[0-9]{2}-[0-9]{2})|$)
And replace with
" $1"
Don't put the quotes, just to show there is a space before the $1. This will also put a space before the first match in a row.
This regex will match as less as possible .*? before it finds either the Date or the end of the row (the $). If a date is found it is stored in $1 because of the brackets around. So as replacement just put a space to separate the found dates and then the found date from $1.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to match sub pattern in Robot Framework? - regex

Related

Print String array of a json payload in splunk

Regex to exclude a substring

How do I select a substring using a regexp in robot framework

Regex - Contains pattern but not starting with xyz

notepad++: keep regex (multi occurence per line) and line structure, remove other characters

Categories

Resources