How do regular expressions work in selenium? - regex

I want to store part of an id, and throw out the rest. For example, I have an html element with an id of 'element-12345'. I want to throw out 'element-' and keep '12345'. How can I accomplish this?
I can capture and echo the value, like this:
| storeAttribute | //pathToMyElement#id | myId |
| echo | ${!-myId-!} | |
When I run the test, I get something like this:
| storeAttribute | //pathToMyElement#id | myId |
| echo | ${myId} | element-12345 |
I'm recording with the Selenium IDE, and copying the test over into Fitnesse, using the Selenium Bridge fixture. The problem is I'm using a clean database each time I run the test, with random ids that I need to capture and use throughout my test.

The solution is to use the JavaScript replace() function with storeEval:
| storeAttribute | //pathToMyElement#id | elementID |
| storeEval | '${elementID}'.replace("element-", "") | myID |
Now if I echo myID I get just the ID:
| echo | ${myID} | 12345 |

/element-(\d+)/i
That's a regular expression that would capture the numbers after the dash.

Something like this might work:
| storeAttribute | fn:replace(//pathToMyElement#id,"^element-","") | myId |
To do regex requires XPath 2.0 - not sure which version Selenium implements.

Related

matching string where intitial part variable and fixed end part

following is the list of instance name from the output of nova command.
nova list
+--------------------------------------+-----------------------------------------+--------+------------+-------------+------------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+-----------------------------------------+--------+------------+-------------+------------------------------------------+
| 6cdc00a7-cfe3-4bfe-bbb1-7980ac1c04c0 | haproxy-instance-vms22updateconfar | ACTIVE | - | Running | Orch-Mgmt=10.32.1.40 |
| d0528617-39cd-4098-b34c-0977f5a18414 | gunicon-instance-vms22updateconfar | ACTIVE | - | Running | vms2.1-net=192.168.0.248 |
| e89dd43d-8021-47c6-9f55-39d8bce3c11b | nsoshim-instance-vms22updateconfar | ACTIVE | - | Running | App-Mgmt=10.20.0.126 |
| b7ea9059-834c-4196-8706-54cfaab3d177 | haproxy-instance-vms22update | ACTIVE | - | Running | App-Mgmt=10.20.0.89 |
| 2d4d22e5-b844-413f-8d36-f8b3eb3dea32 | gunicon-instance-vms22update | ACTIVE | - | Running | App-Mgmt=10.20.0.46 |
| 41c4fdc0-3058-4e39-8207-2c02a611ee22 | nsoshim-instance-vms22update | ACTIVE | - | Running | App-Mgmt=10.20.0.217 |
|
SUBDOMAIN=vms22update
nova list | grep "\-instance-$SUBDOMAIN"
gunicon-instance-vms22updateconfar
haproxy-instance-vms22updateconfar
nsoshim-instance-vms22updateconfar
gunicon-instance-vms22update
haproxy-instance-vms22update
nsoshim-instance-vms22update
I want to see instance ends with only vms22update
I tried nova list | grep "-instance-^$SUBDOMAIN$"
it is not listing anything.
#Chris_vr: Thanks for the hint posting my comment as an answer:
You could try this:
nova list | awk -F"|" '{print $3}' | sed 's/ *$//' | grep -E "vms22update\$"
Get output by executing nova list
Split by |
Remove whitespaces
grep for lines ending with vms22update

Perl regex nested grouping results

I have files like this:
mu (micro) | 10^(-6) | millionth
m (milli) | 0.001 | thousandth
k (kilo) | 10^3 | thousand
M (mega) | 10^6 | million
And I would like to to produce files like:
| $mu (micro)$ | $10^(-6)$ | $millionth$ |
| $m (milli)$ | $0.001$ | $thousandth$ |
| $k (kilo)$ | $10^3$ | $thousand$ |
| $M (mega)$ | $10^6$ | $million$ |
I'm trying to use the perl regex. And so far the best reexpression I could come up with is:
perl -lpe '(([[:alnum:][:punct:]\s]+)\s+|\|\s*([[:alnum:][:punct:]\s]+)\s*\||\s*([[:alnum:][:punct:]\s]+))'
I know it's got a few of redundant \s+, but I tried removing them the result was worse. Current it only separates it in two part:
mu (micro) | 10^(-6) |
millionth
So how can I improve upon this, to get the desired result? I know I can use s/foo/bar/g to replace it but I can't get the expression to separate properly. Also how will I access the nested groups?
Perhaps there is a better way to do this, I'm open to suggestions.
perl -lpe '$_ = "| " . join(" | ", map "\$$_\$", split / \| /) . " |"'
In words: Split each line into fields (on |), wrap each field in $...$, join the fields with |, and add a | at the beginning and end.
perl -pi -e 's/^(\S+ +\S+) +\| +(\S+) +\| +(\S+)$/| \$$1\$ | \$$2\$ | \$$3\$ |/g'

Regex priority of match (forward and rear looking regex)

I have a monster regex at the moment, and am currently looking at how this best functions.
My regex is listed below and I am curious if there is a way to prioritize the regex in one function rather than just look for a specific match whereever it may exist.
Example:
If in my string i have a match for ([\d]+/[\d]+) or ([\d]+ / [\d]+) it would pick that first.
If this match above does not exist then but these existed ([\d]+-[\d]+) or ([\d]+ - [\d]+) it would pick that match
After that if ([\d]+) then it would pick that match as the end marker. If none of those existed it would then just move on to any of the other matches.
So my question is:
With Regex is there any way to prioritize which match to take first?
example: Some of my address strings are in the format of 1 - 12 example street,
often the regex will pull 12 example street rather than taking 1 - 12 example street.
Thanks!
The full regex is listed below:
New Regex("( ([\d]+) | ([\d]+-[\d]+) | ([\d]+ - [\d]+) | CAR
SMOULDERING | GAS BOTTLE EXPLOSION | INPUT | OFF | OPPOSITE | CNR |
SPARKING | INCIC1 | INCIC3 | STRUC1 | STRUC3 | G&SC1 | G&SC3 | ALARC1 |
ALARC3 | NOSTC1| NOSTC3 | RESCC1 | RESCC3 | HIARC1 | HIARC3 | CAR
ACCIDENT - POSS PERSON TRAPPED | EXPLOSIONS HEARD | WASHAWAY AS A
RESULT OF ACCIDENT | ENTRANCE | ENT |FIRE| LHS | RHS | POWER LINES
ARCING AND SPARKING | SMOKE ISSUING FROM FAN | CAR FIRE | FIRE ALARM
OPERATING | GAS LEAK | GAS PIPE | NOW OUT | ACCIDENT | SMOKING | ROOF |
GAS | REQUIRED | FIRE | LOCKED IN CAR | SMOKE RISING | SINGLE CAR
ACCIDENT | ACCIDENT | FIRE)(.*?)(?=\SVSE| M | SVC | SVSW | SVNE | SVNW
)", RegexOptions.RightToLeft)
Change the order of the 3 first:
(\d+-\d+) | (\d+ - \d+) | (\d+ )
instead of:
([\d]+) | ([\d]+-[\d]+) | ([\d]+ - [\d]+)

How to remove words of a line upto specific character pattern...Regex

I want the words after "test" word from a line in a file. means in actuaaly, i dont want the words coming before "test" word.
thats the pattern...
e.g:
Input:
***This is a*** test page.
***My*** test work of test is complete.
Output:
test page.
work of test is complete.
Using sed:
sed -n 's/^.*test/test/p' input
If you want to print non-matching lines, untouched:
sed 's/^.*test/test/' input
The one above will remove (greedily) all text until the last test on a line. If you want to delete up to the first test use potong's suggestion:
sed -n 's/test/&\n/;s/.*\n//p' input
A pure bash one-liner:
while read x; do [[ $x =~ test.* ]] && echo ${BASH_REMATCH[0]}; done <infile
Input: infile
This is a test page.
My test work of test is complete.
Output:
test page.
test work of test is complete.
It reads all lines from file infile, checks if the line contains the string test and then prints the rest of the line (including test).
The same in sed:
sed 's/.(test.)/\1/' infile (Oops! This is wrong! .* is greedy, so it cuts too much from the 2nd example line). This works well:
sed -e 's/\(test.*\)/\x03&/' -e 's/.*\x03//' infile
I did some speed testing (for the original (wrong) sed version). The result is that for small files the bash solution performs better. For larger files sed is better. I also tried this awk version, which is even better for big files:
awk 'match($0,"test.*"){print substr($0,RSTART)}' infile
Similar in perl:
perl -ne 's/(.*?)(test.*)/$2/ and print' infile
I used the two lines example input file and I duplicated it every time. Every version run 1000 times. The result is:
Size | bash | sed | awk | perl
[B] | [sec] | [sec] | [sec] | [sec]
------------------------------------------
55 | 0.420 | 10.510 | 10.900 | 17.911
110 | 0.460 | 10.491 | 10.761 | 17.901
220 | 0.800 | 10.451 | 10.730 | 17.901
440 | 1.780 | 10.511 | 10.741 | 17.871
880 | 4.030 | 10.671 | 10.771 | 17.951
1760 | 8.600 | 10.901 | 10.840 | 18.011
3520 | 17.691 | 11.460 | 10.991 | 18.181
7040 | 36.042 | 12.401 | 11.300 | 18.491
14080 | 72.355 | 14.461 | 11.861 | 19.161
28160 |145.950 | 18.621 | 12.981 | 20.451
56320 | | | 15.132 | 23.022
112640 | | | 19.763 | 28.402
225280 | | | 29.113 | 39.203
450560 | | | 47.634 | 60.652
901120 | | | 85.047 |103.997

Regex named grouping

Can you have dynamic naming in regex groups? Something like
reg = re.compile(r"(?PText|Or|Something).*(?PTextIWant)")
r = reg.find("TextintermingledwithTextIWant")
r.groupdict()["Text"] == "TextIWant"
So that depending on what the beggining was, group["Text"] == TextIWant
Updated to make the quesetion more clear.
Some regex engines support this, some don't. This site says that Perl, Python, PCRE (and thus PHP), and .NET support it, all with slightly different syntax:
+--------+----------------------------+----------------------+------------------+
| Engine | Syntax | Backreference | Variable |
+--------+----------------------------+----------------------+------------------+
| Perl | (?<name>...), (?'name'...) | \k<name>, \k'name' | %+{name} |
| | (?P<name>...) | \g{name}, (?&name)* | |
| | | (?P>name)* | |
+--------+----------------------------+----------------------+------------------+
| Python | (?P<name>...) | (?P=name), \g<name> | m.group('name') |
+--------+----------------------------+----------------------+------------------+
| .NET | (?<name>...), (?'name'...) | \k<name>, \k'name' | m.Groups['name'] |
+--------+----------------------------+----------------------+------------------+
| PCRE | (?<name>...), (?'name'...) | \k<name>, \k'name' | Depends on host |
| | (?P<name>...) | \g{name}, \g<name>* | language. |
| | | \g'name'*, (?&name)* | |
| | | (?P>name)* | |
+--------+----------------------------+----------------------+------------------+
This is not a complete list, but it's what I could find. If you know more flavors, add them! The backreference forms with a * are those which are "recursive" as opposed to just a back-reference; I believe this means they match the pattern again, not what was matched by the pattern. Also, I arrived at this by reading the docs, but there could well be errors—this includes some languages I've never used and some features I've never used. Let me know if something's wrong.
Your question is worded kind of funny, but I think what you are looking for is a non-capturing group. Make it like this:
(?:Must_Match_This_First)What_You_Want(?:Must_Match_This_Last)
The ?: is what designates a that a group matches, but does not capture.
You could first build the string in a dynamic way and then pass it to the Regex engine.