Selenium IDE: Verifying a dynamic pattern in a website using RegExp - regex

I am trying to verify the presence of the dynamic string "6:20 AM – 6:46 AM" in this website in selenium IDE using Regular Expressions but it doesn't work. I can't use XPath since the numbers keep changing and I am looking for only certain numbers. If I use XPath, it will match the string no matter what the numbers are. What is wrong with the following?
Command: verifyTextPresent
Target: regexp:[6]\:[0-9]{2} [AP]M \– [6]\:[0-9]{2} [AP]M
This question seems too simple but not in real, please check your solution and see if it really works on the aforementioned website. Please note that my question is not only about RegExp! I'm asking about using RegExp in Selenium IDE.

Does your string have anything before it - Perhaps escaping it with backspaces queries
\s
might be advisable?
Sunrise Today: 6:19 AM
Sunset Today: 8:38 PM
Notice how there is a gap before the 6 begins.
So a potential output code here would be
\s[0-9]+:[0-9]+\ [AP]M
\s[0-9]+:[0-9]+\ [AP]M
The fellow above restricts it to 1 or 2 using {1,2}, and technically this is more accurate, but mine is more shorthand, and on a timing website where you will only get 1 or 2, my method would work (Regex '+' is 1 or more instances).

Related

How to output potentially multi-line environment variables from the output of env system command? [duplicate]

This question already has answers here:
Regular expression to stop at first match
(9 answers)
Closed 2 years ago.
I have this gigantic ugly string:
J0000000: Transaction A0001401 started on 8/22/2008 9:49:29 AM
J0000010: Project name: E:\foo.pf
J0000011: Job name: MBiek Direct Mail Test
J0000020: Document 1 - Completed successfully
I'm trying to extract pieces from it using regex. In this case, I want to grab everything after Project Name up to the part where it says J0000011: (the 11 is going to be a different number every time).
Here's the regex I've been playing with:
Project name:\s+(.*)\s+J[0-9]{7}:
The problem is that it doesn't stop until it hits the J0000020: at the end.
How do I make the regex stop at the first occurrence of J[0-9]{7}?
Make .* non-greedy by adding '?' after it:
Project name:\s+(.*?)\s+J[0-9]{7}:
Using non-greedy quantifiers here is probably the best solution, also because it is more efficient than the greedy alternative: Greedy matches generally go as far as they can (here, until the end of the text!) and then trace back character after character to try and match the part coming afterwards.
However, consider using a negative character class instead:
Project name:\s+(\S*)\s+J[0-9]{7}:
\S means “everything except a whitespace and this is exactly what you want.
Well, ".*" is a greedy selector. You make it non-greedy by using ".*?" When using the latter construct, the regex engine will, at every step it matches text into the "." attempt to match whatever make come after the ".*?". This means that if for instance nothing comes after the ".*?", then it matches nothing.
Here's what I used. s contains your original string. This code is .NET specific, but most flavors of regex will have something similar.
string m = Regex.Match(s, #"Project name: (?<name>.*?) J\d+").Groups["name"].Value;
I would also recommend you experiment with regular expressions using "Expresso" - it's a utility a great (and free) utility for regex editing and testing.
One of its upsides is that its UI exposes a lot of regex functionality that people unexprienced with regex might not be familiar with, in a way that it would be easy for them to learn these new concepts.
For example, when building your regex using the UI, and choosing "*", you have the ability to check the checkbox "As few as possible" and see the resulting regex, as well as test its behavior, even if you were unfamiliar with non-greedy expressions before.
Available for download at their site:
http://www.ultrapico.com/Expresso.htm
Express download:
http://www.ultrapico.com/ExpressoDownload.htm
(Project name:\s+[A-Z]:(?:\\w+)+.[a-zA-Z]+\s+J[0-9]{7})(?=:)
This will work for you.
Adding (?:\\w+)+.[a-zA-Z]+ will be more restrictive instead of .*

How to use regex for URL-targeting

As a disclaimer, I must say that my experience with regular expressions is very limited. I am using Optimizely for A/B testing and have run into a problem. I only want my experiment to run on one page, however, this page's URL-structure is somewhat complicated. The URL-structure of the page where I want to run my experiment looks like this:
https://mywebsite.co/term/public_id/edit/pricing
The problem is the public_id that changes dynamically, whenever a new user goes through the signup flow. How can I use regex to target this page exclusively? I have been trying to figure it out these past days but without any luck. Optimizely regex docs can be found here. I can't just use a simple match because /term/ appears in the URL of several pages on my site.
You could use this regular expression:
mywebsite\.co/somepage/.*?/edit/pricing
The .* part means any character can occur here any number of times. The additional ? makes it lazy, meaning the rest of the regular expression will kick in as soon as possible.
Note that a literal . needs to be escaped with a backslash, like \.

Regex for Google goal tracking

I am trying to create the regex to track goal conversions on my site (with a dynamic url).
URL: sitename.com/username/year (or all)
So this would be /johnsmith/2014 or /johnsmith/all
As the usernames and years can vary, I put the regex as /[A-Za-z0-9-_.]{2,16}/all|[0-9]{4}/
This isn't working at all. Could someone please help me
how about this regex
/([\w.]+)\/((?:\d+|all))/
Test string
sitename.com/johe8.nsmith/2014
sitename.com/johnsmith/all
Result
MATCH 1
[13-25] johe8.nsmith
[26-30] 2014
MATCH 2
[44-53] johnsmith
[54-57] all
demo here
if you have restrictions as mentioned in the questions, you may perhaps use the regex below
/([\w.]{2,16})\/((?:\d{4}|all))/
demo here

Oracle Regex Patterns

I have 3 Regex patterns from oracle 10/11 that I need to port to oracle 9. However, I am getting different results when I try to do the matching. Is there a way to fix this ? Is it even possible to port these patterns and get the same results when matching ?
Here are the patterns:
'^[[:alnum:]\_]$' which I would think would map to '[A-Za-z0-9_]*$'
'^[[:alpha:]]$' which i would think would map to '^[A-Za-z]*$'
'^(\"|\'')' which I would think needs to map over into 2 patterns '^["]*$' or '^['']*$'
owa pattern docs
oracle 10g regex docs
EDIT:
So my question was originally part of a lexor that was giving me trouble in a parser..it turns out my problem was related to a type (and not the regex) ..My regex patterns match close enough(and actually I changed the :alnum to use "\w" instead of the other pattern.)
There's one thing I still don't understand what does double enclosing brackets mean [[ with in the example regex ?

Why is this line of regex capturing white spaces?

I'm using the following line of regex which I found from this SO answer:
(?:[\w[a-z]-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.-]+[.??][a-z]{2,4}/)(?:[^\s()<>]+|(([^\s()<>]+|(([^\s()<>]+)))))+(?:(([^\s()<>]+|(([^\s()<>]+))))|[^\s`!()[]{};:'".,<>?«»“”‘’])
I am testing it on the following string:
"Quattro Amici in Concert Mar. 3, 2014. Long-time collaborators Lun Jiang, violin; Roberta Zalkind, viola; Pegsoon Whang, cello; and Karlyn Bond, piano, will perform works by Franz Joseph Haydn, Wolfgang Amadeus Mozart, Ludwig van Beethoven and Gabriel Faure. To purchase tickets visit westminstercollege.edu/culturalevents or call 801-832-2457. - See more at: http://entertainment.sltrib.com/events/view/quattro_amici_in_concert#sthash.QRsLXXiA.dpuf"
I'm simply attempting to extract urls from strings and based on a bunch of SO answers, I've found that regex is the recommended tool for that job. I'm not a regex expert (or even intermediate in my understanding), so I'm baffled by the empty strings my re.findall() keeps returning. I've stepped through the regex line using regex buddy and still no luck. Any help would be hugely appreciated.
I'm not sure that a big regex like that is entirely necessary - if you're just looking to get links, you could use a much simpler regex, like this:
/(https?:\/\/[\w\d\$-_\.\+!\*'\(\),\/#]+)/ig
According to RFC 1738, urls are only allowed to use the characters specified in the class above, so it should cover any valid url, without such a gigantic mess of a regex.
You can also use a tool like regexpal.com to validate regexes, which helps find issues. That said, I pasted your regex in there and it crashed chrome, so it may not be a great help for a beast like that :)