i have a (i guess) simple question.
I am running Selenium test cases (HTML, selense) after a successful build in Hudson. The testcases pass when i run them in the IDE after the build but not on the server. The cases that fails holds a regexp expression like so:
regexp:The profile details have been updated, it can take up to [0-9]*\s\w*\(.\) until the changes are fully visible.
the goal is to match times like 1 hour(s), 30 second(s) 4 minutes(s)
Have someone encountered a problem like this and how did you solve it?
I would use a slightly simpler expression, assuming you do not need to handle things like 1 second, 2 seconds, etc, but only need to handle things like 1 second(s) or 2 second(s), you could use this expression:
try replacing the regex stuff with
[0-9]+ [a-z]+\(s\)
Some older regex flavors (POSIX ERE) don't allow the shorthand character classes, so if those rules are somehow being invoked, then it would fail on \s and \w (and maybe even the .). The above expression should work almost anywhere; though it is not as flexible, it should be flexible enough for your situation.
Some flavors that are even older (POSIX BRE) don't support the shorthand repetition and actually interpret curly braces and parentheses differently as well. If that is the flavor being expected, it might just fail on any normal expression, and you'd need to usde something like:
[0-9]\{1,\} [a-z]\{1,\}(s)
If neither of these work, then there is a significant difference in the engines implementing your regex OR there is some difference in how the page is rendering and being parsed/searched/analyzed by the Selenium Server.
If that is the case, there could be <br> tags or formatting tags (like <i> or <span>) that are being parsed as text rather than being extracted from the string. Then, something like changes are fully visible might be parsed as changes are <em>fully</em> visible and thus failing the test
When you test it locally versus hudson run, are there differences in OS platform?
I know that I had some problems with windows xp versus unix (where hudson was deployed). Although I don't think that is the case here, just a thought though.
Related
According to the received wisdom MS Word (more or less) supports find/replace with use of regular expressions. I have a simple regular expression:
^(C[[:alpha:]]*)(\d*)(.*)$
That I'm running on the data:
indSIMDdecile
CSdeccrim12006
CSdeccrim12006
CSdeccrim12009
CSdeccrim12009
CSdeccrim12012
CSdeccrim12012
CSdeceduc12004
CSdeceduc12004
CSdeceduc12006
CSdeceduc12006
CSdeceduc12009
CSdeceduc12009
CSdeceduc12012
CSdeceduc12012
CSdecemp12004.x
I'm interested in returning the first word prior to the digit 1, which works as demonstrated on regex101 here.
Problem
I would like to the same but in MS Word (v. 15.18 on Mac). After getting error messages of trying to supply unsuitable syntax I learned that MS Word does not support to the full regex syntax. I simplified my expression to something on the lines:
but the search does not find any strings and nothing gets replaced. Hence my questions, is it possible to use MS Word on Mac with regex?
The linked help website hints that something like that should be possible, but so far now luck.
The simple answer is "no", if you mean "Does Mac Word have a UI feature that lets you use one of the modern dialects of regex?" Word's Find/Replace only supports its own Regular Expression syntax.
In this case, I think the following will give you what you need:
Find with wildcards:
(C)([!1]#)(1)
and a replace by
\1
(If you also had to find "C1", then that doesn't work, and unfortunately nor does
(C)([!1]{0,})(1)
because Word does not allow 0 in the {,} pattern)
But there is a problem with "#". If the text the "#" is looking for is long, the find/replace may fail. There is supposed to be a 255 limit, but it seems rather more arbitrary than that. (I have long suspected a buffer overrun type error in the Word code, but perhaps there is a simpler explanation).
If you mean, "is there any way to use modern regex with Word?", then the answer is "Yes, but you only get to operate on a copy of the text in the document. You will need to create your own code to do the 'replace' part of the find replace, and that means that you would have to deal with any of the issues such as preserving formatting that Word's built-in find/replace might get right for you.
On the Windows side, people who want a better regex than Word's often use VBScript's regexp object because it is easily used from VBA. VBA itself only really has the "like" operator, which also only has fairly crude pattern matching abilities. I think there are examples of VBScript rexexp use on StackOverflow. On the Mac side, you would either have to use VBA and "shell out" to one of the built-in Mac/Unix utilities to do your finding (and perhaps replacing), or perhaps use Applescript or Javascript application scripting to do it. As far as I can remember Applescript does not have a 'modern' regex built-in either.
[As a bit of history, Word's "regular expressions" were I think introduced in Word 6, around 1993, at a time when most dialects of regex were much more crude than they are today. I don't think Word's version has moved along much at all - it probably added some Unicode support at some point, but that's probably about it. I assume that people using modern regex don't regard it as regex at all, and I personally prefer not to call Word's Regular Expressions 'regex' precisely for that reason.]
I am working on a filter based upon browser version and am having a little trouble. It has to be in RegEx which loves to encompass everything possible.
I want to select:
12.0
8.0
18.0.1025.168
The problem I am having (looking at the 12.0 specifically however it is a problem for all 3) is that it is selecting things other than 12.0 as well. I have been trying to use negated sets and non-capturing groups however it just isn't quite working.
Currently I have: ((?:18.0.1025.168[^.]|(12.0)[^.]|(?:8.0)[^.]))
I have used \d in the negated sets however it seems as though I have to choose \d or . because it does not allow for special characters within the set.
Things that I need to make sure are not selected include any variation of the following, (the 9's could be any number)
9.12.0
912.0
92.09
12.0.9
Any input of what I should look into or another symbol I could use would be greatly appreciated. Also, if needed I can break this into 3 different formulas that will all fire however would like to avoid that is possible
What about (\A|^)(12.0|8.0|18.0.1025.168)($|\z)
We have a large solution with many projects in it, and throughout the project in forms, messages, etc we have a reference to a company name. For years this company name has been the same, so it wasn't planned for it to change, but now it has.
The application is specific to one state in the US, so localizations/string resource files were never considered or used.
A quick Find All instances of the word pulled up 1309 lines, but we only need to change lines that actually end up being displayed to the user (button text, message text, etc).
Code can be refactored later to make it more readable when we have time to ensure nothing breaks, but for time being we're attempting to find all visible instances and replace them.
Is there any way to easily find these "instances"? Perhaps a type of Regex that can be used in the Find All functionality in Visual Studio to only pull out the word when it's wrapped inside quotes?
Before I go down the rabbit hole of trying to make my job easier and spending far more time than it would have taken to just go line by line, figured I would see if anyone has done something like this before and has a solution.
You can give this a try. (I hope your code is under source control!)
Foobar{[^"]*"([^"]*"[^"]*")*[^"]*}$
And replace with
NewFoobar\1
Explanation
Foobar the name you are searching for
[^"]*" a workaround for the missing non greedy modifier. [^"] means match anything but " that means this matches anything till the first ".
([^"]*"[^"]*")* To ensure that you are matching only inside quotes. This ensures that there are only complete sets of quotes following.
[^"]* ensures that there is no quote anymore till the end of the line $
{} the curly braces buts all this stuff following your companies name into a capturing group, you can refer to it using \1
The VS regex capability is quite stripped down. It perhaps represents 20% of what can be done with full-powered regular expressions. It won't be sufficient for your needs. For example, one way to solve this quote-delimited problem is to use non-greedy matching, which VS regex does not support.
If I were in your shoes, I would write a perl script or a C# assembly that runs outside of Visual Studio, and simply races through all files (having a particular file extension) and fixes everything. Then reload into Visual Studio, and you are done. Well, if all went well with the regex anway.
Ultimately what you really must watch out for is code like this:
Log.WriteLine("Hello " + m_CompanyName + " There");
In this case, regex will think that "m_CompanyName" appears between two quotes - but it is not what you meant. In this case you need even more sophistication, and I think you'll find the answer with a special .net regular expression extension.
I hope this title makes sense - I need case-insensitive regex matching on BlackBerry 5.
I have a regular expression defined as:
public static final String SMS_REG_EXP = "(?i)[(htp:/w\\.)]*cobiinteractive\\.com/[\\w|\\%]+";
It is intended to match "cobiinteractive.com/" followed by some text. The preceding (htp:w.) is just there because on my device I needed to override the internal link-recognition that the phone applies (shameless hack).
The app loads at start-up. The idea is that I want to pick up links to my site from sms & email, and process them with my app.
I add it to the PatternRepository using:
PatternRepository.addPattern(
ApplicationDescriptor.currentApplicationDescriptor(),
GlobalConstants.SMS_REG_EXP,
PatternRepository.PATTERN_TYPE_REGULAR_EXPRESSION,
applicationMenu);
On the os 4.5 / 4.7 simulators and on
a Curve 8900 device (running 4.5),
this works.
On the os 5 simulators and the Bold
9700 I tested, app fails to compile
the pattern with an
IllegalArgumentException("unrecognized
character after (?").
I have also tried (naively) to set the pattern to "/rockstar/i" but that only matches the exact string - this is possibly the correct direction to take, but if so, I don't know how to implement it on the BB.
How would I modify my regex in order to pick up case insensitive patterns using the PatternRepository as above?
PS: would the "correct" way be to use the [Cc][Oo][Bb][Ii]2... etc pattern? This is ok for a short string, but I am hoping for a more general solution if possible?
Well not a real solution for the general problem but this workaround is easy, safe and performant:
As your dealing here with URLs and they are not case-sensitive...
(it doesn't matter if we write google.com or GooGLE.COm or whatever)
The most simple solution (we all love KISS_principle) is to do first a lowercase (or uppercase if you like) on the input and than do a regex match where it doesn't matter whether it's case-sensitive or not because we know for sure what we are dealing with.
Since nobody else has answered this question relating to the PatternRepository class, I will self-answer so I can close it.
One way to do this would be to use a pattern like: [Cc][Oo][Bb][Ii]2[Nn][Tt][Ee][Rr][Aa][Cc][Tt][Ii][Vv][Ee]... etc where for each letter in the string, you put 2 options. Fortunately my string is short.
This is not an elegant solution, but it works. Unfortunately I don't know of a way to modify the string passed to PatternRepository and I think the crash when using the (?i) modifier is a bug in BB.
Use the port of the jakarta regex library:
https://code.google.com/p/regexp-me/
If you use unicode support, it's going to eat memory,
but if you just want case insensitive matching,
you simply need to pass the RE.MATCH_CASEINDEPENDENT flag when you compile your regex.
new RE("yourCaseInsensitivePattern", RE.MATCH_CASEINDEPENDENT | OTHER_FLAGS)
This is the line mono on linux locks up (i am using 2.6.4 VM distro on the official site)
var match = Regex.Match(sz, linkPattern);
The string is this which gets the link and the title.
var linkPattern = #"<\ba\b[^\>]*\bhref\b*=\b*""([^""\>]*)""[^\>]*\btitle\b*=\b*""([^""\>]*) by [^""\>]*""";
When mono hits that line it doesnt crash, throw an exception or anything. Using tops i see mono using 96% of the CPU. I dont know how long the string is. I suspect its <8kb (i tested a different url) and it has been a few minutes since i ran the code so something must be broken.
"Too many \b's" was my first reaction. But really:
\b means word boundary. In my opinion, <\ba and <a should be identical. Also, \b* therefore would mean "optional repetition of word boundaries", which sounds rather confusing.
I guess I've never used \b at all, and used \s? or \s* instead.
Did you try a different regex engine (Perl, PHP) to determine whether the lockup is due to Mono?
There are some bugs in Mono's regex implementation that can cause it to recurse infinitely. Probably the only fix is to rewrite your pattern to be a simpler regular expression, or not use regular expressions for this task.
You may also want to file a bug. I think there is a Google Summer of Code student currently working on Mono's regular expression engine.