What regular expression engine does XCode's Find Navigator use - regex

The documentation from Apple, Google searches, and StackOverflow is non-existent with regard to what RegEx engine XCode's Find Navigator uses, making it very difficult to write expressions. For example, I tried to do a negative lookbehind, and it doesn't work. Does anyone know where the hidden documentation lies?

Quick Answer: The ICU Engine. (This is the same engine that is used for NSRegularExpression as well).
Longer version:
The main difference between PCRE and ICU, the two leading regex engines, is PCRE's support for recursion within it's expressions. A simple example of this would be:
\w{3}\d{3}(?R)?
which matches aaa111bbb222 fully. However, entering this into an Xcode (7.1.1) regex find field yields this alert:
This leads us to believe that Xcode is using the ICU engine to power it's search and replace features. This make sense since Xcode is likely built either on top of NSRegularExpression or the underlying libicucore.

Related

What flavor of Regex does Visual Studio Code use?

Trying to search-replace in Visual Studio Code, I find that its Regex flavor is different from full Visual Studio. Specifically, I try to declare a named group with string (?<p>[\w]+) which works in Visual Studio but not in Visual Studio Code. It'll complain with the error Invalid group.
Apart from solving this specific issue, I'm looking for information about the flavor of Regexes in Visual Studio Code and where to find documentation about it, so I can help myself with any other questions I might stumble upon.
Full Visual Studio uses .NET Regular Expressions as documented here. This link is mentioned as the documentation for VS Code elsewhere on Stackoverflow, but it's not.
Rust Regex in the Find/Replace in Files Sidebar
Rob Lourens of MSFT wrote that the file search uses Rust regex. The Rust language documentation describes the syntax.
JavaScript Regex in the Find/Replace in File Widget
Alexandru Dima of MSFT wrote that the find widget uses JavaScript regex. As Wicktor commented, ECMAScript 5's documentation describes the syntax. So does the MDN JavaScript Regular Expression Guide.
Test the Difference
The find in files sidebar does not support (?=foobar) whereas the find in file widget does support that lookahead syntax.
Regarding Find/Replace with Groups
To find/replace with groups, use parentheses () to group and $1, $2, $3, $n to replace.
Here is an example.
Before:
After:
Shaun's answer is still correct, however to add an update, recently VS Code added the option to opt into using the Perl based PCRE2 engine. You can enable this through your settings config.
This allows you to perform more advanced regex operations like lookaheads and backreferences. But as noted below, the regex still has to be valid JavaScript regex.
VS Code does support regular expression searches, however,
backreferences and lookaround aren't supported by default. But you can
enable these with the setting search.usePCRE2. This configures ripgrep
to use the PCRE2 regex engine. While PCRE2 supports many other
features, we only support regex expressions that are still valid in
JavaScript, because open editors are still searched using the editor's
JavaScript-based search.
And for a bonus if you ended up here trying to do multi line searches, VS Code recently added that feature as well!
I've found newer information (July 22, 2020) about it.
IllusionMH left the following comment in Github:
ripgrep (compatible with PCRE2) is already used for Find in files
functionality (for not open editors) and JS engine used only for open
editors.
Which regex engine does vscode use? It is now a little more nuanced than previously. The best source is this vscode wiki: Github wiki: Notes on Regular Expression Support:
[At top of document]
This document applies to search (CMD+SHIFT+F/CTRL+SHIFT+F) and
quickopen (CMD+P/CTRL+P). By default, VS Code uses the ripgrep tool to
drive search.
...
[At end of document]
Text search uses two different sets of regular expression engines. The
workspace is searched using ripgrep, which will use the Rust regex
engine, and will fallback to PCRE2 if the regex fails to parse in the
Rust regex engine. The Rust regex engine doesn't support some features
like backreferences and look-around, so if you use those features,
PCRE2 will be used. Open files are searched using a JS regex in the
editor itself. Most of the time, you don't need to worry about this,
but you may see an inconsistency in how some complex regexes are
interpreted, and this can be an explanation. Especially when you see a
regex interpreted one way when a file is open, and another way when it
is not. During a Replace operation, each file will be opened in turn,
and the search query will be run as a JS regex.
Another potential issue is how newlines are handled between ripgrep
and the editor. The editor normalizes newlines, so that you can match
both CRLF and LF line endings just with \n. It's actually not possible
to match \r explicitly in the editor because it is normalized away.
When searching in the workspace, VS Code tries to rewrite a regex so
that \n will match CRLF. But \r\n or \s\n will also match CRLF in
closed files, but not in open files.
Two key points: (1) newlines are handled specially and (2) backrefereences and look-arounds are supported despite using the Rust regex engine - if your regex has a look-around or backreference in it PCRE2 will be used instead of the Rust engine.
More on lookarounds
The Find Widget (Ctrl+F) used for finding within the active editor only supports all lookarounds (lookahead and lookbehind) and those lookarounds can be non-fixed-length. So this will work in the Find Widget: (?<!blah.*).
In a search across files (Ctrl+Shift+F) non-fixed-length lookbehinds DO NOT WORK. Lookaheads can be fixed or non-fixed-length. But non-fixed-length positive or negative lookbehinds do not work and you will get an error message below the search input box: Regex parse error: lookbehind assertion is not fixed length which may not appear until you actually try to run the search.

Futile attempt to run regular expression find/replace in MS Word using groups on Mac

According to the received wisdom MS Word (more or less) supports find/replace with use of regular expressions. I have a simple regular expression:
^(C[[:alpha:]]*)(\d*)(.*)$
That I'm running on the data:
indSIMDdecile
CSdeccrim12006
CSdeccrim12006
CSdeccrim12009
CSdeccrim12009
CSdeccrim12012
CSdeccrim12012
CSdeceduc12004
CSdeceduc12004
CSdeceduc12006
CSdeceduc12006
CSdeceduc12009
CSdeceduc12009
CSdeceduc12012
CSdeceduc12012
CSdecemp12004.x
I'm interested in returning the first word prior to the digit 1, which works as demonstrated on regex101 here.
Problem
I would like to the same but in MS Word (v. 15.18 on Mac). After getting error messages of trying to supply unsuitable syntax I learned that MS Word does not support to the full regex syntax. I simplified my expression to something on the lines:
but the search does not find any strings and nothing gets replaced. Hence my questions, is it possible to use MS Word on Mac with regex?
The linked help website hints that something like that should be possible, but so far now luck.
The simple answer is "no", if you mean "Does Mac Word have a UI feature that lets you use one of the modern dialects of regex?" Word's Find/Replace only supports its own Regular Expression syntax.
In this case, I think the following will give you what you need:
Find with wildcards:
(C)([!1]#)(1)
and a replace by
\1
(If you also had to find "C1", then that doesn't work, and unfortunately nor does
(C)([!1]{0,})(1)
because Word does not allow 0 in the {,} pattern)
But there is a problem with "#". If the text the "#" is looking for is long, the find/replace may fail. There is supposed to be a 255 limit, but it seems rather more arbitrary than that. (I have long suspected a buffer overrun type error in the Word code, but perhaps there is a simpler explanation).
If you mean, "is there any way to use modern regex with Word?", then the answer is "Yes, but you only get to operate on a copy of the text in the document. You will need to create your own code to do the 'replace' part of the find replace, and that means that you would have to deal with any of the issues such as preserving formatting that Word's built-in find/replace might get right for you.
On the Windows side, people who want a better regex than Word's often use VBScript's regexp object because it is easily used from VBA. VBA itself only really has the "like" operator, which also only has fairly crude pattern matching abilities. I think there are examples of VBScript rexexp use on StackOverflow. On the Mac side, you would either have to use VBA and "shell out" to one of the built-in Mac/Unix utilities to do your finding (and perhaps replacing), or perhaps use Applescript or Javascript application scripting to do it. As far as I can remember Applescript does not have a 'modern' regex built-in either.
[As a bit of history, Word's "regular expressions" were I think introduced in Word 6, around 1993, at a time when most dialects of regex were much more crude than they are today. I don't think Word's version has moved along much at all - it probably added some Unicode support at some point, but that's probably about it. I assume that people using modern regex don't regard it as regex at all, and I personally prefer not to call Word's Regular Expressions 'regex' precisely for that reason.]

RegEx with Excel VBA on Mac

I need to use regEx with Excel VBA. I'm using Mac OS 10.10 and Office 2011. So there is no DLL file I can use.
What is there to do here?
I read I've to bind an apple script. How is this done and what content does this script need?
You can use VBA's Like operator. It's a very limited regex tester only.
Microsoft Word has it's standard wildcards plus if you tick Use Wildcards it is a Regex engine (plus find words that sound the same, and words with the same root). So use Word rather than Vbscript's RegEx.
Just record a Find and Replace in Word and you'll get most of the program written for you that you'll just need to adapt.
Natively, you can't really - AppleScript isn't actually that good for this kind of thing (where VBA is concerned)
There are other libraries that you can install and use to allow support for things like regular expressions on Mac OS - the one I've seen used the most is Satimage although I've not personally had to use it (yet) so can't vouch for it myself:
http://www.satimage.fr/software/en/downloads/downloads_companion_osaxen.html
I'm working on this problem too and I think Advanced Filters may be your answer if you want to do it in Excel without adding an external library. You can access it through VBA and set up a hidden sheet somewhere to stash your filters.
https://searchengineland.com/advanced-filters-excels-amazing-alternative-to-regex-143680
And you can see what it looks like in VBA here:
https://www.contextures.com/exceladvancedfiltervba.html
However, Advanced Filters does have some notable shortcomings, like the inability to distinguish a digit from a letter. The LIKE command mentioned earlier DOES have this ability however - so you could combine them to overcome that limitation.
Hopefully you and I can both solve this problem using these tools...!

How to implement Regex

I'm working on a database server software product (see my profile) and we see the need to implement free- text searching in our software. The query language standard we are using only supports free-text search using a BT type Regex. The only way we can use our free-text database indexes together with Regex seems to be to implement our own. My questions to SO is:
Where can I find papers/examples/patterns on how to implement a BT style Regex?
Is it worth looking into taking one of the open source C/C++ Regex libraries and altering the code to fit our needs?
If I'm not wrong SPARQL uses the XPath/XQuery regular expression syntax which is based on PERL regular expressions (At least that is what the W3C docs say)
If this is indeed the case then you can use PCRE from http://www.pcre.org/
It is licensed as BSD so you will be able to use it in a commercial product
If your syntax is slightly modified you can probably write a small routine to normalize it to the PERL syntax used by PCRE
There are two papers I have found on the subject on REGEX indexing online; one from Bell Labs and one from UCLA/IBM. I'm still not sure if to use an existing Regex library and modify it or write one from scratch.

How can test I regular expressions using multiple RE engines? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 8 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
How can I test the same regex against different regular expression engines?
The most powerful free online regexp testing tool is by far http://regex101.com/ - lets you select the RE engine (PCRE, JavaScript, Python), has a debugger, colorizes the matches, explains the regexp on the fly, can create permalinks to the regex playground.
Other online tools:
http://www.rexv.org/ - supports PHP and Perl PCRE, Posix, Python, JavaScript, and Node.js
http://refiddle.com/ - Inspired by jsfiddle, but for regular expressions. Supports JavaScript, Ruby and .NET expressions.
http://regexpal.com/ - powered by the XRegExp JavaScript library
http://www.rubular.com/ - Ruby-based
Perl Regex Tutor - uses PCRE
Windows desktop tools:
The Regex Coach - free Windows application
RegexBuddy recommended by most, costs US$ 39.95
Jeff Atwood [wrote about regular expressions]( post:).
Other tools recommended by SO users include:
http://www.txt2re.com/ Online free tool to generate regular expressions for multiple language (#palmsey another thread)
The Added Bytes Regular Expressions Cheat Sheet (#GateKiller another thread)
http://regexhero.net/ - The Online .NET Regular Expression Tester. Not free.
RegexBuddy
I use Expresso (www.ultrapico.com). It has a lot of nice features for the developer. The Regulator used to be my favorite, but it hasn't been updated in so long and I constantly ran into crashes with complicated RegExs.
Here are some for the Mac: (Note: don't judge the tools by their websites)
RegExhibit - My Favorite, powerful and easy
Reggy - Simple and Clean
RegexWidget - A Dashboard Widget for quick testing
If you are an Emacs user, the command re-builder lets you type an Emacs regex and shows on the fly the matching strings in the current buffer, with colors to mark groups. It's free as Emacs.
Rubular is free, easy to use and looks nice.
RegexBuddy is a weapon of choice
I use the excellent and free Rad Software Regular Expression Designer.
If you just want to write a regular expression, have a little help with the syntax and test the RE's matching and replacing then this fairly light-footprint tool is ideal.
couple of eclipse plugins for those using eclipse,
http://www.brosinski.com/regex/
http://www.bastian-bergerhoff.com/eclipse/features/web/QuickREx/toc.html
Kodos of course. Cause it's Pythonic. ;)
RegexBuddy is great!!!
I agree on RegExBuddy, but if you want free or when I'm working somewhere and not on my own system RegExr is a great online (Flash) tool that has lots of pre-built regex segments to work with and does real-time pattern matching for your testing.
In the standard Python installation there is a "Tools/scripts" directory containing redemo.py.
This creates an interactive Tkinter window in which you can experiment with regexs.
In the past I preferred The Regex Coach for its simplistic layout, instantaneous highlighting and its price (free).
Every once in awhile though I run into an issue with it when trying to test .NET regular expressions. For that, it turns out, it's better to use a tool that actually uses the .NET regular expression engine. That was my whole reason to build Regex Hero last year. It runs in Silverlight, and as such, runs off of the .NET Regex Class library directly.
Regexbuddy does all this. http://www.regexbuddy.com/
see the accepted answer to this question: Learning Regular Expressions
I'll add to the vote of Reggy for the Mac, gonna try out some of the other ones that Joseph suggested and upvote that post tomorrow when my limit gets reset.
for online: http://regexpal.com/
for desktop: The Regex Coach
+1 For Regex Coach here. Free and does the job really well.
http://www.weitz.de/regex-coach/
I am still a big The Regulator fan.
There are some stability problems but these can be fixed by disableing the Intellisense. It gets mad with some expressions and typos in building an expression.
Would love it if Roy Osherove updated, but looks like he is busy with other things.
I like to use this online one:
http://www.cuneytyilmaz.com/prog/jrx/
Of course, it'll be javascript regexp, but I've never yet done anything clever enough to notice the difference.
How much is your time worth? Pay the $40 and get RegexBuddy. I did, and I even upgraded from 2.x version to 3.x. It has paid for itself many times over.
I personally like the Regular Expression Tester.
It's a free firefox plugin, so always on!
Also this regex plugin can be useful for eclipse and idea users.
I like http://regexhero.net/tester/ a lot
Check out Regex Master which is free and open source regular expression tester
This regex tester able to test javascript, php and python
http://www.piliapp.com/regex-tester/
RegExBuddy so far I concur with and endorse.
RegExr for testing with the Actionscript 3 (whichever standard that may be)
http://rgx-extract-replace.appspot.com
has the functionality to enlist the captured regex groups formatted in columns and
optionally can replace the matched patterns in the input text.