Nim `Warning: re is deprecated`, what to use instead? - regex

I'm writing a Nim program using regexes, which works fine, except that when I compile, I get this error message:
Warning: re is deprecated [Deprecated]
I've looked in the documentation for the re module, but there's no mention of a new way to create regexes.
My question is, if the re"regex" constructor is deprecated, what should I use?

From the docs:
Consider using the nre or pegs modules instead.
pegs is supposed to be more powerful than regular expressions and as such uses a different syntax from most regular expression engines; by contrast, nre is just a better wrapper around the PCRE library than re.

Related

How I can use c++11's regex for matching some generic bytes?

I need to do this:
const regex setData("^(setDataArray:)[\\x00-\\xFF]{8,8}$");
In other word I need to identify a string followed by some generic bytes (it is an internet protocol), but it give me an error at runtime, during compilation of the object.
I think the reason is that I can't just use an 8-bit char. There's a way to fix it?
EDIT: As suggested I'm attaching a simple program that explain the problem:
#include <regex>
using namespace std;
const regex setData("^(setDataArray:)[\\x00-\\xFF]{8,8}$");
int main()
{
return EXIT_SUCCESS;
}
This program crashes on 3rd line when compiled with visual studio 2013 on Windows 8.1
I believe this will work:
const regex setData(R"(^(setDataArray:)[\x00-\xFF]{8,8}$)",
std::regex_constants::basic);
I changed the syntax to use a Raw string constant, (the R"( ... )" syntax) to avoid having to escape slash characters and making it easier to read, but that's just to make it pretty.
The consequential change was the addition of the std::regex_constants::basic causes the regex to use the basic Posix grammar instead of the default ECMAScript grammar. In this case there should not be a problem using the ECMAScript version but I suspect there may be a problem in Microsoft's implementation.
There is a subtle difference between the standard ECMAScript grammar and that of the slightly modified version used in C++11. In particular the character range [B-E] form is not part of the original ECMAScript grammar, but is specified as part of the ECMAScript grammar as used in C++11.
You can read more about the various grammars and what they provide at
cppreference.com

Vim syntax high-lighting for C++11 that does not mess up other highlighting. E.g., class/namespace scoping

I am aware of this script: http://www.vim.org/scripts/script.php?script_id=3797. It has been suggested a few times, and other questions regarding C++11 syntax for Vim have been shut down due to duplicating this question: Is there a C++11 syntax file for vim?.
Unfortunately, the suggested script results in scoping constructs (e.g. "namespace::member()") not being highlighted anymore, and functions and class names are no longer highlighted.
Does anyone have a better C++11 plugin for Vim available now? Ideally, all the features of the regular C++ plugin being retained, new keywords/reserved words marked (e.g. nullptr), lambda expressions/universal initialization syntax not flagged as errors. etc. etc.
Have you tried the following? I use this one and like it
http://www.vim.org/scripts/script.php?script_id=4617

Is it possible to create wrong Regular expression in ActionScript/Flex which will cause runtime error?

Is it possible to create wrong Regular expression in ActionScript/Flex which will cause runtime error? I've tried so many weird regexpes in Flex and Flex never complained! How do I know If my regexp valid?
In theory, according to the ActionScript 3.0 SyntaxError documentation, when a regular expression cannot be parsed a SyntaxError is generated at runtime that you can detect in a try/catch block.
In practice, I've never actually seen the RegExp class exhibit this behavior.
I don't have ActionScript/Flex, so I can't test this. Since you haven't given any examples, I don't know what you think is a "weird" regex. What happens if you try one of these:
/(?<=x*)foo/
(ECMAScript regexes don't support lookbehind)
/foo([/
(missing closing parentheses/brackets)
/foo)]/
(missing opening parentheses/brackets)
/foo(?)/
(Syntax error)
/foo\1/
(invalid backreference)
If your end goal is to determine whether a particular regular expression is valid or not then I'm not sure trying to intentionally generate runtime errors is the best way to accomplish that.
Instead I would recommend testing your patterns against known inputs and make sure they behave as intended. You can use a tool like this to test:
RegExr

Tool for finding C-style Casts

Does anyone know of a tool that I can use to find explicit C-style casts in code? I am refactoring some C++ code and want to replace C-style casts where ever possible.
An example C-style cast would be:
Foo foo = (Foo) bar;
In contrast examples of C++ style casts would be:
Foo foo = static_cast<Foo>(bar);
Foo foo = reinterpret_cast<Foo>(bar);
Foo foo = const_cast<Foo>(bar);
If you're using gcc/g++, just enable a warning for C-style casts:
g++ -Wold-style-cast ...
Searching for the regular expression \)\w gives surprisingly good results.
The fact that such casts are so hard to search for is one of the reasons new-style casts were introduced in the first place. And if your code is working, this seems like a rather pointless bit of refactoring - I'd simply change them to new-style casts whenever I modified the surrounding code.
Having said that, the fact that you have C-style casts at all in C++ code would indicate problems with the code which should be fixed - I wouldn't just do a global substitution, even if that were possible.
The Offload C++ compiler supports options to report as a compile time error all such casts, and to restrict the semantics of such casts to a safer equivalence with static_cast.
The relevant options are:
-cp_nocstylecasts
The compiler will issue an error on all C-style casts. C-style casts in C++ code can potentially be unsafe and lead to undesired or undefined behaviour (for example casting pointers to unrelated struct/class types). This option is useful for refactoring to find all those casts and replace them with safer C++ casts such as static_cast.
-cp_c2staticcasts
The compiler applies the more restricted semantics of C++ static_cast to C-style casts. Compiling code with this option switched on ensures that C-style casts are at least as safe as C++ static_casts
This option is useful if existing code has a large number of C-style casts and refactoring each cast into C++ casts would be too much effort.
A tool that can analyze C++ source code accurately and carry out automated custom changes (e.g., your cast replacement) is the DMS Software Reengineering Toolkit.
DMS has a full C++ parser, builds ASTs and symbol tables, and can thus navigate your code to reliably find C style casts. By using pattern-directed matches and rewrites, you can provide a set of rules that would convert all such C-style casts into your desired C++ equivalents.
DMS has been used to carry out massive automated C++ reengineering tasks for Boeing and General Dynamics, each involving thousands of files.
One issue with C-style casts is that, since they rely on parentheses which are way overloaded, they're not trivial to spot. Still, a regex such as (e.g. in Python syntax):
r'\(\s*\w+\s*\)'
is a start -- it matches a single identifier in parentheses with optional whitespace inside the parentheses. But of course that won't catch, e.g., (void*) casts -- to get trailing asterisks as well,
r'\(\s*\w+[\s*]*\)'
You could also start with an optional const to broaden the net still further, etc, etc.
Once you have a good RE, many tools (from grep to vim, from awk to sed, plus perl, python, ruby, etc) lets you apply it to identify all of its matches in your source.
If you use some kind of hungarian style notation (e.g. iInteger, pPointer etc.) then you can search for e.g. )p and ) p and so on.
It should be possible to find all those places in reasonable time even for a large code base.
I already answered once with a description of a tool that will find and change all the casts if you want it to.
If all you want to do is find such casts, there's another tool that will do this easily, and in fact is the extreme generalization of all the "regular expression" suggestions made here. That is the SD Source Code Search Engine. This tool enables one to search large code bases in terms of the language elements that make up each language. It provides a GUI allowing you enter queries, see individual hits, and show the file text at the hit point with one mouse click. One more click and you can be in your editor [for many editors] on a file. The tool will also record a list of hits in context so you can revisit them later.
In your case, the following search engine query is likely to get most of the casts:
'(' I ')' | '(' I ... '*' ')'
which means, find a sequence of tokens, first being (, second being any identifier, third being ')', or a similar sequence involving something that ends in '*'.
You don't specify any whitespace management, as the tool understands the language whitespace rules; it will even ignore a comment in the middle of a cast and still match the above.
[I'm the CTO at the company that supplies this.]
I used this regular expression in Visual Studio (2010) Find in files search box: :i\):i
Thanks to sth for the inspiration (his post)

Regular expression to match (C) function calls

Does anyone have a regular expression for matching function calls in C programs ?
Since C isn't a regular language and C function calls can contain arbitrary argument expressions, I fear the answer to your question is “no.”
After a bit more searching I decided to let the compiler do the hard work.
Get the compiler to produce a Register Transfer Language (RTL) file using the -dr options of gcc.
The produced RTL file has the suffix .rtl or .expand.
This file is far easier to parse as the functions calls are already identified.
I doubt you can find a regex that matches all (and only) the function calls in some source code. But maybe you could use a tool like Understand, or your IDE, to browse your code.