What are good Perl Pattern-matching / Regex Modules? - regex

I've been having a need to do a lot of regex / pattern-matching stuff lately and, in looking at different examples / forum posts from my web searches it seems people sometimes mention that perl has good modules to help in simplifying pattern matching / regex tasks, however they neglect to mention which ones are the best for this.. I have looked at CPAN for this but their site isn't very easy to navigate as I can't seem to search effectively by category.. any advice is appreciated.

Take a look at Regexp::Common
Also, look at YAPE::Regex::Explain and the web front end to it. Invaluable.

Might I suggest the perl6-esque Regexp::Grammars if you're doing anything really complex and need to write a grammar -- it is really awesome. I just used it to parse a few SQL commands for my perl postgresql shell: pgperlshell

A lot of the power of regular expressions is available natively in perl. Probably the best way for you to simplify your understanding of perl regular expressions is to read the excellent perl regex tutorial at http://perldoc.perl.org/perlretut.html

Related

Building a Powershell Regex

Link to Answer from another question
In the link above there is a regex construct I am trying to figure out what the -f in it is doing. Appreciate your help.
Also, where can I find other regex parameters and other tutorial about regex. Any links please.
Thanks for reading and responding.
If you are just interested in learning Regex in Powershell, lookup Mastering Powershell, and navigate to Chapter 14, Text and Regular Expressions. It's sorta bare bones, but it's a start.
there are 3 Regex Languages (An more I am not thinking of), and they are: PCRE/Javascript/Python. They regexs vary depending on which "Flavor" you use.
I suggest starting out by learning PCRE, because it is a very strong Flavor of Regex.
Here are some nice tutorials:
http://regex.learncodethehardway.org/book/
http://regexcrossword.com/

Is there a function to create a regex pattern from a string input?

I'm lousy at regular expressions but occasionally they're the only thing that's the right solution for a problem.
Is there something in the .NET framework that allows you to input an unencoded string and get a pattern from it? Which you could then modify as required?
e.g. I want to remove a CDATA section that contains a file from some XML but I can't work out what the right pattern is for <![CDATA[hugepileofrandombinarydataherethatalsoneedstogo]]> and I don't want to ask for help each time I'm stuck on a regex pattern.
Such tools exist, google by "regex generator".
But, as suggested in comments, better learn regex. Simple patterns are easy. Something like <!\[.*?]]>
in your case.
There are Regex Design tools like expresso...
http://www.ultrapico.com/expresso.htm
It's not perfect but as there is no suitable .Net component the text to regex page at txt2re.com is the best I've seen for those people who occasionally need to build a regex to match a string but don't have the time to relearn regex each time they want to use one.

Regex - match a string not contain a 'semi-word'

I tried to make regex syntax for that but I failed.
I have 2 variables
PlayerInfo[playerid][pLevel]
and
Character[playerid]
and I want to catch only the second variable,I mean only the world what don't contain PlayerInfo, but cointains [playerid]
"(\S+)\[playerid\]" cath both words and (\S+[^PlayerInfo])\[playerid\] jump on some variables- they contais p,l,a,y ...
I need to replace in notepad++,all variables like Text[playerid] to ExClass [playerid][Text]
Couple Pluasible solutions.
List item
Notepad has a plugin called python script. Running regex from there
gives full regex functionality, the python version anyway, and a lot
of powerful potential beyond that. And I use the online python regex tester to help out.
RegRexReplace plugin helps create regex plugins in Notepad++, so when you do hit a limitation, you find out a lot quicker.
Or of course default to your alternate editor (I'm assuming you have
one?) or this online regex tool is absolutely amazing. You
can perform the action on the text online as well.
(I'd try to build a regex for you, but I'm a bit lost as to what you're looking for. Unless the Ivo Abeloos got it. If you're still coming up short, maybe a code example along with values displayed?)
Good luck!
It seems that Notepad++ support negative lookbehind since v6.
In notepad++ you could try to replace (.+)\[(.+)\] with ExClass\[\2\]\[\1\]
Try to use negative lookbehind.
(?<!PlayerInfo)\[playerid\]
EDIT: unfortunately notepad++ does not support negative lookbehind.
I tried to make a workaround based on the following naive idea:
(.[^o]|[^f]o)[playerid]
But this expression does not work either. Notepad++ seems to fail in alternative operator. Thus the answer is: it is impossible to do exactly what you want. Try to solve the problem in other way or use alternative tool.

How to verify regexp patterns?

What are the common ways to verify the given regex pattern works well of the given scenario and check the results ?
I would like to know in general , not in the particular programming language and what is the best way to learn about writing regular expression ?
Books: Mastering Regular Expressions is the definitive guide to regular expressions. The Regular Expressions Cookbook is said to be lighter and more easily applicable.
Sites: Friedel's companion site is a good start. Regexlib is a source of idioms and patterns.
Software: RegexBuddy is a good, per pay, regex verifier.
I've used this resource when learning: http://www.regular-expressions.info/ and found myself going back there whenever there was something I needed to remember. It's very useful for learning and covers the basics very well. They also have various links to programs which can be used to verify regular expressions.
This is not a "real" verification, but RegexBuddy allows you to verify that your regex does what you expect it to do on any sample data you provide. It also translates the regex into an English description that can help to figure out mistakes. Plus, it knows all major regex flavors and can translate regexes between them.
For testing regular expression you can use RegEx Test tools like one below :
http://www.regextester.com/
To know more about how to learn regular expressions please check following SO threads :
Learning Regular Expressions
How to master Regular Expressions?
https://stackoverflow.com/questions/465119/how-do-i-learn-regular-expressions-closed
RAD Rexexp designer is a great tool
Set up an automated test using your tools of choice (because regex implementations vary from language to language and library to library) which applies the regex to a variety of both matching and non-matching inputs to verify that you get the correct results.
While RegexBuddy and the like may be helpful for initially creating the regex (or may not; I've never used them), you will still need to maintain it, just like any other code. When that time comes, it's vastly preferable to have a test script that will run through all your old test inputs (plus the new ones which created the need for the change) in a matter of seconds rather than having to sit on a website for tens of minutes, if not hours, trying to remember all your test inputs and manually re-run them to make sure you didn't break anything.

Regular expression extraction in text editors

I'm kind of new to programming, so forgive me if this is terribly obvious (which would be welcome news).
I do a fair amount of PHP development in my free time using pregmatch and writing most of my expressions using the free (open source?) Regex Tester.
However frequently I find myself wanting to simply quickly extract something and the only way I know to do it is to write my expression and then script it, which is probably laughable, but welcome to my reality. :-)
What I'd like is something like a simple text editor that I can feed my expression to (given a file or a buffer full of pasted text) and have it parse the expression and return a document with only the results.
What I find is usually regex search/replace functions, as in Notepad++ I can easily find (and replace) all instances using an expression, but I simply don't know how to only extract it...
And it's probably terribly obvious, can expression match only the inverse? Then I could use something like (just the expression I'm currently working on):
([^<]*)
And replace everything that doesn't match with nothing. But I'm sure this is something common and simple, I'd really appreciate any poniters.
FWIW I know grep and I could do it using that, but I'm hoping their are better gui'ified solution I'm simply ignorant of.
Thanks.
Zach
What I was hoping for would be something that worked in a more standard set of gui tools (ie, the tools I might already be using). I appreciate all the responses, but using perl or vi or grep is what I was hoping to avoid, otherwise I would have just scripted it myself (of course I did) since their all relatively powerful, low-level tools.
Maybe I wasn't clear enough. As a senior systems administrator the cli tools are familiar to me, I'm quite fond of them. Working at home however I find most of my time is spent in a gui, like Netbeans or Notepad++. I just figure there would be a simple way to achieve the regex based data extraction using those tools (since in these cases I'd already be using them).
Something vaguely like what I was referring to would be this which will take aa expression on the first line and a url on the second line and then extract (return) the data.
It's ugly (I'll take it down after tonight since it's probably riddled with problems).
Anyway, thanks for your responses. I appreciate it.
If you want a text editor with good regex support, I highly recommend Vim. Vim's regex engine is quite powerful and is well-integrated into the editor. e.g.
:g!/regex/d
This says to delete every line in your buffer which doesn't match pattern regex.
:g/regex/s/another_regex/replacement/g
This says on every line that matches regex, do another search/replace to replace text matching another_regex with replacement.
If you want to use commandline grep or a Perl/Ruby/Python/PHP one-liner any other tool, you can filter the current buffer's text through that tool and update the buffer to reflect the results:
:%!grep regex
:%!perl -nle 'print if /regex/'
Have you tried nregex.com ?
http://www.nregex.com/nregex/default.aspx
There's a plugin for Netbeans here, but development looks stalled:
http://wiki.netbeans.org/Regex
http://wiki.netbeans.org/RegularExpressionsModuleProposal
You might also try The Regulator:
http://sourceforge.net/projects/regulator/
Most regex engines will allow you to match the opposite of the regex.
Usually with the ! operator.
I know grep has been mentioned, and you don't want a cli tool, but I think ack deserves to be mentioned.
ack is a tool like grep, aimed at
programmers with large trees of
heterogeneous source code.
ack is written purely in Perl, and
takes advantage of the power of Perl's
regular expressions.
A good text editor can be used to perform the actions you are describing. I use EditPadPro for search and replace functionality and it has some other nice feaures including code coloring for most major formats. The search panel functionality includes a regular expression mode that allows you to input a regex then search for the first instance which identifies if your expression matches the appropriate information then gives you the option to replace either iteratively or all instances.
http://www.editpadpro.com
My suggestion is grep, and cygwin if you're stuck on a Windows box.
echo "text" | grep ([^<]*)
OR
cat filename | grep ([^<]*)
What I'd like is something like a
simple text editor that I can feed my
expression to (given a file or a
buffer full of pasted text) and have
it parse the expression and return a
document with only the results.
You have just described grep. This is exactly what grep does. What's wrong with it?