Need simple regex for LaTeX - regex

In my LaTeX files, I have literally thousands of occurrences of the following construct:
$\displaystyle{...math goes here...}$
I'd like to replace these with
\mymath{...math goes here...}
Note that the $'s disappear, but the curly braces remain---if not for the trailing $, this would be a basic find-and-replace. If only I knew any regex, I'm sure it would handle this with no problem. What's the regex I need to make this happen?
Many thanks in advance.
Edit: Some issues and questions have arisen, so let me clarify:
Yes, $\displaystyle{ ... }$ can occur multiple times on the same line.
No, nested }$'s (such as $\displaystyle{...{more math}$...}$) cannot occur. I mean, I suppose it could if you put it in an \mbox or something, but I can't imagine why anyone would ever do that inside a $\displaystlye{}$ construct, the purpose of which is to display math inline with text. At any rate, it's not something I've ever done or am likely to do.
I tried using the perl suggestion, but while the shell raised no objections, the files remained unaffected.
I tried using the sed suggestion, but the shell objected to an "unexpected token near `('". I've never used sed before (and "man sed" was obtuse), but here's what I did: navigated to a directory containing .tex files and typed "sed s/\$\\displaystyle({[^}]+})\$/\\mymath\1/g *.tex". No luck. How do I use sed to do what I want?
Again, many many thanks for all offered help.

Be very careful when using REGEX to do this type of substitution
because the theoretical answer is that
REGEX is incapable of matching this type of pattern.
REGEX is a finite state machine; it does not incorporate a pushdown stack so
it cannot work with nested structures such as "{...math goes here...}" if
there is any possibility of nesting such that something like "{more math}$"
can appear as part of a "math goes here" string. You need at a minimum a
context free grammar to describe this type of construct - a state machine
just doesn't cut it!
Now having said that, you may still be able to pull this off using REGEX
provided none of your "math goes here" strings are more complex than
what a state machine can handle.
Give it a shot.... but beware of the results!

sed:
s/\$\\displaystyle({[^}]+})\$/\\mymath\1/g

perl -pi -e 's/$\\displaystyle({.*)}\$/\\mymath$1}/g' *.tex
if multiples }$ are on the same line you need a non greedy version:
perl -pi -e 's/$\\displaystyle({.*?)}\$/\\mymath$1}/g' *.tex

Related

UNIX: How would I grep in a script using a variable as a search parameter for a file?

Before I Start, this isn't exactly how it seems and I did search the web for a while before coming here. Basically I have a script where the user passes in a string and stores it in a variable. I then have to take that word and search for all the subwords that could be made from it in a dictionary file. The problem I am having is I need to make sure the words are at least 4 characters long. I do not have the best grasp on regular expressions. I'm aware of the techniques you can use just logically can't piece it together sometimes. I will show you the line of code and explain my reasoning behind why I think it should be this way. Then, could someone correct me on my logic? I am not looking for someone to send me the working line of code but perhaps correct my logic so I can understand better and derive the answer on my own.
words=$(grep -iE '(["$text"]{4,})' /usr/dict/words)
echo "$words"
For example if I pass in string college I should get output like
cell
cello
clee
cleg
etc.....
I am storing the command in another variable to echo. I am not sure why exactly, It just seems from what I saw online most people were rather fond of this. Using grep with -i for ignore case and -E for regular expression or (egrep) I believe the expression needs to be enclosed in single quote parenthesis for expressions. $text is the variable I stored the users input in. I know $ usually signifies the ending in and [] is a range and "" makes it read the variable rather than print what is there. Then {4,} meaning four or more characters. then the last part is the path to the file. Any input would be appreciated and again, I do not like being spoon fed answers it's an easy way to learn nothing. I would just like corrections on my logic if all possible. Thanks everyone!!
If by "subwords" you mean permutations of its letters, then your command is fine except for the quotes. Unfortunately you have to do it like this:
words=$(grep -iE '(['"$text"']{4,})' /usr/dict/words)
This way you pass to grep the single quoted string so that the shell doesn't interpret its special symbols. But at the same time you have to expand your $text var, thus you have to make a gap inside your single-quoted string, and in that gap place your variable in double quotes.
Hope I didn't spoil it for you.

underscore to camelCase RegEx

Our standards have changed and I want to do a 'find and replace' in say Dreamweaver(it allows for RegEx or we just got Visual Studio 2010, if it allows for searching by RegEx) for all the underscores and camelCase them.
What would be the RegEx to do that?
RegEx baffles me. I definitely need to do some studying.
Thanks in advance!
Update: A little more info - I'm searching within my html,aspx,cfm or css documents for any string that contains an underscore and replacing it with the following letter capitalized.
I had this problem, but I need to also handle converting fields like gap_in_cover_period_d_last_5_yr into gapInCoverPeriodDLast and found 9 out of 10 other sed expressions, don't like 1 letter words or numbers.
So to answer the question, use sed. sed -s is the equivalent to using the :s command in vim. So the example below is a command (ie sed -s/.../gc
This seemed to work, although I did have to run it twice (word_a_word will become wordA_word on the first pass and wordAWord on the second. sed backward commands, are just too magical for my muggle blood):-
s/\([A-Za-z0-9]\+\)_\([0-9a-z]\)/\1\U\2/gc
I recently had to approach a similar situation you asked about. Here is a regex I've been using in VIM which does the job for me:
%s/_\([a-zA-Z]\)/\u\1/g
As an example:
this_is_a_test becomes thisIsATest
I don't think there is a good way to do this purely with regex. Searching for _ characters is easy, something like ._. should work to find an _ with something on either side, but you need a more powerful scripting system to change the case of the character following the _. I suggest perl :)
I have a solution in PHP
preg_replace("/(_)(.)/e", "strtoupper('\\2')", $str)
There may be a more elegant selector criteria but I wanted to keep it simple.

perl regex problem -- $amp in yahoo finance page

I found an old perl hack on the O'Reilly site http://oreilly.com/pub/h/1041 and decided to check it out. After a little fiddling around it started to run but the regex are out of date.
Here is the question: with this
/<a href="\/q\/op\?s=(.*?)\&m=(.*?)">/
as the first line of regex, what needs to be modified to make the regex function again? The following are snippets from
http://finance.yahoo.com/q/op?s=FISV
<a href="/q/op?s=FISV&k=55.000000">
and
<a href="/q/os?s=FISV&m=2011-04-15">
.
The original hack is dated 2004 and option symbols looked like this (FQVAH or FQVFF) back then instead of fisv110416c00060000 for a call option and fisv110416p00090000 for a put option. First thing I did to get it going was to modify all instances of $url to $curl because until the name was changed the symbol was not being passed to yahoo for lookup. The &amp is giving me the most trouble. If this is found to run without modification I would be very surprised and would very much like to know what system and perl -V is installed. SLES 10 and perl 5.8.0 is what I am currently using.
Any suggestions would be helpful. It could be a useful script to anyone who is serious about protecting themselves from a falling equity market.
Thanks,
robm
I'm not /100%/ sure what you're asking, but if I'm understanding, you want a regex that will capture "fisv110416c00060000" and tell you the first few letters, whether it's a call or a put, and the amount?
If so, you're looking for something like:
/([a-z]+)(\d+)([cp])(\d+)/
That should capture the following for the first example
$1 = "fisv"
$2 = 110416
$3 = c
$4 = 00060000
The original regex was very specific to that html string. You can include the beginning bits of it if you need to use it to check that the entire string is there as well. Of course, make your regex as tight as possible to avoid over-matches and wasted time pattern matching. I'm just not sure the exact pattern you're trying to match (ie: is it always "fisv"?).
You should either first unescape the html, this would turn the & into a &, or just change the regex, like this:
/<a href="\/q\/os\?s=(.*?)\&(?:amp;)?m=(.*?)">/
To match both types of urls:
/<a href="\/q\/o[ps]\?s=(.*?)\&(?:amp;)?[mk]=(.*?)">/

Regular expression extraction in text editors

I'm kind of new to programming, so forgive me if this is terribly obvious (which would be welcome news).
I do a fair amount of PHP development in my free time using pregmatch and writing most of my expressions using the free (open source?) Regex Tester.
However frequently I find myself wanting to simply quickly extract something and the only way I know to do it is to write my expression and then script it, which is probably laughable, but welcome to my reality. :-)
What I'd like is something like a simple text editor that I can feed my expression to (given a file or a buffer full of pasted text) and have it parse the expression and return a document with only the results.
What I find is usually regex search/replace functions, as in Notepad++ I can easily find (and replace) all instances using an expression, but I simply don't know how to only extract it...
And it's probably terribly obvious, can expression match only the inverse? Then I could use something like (just the expression I'm currently working on):
([^<]*)
And replace everything that doesn't match with nothing. But I'm sure this is something common and simple, I'd really appreciate any poniters.
FWIW I know grep and I could do it using that, but I'm hoping their are better gui'ified solution I'm simply ignorant of.
Thanks.
Zach
What I was hoping for would be something that worked in a more standard set of gui tools (ie, the tools I might already be using). I appreciate all the responses, but using perl or vi or grep is what I was hoping to avoid, otherwise I would have just scripted it myself (of course I did) since their all relatively powerful, low-level tools.
Maybe I wasn't clear enough. As a senior systems administrator the cli tools are familiar to me, I'm quite fond of them. Working at home however I find most of my time is spent in a gui, like Netbeans or Notepad++. I just figure there would be a simple way to achieve the regex based data extraction using those tools (since in these cases I'd already be using them).
Something vaguely like what I was referring to would be this which will take aa expression on the first line and a url on the second line and then extract (return) the data.
It's ugly (I'll take it down after tonight since it's probably riddled with problems).
Anyway, thanks for your responses. I appreciate it.
If you want a text editor with good regex support, I highly recommend Vim. Vim's regex engine is quite powerful and is well-integrated into the editor. e.g.
:g!/regex/d
This says to delete every line in your buffer which doesn't match pattern regex.
:g/regex/s/another_regex/replacement/g
This says on every line that matches regex, do another search/replace to replace text matching another_regex with replacement.
If you want to use commandline grep or a Perl/Ruby/Python/PHP one-liner any other tool, you can filter the current buffer's text through that tool and update the buffer to reflect the results:
:%!grep regex
:%!perl -nle 'print if /regex/'
Have you tried nregex.com ?
http://www.nregex.com/nregex/default.aspx
There's a plugin for Netbeans here, but development looks stalled:
http://wiki.netbeans.org/Regex
http://wiki.netbeans.org/RegularExpressionsModuleProposal
You might also try The Regulator:
http://sourceforge.net/projects/regulator/
Most regex engines will allow you to match the opposite of the regex.
Usually with the ! operator.
I know grep has been mentioned, and you don't want a cli tool, but I think ack deserves to be mentioned.
ack is a tool like grep, aimed at
programmers with large trees of
heterogeneous source code.
ack is written purely in Perl, and
takes advantage of the power of Perl's
regular expressions.
A good text editor can be used to perform the actions you are describing. I use EditPadPro for search and replace functionality and it has some other nice feaures including code coloring for most major formats. The search panel functionality includes a regular expression mode that allows you to input a regex then search for the first instance which identifies if your expression matches the appropriate information then gives you the option to replace either iteratively or all instances.
http://www.editpadpro.com
My suggestion is grep, and cygwin if you're stuck on a Windows box.
echo "text" | grep ([^<]*)
OR
cat filename | grep ([^<]*)
What I'd like is something like a
simple text editor that I can feed my
expression to (given a file or a
buffer full of pasted text) and have
it parse the expression and return a
document with only the results.
You have just described grep. This is exactly what grep does. What's wrong with it?

Regex Search and Replace Program

Is there a simple and lightweight program to search over a text file and replace a string with regex?
For searching: grep - simple and fast. Included with Linux, here's a Windows version, not sure about Mac.
For replacing: sed. Here's a Windows version, not sure about Mac.
Of course, if you want to actually open up a file and see its contents while you search and replace, you can use emacs for that. Or ConTEXT. Or vim. Or what have you. ;)
See also this question.
Perl excels at this, with its -i, -n, -p and -e switches. See the slides from my talk Field Guide To The Perl Command Line Switches for examples.
Others have mentioned sed and awk, and it's no surprise that Perl was inspired by them. However, Perl may well be easier to get and install for you and/or your users.
There's also sed, which is a useful tool to learn the basics of - great for doing quick regex based substitutions.
Quick example, to change "foo" to "bar" in input.txt ...
sed -e 's/foo/bar/g' input.txt > output.txt
Many decent text editors have the option as well, vim, emacs, EditPlus and so on.
sed or awk. I recommend the book sed&awk to master the subject or the booklet sed&awk pocket reference for a quick reference. Of course mastering regular expressions is a must...
You didn't mention what platform you're using... If you are interested in a relatively simple GUI tool, there's regexxer. Otherwise, the commandline tools such as sed that were mentioned earlier can be very useful.
It depends if you're dealing with one or many files. At the risk of being pilloried, I'm assuming you're using Windows because you didn't specify a platform.
For one file at a time, Notepad2 does the trick and is extremely fast, lightweight and portable.
For search/replace over multiple files at once, try Agent Ransack.
Try WildGem: http://www.skytopia.com/software/wildgem
I'm the creator. Small, super-fast, portable and self-contained. You can use Regex, but it also has its own simple language syntax to make queries much easier in theory.
I quote:
Unlike similar programs, WildGem is fast with a dual split display, and updates or highlights matches as you type in realtime. A unique colour coded syntax allows you to easily find/replace text without worrying about having to escape special symbols.
Here's a screenshot:
NOt knowing the platform, I'd say the ad that popped-up pon this page might be appropriate: PowerGREP. Don't know anything about it, but it sounds similar to what you're looking for.
Use emacs or xemacs. It has a perfect regexp replacement function. You can even use constructions like /1 (or /2 or /3) to get a matched expression back in your replacement that was identified with ( ) around them. To prevent a vi-emacs clash: vi will also have similar constructions. I'm not sure of any modern editors that support this functionality.
Tip: Try out a simple replacement first, it can be a bit unclear as you might up add '\' to escape the special RegExp constructions...