Regex to express an operation in latex - regex

Hello, i want to formulate a regex expression outputted by matlab instruction solve to express an arithmetic operation in latex symboles as the example which follows:
(a+b^(c-d))/b -> \frac{(a+b^{(c-d)})}{b}
allowed input patterns:
/+-*^\w\s()
allowed output patterns:
+-*^\w\s(){}
about division, This is what i have tried so far
The caught expressions are stored in variables{division,numerator,denominator}
about exponentiation, I v tried This
Unfortunately,I found my self faced to couple of problems, one of them, is that my matlab version doesnt accept this kind of recursive regex. but I could implement it as iterative function:
a='^(dfdf ^(sdf) )';b=' ';while(~strcmp(a,b))b=a;a=regexprep(a, '\^\((?<betweenbrackets>.*)\)', '\^{$<betweenbrackets>}');end
Could you advice me anyway else to do it for both exponentiation and division ?

If you have the Symbolic Math Toolbox, you can just say
latex(sym('(a+b^(c-d))/b'))
ans =
\frac{a + b^{c - d}}{b}

Related

Regex tool to capture numbers and calculate sum?

I'm looking for a tool (online or for OSx) that can capture numbers in a string using regex, and then sum up those numbers and display the sum (and ideally perform some other mathematical operations as well).
I know that for example RegExr can list all matches, but it can't really perform any operations on them.
Not sure if this question qualifies on StackOverflow, but I'm sure many developers out there have had the need to quickly and easily sum up some values in debug outputs (which is exactly what I need to do), so I'm throwing it out there anyway.
I could easily create my own tailor made tool to do it, but I'm sure it would benefit all of us (me and those who end up in this thread looking for the same tool) to reuse something that already exists instead of each of us inventing our own wheel.
This can be done using a short piece of Ruby code.
source = "zaw34r5vgfyt78uibv879gu"
regex_numbers = /[0-9]+/
matches = source.scan(regex_numbers).collect { |str| str.to_f }
result = matches.inject { |sum, x| sum + x }
puts result
You can run it using an online interpreter, such as REPL.IT, or locally using a Ruby interpreter.
The operation performed each time can be changed by modifying { |sum, x| sum + x }.
{ |product, x| product * x }

How to rewrite `sin(x)^2` to cos(2*x) form in Sympy

It is easy to obtain such rewrite in other CAS like Mathematica.
TrigReduce[Sin[x]^2]
(*1/2 (1 - Cos[2 x])*)
However, in Sympy, trigsimp with all methods tested returns sin(x)**2
trigsimp(sin(x)*sin(x),method='fu')
While dealing with a similar issue, reducing the order of sin(x)**6, I notice that sympy can reduce the order of sin(x)**n with n=2,3,4,5,... by using, rewrite, expand, and then rewrite, followed by simplify, as shown here:
expr = sin(x)**6
expr.rewrite(sin, exp).expand().rewrite(exp, sin).simplify()
this returns:
-15*cos(2*x)/32 + 3*cos(4*x)/16 - cos(6*x)/32 + 5/16
That works for every power similarly to what Mathematica will do.
On the other hand if you want to reduce sin(x)**2*cos(x) a similar strategy works. In that case you have to rewrite the cos and sin to exp and as before expand rewrite and simplify again as:
(sin(x)**2*cos(x)).rewrite(sin, exp).rewrite(cos, exp).expand().rewrite(exp, sin).simplify()
that returns:
cos(x)/4 - cos(3*x)/4
The full "fu" method tries many different combinations of transformations to find "the best" result.
The individual transforms used in the Fu-routines can be used to do targeted transformations. You will have to read the documentation to learn what the different functions do, but just running through the functions of the FU dictionary identifies TR8 as your workhorse here:
>>> for f in FU.keys():
... print("{}: {}".format(f, FU[f](sin(var('x'))**2)))
...
8<---
TR8 -cos(2*x)/2 + 1/2
TR1 sin(x)**2
8<---
Here is a silly way to get this job done.
trigsimp((sin(x)**2).rewrite(tan))
returns:
-cos(2*x)/2 + 1/2
also works for
trigsimp((sin(x)**3).rewrite(tan))
returns
3*sin(x)/4 - sin(3*x)/4
but not works for
trigsimp((sin(x)**2*cos(x)).rewrite(tan))
retruns
4*(-tan(x/2)**2 + 1)*cos(x/2)**6*tan(x/2)**2

Spot vars and functions with regex in mathematical formulas

I am trying to get an array of functions, and an array of variables in any mathematical formula in vanilla JS :
A sample :
1+3*9/cos(4+2*x/6+pol)+
BDLIRE(longueur2)+2*8+sin(2)
+g()
For getting functions I use : /([a-zA-Z]+)(?=[(])/gm
https://regex101.com/r/fT3iM7/1
For getting vars I tried : /([a-zA-Z]+[a-zA-Z0-9]?)(?!\()/gm
https://regex101.com/r/mG2fQ2/1
But as you can see it ignores last char of a match followed by a ( char
So i'm a bit stuck with my regex to match variables.
Thanks for any help :)
You cannot use RegEx to parse arithmetic expressions, maybe you can, but you will have a flaky implementation and writing it WILL make you go bonkers.
You need to look into parsers. Check out PEG.js, it is an easy to use framework for writing parsers that compiles into JS, so it IS "vanilla" (Whatever you mean by that...):
http://pegjs.org/

Notepad++ Regular Expression add up numbers

How do I sum or add a certain value to all those numbers? For example my goal is to increase all those numbers inside the "" with 100 but achieving that has been problematic. Basically just somehow sum the current number with +100.
I have the following lines
<devio1="875" devio2="7779" devio3="5635" devio4="154"/>
<devio1="765" devio2="74779" devio3="31535" devio4="544"/>
<devio1="4335" devio2="13" devio3="55635" devio4="1565"/>
By using this regular expression with Notepad++
<devio1="([0-9]+)" devio2="([0-9]+)" devio3="([0-9]+)" devio4="([0-9]+)"/>
I can find all the numbers inside the "" but I cannot find a way to add +100 to all of them. Can this task be achieved with Notepad++ using Regular Expressions?
That's not possible with the sole use of regular expressions in Notepad++. Unfortunately there's no way to perform calculations in the replacement pattern.
So the only way of accomplishing your task in Notepad++ is with the use of the Python Script plugin.
Install Python Script plugin from the Plugin Manager or from the official website.
Then go to Plugins > Python Script > New Script. Choose a filename for your new file (eg add_numbers.py) and copy the code that follows:
def calculate(match):
return 'devio%s="%s"' % (match.group(1), str(int(match.group(2))+100))
editor.rereplace('devio([0-9])="([0-9]+)"', calculate)
Run Plugins > Python Script > Scripts > add_numbers.py and your text will be transformed to:
<devio1="975" devio2="7879" devio3="5735" devio4="254"/>
<devio1="865" devio2="74879" devio3="31635" devio4="644"/>
<devio1="4435" devio2="113" devio3="55735" devio4="1665"/>
I'm not really familiar with notepad++ but for an algorithm, supposing you have a number abcd = a*1000 +b*100 + c*10 + d, then so long as b is in [0,8] you can just replace b by b+1. As for when b = 9 then you need to replace b with 0 and replace a with a+1 (and if a = 9 then you'd replace a by 10).
Noting this, you could then, for three and four digit numbers, say, apply the following regexes:
\([1-9]+\)0\([0-9]{2}\) -> \1 1\2,
\([1-9]+\)1\([0,9]{2}\) -> \1 2\2,
... -> ,
\([1-9]+\)8\([0-9]{2}\) -> \1 9\2,
and so on ... Noting that you also have to consider any a=9, b=9 integers, and larger integers; this suggests some sort of iteration with if statements covering the cases where the coefficients of 10^x (x>=2) are equal to 9. When you start actually coding this (or doing it by hand) you will begin to realize that doing this with a pure regex approach is going to be painful.
Regex doesn't support arithmentic and Notepad++ doesn't support any computation beyond regex, so you're stuck if you're limiting yourself to that tool. There are, of course, many other non-Notepad++ solutions, some of which are discussed in Math operations in regex.

Regular expression to search for Gadaffi [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm trying to search for the word Gadaffi, which can be spelled in many different ways. What's the best regular expression to search for this?
This is a list of 30 variants:
Gadaffi
Gadafi
Gadafy
Gaddafi
Gaddafy
Gaddhafi
Gadhafi
Gathafi
Ghadaffi
Ghadafi
Ghaddafi
Ghaddafy
Gheddafi
Kadaffi
Kadafi
Kaddafi
Kadhafi
Kazzafi
Khadaffy
Khadafy
Khaddafi
Qadafi
Qaddafi
Qadhafi
Qadhdhafi
Qadthafi
Qathafi
Quathafi
Qudhafi
Kad'afi
My best attempt so far is:
\b[KG]h?add?af?fi$\b
But I still seem to be missing some variants. Any suggestions?
Easy... (Qadaffi|Khadafy|Qadafi|...)... it's self-documented, maintainable, and assuming your regexp engine actually compiles regular expressions (rather than interpreting them), it will compile to the same DFA that a more obfuscated solution would.
Writing compact regular expressions is like using short variable names to speed up a program. It only helps if your compiler is brain-dead.
\b[KGQ]h?add?h?af?fi\b
Arabic transcription is (Wiki says) "Qaḏḏāfī", so maybe adding a Q. And one H ("Gadhafi", as the article (see below) mentions).
Btw, why is there a $ at the end of the regex?
Btw, nice article on the topic:
Gaddafi, Kadafi, or Qaddafi? Why is the Libyan leader’s name spelled so many different ways?.
EDIT
To match all the names in the article you've mentioned later, this should match them all. Let's just hope it won't match a lot of other stuff :D
\b(Kh?|Gh?|Qu?)[aeu](d['dt]?|t|zz|dhd)h?aff?[iy]\b
One interesting thing to note from your list of potential spellings is that there's only 3 Soundex values for the contained list (if you ignore the outlier 'Kazzafi')
G310, K310, Q310
Now, there are false positives in there ('Godby' also is G310), but by combining the limited metaphone hits as well, you can eliminate them.
<?
$soundexMatch = array('G310','K310','Q310');
$metaphoneMatch = array('KTF','KTHF','FTF','KHTF','K0F');
$text = "This is a big glob of text about Mr. Gaddafi. Even using compound-Khadafy terms in here, then we might find Mr Qudhafi to be matched fairly well. For example even with apostrophes sprinkled randomly like in Kad'afi, you won't find false positives matched like godfrey, or godby, or even kabbadi";
$wordArray = preg_split('/[\s,.;-]+/',$text);
foreach ($wordArray as $item){
$rate = in_array(soundex($item),$soundexMatch) + in_array(metaphone($item),$metaphoneMatch);
if ($rate > 1){
$matches[] = $item;
}
}
$pattern = implode("|",$matches);
$text = preg_replace("/($pattern)/","<b>$1</b>",$text);
echo $text;
?>
A few tweaks, and lets say some cyrillic transliteration, and you'll have a fairly robust solution.
Using CPAN module Regexp::Assemble:
#!/usr/bin/env perl
use Regexp::Assemble;
my $ra = Regexp::Assemble->new;
$ra->add($_) for qw(Gadaffi Gadafi Gadafy Gaddafi Gaddafy
Gaddhafi Gadhafi Gathafi Ghadaffi Ghadafi
Ghaddafi Ghaddafy Gheddafi Kadaffi Kadafi
Kaddafi Kadhafi Kazzafi Khadaffy Khadafy
Khaddafi Qadafi Qaddafi Qadhafi Qadhdhafi
Qadthafi Qathafi Quathafi Qudhafi Kad'afi);
say $ra->re;
This produces the following regular expression:
(?-xism:(?:G(?:a(?:d(?:d(?:af[iy]|hafi)|af(?:f?i|y)|hafi)|thafi)|h(?:ad(?:daf[iy]|af?fi)|eddafi))|K(?:a(?:d(?:['dh]a|af?)|zza)fi|had(?:af?fy|dafi))|Q(?:a(?:d(?:(?:(?:hd)?|t)h|d)?|th)|u(?:at|d)h)afi))
I think you're over complicating things here. The correct regex is as simple as:
\u0627\u0644\u0642\u0630\u0627\u0641\u064a
It matches the concatenation of the seven Arabic Unicode code points that forms the word القذافي (i.e. Gadaffi).
If you want to avoid matching things that no-one has used (ie avoid tending towards ".+") your best approach would be to create a regular expression that's just all the alternatives (eg. (Qadafi|Kadafi|...)) then compile that to a DFA, and then convert the DFA back into a regular expression. Assuming a moderately sensible implementation that would give you a "compressed" regular expression that's guaranteed not to contain unexpected variants.
If you've got a concrete listing of all 30 possibilities, just concatenate them all together with a bunch of "ors". Then you can be sure that it only matches the exact things you've listed, and no more. Your RE engine will probably be able to optimize in further, and, well, with 30 choices even if it doesn't it's still not a big deal. Trying to fiddle around with manually turning it into a "clever" RE can't possibly turn out better and may turn out worse.
(G|Gh|K|Kh|Q|Qh|Q|Qu)(a|au|e|u)(dh|zz|th|d|dd)(dh|th|a|ha|)(\x27|)(a|)(ff|f)(i|y)
Certainly not the most optimized version, split on syllables to maximize matches while trying to make sure we don't get false positives.
Well since you are matching small words why don't you try a similarity search engine with the Levenshtein distance? You can allow at most k insertions or deletions. This way you can change the distance function to other things that work better for your specific problem. There are many functions available in the simMetrics library.
A possible alternative is the online tool for generate regular expressions from examples http://regex.inginf.units.it.
Give it a chance!
Why not do a mixed approach? Something between a list of all possibilities and a complicated Regex that matches far too much.
Regex is about pattern matching and I can't see a pattern for all variants in the list. Trying to do so, will also find things like "Gazzafy" or "Quud'haffi" which are most probably not a used variant and definitly not on the list.
But I can see patterns for some of the variants, and so I ended up with this:
\b(?:Gheddafi|Gathafi|Kazzafi|Kad'afi|Qadhdhafi|Qadthafi|Qudhafi|Qu?athafi|[KG]h?add?h?aff?[iy]|Qad[dh]?afi)\b
At the beginning I list the ones where I can't see a pattern, then followed by some variants where there are patterns.
See it here on www.rubular.com
I know this is an old question, but...
Neither of these two regexes is the prettiest, but they are optimized and both match ALL the variations in the original post.
"Little Beauty" #1
(?:G(?:a(?:d(?:d(?:af[iy]|hafi)|af(?:f?i|y)|hafi)|thafi)|h(?:ad(?:daf[iy]|af?fi)|eddafi))|K(?:a(?:d(?:['dh]a|af?)|zza)fi|had(?:af?fy|dafi))|Q(?:a(?:d(?:(?:(?:hd)?|t)h|d)?|th)|u(?:at|d)h)afi)
"Little Beauty" #2
(?:(?:Gh|[GK])adaff|(?:(?:Gh|[GKQ])ad|(?:Ghe|(?:[GK]h|[GKQ])a)dd|(?:Gadd|(?:[GKQ]a|Q(?:adh|u))d|(?:Qad|(?:Qu|[GQ])a)t)h|Ka(?:zz|d'))af)i|(?:Khadaff|(?:(?:Kh|G)ad|Gh?add)af)y
Rest in Peace, Muammar.
Just an addendum: you should add "Gheddafi" as alternate spelling. So the RE should be
\b[KG]h?[ae]dd?af?fi$\b
[GQK][ahu]+[dtez]+\'?[adhz]+f{1,2}(i|y)
In parts:
[GQK]
[ahu]+
[dtez]+
\'?
[adhz]+
f{1,2}(i|y)
Note: Just wanted to give a shot at this.
What else starts with Q, G, or K, has a d, z or t in the middle, and ends in "fi" the people actually search for?
/\b[GQK].+[dzt].+fi\b/i
Done.
>>> print re.search(a, "Gadasadasfiasdas") != None
False
>>> print re.search(a, "Gadasadasfi") != None
True
>>> print re.search(a, "Qa'dafi") != None
True
Interesting that I'm getting downvoted. Can someone leave some false positives in the comments?