Simplify terms within an expression - sympy

Sympy can simplify this:
In [26]: (asinh(sinh(x))).simplify()
Out[26]: x
but doesn't simplify that:
In [28]: (asinh(sinh(x))+1).simplify()
Out[28]: asinh(sinh(x)) + 1
How can I ask for subparts of an expression to be simplified? If possible, I'd like to avoid equation-scale simplification, e.g. that a common denominator is found and factorised out for all terms.

Arguably, this simplification should not happen at all because it's not always true: for example, asinh(sinh(2*I)) is not 2*I. The current implementation of simplify has a clause for "canceling" a function and its inverse, which only applies if the entire expression is that function, and does not pay attention to things like asin(sin(pi)) being 0 rather than pi. Inverses are tricky.
But the following approach, based on replace, will replace all known "function-inverse" pairs:
expr = sqrt(asinh(sinh(x))) + sin(asin(exp(2*x+1)))
expr = expr.replace(lambda f: isinstance(f, Function) and isinstance(f.args[0], f.inverse(argindex=1)), lambda f: f.args[0].args[0])
# expr is now sqrt(x) + exp(2*x + 1)
The first argument of replace is a filter function,
lambda f: isinstance(f, Function) and isinstance(f.args[0], f.inverse(argindex=1))
which asserts that f is a Function whose first argument is its inverse.
The second argument of replace is the action to be done on matching subexpression,
lambda f: f.args[0].args[0]
means: replace by the argument of the argument, i.e., asinh(sinh(x)) -> x.
As noted above, it's not guaranteed that the result of such "simplification" is mathematically equivalent to the original expression.

Related

How to replace the match with a function in a regex expression in Python

I'm trying to get a better understanding of how lambda functions and regex matches work in Python. For this purpose I'm replacing a lambda with a named function.
Even though I've found a way to make it work, I'm not able to understand why it works.
The lambda/regex I'm working on are the one mentioned in the following posts:
How to replace multiple substrings of a string?
Python - Replace regular expression match with a matching pair value
This is the main piece of code:
import re
# define desired replacements here
rep = {"condition1": "", "condition2": "text"}
text = "(condition1) and --condition2--"
# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.items())
pattern = re.compile("|".join(rep.keys()))
output = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)
print(output)
>>> '() and --text--'
If I replace the lambda function with:
def replace_conditions(match, rep):
return rep[re.escape(match.group(0))]
output = pattern.sub(replace_conditions(m, rep), text)
I get the following exception:
NameError: name 'm' is not defined
And I'm able to make it work only using this syntax:
def replace_conditions(match, rep=rep):
return rep[re.escape(match.group(0))]
output = pattern.sub(replace_conditions, line)
NOTE: I had to pre-assign a value to the second argument "rep" and use the function's name without actually calling it.
I can't understand why the match returned by the regex expression is properly passed on to the function if called with no arguments, while it's not passed to its first argument when called with the usual syntax.
I can't understand why the match returned by the regex expression is properly passed on to the function if called with no arguments
That's not what's happening. pattern.sub(replace_conditions, line) doesn't call the replace_conditions function, it just passes it on.
From the docs for:
re.sub(pattern, repl, string, count=0, flags=0)
which is the same as:
pattern.sub(repl, string)
If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

Regular Expression Simplification Issue

I'm trying to understand the equivalence between regular expressions α and β defined below, but I'm losing my mind over conflicting information.
a+b: a or b
ab: concatenation of a and b
$: empty string
α = (1*+0)+(1*+0)(0+1)*($+0+1)
β = (1*+0)(0+1)*($+0+1)
https://ivanzuzak.info/noam/webapps/regex_simplifier/ says, that α is equivalent to β.
My school however teaches that concatenation has stronger binding than union, meaning that:
11*+0 =/= 1(1*+0)
Which would mean that my α looks like this with parentheses:
α = (1*+0) + ( (1*+0)(0+1)*($+0+1) )
and that
α =/= ( (1*+0) + (1*+0) ) (0+1)*($+0+1)
I hope it's clear what my problem is, I'd appreciate any kind of help. Thanks.
Usually, two regular expressions are considered equivalent when they match the same set of words.
How they match it is not relevant. Therefore it doesn't matter which of the operators has greater precedence.
Note the subtle difference between being equal (in written form) and being equivalent (having the same effect).
Alright, it turns out that I have misunderstood why b+b <=> b.
It's that L1∪L2 <=> L2, if L1 is subset of L2.

Incrementing a number in a string using sub

There's a string with a (single) number somewhere in it. I want to increment the number by one. Simple, right? I wrote the following without giving it a second thought:
sub("([[:digit:]]+)", as.character(as.numeric("\\1")+1), string)
... and got an NA.
> sub("([[:digit:]]+)", as.character(as.numeric("\\1")+1), "x is 5")
[1] NA
Warning message:
In sub("([[:digit:]]+)", as.character(as.numeric("\\1") + 1), "x is 5") :
NAs introduced by coercion
Why doesn't it work? I know other ways of doing this, so I don't need a "solution". I want to understand why this method fails.
The point is that the backreference is only evaluated during a match operation, and you cannot pass it to any function before that.
When you write as.numeric("\\1") the as.numeric function accepts a \1 string (a backslash and a 1 char). Thus, the result is expected, NA.
This happens because there is no built-in backreference interpolation in R.
You may use a gsubfn package:
> library(gsubfn)
> s <- "x is 5"
> gsubfn("\\d+", function(x) as.numeric(x) + 1, s)
[1] "x is 6"
It does not work because the arguments of sub are evaluated before they are passed to the regex engine (which gets called by .Internal).
In particular, as.numeric("\\1") evaluates to NA ... after that you're doomed.
It might be easier to think of it differently. You are getting the same error that you would get if you used:
print(as.numeric("\\1")+1)
Remember, the strings are passed to the function, where they are interpreted by the regex engine. The string \\1 is never transformed to be 5, since this calculation is done within the function.
Note that \\1 is not something that works as a number. NA seems to be similar to null in other languages:
NA... is a product of operation when you try to access something that is not there
From mpiktas' answer here.

VBScript checks all OR conditions?

Perhaps I'm missing something, but it annoys me that VBScript seems to read all OR condtions. For example, I'd like to do something like this:
If (oFS.FileExists(sFileLoc) = False) Or (sNewText <> oFS.OpenTextFile(sFileLoc).ReadAll) Then
Now I get an error that the file doesn't exist because of the second condition. I was hoping that if the file doesn't exist VBScript would skip immediately to the result, and if it does, it checks the second condition.
Am I right and is this normal behavior?
As M. Harris already said in 2003 and the docs for the logical operators (e.g. Or) state explicitly, VBScript does not short-circuit the evaluation of conditionals. You must use nested Ifs or a slightly fancy Select Case
You can use inline nested IF's to achieve short-circuiting in VBScript. For example, you could rewrite your statement like this:
If oFS.FileExists(sFileLoc) Then If sNewText = oFS.OpenTextFile(sFileLoc).ReadAll Then
But your Then condition must be specified on the same line as this statement. So if you need to perform multiple operations as a result of this condition, you must separate the statements with a colon (:), which is the single-line statement separator in VBScript.
If oFS.FileExists(sFileLoc) Then If sNewText = oFS.OpenTextFile(sFileLoc).ReadAll Then x = 1 : y = 2
You could also just move your logic into a Sub or Function and make a call instead:
If oFS.FileExists(sFileLoc) Then If sNewText = oFS.OpenTextFile(sFileLoc).ReadAll Then DoStuff
Note, too, that if you need to specify an Else clause, it must be specified on this line as well.
If oFS.FileExists(sFileLoc) Then If sNewText = oFS.OpenTextFile(sFileLoc).ReadAll Then x = 1 Else x = 2

Simplest way to find out if at least one cell in a cell array matches a regular expression

I need to search a cell array and return a single boolean value indicating whether any cell matches a regular expression.
For example, suppose I want to find out if the cell array strs contains foo or -foo (case-insensitive). The regular expression I need to pass to regexpi is ^-?foo$.
Sample inputs:
strs={'a','b'} % result is 0
strs={'a','foo'} % result is 1
strs={'a','-FOO'} % result is 1
strs={'a','food'} % result is 0
I came up with the following solution based on How can I implement wildcard at ismember function of matlab? and Searching cell array with regex, but it seems like I should be able to simplify it:
~isempty(find(~cellfun('isempty', regexpi(strs, '^-?foo$'))))
The problem I have is that it looks rather cryptic for such a simple operation. Is there a simpler, more human-readable expression I can use to achieve the same result?
NOTE: The answer refers to the original regexp in the question: '-?foo'
You can avoid the find:
any(~cellfun('isempty', regexpi(strs, '-?foo')))
Another possibility: concatenate first all cells into a single string:
~isempty(regexpi([strs{:}], '-?foo'))
Note that you can remove the "-" sign in any of the above:
any(~cellfun('isempty', regexpi(strs, 'foo')))
~isempty(regexpi([strs{:}], 'foo'))
And that allows using strfind (with lower) instead of regexpi:
~isempty(strfind(lower([strs{:}]),'foo'))