Pass variable to the command with regular expression in Tcl - regex

I am trying to pass the variable to my regular expression to be used while looping through a list of strings. For example, I have a string which is:
top/inst/name[i], where i can take different values of integers.
for {set i 0} {$i < $rows} {set i [expr {$i + 1}]} {
my_command { top/inst/name[$i] top_o/inst_o/name[$i] }
}
How do I tell regular expression parser to treat $i as a number? It complains that $i is a command.

The issue is that […] is serving two different purposes here, one in base Tcl as command substitution syntax, and one for regular expressions as character set syntax. I'm not sure that you want either of them at this point, given that the brackets appear to be part of the actual name of something. So you need to be careful.
To avoid the command substitution, you can either insert \ characters before the [ and the ], or you can use the extended capabilities of subst:
my_command [subst -nocommands { top/inst/name[$i] top_o/inst_o/name[$i] }]
To avoid the other problem, you can either insert more backslashes (note that this can make things ugly after a while) or if you are really using regular expressions to just match a literal (sub)string, you can prefix the regular expression with ***=.
It is idiomatic to use incr i instead of set i [expr {$i + 1}] in for loop iteration clauses. It does the same thing, but is shorter and clearer for (human) readers. It's just like using ++i instead of i = i + 1 in C or C++ (or many other languages).

Related

Converting string into mathematical function in Tcl

Let's say I have a string, maybe "Enable && Signal" for simplicity's sake.
I'd like to convert this string to standard && operations in tcl, such that Enable && Signal would return 0 if any of the value is false and 1 only when both are true.
Is there an easy way to do this, As for my case i would need a generic method where the number of arguments can be any and perform logical/relational operations like && || == <= > != etc
Any help and insights would be very much appreciated.
Thanks
I initially tried to split the arguments into conditions list and data list but could not handled the precedence of operations. Like == need to be done first and later && operations for n^n combinations
I'm assuming that in your example, Enable and Signal are Tcl variables. So, all that would be needed to be able to pass the string to expr is to prepend a '$' to all identifiers. That can be done with regsub as follows:
set str "Enable && Signal"
regsub -all {\m[A-Za-z]\w*\M} $str {$&} expr
set result [expr $expr]
Due to the \m\M, This will properly leave numbers like 1e3 alone. But this method falls short if you also want to be able to use functions, like sin(x). If that is also a requirement, a negative lookahead may help:
set str "sin(x) * cos(y)"
regsub -all {\m[A-Za-z]\w*\M(?!\()} $str {$&} expr
puts $expr
This produces: sin($x) * cos($y)

Tcl: Regsub does not substitute a string while parsing HTML snipet

I'm trying to find a specific string within an array element. Since array element is a string which can contain multiple occurrences of the string I perform recursive substitution of the result. Algorithm works on simple example, but when I use it with HTML (which is the purpose of the program) it stuck in an infinite while loop.
Here is an (ugly) expression that I'm using:
set expression {\<div\sclass\=\"fileText\"\sid\=\"[^\"]+\"\>File\:\s\<a\s(title\=\"[^\"]+\"\s)?href\=\"([^\"]+)\"\starget\=\"\_blank\"\>([^\<]+)\<\/a\>[^\<]+\<\/div\>};
Here is an element of the array I from which I want to extract strings (it containes 2 occurences of the given expression):
set htmlForParse(0) {file" id="f51456520"><div class="fileText" id="fT51456520">File: 48912-arduinouno_r3_front.jpg (1022 KB, 1800x1244)</div><a class="fileThumb" href="//example.com" target="_blank"><img " title="Reply to this post">YesNo?</a></span></div><div class="file" id="f51456769"><div class="fileText" id="fT51456769">File: 892991578.jpg (32 KB, 400x422)</div><a class="fileThumb" href="//example.com" target="_blank"><img src};
And here are the loops that I'm using to achieve this:
for {set k 0} {$k < [array size htmlForParse]} {incr k} {
while {[regexp $expression $htmlForParse($k) exString]} {
regsub -- $exString $htmlForParse($k) {} htmlForParse($k);
puts $htmlForParse($k);
} }
Purpose of the regsub is to substitute one hit from regexp at a time, until no hits are left and regexp returns 0. At that moment, while loop is finished, and next element of the array can be examined. But that doesn't happen, it continues to loop forever, and it seem that regsub does not substitute found string with an empty string (nor will it substitute with anything else either). Why?
The problem is that the string you are matching contains unquoted RE metacharacters. The ones I notice are parentheses (around the sizes):
% regexp $expression $htmlForParse($k) exString
1
% puts $exString
<div class="fileText" id="fT51456520">File: 48912-arduinouno_r3_front.jpg (1022 KB, 1800x1244)</div>
This means that the substring you extract doesn't actually match as a regular expression in the regsub, and no change is made. Next time round the loop, you get to match everything exactly as it was once again. Not what you want!
The easiest fix is to tell the regsub that the string it is using as a pattern is a literal string. This is done by preceding the RE with ***=, like this:
while {[regexp $expression $htmlForParse($k) exString]} {
regsub -- ***=$exString $htmlForParse($k) {} htmlForParse($k)
puts $htmlForParse($k)
}
With your sample text, this will perform two replacements. I hope that's what you want.
Also, your initial RE has far too many backslashes in it. None of /, < and > are RE metacharacters. It's not harmful to quote them, but I hope you are generating that RE from something, not writing it by hand!

Use format or regexp to convert -3.014-5 to scientific format -3.014e-5

I'm fetching node coordinates from a file. Unfortunately for small numbers the following format is used:
-3.014-5
without an "e" --> -3.014e-5
I can't use format because all the functions I found require a floating point number, which the above not is...
So I wanted to use regular expressions to find the "-5" part and replace it by "e-5".
([+-]?[0-9]+)?$ would do that, but how can I use that expression in TCL?
set num -3.014-5
set Enum [ regexp -all { ([+-]?[0-9]+)?$ } $num ]
I get "invalid command name "+-", so I replaced the square brackets by " , but then I get 1 as an answer. What am I doing wrong?
I don't understand why you get the error message "invalid command name "+-". As long as you have your regular expression inside curly braces {} the expression should not be evaluated by the interpreter.
For me this worked to achieve the desired result:
set Enum [regsub {^([+-]?[.0-9]+)([+-]?[0-9]+)?$} $num {\1e\2}]
Edit:
If you want "normal" numbers (those without an exponent) to remain unchanged you could simply remove the ? from the tail part of the regular expression. In this case the expression will not match and the number remains unchanged:
set Enum [regsub {^([+-]?[.0-9]+)([+-][0-9]+)$} $num {\1e\2}]
I don't know tcl but I would guess you need to escape the + and propably the - too.
Try this: set Enum [ regexp -all ([\+-]?[0-9]+)?$ $num ]
or this: set Enum [ regexp -all ([\+\-]?[0-9]+)?$ $num ]
You might need to use \\ instead of \ (I don't know tcl sorry)

Tcl switch statement and -regexp quoting?

There must be an extra level of evaluation when using the Tcl switch command. Here's an example session at the repl to show what I mean:
$ tclsh
% regexp {^\s*foo} " foo"
1
% regexp {^\\s*foo} " foo"
0
% switch -regexp " foo" {^\\s*foo {puts "match"}}
match
% switch -regexp " foo" {{^\s*foo} {puts "match"}}
match
...There needed to be an extra backslash added inside the first "switch" version. This is consistent between 8.5.0 and 8.6.0 on Windows. Can someone point me to the section of the manual where this and similar instances of extra levels of unquoting are described? My first thought was that the curly brackets in the first "switch" version would have protected the backslash, but "switch" itself must be applying an extra level of backslash susbtitution to the patterns. Unless I'm misunderstanding the nuances of something else.
Edit:
...hmmm... Like Johannes Kuhn says below backslash substitution apparently depends on the dynamic context of use, and not the lexical context of creation...
% set t {\s*foo}
\s*foo
% proc test_s x {set x}
% test_s $t
\s*foo
% proc test_l x {lindex $x 0}
% test_l $t
s*foo
% puts $t
^\s*foo
...that seems to be quite the interesting design choice.
The problem you describe here is simple to solve:
The difference between switch and regexp is that switch takes actually a list.
So if we print the first element of the list {^\s*foo {puts "match"}} with
% puts [lindex {^\s*foo {puts "match"}} 0]
^s*foo
it results in something that we don't want.
List constructing is a little bit complex, if you are not sure, use an interactive Tcl shell that constructs one for you with list.
Edit: Indeed, it is an intresting desin choice, but this applies to everything in Tcl. For example expr uses an minilanguage designed for arithmetic expressions. It is up to the command what it shall do with it's arguments. Even language constucts like if, for, while are just commands that treats one of the arguments as expression, and the other arguments as script. This design makes it possible to create new control structures, like sqlite's eval, which takes the SQL statment and a script that it should evaluate for each result.

splitting a formula and again regenerating and reevaluateing formula

I m splitting a formula string with "*/+-()" as my pattern (for eg. a*b+c is string) and I m getting a list in the output as (a b c) where a,b,c are variables and contain some values like 5,10,15.
What I need is: I should be able to directly substitute values in the variables and evaluate the expression.
The formula is taken from the user and changes time to time. so if the user enters (a/b), something should automatically replace it with real values (5/10) and then return the result 0.5.
The formula is formed from limited number of variables (for eg. a,b,c) and it can use +,-,*,/,(,) as operators.
The problem is that after splitting the variables, i m not able to replace them with their values or evaluate the equation. Please help me to do this task in as short expression as possible. thanks in advance.
It is not at all that complicated:
First, replace all variables with with a Tcl variable (prepped a $).
You have to be careful not replace sin(a) with $sin($a) or similar.
regsub -all -inline {[a-z]+(?![a-z\(])} $input
Example:
set input {a*b+c+sin(d)}
regsub -all -inline {[a-z]+(?![a-z\(])} $input
would yield $a+$b*$c+sin($c), which can be passed to expr.
If you need the variable names, just use regexp with this expression.
If you know the names of the variables and none of them are prefixes of anything else you use, you can easily transform the expression like this:
set a 1; set b 2; set c 3
set e "a*b+c"
set value [expr [string map {a $a b $b c $c} $e]]
puts "$e = $value"
Note: no braces around the expression on the third line. This is when you want to avoid safety like that because you are doing runtime generation of the expression.
That mapping can be generated automatically:
set a 1; set b 2; set c 3
set e "a*b+c"
set vars {a b c}
set value [expr [string map [regsub -all {\w+} "& $&"] $e]]
puts "$e = $value"
However, if you've got prefixes and other things like that, you need a more complex transform:
# Omitting the variable setup and print at the end...
proc replIfRight {vars word} {
if {$word in $vars} {return \$$word} else {return $word}
}
set value [expr [subst [regsub -all {\w+} [string map {[ \[ $ \$ \\ \\\\} $e] {[replIfRight $vars &]}]]]
You're absolutely right to not expect to come up with such a horrible thing yourself!