Im have many tcl scripts and in all the same lots of regexp entries.
regexp one exmpl:
if {[regexp -nocase {outl} $cat]} { set cat "outlook" }
how can insert all my regexp in a file and load this in a proc?
exampl:
proc pub:mapping {nick host handle channel text} {
set cat [lindex [split $text] 1];
#regexp i want hier load the file for regexp
if {[regexp -nocase {outl} $cat]} { set cat "outlook" }
putnow "PRIVMSG $channel :new $cat"
}
Regards
If I understand you correctly, you now have a bunch of Tcl scripts with large portions of code being repeated among them (in your case, various regex comparisons). In that case, it makes a lot of sense to extract that code into a separate unit.
This could be, as you suggest, become a sort of a text file where you would list regex expressions and their results in some format and then load them when needed in Tcl scripts. But I feel this would be too complicated and ungainly.
Might I suggest you simply create a regex checking proc and save that into a .tcl file. If you need regex checking in any of your other scripts, you can simply source that file and have the proc available.
From your question I'm not quite sure how you plan on using those regex comparisons, but maybe this example can be of some help:
# This is in regexfilter.tcl
proc regexfilter {text} {
if {[regexp -nocase {outl} $text]} { return "Outlook" }
if {[regexp -nocase {exce} $text]} { return "Excel" }
if {[regexp -nocase {foo} $text]} { return "Bar" }
# You can have as many options here as you like.
# In fact, you should consider changing all this into a switch - case
# The main thing is to have all your filters in one place and
# no code duplication
}
#
# This can then be in other Tcl scripts
#
source /path_to_filter_scipt/regexfilter.tcl
proc pub:mapping {nick host handle channel text} {
set cat [lindex [split $text] 1]
set cat [regexfilter $cat]
putnow "PRIVMSG $channel :new $cat"
}
If you're just wanting to expand abbreviations, then you can use string map
proc expand_abbreviations {string} {
# this is an even-numbered list mapping the abbreviation to the expansion
set abbreviations {
outl outlook
foo foobar
ms Microsoft
}
return [string map $abbreviations $string]
}
This approach will be quite fast. However, if the string already contains "outlook", it will be turned into "outlookook"
Related
I have a huge file. I need to find the line containing the pattern abc_x and replace its value 0.34 to 10% increased value. And then copy the entire file (with replaced values) into a new file. I am new to Tcl, please let me know how to do this.
Thanks in advance.
There are three key stages to this:
Reading the old file in.
Making the update to the data in memory.
Writing the new file out.
The first and third stages are pretty standard:
set f [open input.txt]
set data [read $f]
close $f
set f [open output.txt "w"]
puts -nonewline $f $data
close $f
So now it's just about doing the transformation in memory. The best answer to this depends on the version of Tcl you're using. In all current production versions, it's probably best to split the data into lines and iterate over them, checking whether a line matches and, if it does, performing the transformation.
set lines {}
foreach line [split $data "\n"] {
if {[matchesPattern $line]} {
set line [updateLine $line]
}
lappend lines $line
}
set data [join $lines "\n"]
OK, in that code above, matchesPattern and updateLine are stand-ins for the real code, which might look like this:
if {[regexp {^(\s*abc_x\s+)([\d.]+)(.*)$} $line -> prefix value suffix]} {
# Since we matched the prefix and suffix, we must put them back
set line $prefix[expr {$value * 1.1}]$suffix
}
Composing all those pieces together gets this:
set f [open input.txt]
set data [read $f]
close $f
set lines {}
foreach line [split $data "\n"] {
if {[regexp {^(\s*abc_x\s+)([\d.]+)(.*)$} $line -> prefix value suffix]} {
set line $prefix[expr {$value * 1.1}]$suffix
}
lappend lines $line
}
set data [join $lines "\n"]
set f [open output.txt "w"]
puts -nonewline $f $data
close $f
In 8.7 you'll be able to write the update more succinctly:
set data [regsub -all -line -command {^(\s*abc_x\s+)([\d.]+)} $data {apply {{-> prefix value} {
# Since we matched the prefix, we must put it back
string cat $prefix [expr {$value * 1.1}]
}}}]
(Getting shorter than this would really require a RE engine that supports lookbehinds; Tcl's standard one does not.)
I have a string in tcl say:
set name "a_b_c_d"
and I want to get 4 variables out of it like $a would have the value, $b the value b, etc...
Thanks a lot !
This is exactly what the split command is for. You just need to provide the optional argument that says what character to use to split the string into a list of its fields.
set fields [split $name "_"]
Note that if you have two of the split character in a row, you get an empty list element in the result.
Your requirement is a bit strange in my opinion, but that's how I would do it:
set name a_b_c_d
foreach item [split $name "_"] {
set $item $item
}
You didn't ask for the following, but I believe it might be better if you use an array, so you know exactly where your variables are, instead of just being 'there' in the open:
set name a_b_c_d
foreach item [split $name "_"] {
set items($item) $item
}
parray items
# items(a) = a
# items(b) = b
# items(c) = c
# items(d) = d
EDIT: Since you mentioned it in a comment, I'll just put it here: if the situation is as you mentioned, I'd probably go like this:
lassign [split $name "_"] varName folderName dirName
And it should still work most of the time. Dynamic variable names are not recommended and can 90% of the time be avoided for a safer, more readable and maintainable code. Sure, it works for things that you just need once in a blue moon, but you need to know what you are doing.
I am trying to find the matching pattern using regexp command in the {if loop} . Still a newbie in tcl. The code is as shown below:
set A 0;
set B 2;
set address "my_street[0]_block[2]_road";
if {[regexp {street\[$A\].*block\[$B\]} $address]} {
puts "the location is found"
}
I am expecting the result to return "the location is found" as the $address contain matching A and B variables. i am hoping to able to change the A and B number for a list of $address. but I am not able to get the result to return "the location is found".
Thank you.
Tcl's regular expression engine doesn't do variable interpolation. (Should it? Perhaps. It doesn't though.) That means that you need to do it at the generic level, which is in general quite annoying but OK here as the variables only have numbers in, which are never RE metacharacters by themselves.
Basic version (with SO. MANY. BACKSLASHES.):
if {[regexp "street\\\[$A\\\].*block\\\[$B\\\]" $address]} {
Nicer version with format:
if {[regexp [format {street\[%d\].*block\[%d\]} $A $B] $address]} {
You could also use subst -nocommands -nobackslashes but that's getting less than elegant.
If you need to support general substitutions, it's sufficient to use regsub to do the protection.
proc protect {string} {
regsub -all {\W} $string {\\&}
}
# ...
if {[regexp [format {street\[%s\].*block\[%s\]} [protect $A] [protect $B]] $address]} {
It's overkill when you know you're working with alphanumeric substitutions into the RE.
I have a list of strings structured like:
C:/Users/scott-filter1.pgm C:/Users/scott-filter2.pgm C:/Users/scott-filter3.pgm
Essentially, what I want to do is remove C:/Users/scott- and .pgm leaving me with just filter1 for example.
So, this is my regular expression:
regsub -nocase {.pgm} [regsub -nocase {C:/Users/scott-} $list ""] ""
Which works fine, albeit a little clunky. Now, when I replace the inner regular expression with a regular expression that contains a variable, such as:
set myname scott
{C:/Users/$myname-}
It no longer works. Any ideas on how to achieve what I want to achieve?
Thanks!
You will need to remove the braces as they prevent substitution (that is you won't have the variable replaced by the value of that variable and instead, you will have the literal string $myname in the regex -- also might be worth noting that $ in regex matches at the end of the string):
regsub "C:/Users/$myname-" $in "" out
Or you can do it with a single regsub:
set list "C:/Users/scott-filter1.pgm"
set myname "scott"
regsub -nocase -- "C:/Users/$myname-(.*)\\.pgm" $list {\1} out
puts $out
# => filter1
Notes:
If you remove the braces and use quotes, you need to double escape things you would otherwise escape once.
I'm using a capture group when I use parens and .* matches any character(s). The captured part is then put back using \1 in the replacement part, into the variable called out.
Strictly speaking, you need to escape . because this is a wildcard in regex and matches any 1 character. Because I'm using quotes, I need to double escape it with two backslashes.
Matching might be easier and more straightforward than substitution:
regexp -nocase -- "C:/Users/$myname-(.*)\\.pgm" $list - out
puts $out
# => filter1
If the 'name' can be anything, then you can use a more generic regex to avoid having to place the name in the regex... For instance, if $myname can never have a dash, you can use the negated class [^-] which matches anything except dash and you won't have to worry about double escapes:
regexp -nocase -- {C:/Users/[^-]+-(.*)\.pgm} $list - out
puts $out
# => filter1
There is another way to do this, assuming the part you want is always in a file name between a dash and the last dot before the extension.
set foo C:/Users/scott-filter1.pgm
# => C:/Users/scott-filter1.pgm
set bar [file rootname [file tail $foo]]
# => scott-filter1
set baz [split $bar -]
# => scott filter1
set qux [lindex $baz end]
# => filter1
or
lindex [split [file rootname [file tail $foo]] -] end
# => filter1
The file commands work on any string that is recognizable as a file path. file tail yields the file path minus the part with the directories, i.e. only the actual file name. file rootname yields the file name minus the extension. split converts the string into a list, splitting it at every dash. lindex gets one item from the list, in this case the last item.
An even more ad-hoc-ish (but actually quite generic) solution:
lindex [split [lindex [split $foo -] end] .] 0
# => filter1
This invocation splits the file path at every dash and selects the last item. This item is again split at every dot, and the first item of the resulting list is selected.
Documentation: file, lindex, set, split
Since this is a list of filenames, we can use lmap (to apply an operation to each of the elements of a list, requires 8.6) and file (specifically file tail and file rootname) to do most of the work. A simple string map will finish it off, though a regsub could also have been used.
set filenames {C:/Users/scott-filter1.pgm C:/Users/scott-filter2.pgm C:/Users/scott-filter3.pgm}
set filtered [lmap name $filenames {
string map {"scott-" ""} [file rootname [file tail $name]]
# Regsub version:
#regsub {^scott-} [file rootname [file tail $name]] ""
}]
Older versions of Tcl will need to use foreach:
set filtered {}
foreach name $filenames {
lappend filtered [string map {"scott-" ""} [file rootname [file tail $name]]]
# Regsub version:
#lappend filtered [regsub {^scott-} [file rootname [file tail $name]] ""]
}
I want to be able to print 10 lines before and 10 lines after I come across a matching pattern in a file. I'm matching the pattern via regex. I would need a TCL specific solution. I basically need the equivalent of the grep -B 10 -A 10 feature.
Thanks in advance!
If the data is “relatively small” (which can actually be 100MB or more on modern computers) then you can load it all into Tcl and process it there.
# Read in the data
set f [open "datafile.txt"]
set lines [split [read $f] "\n"]
close $f
# Find which lines match; adjust to taste
set matchLineNumbers [lsearch -all -regexp $lines $YourPatternHere]
# Note that the matches are already in order
# Handle overlapping ranges!
foreach n $matchLineNumbers {
set from [expr {max(0, $n - 10)}]
set to [expr {min($n + 10, [llength $lines] - 1)}]
if {[info exists prev] && $from <= $prev} {
lset ranges end $to
} else {
lappend ranges $from $to
}
set prev $to
}
# Print out the ranges
foreach {from to} $ranges {
puts "=== $from - $to ==="
puts [join [lrange $lines $from $to] "\n"]
}
The only mechanism that springs to mind is for you to split the input data into a list of lines. You'd then need to sweep through the list and whenever you found a match output a suitable collection of entries from the list.
To the best of my knowledge there's no built-in, easy way of doing this.
There might be something useful in tcllib.
I'd use grep myself.