Most efficient way to search a tcl list - list

i have a tcl list as below.
set mylist [list a b c d e]; # could be more
Now i am doing some processing if the list contains the items "c", "d", "e". But i need to skip that processing if and only if the list has either of the below values:
set mylist [list a];
OR
set mylist [list b];
OR
set mylist [list a b];
So if mylist is any of the above three, i skip the processing. But lets say if the list has any values other than the above three combinations, i do the processing.
What is the most efficient way of searching if the list has any of the three combinations.
I have the basic code which is fulfilling my requirement, but i was looking for more efficient way as i am not much familiar with tcl containers.
set mylist [list a];
if {[llength $mylist] == 2 && ([lindex $mylist 0] eq "a" || [lindex $mylist 0] eq "b") && ([lindex $mylist 1] eq "a" || [lindex $mylist 1] eq "b")} {
puts "1. skip the processing"
} elseif {[llength $mylist] == 1 && ([lindex $mylist 0] eq "a" || [lindex $mylist 0] eq "b")} {
puts "2. skip the processing"
} else {
puts "Do the processing"
}
I was wondering if there is any other efficient way to perform the same.

if {$mylist in {a b {a b}}} {
puts "skip the processing"
}
A list isn't a string, but we can usually compare lists to strings for equality and order. A list with a single element "a" is comparable to the string "a". If you want to know if a given string is equal to any of the lists in the question, the easiest way is to check if the value of the list is a member of the list {a b {a b}}.
Note: This particular solution does not solve all aspects of list equality in general. It works in those cases where it works.
Efficiency
Is it really efficient to compare a list to a string when this will cause automatic, repeated reconstruction of the internal representation of the data ("shimmering"). Actually, it is. If one compares the procedures
proc foo1 mylist {
set a 0
if {$mylist in {a b {a b}}} {set a 92}
return $a
}
proc foo2 mylist {
set a 0
if {$mylist in [list [list a] [list b] [list a b]]} {set a 92}
return $a
}
then foo1 seems to be faster than foo2 (different machines may produce different results).
Constructing a list inside the condition evaluation code does not seem to add very much time. This procedure
proc foo3 mylist {
set a 0
set x [list [list a] [list b] [list a b]]
if {$mylist in $x} {set a 92}
return $a
}
is somewhere in between foo1 and foo2 in speed, but not significantly faster than foo2.
One can also do this by invoking lsearch:
proc foo4 mylist {
set a 0
set x [list [list a] [list b] [list a b]]
if {[lsearch $x $mylist] >= 0} {set a 92}
return $a
}
proc foo5 mylist {
set a 0
set x [list [list a] [list b] [list a b]]
set i [lsearch $x $mylist]
if {$i >= 0} {set a 92}
return $a
}
which is comparable to foo2 and foo3.
(In case it needs to be said, lsearch is more versatile than the in operator, offering e.g. case insensitive lookup, regex lookup, etc. If you need such things, lsearch is the best option.)
I've deleted most of my observations and theories on speed after timing the procedures on another machine, which showed quite different results. foo1 was consistently faster on both machines, though. Since that code is simpler than the other alternatives, I would say this is the way to do it. But to be sure, one needs to time the procedure with one's own machine, whitelist, and code to be performed.
Finally, none of this really matters if I/O occurs inside the procedures, since the I/O will be so much slower than anything else.
Documentation: if, list, lsearch, proc, puts, return, set

Related

Split a list of numbers into smaller list based on a range in TCL

I have a sorted list of numbers and I am trying to split the list into smaller lists based on range of 50 and find the average in TCL.
For eg: set xlist {1 2 3 4 5 ...50 51 52 ... 100 ... 101 102}
split lists: {1 ... 50} { 51 .. 100} {101 102}
result: sum(1:50)/50; sum(51:100)/50; sum(101:102)/2
The lrange command is the core of what you need here. Combined with a for loop, that'll give you the splitting that you're after.
proc splitByCount {list count} {
set result {}
for {set i 0} {$i < [llength $list]} {incr i $count} {
lappend result [lrange $list $i [expr {$i + $count - 1}]]
}
return $result
}
Testing that interactively (with a smaller input dataset) looks good to me:
% splitByCount {a b c d e f g h i j k l} 5
{a b c d e} {f g h i j} {k l}
The rest of what you want is a trivial application of lmap and tcl::mathop::+ (the command form of the + expression operator).
set sums [lmap sublist [splitByCount $inputList 50] {
expr {[tcl::mathop::+ {*}$sublist] / double([llength $sublist])}
}]
We can make that slightly neater by defining a custom function:
proc tcl::mathfunc::average {list} {expr {
[tcl::mathop::+ 0.0 {*}$list] / [llength $list]
}}
set sums [lmap sublist [splitByCount $inputList 50] {expr {
average($sublist)
}}]
(I've moved the expr command to the previous line in the two cases so that I can pretend that the body of the procedure/lmap is an expression instead of a script.)

How to "zip" lists in tcl

I have three lists :
set l1 {1 2 3}
set l2 {'one' 'two' 'three'}
set l3 {'uno' 'dos' 'tres'}
and I would like to build this list :
{{1 'one' 'uno'} {2 'two' 'dos'} {3 'three' 'tres'}}
In python, I would use something like the built-in function zip. What should I do in tcl ? I have looked in the documentation of 'concat', but
haven't find a priori relevant commands.
If you're not yet on Tcl 8.6 (where you can use lmap) you need this:
set zipped {}
foreach a $l1 b $l2 c $l3 {
lappend zipped [list $a $b $c]
}
That's effectively what lmap does for you, but it was a new feature in 8.6.
lmap a $l1 b $l2 c $l3 {list $a $b $c}
List map, lmap, is a mapping command that takes elements from one or more lists and executes a script. It creates a new list where each element is the result of one execution of the script.
Documentation: list, lmap
This command was added in Tcl 8.6, but can easily be added to earlier versions.
Getting lmap for Tcl 8.5 and earlier
Here's a version that takes an arbitrary number of list names:
set l1 {a b c}
set l2 {d e f}
set l3 {g h i j}
proc zip args {
foreach l $args {
upvar 1 $l $l
lappend vars [incr n]
lappend foreach_args $n [set $l]
}
foreach {*}$foreach_args {
set elem [list]
foreach v $vars {
lappend elem [set $v]
}
lappend result $elem
}
return $result
}
zip l1 l2 l3
{a d g} {b e h} {c f i} {{} {} j}
Requires Tcl 8.5 for the {*} argument expansion.
An 8.6 version
proc zip args {
foreach l $args {
upvar 1 $l $l
lappend vars [incr n]
lappend lmap_args $n [set $l]
}
lmap {*}$lmap_args {lmap v $vars {set $v}}
}

Get items in specific index list of lists

I have a key-value list such as:
set x {{a 1} {b 2} {c 3}}
I need to extract all the items in index=1 in all sub-lists to get:
{1 2 3}
You can use this:
$ set y {}
$ foreach sublist $x { lappend y [lindex $sublist 1]}
$ puts $y
1 2 3
A solution for TCL 8.6 or newer:
Use lmap to iterate through x without saving value anywhere ,in one-line:
$ lmap sublist $x {lindex $sublist 1}
References:
lmap,tcl.tk
I've used the following function:
proc MapList {Var List Script} {
if {![llength $List]} {return $List}
upvar 1 $Var Item
foreach Item $List {lappend Res [uplevel 1 $Script]}
return $Res
}
And used it like this:
MapList Arg $x {lindex $Arg 1}
One solution is to conscript the dict values command:
dict values [concat {*}{{a 1} {b 2} {c 3}}]
How this works: the dict values collects a list consisting of every other item (starting from the second) in another list. This is intended to be used on dictionaries, but since dictionaries are basically just even-sized lists, it works on any even-sized list, with one caveat: if any key appears more than once, the result of dict values will only contain the last value associated with that key.
A list consisting of two-item sublists can easily be transformed into an even-sized list by passing the sublists individually as arguments to concat.
Another way is to traverse the list using one of the methods mentioned in the other answers, or maybe like this:
set res {}
for {set i 0} {$i < [llength $x]} {incr i} {
lappend res [lindex $x $i 1]
}
set res
This is similar to
set res {}
foreach item $x {
lappend res [lindex $item 1]
}
set res
(or the corresponding lmap item $x {lindex $item 1})
but does provide the option to 1) start at an index ≠ 0, 2) end before the end of the list, and 3) traverse the list by two (or more) item steps.
Documentation: concat, dict, for, foreach, incr, lappend, lindex, llength, lmap, set

TCL remove an element from a list

How te remove an element from TCL list say:
which has index = 4
which has value = "aa"
I have Googled and have not found any built-in function yet.
set mylist {a b c}
puts $mylist
a b c
Remove by index
set mylist [lreplace $mylist 2 2]
puts $mylist
a b
Remove by value
set idx [lsearch $mylist "b"]
set mylist [lreplace $mylist $idx $idx]
puts $mylist
a
The other way to remove an element is to filter it out. This Tcl 8.5 technique differs from the lsearch&lreplace method mentioned elsewhere in that it removes all of a given element from the list.
set stripped [lsearch -inline -all -not -exact $inputList $elemToRemove]
What it doesn't do is search through nested lists. That's a consequence of Tcl not putting effort into understanding your data structures too deeply. (You can tell it to search by comparing specific elements of the sublists though, via the -index option.)
Lets say you want to replace element "b":
% set L {a b c d}
a b c d
You replace the first element 1 and last element 1 by nothing:
% lreplace $L 1 1
a c d
regsub may also be suitable to remove a value from a list.
set mylist {a b c}
puts $mylist
a b c
regsub b $mylist "" mylist
puts $mylist
a c
llength $mylist
2
Just wrapped up what others have done
proc _lremove {listName val {byval false}} {
upvar $listName list
if {$byval} {
set list [lsearch -all -inline -not $list $val]
} else {
set list [lreplace $list $val $val]
}
return $list
}
Then call with
Inline edit, list lappend
set output [list 1 2 3 20]
_lremove output 0
echo $output
>> 2 3 20
Set output like lreplace/lsearch
set output [list 1 2 3 20]
echo [_lremove output 0]
>> 2 3 20
Remove by value
set output [list 1 2 3 20]
echo [_lremove output 3 true]
>> 1 2 20
Remove by value with wildcar
set output [list 1 2 3 20]
echo [_lremove output "2*" true]
>> 1 3
You can also try like this :
set i 0
set myl [list a b c d e f]
foreach el $myl {
if {$el in {a b e f}} {
set myl [lreplace $myl $i $i]
} else {
incr i
}
}
set myl
There are 2 easy ways.
# index
set mylist "a c b"
set mylist [lreplace $mylist 2 2]
puts $mylist
a b
# value
set idx [lsearch $mylist "b"]
set mylist [lreplace $mylist $idx $idx]
puts $mylist
a

difference between tcl list of length one and a scalar?

I have a c function (dbread) that reads 'fields' from a 'database'. Most of those fields are single valued; but sometimes they are multi-valued. So I had c code that said
if valcount == 1
return string
else
make list
foreach item in vals
append to list
return list
Because i thought most of the time people want a scalar.
However doing this leads to some odd parsing errors. Specifically if I want to add a value
set l [dbread x] # get current c value
lappend l "extra value" # add a value
dbwrite x {*}$l # set it back to db
If x has single value and that value contains spaces the lappend parses wrong. I get a list with 3 items not 2. I see that this is because it is passed something that is not a list and it parses it to a list and sees 2 items.
set l "foo bar"
lappend l "next val" # string l is parsed into list -> [list foo bar]
so I end up with [list foo bar {next val}]
Anyway, the solution is to make dbread always return a list - even if there is only one item. My question is - is there any downside to this? Are there surprises lurking for the 90% case where people would expect a scalar
The alternative would be to do my own lappend that checks for llength == 1 and special cases it
I think it's cleaner to have an API which always returns a list of results, be it one result or many. Then there's no special casing needed.
No downside, only upside.
Think about it, what if you move away from returning a single scalar and have a case in the future where you're returning a single value that happens to be a string with a space in it. If you didn't construct a list of that single value, you'd treat it as two values (because Tcl would shimmer the string into a list of two things). By always constructing a list of return values, all the code using your API will handle this correctly.
Just because Tcl doesn't have strict typing doesn't mean it's good style to return different types at different times.
One of the approaches I have taken in the past (when the data for each row could contain nulls or empty strings), was to use a list of lists of list:
{{a b} {c d}} ;# two rows, each with two elements
{{{} b} {c d}} ;# two rows, first element of first row is null
;# llength [lindex [lindex {{{} b} {c d}} 0] 0] -> 0
{ { {{}} b } { c d } }
;# two rows, first element of first row is the empty string
;# llength [lindex [lindex {{{{}} b} {c d}} 0] 0] -> 1
It looks complicated, but it's really not if you treat the actual data items as an opaque data structure and add accessors to use it:
foreach row $db_result {
foreach element $row {
if {[db_isnull $element]} {
puts "null"
} elseif {![string length [db_value $element]]} {
puts "empty string"
} else {
puts [db_value $element]
}
}
}
Admittedly, far more complicated than you're looking for, but I thought it worth mentioning.