tcl split list by n character - list

I have a list like below.
{2 1 0 2 2 0 2 3 0 2 4 0}
I would like to add comma between each 3 characters with using TCL.
{2 1 0,2 2 0,2 3 0,2 4 0}
I am looking for your help.
Regards

If it is always definitely three elements, it is easy to use lmap and join:
set theList {2 1 0 2 2 0 2 3 0 2 4 0}
set joined [join [lmap {a b c} $theList {list $a $b $c}] ","]

One way:
Append n elements at a time from your list to a string using a loop, but first append a comma if it's not the first time through.
#!/usr/bin/env tclsh
proc insert_commas {n lst} {
set res ""
set len [llength $lst]
for {set i 0} {$i < $len} {incr i $n} {
if {$i > 0} {
append res ,
}
append res [lrange $lst $i [expr {$i + $n - 1}]]
}
return $res;
}
set lst {2 1 0 2 2 0 2 3 0 2 4 0}
puts [insert_commas 3 $lst] ;# 2 1 0,2 2 0,2 3 0,2 4 0

Related

How can I split a variable by line in TCL?

I have a variable named "results" with this value:
{0 0 0 0 0 0 0 0 0 0 0 3054 11013}
{0 0 0 0 0 0 0 0 0 0 0 5 13 15}
{0.000 3272.744 12702.352 30868.696}
I'd like to store each line (values between the '{}') in a separate variable and then, compare each of the elements of each line with a threshold (this threshold will be different for each line, that's why I need to split them).
I've tried
set result [split $results \n]
But it doesn't really give me a neat list of elements. Any to get 3 lists from the variable "results"?
If I understand correctly, and the representation of your exemplary data is accurate, then you do not have to process ([split]) the data held by results, but leave that to Tcl's list parser. In other words, the input is already a valid string representation of a Tcl list eligible for further processing. Watch:
set results {
{0 0 0 0 1}
{2 2 3 3 3}
{1 1 2 3 4}
};
set thresholds {
3
2
1
}
lmap values $results threshold $thresholds {
lmap v $values {expr {$v >= $threshold}}
}
This will produce:
{0 0 0 0 0} {1 1 1 1 1} {1 1 1 1 1}
Background: when $results is worked on by [lmap], it will be turned into a list automatically.
I think its better to split according to new line character and then apply regexp to fetch the data. I have tried a sample code.
set results "{0 0 0 0 1}
{2 2 3 3 3}
{1 1 2 3 4}";
set result [split $results \n];
foreach line $result {
if {[regexp {^\s*\{(.+)\}\s*} $line Complete_Match Content]} {
puts "$Content\n";
}
}

Element-wise multiplication of two lists in Tcl

I have two lists of same length and I want to multiply them element-wise(like Cartesian product in sets). How do I do it? For example, if I write
set a {1 2 3 4 5}
set b {1 2 3 4 5}
,then the desired output is :
{1 4 9 16 25}
A two-list lmap is perfect for this:
set a {1 2 3 4 5}
set b {1 2 3 4 5}
set result [lmap x $a y $b {expr {$x * $y}}]
If you're on Tcl 8.5 (or older) use this instead:
set a {1 2 3 4 5}
set b {1 2 3 4 5}
set result {}
foreach x $a y $b {
lappend result [expr {$x * $y}]
}
The multi-list form of foreach has been supported for a very long time indeed.

tcl add counter to identical list values

I have a list with values:
set unnumbered [list 101 101 101 102 102 103 104 105 105 105 106]
I want to add a counter to subsequent identical values like this:
numbered [ 101.1 101.2 101.3 102.1 102.2 103 104 105.1 105.2 105.3 106]
So far I have tried the following:
set unnumbered [list 101 101 101 102 102 103 104 105 105 105 106]
set numbered [list ]
set previous [lindex $unnumbered 0]
set subcounter 1
foreach current $unnumbered {
if { $current eq $previous } {
lappend numbered $current.$counter
set previous $current
incr subcounter
} else {
lappend numbered $current
set previous $current
set subcounter 1
}
}
The result is almost what I need.
101.1 101.2 101.3 102 102.1 103 104 105 105.1 105.2 106
For all but the first value, the Counter starts to Count to late. The first 102 laks the ".1"
How can I fix this?
The problem is that your code doesn't have enough information at the point where a number is added to numbered. Get the information first, and then apply it.
First, create a list where each item is a list consisting of one of the unique numbers in $unnumbered and the indices in $unnumbered where that number occurs:
lmap n [lsort -unique $unnumbered] {
list $n [lsearch -all $unnumbered $n]
}
# => {101 {0 1 2}} {102 {3 4}} {103 5} {104 6} {105 {7 8 9}} {106 10}
For each of those items, split up the item into n = the number and indices = the indices. Check how many indices you have. For more than one index, add enumerated numbers like this:
set i 0
foreach index $indices {
lappend numbered $n.[incr i]
}
For single indices, just add the number:
lappend numbered $n
The whole program looks like this:
set unnumbered [list 101 101 101 102 102 103 104 105 105 105 106]
set numbered [list]
foreach item [lmap n [lsort -unique $unnumbered] {
list $n [lsearch -all $unnumbered $n]
}] {
lassign $item n indices
if {[llength $indices] > 1} {
set i 0
foreach index $indices {
lappend numbered $n.[incr i]
}
} else {
lappend numbered $n
}
}
Documentation:
> (operator),
foreach,
if,
incr,
lappend,
lassign,
list,
llength,
lmap (for Tcl 8.5),
lmap,
lsearch,
lsort,
set
If you don’t have lmap, see the link above. If you don’t have lassign, use
foreach {n indices} $item break
instead.
ETA If the "no index on singleton numbers" requirement can be relaxed, one could do it this way:
set previous {}
lmap num $unnumbered {
if {$num ne $previous} {
set i 0
}
set previous $num
format %d.%d $num [incr i]
}
Another variant. It’s very similar to Jerry’s second suggestion, but I didn’t see that one until I was going to submit this, honest. This one assumes that no element in $unnumbered is the empty string.
set numbered [list]
set rest [lassign $unnumbered current next]
set i 0
while 1 {
if {$current eq $next} {
lappend numbered $current.[incr i]
} else {
if {$i > 0} {
lappend numbered $current.[incr i]
set i 0
} else {
lappend numbered $current
}
set current $next
}
if {$next eq {}} break
set rest [lassign $rest next]
}
Another approach: maintain a dict to keep the count of what you've seen so far
set unnumbered [list 101 101 101 102 102 103 104 105 105 105 106]
set count [dict create]
set numbered {}
foreach num $unnumbered {
dict incr count $num
lappend numbered "$num.[dict get $count $num]"
}
puts $numbered
101.1 101.2 101.3 102.1 102.2 103.1 104.1 105.1 105.2 105.3 106.1
Using an array is a little simpler: taking advantage of the fact that incr returns the new count
set numbered {}
array set count {}
foreach num $unnumbered {lappend numbered "$num.[incr count($num)]"}
OK, I missed the requirement that singleton entries should not have a suffix. There's this, but it may re-order the initial list:
set count [dict create]
foreach num $unnumbered {dict incr count $num}
set numbered {}
foreach num [dict keys $count] {
set c [dict get $count $num]
if {$c == 1} {
lappend numbered $num
} else {
for {set i 1} {$i <= $c} {incr i} {
lappend numbered "$num.$i"
}
}
}
puts $numbered
101.1 101.2 101.3 102.1 102.2 103 104 105.1 105.2 105.3 106
Or, this maintains the original order
set count [dict create]
foreach num $unnumbered {dict incr count $num}
foreach key [dict keys $count] {
if {[dict get $count $key] == 1} {
set count [dict remove $count $key]
}
}
set numbered {}
foreach num [lreverse $unnumbered] {
if {![dict exists $count $num]} {
lappend numbered $num
} else {
lappend numbered "$num.[dict get $count $num]"
dict incr count $num -1
}
}
set numbered [lreverse $numbered]
puts $numbered
101.1 101.2 101.3 102.1 102.2 103 104 105.1 105.2 105.3 106
An O(n) solution (single loop), and which I believe looks a bit more like how you initially wanted to implement it:
set unnumbered [list 101 101 101 102 102 103 104 105 105 105 106]
set numbered [list]
set previous ""
set subcounter 1
foreach current $unnumbered {
if {$previous == ""} {
# First, do nothing except set $current to $previous later below
} elseif {$previous == $current} {
lappend numbered $previous.$subcounter
incr subcounter
} else {
if {$subcounter > 1} {
lappend numbered $previous.$subcounter
} else {
lappend numbered $previous
}
set subcounter 1
}
set previous $current
}
if {$subcounter > 1} {
lappend numbered $current.$subcounter
} else {
lappend numbered $current
}
The loop basically adds one number late to the numbered list, so that the last if is required for the last number. Of course, this only works if you know that unnumbered is sorted.
EDIT: Actually this is even closer! Since you can already get $previous, you can loop starting from the next element of the list, and one last time after the last element (note that we get a blank if the lindex is supplied with an out of range index, which makes things easier here).
set unnumbered [list 101 101 101 102 102 103 104 105 105 105 106]
set numbered [list]
set previous [lindex $unnumbered 0]
set subcounter 1
for {set i 1} {$i <= [llength $unnumbered]} {incr i} {
if {$previous == [lindex $unnumbered $i]} {
lappend numbered $previous.$subcounter
incr subcounter
} else {
if {$subcounter > 1} {
lappend numbered $previous.$subcounter
} else {
lappend numbered $previous
}
set subcounter 1
}
set previous [lindex $unnumbered $i]
}

tcl: sorting list depending on array

If List1 is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
And array1 is:
{3 4 5} {12 13} {20 21}
How do I convert list1 according to array1 by replacing reverse list of each element of array1, i.e., producing this
output:
1 2 5 4 3 6 7 8 9 10 11 13 12 14 15 16 17 18 19 21 20 22 23 24 25
^^^^^ ^^^^^ ^^^^^
This is not a sorting task, this is a searching task.
If you assume that the ranges to reverse do not overlap, but are not necessarily present either (i.e., not using the fact that they are contiguous numbers), you get something like this:
# Iterate over each of the replacement patterns
foreach range $array1 {
# Iterate over each of the locations where the first element of the current
# replacement pattern is found
foreach pos [lsearch -all -exact $list1 [lindex $range 0]] {
# This will be the index of the *last* element in each subrange
set pos2 [expr {$pos + [llength $range] - 1}]
# Do the reversed replacement if the ranges match
if {[lrange $list1 $pos $pos2] eq $range} {
set list1 [lreplace $list1 $pos $pos2 {*}[lreverse $range]]
}
}
}
The result after this will be in the updated list1 variable. Wrapping into a procedure is left as an exercise.

the order within group apply function

Having the code (below) I am trying to figure will particular group order always remain the same as in original dataframe.
It looks like the order within the group preserved for my little example, but what if I have dataframe with ~1 mln records? Will pandas provide such guarantee and I should worry about that by myself?
Code:
import numpy as np
import pandas as pd
N = 10
df = pd.DataFrame(index = xrange(N))
df['A'] = map(lambda x: int(x) / 5, np.random.randn(N) * 10.0)
df['B'] = map(lambda x: int(x) / 5, np.random.randn(N) * 10.0)
df['v'] = np.random.randn(N)
def show_x(x):
print x
print "----------------"
df.groupby('A').apply(show_x)
print "==============="
print df
Output:
A B v
6 -4 -1 -2.047354
[1 rows x 3 columns]
----------------
A B v
6 -4 -1 -2.047354
[1 rows x 3 columns]
----------------
A B v
8 -3 0 -1.190831
[1 rows x 3 columns]
----------------
A B v
0 -1 -1 0.456397
9 -1 -2 -1.329169
[2 rows x 3 columns]
----------------
A B v
1 0 0 0.663928
2 0 2 0.626204
7 0 -3 -0.539166
[3 rows x 3 columns]
----------------
A B v
4 2 2 -1.115721
5 2 1 -1.905266
[2 rows x 3 columns]
----------------
A B v
3 4 -1 0.751016
[1 rows x 3 columns]
----------------
===============
A B v
0 -1 -1 0.456397
1 0 0 0.663928
2 0 2 0.626204
3 4 -1 0.751016
4 2 2 -1.115721
5 2 1 -1.905266
6 -4 -1 -2.047354
7 0 -3 -0.539166
8 -3 0 -1.190831
9 -1 -2 -1.329169
[10 rows x 3 columns]
If you are using apply not only is the order not guaranteed, but as you've found it can trigger the function for the same group a couple of times (to decide which "path" to take / what type of result to return). So if your function has side-effects don't do this!
I recommend simply iterating through the groupby object!
In [11]: df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])
In [12]: df
Out[12]:
A B
0 1 2
1 1 4
2 5 6
In [13]: g = df.groupby('A')
In [14]: for key, sub_df in g:
print("key =", key)
print(sub_df)
print('') # apply whatever function you want
key = 1
A B
0 1 2
1 1 4
key = 5
A B
2 5 6
Note that this is ordered (the same as the levels) see g.grouper._get_group_keys():
In [21]: g.grouper.levels
Out[21]: [Int64Index([1, 5], dtype='int64')]
It's sorted by default (there's a sort kwarg when doing the groupby), through it's not clear what this actually means if it's not a numeric dtype.