Element-wise multiplication of two lists in Tcl - list

I have two lists of same length and I want to multiply them element-wise(like Cartesian product in sets). How do I do it? For example, if I write
set a {1 2 3 4 5}
set b {1 2 3 4 5}
,then the desired output is :
{1 4 9 16 25}

A two-list lmap is perfect for this:
set a {1 2 3 4 5}
set b {1 2 3 4 5}
set result [lmap x $a y $b {expr {$x * $y}}]
If you're on Tcl 8.5 (or older) use this instead:
set a {1 2 3 4 5}
set b {1 2 3 4 5}
set result {}
foreach x $a y $b {
lappend result [expr {$x * $y}]
}
The multi-list form of foreach has been supported for a very long time indeed.

Related

tcl split list by n character

I have a list like below.
{2 1 0 2 2 0 2 3 0 2 4 0}
I would like to add comma between each 3 characters with using TCL.
{2 1 0,2 2 0,2 3 0,2 4 0}
I am looking for your help.
Regards
If it is always definitely three elements, it is easy to use lmap and join:
set theList {2 1 0 2 2 0 2 3 0 2 4 0}
set joined [join [lmap {a b c} $theList {list $a $b $c}] ","]
One way:
Append n elements at a time from your list to a string using a loop, but first append a comma if it's not the first time through.
#!/usr/bin/env tclsh
proc insert_commas {n lst} {
set res ""
set len [llength $lst]
for {set i 0} {$i < $len} {incr i $n} {
if {$i > 0} {
append res ,
}
append res [lrange $lst $i [expr {$i + $n - 1}]]
}
return $res;
}
set lst {2 1 0 2 2 0 2 3 0 2 4 0}
puts [insert_commas 3 $lst] ;# 2 1 0,2 2 0,2 3 0,2 4 0

How can I split a variable by line in TCL?

I have a variable named "results" with this value:
{0 0 0 0 0 0 0 0 0 0 0 3054 11013}
{0 0 0 0 0 0 0 0 0 0 0 5 13 15}
{0.000 3272.744 12702.352 30868.696}
I'd like to store each line (values between the '{}') in a separate variable and then, compare each of the elements of each line with a threshold (this threshold will be different for each line, that's why I need to split them).
I've tried
set result [split $results \n]
But it doesn't really give me a neat list of elements. Any to get 3 lists from the variable "results"?
If I understand correctly, and the representation of your exemplary data is accurate, then you do not have to process ([split]) the data held by results, but leave that to Tcl's list parser. In other words, the input is already a valid string representation of a Tcl list eligible for further processing. Watch:
set results {
{0 0 0 0 1}
{2 2 3 3 3}
{1 1 2 3 4}
};
set thresholds {
3
2
1
}
lmap values $results threshold $thresholds {
lmap v $values {expr {$v >= $threshold}}
}
This will produce:
{0 0 0 0 0} {1 1 1 1 1} {1 1 1 1 1}
Background: when $results is worked on by [lmap], it will be turned into a list automatically.
I think its better to split according to new line character and then apply regexp to fetch the data. I have tried a sample code.
set results "{0 0 0 0 1}
{2 2 3 3 3}
{1 1 2 3 4}";
set result [split $results \n];
foreach line $result {
if {[regexp {^\s*\{(.+)\}\s*} $line Complete_Match Content]} {
puts "$Content\n";
}
}

Extract specific rows from SAS dataset based on a particular cell value of a variable

I want to extract specific set of rows from a large SAS dataset based on a particular cell value of a variable into a new dataset. In this dataset, I have 6 variables. Following is an example of this dataset:
Variable names: Var1 Var2 Var3 Var4 Var5 Var6
Row 1 A 1 2 3 4 5
Row 2 B 1 2 3 4 5
Row 3 A 1 2 3 4 5
Row 4 B 1 2 3 4 5
Row 5 Sample 1 2 3 4 5
Row 6 A 1 2 3 4 5
Row 7 B 1 2 3 4 5
Row 8 A 1 2 3 4 5
Row 9 B 1 2 3 4 5
Row 10 A 1 2 3 4 5
Row 11 B 1 2 3 4 5
Row 12 A 1 2 3 4 5
Row 13 B 1 2 3 4 5
From this dataset, I want to select a set of next 8 rows starting from a row in which Var 1 has a value = "Sample". I want to extract multiple such sets of 8 rows from this dataset into a new dataset. Can someone please guide me how I can accomplish this in SAS?
Thank you
Would the output statement work for you?
data have;
infile datalines dsd dlm=",";
input Variable_names : $char10.
Var1 : $char10.
Var2 : 8.
Var3 : 8.
Var4 : 8.
Var5 : 8.
Var6 : 8.;
datalines;
Row 1 , A , 1, 2, 3, 4, 5
Row 2 , B , 1, 2, 3, 4, 5
Row 3 , A , 1, 2, 3, 4, 5
Row 4 , B , 1, 2, 3, 4, 5
Row 5 , Sample, 1, 2, 3, 4, 5
Row 6 , A , 1, 2, 3, 4, 5
Row 7 , B , 1, 2, 3, 4, 5
Row 8 , A , 1, 2, 3, 4, 5
Row 9 , B , 1, 2, 3, 4, 5
Row 10, A , 1, 2, 3, 4, 5
Row 11, B , 1, 2, 3, 4, 5
Row 12, A , 1, 2, 3, 4, 5
Row 13, B , 1, 2, 3, 4, 5
;
run;
data want_without
want_with;
set have;
if strip(Var1) = "Sample" then output want_with;
else output want_without;
run;
One way to do this is to set a counter to 8 whenever the previous record has var1="Sample", and then decrement the counter for each record. And only output records where counter is >= 1.
data want ;
set have ;
if lag(var1) = "Sample" then counter = 8 ;
else counter+(-1) ; *counter is implicitly retained ;
if counter>=1 then output ;
* drop counter ;
run ;
You can set a counter and output as desired, use the RETAIN coupled with an IF (& OUTPUT) statement. You may need to tweak the IF condition but I think you get the idea here.
data want;
set have;
retain counter 10;
if strip(Var1) = "Sample" then counter=1;
else counter+1;
if 2<=counter<=9 then OUTPUT;
*if 2<=counter<=9; *this is the same as above, but less code;
run;

the order within group apply function

Having the code (below) I am trying to figure will particular group order always remain the same as in original dataframe.
It looks like the order within the group preserved for my little example, but what if I have dataframe with ~1 mln records? Will pandas provide such guarantee and I should worry about that by myself?
Code:
import numpy as np
import pandas as pd
N = 10
df = pd.DataFrame(index = xrange(N))
df['A'] = map(lambda x: int(x) / 5, np.random.randn(N) * 10.0)
df['B'] = map(lambda x: int(x) / 5, np.random.randn(N) * 10.0)
df['v'] = np.random.randn(N)
def show_x(x):
print x
print "----------------"
df.groupby('A').apply(show_x)
print "==============="
print df
Output:
A B v
6 -4 -1 -2.047354
[1 rows x 3 columns]
----------------
A B v
6 -4 -1 -2.047354
[1 rows x 3 columns]
----------------
A B v
8 -3 0 -1.190831
[1 rows x 3 columns]
----------------
A B v
0 -1 -1 0.456397
9 -1 -2 -1.329169
[2 rows x 3 columns]
----------------
A B v
1 0 0 0.663928
2 0 2 0.626204
7 0 -3 -0.539166
[3 rows x 3 columns]
----------------
A B v
4 2 2 -1.115721
5 2 1 -1.905266
[2 rows x 3 columns]
----------------
A B v
3 4 -1 0.751016
[1 rows x 3 columns]
----------------
===============
A B v
0 -1 -1 0.456397
1 0 0 0.663928
2 0 2 0.626204
3 4 -1 0.751016
4 2 2 -1.115721
5 2 1 -1.905266
6 -4 -1 -2.047354
7 0 -3 -0.539166
8 -3 0 -1.190831
9 -1 -2 -1.329169
[10 rows x 3 columns]
If you are using apply not only is the order not guaranteed, but as you've found it can trigger the function for the same group a couple of times (to decide which "path" to take / what type of result to return). So if your function has side-effects don't do this!
I recommend simply iterating through the groupby object!
In [11]: df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])
In [12]: df
Out[12]:
A B
0 1 2
1 1 4
2 5 6
In [13]: g = df.groupby('A')
In [14]: for key, sub_df in g:
print("key =", key)
print(sub_df)
print('') # apply whatever function you want
key = 1
A B
0 1 2
1 1 4
key = 5
A B
2 5 6
Note that this is ordered (the same as the levels) see g.grouper._get_group_keys():
In [21]: g.grouper.levels
Out[21]: [Int64Index([1, 5], dtype='int64')]
It's sorted by default (there's a sort kwarg when doing the groupby), through it's not clear what this actually means if it's not a numeric dtype.

Applying predicates on a list in R

Given a list of values in R, what is a nice way to filter values in a list by a given predicate function?
It's not entirely clear whether you have a proper list object in R, or another type of object such as a data.frame or vector. Assuming you have a true list object, we can combine lapply and subset to do what you want. If you don't have a list, then there's no need for lapply.
set.seed(1)
#Fake data
dat <- list(a = data.frame(x = sample(1:10, 20, TRUE))
, b = data.frame(x = sample(1:10, 20, TRUE)))
#Apply the subset function over the list
lapply(dat, subset, x < 3)
$a
x
10 1
12 2
$b
x
4 2
7 1
14 2
18 2
#Example two
lapply(dat, subset, x %in% c(1,7,9))
$a
x
6 9
8 7
9 7
10 1
13 7
$b
x
3 7
7 1
9 9
15 9
16 7