Crystal static variables - static-variables

Has Crystal got static variables or must I use global variables with a file / global scope ?
def test(value)
static var = 1
var += value
return var
pp test 0 #=> 1
pp test 1 #=> 2
pp test 1 #=> 3
pp test 0 #=> 3

Crystal has no static variables scoped to methods. You'll need to use class variables for this:
class Test
##var = 1
def self.test(value)
##var += value
return ##var
pp Test.test 0 #=> 1
pp Test.test 1 #=> 2
pp Test.test 1 #=> 3
pp Test.test 0 #=> 3
Also you can use macros class_property, class_setter or class_getter
class Test
class_property var = 1
Test.var += 0
pp Test.var #=> 1
Test.var += 1
pp Test.var #=> 2
Test.var += 1
pp Test.var #=> 3
Test.var += 0
pp Test.var #=> 3


Cumsum entire table and reset at zero

I have following data frame.
d = pd.DataFrame({'one' : [0,1,1,1,0,1],'two' : [0,0,1,0,1,1]})
one two
0 0 0
1 1 0
2 1 1
3 1 0
4 0 1
5 1 1
I want cumulative sum which resets at zero
desired output should be
pd.DataFrame({'one' : [0,1,2,3,0,1],'two' : [0,0,1,0,1,2]})
one two
0 0 0
1 1 0
2 2 1
3 3 0
4 0 1
5 1 2
i have tried using group by but it does not work for entire table.
df2 = df.apply(lambda x: x.groupby((~x.astype(bool)).cumsum()).cumsum())
one two
0 0 0
1 1 0
2 2 1
3 3 0
4 0 1
5 1 2
def cum_reset_pd(df):
csum = df.cumsum()
return (csum - csum.where(df == 0).ffill()).astype(d.dtypes)
one two
0 0 0
1 1 0
2 2 1
3 3 0
4 0 1
5 1 2
def cum_reset_np(df):
v = df.values
z = np.zeros_like(v)
j, i = np.where(v.T)
r = np.arange(1, i.size + 1)
p = np.where(
np.append(False, (np.diff(i) != 1) | (np.diff(j) != 0))
b = np.append(0, np.append(p, r.size))
z[i, j] = r - b[:-1].repeat(np.diff(b))
return pd.DataFrame(z, df.index, df.columns)
one two
0 0 0
1 1 0
2 2 1
3 3 0
4 0 1
5 1 2
Why go through this trouble?
because it's quicker!
This one is without using Pandas, but using NumPy and list comprehensions:
import numpy as np
d = {'one': [0,1,1,1,0,1], 'two': [0,0,1,0,1,1]}
out = {}
for key in d.keys():
l = d[key]
indices = np.argwhere(np.array(l)==0).flatten()
indices = np.append(indices, len(l))
out[key] = np.concatenate([np.cumsum(l[indices[n-1]:indices[n]]) \
for n in range(1, indices.shape[0])]).ravel()
First, I find all occurences of 0 (positions to split the lists), then I calculate cumsum of the resulting sublists and insert them into a new dict.
This should do it:
d = {'one' : [0,1,1,1,0,1],'two' : [0,0,1,0,1,1]}
one = d['one']
two = d['two']
i = 0
new_one = []
for item in one:
if item == 0:
i = 0
i += item
j = 0
new_two = []
for item in two:
if item == 0:
j = 0
j += item
d['one'], d['two'] = new_one, new_two
df = pd.DataFrame(d)

Find position of first non-zero decimal

Suppose I have the following local macro:
loc a = 12.000923
I would like to get the decimal position of the first non-zero decimal (4 in this example).
There are many ways to achieve this. One is to treat a as a string and to find the position of .:
loc a = 12.000923
loc b = strpos(string(`a'), ".")
di "`b'"
From here one could further loop through the decimals and count since I get the first non-zero element. Of course this doesn't seem to be a very elegant approach.
Can you suggest a better way to deal with this? Regular expressions perhaps?
Well, I don't know Stata, but according to the documentation, \.(0+)? is suported and it shouldn't be hard to convert this 2 lines JavaScript function in Stata.
It returns the position of the first nonzero decimal or -1 if there is no decimal.
function getNonZeroDecimalPosition(v) {
var v2 = v.replace(/\.(0+)?/, "")
return v2.length !== v.length ? v.length - v2.length : -1
We remove from input string a dot followed by optional consecutive zeros.
The difference between the lengths of original input string and this new string gives the position of the first nonzero decimal
Sample Snippet
function getNonZeroDecimalPosition(v) {
var v2 = v.replace(/\.(0+)?/, "")
return v2.length !== v.length ? v.length - v2.length : -1
var samples = [
"loc a = 12.00012",
"loc b = 12",
"loc c = 12.012",
"loc d = 1.000012",
"loc e = -10.00012",
"loc f = -10.05012",
"loc g = 0.0012"
samples.forEach(function(sample) {
You can do this in mata in one line and without using regular expressions:
foreach x in 124.000923 65.020923 1.000022030 0.0090843 .00000425 {
mata: selectindex(tokens(tokens(st_local("x"), ".")[selectindex(tokens(st_local("x"), ".") :== ".") + 1], "0") :!= "0")[1]
Below, you can see the steps in detail:
. local x = 124.000823
. mata:
: /* Step 1: break Stata's local macro x in tokens using . as a parsing char */
: a = tokens(st_local("x"), ".")
: a
1 2 3
1 | 124 . 000823 |
: /* Step 2: tokenize the string in a[1,3] using 0 as a parsing char */
: b = tokens(a[3], "0")
: b
1 2 3 4
1 | 0 0 0 823 |
: /* Step 3: find which values are different from zero */
: c = b :!= "0"
: c
1 2 3 4
1 | 0 0 0 1 |
: /* Step 4: find the first index position where this is true */
: d = selectindex(c :!= 0)[1]
: d
: end
You can also find the position of the string of interest in Step 2 using the
same logic.
This is the index value after the one for .:
. mata:
: k = selectindex(a :== ".") + 1
: k
: end
In which case, Step 2 becomes:
. mata:
: b = tokens(a[k], "0")
: b
1 2 3 4
1 | 0 0 0 823 |
: end
For unexpected cases without decimal:
foreach x in 124.000923 65.020923 1.000022030 12 0.0090843 .00000425 {
if strmatch("`x'", "*.*") mata: selectindex(tokens(tokens(st_local("x"), ".")[selectindex(tokens(st_local("x"), ".") :== ".") + 1], "0") :!= "0")[1]
else display " 0"
A straighforward answer uses regular expressions and commands to work with strings.
One can select all decimals, find the first non 0 decimal, and finally find its position:
loc v = "123.000923"
loc v2 = regexr("`v'", "^[0-9]*[/.]", "") // 000923
loc v3 = regexr("`v'", "^[0-9]*[/.][0]*", "") // 923
loc first = substr("`v3'", 1, 1) // 9
loc first_pos = strpos("`v2'", "`first'") // 4: position of 9 in 000923
di "`v2'"
di "`v3'"
di "`first'"
di "`first_pos'"
Which in one step is equivalent to:
loc first_pos2 = strpos(regexr("`v'", "^[0-9]*[/.]", ""), substr(regexr("`v'", "^[0-9]*[/.][0]*", ""), 1, 1))
di "`first_pos2'"
An alternative suggested in another answer is to compare the lenght of the decimals block cleaned from the 0s with that not cleaned.
In one step this is:
loc first_pos3 = strlen(regexr("`v'", "^[0-9]*[/.]", "")) - strlen(regexr("`v'", "^[0-9]*[/.][0]*", "")) + 1
di "`first_pos3'"
Not using regex but log10 instead (which treats a number like a number), this function will:
For numbers >= 1 or numbers <= -1, return with a positive number the number of digits to the left of the decimal.
Or (and more specifically to what you were asking), for numbers between 1 and -1, return with a negative number the number of digits to the right of the decimal where the first non-zero number occurs.
digitsFromDecimal = (n) => {
dFD = Math.log10(Math.abs(n)) | 0;
if (n >= 1 || n <= -1) { dFD++; }
return dFD;
var x = [118.8161330, 11.10501660, 9.254180571, -1.245501523, 1, 0, 0.864931613, 0.097007836, -0.010880074, 0.009066729];
x.forEach(element => {
console.log(`${element}, Digits from Decimal: ${digitsFromDecimal(element)}`);
// Output
// 118.816133, Digits from Decimal: 3
// 11.1050166, Digits from Decimal: 2
// 9.254180571, Digits from Decimal: 1
// -1.245501523, Digits from Decimal: 1
// 1, Digits from Decimal: 1
// 0, Digits from Decimal: 0
// 0.864931613, Digits from Decimal: 0
// 0.097007836, Digits from Decimal: -1
// -0.010880074, Digits from Decimal: -1
// 0.009066729, Digits from Decimal: -2
Mata solution of Pearly is very likable, but notice should be paid for "unexpected" cases of "no decimal at all".
Besides, the regular expression is not a too bad choice when it could be made in a memorable 1-line.
loc v = "123.000923"
capture local x = regexm("`v'","(\.0*)")*length(regexs(0))
Below code tests with more values of v.
foreach v in 124.000923 605.20923 1.10022030 0.0090843 .00000425 12 .000125 {
capture local x = regexm("`v'","(\.0*)")*length(regexs(0))
di "`v': The wanted number = `x'"

Removing some particular rows in pandas

I want to delete some rows in pandas dataframe.
ID Value
2012XY000 1
2012XY001 1
2015AB000 4
2015PQ001 5
2016DF00G 2
I want to delete rows whose ID does not start with 2015.
How should I do that?
Use startswith with boolean indexing:
print (df.ID.str.startswith('2015'))
0 False
1 False
2 True
3 True
4 False
Name: ID, dtype: bool
print (df[df.ID.str.startswith('2015')])
ID Value
2 2015AB000 4
3 2015PQ001 5
EDIT by comment:
print (df)
ID Value
0 2012XY000 1
1 2012XY001 1
2 2015AB000 4
3 2015PQ001 5
4 2015XQ001 5
5 2016DF00G 2
print ((df.ID.str.startswith('2015')) & (df.ID.str[4] != 'X'))
0 False
1 False
2 True
3 True
4 False
5 False
Name: ID, dtype: bool
print (df[(df.ID.str.startswith('2015')) & (df.ID.str[4] != 'X')])
ID Value
2 2015AB000 4
3 2015PQ001 5
Use str.match with regex string r'^2015':
To exclude those that have an X afterwards.
The regex r'^2015[^X]' translates into
^2015 - must start with 2015
[^X] - character after 2015 must not be X
consider the df

Regex Negations in Vim

How do I convert var x+=1+2+3+(5+6+7) to var x += 1 + 2 + 3 + ( 5 + 6 + 7 )
Using regular expressions, something like :%s/+/\ x\ /g won't work because it will convert += to + = (amongst other problems). So instead one would use negations (negatives, nots, whatever they're called) like so :%s/\s\#!+/\ +/g, which is about as complicated a way as one can say "plus sign without an empty space before it". But now this converts something like x++ into x + +. What I need is something more complex. I need more than one constraint in the negation, and an additional constraint afterwards. Something like so, but this doesn't work :%s/[\s+]\#!+\x\#!/\ +/g
Could someone please provide the one, or possibly two regex statements which will pad out an example operator, such that I can model the rest of my rules on it/them.
I find beautifiers for languages like javascript or PHP don't give me full control (see here). Therefore, I am attempting to use regex to carry out the following conversions:
foo(1,2,3,4) → foo( 1, 2, 3, 4 )
var x=1*2*3 → var x = 1 * 2 * 3
var x=1%2%3 → var x = 1 % 2 % 3
var x=a&&b&&c → var x = a && b && c
var x=a&b&c → var x = a & b & c
Any feedback would also be appreciated
Thanks to the great feedback, I now have a regular expression like so to work from. I am running these two regular expressions:
:%s/\(\w\)\([+\-*\/%|&~)=]\)/\1\ \2/g
:%s/\([+\-*\/%|&~,(=]\)\(\w\)/\1\ \2/g
And it is working fairly well. Here are some results.
(1+2+3+4,1+2+3+4,1+2+3+4) --> ( 1 + 2 + 3 + 4, 1 + 2 + 3 + 4, 1 + 2 + 3 + 4 )
(1-2-3-4,1-2-3-4,1-2-3-4) --> ( 1 - 2 - 3 - 4, 1 - 2 - 3 - 4, 1 - 2 - 3 - 4 )
(1*2*3*4,1*2*3*4,1*2*3*4) --> ( 1 * 2 * 3 * 4, 1 * 2 * 3 * 4, 1 * 2 * 3 * 4 )
(1/2/3/4,1/2/3/4,1/2/3/4) --> ( 1 / 2 / 3 / 4, 1 / 2 / 3 / 4, 1 / 2 / 3 / 4 )
(1%2%3%4,1%2%3%4,1%2%3%4) --> ( 1 % 2 % 3 % 4, 1 % 2 % 3 % 4, 1 % 2 % 3 % 4 )
(1|2|3|4,1|2|3|4,1|2|3|4) --> ( 1 | 2 | 3 | 4, 1 | 2 | 3 | 4, 1 | 2 | 3 | 4 )
(1&2&3&4,1&2&3&4,1&2&3&4) --> ( 1 & 2 & 3 & 4, 1 & 2 & 3 & 4, 1 & 2 & 3 & 4 )
(1~2~3~4,1~2~3~4,1~2~3~4) --> ( 1 ~ 2 ~ 3 ~ 4, 1 ~ 2 ~ 3 ~ 4, 1 ~ 2 ~ 3 ~ 4 )
(1&&2&&3&&4,1&&2&&3&&4,1&&2&&3&&4) --> ( 1 && 2 && 3 && 4, 1 && 2 && 3 && 4, 1 && 2 && 3 && 4 )
(1||2||3||4,1||2||3||4,1||2||3||4) --> ( 1 || 2 || 3 || 4, 1 || 2 || 3 || 4, 1 || 2 || 3 || 4 )
var x=1+(2+(3+4*(965%(123/(456-789))))); --> var x = 1 +( 2 +( 3 + 4 *( 965 %( 123 /( 456 - 789 )))));
It seems to work fine for everything except nested brackets. If I fix the nested brackets problem, I will update it here.

which list element is being processed when using snowfall::sfLapply?

Assume we have a list (mylist) that is use as input object for a lapply function. Is there a way to know which element in mylist is being evaluated? The method should work on lapply and snowfall::sfApply (and possible others apply family members) as well.
On chat, Gavin Simpson suggested the following method. This works great for lapply but not so much for sfApply. I would like to avoid extra packages or fiddling with the list. Any suggestions?
mylist <- list(a = 1:10, b = 1:10)
foo <- function(x) {
bar <- lapply(mylist, FUN = foo)
> bar
[1] "X[[1L]]"
[1] "X[[2L]]"
This is the parallel version that isn't cutting it.
sfInit(parallel = TRUE, cpus = 2, type = "SOCK") # I use 2 cores
sfExport("foo", "mylist")
bar.para <- sfLapply(x = mylist, fun = foo)
> bar.para
[1] "X[[1L]]"
[1] "X[[1L]]"
I think you are going to have to use Shane's solution/suggestion in that chat session. Store your objects in a list such that each component of the top list contains a component with the name or ID or experiment contained in that list component, plus a component containing the object you want to process:
obj <- list(list(ID = 1, obj = 1:10), list(ID = 2, obj = 1:10),
list(ID = 3, obj = 1:10), list(ID = 4, obj = 1:10),
list(ID = 5, obj = 1:10))
So we have the following structure:
> str(obj)
List of 5
$ :List of 2
..$ ID : num 1
..$ obj: int [1:10] 1 2 3 4 5 6 7 8 9 10
$ :List of 2
..$ ID : num 2
..$ obj: int [1:10] 1 2 3 4 5 6 7 8 9 10
$ :List of 2
..$ ID : num 3
..$ obj: int [1:10] 1 2 3 4 5 6 7 8 9 10
$ :List of 2
..$ ID : num 4
..$ obj: int [1:10] 1 2 3 4 5 6 7 8 9 10
$ :List of 2
..$ ID : num 5
..$ obj: int [1:10] 1 2 3 4 5 6 7 8 9 10
The have something like the first line in the following function, followed by your
foo <- function(x) {
writeLines(paste("Processing Component:", x$ID))
Which will do this:
> res <- lapply(obj, foo)
Processing Component: 1
Processing Component: 2
Processing Component: 3
Processing Component: 4
Processing Component: 5
Which might work on snowfall.
I could also alter the attributes like so.
mylist <- list(a = 1:10, b = 1:10)
attr(mylist[[1]], "seq") <- 1
attr(mylist[[2]], "seq") <- 2
foo <- function(x) {
writeLines(paste("Processing Component:", attributes(x)))
bar <- lapply(mylist, FUN = foo)
(and the parallel version)
mylist <- list(a = 1:10, b = 1:10)
attr(mylist[[1]], "seq") <- 1
attr(mylist[[2]], "seq") <- 2
foo <- function(x) {
x <- paste("Processing Component:", attributes(x))
sfExport("mylist", "foo")
bar <- sfLapply(mylist, fun = foo)