Incrementing os.clock() in Lua - unit-testing

Summary: Can I cause time to (appear to) pass in my script without busywaiting? Using os.execute('sleep 1') doesn't cut it (and also wastes time).
Details
I have a library that does work after specific intervals. I currently schedule this by injecting a table into a time-sorted list. Roughly:
local delayedWork = {}
function handleLater( data, delaySeconds )
local delayedItem = { data=data, execTime=os.clock() + delaySeconds }
local i=1
for _,existing in ipairs(delayedWork) do
if existing.execTime > delayedItem.execTime then break else i=i+1 end
end
table.insert(delayedWork,i,delayedItem)
end
Later on I periodically check for work to do. Roughly:
function processDelayedWork()
local i,last = 1,#delayedWork
while i<=last do
local delayedItem = delayedWork[i]
if delayedItem.execTime <= os.clock() then
table.remove(delayedWork,i)
-- use delayedItem.data
last = last-1
else
i=i+1
end
end
end
(The fact that this uses CPU time instead of wall time is an issue for a different question. The fact that I repeatedly shift the array during processing instead of one compacting pass is an optimization not relevant here.)
When I am running unit tests of the system I need to cause time to pass, preferably faster than normal. However, calling os.execute('sleep 1') consumes one wall clock second, but does not cause os.clock() to increment.
$ lua -e "require 'os' print(os.clock()) os.execute('sleep 1') print(os.clock())"
0.002493
0.002799
I can't use os.time() because this is only integer seconds (and less than that on systems compiled to use float instead double for numbers):
$ lua -e "require 'os' print(os.time()) os.execute('sleep 0.4') print(os.time())"
1397704209
1397704209
Can I force time to pass in my script without just busywaiting?

Paul answered this already: make os.clock return whatever you need.
I'm adding an implementation that lets you control how fast time passes according to os.clock \\\\\\\\\\\\\\\\\\\\\\\.
do
local realclock = os.clock
local lasttime = realclock()
local faketime = lasttime
local clockMultiplier = 1
function setClockMultiplier(multiplier)
clockMultiplier = multiplier
end
function adjustTime(offsetAmount)
faketime = faketime + offsetAmount
end
function os.clock()
local now = realclock()
adjustTime((now - lasttime) * clockMultiplier)
lasttime = now
return faketime
end
end
You can now call setClockMultiplier to freely slow down or speed up the passage of time as reported by os.clock. You can call adjustTime to advance or retard the clock by arbitrary amounts.

Why not write your own clock function that does what you want?
do
local clock = os.clock
local increment = 0
os.clock = function(inc)
increment = increment + (inc or 0)
return clock()+increment
end
end
print(os.clock())
os.clock(1)
print(os.clock())
This will print 0.001 1.001 or something similar.

Related

Equation in if branch is not executed

I have a question that confused me for a long time. As you know, when we use an if condition in Modelica, that means if the expression is true, then Modelica will do the corresponding equation.
But when i test the following code, I am confused:
model Model134
Real a(start = 0);
equation
if not sample(0, 2) then
a = 1;
else
a = 3;
end if;
end Model134;
I think a will be changed every 2s (start time=0), but when I simulate this model, it dose not change and a is equal to 1 all the time.
Dose anybody know the root cause?
a does change its value, but depending on your simulation tool you might not see it in the plot.
sample(0, 2) creates a time event every 2 seconds. The return value of sample() is only true during the event. So the value of a changes, but after the event it immediately changes back.
In this answer to a similar question, it is mentioned that Dymola stores the value before and after the event in result file. Intermediate values are skipped for efficiency reasons (there can be many for every event, which would bloat up your result file). Hence you can not plot this change in Dymola. For OpenModelica see the answer by
Akhil Nandan.
To proof that a really does change its value you can use this code for example:
model Model134
import Modelica.Utilities.Streams.print;
Real a;
equation
if sample(0, 2) then
a = 1;
else
a = 0;
end if;
when a > 0.5 then
print("a is " + String(a) + " at t=" + String(time) + "s");
end when;
annotation (experiment(StopTime=10));
end Model134;
You should see something like this in the simulation log:
a is 1 at t=2s
a is 1 at t=4s
a is 1 at t=6s
a is 1 at t=8s
a is 1 at t=10s
This is the plot simulated when trying your above code in OpenModelica with settings shown in the second figure.
A time event is triggered when sample(startTime,interval) evaluates true at every multiple of 2 seconds and based on your code logic this should activate else
block and assign value of variable a to be 3.

thinkscript if statement failure

The thinkscript if statement fails to branch as expected in some cases. The following test case can be used to reproduce this bug / defect.
It is shared via Grid containing chart and script
To cut the long story short, a possible workaround in some cases is to use the if-expression which is a function, which may be slower, potentially leading to Script execution timeout in scans.
This fairly nasty bug in thinkscript prevents me from writing some scans and studies the way I need to.
Following is some sample code that shows the problem on a chart.
input price = close;
input smoothPeriods = 20;
def output = Average(price, smoothPeriods);
# Get the current offset from the right edge from BarNumber()
# BarNumber(): The current bar number. On a chart, we can see that the number increases
# from left 1 to number of bars e.g. 140 at the right edge.
def barNumber = BarNumber();
def barCount = HighestAll(barNumber);
# rightOffset: 0 at the right edge, i.e. at the rightmost bar,
# increasing from right to left.
def rightOffset = barCount - barNumber;
# Prepare a lookup table:
def lookup;
if (barNumber == 1) {
lookup = -1;
} else {
lookup = 53;
}
# This script gets the minimum value from data in the offset range between startIndex
# and endIndex. It serves as a functional but not direct replacement for the
# GetMinValueOffset function where a dynamic range is required. Expect it to be slow.
script getMinValueBetween {
input data = low;
input startIndex = 0;
input endIndex = 0;
plot minValue = fold index = startIndex to endIndex with minRunning = Double.POSITIVE_INFINITY do Min(GetValue(data, index), minRunning);
}
# Call this only once at the last bar.
script buildValue {
input lookup = close;
input offsetLast = 0;
# Do an indirect lookup
def lookupPosn = 23;
def indirectLookupPosn = GetValue(lookup, lookupPosn);
# lowAtIndirectLookupPosn is assigned incorrectly. The if statement APPEARS to be executed
# as if indirectLookupPosn was 0 but indirectLookupPosn is NOT 0 so the condition
# for the first branch should be met!
def lowAtIndirectLookupPosn;
if (indirectLookupPosn > offsetLast) {
lowAtIndirectLookupPosn = getMinValueBetween(low, offsetLast, indirectLookupPosn);
} else {
lowAtIndirectLookupPosn = close[offsetLast];
}
plot testResult = lowAtIndirectLookupPosn;
}
plot debugLower;
if (rightOffset == 0) {
debugLower = buildValue(lookup);
} else {
debugLower = 0;
}
declare lower;
To prepare the chart for the stock ADT, please set custom time frame:
10/09/18 to 10/09/19, aggregation period 1 day.
The aim of the script is to find the low value of 4.25 on 08/14/2019.
I DO know that there are various methods to do this in thinkscript such as GetMinValueOffset().
Let us please not discuss alternative methods of achieving the objective to find the low, alternatives for the attached script.
Because I am not asking for help achieving the objective. I am reporting a bug, and I want to know what goes wrong and perhaps how to fix it. In other words, finding the low here is just an example to make the script easier to follow. It could be anything else that one wants a script to compute.
Please let me describe the script.
First it does some smoothing with a moving average. The result is:
def output;
Then the script defines the distance from the right edge so we can work with offsets:
def rightOffset;
Then the script builds a lookup table:
def lookup;
script getMinValueBetween {} is a little function that finds the low between two offset positions, in a dynamic way. It is needed because GetMinValueOffset() does not accept dynamic parameters.
Then we have script buildValue {}
This is where the error occurs. This script is executed at the right edge.
buildValue {} does an indirect lookup as follows:
First it goes into lookup where it finds the value 53 at lookupPosn = 23.
With 53, if finds the low between offset 53 and 0, by calling the script function getMinValueBetween().
It stores the value in def lowAtIndirectLookupPosn;
As you can see, this is very simple indeed - only 38 lines of code!
The problem is, that lowAtIndirectLookupPosn contains the wrong value, as if the wrong branch of the if statement was executed.
plot testResult should put out the low 4.25. Instead it puts out close[offsetLast] which is 6.26.
Quite honestly, this is a disaster because it is impossible to predict which of any if statement in your program will fail or not.
In a limited number of cases, the if-expression can be used instead of the if statement. However the if-expression covers only a subset of use cases and it may execute with lower performance in scans. More importantly,
it defeats the purpose of the if statement in an important case because it supports conditional assignment but not conditional execution. In other words, it executes both branches before assigning one of two values.

how can i improve bulk calculation from file data

I have a file of binary values. The section I am looking at is 4 byte int with the values in the pattern of MW1, MVAR1, MW2, MVAR2,...
I read the values in with
temp = array.array("f")
temp.fromfile(file, length *2)
mw_mvar = temp.tolist()
I then calculate the magnitude like this.
mag = [0] * length
for x in range(0,length * 2, 2):
a = mw_mvar[x]
b = mw_mvar[x + 1]
mag[(x / 2)] = sqrt(a*a + b*b)
The calculations (not the read) are doubling the total length of my script. I know there is (theoretically) a way to do this faster because am mimicking a script that ultimately calls fortran (pyd to call function dlls in fortran i think) which is able to do this calculation with negligible affect on run time.
This is the best i can come up with. any suggestions for improvements?
I have also tried math.pow(), **.5, **2 with no differences.
with no luck improving the calculations, I went around the problem. I realised that I only needed 1% of those calculated values so I created a class to calculate them on demand. It was important (to me) that the resulting code act similar to as if it were a list of calculated values. A lot of the remainder of the process uses the values and different versions of the data are pre-calculated. The class means i don't need a set of procedures for each version of data
class mag:
def __init__(self,mw_mvar):
self._mw_mvar = mw_mvar
#_sgn = sgn
def __len__(self):
return len(self._mw_mvar/2)
def __getitem__(self, item):
return sqrt(self._mw_mvar[2*item] ** 2 + self._mw_mvar[2*item+1] ** 2)
ps this could also be done in a function and take both versions. i would have had to make more changes to the overall script.
function (a,b,x):
if b[x]==0:
return a[x]
else:
return sqrt(a[x]**2 + b[x]**2)

Share a big dictionary in python multiprcess in Windows

I am writing a feature-collection program which would extract information from over 20000 files with Python 2.7. I store some pre-computed information in a dictionary. In roder to make my prgram faster, I used multiprocess and the big dictionary must be used in these process. (Actually, I just want to use each process to handle a part of the files, every process is actually the same function). This dictionary will not be changed in these processes, it is just a parameter for the function.
I found that each process will create its own address space, each with a copy of this big dictionary. My computer does not such a big memory to store many copy of this dictionary. I wonder if there is way to create a static dict object that can be used by every process? Below is my code of the multiprocess part, pmi_dic is the big dictionary (maybe several GB), it will not be changed in the function get_features.
processNum = 2
pool = mp.Pool(processes = processNum)
fileNum = len(filelist)
offset = fileNum / processNum
for i in range(processNum):
if (i == processNum - 1):
start = i * offset
end = fileNum
else:
start = i * offset
end = start + offset
print str(start) + ' ' + str(end)
pool.apply_async(get_features, args = (df_dic, pmi_dic, start, end, filelist, wordNum))
pool.close()
pool.join()

Theano scan function

Example taken from: http://deeplearning.net/software/theano/library/scan.html
k = T.iscalar("k")
A = T.vector("A")
# Symbolic description of the result
result, updates = theano.scan(fn=lambda prior_result, A: prior_result * A,
outputs_info=T.ones_like(A),
non_sequences=A,
n_steps=k)
# We only care about A**k, but scan has provided us with A**1 through A**k.
# Discard the values that we don't care about. Scan is smart enough to
# notice this and not waste memory saving them.
final_result = result[-1]
# compiled function that returns A**k
power = theano.function(inputs=[A,k], outputs=final_result, updates=updates)
print power(range(10),2)
print power(range(10),4)
What is prior_result? More accurately, where is prior_result defined?
I have this same question for lot of the examples given on:http://deeplearning.net/software/theano/library/scan.html
For example,
components, updates = theano.scan(fn=lambda coefficient, power, free_variable: coefficient * (free_variable ** power),
outputs_info=None,
sequences=[coefficients, theano.tensor.arange(max_coefficients_supported)],
non_sequences=x)
Where is power and free_variables defined?
This is using a Python feature call "lambda". lambda are unnamed python function of 1 line. They have this forme:
lambda [param...]: code
In your example it is:
lambda prior_result, A: prior_result * A
This is a function that take prior_result and A as input. This function, is passed to the scan() function as the fn parameter. scan() will call it with 2 variables. The first one will be the correspondance of what was provided in the output_info parameter. The other is what is provided in the non_sequence parameter.