thinkscript if statement failure - thinkscript

The thinkscript if statement fails to branch as expected in some cases. The following test case can be used to reproduce this bug / defect.
It is shared via Grid containing chart and script
To cut the long story short, a possible workaround in some cases is to use the if-expression which is a function, which may be slower, potentially leading to Script execution timeout in scans.
This fairly nasty bug in thinkscript prevents me from writing some scans and studies the way I need to.
Following is some sample code that shows the problem on a chart.
input price = close;
input smoothPeriods = 20;
def output = Average(price, smoothPeriods);
# Get the current offset from the right edge from BarNumber()
# BarNumber(): The current bar number. On a chart, we can see that the number increases
# from left 1 to number of bars e.g. 140 at the right edge.
def barNumber = BarNumber();
def barCount = HighestAll(barNumber);
# rightOffset: 0 at the right edge, i.e. at the rightmost bar,
# increasing from right to left.
def rightOffset = barCount - barNumber;
# Prepare a lookup table:
def lookup;
if (barNumber == 1) {
lookup = -1;
} else {
lookup = 53;
}
# This script gets the minimum value from data in the offset range between startIndex
# and endIndex. It serves as a functional but not direct replacement for the
# GetMinValueOffset function where a dynamic range is required. Expect it to be slow.
script getMinValueBetween {
input data = low;
input startIndex = 0;
input endIndex = 0;
plot minValue = fold index = startIndex to endIndex with minRunning = Double.POSITIVE_INFINITY do Min(GetValue(data, index), minRunning);
}
# Call this only once at the last bar.
script buildValue {
input lookup = close;
input offsetLast = 0;
# Do an indirect lookup
def lookupPosn = 23;
def indirectLookupPosn = GetValue(lookup, lookupPosn);
# lowAtIndirectLookupPosn is assigned incorrectly. The if statement APPEARS to be executed
# as if indirectLookupPosn was 0 but indirectLookupPosn is NOT 0 so the condition
# for the first branch should be met!
def lowAtIndirectLookupPosn;
if (indirectLookupPosn > offsetLast) {
lowAtIndirectLookupPosn = getMinValueBetween(low, offsetLast, indirectLookupPosn);
} else {
lowAtIndirectLookupPosn = close[offsetLast];
}
plot testResult = lowAtIndirectLookupPosn;
}
plot debugLower;
if (rightOffset == 0) {
debugLower = buildValue(lookup);
} else {
debugLower = 0;
}
declare lower;
To prepare the chart for the stock ADT, please set custom time frame:
10/09/18 to 10/09/19, aggregation period 1 day.
The aim of the script is to find the low value of 4.25 on 08/14/2019.
I DO know that there are various methods to do this in thinkscript such as GetMinValueOffset().
Let us please not discuss alternative methods of achieving the objective to find the low, alternatives for the attached script.
Because I am not asking for help achieving the objective. I am reporting a bug, and I want to know what goes wrong and perhaps how to fix it. In other words, finding the low here is just an example to make the script easier to follow. It could be anything else that one wants a script to compute.
Please let me describe the script.
First it does some smoothing with a moving average. The result is:
def output;
Then the script defines the distance from the right edge so we can work with offsets:
def rightOffset;
Then the script builds a lookup table:
def lookup;
script getMinValueBetween {} is a little function that finds the low between two offset positions, in a dynamic way. It is needed because GetMinValueOffset() does not accept dynamic parameters.
Then we have script buildValue {}
This is where the error occurs. This script is executed at the right edge.
buildValue {} does an indirect lookup as follows:
First it goes into lookup where it finds the value 53 at lookupPosn = 23.
With 53, if finds the low between offset 53 and 0, by calling the script function getMinValueBetween().
It stores the value in def lowAtIndirectLookupPosn;
As you can see, this is very simple indeed - only 38 lines of code!
The problem is, that lowAtIndirectLookupPosn contains the wrong value, as if the wrong branch of the if statement was executed.
plot testResult should put out the low 4.25. Instead it puts out close[offsetLast] which is 6.26.
Quite honestly, this is a disaster because it is impossible to predict which of any if statement in your program will fail or not.

In a limited number of cases, the if-expression can be used instead of the if statement. However the if-expression covers only a subset of use cases and it may execute with lower performance in scans. More importantly,
it defeats the purpose of the if statement in an important case because it supports conditional assignment but not conditional execution. In other words, it executes both branches before assigning one of two values.

Related

Equation in if branch is not executed

I have a question that confused me for a long time. As you know, when we use an if condition in Modelica, that means if the expression is true, then Modelica will do the corresponding equation.
But when i test the following code, I am confused:
model Model134
Real a(start = 0);
equation
if not sample(0, 2) then
a = 1;
else
a = 3;
end if;
end Model134;
I think a will be changed every 2s (start time=0), but when I simulate this model, it dose not change and a is equal to 1 all the time.
Dose anybody know the root cause?
a does change its value, but depending on your simulation tool you might not see it in the plot.
sample(0, 2) creates a time event every 2 seconds. The return value of sample() is only true during the event. So the value of a changes, but after the event it immediately changes back.
In this answer to a similar question, it is mentioned that Dymola stores the value before and after the event in result file. Intermediate values are skipped for efficiency reasons (there can be many for every event, which would bloat up your result file). Hence you can not plot this change in Dymola. For OpenModelica see the answer by
Akhil Nandan.
To proof that a really does change its value you can use this code for example:
model Model134
import Modelica.Utilities.Streams.print;
Real a;
equation
if sample(0, 2) then
a = 1;
else
a = 0;
end if;
when a > 0.5 then
print("a is " + String(a) + " at t=" + String(time) + "s");
end when;
annotation (experiment(StopTime=10));
end Model134;
You should see something like this in the simulation log:
a is 1 at t=2s
a is 1 at t=4s
a is 1 at t=6s
a is 1 at t=8s
a is 1 at t=10s
This is the plot simulated when trying your above code in OpenModelica with settings shown in the second figure.
A time event is triggered when sample(startTime,interval) evaluates true at every multiple of 2 seconds and based on your code logic this should activate else
block and assign value of variable a to be 3.

What's slowing down this piece of python code?

I have been trying to implement the Stupid Backoff language model (the description is available here, though I believe the details are not relevant to the question).
The thing is, the code's working and producing the result that is expected, but works slower than I expected. I figured out the part that was slowing down everything is here (and NOT in the training part):
def compute_score(self, sentence):
length = len(sentence)
assert length <= self.n
if length == 1:
word = tuple(sentence)
return float(self.ngrams[length][word]) / self.total_words
else:
words = tuple(sentence[::-1])
count = self.ngrams[length][words]
if count == 0:
return self.alpha * self.compute_score(sentence[1:])
else:
return float(count) / self.ngrams[length - 1][words[:-1]]
def score(self, sentence):
""" Takes a list of strings as argument and returns the log-probability of the
sentence using your language model. Use whatever data you computed in train() here.
"""
output = 0.0
length = len(sentence)
for idx in range(length):
if idx < self.n - 1:
current_score = self.compute_score(sentence[:idx+1])
else:
current_score = self.compute_score(sentence[idx-self.n+1:idx+1])
output += math.log(current_score)
return output
self.ngrams is a nested dictionary that has n entries. Each of these entries is a dictionary of form (word_i, word_i-1, word_i-2.... word_i-n) : the count of this combination.
self.alpha is a constant that defines the penalty for going n-1.
self.n is the maximum length of that tuple that the program is looking for in the dictionary self.ngrams. It is set to 3 (though setting it to 2 or even 1 doesn't anything). It's weird because the Unigram and Bigram models work just fine in fractions of a second.
The answer that I am looking for is not a refactored version of my own code, but rather a tip which part of it is the most computationally expensive (so that I could figure out myself how to rewrite it and get the most educational profit from solving this problem).
Please, be patient, I am but a beginner (two months into the world of programming). Thanks.
UPD:
I timed the running time with the same data using time.time():
Unigram = 1.9
Bigram = 3.2
Stupid Backoff (n=2) = 15.3
Stupid Backoff (n=3) = 21.6
(It's on some bigger data than originally because of time.time's bad precision.)
If the sentence is very long, most of the code that's actually running is here:
def score(self, sentence):
for idx in range(len(sentence)): # should use xrange in Python 2!
self.compute_score(sentence[idx-self.n+1:idx+1])
def compute_score(self, sentence):
words = tuple(sentence[::-1])
count = self.ngrams[len(sentence)][words]
if count == 0:
self.compute_score(sentence[1:])
else:
self.ngrams[len(sentence) - 1][words[:-1]]
That's not meant to be working code--it just removes the unimportant parts.
The flow in the critical path is therefore:
For each word in the sentence:
Call compute_score() on that word plus the following 2. This creates a new list of length 3. You could avoid that with itertools.islice().
Construct a 3-tuple with the words reversed. This creates a new tuple. You could avoid that by passing the -1 step argument when making the slice outside this function.
Look up in self.ngrams, a nested dict, with the first key being a number (might be faster if this level were a list; there are only three keys anyway?), and the second being the tuple just created.
Recurse with the first word removed, i.e. make a new tuple (sentence[2], sentence[1]), or
Do another lookup in self.ngrams, implicitly creating another new tuple (words[:-1]).
In summary, I think the biggest problem you have is the repeated and nested creation and destruction of lists and tuples.

how can i improve bulk calculation from file data

I have a file of binary values. The section I am looking at is 4 byte int with the values in the pattern of MW1, MVAR1, MW2, MVAR2,...
I read the values in with
temp = array.array("f")
temp.fromfile(file, length *2)
mw_mvar = temp.tolist()
I then calculate the magnitude like this.
mag = [0] * length
for x in range(0,length * 2, 2):
a = mw_mvar[x]
b = mw_mvar[x + 1]
mag[(x / 2)] = sqrt(a*a + b*b)
The calculations (not the read) are doubling the total length of my script. I know there is (theoretically) a way to do this faster because am mimicking a script that ultimately calls fortran (pyd to call function dlls in fortran i think) which is able to do this calculation with negligible affect on run time.
This is the best i can come up with. any suggestions for improvements?
I have also tried math.pow(), **.5, **2 with no differences.
with no luck improving the calculations, I went around the problem. I realised that I only needed 1% of those calculated values so I created a class to calculate them on demand. It was important (to me) that the resulting code act similar to as if it were a list of calculated values. A lot of the remainder of the process uses the values and different versions of the data are pre-calculated. The class means i don't need a set of procedures for each version of data
class mag:
def __init__(self,mw_mvar):
self._mw_mvar = mw_mvar
#_sgn = sgn
def __len__(self):
return len(self._mw_mvar/2)
def __getitem__(self, item):
return sqrt(self._mw_mvar[2*item] ** 2 + self._mw_mvar[2*item+1] ** 2)
ps this could also be done in a function and take both versions. i would have had to make more changes to the overall script.
function (a,b,x):
if b[x]==0:
return a[x]
else:
return sqrt(a[x]**2 + b[x]**2)

Python: How to create a function that uses its own output and uses an array of random generated numbers

Disclaimer: I am quite new to Python and programming as a whole.
I have been trying to create a function to generated random stock prices using the following:
New stock price = previous price + (previous price*(return + (volatility * random number)))
The return and volatility numbers are fixed. Also, I have generated the random numbers for N times.
The problem is how to create a function that has the output re-used again on itself as an input previous price.
Basically to have an array of NEW stock prices generated from this formula and the previous price variable is the output of the function on itself.
I have been trying to do this for a couple of days and I am sure I am not fully equipped to do it (given that I am a newbie) but ANY HELP would really really be more than appreciated...!!!
Please any help would be useful.
import random
initial_price = 10
return_daily = 0.12 / 252
vol_daily = 0.30 / (math.sqrt(252))
random_numbers = []
for i in range (5):
random_numbers.append(random.gauss(0,1))
def stock_prices(random_numbers):
prices = []
for i in range(0,len(random_numbers)):
calc = initial_price + (initial_price * (return_daily+(vol_daily*random_numbers[i])))
prices.append(calc)
return prices
You can't really use recursion here, because you don't have a break condition that ends the recursion. You could construct one by passing an additional counter parameter that specifies how many more levels to recurse, but that would be not optimal in my opinion.
Instead, I recommend you to use a for loop that gets repeated a fixed number of times you can specify. This way you can add one new price value to a list per loop iteration step and access the previous one to calculate it:
first_price = 100
list_length = 20
def price_formula(previous_price):
return previous_price * 1.2 # you would replace this with your actual calculation
prices = [first_price] # create list with initial item
for i in range(list_length): # repeats exactly 'list_length' times, turn number is 'i'
prices.append(price_formula(prices[-1])) # append new price to list
# prices[-1] always returns the last element of the list, i.e. the previously added one.
print("\n".join(map(str, prices)))
My optimization of your code snippet:
import random
initial_price = 10
return_daily = 0.12 / 252
vol_daily = 0.30 / (math.sqrt(252))
def stock_prices(number_of_prices):
prices = [initial_price]
for i in range(0, number_of_prices):
prices.append(prices[-1] + (prices[-1] * (return_daily+(vol_daily*random.gauss(0,1))))
return prices
This is the classic Markov process. The present value depends upon its previous value, and only its previous value. The best thing to use in this case is what is called an iterator. Iterators can be created to generate arbitrary iterators that model the markov model.
Learn about how iterators can be generated here http://anandology.com/python-practice-book/iterators.html
Now that you have some understanding of how iterators work, you can create your own iterators for your problem. You need a class that implements the __iter__() method and the next() method.
Something like this:
import random
from math import sqrt
class Abc:
def __init__(self, initPrice):
self.v = initPrice # This is the initial price
self.dailyRet = 0.12/252
self.dailyVol = 0.3/sqrt(252)
return
def __iter__(self): return self
def next(self):
self.v += self.v * (self.dailyRet + self.dailyVol*random.gauss(0,1) )
return self.v
if __name__ == '__main__':
initPrice = 10
temp = Abc(initPrice)
for i in range(10):
print temp.next()
This will give the output:
> python test.py
10.3035353791
10.3321905359
10.3963790497
10.5354048937
10.6345509793
10.2598381299
10.3336476153
10.6495914319
10.7915999185
10.6669136891
Note that this does not have the stop iteration command, so if you use this incorrectly, you may get into trouble. However, that is not difficult to implement and I hope you try to implement it ...

Finding length of list without using the 'len' function in python

In my High school assignment part of it is to make a function that will find the average number in a list of floating points. We can't use len and such so something like sum(numList)/float(len(numList)) isn't an option for me. I've spent an hour researching and racking my brain for a way to find the list length without using the len function, and I've got nothing so I was hoping to be either shown how to do it or to be pointed in the right direction. Help me stack overflow, your my only hope. :)
Use a loop to add up the values from the list, and count them at the same time:
def average(numList):
total = 0
count = 0
for num in numList:
total += num
count += 1
return total / count
If you might be passed an empty list, you might want to check for that first and either return a predetermined value (e.g. 0), or raise a more helpful exception than the ZeroDivisionError you'll get if you don't do any checking.
If you're using Python 2 and the list might be all integers, you should either put from __future__ import division at the top of the file, or convert one of total or count to a float before doing the division (initializing one of them to 0.0 would also work).
Might as well show how to do it with a while loop since it's another opportunity to learn.
Normally, you won't need counter variable(s) inside of a for loop. However, there are certain cases where it's helpful to keep a count as well as retrieve the item from the list and this is where enumerate() comes in handy.
Basically, the below solution is what #Blckknght's solution is doing internally.
def average(items):
"""
Takes in a list of numbers and finds the average.
"""
if not items:
return 0
# iter() creates an iterator.
# an iterator has gives you the .next()
# method which will return the next item
# in the sequence of items.
it = iter(items)
count = 0
total = 0
while True:
try:
# when there are no more
# items in the list
total += next(it)
# a stop iteration is raised
except StopIteration:
# this gives us an opportunity
# to break out of the infinite loop
break
# since the StopIteration will be raised
# before a value is returned, we don't want
# to increment the counter until after
# a valid value is retrieved
count += 1
# perform the normal average calculation
return total / float(count)
def length_of_list(my_list):
if not my_list:
return 0
return 1+length_of_list(my_list[1:])