Output issue calculating moving average - list

I coded a function that calculates the moving average of a stock given a list of dates and prices. But the output is incorrect. I just need a second set of eyes on the code. here is my code.
def calculate(self, stock_date_price_list, min_days=2):
'''Calculates the moving average and generates a signal strategy for buy or sell
strategy given a list of stock date and price. '''
stock_averages = []
stock_signals = []
price_list = [float(n) for n in stock_date_price_list[1::2]]
days_window = collections.deque(maxlen=min_days)
rounding_point = 0.01
for price in price_list:
days_window.append(price)
stock_averages.append(0)
stock_signals.append("")
if len(days_window) == min_days:
moving_avg = sum(days_window) / min_days
stock_averages[-1] = moving_avg
if price < moving_avg:
stock_signals[-1] = "SELL"
elif price > moving_avg:
if price_list[-2] < stock_averages[-2]:
stock_signals[-1] = "BUY"
stock_averages[:] = ("%.2f" % avg if abs(avg)>=rounding_point else ' ' for avg in stock_averages)
return stock_averages, stock_signals
The input is a list of stock price and dates in the following format:
[2012-10-10,52.30,2012-10-09,51.60]
The output I get is:
2012-10-01 659.39
2012-10-02 661.31
2012-10-03 671.45
2012-10-04 666.80
2012-10-05 652.59
2012-10-08 638.17
2012-10-09 635.85
2012-10-10 640.91
2012-10-11 628.10
2012-10-12 629.71 648.43 SELL
2012-10-15 634.76 645.97 SELL
2012-10-16 649.79 644.81 BUY
2012-10-17 644.61 642.13 BUY
2012-10-18 632.64 638.71 SELL
2012-10-19 609.84 634.44 SELL
2012-10-22 634.03 634.02 BUY
2012-10-23 613.36 631.77 SELL
2012-10-24 616.83 629.37 SELL
Whereas it should be:
2012-10-01 659.39
2012-10-02 661.31
2012-10-03 671.45
2012-10-04 666.80
2012-10-05 652.59
2012-10-08 638.17
2012-10-09 635.85
2012-10-10 640.91
2012-10-11 628.10
2012-10-12 629.71 648.43
2012-10-15 634.76 645.97
2012-10-16 649.79 644.81 BUY
2012-10-17 644.61 642.13
2012-10-18 632.64 638.71 SELL
2012-10-19 609.84 634.44
2012-10-22 634.03 634.02 BUY
2012-10-23 613.36 631.77 SELL
2012-10-24 616.83 629.37
Parameters for buying/selling:
If the closing price on a particular day has crossed above the simple moving average (i.e., the closing price on that day is above that day's simple moving average, while the previous closing price is not above the previous simple moving average), generate a buy signal.
If the closing price on a particular day has crossed below the simple moving average, generate a sell signal.
Otherwise, generate no signal.

As you state yourself, the condition for buying is not just
price > moving_avg but also that the previous_price < previous_moving_avg.
You do address this with
price_list[-2] < stock_averages[-2]
except that price_list is one big list, and price_list[-2] is always the penultimate item in the big list. It isn't necessarily the previous price relative to where you are in the loop.
Similarly, the signal to sell needs to be not only price < moving_avg but also that previous_price > previous_moving_avg.
There are other (mainly stylistic) problems with calculate.
stock_data_price_list is a required input, but you only use the
slice stock_data_price_list[1::2]. If that's the case, you should require the slice as the input, not stock_data_price_list
price_list is essentially this slice, except that you call float
on each item. That implies the data has not been parsed properly.
Don't make calculate be both a data parser as well as a data
analyzer. It's much better to make simple functions which accomplish one and only one task.
Similarly, calculate should not be in the business of formatting
the result:
stock_averages[:] = ("%.2f" % avg if abs(avg)>=rounding_point else ' ' for avg in stock_averages)
Here is how you could fix the code using pandas:
import pandas as pd
data = [('2012-10-01', 659.38999999999999),
('2012-10-02', 661.30999999999995),
('2012-10-03', 671.45000000000005),
('2012-10-04', 666.79999999999995),
('2012-10-05', 652.59000000000003),
('2012-10-08', 638.16999999999996),
('2012-10-09', 635.85000000000002),
('2012-10-10', 640.90999999999997),
('2012-10-11', 628.10000000000002),
('2012-10-12', 629.71000000000004),
('2012-10-15', 634.75999999999999),
('2012-10-16', 649.78999999999996),
('2012-10-17', 644.61000000000001),
('2012-10-18', 632.63999999999999),
('2012-10-19', 609.84000000000003),
('2012-10-22', 634.02999999999997),
('2012-10-23', 613.36000000000001),
('2012-10-24', 616.83000000000004)]
df = pd.DataFrame(data, columns=['date','price'])
df['average'] = pd.rolling_mean(df['price'], 10)
df['prev_price'] = df['price'].shift(1)
df['prev_average'] = df['average'].shift(1)
df['signal'] = ''
buys = (df['price']>df['average']) & (df['prev_price']<df['prev_average'])
sells = (df['price']<df['average']) & (df['prev_price']>df['prev_average'])
df.loc[buys, 'signal'] = 'BUY'
df.loc[sells, 'signal'] = 'SELL'
print(df)
yields
date price average prev_price prev_average signal
0 2012-10-01 659.39 NaN NaN NaN
1 2012-10-02 661.31 NaN 659.39 NaN
2 2012-10-03 671.45 NaN 661.31 NaN
3 2012-10-04 666.80 NaN 671.45 NaN
4 2012-10-05 652.59 NaN 666.80 NaN
5 2012-10-08 638.17 NaN 652.59 NaN
6 2012-10-09 635.85 NaN 638.17 NaN
7 2012-10-10 640.91 NaN 635.85 NaN
8 2012-10-11 628.10 NaN 640.91 NaN
9 2012-10-12 629.71 648.428 628.10 NaN
10 2012-10-15 634.76 645.965 629.71 648.428
11 2012-10-16 649.79 644.813 634.76 645.965 BUY
12 2012-10-17 644.61 642.129 649.79 644.813
13 2012-10-18 632.64 638.713 644.61 642.129 SELL
14 2012-10-19 609.84 634.438 632.64 638.713
15 2012-10-22 634.03 634.024 609.84 634.438 BUY
16 2012-10-23 613.36 631.775 634.03 634.024 SELL
17 2012-10-24 616.83 629.367 613.36 631.775
[18 rows x 6 columns]
Without pandas, you could do something like this:
nan = float('nan')
def calculate(prices, size=2):
'''Calculates the moving average and generates a signal strategy for buy or sell
strategy given a list of stock date and price. '''
averages = [nan]*(size-1) + moving_average(prices, size)
previous_prices = ([nan] + prices)[:-1]
previous_averages = ([nan] + averages)[:-1]
signal = []
for price, ave, prev_price, prev_ave in zip(
prices, averages, previous_prices, previous_averages):
if price > ave and prev_price < prev_ave:
signal.append('BUY')
elif price < ave and prev_price > prev_ave:
signal.append('SELL')
else:
signal.append('')
return averages, signal
def window(seq, n=2):
"""
Returns a sliding window (of width n) over data from the sequence
s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...
"""
for i in xrange(len(seq) - n + 1):
yield tuple(seq[i:i + n])
def moving_average(data, size):
return [(sum(grp)/len(grp)) for grp in window(data, n=size)]
def report(*args):
for row in zip(*args):
print(''.join(map('{:>10}'.format, row)))
dates = ['2012-10-01',
'2012-10-02',
'2012-10-03',
'2012-10-04',
'2012-10-05',
'2012-10-08',
'2012-10-09',
'2012-10-10',
'2012-10-11',
'2012-10-12',
'2012-10-15',
'2012-10-16',
'2012-10-17',
'2012-10-18',
'2012-10-19',
'2012-10-22',
'2012-10-23',
'2012-10-24']
prices = [659.38999999999999,
661.30999999999995,
671.45000000000005,
666.79999999999995,
652.59000000000003,
638.16999999999996,
635.85000000000002,
640.90999999999997,
628.10000000000002,
629.71000000000004,
634.75999999999999,
649.78999999999996,
644.61000000000001,
632.63999999999999,
609.84000000000003,
634.02999999999997,
613.36000000000001,
616.83000000000004]
averages, signals = calculate(prices, size=10)
report(dates, prices, averages, signals)
which yields
2012-10-01 659.39 nan
2012-10-02 661.31 nan
2012-10-03 671.45 nan
2012-10-04 666.8 nan
2012-10-05 652.59 nan
2012-10-08 638.17 nan
2012-10-09 635.85 nan
2012-10-10 640.91 nan
2012-10-11 628.1 nan
2012-10-12 629.71 648.428
2012-10-15 634.76 645.965
2012-10-16 649.79 644.813 BUY
2012-10-17 644.61 642.129
2012-10-18 632.64 638.713 SELL
2012-10-19 609.84 634.438
2012-10-22 634.03 634.024 BUY
2012-10-23 613.36 631.775 SELL
2012-10-24 616.83 629.367

Related

how to know which quarter does the current month belongs to ? (in python )

I want to know to which quarter(Q1,Q2,Q3,Q4) does the current month belongs to in python. I'm fetching the current date by importing time module as follows:
import time
print "Current date " + time.strftime("%x")
any idea how to do it ?
Modifying your code, I get this:
import time
month = int(time.strftime("%m")) - 1 # minus one, so month starts at 0 (0 to 11)
quarter = month / 3 + 1 # add one, so quarter starts at 1 (1 to 4)
quarter_str = "Q" + str(quarter) # convert to the "Qx" format string
print quarter_str
Or you could use the bisect module:
import time
import bisect
quarters = range(1, 12, 3) # This defines quarters: Q1 as 1, 2, 3, and so on
month = int(time.strftime("%m"))
quarter = bisect.bisect(quarters, month)
quarter_str = = "Q" + str(quarter)
print quarter_str
strftime does not know about quarters, but you can calculate them from the month:
Use time.localtime to retrieve the current time in the current timezone. This function returns a named tuple with year, month, day of month, hour, minute, second, weekday, day of year, and time zone offset. You will only need the month (tm_mon).
Use the month to calculate the quarter. If the first quarter starts with January and ends with March, the second quarter starts with April and ends with June, etc. then this is as easy as dividing by 4 without remainder and adding 1 (for 1..3 // 4 == 0, 0 + 1 == 1, 4..6 // 4 == 1, 1 + 1 == 2, etc.). If your definition of what a quarter is differs (e.g. companies may choose different start dates for their financial quarters), you have to adjust the calculation accordingly.

Pandas groupby mean absolute deviation

I have a pandas dataframe like this:
Product Group Product ID Units Sold Revenue Rev/Unit
A 451 8 $16 $2
A 987 15 $40 $2.67
A 311 2 $5 $2.50
B 642 6 $18 $3.00
B 251 4 $28 $7.00
I want to transform it to look like this:
Product Group Units Sold Revenue Rev/Unit Mean Abs Deviation
A 25 $61 $2.44 $0.24
B 10 $46 $4.60 $2.00
The Mean Abs Deviation column is to be performed on the Rev/Unit column in the first table. The tricky thing is taking into account the respective weights behind the Rev/Unit calculation.
For example taking a straight MAD of Product Group A's Rev/Unit would yield $0.26. However after taking weight into consideration, the MAD would be $0.24.
I know to use groupby to get the simple summation for units sold and revenue, but I'm a bit lost on how to do the more complicated calculations of the next 2 columns.
Also while we're giving advice/help---is there any easier way to create/paste tables into SO posts??
UPDATE:
Would a solution like this work? I know it will for the summation fields, but not sure how to implement for the latter 2 fields.
grouped_df=df.groupby("Product Group")
grouped_df.agg({
'Units Sold':'sum',
'Revenue':'sum',
'Rev/Unit':'Revenue'/'Units Sold',
'MAD':some_function})
you need to clarify what the "weights" are, I assumed the weights are the number of units sold, but that gives a different results from yours:
pv = df.pivot_table( rows='Product Group',
values=[ 'Units Sold', 'Revenue' ],
aggfunc=sum )
pv[ 'Rev/Unit' ] = pv.Revenue / pv[ 'Units Sold' ]
this gives:
Revenue Units Sold Rev/Unit
Product Group
A 61 25 2.44
B 46 10 4.60
As for WMAD:
def wmad( prod ):
idx = df[ 'Product Group' ] == prod
w = df[ 'Units Sold' ][ idx ]
abs_dev = np.abs ( df[ 'Rev/Unit' ][ idx ] - pv[ 'Rev/Unit' ][ prod ] )
return sum( abs_dev * w ) / sum( w )
pv[ 'Mean Abs Deviation' ] = [ wmad( idx ) for idx in pv.index ]
which as I mentioned gives different result
Revenue Units Sold Rev/Unit Mean Abs Deviation
Product Group
A 61 25 2.44 0.2836
B 46 10 4.60 1.9200
From your suggested solution, you can use a lambda function to operate on each row e.g:
'Rev/Unit': lambda x: calculate_revenue_per_unit(x)
Bear in mind that x is a tuple for each row, so you'll need to unpack that within your calculate_revenue_per_unit function.

if statement on discount

A_quantity = 10
B_quantity = 20
the quantity amount
N = float (input('please enter the quantity of package: '))
X_total = float (N*99.00)
the fee from input
Q_discount = (0.2*X_total)
W_discount = (X_total*0.3)
discounts from input total
Y_total = (X_total-Q_discount)
M_total = (X_total-W_discount)
the fee with the discount
def main ():
if N >= A_quantity:
print ('the total cost is $', \
format (Y_total, ',.2f'))
else:
if N >= B_ quantity:
print ('the total cost is $', \
format (M_total, ',.2f'))
main ()
the results should be 10 packages for $792.00
and 20 packages for $1,380.00
yet the second statement gets the 20% discount also which total to $1549.00, when it should get only 30% discount
I don't know which language it is, but it's an algorithm problem : you should first try for the highest value cause the way it is designed now, if N = 30 , you will always enter the "if" , never the "else" , and if N=5 , you will enter the "else" , but the the if inside it...
let me try although I don't know the language:
def main ():
if N >= B_quantity:
print ('the total cost is $', \
format (M_total, ',.2f'))
else:
if N >= A_quantity:
print ('the total cost is $', \
format (Y_total, ',.2f'))
main ()
take the value of the product divided by 100 and multiplied by the discount
and then get this result and the value of the product subitrair
var SomaPercent = ValorUnit/100 * descont;
var result_fim = ValorUnit-SomaPercent;
you can change the if condition to
if N >= A_quantity && N < B_quantity ...
if N >= B_quantity
..

decision tree Python with exceptions 2.7

I'm writing a script which calculates the date of Easter for years 1900 - 2099.
The thing is that for 4 certain years (1954, 1981, 2049, and 2076) the formula differs a little bet (namely, the date is off 7 days).
def main():
print "Computes the date of Easter for years 1900-2099.\n"
year = input("The year: ")
if year >= 1900 and year <= 2099:
if year != 2049 != 2076 !=1981 != 1954:
a = year%19
b = year%4
c = year%7
d = (19*a+24)%30
e = (2*b+4*c+6*d+5)%7
date = 22 + d + e # March 22 is the starting date
if date <= 31:
print "The date of Easter is March", date
else:
print "The date of Easter is April", date - 31
else:
if date <= 31:
print "The date of Easter is March", date - 7
else:
print "The date of Easter is April", date - 31 - 7
else:
print "The year is out of range."
main()
Exerything is working well but the 4 years computation.
I'm getting the:
if date <= 31:
UnboundLocalError: local variable 'date' referenced before assignment whenever I'm entering any of the 4 years as input.
You cannot chain a expression like that; chain the tests using and operators or use a not in expression instead:
# and operators
if year != 2049 and year != 2076 and year != 1981 and year != 1954:
# not in expression
if year not in (2049, 2076, 1981, 1954):
The expression year != 2049 != 2076 !=1981 != 1954 means something different, it is interpreted as (((year != 2049) != 2076) !=1981) != 1954 instead; the first test is either True or False, and neither of those two values will ever be equal to any of the other numbers and that branch will always evaluate to False.
You will still get the UnboundLocalError for date though, since your else branch refers to date but it is never set in that branch. When the else branch executes, all Python sees is:
def main():
print "Computes the date of Easter for years 1900-2099.\n"
year = input("The year: ")
if year >= 1900 and year <= 2099:
if False:
# skipped
else:
if date <= 31:
print "The date of Easter is March", date - 7
else:
print "The date of Easter is April", date - 31 - 7
and date is never assigned a value in that case. You need to calculate date separately in that branch still, or move the calculation of the date value out of the if statement altogether; I am not familiar with the calculation of Easter so I don't know what you need to do in this case.

How do I take already calculated totals that are in a loop and add them together?

I created this program in Python 2.7.3
I did this in my Computer Science class. He assigned it in two parts. For the first part we had to create a program to calculate a monthly cell phone bill for five customers. The user inputs the number of texts, minutes, and data used. Additionaly, there are overage fees. $10 for every GB of data over the limit, $.4, per minute over the limit, and $.2 per text sent over the limit. 500 is the limit amount of text messages, 750 is the limit amount of minutes, and 2 GB is the limit amount of data for the plan.
For part 2 of the assignment. I have to calculate the total tax collected, total charges (each customer bill added together), total goverment fees collected, total customers who had overages etc.
Right now all I want help on is adding the customer bills all together. As I said earlier, when you run the program it prints the Total bill for 5 customers. I don't know how to assign those seperate totals to a variable, add them together, and then eventually print them as one big variable.
TotalBill = 0
monthly_charge = 69.99
data_plan = 30
minute = 0
tax = 1.08
govfees = 9.12
Finaltext = 0
Finalminute = 0
Finaldata = 0
Finaltax = 0
TotalCust_ovrtext = 0
TotalCust_ovrminute = 0
TotalCust_ovrdata = 0
TotalCharges = 0
for i in range (1,6):
print "Calculate your cell phone bill for this month"
text = input ("Enter texts sent/received ")
minute = input ("Enter minute's used ")
data = input ("Enter Data used ")
if data > 2:
data = (data-2)*10
TotalCust_ovrdata = TotalCust_ovrdata + 1
elif data <=2:
data = 0
if minute > 750:
minute = (minute-750)*.4
TotalCust_ovrminute = TotalCust_ovrminute + 1
elif minute <=750:
minute = 0
if text > 500:
text = (text-500)*.2
TotalCust_ovrtext = TotalCust_ovrtext + 1
elif text <=500:
text = 0
TotalBill = ((monthly_charge + data_plan + text + minute + data) * (tax)) + govfees
print ("Your Total Bill is... " + str(round(TotalBill,2)))
print "The toatal number of Customer's who went over their minute's usage limit is... " ,TotalCust_ovrminute
print "The total number of Customer's who went over their texting limit is... " ,TotalCust_ovrtext
print "The total number of Customer's who went over their data limit is... " ,TotalCust_ovrdata
Some of the variables created are not used in the program. Please overlook them.
As Preet suggested.
create another variable like TotalBill i.e.
AccumulatedBill = 0
Then at the end of your loop put.
AccumulatedBill += TotalBill
This will add each TotalBill to Accumulated. Then simply print out the result at the end.
print "Total for all customers is: %s" %(AccumulatedBill)
Note: you don't normally use uppercase on variables for the first letter of the word. Use either camelCase or underscore_separated.