I need to create a Python 2 algorithm to increment and decrement shop prices in the most exact and economical way for the customer, accounting for prices at various quantities.
For example (replaced item names):
75 "stickers" costs $4
200 "stickers" costs $9
675 "stickers" costs $27
The user should be able to increment to the next cheapest combination that provides a greater quantity of items, skipping objectively bad combinations. So in this example the increments would go: 75, 150, 200, 225, 275, 300, 350, 400...
Here the quantity 375 (5 * $4 = $20) is skipped. 400 (2 * $9 = $18) is cheaper and provides more items.
Thank you to anyone that can provide some insights! I am using Python 2.7 and must rely on the Python standard library.
Related
I have a Django model stored in a Postgres DB comprised of values of counts at irregular intervals:
WidgetCount
- Time
- Count
I'm trying to use a window function with Lag to give me a previous row's values as an annotation. My problem is when I try to combine it with some distinct date truncation the window function uses the source rows rather than the distinctly grouped ones.
For example if I have the following rows:
time count
2020-01-20 05:00 15
2020-01-20 06:00 20
2020-01-20 09:00 30
2020-01-21 06:00 35
2020-01-21 07:00 40
2020-01-22 04:00 50
2020-01-22 06:00 54
2020-01-22 09:00 58
And I want to return a queryset showing the first reading per day, I can use:
from django.db.models.functions import Trunc
WidgetCount.objects.distinct("date").annotate(date=Trunc("time", "day"))
Which gives me:
date count
01/01/20 15
01/01/21 35
01/01/22 50
I would like to add an annotation which gives me yesterday's value (so I can show the change per day).
date count yesterday_count
01/01/20 15
01/01/21 35 15
01/01/22 50 35
If I do:
from django.db.models.functions import Trunc, Lag
from django.db.models import Window
WidgetCount.objects.distinct("date").annotate(date=Trunc("time", "day"), yesterday_count=Window(expression=Lag("count")))
The second row return gives me 30 for yesterday_count - ie, its showing me the previous row before applying the distinct clause.
If I add a partiion clause like this:
WidgetCount.objects.distinct("date").annotate(date=Trunc("time", "day"), yesterday_count=Window(expression=Lag("count"), partition_by=F("date")))
Then yesterday_count is None for all rows.
I can do this calculation in Python if I need to but it's driving me a bit mad and I'd like to find out if what I'm trying to do is possible.
Thanks!
I think the main problem is that you're mixing operations that used in annotation generates a grouped query set such as sum with a operation that simples create a new field for each record in the given query set such as yesterday_count=Window(expression=Lag("count")).
So Ordering really matters here. So when you try:
WidgetCount.objects.distinct("date").annotate(date=Trunc("time", "day"), yesterday_count=Window(expression=Lag("count")))
The result queryset is simply the WidgetCount.objects.distinct("date") annotated, no grouping is perfomed.
I would suggest decoupling your operations so it becomes easier to understand what is happening, and notice you're iterating over the python object so don't need to make any new queries!
Note in using SUM operation as example because I am getting an unexpected error with the FirstValue operator. So I'm posting with Sum to demonstrate the idea which remains the same. The idea should be the same for first value just by changing acc_count=Sum("count") to first_count=FirstValue("count")
for truncDate_groups in Row.objects.annotate(trunc_date=Trunc('time','day')).values("trunc_date")\
.annotate(acc_count=Sum("count")).values("acc_count","trunc_date")\
.order_by('trunc_date')\
.annotate(y_count=Window(Lag("acc_count")))\
.values("trunc_date","acc_count","y_count"):
print(truncDate_groups)
OUTPUT:
{'trunc_date': datetime.datetime(2020, 1, 20, 0, 0, tzinfo=<UTC>), 'acc_count': 65, 'y_count': None}
{'trunc_date': datetime.datetime(2020, 1, 21, 0, 0, tzinfo=<UTC>), 'acc_count': 75, 'y_count': 162}
{'trunc_date': datetime.datetime(2020, 1, 22, 0, 0, tzinfo=<UTC>), 'acc_count': 162, 'y_count': 65}
It turns out FirstValue operator requires to use a Windows function so you can't nest FirtValue and then calculate Lag, so in this scenario I'm not exactly sure if you can do it. The question becomes how to access the First_Value column without nesting windows.
I haven't tested it out locally but I think you want to GROUP BY instead of using DISTINCT here.
WidgetCount.objects.values(
date=Trunc('time', 'day'),
).order_by('date').annotate(
date_count=Sum('count'), # Will trigger a GROUP BY date
).annotate(
yesterday_count=Window(Lag('date_count')),
)
I am creating the RESTful endpoints for supporting frontend payload.
My payload is an order of build your own dish and ready-made single dish
Problem:
In single POST of frontend. He wants to put everything to the single time. That's mean in the given list will contains 2 types of dictionary
one for build your own and one for ready-made single dish
IMO:
He can POST 2 times for each type of payload. By this method the endpoint will do one thing and I prefer that way.
He has only 1 reason to POST everything to single endpoint
Question:
What is your best practice for this sort of problem?
Build Your Own Payload:
In short I call it BYO.
1. base_bowl will dictates the size and price of the item
1. base_bowl will also determine the number of fishes, toppings, sauces.
Because base_bowl size S, M, or L has different quota.
For example
Size S can has fishes 1 scoop size S, and toppings 2 scoops size S.
Size M can has fishes 2 scoops size M, and toppings 3 scoops size M. Then if the customer would like to add more than quota he must add it in the extra_fishes, extra_toppings
Base on Price id since quantity is determine by number of member in the list.
{
"base_bowl": salad.id, # require=True, Price id
"fishes": [salmon.id, tuna.id],
"extra_fishes": [tofu.id],
"toppings": [tamago.id, mango.id],
"extra_toppings": [rambutan.id],
"premium_toppings": [ikura.id],
"sauces": [shoyu.id, spicy_kimchi.id],
"extra_sauces": [],
"sprinkles": [sesame.id, fried_shalots.id],
"dish_order": 1, # require=True
"note": {
'msg': 'eat here',
},
}
And backend will validate the input and INSERT them to Order and OrderItem
Ready-Made Dish:
This is very straight forward because it has no implicit logic like BYO. It just add OrderItem to Order
Use Menu id, size, and qty to determine price. Because customer is free to choose
{
'order_items': [
{
'menu_id': has_poink_menu.id,
'size': Price.MenuSize.XL, # 27, 37, 47, 52
'qty': 2, # amount = 52 * 2
},
{
'menu_id': no_poink_menu.id,
'size': Price.MenuSize.L, # 20, 30, 40, 45
'qty': 1 # amount = 40 * 1
}
]
}
My answer is opinionated, but to me a RESTful design is kept much clearer by keeping endpoints specific and well defined. So in your case there may be a BYODishViewSet and ReadyMadeDishViewSet mapped to /api/byodish/ and /api/readymadedish/.
However, if this is part of a larger single model, say an Order model, then you may want to consider using a nested (writable) serializer to wrap up an Order as a single API request-response.
I am using optimizer in Pyalgotrade to run my strategy to find the best parameters. The message I get is this:
2015-04-09 19:33:35,545 broker.backtesting [DEBUG] Not enough cash to fill 600800 order [1681] for 888 share/s
2015-04-09 19:33:35,546 broker.backtesting [DEBUG] Not enough cash to fill 600800 order [1684] for 998 share/s
2015-04-09 19:33:35,547 server [INFO] Partial result 7160083.45 with parameters: ('600800', 4, 19) from worker-16216
2015-04-09 19:33:36,049 server [INFO] Best final result 7160083.45 with parameters: ('600800', 4, 19) from client worker-16216
This is just part of the message. You can see only for parameters ('600800', 4, 19) and ('600800', 4, 19) we have result, for other combination of parameters, I get the message : 546 broker.backtesting [DEBUG] Not enough cash to fill 600800 order [1684] for 998 share/s.
I think this message means that I have created a buy order but I do not have enough cash to busy it. However, from my script below:
shares = self.getBroker().getShares(self.__instrument)
if bars[self.__instrument].getPrice() > up and shares == 0:
sharesToBuy = int(self.getBroker().getCash()/ bars[self.__instrument].getPrice())
self.marketOrder(self.__instrument, sharesToBuy)
if shares != 0 and bars[self.__instrument].getPrice() > up_stop:
self.marketOrder(self.__instrument, -1 * shares)
if shares != 0 and bars[self.__instrument].getPrice() < up:
self.marketOrder(self.__instrument, -1 * shares)
The logic of my strategy is that is the current price is larger than up, we buy, and if the current price is larger than up_stop or smaller than up after we buy, we sell. So from the code, there is no way that I will generate an order which I do not have enough cash to pay because the order is calculated by my current cash.
So where do I get wrong?
You calculate the order size based on the current price, but the price for the next bar may have gone up. The order is not filled in the current bar, but starting from the next bar.
With respect to the 'Partial result' and 'Best final result' messages, how many combinations of parameters are you trying ? Note that if you are using 10 different combinations, you won't get 10 different 'Partial result' because they are evaluated in batches of 200 combinations and only the best partial result for each batch of 200 combinations gets printed.
How to measure the similarity between three vectors?
Suppose I have three students and their subjects marks.
Student 1 (12,23,43,35,21)
Student 2 (23, 34, 45, 25.17) and
Student 3 (34, 43, 22, 11, 39)
now I want to measure the similarity between these three students. Can anyone help me on this. Thanks in advance.
You want similarity, not dissimilarity. The latter is available in numerous functions, some noted in the comments. The most commonly used metric for dissimilarity is Euclidean distance.
To measure similarity, you could use the simil(...) function in the proxy package in R, as shown below. Assuming that the scores are in the same order for each student, you would combine the scores into a matrix row-wise, then:
Student.1 <- c(12, 23, 43, 35, 21)
Student.2 <- c(23, 34, 45, 25, 17)
Student.3 <- c(34, 43, 22, 11, 39)
students <- rbind(Student.1,Student.2,Student.3)
library(proxy)
simil(students,method="Euclidean")
# Student.1 Student.2
# Student.2 0.04993434
# Student.3 0.02075985 0.02593140
This calculates the Euclidean distance for every student vs. every other student, and converts that to a similarity score using
sim = 1 / (1+dist)
So if the scores for two students are identical, their similarity will be 1.
But this is only one way to do it. There are 48 similarity/distance metrics coded in the proxy package, which can be listed using:
pr_DB$get_entries()
You can even code your own metric, using, e.g.,
simil(students,FUN=f)
where f(x,y) is a function that takes two vectors as arguments and returns a similarity score defined as you like. This might be relevant if, for example, some courses were "more important" in the sense that you wanted to weight differences wrt those courses more highly than the others.
My question is linked with this one:
Roulette-wheel selection in Genetic algorithm. Population needs to be sorted first?
If we don't sort the population what is the way of organizing roulette wheel selection for it?
Surely, we have to search in linear way now. Have you got any code snippets in C++ or Java for this case?
The population does not need to be sorted at all - the key to roulette selection is that the probability of a given individual being selected for reproduction is proportional to its fitness.
Say you have an unsorted population, with fitnesses as follows:
[12, 45, 76, 32, 54, 21]
To perform roulette selection, you need only pick a random number in the range 0 to 240 (the sum of the population's fitness). Then, starting at the first element in the list, subtract each individual's fitness until the random number is less than or equal to zero. So, in the above case, if we randomly pick 112, we do the following:
Step 1: 112 - 12 = 100. This is > 0, so continue.
Step 2: 100 - 45 = 55. This is > 0, so continue.
Step 3: 55 - 76 = -21. This is <= 0, so stop.
Therefore, we select individual #3 for reproduction. Note how this doesn't require the population to be sorted at all.
So, in pseudocode, it boils down to:
let s = sum of population fitness
let r = random number in range [0, s].
let i = 0.
while r > 0 do:
r = r - fitness of individual #i
increment i
select individual #i - 1 for reproduction.
Note that the - 1 in the final line is to counteract the increment i that's done within the last iteration of the loop (because even though we've found the individual we want, it increments regardless).