Embed product-variance logic into Django models

Embed product-variance logic into Django models - django

I wonder how I would model my Products model to auto-create (and that the admin-App would also understand it) variants of a Product based on it's variant-parts.
My Products have;
Colors
Sizes
and can probably get more features in the future.
How would I model my Product class to generate all variants of the Product?
Say I would create a new Product in Colors Red Blue Green and in Sizes XS S M L XL.
class Product(models.Model):
name = models.CharField(max_length=200)
class Color(models.Model):
product = models.ForeignKey(Product)
name = models.CharField(max_length=200)
class Size(models.Model):
product = models.ForeignKey(Product)
name = models.CharField(max_length=200)
class FutureVariant(models.Model):
product = models.ForeignKey(Product)
name = models.CharField(max_length=200)
# etc.
Now when I would need a smart method that when I would auto-create all color-size-[FUTURE VARIANT] for that product.
So I would tell Django;
Create new Product
In the colors Red Blue Green
In the sizes XS S M L XL
And the Product class would go and produce Products with all possible combinations in the products_product table.
I'm almost sure that this has design flaws. But I'm just curious how to put this logic in the ORM, and not to write weird procedural code, which would probably go against the DRY principal.
In Database logic I would think of something like this;
PRODUCTS
- id
- name
PRODUCTS_VARIANTS_COLORS
- id
- name
- html_code
PRODUCTS_VARIANTS_SIZES
- id
- name
PRODUCTS_VARIANTS_TABLES
- table_name
- table_id
PRODUCTS_VARIANTS
- product_id
- variant_table
- variant_id
This way I could make endless variant tables, as long as I would register them in my PRODUCTS_VARIANTS_TABLES and store their name as relevant. PRODUCTS_VARIANTS would hold all the the variants of the product, including combinations of them all. I am also aiming to have a selection-phase where the user can chose (in a HTML checkbox-list) which variants it does and doesn't want.
The problem (I think) is that this would not really comply with a logic in the ORM.

I don't know if you are asking about alternatives or just looking to make your way work, but what about splitting a product from it's attributes?
So instead of having separate models for attributes, you just have an Attribute model. This way you are future-proofing your database so you can easily add more attributes (like if you have products with a height and width instead of just color or size).
class AttributeBase(models.Model):
label = models.CharField(max_length=255) # e.g. color, size, shape, etc.
...
class Attribute(models.Model):
base = models.ForeignKey('AttributeBase', related_name='attributes')
value = models.CharField(max_length=255) # e.g. red, L, round, etc.
internal_value = models.CharField(max_length=255, null=True, blank=True) # other values you may need e.g. #ff0000, etc.
...
class ProductAttribute(Attribute):
product = models.ForeignKey('Product', related_name='attributes')
It now becomes very easy to create all attributes for a product...
class Product(models.Model):
...
def add_all_attributes(self):
for attribute in Attribute.objects.all():
self.attributes.add(attribute)
now when you use product.add_all_attributes() that product will contain every attribute. AND you can even make it add attributes of a certain AttributeBase
def add_all_attributes_for_base(self, label):
base = AttributeBase.objects.get(label=label)
for attribute in base.attributes.all():
self.attributes.add(attribute)

You could write something as:
class Product(models.Model):
#classmethod
def create_variants(cls):
# compute all possible combinations
combinations = ...
for combination in combinations:
Product.objects.create(**combination)
Creating all the combinations would indeed happen through registering the possible variants and their possible values.
Note that ORM is there to help you map Django objects to database records, it doesn't help you with producing the database records (read: Django models) that you wish to save.

Related

Bin a queryset using Django?

Let's say we have the following simplistic models:
class Category(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "categories"
class Status(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "status"
class Product(models.Model):
title = models.CharField(max_length=264)
description = models.CharField(max_length=264)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
price = models.DecimalField(max_digits=10)
status = models.ForeignKey(Status, on_delete=models.CASCADE)
My aim is to get some statistics, like total products, total sales, average sales etc, based on which price bin each product belongs to.
So, the price bins could be something like 0-100, 100-500, 500-1000, etc.
I know how to use pandas to do something like that:
Binning column with python pandas
I am searching for a way to do this with the Django ORM.
One of my thoughts is to convert the queryset into a list and apply a function to get the apropriate price bin and then do the statistics.
Another thought which I am not sure how to impliment, is the same as the one above but just apply the bin function to the field in the queryset I am interested in.

There are three pathways I can see.
First is composing the SQL you want to use directly and putting it to your database with a modification of your models manager class. .objects.raw("[sql goes here]"). This answer shows how to define group with a simple function on the content - something like that could work?
SELECT FLOOR(grade/5.00)*5 As Grade,
COUNT(*) AS [Grade Count]
FROM TableName
GROUP BY FLOOR(Grade/5.00)*5
ORDER BY 1
Second is that there is no reason you can't move the queryset (with .values() or .values_list()) into a pandas dataframe or similar and then bin it, as you mentioned. There is probably a bit of an efficiency loss in terms of getting the queryset into a dataframe and then processing it, but I am not sure that it would certainly or always be bad. If its easier to compose and maintain, that might be fine.
The third way I would try (which I think is what you really want) is chaining .annotate() to label points with the bin they belong in, and the aggregate count function to count how many are in each bin. This is more advanced ORM work than I've done, but I think you'd start looking at something like the docs section on conditional aggregation. I've adapted this slightly to create the 'price_class' column first, with annotate.
Product.objects.annotate(price_class=floor(F('price')/100).aggregate(
class_zero=Count('pk', filter=Q(price_class=0)),
class_one=Count('pk', filter=Q(price_class=1)),
class_two=Count('pk', filter=Q(price_class=2)), # etc etc
)
I'm not sure if that 'floor' is going to work, and you may need 'expression wrapper' to ensure the push price_class into the write type of output_field. All the best.

Sorting by distance with a related ManyToMany field

I have this two models.
class Store(models.Model):
coords = models.PointField(null=True,blank=True)
objects = models.GeoManager()
class Product(models.Model):
stores = models.ManyToManyField(Store, null=True, blank=True)
objects = models.GeoManager()
I want to get the products sorted by the distance to a point. If the stores field in Product was a Foreign Key I would do this and it works.
pnt = GEOSGeometry('POINT(5 23)')
Product.objects.distance(pnt, field_name='stores__coords').order_by('distance')
But since the field is a ManyToMany field it breaks with
ValueError: <django.contrib.gis.db.models.fields.PointField: coords> is not in list
I kind of expected this because it's not clear which of the stores it should use to calculate the distance, but is there any way to do this.
I need the list of products ordered by distance to a specific point.

Just an idea, maybe this would work for you, this should take only two database queries (due to how prefetch works). Don't judge harshly if it doesn't work, I haven't tried it:
class Store(models.Model):
coords = models.PointField(null=True,blank=True)
objects = models.GeoManager()
class Product(models.Model):
stores = models.ManyToManyField(Store, null=True, blank=True, through='ProductStore')
objects = models.GeoManager()
class ProductStore(models.Model):
product = models.ForeignKey(Product)
store = models.ForeignKey(Store)
objects = models.GeoManager()
then:
pnt = GEOSGeometry('POINT(5 23)')
ps = ProductStore.objects.distance(pnt, field_name='store__coords').order_by('distance').prefetch_related('product')
for p in ps:
p.product ... # do whatever you need with it

This is how I solved it but I dont really like this solution. I think is very inefficient. There should be a better way with GeoDjango. So, until i find a better solution I probably wont be using this. Here's what I did.
I added a new method to the product model
class Product(models.Model):
stores = models.ManyToManyField(Store, null=True, blank=True)
objects = models.GeoManager()
def get_closes_store_distance(point):
sorted_stores = self.stores.distance(point).order_by('distance')
if sorted_stores.count() > 0:
store = sorted_stores[0]
return store.distance.m
return 99999999 # If no store, return very high distance
Then I can sort this way
def sort_products(self, obj_list, lat, lng):
pt = 'POINT(%s %s)' % (lng, lat)
srtd = sorted(obj_list, key=lambda obj: obj.get_closest_store_distance(pt))
return srtd
Any better solutions or ways to improve this one are very welcome.

I will take "distance from a product to a point" to be the minimum distance from the point to a store with that product. I will take the output to be a list of (product, distance) for all products sorted by distance ascending. (A comment by someone who placed a bounty indicated they sometimes also want (product,distance,store) sorted by distance then store within product.)
Every model has a corresponding table. The fields of the model are the columns of the table. Every model/table should have a fill-in-the-(named-)blanks statement where its records/rows are the ones that make a true statement.
Store(coords,...) // store [store] is at [coords] and ...
Product(product,store,...) // product [product] is stocked by store [store] and ...
Since Product has store(s) as manyToManyField it already is a "ProductStore" table of products and stocking stores and Store already is a "StoreCoord" table of stores and their coordinates.
You can mention any object's fields in a query filter() for a model with a manyToManyField.
The SQL for this is simple:
select p.product,distance
select p.product,distance(s.coord,[pnt]) as distance
from Store s join Product p
on s.store=p.store
group by product
having distance=min(distance)
order by distance
It should be straightforward to map this to a query. However, I am not familiar enough with Django to give you exact code now.
from django.db.models import F
q = Product.objects.all()
.filter(store__product=F('product'))
...
.annotate(distance=Min('coord.distance([pnt])'))
...
.order_by('distance')
The Min() is an example of aggregation.
You may also be helped by explicitly making a subquery.
It is also possible to query this by the raw interface. However, the names above are not right for a Django raw query. Eg the table names will by default be APPL_store and APPL_product where APPL is your application name. Also, distance is not your pointField operator. You must give the right distance function. But you should not need to query at the raw level.

Django Query Set in Deep with exclude

i have three classes
Product have many Descriptions and each model have many stores
what i want to do
select all products but store.qty value > 0
I've tried
pr = Product.objects.all().exclude(Product__Product_description__qty > 0)
how can i do that ?
class Product
id = models.AutoField(primary_key=True)
name = models.CharField(max_length=255)
class Product_description
id = models.AutoField(primary_key=True)
name = models.CharField(max_length=255)
product = models.ForeignKey(Product)
class Store
id = models.AutoField(primary_key=True)
name = models.CharField(max_length=255)
desc = models.ForeignKey(Product_description)
qty = models.IntegerField()

pr = Product.objects.filter(Product_description__qty__lte = 0)
Or if you really must use exclude:
pr = Product.objects.exclude(Product_description__qty__gt = 0)
all() is not necessary in either case; you just end up building an untriggered proxy that goes into building the filter/exclude queryset afterward. It wastes memory and CPU, but otherwise does nothing. Only the .delete() operator requires a working all() queryset, but it's a special case designed explicitly to avoid the accidental destruction of datasets.
The Django Queryset API documentation is very readable.
Django convention is to name your class ProductDescription.
This seems like a backward hierarchy. Why would stores have "product descriptions?" Isn't that metadata on the product itself, and what you care about is that the stores have a certain quantity of product? Or are these product variants, i.e you want to find all the products for which stores have at least one green or blue or orange one? Something tells me that your project needs a careful re-think.

Django ORM: count a subset of related items

I am looking to find a way to annotate a queryset with the counts of a subset of related items. Below is a subset of my models:
class Person(models.Model):
Name = models.CharField(max_length = 255)
PracticeAttended = models.ManyToManyField('Practice',
through = 'PracticeRecord')
class Club(models.Model):
Name = models.CharField(max_length = 255)
Slug = models.SlugField()
Members = models.ManyToManyField('Person')
class PracticeRecord(PersonRecord):
Person = models.ForeignKey(Person)
Practice = models.ForeignKey(Practice)
class Practice(models.Model):
Club = models.ForeignKey(Club, default = None, null = True)
Date = models.DateField()
I'm looking to make a queryset which annotates the number of club specific practices attended by a person. I can already find the total number of practices by that person with a query of Person.objects.all().annotate(Count('PracticeRecord'))
However I would like someway to annotate the number of practices that a person attends for a specific club.
I would prefer something using the django ORM without having to resort to writing raw SQL.
Thanks.

However I would like someway to annotate the number of practices that a person attends for a specific club.
Let us see.
First, find the specific club.
club = Club.objects.get(**conditions)
Next, filter all Persons who have practiced at this club.
persons = Person.objects.filter(practicerecord__Practice__Club = club)
Now, annotate with the count.
q = persons.annotate(count = Count('practicerecord'))
Edit
I was able to successfully make this work in my test setup: Django 1.2.3, Python 2.6.4, Postgresql 8.4, Ubuntu Karmic.
PS: It is a Good Idea™ to use lower case names for the fields. This makes it far easier to use the double underscore (__) syntax to chain fields. For e.g. in your case Django automatically creates practicerecord for each Person. When you try to access other fields of PracticeRecord through this field you have to remember to use title case.
If you had used lower case names, you could have written:
persons = Person.objects.filter(practicerecord__practice__club = club)
# ^^ ^^
which looks far more uniform.
PPS: It is Count('practicerecord') (note the lower case).

I'm afraid that raw sql is the only option here. Anyway it's not that scary and hard to manage if you put it to model manager.

Is a many-to-many relationship with extra fields the right tool for my job?

Previously had a go at asking a more specific version of this question, but had trouble articulating what my question was. On reflection that made me doubt if my chosen solution was correct for the problem, so this time I will explain the problem and ask if a) I am on the right track and b) if there is a way around my current brick wall.
I am currently building a web interface to enable an existing database to be interrogated by (a small number of) users. Sticking with the analogy from the docs, I have models that look something like this:
class Musician(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
dob = models.DateField()
class Album(models.Model):
artist = models.ForeignKey(Musician)
name = models.CharField(max_length=100)
class Instrument(models.Model):
artist = models.ForeignKey(Musician)
name = models.CharField(max_length=100)
Where I have one central table (Musician) and several tables of associated data that are related by either ForeignKey or OneToOneFields. Users interact with the database by creating filtering criteria to select a subset of Musicians based on data the data on the main or related tables. Likewise, the users can then select what piece of data is used to rank results that are presented to them. The results are then viewed initially as a 2 dimensional table with a single row per Musician with selected data fields (or aggregates) in each column.
To give you some idea of scale, the database has ~5,000 Musicians with around 20 fields of related data.
Up to here is fine and I have a working implementation. However, it is important that I have the ability for a given user to upload there own annotation data sets (more than one) and then filter and order on these in the same way they can with the existing data.
The way I had tried to do this was to add the models:
class UserDataSets(models.Model):
user = models.ForeignKey(User)
name = models.CharField(max_length=100)
description = models.CharField(max_length=64)
results = models.ManyToManyField(Musician, through='UserData')
class UserData(models.Model):
artist = models.ForeignKey(Musician)
dataset = models.ForeignKey(UserDataSets)
score = models.IntegerField()
class Meta:
unique_together = (("artist", "dataset"),)
I have a simple upload mechanism enabling users to upload a data set file that consists of 1 to 1 relationship between a Musician and their "score". Within a given user dataset each artist will be unique, but different datasets are independent from each other and will often contain entries for the same musician.
This worked fine for displaying the data, starting from a given artist I can do something like this:
artist = Musician.objects.get(pk=1)
dataset = UserDataSets.objects.get(pk=5)
print artist.userdata_set.get(dataset=dataset.pk)
However, this approach fell over when I came to implement the filtering and ordering of query set of musicians based on the data contained in a single user data set. For example, I could easily order the query set based on all of the data in the UserData table like this:
artists = Musician.objects.all().order_by(userdata__score)
But that does not help me order by the results of a given single user dataset. Likewise I need to be able to filter the query set based on the "scores" from different user data sets (eg find all musicians with a score > 5 in dataset1 and < 2 in dataset2).
Is there a way of doing this, or am I going about the whole thing wrong?

edit: nevermind, it's wrong. I'll keep it so you can read, but then I'll delete afterward.
Hi,
If I understand correctly, you can try something like this:
artists = Musician.objects.select_related('UserDataSets').filter( Q(userdata__score_gt=5, userdata__id=1) | Q(userdata__sorce_lt=2, userdata__id=2 )
For more info on how to use Q, check this: Complex lookups with Q objects.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Embed product-variance logic into Django models - django

Related

Bin a queryset using Django?

Sorting by distance with a related ManyToMany field

Django Query Set in Deep with exclude

Django ORM: count a subset of related items

Is a many-to-many relationship with extra fields the right tool for my job?

Categories

Resources