Actually, this question has puzzled me for a long time.
Say, I have two models, Course and CourseDate, as follows:
class Course(models.Model):
name = model.CharField()
class CourseDate(models.Model):
course = modelds.ForienKey(Course)
date = models.DateField()
where CourseDate is the dates that a certain course will take place.
I could also define Course as follows and discard CourseDate:
class Course(models.Model):
name = models.CharField()
dates = models.CharField()
where the dates field contains dates represented by strings. For example:
dates = '2016-10-1,2016-10-12,2016-10-30'
I don't know if the second solution is kind of "cheating". So, which one is better?
I don't know about cheating, but it certainly goes against good database design. More to the point, it prevents you from doing almost all kinds of useful queries on that field. What if you wanted to know all courses that had dates within two days of a specific date? Almost impossible to do that with solution 2, but simple with solution 1.
Related
Is there a by the book way of allowing a user to add columns to a sites database table. For example, if the site was about animals, one user might want to have stats like, 'walks per week' and 'type of food' about their breed of dog. but another user might want to keep track of how much milk their goat is producing.
So if i have an 'Animal' class with come basic info. like, 'breed', 'animal name', 'DOB', 'DOD'. But then, in the front end have a form that will allow the users to add all the other columns they would like.
Is this possible? hope I've explained it well enough.
#WillemVanOnsem already mentioned some good options in the comments. I'm going to chime in to say that modifying your schema's structure based on user input is an extremely bad idea and opens another avenue for abuse... for Django in particular, it means you either can't use the ORM's migration facilities for some of your models, or you probably have to do some really awful automation.
If your animal types are well-defined and consistent, you can consider (carefully) making them subclasses of the Animal model. Otherwise, this would be the simplest way to handle it (note that the following isn't valid code, it needs required arguments for the field types):
class AnimalAttribute(models.Model):
animal = models.ForeignKey(Animal)
name = models.CharField()
value = models.CharField()
This works best if attributes aren't shared, e.g. users are directly inputting their animals' names and attributes, not picking from an existing list.
If you need to provide a normalized list of attributes users can pick from (actual EAV, which is something you should avoid if possible, since it moves some of your data structure from code into the data persistence layer), doing that in your models is a little more complex. For example:
class Species(models.Model):
name = models.CharField()
class SpeciesAttribute(models.Model):
species = models.ForeignKey(Species)
name = models.CharField()
class Animal(models.Model):
name = models.CharField()
species = models.ForeignKey(Species)
class AnimalAttributeValue(models.Model):
animal = models.ForeignKey(Animal)
attribute = models.ForeignKey(SpeciesAttribute)
value = models.CharField()
Lets say i have two model
class Testmodel1():
amount = models.IntegerField(null=True)
contact = models.CharField()
entry_time = models.DateTimeField()
class Testmodel2():
name = models.CharField()
mobile_no = models.ForeignKey(Testmodel1)
and I am creating the object for this model(Testmodel2). Now I want to find out the count of object(Testmodel2) created in last 24 hours by mobile_no field.
what could be the best way of making query.
Any help would be appreciated.
It'd be better if you made the contact field into a models.DateTime field rather than a models.CharField. If it were a DateTime field, you could do lte, gte, and other operations on it easily to compare it to other datetimes.
For example, if Testmodel.contact were a DateTime field, the answer to your question would be:
Testmodel.objects.filter(contact__gte=past).count()
If the contact field contains a string representing a DateTime, I'd recommend switching it over, since there's really no reason to store it as a string.
If you're unable to change these fields, unfortunately I don't think there's a way to do this on the database level. You'll have to filter them individually on the python side:
from dateutil.parser import parse
results = []
past = arrow.utcnow().shift(hours=-24)
model_query = TestModel.objects.all()
for obj in model_query.iterator():
contact_date = parse(obj.contact) # Parse string into datetime
if contact_date > past:
results.append(obj)
print(len(results))
This will give you a list (note: NOT a queryset) containing all matching model instances. It'll be a lot slower than the other option would be, you can't edit the results afterwards with something like results.filter(amount__gte=1).count(), and it's not quite as clean.
That said, it'll get the job done.
EDIT
It occurs to me that this might be able to be done with annotation, but I'm not sure how that would be accomplished, or if it would even work. I defer to other answers if they can think of a way to use annotation to accomplish this in a better way, but stick to my original assessment that this should probably be a DateTime field.
EDIT 2
With a DateTime field now added on the other model, you can look it up across models like so:
past = arrow.utcnow().shift(hours=-24)
Testmodel2.objects.filter(mobile_no__entry_time__gte=past)
I have the following models:
class Deal(models.Model):
date = models.DateTimeField(auto_now_add=True)
retailer = models.ForeignKey(Retailer, related_name='deals')
description = models.CharField(max_length=255)
...etc
class CustomerProfile(models.Model):
saved_deals = models.ManyToManyField(Deal, related_name='saved_by_customers', null=True, blank=True)
dismissed_deals = models.ManyToManyField(Deal, related_name='dismissed_by_customers', null=True, blank=True)
What I want to do is retrieve deals for a customer, but I don't want to include deals that they have dismissed.
I'm having trouble wrapping my head around the many-to-many relationship and am having no luck figuring out how to do this query. I'm assuming I should use an exclude on Deal.objects() but all the examples I see for exclude are excluding one item, not what amounts to multiple items.
When I naively tried just:
deals = Deal.objects.exclude(customer.saved_deals).all()
I get the error: "'ManyRelatedManager' object is not iterable"
If I say:
deals = Deal.objects.exclude(customer.saved_deals.all()).all()
I get "Too many values to unpack" (though I feel I should note there are only 5 deals and 2 customers in the database right now)
We (our client) presumes that he/she will have thousands of customers and tens of thousands of deals in the future, so I'd like to stay performance oriented as best I can. If this setup is incorrect, I'd love to know a better way.
Also, I am running django 1.5 as this is deployed on App Engine (using CloudSQL)
Where am I going wrong?
Suggest you use customer.saved_deals to get the list of deal ids to exclude (use values_list to quickly convert to a flat list).
This should save you excluding by a field in a joined table.
deals = Deals.exclude( id__in=customer.saved_deals.values_list('id', flat=True) )
You'd want to change this:
deals = Deal.objects.exclude(customer.saved_deals).all()
To something like this:
deals = Deal.objects.exclude(customer__id__in=[1,2,etc..]).all()
Basically, customer is the many-to-many foreign key, so you can't use it directly with an exclude.
Deals saved and deals dismissed are two fields describing almost same thing. There is also a risk too much columns may be used in database if these two field are allowed to store Null values. It's worth to consider remove dismissed_deals at all, and use saved_deal only with True or False statement.
Another thing to think about is move saved_deals out of CustomerProfile class to Deals class. Saved_deals are about Deals so it can prefer to live in Deals class.
class Deal(models.Model):
saved = models.BooleandField()
...
A real deal would have been made by one customer / buyer rather then few. A real customer can have milions of deals, so relating deals to customer would be good way.
class Deal(models.Model):
saved = models.BooleanField()
customer = models.ForeignKey(CustomerProfile)
....
What I want to do is retrieve deals for a customer, but I don't want to include deals that they have dismissed.
deals_for_customer = Deals.objects.all().filter(customer__name = "John")
There is double underscore between customer and name (customer__name), which let to filter model_name (customer is related to CustomerProfile which is model name) and name of field in that model (assuming CutomerProfile class has name attribute)
deals_saved = deals_for_customer.filter(saved = True)
That's it. I hope I could help. Let me know if not.
i am trying to figure out how to solve this problem without any luck. The situation is that Author has many books divided by genres and i would like to have that when i query author it would return author and book objects divided by genres.
Author object would have these properties:
name
fantasy - would have one book based by given date
crime - would have one book based by given date
romance - would have one book based by given date
Is there a sane way to achieve this by not making thousands(if i would have that many genres) of foreign keys in author model?
class Author(models.Model):
name = models.CharField(u'Name',max_length=100)
GENRE = (
(0,u'Fantasy'),
(1,u'Crime'),
(2,u'Romance')
)
class Book(models.Model):
author = models.ForeignKey(Author)
name = models.CharField(u'Name',max_length=100)
genre = models.SmallIntegerField(u'Genre',choices=GENRE)
date = models.DateField(u'Publish date')
EDIT:
After closer inspection sgarza62 example seems to work bad with large amount of data.
So i tried new django 1.7 feature Prefetch
authors = Author.objects.all().prefetch_related(
Prefetch("book", queryset=Book.objects.filter(genre=0,date_from__lte=datetime.datetime.now()), to_attr='x_fantasy'),
Prefetch("book", queryset=Book.objects.filter(genre=1,date_from__lte=datetime.datetime.now()), to_attr='x_crime'),
Prefetch("book", queryset=Book.objects.filter(genre=2,date_from__lte=datetime.datetime.now()), to_attr='x_romance')
)
But i have 2 issues with this, how to prefetch only one object (latest book in this example) and second, how to appy ordering based on prefetched values.
If you're querying all or several authors, I recommend prefetching related fields. This will snatch up all related objects in a single hit to the database, and store the objects in the Queryset.
authors = Author.objects.all().prefetch_related('book_set')
for author in authors:
# accessing related field will not cause a hit to the db,
# because values are cached in Queryset
for book in author.books_set:
print book.genre
If you're only querying one author, then it's not such a big deal.
author = Author.objects.get(pk=1)
her_books = author.book_set
for book in her_books:
print book.genre
Edit
I'm having a bit of trouble understanding exactly what you're going to do. But, if you're looking for the latest book of each genre, for a given author:
author = Author.objects.get(pk=1)
author_books = author.book_set.order_by('-date') # most recent, first
author_genres = set([b.genre for b in author_books])
for g in author_genres:
print next((b for b in author_books if b.genre==g), None)
Keep in mind that these operations are all on the Queryset, and are not hitting the database each time. This is good, because querying the database is an expensive operation, and most authors have a relatively small list of works, so the Querysets will generally be small.
Previously had a go at asking a more specific version of this question, but had trouble articulating what my question was. On reflection that made me doubt if my chosen solution was correct for the problem, so this time I will explain the problem and ask if a) I am on the right track and b) if there is a way around my current brick wall.
I am currently building a web interface to enable an existing database to be interrogated by (a small number of) users. Sticking with the analogy from the docs, I have models that look something like this:
class Musician(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
dob = models.DateField()
class Album(models.Model):
artist = models.ForeignKey(Musician)
name = models.CharField(max_length=100)
class Instrument(models.Model):
artist = models.ForeignKey(Musician)
name = models.CharField(max_length=100)
Where I have one central table (Musician) and several tables of associated data that are related by either ForeignKey or OneToOneFields. Users interact with the database by creating filtering criteria to select a subset of Musicians based on data the data on the main or related tables. Likewise, the users can then select what piece of data is used to rank results that are presented to them. The results are then viewed initially as a 2 dimensional table with a single row per Musician with selected data fields (or aggregates) in each column.
To give you some idea of scale, the database has ~5,000 Musicians with around 20 fields of related data.
Up to here is fine and I have a working implementation. However, it is important that I have the ability for a given user to upload there own annotation data sets (more than one) and then filter and order on these in the same way they can with the existing data.
The way I had tried to do this was to add the models:
class UserDataSets(models.Model):
user = models.ForeignKey(User)
name = models.CharField(max_length=100)
description = models.CharField(max_length=64)
results = models.ManyToManyField(Musician, through='UserData')
class UserData(models.Model):
artist = models.ForeignKey(Musician)
dataset = models.ForeignKey(UserDataSets)
score = models.IntegerField()
class Meta:
unique_together = (("artist", "dataset"),)
I have a simple upload mechanism enabling users to upload a data set file that consists of 1 to 1 relationship between a Musician and their "score". Within a given user dataset each artist will be unique, but different datasets are independent from each other and will often contain entries for the same musician.
This worked fine for displaying the data, starting from a given artist I can do something like this:
artist = Musician.objects.get(pk=1)
dataset = UserDataSets.objects.get(pk=5)
print artist.userdata_set.get(dataset=dataset.pk)
However, this approach fell over when I came to implement the filtering and ordering of query set of musicians based on the data contained in a single user data set. For example, I could easily order the query set based on all of the data in the UserData table like this:
artists = Musician.objects.all().order_by(userdata__score)
But that does not help me order by the results of a given single user dataset. Likewise I need to be able to filter the query set based on the "scores" from different user data sets (eg find all musicians with a score > 5 in dataset1 and < 2 in dataset2).
Is there a way of doing this, or am I going about the whole thing wrong?
edit: nevermind, it's wrong. I'll keep it so you can read, but then I'll delete afterward.
Hi,
If I understand correctly, you can try something like this:
artists = Musician.objects.select_related('UserDataSets').filter( Q(userdata__score_gt=5, userdata__id=1) | Q(userdata__sorce_lt=2, userdata__id=2 )
For more info on how to use Q, check this: Complex lookups with Q objects.