Model based caching instead of view based caching in django - django

I am working on a django application. The main task of the application is providing suggestion like "Should I go outside today?". There is only a single endpoint to get the suggestion such as example.com/.
The main logic for providing the suggestion is:
Does the user has any pending task today? (querying from UserTaskModel)
Is today's weather is comfortable? (calculating weather forecasting)
If two user try to fetch data at the same date then the UserTask query will be different. But weather forecast query task will be same. If I use view based django caching then the weather forecast query will be execute for each user. But I want to cache the weather query data for all user at a same date. It can view implement by creating different view for the weather. But I don't want to use another endpoint for the weather.
Django cache set-get method can be use for this task. But is this way is the best way to do this type of task? In my example I use a simple weather calculation query depending the the date. But is this technique is good for complex query?

As you said cache set-get is your solution. but note these two things:
Assuming you want to cache weather for each day, set expire time +24h (cache won't expire soon)
Also your cache key should be something like weather_2019_09_22
I think creating a utility class can be really good (something like this, this is a pseudo code)
class WeatherCache:
def get(self):
date = today()
if forecast for date in cache:
return forecast
get forecast for date
insert forecast into cache
return forecast
Another idea can be simply creating a model and putting forecasts there, the good point is you will keep the history of forecasts. maybe it can be useful for later queries (and the table won't get too big so you don't need to worry about it)

Related

DynamoDB - UUID and avoiding a full table scan

This is my use case:
I have a JSON Api with 200k objects. The dataset looks a little something like this: date, bike model, production time in min. I use Lambda to read from a JSON Api and write in DynamoDB via http request. The Lambda function runs everyday and updates DynamoDB with the most recent data.
I then retrieve the data by date since I want to calculate the average production time for each day and put it in a second table. An Alexa skill is connected to the second table and reads out the average value for each day.
First question: Since the same bike model is produced multiple times per day, using a composite primary key with date and bike model won't give me a unique key. Shall I create a UUID for the entries instead? Or is there a better solution?
Second question: For the calculation I would need to do a full table scan each time, which is very costly and advised against by many. How can I solve this problem without doing a full table scan?
Third question: Is it better to avoid DynamoDB altogether for my use case? Which AWS database is more suitable for my use case then?
Yes, uuid or any other unique identifier (ex: date+bike model+created time) as pk is fine.
It seems your daily job for average value is some sort of data analytics job not really a transaction job. I would suggest to go with a service support data analytics such as Amazon Redshift. You should be able to add data to such database service using Dynamodb streams. Alternatively, you can stream data into s3 and use a service like Athena to get the daily average.
There is a simple database model that you could use for this task:
PartitionKey: a UUID or use any combination of fields that provide uniqueness.
SortKey: Production date, as a string, i.e. 2020-07-28
If you then create a secondary index which uses as PK the Production date and includes the production time, you can then query (not scan) the secondary index for a specific date and perform any calculations you need on production time. You can then provision the required read/write capacity on the secondary index and the table independently.
Regarding your third question, I don't see any real benefit of using DynamoDB for this task. Any RDS (i.e. MySQL), Redshift or even S3+Athena can easily handle such use case. If you require real time analytics, you could even consider AWS Kinesis.

Conditional Relationships between Two Tables in Django?

The following image shows a rough draft of my proposed database structure that I will develop for Django. Briefly, I have a list of ocean Buoys which have children tables of their forecast conditions and observed conditions. I'd like Users to be able to make a log of their surf sessions (surfLogs table) in which they input their location, time of surf session, and their own rating.
I'd like the program to then look in the buoysConditions table for the buoy nearest the user's logged location and time and append to the surfLog table the relevant buoyConditions. This will allow the user to keep track of what conditions work best for them (and also eventually create notifications for the user automatically).
I don't know what the name for this process of joining the tables is, so I'm having some trouble finding documentation on it. I think in SQL it's termed a join or update. How is this accomplished with Django?
Thanks!

Django, each user having their own table of a model

A little background. I've been developing the core code of an application in python, and now I want to implement it as a website for the user, so I've been learning Django and have come across a problem and not sure where to go with it. I also have little experience dealing with databases
Each user would be able to populate their own list, each with the same attributes. What seems to be the solution is to create a single model defining the attributes etc..., and then the user save records to this, and at the same time very frequently changing the values of the attributes of the records they have added (maybe every 5~10 seconds or so), using filters to filter down to their user ID. Each user would add on average 4000 records to this model, so say just for 1000 users, this table would have 4 million rows, 10,000 users we get 40million rows. To me this seems it would impact the speed of content delivery a lot?
To me a faster solution would be to define the model, and then for each user to have their own instance of this table of 4000ish records. From what I'm learning this would use more memory and disk-space, but I'd rather get a faster user experience as my primary end point.
Is it just my thinking because I don't have experience with databases? Or are my concerns warranted and I should find a solution as to how to be able to do the latter?
This post asked the same question I believe, but no solution on how to achieve it. How to create one Model (table) for each user on django?

How to know where database has changed

I have a project that looks like a simple shopping site that sells different kinds of products. For example, I have 4 models: Brand, Product, Consignment. Consignment is linked to Product, and Product is linked to Brand. To reduce count of queries to databases, I want to save current state of these models(or at least some of them). I want to do it, because I show a sidebar with brands and products. So every time when user opens some page, it will execute the query to database to get those brands and products.
But when admin add some new product or brand, I want to handle database changing and resave it. How to implement it?
Your answer is by using Cache. Cache is a method to store your objects in memory/other app like redis temporarily so that you do not need send queries to database. You can read the full description here.
Or, you can use this third party library that helps you to cache Django ORM Model. Here are the example.
Brand.objects.filter(name='stackoverlow').cache()
After doing an update to the model, you need to clear or invalidate the cache.
invalidate_model(Brand)

Doctrine2 and database-side triggers for denormalized fields

Let's say I have two tables: Category and Product, and Product links to Category with a foreign key Production.categoryId == Category.id. I would like my database server to take care of counting number of products within a category using a denormalized field Category.productCount - the triggers will update this count on any update/delete/insert, so I don't have to worry about it. Is there a way to synchronize database-side triggers with Doctrine2 entities somehow? I really don't want to recalculate those counters on PHP side, as we are going to run it on multiple servers.
If I understand the question you want to be able to add a new product to a category, persist it then have Category.productCount update itself from the database? You can use
$entityManager->refresh($category);
To reload an entity from the database. I have not done it myself but I would expect that you could use the life cycle functionality to automate this.
But I do kind of wonder if it might not be better to just increment the counter locally without persisting it to the database. Let your trigger do the database operation but, within the request, update the count locally.