Django Cache + Django Database request - django

I'm building a Django web application which allow users to select a photo from the computer system and keep populating onto the users timeline. The timeline will be showing 10 photos initially and then have a pull to refresh to fetch the next 10 photos on the timeline.
So my first question is I'm able to upload images which gets store on the file system,but how do I show only first 10 and then pull a refresh to fetch the next 10 and so on.
Next, I want the user experience of the app to be fast. So, I'm considering caching. So, i was thinking, what do I cache. Since there are 3 types of cache in Django- Database cache, MemCache, or FileSystem Caching.
So my secon question is should I cache the first 10 photos of each user or something else?
Kindly answer with your suggestions.

So my first question is I'm able to upload images which gets store on the file system,but how do I show only first 10 and then pull a refresh to fetch the next 10 and so on.
Fetch first 10 with your initial logic, fetch next photos in chronological order. You must have some timestamp relating to your photo posting. Fetch images according to that. You can use Django Paginator for this.
what do I cache
Whatever static data you want to show to the user frequently and wont change right away. You can cache per user or for all users. According to that you choose what to cache.
should I cache the first 10 photos of each user or something else
Depends on you. Are those first pictures common to all the users? Then you can cache. If not and the pictures are user dependent, there is no point caching them. The user will anyway have to fetch the first images. And I highly doubt the user will keep asking for the same first 10 photos frequently. Again, it's your logic. If you think caching will help, you can go ahead and cache.

The DiskCache project was first created for a similar problem (caching images). It includes a couple of features that will help you to cache and serve images efficiently. DiskCache is an Apache2 licensed disk and file backed cache library, written in pure-Python, and compatible with Django.
diskcache.DjangoCache provides a Django-compatible cache interface with a few extra features. In particular, the get and set methods permit reading and writing files. An example:
from django.core.cache import cache
with open('filename.jpg', 'rb') as reader:
cache.set('filename.jpg', reader, read=True)
Later you can get a reference to the file:
reader = cache.get('filename.jpg', read=True)
If you simply wanted the name of the file on disk (in the cache):
try:
with cache.get('filename.jpg', read=True) as reader:
filename = reader.name
except AttributeError:
filename = None
The code above requests a file from the cache. If there is no such value, it will return None. None will cause an exception to be raised by the with statement because it lacks an __exit__ method. In that case, the exception is caught and filename is set to None.
With the filename, you can use something like X-Accel-Redirect to tell Nginx to serve the file directly from disk.

Related

what is the best method to initialize or store a lookup dictionary that will be used in django views

I'm reviving an old django 1.2 app. most of the steps have been taken.
I have views in my django app that will reference a simple dictionary of only 1300ish key-value pairs.
Basically the view will query the dictionary a few hunderd to a few thousand times for user supplied values.The dictionary data may change twice a year or so.
fwiw: django served by gunicorn, db=postgres, apache as proxy, no redis available yet on the server
I thought of a few options here:
a table in the database that will be queried and let caching do its
job (at the expense of a few hundred sql queries)
Simply define the dictionary in the settings file (ugly, and how many time is it read? Every time you do an 'from django.conf import settings'?
This was the situation how it was coded in the django 1.2 predecessor of this app many years ago
read a tab delimited file using Pandas in the django settings and make this available. the advantage is that I can do some pandas magic in the view. (How efficient is this, will the file be read many times for different users or just once during server startup?)
prepopulate a redis cache from a file as part of the startup process (complicates things on the server side and we want it to be simple, but its fast.
List items in a tab delimited file and read it in in the view (my least popular option since it seems to be rather slow)
What are your thoughts on this? Any other options?
Let me give a few - simple to more involved
Hold it in memory
Basic flat file
Sqlite file
Redis
DB
I wouldn't bring redis in for 1300 kv pairs that don't even get mutated all that much
I would put a file alongside the code that gets slurped in memory at startup or do a single sql query and grab the entire thing at startup and keep it in memory to use throughout the application

Django redis caching per url

So I am trying to implement moreover learn how to cache Django views per URL. I am able to do so and here is what is happening...
I visit a URL for 1st time and Django sets the caches.
I get my result from cache and not from querying the database during the second visit if the browser is same.
Now the doubt is - if I change my browser from the first visit and second visit, for example, do the first visit from Chrome (it sets the cache) and during the second visit from Mozilla, it again sets the cache. I was expecting it to return the result from the cache.
During my research on StackOverflow and checking what it sets as cache, I found there are two important things first being a header and the second being the content. And I think every time a browser is changed the header is new so it sets the cache instead of returning the result from cache. Do let me know if I am wrong.
I have a public URL and I was thinking to show data from the cache if a subsequent request is made, irrespective of browser or mobile/laptop/desktop, only based on Url, is that anyhow possible?
**(I was thinking if someone from the north part of the country visit a URL, subsequent visit to the same URL from the south part of the country should get data from the cache, based on my cache expiry time though)
Also if my understanding is wrong please correct me.
I am learning to cache using Redis on Django.
So i manually set key for some of my public url(views), adjust cache on create and delete and during get-list I check for the key values in cache, get the result from cache if cache timeout or unavailable then get the result from database. Somehow response time for this is little bit slower than cache_page(), the default django function, I dont know why. Any explanation ?? or Am i correct ?

Django multiple admin modifying the same databases

i'm a total noob in django and just wondering if it's possible for an admin doing a same thing at the same time ? the only thing i get after looking at the django documentation is that it is possible to have two admins, but is it possible for the admins to do a task in the same databases at the same time ?
thanks for any help
You didn't made it clear that what do you actually want but:
If by admin you mean a superuser then yes you can have as many admins as you want.
Admins can change anything in database at the same time, but if you mean changing a specific row of a specific table at the same time, its not possible because of these reasons:
Its kinda impossible to save something at the same time. when both admins tries to save anything, the last request will be saved (the first one will be saved too but it changes to the last request)
and if there is any important data in database, you should block any other accesses to that row till the first user has done his job and saved the changes. (imagine a ticket reservation website which has to block any other users to be allowed to order the same ticket number till user finishes the order or cancel it.)
Also if you mean 2 different django projects using a single database, then its another yes. Basically they are like 2 different admins and all above conditions works for them too.

Access to pandas dataframe object between requests via session key

I have a pandas dataframe with a loose wrapper class around it that provides metadata for my django/DRF application. The application is basically a user friendly (non programmer) way to do some data analysis and validation. Between requests I want to be able to save the state of the dataframe so I can have a series of interactions with the data but it does not need to be saved in a database ( It only needs to survive as long as the browser session ). From this it was logical to check out django's session framework, but from what I've heard session data should be lightweight and the dataframe object does not json serialize.
Because I dont have a ton of users, and I want the app to feel like a desktop site, I was thinking of using the django cache as a way to keep the dataframe object in memory. So putting the data in the cache would go something like this
>>> from django.core.cache import caches
>>> cache1 = caches['default']
>>> cache1.set(request.session._get_session_key, dataframe_object)
and then the same except using get in the following requests to access.
Is this a good way to do handle this workflow or is there another system I should use to keep rather large data(5mb to 100mb) in memory?
If you are running your application on a modern server then 100mb is not a huge amount of memory. However if you have more than a couple dozen simultaneous users, each requiring 100mb of cache then this could add up to more memory than your server can handle. Your cache and server should be configured appropriately and you may want to limit the total number of cached dataframes in your python code.
Since it does appear that Django needs to serialize session data your choice is to either use sessions with PickleSerializer or to use the cache. According to documentation, PickleSerializer is not recommended for security reasons so your choice to use the cache is a good one.
The default cache backend in Django does not share entries across processes so you would get better memory and time efficiency by installing memcached and enabling the memcached.MemcachedCache backend.

Storing large data on the server

I am using Django to write a website that conducts a user study. For each user, I need to load a large amount of data in RAM, and let that data be accessible throughout this particular user's time on the website. When the user leaves the website, this data can be discarded. When the next user visits the website, a new set of data will be loaded into RAM. The data is the same size, but of different value, for each user. A maximum of four users will be visiting the website at any one time. The data can be up to 100MB in size.
What is the best way to implement this? The only solution I can think of is to store the data as a session variable, but I'm wondering whether this involves any memory copying, which might be slow given that the data is large?
You shouldn't allocate RAM via Django. If you have heavy processes to run, run them asynchronously - you probably need Celery:
https://pypi.python.org/pypi/django-celery
http://www.celeryproject.org/
First do your "machine learning calculations based on the user's input" in a Django command. Then you can check with Celery when to run it...
The workflow would be:
- user enters some data in a form
- user submits it: that saves a record in the database
- the command is automatically ran afterwards using that record