Using sorl-thumbnail with MongoDB storage - django

I've extended sorl-thumbnail's KVStoreBase class, and made a key-value backend that uses a single MongoDB collection.
This was done in order to avoid installing a discrete key-value store (e.g. Redis).
Should I clear the collection every once in a while?
What are the downsides?

Only clear the collection if low disk usage is more important to you than fast access times.
The downsides are that your users will all hit un-cached thumbs simultaneously (And simultaneously begin recomputing them).
Just run python manage.py thumbnail cleanup
This cleans up the Key Value Store from stale cache. It removes references to images that do not exist and thumbnail references and their actual files for images that do not exist. It removes thumbnails for unknown images.

Related

insert base64 strings in Dexiejs

I am building an ionic 3 app and I want to set up an upload based on the ImagePicker Cordova plugin.
I use Dexie to persist some data, and I wonder if persisting whole base64 strings would be alright. Or is it too heavy?
I want to persist the images chosen with the image picker. When an upload is suspended or stopped i would be able to restart the upload for those.
Anybody using any other type of persistence of Base64 images?
Thank you
It depends on the size of the images. Unless images are larger than 10 megabytes, I think you are safe. There is no direct limit of document sizes in indexedDB except for the quota you are given for the whole db instance, which can vary per platform and can be extended on modern platforms using navigator.storage.persist(). Do not index the property containing the large string though, since it would affect performance badly and eventually trigger unknown bugs.
In case you target modern platforms (Chromium, Firefox and Safari 10.1), you don't need to convert the images to base64. Instead you can store the binary data directly in a property of type Uint8Array.

what's the efficiency of qsqlite data base in Qt?

I am very new to database and I am trying to implement a offline map viewer. What would be the efficiency of the qsqldatabase?
To make it extreme, for example, is it possible to download all satellite image of all the detail levels of US from the google's map server and store it in a local sqlite database and still perform real time query based on my current gps location?
The Qt Database driver for SQLite uses SQLite internally (surprise!). So the question is more like: Is SQLite the right database to use? My answer: I would not use it to store geographical data, consider to look for a database which is optimized for this task.
If this is not an option; SQLite is really efficient. First check if your data is within the limits. Do not forget to create indexes and analyze the database. Then it should be able to handle your task. Here I assume you just want to get an image by its geographical position (but other solutions can be a lot faster because your data is sortable — if I remember correctly SQLite is not optimized for that).
As you will store large blobs, you may want to have a look at the Internal Versus External BLOBs in SQLite document. Maybe this gives you the answer already.

Ways to handle thumbnails with Django?

I'm going to have different ways to present an objects image in my templates. Full size, medium size and smaller thumbnails in lists. It's a dynamic structure and needs to be fast for sorting, searching, filtering..
As a beginner I'm thinking of three ways to handle this:
Simply use the image from the image field and change the size in templates with css.
Save different image versions (size) in different fields in the model and in media files.
Create the thumbnails dynamically in the templates with sorl-thumbnail or easy-thumbnails
The thumbnail apps is a bit complicated and would need some extra requirements like PIL and I need to make some choices about caching. Not sure if I win so much performance by going down this path or if there is other smarter ways? Is it better to plan ahead for scaling/performance.
How are you handling thumbnails? And are you using redis or memcached?
First neither redis or memcached handles caching of images. Memcached is a simple key-value store. Redis essentially works as a key-value store but it has support for other types as well. Such as lists. When it comes to caching images you would use something like nginx.
Secondly, the first option is suboptimal if you want your page to load as quickly as possible. As it will need to load a bigger file than necessary. The second and the third option is essentially the same. Easy-thumbnails for example doesn't create thumbnails on the fly in the template. It generates them as needed and then you can access those thumbnails from your static_folder.
If you want to manipulate images, you will need PIL or Pillow if you're using Python 3.

What are the best practices for user uploads with S3?

I was wondering what you recommend for running a user upload system with s3. I plan on using MongoDB for storing metadata such as the uploader, size, etc. How should I go about storing the actual file in s3.
Here are some of my ideas, what do you think is the best? All of these examples would involve saving the metadata to MongoDB.
1.Should I just store all the files in a bucket?
2. Maybe organize them into dates (e.g. 6/8/2014/mypicture.png)?
3.Should I save them all in one bucket, but with an added string (such as d1JdaZ9-mypicture.png) to avoid duplicates.
4. Or should I generate a long string for a folder, and store the file in that folder. (to retain the original file name). e.g. sh8sb36zkj391k4dhqk4n5e4ndsqule6/mypicture.png
This depends primarily on how you intend to use the pictures and which objects/classes/modules/etc. in your code will actually deal with retrieving them.
If you find yourself wanting to do things like - "all user uploads on a particular day" - A simple naming convention with folders for the year, month and day along with a folder at the top level for the user's unique ID will solve the problem.
If you want to ensure uniqueness and avoid collisions in your bucket, you could generate a unique string too.
However, since you've got MongoDB which (i'm assuming) will actually handle these queries for user uploads by date, etc., it makes the choice of your bucket more aesthetic than functional.
If all you're storing in mongoDB is the key/URL, it doesn't really matter what the actual structure of your bucket is. Nevertheless, it makes sense to still split this up in some coherent way - maybe group all a user's uploads and give each a unique name (either generate a unique name or prefix a unique prefix to the file name).
That being said, do you think there might be a point when you might look at changing how your images are stored? You might move to a CDN. A third party might come up with an even cheaper/better product which you might want to try. In a case like that, simply storing the keys/URLs in your MongoDB is not a good idea since you'll have to update every entry.
To make this relatively future-proof, I suggest you give your uploads a definite structure. I usually opt for:
bucket_name/user_id/yyyy/mm/dd/unique_name.jpg
Your database then only needs to store the file name and the upload time stamp.
You can introduce a middle layer in your logic (a new class perhaps or just a helper function/method) which then generates the URL for a file based on this info. That way, if you change your storage method later, you only need to make a small change in this middle layer (after migrating your files of course) and not worry about MongoDB.

When to cache the results from a web service?

In a section from my web application i get information from http://www.geonames.org/ ( web service method ) and http://data.un.org/ ( xml files stored on our application )
I'm new at this and my questions are:
When to cache the information from geonames ?
What method to use for the cache ?
It will be ok if i cache the xml files or is the same performance ?
I use ASP.NET MVC 2 C#
Caching is a way to improve performance, consider it, only if the current performance is not acceptable, otherwise there is no need to worry.
One way you could cache your data is set up a database table with a CLOB field, a date time of when it was stored and of course fields to identify the object (such as the webservice parameters used to obtain this object).
You've to decide a policy to expire the old objects, for instance you could set up a query to run daily that would delete all objects older than a week. This is an example, I can't tell you for how long to cache, it depends on the size of the data you can keep and on how often it gets updated.
To get to your questions in more detail:
.1. When to cache the information from geonames ?
I'm not sure if I understand correctly, but normally: you'd look up the value in the cache, if it's found you return from the cache, if it's not found you do the service call and you store the result in the cache.
.2. What method to use for the cache ?
I've explained a way with SQL tables, you could also use files, but it's more complicated.
.3. It will be ok if i cache the xml files or is the same performance ?
Whatever you decide to cache, processed or unprocessed (XML) information, it won't change much from a performance point of view, since the biggest delay is fetching the information from the network, not processing it.