sorl-thumbnail won't delete thumbnails - django

Having issues with SORL Thumbnail and deleting thumbnails files or refreshing thumbnails when a file is overwritten. The scenario is that I have a file that for each entry is always the same but can be overwritten. Need the thumbnail to be recreated when a new file is uploaded and the old file is overwritten.
This is at the model + form level so I'm using the low level API to generate thumbs.
Have tried using:
from sorl.thumbnail import delete
delete(filename)
But with no success, the thumbnail is never deleted or overwritten.
I have even tried:
from sorl.thumbnail.images import ImageFile
from sorl.thumbnail import default
image_file = ImageFile(filename)
default.kvstore.delete_thumbnails(image_file)
Again with no success.
Please help!
Update:
I found a work around by creating an alternate ThumbnailBackend and a new _get_thumbnail_filename method. The new method uses a file's SHA-1 hash to always have a thumbnail specific to the current file.
Here's the backend for anyone else that might encounter a similar scenario.
class HashThumbnailBackend(ThumbnailBackend):
def _get_thumbnail_filename(self, source, geometry_string, options):
"""
Computes the destination filename.
"""
import hashlib
# hash object
hash = hashlib.sha1()
# open file and read it in as chunks to save memory
f = source.storage.open(u'%s' % source, 'rb')
while True:
chunk = f.read(128)
if not chunk:
break
hash.update(hashlib.sha1(chunk).hexdigest())
# close file
f.close()
hash.update(geometry_string)
hash.update(serialize(options))
key = hash.hexdigest()
# make some subdirs
path = '%s/%s/%s' % (key[:2], key[2:4], key)
return '%s%s.%s' % (settings.THUMBNAIL_PREFIX, path,
self.extensions[options['format']])

Its a little hard to explain so I made this awesome table. the first column's commands are listed below, the other columns marks wheter it deletes using an X. Original is the original file, thumbnails the thumbnails for the original and KV means the Key Value store reference.
| Command | Original | Thumbnails | KV Original | KV Thumbnails |
| #1 | X | X | X | X |
| #2 | | X | | X |
| #3 | | X | X | X |
sorl.thumbnail.delete(filename)
sorl.thumbnail.default.kvstore.delete_thumbnails(image_file)
sorl.thumbnail.delete(filename, delete_file=False)
As I understand it you really want to do #3. Now, your problem... a guess is that filename does not refer to a filename relative to MEDIA_ROOT (if you are using another storage backend the situation would be similar). But I think I need to know what you are doing besides this to get a better picture, note that ImageFields and FileFields do not overwrite, also note that django changed the deletion behaviour in 1.2.5, see release notes.
Update:
Anyone reading this should note that the above way to generate thumbnail filenames is extremely inefficient, please do not use if you care anything at about performance.

I'm not completely sure whether this answers your question, but I was having the same problem and this was my solution.
I have a model with a FileField on it, like such:
material = models.FileField(upload_to='materials')
When handling an uploaded file, I use get_thumbnail() to generate the thumbnail, passing the FileField in as the parameter vs the python level file behind it. ie:
thumb = get_thumbnail(modelinstance.material, '%dx%d' % (thumb_width, thumb_height))
As with your issue, I also found that when a file had the same name, sorl would just grab the thumbnail from the cache instead of generating a new one. Aggravating!
What worked was using sorl's delete method and passing the FileField. I first tried passing in the python file behind the FileField object, which is possibly what you were trying? Going from this:
sorl.thumbnail.delete(modelinstance.material.file)
To this:
sorl.thumbnail.delete(modelinstance.material)
Seemed to line up with sorl-thumbnail's KV Store, and would properly get the cached thumbnail out of the way so the new one could be created from the new file. Yay!
This was helpful for me: http://sorl-thumbnail.readthedocs.org/en/latest/operation.html
Also, even after running ./manage.py thumbnail cleanup and ./manage.py thumbnail clear, I couldn't get Django to stop looking for the old thumbnails in the same place. I had to manually clear the Django cache (I'm using memcached). Here's how you can do that:
import os
# Set the DJANGO_SETTINGS_MODULE environment variable.
os.environ['DJANGO_SETTINGS_MODULE'] = "yourproject.settings"
from django.core.cache import cache
# Flush!
cache._cache.flush_all()
This is my first SO answer. Hope it helps someone :)

The thing is you cannot use the shortcut delete(file) with a File class that is different to the one you employed to generate that very thumbnail through get_thumbnail() or the {% thumbnail ...%} templatetag.
The reason is ImageFile instances constructed from the file objects will get differents keys (ImageFile.key) and delete() will never be able to retrieve the good thumbnails to remove because the keys don't match.
I'm not sure that it won't works if you use a Python File object and then a Django File object for instance, but in Django, if you generate the thumbnail with a FileField object and try to delete it (and its thumbnails) with a File instance, it will not works for sure.
So, in your templates, don't do:
{% load thumbnail %}
{% thumbnail image "100" as im %}
<img src="{{ im.url }}" width="{{ im.width }}" height="{{ im.height }}">
{% endthumbnail %}
Where image is a models.ImageField instance, but use its file attribute:
{% load thumbnail %}
{% thumbnail image.file "100" as im %}
And to delete it in your Python code (the following is an example of Storage to overwrite the existing file if the name is the same):
from django.core.files.storage import FileSystemStorage
from django.core.files import File
from sorl.thumbnail import delete
class OverwriteStorage(FileSystemStorage):
def _save(self, name, content):
if self.exists(name):
img = File(open(os.path.join(self.location, name), "w"))
delete(img)
return super(OverwriteStorage, self)._save(name, content)
Not sure if its a bug in sorl or if there is a good reason to generate different keys.

I saw this problem. It was happening because Sorl was being used oddly.
All the thumbnail were got in the following style:
sorl.thumbnail.get_thumbnail(self.picture.url, geometry_string, **options)
# picture being a FieldFile
And when deleting the thumbnail (removing them from cache) it was being done like this:
sorl.thumbnail.delete(self.picture.name, delete_files=False)
Shortly, we were using the image's URL to generate and fetch the thumbnails, and when deleting we were using the image's name. Although Sorl didn't complain about it, the KV Store and the FS weren't never cleaned up.
The fix was to just change the get_thumbnail name argument to self.picture.name.

Related

How to reference a randomly generated filename with flask

I am creating a program that generates a chart, and then displays the chart on the webpage. I want each chart generated to have a unique filename. However, once that unique filename is generated, I don't know how to refer to it within the html file.
I use this to create a random filename starting with "chart" in the "images" folder. This parts works fine.
basename = "images/chart"
suffix = str(uuid.uuid4())
filename = "_".join([basename, suffix])
plt.savefig(filename)
I then have this in the html file, but don't know how to modify to add the random suffix part of the name that was just generated.
<img src="{{ url_for('static', filename = '/images/chart.png') }}">
It is hard to say without knowing more about the structure of your application, but one approach might be to change the template to something like
<img src="{{url}}">
and then calling it with
render_template('template.html', url=url_for('static', filename))
I'm no flask expert, but my instinct would be to avoid putting too much Python code — things like the url_for call — in the templates, because I do not want to mix program logic with presentation.

Show Image stored in s3 on Web page using flask

Im trying to get images from an s3 bucket, and show them on a web page using flask (and boto3 to access the bucket).
I currently have a list of all the pictures from the bucket, but cant get the html to show them(gives me 404 error).
How do I do this without downloading the files?
this is what I have so far:
def list_files(bucket):
contents = []
for image in bucket.objects.all():
contents.append(image.key)
return contents
def files():
list_of_files = list_files(bucket)
return render_template('index.html', my_bucket=bucket, list_of_files=list_of_files)
and this is the html snippet:
<table class="table table-striped">
<br>
<br>
<tr>
<th>My Photos</th>
{% for f in list_of_files %}
<td> <img src="{{ f }}"></td>
{% endfor %}
Thanks a lot!
since loading an image to a html page requires a real image which exists in the directory. images from AWS S3 can be loaded onto a html page if you download them first in the directory, then use its url as a source in html <image> tag.
i found a solution to this but you need to modify it as your needs.
define a function that loads the image from S3 as:
import matplotlib.image as mpimg
import numpy as np
import boto3
import tempfile
s3 = boto3.resource('s3', region_name='us-east-2')
bucket = s3.Bucket('bucketName')
object = bucket.Object('dir/subdir/2015/12/7/img01.jpg')
tmp = tempfile.NamedTemporaryFile()
def imageSource(bucket, object, tmp):
with open(tmp.name, 'wb') as f:
object.download_fileobj(f)
src = tmp.name #dir/subdir/2015/12/7/img01.jpg
retrun src
Just ran into this problem as well, seems like this hasn't been updated for a while so will try to add it.
Your current approach below is right. The only issue is that in order to render an image that is not going to be downloaded to your server, you have to have a direct url to your S3 file. Currently, you only have the image name, not the full url.
def list_files(bucket):
contents = []
for image in bucket.objects.all():
contents.append(image.key)
return contents
def files():
list_of_files = list_files(bucket)
return render_template('index.html', my_bucket=bucket, list_of_files=list_of_files)
Currently, your items in the list of files will look like this:
['file_name1', 'file_name2', 'file_name3']
In order for them to render in your browser directly you need them to look like this:
['file_url1', 'file_url2', 'file_url3']
s3 file urls look something like this: https://S3BUCKETNAME.s3.amazonaws.com/file_name1.jpg
Therefore, instead of the line below
contents.append(image.key)
you need to replace the image.key with something that makes the URL
contents.append(f'https://{S3BUCKETNAME}.s3.amazonaws.com/{image.key})
That should do it, the html you have should work correctly as is. The only other big risk is the files you uploaded are not public, for that you'll need to look at the settings of your bucket on AWS.
Additional Resources and Sources:
Adding a public policy to your AWS S3 Bucket: https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-bucket-policies.html
Uploading and downloading files with Flask & S3: https://stackabuse.com/file-management-with-aws-s3-python-and-flask/

Print Barcode in PDF with Django

I am using render_to_string in django for parse an HTML and export to PDF.
html = render_to_string("etiquetaTNT.html", {
'context': context,
'barcode': b,
'barcodeimg': barcodeimg,
})
font_config = FontConfiguration()
HTML(string=html).write_pdf(response, font_config=font_config)
return response
I am trying to insert a barcode in PDF. I generate this barcode in a PNG.
br = barcode.get('code128', b, writer=ImageWriter())
filename = br.save(b)
barcodeimg = filename
But the PDF in template, not show the image.
<img class="logo" src="{{barcodeimg}}" alt="Barcode" />
I do not know the way to save the filename in the template that I want, and I do not know to show in the PDF, because any image is showed. For example, the logo, it is showed in HTML template but not in the PDF.
<img class="logo" src="{{logo}}" alt="TNT Logo" />
The libraries that I am using:
import barcode
from barcode.writer import ImageWriter
from django.http import HttpResponse
from django.template.loader import render_to_string
from weasyprint import HTML
from weasyprint.fonts import FontConfiguration
I do not want to use Reportlab, because I need to render a HTML, not a Canvas.
Understanding the problem:
Think about what happens when you load a webpage. There is the initial request where the document is loaded, and then subsequent requests are made to fetch the images / other assets.
When you want to print some HTML to PDF using weasyprint, weasyprint has to fetch all of the other images. Checking out the python-barcode docs, br.save(b) is just going to return literally just the filename, (which will be saved in your current working directory). So your html will look something like this:
<img class="logo" src="some_filename.svg" alt="Barcode" />
Quite how it fetches this will depend on how you have weasyprint set up. You can check out django-weasyprint which has a custom URL fetcher. But as things stand, weasyprint can't fetch this file.
A solution
There are a few ways you can fix this. But it depends alot on how you are deploying this. For example, heroku (as I understand it) doesn't have a local file system you can write to, so you would need to write the file to an external service like s3, and then insert the url for that into your template, which weasyprint will then be able to fetch. However, I think there is probably a simpler solution we can use in this case.
A better (maybe) Solution
Taking a look at the python-barcode docs it looks like you can write using SVG.
This is good because we can insert SVG straight into our HTML template (and avoid having to fetch any other assets). I would suggest something like the following
from io import BytesIO
from barcode.writer import SVGWriter
# Write the barcode to a binary stream
rv = BytesIO()
code = barcode.get('code128', b, writer=SVGWriter())
code.write(rv)
rv.seek(0)
# get rid of the first bit of boilerplate
rv.readline()
rv.readline()
rv.readline()
rv.readline()
# read the svg tag into a string
svg = rv.read()
Now you'll just need to insert that string into your template. Just add it to your context, and render it as follows:
{{svg}}
Enhancing the solution provided by #tim-mccurrach, I have created a templatetag for it.
/app/templatetags/barcode_tags.py
from django import template
from io import BytesIO
import barcode
register = template.Library()
#register.simple_tag
def barcode_generate(uid):
rv = BytesIO()
# code = barcode.get('code128', b, writer=SVGWriter())
code = barcode.get('code128', uid,
writer=barcode.writer.SVGWriter())
code.write(rv)
rv.seek(0)
# get rid of the first bit of boilerplate
rv.readline()
rv.readline()
rv.readline()
rv.readline()
# read the svg tag into a string
svg = rv.read()
return svg.decode("utf-8")
And then In the template.html:
{% load barcode_tags %}
{% barcode_generate object.uid as barcode_svg %}
{{barcode_svg | safe}}

Wagtail mock images show as broken in admin but visible on template

I have a demo Wagtail website. A site is generated via a Cookiecutter. To populate the CMS with initial content I have added a load_initial_data command that can be run once Wagtail is installed. This populates text content from a fixtures JSON file, and moves images from the fixtures folder to the Wagtail site's media_root folder. The code looks like:
# load_initial_data.py
import os, shutil
from django.conf import settings
from django.core.management.base import BaseCommand
from django.core.management import call_command
class Command(BaseCommand):
def handle(self, **options):
fixtures_dir = os.path.join(settings.PROJECT_DIR, 'fixtures')
fixture_file = os.path.join(fixtures_dir, 'db.json')
image_src_dir = os.path.join(fixtures_dir, 'images')
image_dest_dir = os.path.join(settings.MEDIA_ROOT, 'original_images')
call_command('loaddata', fixture_file, verbosity=0)
if not os.path.isdir(image_dest_dir):
os.makedirs(image_dest_dir)
for filename in os.listdir(image_src_dir):
shutil.copy(os.path.join(image_src_dir, filename), image_dest_dir)
This works to the extent that the images are copied to the correct directory, and on the templates the images appear as expected when requested. The problem is within /admin/images/ where the requested version of the image is unavailable, and so the browser shows a broken image icon.
The admin page is looking for a specific size of the image ({your-image-name}.max-165x165.{.jpg|.png|.gif}.
Watching how images move from original_images to images makes it appear that they are only processed after the template they are on is first requested. One idea then might be to create a template listing all the images (with the correct styling) to process them after the data has been loaded in. However doing something like
{% image page.image max-165x165 as test_photo %}
<img src="{{ test_photo.url }}" width="{{ test_photo.width }}" height="{{ test_photo.height }}" alt="{{ test_photo.alt }}" />
Still returns a broken image, and doesn't process the image from the original_images folder to the images folder as I'd have expected. I tried this after the initial data load, and am presuming it's because the image size needs to have a reference within both the database and template?
Is there a way to programmatically force Wagtail to reprocess all the images to generate the size and file name that the image admin page is looking for?
(To mention quickly, if it's relevant, that the images currently sit within the project repo, but will ultimately be a zip file stored on a cloud store and only be imported to the project once requested. Currently, regardless of whether the user wants them or not, the images are included with the Cookiecutter)
Whenever a template (either front-end or within the admin) requires an image at a particular size, it will look in the wagtailimages.Rendition model (or the project-specific Rendition model, if custom image models are in use) to see if one has previously been generated. If so, it will use the existing file; if not, it will generate a new one and add a Rendition record.
If you're getting a broken image, it most likely means that a Rendition record exists (because it's been included in your initial data fixture) but the corresponding image file isn't present in MEDIA_ROOT/images. The proper fix would be to remove the rendition records from your fixture. To fix this after the fact and force all image renditions to be recreated, you can simply delete the contents of the wagtailimages_rendition table.

django 1.6 with django_tables2 cannot change DATETIME_FORMAT

python 2.7.5+. django_tables2 0.14.0
In settings.py:
USE_L10N = True
DATETIME_FORMAT = 'M j, Y H:i:s.u'
Didn't work. I tried {% load L10N %} at the top of my template. Didn't work.
I created a formats directory at the top of my project containing an en directory containing formats.py and added the following to my settings.py (yes, there are init.py files in the formats directory and in the en directory):
FORMAT_MODULE_PATH = '<project>.formats'
Didn't work.
I tried putting in {{created_at|date: 'M j, Y'}}{{created_at|time: 'H:i:s.u'}} before the {% render_table table %} tag from django_tables2. Got an error.
Tried the recommended django_tables2 method of, in my tables.py definition, adding to the Meta class:
localize('created_at')
Didn't work.
Tried putting 'localize=True' in the column definition for created_at in tables.py.
Didn't work.
The only thing that worked was modifying python2.7/site-packages/django/conf/locale/en/formats.py (in my virtual environment) but I won't be allowed to do that on my project because it's not considered portable.
I see stackoverflow entries that say this works but they are many years old and many django versions back. I see others here that say this doesn't work and they haven't/aren't going to fix it.
This functionality logs requests; during testing the other devs could really use seeing seconds and probably microseconds to help them debug but I can't seem to give it to them.
Any suggestions that I haven't already tried?
Thanks.
OK, blush - I got it. I made the following change in the model for the DateTimeField created_at:
#property
def nice_date(self):
# mm/dd/yyyy HH:mm:ss.uuuuuu
t = self.created_at.strftime("%m/%d/%Y %H:%M:%S.%f")
return t
Then, I displayed nice_date instead of created_at.
I hope this helps someone else.