collectstatic incorrectly creates multiple CSS files in S3 - django

I have uploading files to S3 working fine with my Wagtail/django application (both static and uploads). Now I'm trying to use ManifestStaticFilesStorage to enable cache busting. The urls are correctly being generated by the application and files are being copied with hashes to S3.
But each time I run collectstatic some files get copied twice to S3 - each with a different hash. So far the issue is ocurring for all CSS files.
file.a.css is loaded by the application and is the file referenced in staticfiles.json - however it is a 20.0B file in S3 (should be 6.3KB).
file.b.css has the correct contents in S3 - however it does NOT appear in the output generated by collectstatic.
# custom_storages.py
from django.conf import settings
from django.contrib.staticfiles.storage import ManifestFilesMixin
from storages.backends.s3boto import S3BotoStorage
class CachedS3Storage(ManifestFilesMixin, S3BotoStorage):
pass
class StaticStorage(CachedS3Storage):
location = settings.STATICFILES_LOCATION
class MediaStorage(S3BotoStorage):
location = settings.MEDIAFILES_LOCATION
file_overwrite = False
Deps:
"boto==2.47.0",
"boto3==1.4.4",
"django-storages==1.5.2"
"Django==2.0.8"
Any pointers on where to look to track down this issue would be appreciated! :)
Edit:
Looking more carefully at all the files copied to S3 the issue is ONLY occurring for CSS files.
Disabling pushing assets to S3 and writing them to the local filesystem works as expected.
Edit 2:
Updated all the deps to the latest version - same behavior as above.

I eventually stumbled across this issue in django-storages issue tracker which then lead me to a very similar question on SO.
Between these two pages I managed to resolve the issue. I did the following to get django-storages + ManifestStaticFilesStorage + S3 to work together:
# custom_storages.py
from django.conf import settings
from django.contrib.staticfiles.storage import ManifestFilesMixin
from storages.backends.s3boto3 import S3Boto3Storage # note boto3!!
class PatchedS3StaticStorage(S3Boto3Storage):
def _save(self, name, content):
if hasattr(content, 'seek') and hasattr(content, 'seekable') and content.seekable():
content.seek(0)
return super()._save(name, content)
class CachedS3Storage(ManifestFilesMixin, PatchedS3StaticStorage):
pass
class StaticStorage(CachedS3Storage):
location = settings.STATICFILES_LOCATION
class MediaStorage(S3Boto3Storage):
location = settings.MEDIAFILES_LOCATION
file_overwrite = False
Note that I had to use boto3 to get this to work django-storages must be >= 1.5 to use boto3. I removed boto as a dep. My final deps were:
"boto3==1.4.4",
"django-storages==1.7.1"
"Django==2.0.8"

Related

In django how to delete the images which are not in databases?

I have created a blog website in Django. I have posted multiple articles on the website. After deleting the article, the article gets deleted but the media files are not removed. I want to delete all media files which are not referred to the articles.
I know, I can create Django post-delete signal and delete media files from there. But it will applicable for only future use. I want to delete previous media files which are not in my database.
first install this:
pip install django-cleanup
then, add this in settings file inside your installed_apps:
'django_cleanup.apps.CleanupConfig'
It will delete your media files.
Just use this program to delete your unused media files.
This command deletes all media files from the MEDIA_ROOT directory which are no longer referenced by any of the models from installed_apps.
import os
from django.core.management.base import BaseCommand
from django.conf import settings
class Command(BaseCommand):
def handle(self, *args, **options):
physical_files = set()
db_files = set()
media_root = getattr(settings, 'MEDIA_ROOT', None)
if media_root is not None:
for relative_root, dirs, files in os.walk(media_root):
for file_ in files:
relative_file = os.path.join(os.path.relpath(relative_root,
media_root), file_)
physical_files.add(relative_file)
deletables = physical_files - db_files
if deletables:
for file_ in deletables:
os.remove(os.path.join(media_root, file_))

Django: sorl-thumbnail cache file is not deleted in production environment

I use django-cleanup, sorl-thumbnail in my Django project.
I have a model like this:
from sorl.thumbnail import ImageField
class Variation(BaseModel):
image = ImageField(upload_to=draft_img_upload_location)
And I use signal for sorl-thumbnail like this (recommanded by https://github.com/un1t/django-cleanup):
from django_cleanup.signals import cleanup_pre_delete
def sorl_delete(**kwargs):
from sorl.thumbnail import delete
delete(kwargs['file'])
cleanup_pre_delete.connect(sorl_delete)
So, In local environment, belows work:
1. When I delete Variation model in ADMIN PAGE, it deletes BOTH image file and image cache(created by sorl-thumbnail).
2. When I change just image file with other image in ADMIN PAGE, it delete BOTH 'prior image file' and 'prior image cache file(created by sorl-thumbnail)'.
In production environment, I used AWS S3 and cloudfront for managing static and media file. So all sorl-thumbnail cache image files are stored in S3. But whenever I changed the image file with other image file in ADMIN PAGE, the prior image cache file(created by sorl-thumbnail) still remained.
Lets say that sorl-thumbnail image url is https://example.cloudfront.net/cache/da/75/abcdefg_i_am_cached_image.jpg (from Google development tool).
In this case, there were two image files exist in S3: abcdefg.jpg and /da/75/abcdefg_i_am_cached_image.jpg
I changed abcdefg.jpg with the other image. Then, it completely deleted abcdefg.jpg in S3 storage.
Now, I accessed https://example.cloudfront.net/cache/da/75/abcdefg_i_am_cached_image.jpg in my browser and guess what! It showed this sorl-thumbnail cached images in this time!
Strange thing happened in S3 storage. When I tried to check whether abcdefg_i_am_cached_image.jpg exists in path /da/75, there was no directory named 75 right below the da folder!
In short, abcdefg_i_am_cached_image.jpg still remained in my S3 storage!
I don't know why this happened only in production environment...
This is part of settings only for production mode.
settings.py
from .partials import *
DEBUG = False
ALLOWED_HOSTS = ["*", ]
STATICFILES_STORAGE = 'spacegraphy.storage.S3PipelineCachedStorage'
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
AWS_ACCESS_KEY_ID = os.environ.get("AWS_ACCESS_KEY_ID")
AWS_SECRET_ACCESS_KEY = os.environ.get("AWS_SECRET_ACCESS_KEY")
AWS_STORAGE_BUCKET_NAME = os.environ.get("AWS_STORAGE_BUCKET_NAME")
AWS_S3_CUSTOM_DOMAIN = os.environ.get("AWS_S3_CUSTOM_DOMAIN")
AWS_S3_URL_PROTOCOL = 'https'
AWS_S3_HOST = 's3-ap-northeast-1.amazonaws.com'
STATIC_URL = "https://this-is-my-example.cloudfront.net/"
INSTALLED_APPS += [
'storages',
'collectfast'
]
AWS_PRELOAD_METADATA = True
storage.py
from django.contrib.staticfiles.storage import CachedFilesMixin, ManifestFilesMixin
from pipeline.storage import PipelineMixin
from storages.backends.s3boto import S3BotoStorage
class S3PipelineManifestStorage(PipelineMixin, ManifestFilesMixin, S3BotoStorage):
pass
class S3PipelineCachedStorage(PipelineMixin, CachedFilesMixin, S3BotoStorage):
pass
After spending couple of hours debugging, I found out that sorl_delete signal is not called only in production environment!!.
I have no idea why this happened. I think that this one is a main problem.
And sorry for bad english (I'm not native). Really need your help. Thanks

Read static file in view

To integrate Django and Ember, I have decided to serve my Ember SPA in a Django view (avoids CORS issues, only one server for frontend and API, etc). I do it like this:
# urls.py
urlpatterns = [
url(r'^admin/', include(admin.site.urls)),
url(r'^api/', include(api_urls, namespace='api')),
...
url(r'^$', views.emberapp, name='emberapp'),
...
]
# views.py
from django.http import HttpResponse
def emberapp(request):
# The Ember frontend SPA index file
# This only works in development, and is anyway hacky
EMBER_FE_INDEX_HTML = '/absolute/path/to/my/frontend/static/fe-dist/index.html'
template_file = open(EMBER_FE_INDEX_HTML)
html_content = index_file.read()
index_file.close()
return HttpResponse(html_content)
The index.html is part of the static assets. In development this is very easy:
The index.html is directly accessible to the Django application in the file system
I know the absolute path to the index file
But in production things are more complex, because the static assets are not local to the django application, but accessible on Amazon S3. I use django-storages for that.
How can I read the contents of a static file from a view, in a generic way, no matter what backend is used to store/serve the static files?
First, I don't think the way you do it is a good idea.
But, to answer your question: In your settings.py, you likely have defined the directory where Django will collect all static files.
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
STATIC_ROOT = os.path.join(BASE_DIR, 'static')
So in your view, you just need to fetch the file os.path.join(settings.STATIC_ROOT, 'index.html')
That said, you should serve index.html via the webserver, same as your static/ files, robots.txt, favicon.ico, etc. Not through Django. The webserver is much faster, uses proper caching, and its just one line in your Nginx or Apache settings, instead of an entire view function in Django.
This is my current solution. Works in development, not sure about production yet (it is a pain that you need to commit untested code to verify production-related code in Heroku)
from django.conf import settings
from django.http import HttpResponse
from django.core.files.storage import get_storage_class
FE_INDEX_HTML = 'fe/index.html' # relative to the collectstatic directory
def emberapp(request):
# The Ember frontend SPA index file
# By getting the storage_class like this, we guarantee that this will work
# no matter what backend is used for serving static files
# Which means, this will work both in development and production
# Make sure to run collectstatic (even in development)
# TODO: how to use this in development without being forced to run collectstatic?
storage_class = get_storage_class(settings.STATICFILES_STORAGE)
# TODO: reading from a storage backend can be slow if assets are in a third-party server (like Amazon S3)
# Maybe streaming the static file from the server would be faster?
# No redirect to the Amazon S3 asset, please, since the Ember App needs to
# run from the same URL as the API, otherwise you get CORS issues
with storage_class().open(FE_VWORKS_INDEX_HTML) as index_file:
html_content = index_file.read()
return HttpResponse(html_content)
Or, to reply with an StreamingHttpResponse, which does not force Django to read the whole file in memory (and wait for it to be read):
def emberapp(request):
# The Ember frontend SPA index file
# By getting the storage_class like this, we guarantee that this will work
# no matter what backend is used for serving static files
# Which means, this will work both in development and production
# Make sure to run collectstatic (even in development)
# TODO: how to use this in development without being forced to run collectstatic?
storage_class = get_storage_class(settings.STATICFILES_STORAGE)
index_file = storage_class().open(FE_INDEX_HTML)
return StreamingHttpResponse(index_file)

How to use django-cumulus for serving Static files?

I'm trying to use django-cumulus for serving files off Rackspace CloudFiles. I'm currently only trying it on my local dev server, using Django 1.4.2.
I can use cumulus's syncstatic management command to upload all my static assets successfully, but I can't seem to display them on my site with the same settings.
If my relevant settings are:
STATIC_URL = '/static/'
CUMULUS = {
'USERNAME': 'myusername',
'API_KEY': 'myapikey',
'CONTAINER': 'mycontainername',
'STATIC_CONTAINER': 'mycontainername',
}
DEFAULT_FILE_STORAGE = 'cumulus.storage.CloudFilesStorage'
STATICFILES_STORAGE = 'cumulus.storage.CloudFilesStaticStorage'
then when I run syncstatic all my apps' static files are uploaded into /mycontainername/static/, as I'd expect. But when I load a page in admin it ignores STATIC_URL and tries to serve assets from URLs like http://uniquekey....r82.cf2.rackcdn.com/path/to/file.css rather than http://uniquekey....r82.cf2.rackcdn.com/static/path/to/file.css.
Also, I can't see how to have my public (non-admin) pages use the static files on CloudFiles, rather than serving them from a local /static/ directory.
Have I missed some crucial setting, or am I doing something else wrong?
I had the same problem. What i did was to
git clone https://github.com/richleland/django-cumulus.git
edit context_processors.py
from django.conf import settings
from cumulus.storage import CloudFilesStorage
def cdn_url(request):
"""
A context processor to expose the full cdn url in templates.
"""
cloudfiles_storage = CloudFilesStorage()
static_url = '/'
container_url = cloudfiles_storage._get_container_url()
cdn_url = container_url + static_url
print {'CDN_URL': cdn_url}
return {'CDN_URL': cdn_url}
Once you are done, install it with sudo python setup.py install
Do note that context_processors.py from django cumulus is actually quite slow

Django get the static files URL in view

I'm using reportlab pdfgen to create a PDF. In the PDF there is an image created by drawImage. For this I either need the URL to an image or the path to an image in the view. I managed to build the URL but how would I get the local path to the image?
How I get the URL:
prefix = 'https://' if request.is_secure() else 'http://'
image_url = prefix + request.get_host() + STATIC_URL + "images/logo_80.png"
# Older Django <3.0 (also deprecated in 2.0):
from django.contrib.staticfiles.templatetags.staticfiles import static
# Django 3.0+
from django.templatetags.static import static
url = static('x.jpg')
url now contains '/static/x.jpg', assuming a static path of '/static/'
EDIT: If you're on Django >=3.0, refer to Django get the static files URL in view instead. This was answered with Django 1.X version.
dyve's answer is good one, however, if you're using "cached storage" on your django project and final url paths of the static files should get "hashed"(such as style.aaddd9d8d8d7.css from style.css), then you can't get a precise url with django.templatetags.static.static(). Instead, you must use template tag from django.contrib.staticfiles to get hashed url.
Additionally, in case of using development server, this template tag method returns non-hashed url, so you can use this code regardless of that the host it is development or production! :)
from django.contrib.staticfiles.templatetags.staticfiles import static
# 'css/style.css' file should exist in static path. otherwise, error will occur
url = static('css/style.css')
From Django 3.0 you should use from django.templatetags.static import static:
from django.templatetags.static import static
...
img_url = static('images/logo_80.png')
here's another way! (tested on Django 1.6)
from django.contrib.staticfiles.storage import staticfiles_storage
staticfiles_storage.url(path)
Use the default static tag:
from django.templatetags.static import static
static('favicon.ico')
There is another tag in django.contrib.staticfiles.templatetags.staticfiles (as in the accepted answer), but it is deprecated in Django 2.0+.
#dyve's answer didn't work for me in the development server. Instead I solved it with find. Here is the function:
from django.conf import settings
from django.contrib.staticfiles.finders import find
from django.templatetags.static import static
def get_static(path):
if settings.DEBUG:
return find(path)
else:
return static(path)
If you want to get absolute url(including protocol,host and port), you can use request.build_absolute_uri function shown as below:
from django.contrib.staticfiles.storage import staticfiles_storage
self.request.build_absolute_uri(staticfiles_storage.url('my-static-image.png'))
# 'http://localhost:8000/static/my-static-image.png'
In short words you need to get
STATIC_URL
STATIC_ROOT
urlpatterns
staticfiles
templatetags
url parameters
All in the right place, to get this working. In addition, in real time deployment, circumstances vary, which it is very possible that the current setting you spent 3 hours worked out, works on your local machine but the server.
So I adopted the traditional way!!
app
├── static
│ └── json
│ └── data.json
└── views.py
views.py
import os
with open(os.path.abspath(os.getcwd()) + '/app/static/json/data.json', 'r') as f:
pass