File uploads in Heroku deployment with Django - django

So I was finally able to set up local + prod test project I'm working on.
# wsgi.py
from dj_static import Cling, MediaCling
application = Cling(MediaCling(get_wsgi_application()))
application = DjangoWhiteNoise(application)
I set up static files using whitenoise (without any problems) and media (file uploads) using dj_static and Postgres for local + prod. Everything works fine at first... static files, file uploads.
But after the Heroku dynos restart I lose all the file uploads. My question is, --- Since I'm serving the media files from the Django app instead of something like S3, does the dyno restart wipe all that out too?
PS: I'm aware I can do this with AWS, etc, but I just want to know if thats the reason I'm losing all the uploads.

Since I'm serving the media files from the Django app instead of something like S3, does the dyno restart wipe all that out too?
Yes!. That's right. According to the Heroku docs:
Each dyno gets its own ephemeral filesystem, with a fresh copy of the most recently deployed code.
See, also this answer and this answer.
Conclusion: For media files (the uploaded ones), you must use some external service (like S3 or something). whitenoise is just for static files. See here why whitenoise is not suitable for serving user-uploaded (media) files.

Related

Django Static Files on Heroku Dyno

I am running a django application on Heroku, and currently using AWS S3 to serve my static files. We store our static files both in static folders per app, and also in a static/ folder at the root of the directory. The static/ folder at the root is about 40Mb large.
Whenever we deploy our app to Heroku, the static files are included in the Heroku slug, so that
heroku run python manage.py collectstatic --no-input can be run from the Dyno itself, which then copies any changed/new static files to our S3 bucket so that they can be served.
The issue is after we go through this process, we now have a static/ folder on the Dyno which takes up about 40Mb of space, and is seemingly useless since our files are being served from our S3 bucket!
Is there a better way to go about deploying our application, and collecting our static files to our S3 bucket but not copying the static files to Heroku?
One way I was thinking was to add all static files to Heroku's .slugignore file, and then configure a way to upload static files to our S3 bucket without using Heroku at all. I'm not sure if this is the correct way to go about it, however, and would appreciate advice on this.
The reason we have been looking into this is our Heroku slug size is starting to grow far too large (~450Mb), and we need to start reducing it.
After some more digging, I found examples of people doing exactly what I was describing above, which is uploading static files directly to S3 without using any intermediary storage. This article shows how to configure Django and S3 so that running python manage.py collectstatic on your local machine will copy the static files directly to S3.
This configuration, in combination with disabling collectstatic on Heroku (https://devcenter.heroku.com/articles/django-assets#disabling-collectstatic) and adding our static files to .slugignore, would be exactly what I was looking for, which was to upload static files directly to S3 without uploading them first to Heroku.
More reading from Django' docs

Django ManifestStaticFilesStorage not loading the correct static files

I am using a combination of django-storages and ManifestStaticFilesStorage to server static files and media from S3.
class StaticStorage(ManifestFilesMixin, S3BotoStorage):
location = settings.STATICFILES_LOCATION
When I run collectstatic I can see the newest version of my JS file on S3 with the correct timestamp.
I can also see that file being referenced in the staticfiles.json manifest.
However looking at the site in the browser I am still seeing the old JS being pulled down, not the one in the manifest
What could be going wrong?
The staticfiles.json seems to be loaded once when the server starts up (from the S3 instance). If you run collectstatic while the server is running it has no way of knowing that there were changes made to S3. You need to restart the server after running collectstatic if changes have been made.
You can read this post for more infomation. In short:
By default staticfiles.json will reside in STATIC_ROOT which is the
directory where all static files are collected in. We host all our
static assets on an S3 bucket which means staticfiles.json by default
would end up being synced to S3.
So if your staticfiles.json being cached, your static files will be the old ones.
There are 2 ways to fix this:
Versionize staticfiles.json like you're already done with your static files
Keep staticfiles.json in local instead of S3

Faster alternative to manage.py collectstatic (w/ s3boto storage backend) to sync static files to s3?

I have been using s3boto's S3BotoStorage as my static files backend and syncing files to my aws s3 buckets (staging and production) using ./manage.py collectstatic. It works fine. However it is painfully slow. In addition to my own static files (just a few) and django admin, I have a few third party packages with many many static files (grappelli, django-redactor). And collectstatic can take upwards of 15 minutes each time I run it, depending on my internet connection. For instances where I'm syncing with my staging bucket and things aren't quite right, and I have to tweak something and re-sync, its a big time killer. Are there any good, fast, scriptable alternatives for syncing static files to s3?
I wrote a pluggable Django app, based on a djangosnippet, that caches the ETag of the remote file and compares the chached checksum instead of performing a lookup every time. It took me from about 1m30s to around 10s per call to manage.py collectstatic for a few hundred static files. Check it out here: https://github.com/antonagestam/collectfast
Set AWS_PRELOAD_METADATA to True in your settings so it pre-loads all files on s3 before syncing and only syncs the ones that are not already there (or have changed).

django-compressor not setting absolute CSS image paths on Heroku

I'm using django-compressor to concatenate and compress my CSS and JS files on this site. I'm serving static files from an S3 bucket.
On my local copy of the site, using a different S3 bucket, this all works perfectly. But on the live site, hosted on Heroku, it all works except the relative URLs for images in the CSS files do not get re-written.
eg, this line in the CSS file:
background-image: url("../img/glyphicons-halflings-grey.png");
gets rewritten to:
background-image:url('https://my-dev-bucket-name.s3.amazonaws.com/static/img/glyphicons-halflings-grey.png')
on my development site, but isn't touched on the live site. So the live site ends up looking in pepysdiary.s3.amazonaws.com/static/CACHE/img/ for the images (as it's relative to the new, compressed CSS file).
For now, I've put a directory at that location containing the images, but I can't work out why there's this difference. Both sites have this in their settings:
COMPRESS_CSS_FILTERS = [
# Creates absolute urls from relative ones.
'compressor.filters.css_default.CssAbsoluteFilter',
# CSS minimizer.
'compressor.filters.cssmin.CSSMinFilter'
]
And the CSS files are being minimised just fine... but it's like the other filter isn't being applied on the live site.
I recently ran into this issue on heroku, and running the latest version of django-compressor (1.3) does not solve the problem. I will provide the solution that I am using, as well as an explanation of the problems I ran into along the way.
The solution
I created my own 'CssAbsoluteFilter' that removes the settings.DEBUG check from the 'find' method like this:
# compress_filters.py
from compressor.filters.css_default import CssAbsoluteFilter
from compressor.utils import staticfiles
class CustomCssAbsoluteFilter(CssAbsoluteFilter):
def find(self, basename):
# The line below is the original line. I removed settings.DEBUG.
# if settings.DEBUG and basename and staticfiles.finders:
if basename and staticfiles.finders:
return staticfiles.finders.find(basename)
# settings.py
COMPRESS_CSS_FILTERS = [
# 'compressor.filters.css_default.CssAbsoluteFilter',
'app.compress_filters.CustomCssAbsoluteFilter',
'compressor.filters.cssmin.CSSMinFilter',
]
The absolute urls now always work for me whether DEBUG = True or False.
The Problem
The issue is connected to 'compressor.filters.css_default.CssAbsoluteFilter', your DEBUG setting, and the fact that heroku has a read-only file system and overwrites your app files every time you deploy.
The reason compress works correctly on your development server is because CssAbsoluteFilter will always find your static files when DEBUG = True, even if you never run 'collectstatic'. It looks for them in STATICFILES_DIRS.
When DEBUG = False on your production server, CssAbsoluteFilter assumes that static files have already been collected into your COMPRESS_ROOT and won't apply the absolute filter if it can't find the files.
Jerdez, django-compressor's author, explains it like this:
the CssAbsoluteFilter works with DEBUG = False if you've successfully provided the files to work with. During development compressors uses the staticfiles finder as a convenience so you don't have to run collectstatic all the time.
Now for heroku. Even though you are storing your static files on S3, you need to also store them on heroku (using CachedS3BotoStorage). Since heroku is a read-only file system, the only way to do this is to let heroku collect your static files automatically during deployment (see https://devcenter.heroku.com/articles/django-assets).
In my experience, running 'heroku run python manage.py collectstatic --noinput' manually or even in your Procfile will upload the files to S3, but it will NOT save the files to your STATIC_ROOT directory (which compressor uses by default as the COMPRESS_ROOT). You can confirm that your static files have been collected on heroku using 'heroku run ls path/to/collected'.
If your files have been collected on heroku successfully, you should be able to compress your files successfully as well, without the solution I provided above.
However, it seems heroku will only collect static files if you have made changes to your static files since your last deploy. If no changes have been made to your static files, you will see something like "0 of 250 static files copied". This is a problem because heroku completely replaces your app contents when you deploy, so you lose whatever static files were previously collected in COMPRESS_ROOT/STATIC_ROOT. If you try to compress your files after the collected files no longer exist on heroku and DEBUG = False, the CssAbsoluteFilter will not replace the relative urls with absolute urls.
My solution above avoids the heroku problem altogether, and replaces relative css urls with absolute urls even when DEBUG = False.
Hopefully other people will find this information helpful as well.
I've had this exact same problem for a month now, but this is fixed in the 1.3 release (3/18/13, so you were probably on 1.2), so just upgrade:
pip install -U django-compressor
The exact problem I gave up on working out, but it's related to Heroku and CssAbsoluteFilter being called but failing at _converter method. Looking at the 1.3 changelog, the only related commit is this: https://github.com/jezdez/django_compressor/commit/8254f8d707f517ab154ad0d6d77dfc1ac292bf41
I gave up there, life's too short.
Meanwhile this has been fixed in django-compressor 1.6. From the changelog:
Apply CssAbsoluteFilter to precompiled css even when compression is disabled
i.e. the absolute filter is run even with DEBUG = True.

Using collectstatic with multiple environments

I have a Django app on Heroku, with staging and production environments. Static files are hosted on S3. I'm streamlining my deployment process and plan to set up fabfiles once I have things working manually.
How can I configure collectstatic to push to multiple places? If I run it locally, it uses my dev settings (with a local STATIC_ROOT). If I run it on one of my Heroku apps (heroku run ./manage.py collectstatic), then it can't grab the files (since .slugignore ensures they're never pushed to Heroku). The same applies if I include collectstatic in my Procfile.
I'm also using django-pipeline, though it's not yet doing much since I'm stuck on the collectstatic bit.
UPDATE
In response to Marat's question, I tried passing a settings file as an option to collectstatic: ./manage.py collectstatic --settings=project.settings.prod, but got an error: Unknown command: 'collectstatic' I checked on the server though and Installed Apps does include django.contrib.staticfiles and I can also run collectstatic remotely, so I'm not sure what would cause that.
You can set the environment variable DJANGO_SETTINGS_MODULE so you don't need specify --settings everywhere:
heroku config:set DJANGO_SETTINGS_MODULE=project.settings.prod
First, if you are going to serve static via CloudFront, you can use custom origin and always use local STATIC_ROOT. Actually it has some advantages over S3 source, eg gzip support.
Another good thing you can do is to have environment dependent settings in a separate file and then import it in settings.py, eg:
local_settings.py (not in project repository, yet you can have local_settings.py.example):
#environment dependent settings
DATABASES = { .. }
CACHES = { .. }
STATIC_ROOT = 'your_path/static'
settings.py:
import local_settings
I've just replied a similar question on Upload Media from Heroku to Amazon S3. If you customise your settings to take in account environmental vars, you can use filesystem storage backends locally and S3 storage backends when pushing to Heroku. This will collect and upload your static files when your slug is compiled.