Editing files on Amazon S3 with collectstatic - django

I'm using Amazon S3 to serve my static files. Everything has been set up and when I initially created my CSS files and ran
python manage.py collectstatic
it informed me that everything went fine and my CSS file was copied. Sure enough when viewing the bucket in the browser, it was there. When I edit the file locally and re-run collect static it tells me no static files were copied but 73 were modified. When I check in the browser the changes aren't present in the CSS file; it just looks like the initial version I created.
I figured it'd be a permission error and when I checked I noticed everyone didn't have edit permissions (I know I shouldn't let everyone edit it, but I just want to get it working for the moment). I changed it so everyone could edit, view and download and tried to recollect static but to no avail. The file hadn't been edited.
Am I missing something?

The reason for this was pretty bizarre. Effectively the problem was timezones. It thought the files on S3 were younger than the files locally due to time difference discrepancies. I fixed this by editing the TIME_ZONE variable in settings.py with the following:
TIME_ZONE = None

It's really weird but changing the TIME_ZONE to None worked for me as well.

I'm using Django 1.10.6 and I used this tutorial to get static files working on S3.
For me TIME_ZONE = None didn't work. But this works for me:
TIME_ZONE = 'UTC'
So I made a settings/collectstatic.py file to run local and sync s3 files to production.

Related

Django collectstatic keeps waiting when run through Github Action

We are facing a very weird issue. We ship a django application in a docker container through Github Actions on each push. Everything is working fine except collectstatic.
We have the following lines at the end of our CD github action:
docker exec container_name python manage.py migrate --noinput
docker exec container_name python manage.py collectstatic --noinput
migrate works perfectly fine, but collectstatic just keeps on waiting if ran through the github action. If I run the command directly on the server then it works just fine and completes with in few minutes.
Can someone please help me figuring out what could be the issue?
Thanks in advance.
Now I am far from the most experienced but I did this recently and I have some suggestions of where to look. I'm definitely not the greatest authority though.
I wasn't using docker so I can't say anything about that. From the issues, I had here are some suggestions I can recommend to try.
Take note that all of this was for a self-hosted runner. Things would be very different otherwise.
Check to make sure STATIC_ROOT and MEDIA_ROOT variables are set correctly in the settings file.
If the STATIC and MEDIA root variables are environment variables make sure you are serving the correct environment variables file like a .env file which I used.
I used django-environ to serve my environment variables. From the docs, it says to have the .env file in the same directory as the settings file. Well if you are putting the project on a production server with github actions, you won't be able to put the .env file anywhere in the project because it will get overwritten every time new code is pushed.
So to fix that you need to specify the correct .env file from somewhere else on the server. Do that by specifying ENV_PATH.
https://django-environ.readthedocs.io/en/latest/
Under the section Multiple env files
Another resource that was helpful:
https://github.com/joke2k/django-environ/issues/143
I set up my settings file like how they did there.
I put my .env file in a proj directory I made in the virtualenvironment folder for the project.
I don't know if it's a good place to put it but that's how I did it. I didn't find much great info online for this stuff. Had to figure out a lot on my own.
Make sure the user which is running the github action has permissions to read the .env file.
Also like .env file, if you have the static files being collected into the base directory of your project you might have an issue with github actions overwriting those files every time new code is pushed. If you have a media directory where the user uploads files to then that will really be an issue because those files won't get overwritten. They'll just disappear.
Now if this was an issue it shouldn't cause github actions to just get stuck on the collect static command. It would just cause files to get overwritten every time the workflow runs and the media files will disappear.
If you do change the directory of where the static and media files are located as stated before, make sure all the variables for the paths are correct in the settings file and the .env file.
You will also need to update the nginx config file for the static and media root directories if you used nginx. Not sure about how apache does this.
You can do that with this command:
sudo nano /etc/nginx/sites-available/myproject
Don't forget to restart the nginx server after doing that.
If you are writing static and media files at a different location from the base project directory on the server, also check permissions on those directories. Make sure the user running the github action has permissions to write to those directories. I suspect that might cause it to hang but it very well might just cause an error.
Check all the syntax in the github actions yml file. Make sure everything is correct and it's not hanging cause it had an incomplete command or something like that.
But yeah, that's some things I had to take a look at. Honestly, none of this might be relevant for you. All of these issues should cause an error somewhere for the most part.
I couldn't really offer many external resources for you to look deeper into this because I'm just speaking from personal experience.
Hope I could help.
Heres my github repo for the project I did: https://github.com/pkudlanov/personal-portfolio-django
I hosted it on digitalocean on a linux server using nginx and gunicorn.

ManifestStaticFilesStorage - still fetching old static file even when staticfiles.json shows the right mapping

I am using using ManifestStaticFilesStorage. After performing collectstatic and moving over the files to prod, I still see old css file (with old MD5 hash string) being fetched.
settings.py:
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.ManifestStaticFilesStorage'
On prod, I can see staticfiles.json showing the correct (new) css file but still when I do 'view-source' from the web page, I can see old css being fetched.
What could I have missed?
Restart Django on prod after moving the files over.
(I don't know why that's required but thats the only thing that works)
Update (Aug 2022): I ran into the same problem after 5 years and bumped into my own post. This time restarting Django also did not help. But restarting Nginx helped (got the hint from #patrickm96's answer) - in case it helps
I had the same problem and restarting gunicorn helped me

Make 'collectstatic' find updated files

Is there a way to make python manage.py collectstatic find updated static files? Currently, it is properly searching STATICFILES_DIRS and finding where I have my static files, but it only uploads new ones. If I modify a static file, it does not detect this. Does Django do this so we have to delete each file first, or is there an easy solution?
UPDATE:
Disclaimer - This issue has to do with external hosting on Amazon's S3 Storage
I had simply forgot to include AWS_PRELOAD_METADATA = True in my settings.py file.
Adding this setting in fixed the issue of collectstatic only finding new files. Also, I saw a major speed increase in syncing between the server and Amazon's S3.
If you are using S3 for storage, I found this answer to be quite helpful.

django-compressor not setting absolute CSS image paths on Heroku

I'm using django-compressor to concatenate and compress my CSS and JS files on this site. I'm serving static files from an S3 bucket.
On my local copy of the site, using a different S3 bucket, this all works perfectly. But on the live site, hosted on Heroku, it all works except the relative URLs for images in the CSS files do not get re-written.
eg, this line in the CSS file:
background-image: url("../img/glyphicons-halflings-grey.png");
gets rewritten to:
background-image:url('https://my-dev-bucket-name.s3.amazonaws.com/static/img/glyphicons-halflings-grey.png')
on my development site, but isn't touched on the live site. So the live site ends up looking in pepysdiary.s3.amazonaws.com/static/CACHE/img/ for the images (as it's relative to the new, compressed CSS file).
For now, I've put a directory at that location containing the images, but I can't work out why there's this difference. Both sites have this in their settings:
COMPRESS_CSS_FILTERS = [
# Creates absolute urls from relative ones.
'compressor.filters.css_default.CssAbsoluteFilter',
# CSS minimizer.
'compressor.filters.cssmin.CSSMinFilter'
]
And the CSS files are being minimised just fine... but it's like the other filter isn't being applied on the live site.
I recently ran into this issue on heroku, and running the latest version of django-compressor (1.3) does not solve the problem. I will provide the solution that I am using, as well as an explanation of the problems I ran into along the way.
The solution
I created my own 'CssAbsoluteFilter' that removes the settings.DEBUG check from the 'find' method like this:
# compress_filters.py
from compressor.filters.css_default import CssAbsoluteFilter
from compressor.utils import staticfiles
class CustomCssAbsoluteFilter(CssAbsoluteFilter):
def find(self, basename):
# The line below is the original line. I removed settings.DEBUG.
# if settings.DEBUG and basename and staticfiles.finders:
if basename and staticfiles.finders:
return staticfiles.finders.find(basename)
# settings.py
COMPRESS_CSS_FILTERS = [
# 'compressor.filters.css_default.CssAbsoluteFilter',
'app.compress_filters.CustomCssAbsoluteFilter',
'compressor.filters.cssmin.CSSMinFilter',
]
The absolute urls now always work for me whether DEBUG = True or False.
The Problem
The issue is connected to 'compressor.filters.css_default.CssAbsoluteFilter', your DEBUG setting, and the fact that heroku has a read-only file system and overwrites your app files every time you deploy.
The reason compress works correctly on your development server is because CssAbsoluteFilter will always find your static files when DEBUG = True, even if you never run 'collectstatic'. It looks for them in STATICFILES_DIRS.
When DEBUG = False on your production server, CssAbsoluteFilter assumes that static files have already been collected into your COMPRESS_ROOT and won't apply the absolute filter if it can't find the files.
Jerdez, django-compressor's author, explains it like this:
the CssAbsoluteFilter works with DEBUG = False if you've successfully provided the files to work with. During development compressors uses the staticfiles finder as a convenience so you don't have to run collectstatic all the time.
Now for heroku. Even though you are storing your static files on S3, you need to also store them on heroku (using CachedS3BotoStorage). Since heroku is a read-only file system, the only way to do this is to let heroku collect your static files automatically during deployment (see https://devcenter.heroku.com/articles/django-assets).
In my experience, running 'heroku run python manage.py collectstatic --noinput' manually or even in your Procfile will upload the files to S3, but it will NOT save the files to your STATIC_ROOT directory (which compressor uses by default as the COMPRESS_ROOT). You can confirm that your static files have been collected on heroku using 'heroku run ls path/to/collected'.
If your files have been collected on heroku successfully, you should be able to compress your files successfully as well, without the solution I provided above.
However, it seems heroku will only collect static files if you have made changes to your static files since your last deploy. If no changes have been made to your static files, you will see something like "0 of 250 static files copied". This is a problem because heroku completely replaces your app contents when you deploy, so you lose whatever static files were previously collected in COMPRESS_ROOT/STATIC_ROOT. If you try to compress your files after the collected files no longer exist on heroku and DEBUG = False, the CssAbsoluteFilter will not replace the relative urls with absolute urls.
My solution above avoids the heroku problem altogether, and replaces relative css urls with absolute urls even when DEBUG = False.
Hopefully other people will find this information helpful as well.
I've had this exact same problem for a month now, but this is fixed in the 1.3 release (3/18/13, so you were probably on 1.2), so just upgrade:
pip install -U django-compressor
The exact problem I gave up on working out, but it's related to Heroku and CssAbsoluteFilter being called but failing at _converter method. Looking at the 1.3 changelog, the only related commit is this: https://github.com/jezdez/django_compressor/commit/8254f8d707f517ab154ad0d6d77dfc1ac292bf41
I gave up there, life's too short.
Meanwhile this has been fixed in django-compressor 1.6. From the changelog:
Apply CssAbsoluteFilter to precompiled css even when compression is disabled
i.e. the absolute filter is run even with DEBUG = True.

Django collectstatic from Heroku pushes to S3 everytime

I'm using django-storages for static files with S3 (and S3BotoStorage). When I do collectstatic from my local machine, the behaviour is as expected, where only modified files are pushed to S3. This process needs python-dateutils 1.5 to check for modified time.
However, doing the same on Heroku results in every file being pushed regardless, although the setup is the same. I then looked into the modified time of the files on Heroku itself, and it seems like, the os.stat(static_filename).st_mtime is the same as the time of the last push.
Is this expected behaviour? Does heroku copy around files even when there is no change from git?
Try setting DISABLE_COLLECTSTATIC=1 as an environment setting for your app - that should disable it from running on every push.
See this article for details - https://devcenter.heroku.com/articles/django-assets :
> Sometimes, you may not want Heroku to run collectstatic on your behalf.
> You can disable collectstatic by enabling user-env-compile as well:
$ heroku labs:enable user-env-compile
$ heroku config:set DISABLE_COLLECTSTATIC=1
I've found that simply setting the config will do - no need to also enable user-env-compile - it may be that that this has passed from labs into production?
NB the deployment is managed by the Heroku python buildpack, which you can see here - https://github.com/heroku/heroku-buildpack-python/
EDIT 1
I've just done a bunch of tests on this, and can confirm that DISABLE_COLLECTSTATIC does indeed disable collectstatic, irrespective of the user-env-compile setting - I think that's now in the main trunk (but that's speculation). Doesn't seem to care what the setting is - if DISABLE_COLLECTSTATIC exists as a config var it is used.
I strongly recommend using the collectfast package for any django static deployment to s3, whether local or from your heroku server. It ignores modified dates and utilizes md5 hashes, which the s3 api will provides very quickly, and (optional) caching to make your static deployments zoom. It took my static deployments from ~10-15 minutes to < 2 minutes and only deploys the files that have actually changed.
I've just had that exact same issue and contacted Heroku's support to find out what is going on. My question to them was
I've run into a funky issue doing some deployments. It appears that on each push the date modified on all files is updated to the time a new deploy/git push happens. Is this intended behaviour?
When considering that Django's collectstatic command only checks the modified date on files when evaluating if the file should be copied across to the final storage backend for static assets, it means that on each new push, all files are first removed from the remote storage (in this case S3) and then re-uploaded. This is both a very slow and wasteful process in terms of bandwidth consumed and requests made.
The answer I received today from "Caio", one of Heroku's support staff, was
Hi, that's how it currently works, yes. I'm routing your feedback to our runtime team to see if we can package files with their original dates.
As confirmed by Alen, Heroku changes the modified date of the files when it deploys. However, Amazon S3 also has an attribute called etag that is an md5 hash of the file content. It's possible to use this to check if the files have changed instead of the modified date, as implemented in this Django snippet.
I took that code, packaged it and fixed some errors I found and put it on Github as django-s3-collectstatic. It includes a new management command fasts3collectstatic that only uploads new files. Check the Github page for installation instructions.
Why not run collectstatic from local machine?
python manage.py collectstatic --noinput --settings=settings.[prod]
I agree this is annoying- there's a couple things you can do. I override the collectstatic command and wire it up in my production settings. Below is the command I use:
```
from django.core.management.base import BaseCommand
class Command(BaseCommand):
args = '< none >'
help = "disables collectstatic cmd in contrib"
def handle(self, *args, **kwargs):
print 'collectstatic disabled'
```
I keep this in mysite/disablecollectstatic/management/commands
Then in production settings:
INSTALLED_APPS += ('mysite.disablecollectstatic',)
Alternatively you could use the fact that Heroku does a dry run first before actually invoking the command. If it fails, it won't run it, which means you could contrive an error (by maybe deleting the static root in your settings, for example) but this approach makes me nervous:
https://devcenter.heroku.com/articles/django-assets#detection