Django collectstatic keeps waiting when run through Github Action - django

We are facing a very weird issue. We ship a django application in a docker container through Github Actions on each push. Everything is working fine except collectstatic.
We have the following lines at the end of our CD github action:
docker exec container_name python manage.py migrate --noinput
docker exec container_name python manage.py collectstatic --noinput
migrate works perfectly fine, but collectstatic just keeps on waiting if ran through the github action. If I run the command directly on the server then it works just fine and completes with in few minutes.
Can someone please help me figuring out what could be the issue?
Thanks in advance.

Now I am far from the most experienced but I did this recently and I have some suggestions of where to look. I'm definitely not the greatest authority though.
I wasn't using docker so I can't say anything about that. From the issues, I had here are some suggestions I can recommend to try.
Take note that all of this was for a self-hosted runner. Things would be very different otherwise.
Check to make sure STATIC_ROOT and MEDIA_ROOT variables are set correctly in the settings file.
If the STATIC and MEDIA root variables are environment variables make sure you are serving the correct environment variables file like a .env file which I used.
I used django-environ to serve my environment variables. From the docs, it says to have the .env file in the same directory as the settings file. Well if you are putting the project on a production server with github actions, you won't be able to put the .env file anywhere in the project because it will get overwritten every time new code is pushed.
So to fix that you need to specify the correct .env file from somewhere else on the server. Do that by specifying ENV_PATH.
https://django-environ.readthedocs.io/en/latest/
Under the section Multiple env files
Another resource that was helpful:
https://github.com/joke2k/django-environ/issues/143
I set up my settings file like how they did there.
I put my .env file in a proj directory I made in the virtualenvironment folder for the project.
I don't know if it's a good place to put it but that's how I did it. I didn't find much great info online for this stuff. Had to figure out a lot on my own.
Make sure the user which is running the github action has permissions to read the .env file.
Also like .env file, if you have the static files being collected into the base directory of your project you might have an issue with github actions overwriting those files every time new code is pushed. If you have a media directory where the user uploads files to then that will really be an issue because those files won't get overwritten. They'll just disappear.
Now if this was an issue it shouldn't cause github actions to just get stuck on the collect static command. It would just cause files to get overwritten every time the workflow runs and the media files will disappear.
If you do change the directory of where the static and media files are located as stated before, make sure all the variables for the paths are correct in the settings file and the .env file.
You will also need to update the nginx config file for the static and media root directories if you used nginx. Not sure about how apache does this.
You can do that with this command:
sudo nano /etc/nginx/sites-available/myproject
Don't forget to restart the nginx server after doing that.
If you are writing static and media files at a different location from the base project directory on the server, also check permissions on those directories. Make sure the user running the github action has permissions to write to those directories. I suspect that might cause it to hang but it very well might just cause an error.
Check all the syntax in the github actions yml file. Make sure everything is correct and it's not hanging cause it had an incomplete command or something like that.
But yeah, that's some things I had to take a look at. Honestly, none of this might be relevant for you. All of these issues should cause an error somewhere for the most part.
I couldn't really offer many external resources for you to look deeper into this because I'm just speaking from personal experience.
Hope I could help.
Heres my github repo for the project I did: https://github.com/pkudlanov/personal-portfolio-django
I hosted it on digitalocean on a linux server using nginx and gunicorn.

Related

Django Lightsail - attempt to write a readonly database

I am trying to deploy my Djago app on AWS Lightsail.
When I try to login/register, I am getting this error:
Attempt to write a readonly database
I have been googling solutions for quite some time and have tried setting different permissions, even giving away all permissions which might be huge security risk, however it still doesn't work.
Could anyone help me.
check that your file is owned by bitnami:bitnami. (I've been having the same exact issue and yours and i havent been able to fix it either)
So, in case someone else has this problem, what should work and what worked for me:
I moved db.sqlite3 file to one folder outside of main project dir.
Then I changed address to this file in settings.py to os.path.join(BASE_DIR, '..', 'db.sqlite3')
Though I feel like it's really a problem of user permissions, but that's above my current skill level.
I found another solution that worked for me.
See: https://github.com/mchesler613/django_adventures/blob/main/deploy_django_aws_bitnami.md
Author goes into detail about the details behind the error.
See "Error: Attempt to write a readonly database"
Change group ownership of the project root directory and the database file to daemon. For example:
$ sudo chgrp daemon project_directory project_directory/database_file
Make the project root directory and the database file writable by the daemon group. For example:
$ sudo chmod g+w project_directory project_directory/database_file
To see if the database error goes away, try reloading the Django app on the browser.
I had the same problem. Changing db.sqlite3 mode to writable doesn't work for me. I can use django shell to add data to db.sqlite3 but from apache2, it doesn't work.
Finally, I changed owner of the directory where db.sqlite3 locates to www-data:www-data, and it worked.

running nginx/wusgi/mysql/django in docker container

I have a docker image for running a django app. If I mount the dir containing the django app when I create the container it works fine. But I want to make the image self-contained and not dependent on the local file system. So I changed the Dockerfile to copy the dir containing the django app from the host machine into the image. But then, when I create the container (without mounting the dir) I get permission denied on all accesses to that dir (e.g. the socket, the static files, ...). Everything is world readable and executable. Anyone have any clues as to what could be causing this?
I ended up fixing it. Turned out one of the dirs in the path was not readable. That is, the django app was in /foo/bar/baz and although /foo and /foo/bar/baz were readable, /foo/bar was not. Once I chmod-ed that all was well.

heroku django tutorial not working?

I'm trying to work through the "Getting Started with Django" heroku tutorial. Things are working when I run the framework locally with foreman, but when I try to run it on Heroku, it is failing when trying to find the settings module.
In a nutshell, I've done...
chris#xi:~: mkdir hero
chris#xi:~: cd hero
chris#xi:~/hero: django-startproject ku .
and then I create and edit the files per the instructions. Perhaps I have gotten something wrong?
in ~/hero, I created my Procfile and requirements.txt, as well as the ku directory that was created by django-admin
in ~/hero/ku I have settings.py and wsgi.py (created by django-admin and edited by me)
any idea what I'm not doing correctly?
found it after much trial end error. It turns out that I had an old .gitignore file in my home directory that ignored settings.py. Once I added settings.py to the project and re-pushed, everything started working.
I don't like putting settings.py into git because it contains machine-specific information, as well as security information. But I guess I will leave it for now, and when I have to do it for real, I can use the heroku config:set command to set up things like database users and passwords, or paths to DJANGO_SETTINGS_FILE, etc.

Django collectstatic from Heroku pushes to S3 everytime

I'm using django-storages for static files with S3 (and S3BotoStorage). When I do collectstatic from my local machine, the behaviour is as expected, where only modified files are pushed to S3. This process needs python-dateutils 1.5 to check for modified time.
However, doing the same on Heroku results in every file being pushed regardless, although the setup is the same. I then looked into the modified time of the files on Heroku itself, and it seems like, the os.stat(static_filename).st_mtime is the same as the time of the last push.
Is this expected behaviour? Does heroku copy around files even when there is no change from git?
Try setting DISABLE_COLLECTSTATIC=1 as an environment setting for your app - that should disable it from running on every push.
See this article for details - https://devcenter.heroku.com/articles/django-assets :
> Sometimes, you may not want Heroku to run collectstatic on your behalf.
> You can disable collectstatic by enabling user-env-compile as well:
$ heroku labs:enable user-env-compile
$ heroku config:set DISABLE_COLLECTSTATIC=1
I've found that simply setting the config will do - no need to also enable user-env-compile - it may be that that this has passed from labs into production?
NB the deployment is managed by the Heroku python buildpack, which you can see here - https://github.com/heroku/heroku-buildpack-python/
EDIT 1
I've just done a bunch of tests on this, and can confirm that DISABLE_COLLECTSTATIC does indeed disable collectstatic, irrespective of the user-env-compile setting - I think that's now in the main trunk (but that's speculation). Doesn't seem to care what the setting is - if DISABLE_COLLECTSTATIC exists as a config var it is used.
I strongly recommend using the collectfast package for any django static deployment to s3, whether local or from your heroku server. It ignores modified dates and utilizes md5 hashes, which the s3 api will provides very quickly, and (optional) caching to make your static deployments zoom. It took my static deployments from ~10-15 minutes to < 2 minutes and only deploys the files that have actually changed.
I've just had that exact same issue and contacted Heroku's support to find out what is going on. My question to them was
I've run into a funky issue doing some deployments. It appears that on each push the date modified on all files is updated to the time a new deploy/git push happens. Is this intended behaviour?
When considering that Django's collectstatic command only checks the modified date on files when evaluating if the file should be copied across to the final storage backend for static assets, it means that on each new push, all files are first removed from the remote storage (in this case S3) and then re-uploaded. This is both a very slow and wasteful process in terms of bandwidth consumed and requests made.
The answer I received today from "Caio", one of Heroku's support staff, was
Hi, that's how it currently works, yes. I'm routing your feedback to our runtime team to see if we can package files with their original dates.
As confirmed by Alen, Heroku changes the modified date of the files when it deploys. However, Amazon S3 also has an attribute called etag that is an md5 hash of the file content. It's possible to use this to check if the files have changed instead of the modified date, as implemented in this Django snippet.
I took that code, packaged it and fixed some errors I found and put it on Github as django-s3-collectstatic. It includes a new management command fasts3collectstatic that only uploads new files. Check the Github page for installation instructions.
Why not run collectstatic from local machine?
python manage.py collectstatic --noinput --settings=settings.[prod]
I agree this is annoying- there's a couple things you can do. I override the collectstatic command and wire it up in my production settings. Below is the command I use:
```
from django.core.management.base import BaseCommand
class Command(BaseCommand):
args = '< none >'
help = "disables collectstatic cmd in contrib"
def handle(self, *args, **kwargs):
print 'collectstatic disabled'
```
I keep this in mysite/disablecollectstatic/management/commands
Then in production settings:
INSTALLED_APPS += ('mysite.disablecollectstatic',)
Alternatively you could use the fact that Heroku does a dry run first before actually invoking the command. If it fails, it won't run it, which means you could contrive an error (by maybe deleting the static root in your settings, for example) but this approach makes me nervous:
https://devcenter.heroku.com/articles/django-assets#detection

Google App Engine Development and Production Environment Setup

Here is my current setup:
GitHub repository, a branch for dev.
myappdev.appspot.com (not real url)
myapp.appspot.com (not real url)
App written on GAE Python 2.7, using django-nonrel
Development is performed on a local dev server. When I'm ready to release to dev, I increment the version, commit, and run "manage.py upload" to the myappdev.appspot.com
Once testing is satisfactory, I merge the changes from dev to main repo. I then run "manage.py upload" to upload the main repo code to the myapp.appspot.com domain.
Is this setup good? Here are a few issues I've run into.
1) I'm new to git, so sometimes I forget to add files, and the commit doesn't notify me. So I deploy code to dev that works, but does not match what is in the dev branch. (This is bad practice).
2) The datastore file in the git repo causes issues. Merging binary files? Is it ok to migrate this file between local machines, or will it get messed up?
3) Should I be using "manage.py upload" for each release to the dev or prod environment, or is there a better way to do this? Heroku looks like it can pull right from GitHub. The way I'm doing it now seems like there is too much room for human error.
Any overall suggestions on how to improve my setup?
Thanks!
I'm on a pretty similar setup, though I'm still runing on py2.5, django-nonrel.
1) I usually use 'git status' or 'git gui' to see if I forgot to check in files.
2) I personally don't check in my datastore. Are you familiar with .gitignore? It's a text file in which you list files for git to ignore when you run 'git status' and other functions. I put in .gaedata as well as .pyc and backup files.
To manage the database I use "python manage.py dumpdata > file" which dumps the database to a json encoded file. Then I can reload it using "python manage.py loaddata".
3) I don't know of any deploy from git. You can probably write a little python script to check whether git is up to date before you deploy. Personally though, I deploy stuff to test to make sure it's working, before I check it in.