Azure web app intermittently copying code from tmp to home - django

I am deploying a Django appication to a Azure Web App via Github Actions. The code is consistently deployed to the tmp folder but is not always copied to wwwroot, even though Oryx deployment logs state that the contents are copied. What causes this intermittent behaviour?
Detecting platforms...
Detected following platforms:
python: 3.8.12
Using intermediate directory '/tmp/8da59cae1f18fde'.
Copying files to the intermediate directory...
Done in 0 sec(s).
Source directory : /tmp/8da59cae1f18fde
Destination directory: /home/site/wwwroot
Python Version: /opt/python/3.8.12/bin/python3.8
Creating directory for command manifest file if it doesnot exist
Removing existing manifest file
Python Virtual Environment: antenv
Creating virtual environment...
Activating virtual environment...
Running pip install...
Content in source directory is a Django app
Running collectstatic...
Done in 18 sec(s).
Not a vso image, so not writing build commands
Preparing output...
Copying files to destination directory '/tmp/_preCompressedDestinationDir'...
Done in 31 sec(s).
Compressing content of directory '/tmp/_preCompressedDestinationDir'...
Copied the compressed output to '/home/site/wwwroot'
Removing existing manifest file
Creating a manifest file...
Manifest file created.
Done in 244 sec(s).

For others that may be struggling with this same issue, specifically with Django applications, I have landed on the following workaround:
After learning more about how Oryx build actually works, I now understand that the code IS copied to wwwroot - but as a tarball. The code is then extracted to tmp and executed in this folder. This causes a challenge running django migrations as the migration folder is not persistent at the location where the code is run from, i.e. you are not able to run migrations automatically from the tmp folder (because, assuming you are not putting your migrations in source control, there is no migration history here.) and you are not able to run migrations from wwwroot (because the new code only exists in the tarball). So, in order to persist migration history you will need to:
Extract output.tar.gz in its entirety to wwwroot and overwrite existing files.
OR
Do 1. ONCE in order to get your project structure ready to perform migrations in wwwroot, and after this copy only the files needed to detect database changes (in my case settings.py, forms.py, models.py and admin.py) from tmp to wwwroot. This can be done automatically by editing the startup command.

Related

Django collectstatic keeps waiting when run through Github Action

We are facing a very weird issue. We ship a django application in a docker container through Github Actions on each push. Everything is working fine except collectstatic.
We have the following lines at the end of our CD github action:
docker exec container_name python manage.py migrate --noinput
docker exec container_name python manage.py collectstatic --noinput
migrate works perfectly fine, but collectstatic just keeps on waiting if ran through the github action. If I run the command directly on the server then it works just fine and completes with in few minutes.
Can someone please help me figuring out what could be the issue?
Thanks in advance.
Now I am far from the most experienced but I did this recently and I have some suggestions of where to look. I'm definitely not the greatest authority though.
I wasn't using docker so I can't say anything about that. From the issues, I had here are some suggestions I can recommend to try.
Take note that all of this was for a self-hosted runner. Things would be very different otherwise.
Check to make sure STATIC_ROOT and MEDIA_ROOT variables are set correctly in the settings file.
If the STATIC and MEDIA root variables are environment variables make sure you are serving the correct environment variables file like a .env file which I used.
I used django-environ to serve my environment variables. From the docs, it says to have the .env file in the same directory as the settings file. Well if you are putting the project on a production server with github actions, you won't be able to put the .env file anywhere in the project because it will get overwritten every time new code is pushed.
So to fix that you need to specify the correct .env file from somewhere else on the server. Do that by specifying ENV_PATH.
https://django-environ.readthedocs.io/en/latest/
Under the section Multiple env files
Another resource that was helpful:
https://github.com/joke2k/django-environ/issues/143
I set up my settings file like how they did there.
I put my .env file in a proj directory I made in the virtualenvironment folder for the project.
I don't know if it's a good place to put it but that's how I did it. I didn't find much great info online for this stuff. Had to figure out a lot on my own.
Make sure the user which is running the github action has permissions to read the .env file.
Also like .env file, if you have the static files being collected into the base directory of your project you might have an issue with github actions overwriting those files every time new code is pushed. If you have a media directory where the user uploads files to then that will really be an issue because those files won't get overwritten. They'll just disappear.
Now if this was an issue it shouldn't cause github actions to just get stuck on the collect static command. It would just cause files to get overwritten every time the workflow runs and the media files will disappear.
If you do change the directory of where the static and media files are located as stated before, make sure all the variables for the paths are correct in the settings file and the .env file.
You will also need to update the nginx config file for the static and media root directories if you used nginx. Not sure about how apache does this.
You can do that with this command:
sudo nano /etc/nginx/sites-available/myproject
Don't forget to restart the nginx server after doing that.
If you are writing static and media files at a different location from the base project directory on the server, also check permissions on those directories. Make sure the user running the github action has permissions to write to those directories. I suspect that might cause it to hang but it very well might just cause an error.
Check all the syntax in the github actions yml file. Make sure everything is correct and it's not hanging cause it had an incomplete command or something like that.
But yeah, that's some things I had to take a look at. Honestly, none of this might be relevant for you. All of these issues should cause an error somewhere for the most part.
I couldn't really offer many external resources for you to look deeper into this because I'm just speaking from personal experience.
Hope I could help.
Heres my github repo for the project I did: https://github.com/pkudlanov/personal-portfolio-django
I hosted it on digitalocean on a linux server using nginx and gunicorn.

How can I change the name(s) of folders in my Django project without destroying its ability to find its own parts?

I'm reading through Two Scoops of Django and the authors note that best practices involve having a config folder and an apps folder. I've been building a Django project for the last few months here & there and want to get in the habit of complying with these practices, but when I change the name of my <project_name> folder (with the wsgi, settings, etc.), Django can't seem to find the contents of the folder anymore.
How do I change this folder name and put the project apps into an app folder without breaking the connections they have?
Restoring connection can be a painful process and even if you restore the connections, there is no guarantee it will always work (eg, some 3rd party app may fail because of dependency issues which you forgot to change).
I do like to separate my created apps and project folder to be visibly different too. For this, I create parent folder which would be my entire django installation and then inside the created folder I create the project while telling that this is the directory I'd like to use. Lets say I want to create blog project:
$ mkdir blog
$ cd blog
$ django-admin startproject blog_project .
This will give you a blog folder and inside it you will get blog_project folder and beloved manage.py.

How to keep changes made on sqlite database

I have a Django app with a SQLite database. This app is deployed on Heroku.
When someone uses the app and add data into the database, the sqlite database is modified. But when I make a code change on Github and then deploy this new version of my app, I also deploy the sqlite database (which doesn't have the new changes made by the users) and so I remove the changes.
What is the best process to prevent this ?
The solution is simple: you must not use sqlite on Heroku. As you have discovered, the file system on Heroku is ephemeral, and changes aren't persisted between deploys. Plus, if you scaled to more than one dyno, they wouldn't share the same database.
Use the Postgres add-on instead.
I don't know much about Heroku but if there's a seperate copy of your databse files on Heroku then you should add your sqlite files to the .gitignore file. This way you sqlite files won't be sent to Heroku along with the rest of the code. Also you'll need to delete the sqlite files from your repo which have already been added. Search more about gitignore. A general .gitignore for Django projects contains the following -
*.pyc
__pycache__
myvenv
db.sqlite3
.DS_Store
settings.py
Just copy this to your existing gitignore or make a new if not already present. A gigignore file is usually present in the root directory of the project.

Django 1.7 + Django CMS - drop migration files from my repo or include virtualenv in repo?

I'm using git to version control a Django 1.7 + Django CMS 3.0.6 project.
In the course of building various apps etc I'm ending up with a lot of migration files. The migration files are currently included in my git repo.
Thus far I have been trying to avoid including the virtual env files in my repo directly as it seems rather messy and redundant. Instead I have thus far been including a pip requirements file in the repo and using that to recreate the virtual env when needed.
However, I have recently discovered that choosing to include the migration files in the repo seems to require including all of the virtual env files in the repo as well. I say this because upon deploying my project to a production server and trying to run any of the db commands (syncdb, makemigrations or migrate) via python manage.py I get the error:
KeyError: u"Migration image_gallery.0001_initial dependencies reference nonexistent parent node (u'cms', u'0004_auto_20141108_1256')"
whereas such error does not occur on my local machine, even after deleting the database.
I tracked the source of this error down to the fact that the virtual env on my local machine has a reference to '0004_auto_20141108_1256' (inside the django-cms package - it appears some cms migration info is recorded directly inside the virtual env directory itself) while that of the production environment does not - as the production venv is create thorough a pip requirements file. Therefore, the two virtual envs do not exactly match, even though all third party libs are the same. Currently I am not including the venv in my git repo.
So as I see it I have two options:
1. include the virtual env in my git repo
2. drop the migration files from git
Which option is better and why - or is there a third even better way?
The downside to #1 is unnecessary bloat. The downside to option #2 is one loses the migration history, something one might potentially want to keep.
You never commit the virtual env, it defeats the purpose; you just add unnecessary content to git.
Instead, freeze the requirements and commit the file:
pip freeze > requirements.txt
Install the packages on the server:
pip install -r requirements.txt
The problem is in my django settings.py file:
MIGRATION_MODULES = {
'cms': 'cms.migrations_django',
'menus': 'menus.migrations_django',
'djangocms_file': 'djangocms_file.migrations_django',
...
}
I had to introduce the above to get django-cms 3.0.6 to work with django 1.7, a consequence of the fact that migrations in django 1.7 are no longer done with South, as django 1.7 now has it's own migration system, while cms 3.0.6. still expects migrations to be managed by South by default.
However, the effect of the above config is to store migrations in the above described paths which in my case pointed straight to the virtual env. Thus migration info was getting stored within the virtual env dir, leading to problems in deploying to production.
To fix this I modified my project directory structure to include a folder called "migrations":
myproject/manage.py
myproject/migrations/
myproject/myproject/
...
And modified the config to be:
MIGRATION_MODULES = {
'cms': 'migrations.cms.migrations_django',
'menus': 'migrations.menus.migrations_django',
'djangocms_file': 'migrations.djangocms_file.migrations_django',
...
}
This has the effect of now storing all migration files in the django project itself (and by extension the git repo). As migration info is no longer in the virtual env directory, there is no longer any reason to consider the rather unattractive possibility of including the virtual env in the repo.

Django and live server, on rewrite, the old sources files are still used

I'm using Django 1.3 on an Apache server and mod_wsgi(daemon mode), with Nginx for serving static file. The database is on a separate server. The wsgi daemon runs on 2 threads with a maximum requests of 100.
I get in trouble when I override old .py files ... Not .pyc ... I'm also overriding the .wsgi config file (http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode). Sometimes, some server requests uses the old code, and therefore an error is generated (HTTP ERROR 500). Is there a server side cache that needs be emptied?
Can this be generated by the .pyc files? Do I need to restart the Apache server or the wsgi daemon?
If you remove the .pyc files and touch your wsgi files it should reload the wsgi daemon when it gets a chance, and you should be good.
On occasion I have had to restart apache in order to get my changes take affect.
Setup up ownership/permissions so that the user that code runs as under Apache cannot change code files nor create .pyc files. The user application runs as should only have ability to write to data or upload directories that it really needs to as is safer anyway.
The most reliable deployment method would be to install new version into a completely new directory hierarchy, with WSGI script file outside that tree. Then replace WSGI script file with new one referring to new directory. The WSGI script file in doing this should not be edited in place however, but a new file moved into place so filesystem does an atomic replace of the whole file and no risk of on the fly edit being picked up.