Deploy Django Application without Service Interruption / no Downtime - django

We have no continuous integration setup(, yet). But want to deploy very frequently. Once a day or so.
We have a pretty standard Django application with a separate Postgres server. We use normal rented VMs (NO Amazon or Rackspace).
How can we minimize the downtime of our application? Best would be to zero downtime. We thought about a setup with two equal application and two database servers and deploy one app/db server pair after another.
The problem is keeping the data consistant. While one app/db server pair is updating the server pair with the old code can serve users. But if the users write to the db we would lose the data when switching to the updated pair. Especially when we push schema migrations.
How can we handle this? This must be a very common problem but I can't find good answers. How do you handle this problem?

In the case that you have no schema migrations, I'll give you a practical scenario:
Keep two versions of django processes ( A and B ), which you control with, let's say, supervisor. Keep an nginx process in front of your django processes, which forwards all requests to A. So, you upload version B to the server, start the django process B with supervisor, then change your nginx's conf file to point to B, then reload your nginx process..
In the case that you have schema migrations, things get complicated. Your options include:
You could consider using a NoSQL solution, like mongoDB (in this case you can keep a single DB instance).
Figure out how to manually record all write requests while uploading, as to push them later to your new database.

Related

Why we need to setup AWS and POSTgres db when we deploy our app using Heroku?

I'm building a web api by watching the youtube video below and until the AWS S3 bucket setup I understand everything fine. But he first deploy everything locally then after making sure everything works he is transferring all static files to AWS and for DB he switches from SQLdb3 to POSgres.
django portfolio
I still don't understand this part why we need to put our static files to AWS and create POSTgresql database even there is an SQLdb3 default database from django. I'm thinking that if I'm the only admin and just connecting my GitHub from Heroku should be enough and anytime I change something in the api just need to push those changes to github master and that should be it.
Why we need to use AWS to setup static file location and setup a rds (relational data base) and do the things from the beginning. Still not getting it!
Can anybody help to explain this ?
Thanks
Databases
There are several reasons a video guide would encourage you to switch from SQLite to a database server such as MySQL or PostgreSQL:
SQLite is great but doesn't scale well if you're expecting a lot of traffic
SQLite doesn't work if you want to distribute your app accross multiple servers. Going back to Heroky, if you serve your app with multiple Dynos, you'll have a problem because each Dyno will use a distinct SQLite database. If you edit something through the admin, it will happen on one of this databases, at random, leading to inconsistencies
Some Django features aren't available on SQLite
SQLite is the default database in Django because it works out of the box, and is extremely fast and easy to use in local/development environments for prototyping.
However, it is usually not suited for production websites. Additionally, while it can be tempting to store your sqlite.db file along with your code, for instance in a git repository, it is considered a bad practice because your database can contain sensitive data (such as passwords, usernames, emails, etc.). Hence, a strict separation between your code and data is a good practice.
Another way to put it is that your code and your data have different lifecycles. You want to be able to edit data in your database without redeploying your code, and update your code without touching your database.
Even if you can remove public access to some files through GitHub, this is not a good practice because when you work in a team with multiple developpers, developpers may have access to the code but not the production data, because it's usually sensitive. If you work with 5 people and each one of them has a copy of your database, it means the risk to lose it or have it stolen is 5x higher ;)
Static files
When you work locally, Django's built-in runserver command handles the serving of static assets such as CSS, Javascript and images for you.
However, this server is not designed for production use either. It works great in development, but will start to fail very fast on a production website, that should handle way more requests than your local version.
Because of that, you need to host these static files somewhere else, and AWS is one place where you can do that. AWS will serve those files for you, in a very efficient way. There are other options available, for instance configuring a reverse proxy with Nginx to serve the files for you, if you're using a dedicated server.
As far as I can tell, the progression you describe from the video is bringing you from a local, development enviromnent to a more efficient and scalable production setup. That is to be expected, because it's less daunting to start with something really simple (SQLite, Django's built-in runserver), and move on to more complex and abstract topics and tools later on.

Migrate existing live Django site from one server to another along with user records intact

I have deployed a Django site on GoDaddy Django "droplet" for a while and users have been using the site to keep their records.
Now that GoDaddy is discontinuing the service, I would like to migrate the entire site with all records intact to DigitalOcean.
How does one go about doing this?
Here's what I'd do:
dump my data into a file see dumpdata
stop the server
remove all .pyc files
copy paste the whole folder of the website to the destination server
restore data using loaddata
run the server
I did this 4 times without any problems (dev to test environment, test to pre-prod and pre-prod to prod).
The reply by Oliver is good for small sites, but big companies (or any) won't be happy if you stop the server, also some records/sales might be done between the time you dump and the time you stop the server
I think this would be better but I'm unsure:
make a new database and make it sync with the old one, whenever old db changes changes should be synced to the new db (I know for example postgres has a feature that triggers some kind of alerts on creations/updates, this is indeed the hardest step and needs quite a lot of research to pull it off, both databases must be on sync)
upload the new(copy) page next to the new database connected to it
modify the DNS records to point to the new server, traffic will organically move to the new server, at some point you might have people making database updates on both servers so it is important that the sync goes both ways
take the first web-page's server down, cut the database syncing and remove old
page and old database
remove the pipe/method that was letting you sync the
new db with the old one

Update SQLite database on disk

My Django application (a PoC, not a final product) with a backend library uses a SQLite database - read only. The SQLite database is part of the repo and deployed to Heroku. This is working fine.
I have the requirement to allow updates to this database via the Django admin interface. This is not a Django managed database, so from Django's point of view just a binary file.
I could allow for a FileField to handle this, overwriting the database; I guess this would work in a self-managed server, but I am on Heroku and have the constraints imposed by Disk Backed Storage. My SQLite is not my webapp database, but limitations apply the same: I can not write to the webapp's filesystem and get any guarantee the new data will be visible by the running webapp.
I can think of alternatives, all with drawbacks:
Put the SQLite database in another server (a "media" server), and access it remotely: this will severely impact performance. Besides, accessing SQLite databases over the network does not seem easy.
Create a deploy script for the customer to upload the database via the usual deploy mechanisms. Since the customer is not technically fit, and I can not provide direct support, this is unfeasible.
Move out of Heroku to a self-managed server, so I can implement this quick-and-dirty upload without complications.
Do you have another suggestion?
PythonAnywhere.com
deploy your app and you can easily access all of your files and update them and your Sqlite3 database is going to be updated in real time without losing data.
herokuapp.com erase your Sqlite3 database every 24 hours that's why it's not preferred for Sqlite3 having web apps

Good way to deploy a django app with an asynchronous script running outside of the app

I am building a small financial web app with django. The app requires that the database has a complete history of prices, regardless of whether someone is currently using the app. These prices are freely available online.
The way I am currently handling this is by running simultaneously a separate python script (outside of django) which downloads the price data and records it in the django database using the sqlite3 module.
My plan for deployment is to run the app on an AWS EC2 instance, change the permissions of the folder where the db file resides, and separately run the download script.
Is this a good way to deploy this sort of app? What are the downsides?
Is there a better way to handle the asynchronous downloads and the deployment? (PythonAnywhere?)
You can write the daemon code and follow this approach to push data to DB as soon as you get it from Internet. Since your daemon would be running independently from the Django, you'd need to take care of data synchronisation related issues as well. One possible solution could be to use DateTimeField in your Django model with auto_now_add = True, which will give you idea of time when data was entered in DB. Hope this helps you or someone else looking for similar answer.

How to handle DB migration using AWS deployment tools

Amazon Web Services offer a number of continuous deployment and management tools such as Elastic Beanstalk, OpsWorks, Cloud Formation and Code Deploy depending on your needs. The basic idea being to facilitate code deployment and upgrade with zero downtime. They also help manage best architectural practice using AWS resources.
For simplicity lets assuming a basic architecture where you have a 2 tear structure; a collection of application servers behind a load balancer and then a persistence layer using a multi-zone RDS DB.
The actual code upgrade across a fleet of instances (app servers) is easy to understand. For a very simplistic overview the AWS service upgrades each node in turn handing connections off so the instance in question is not being used.
However, I can't understand how DB upgrades are managed. Assume that we are going from version 1.0.0 to 2.0.0 of an application and that there is a requirement to change the DB structure. Normally you would use a script or a library like Flyway to perform the upgrade. However, if there is a fleet of servers to upgrade there is a point where both 1.0.0 and 2.0.0 applications exist across the fleet each requiring a different DB structure.
I need to understand how this is actually achieved (high level) to know what the best way/time of performing the DB migration is. I guess there are a couple of ways they could be achieving this but I am struggling to see how they can do it and allow both 1.0.0 and 2.0.0 to persist data without loss.
If they migrate the DB structure with the first app node upgrade and at the same time create a cached version of the 1.0.0. Users connected to the 1.0.0 app persist using the cached version of the DB and users connected to the 2.0.0 app persist to the new migrated DB. Once all the app nodes are migrated, the cached data is merged into the DB.
It seems unlikely they can do this as the merge would be pretty complex but I can't see another way. Any pointers/help would be appreciated.
This is a common problem to encounter once your application infrastructure gets into multiple application nodes. In the olden days, you could take your application offline for "maintenance windows" during which you could:
Replace application with a "System Maintenance, back soon" page.
Perform database migrations (schema and/or data)
Deploy new application code
Put application back online
In 2015, and really for several years this approach is not acceptable. Your users expect 24/7 operation, so there must be a better way. Of course there is, the answer is a series of patterns for Database Refactorings.
The basic concept to always keep in mind is to assume you have to maintain two concurrent versions of your application, and there can be no breaking changes between these two versions. This means that you have a current application (v1.0.0) currently in production and (v2.0.0) that is scheduled to be deployed. Both these versions must work on the same schema. Once v2.0.0 is fully deployed across all application servers, you can then develop v3.0.0 that allows you to complete any final database changes.