Options for maintaining MySQL databases for a django development team - django

What are some options to avoid the latency of pointing local django development servers to a remote MySQL database?
If developers use local MySQL databases to avoid the latency, what are some useful tools to sync schema updates of the remote db with the local db and avoid manually creating, downloading, and loading dumps?
Thanks!

One possibility is to configure the remote MySQL database to replicate to the developers local machine - assuming you have control of the remote database's configuration.
See the MySQL docs for replication notes. Using MySQL replication the remote node would be the Master and the developer machines would be Slaves. The main advantage of this approach is your developer machines would always remain synchronized to the Master database. One possible disadvantage (depending on the number of developer machines you are slaving) is a degradation in the remote database's performance due to extra load introduced by replication.

It sounds like you want to do schema migrations. Basically it's a way to log schema changes so that you can update and even roll back along with your source changes (if you change a model you also check in a new migration that has up and down commands). While this will likely become an official feature at some point, there are several third-party solutions to choose from. It's really a personal preference, here are some popular ones:
South
Django Evolution
dmigrations

I use a combination of South for schema migrations, and storing JSON fixtures (or SQL dumps) of useful test data in the VCS repo for the project. Works pretty seamlessly.

Related

Why we need to setup AWS and POSTgres db when we deploy our app using Heroku?

I'm building a web api by watching the youtube video below and until the AWS S3 bucket setup I understand everything fine. But he first deploy everything locally then after making sure everything works he is transferring all static files to AWS and for DB he switches from SQLdb3 to POSgres.
django portfolio
I still don't understand this part why we need to put our static files to AWS and create POSTgresql database even there is an SQLdb3 default database from django. I'm thinking that if I'm the only admin and just connecting my GitHub from Heroku should be enough and anytime I change something in the api just need to push those changes to github master and that should be it.
Why we need to use AWS to setup static file location and setup a rds (relational data base) and do the things from the beginning. Still not getting it!
Can anybody help to explain this ?
Thanks
Databases
There are several reasons a video guide would encourage you to switch from SQLite to a database server such as MySQL or PostgreSQL:
SQLite is great but doesn't scale well if you're expecting a lot of traffic
SQLite doesn't work if you want to distribute your app accross multiple servers. Going back to Heroky, if you serve your app with multiple Dynos, you'll have a problem because each Dyno will use a distinct SQLite database. If you edit something through the admin, it will happen on one of this databases, at random, leading to inconsistencies
Some Django features aren't available on SQLite
SQLite is the default database in Django because it works out of the box, and is extremely fast and easy to use in local/development environments for prototyping.
However, it is usually not suited for production websites. Additionally, while it can be tempting to store your sqlite.db file along with your code, for instance in a git repository, it is considered a bad practice because your database can contain sensitive data (such as passwords, usernames, emails, etc.). Hence, a strict separation between your code and data is a good practice.
Another way to put it is that your code and your data have different lifecycles. You want to be able to edit data in your database without redeploying your code, and update your code without touching your database.
Even if you can remove public access to some files through GitHub, this is not a good practice because when you work in a team with multiple developpers, developpers may have access to the code but not the production data, because it's usually sensitive. If you work with 5 people and each one of them has a copy of your database, it means the risk to lose it or have it stolen is 5x higher ;)
Static files
When you work locally, Django's built-in runserver command handles the serving of static assets such as CSS, Javascript and images for you.
However, this server is not designed for production use either. It works great in development, but will start to fail very fast on a production website, that should handle way more requests than your local version.
Because of that, you need to host these static files somewhere else, and AWS is one place where you can do that. AWS will serve those files for you, in a very efficient way. There are other options available, for instance configuring a reverse proxy with Nginx to serve the files for you, if you're using a dedicated server.
As far as I can tell, the progression you describe from the video is bringing you from a local, development enviromnent to a more efficient and scalable production setup. That is to be expected, because it's less daunting to start with something really simple (SQLite, Django's built-in runserver), and move on to more complex and abstract topics and tools later on.

Manage sqlite database with git

I have this small project that specifies sqlite as the database choice.
For this particular project, the framework is Django, and the server is hosted by Heroku. In order for the database to work, it must be set up with migration commands and credentials whenever the project is deployed to continuous integration tools or development site.
The question is, that many of these environments do not actually use the my_project.sqlite3 file that comes with the source repository, which we version control with git. How do I incorporate changes to the deployed database? Is a script that set up the database suitable for this scenario? Meanwhile, it is worth notice that there are security credentials that should not appear in a script in unencrypted ways, which makes the situation tricky.
that many of these environments do not actually use the my_project.sqlite3 file that comes with the source repository
If your deployment platform does not support your chosen database, then your development environment should probably be moved to using one of the databases they do support. It is possible to run different databases in development and production, but just seems like the source of headaches.
I have found a number of articles that state that Heroku just doesn't support SQLite in production and instead recommends Postgres.
How do I incorporate changes to the deployed database? Is a script that set up the database suitable for this scenario?
I assume that you are just extracting data from one database to give to another, so yes,as long as that script is a one time batch operation each time the code is updated, then it should be fine. You will want something else if you are adding/manipulating data in production and then exporting it to your git.
Meanwhile, it is worth notice that there are security credentials that should not appear in a script in unencrypted ways
An environment variable should solve that. You set your host machine to have environment variables with your credentials and then just retrieve them within the script. You are looking to have something like this:
# Set environment vars
os.environ['USER'] = 'username'
os.environ['PASSWORD'] = 'password'
# Get environment vars
USER = os.getenv('USER')
PASSWORD = os.environ.get('PASSWORD')

Django sqlite development to production

I am having trouble understanding how to synchronise my development and production environments.
I have a production and development branch in git, with the production branch being of course what the server's copy is.
My sqlite database is currently under version control (which I now gather it shouldn't be, however I am not sure how I would sync my copies of the project if it wasn't?)
When I want to make a change I commit and push the server's copy to production and then I pull that down to my local machine. I then make a change (which can include database changes), but then in terms of getting those changes back into production, I am not sure how to get the changes back onto my server without potentially overwriting changes that have occurred on the server since I started the change?
How can I handle local changes to the database when changes may also have occurred on the server at the same time? I have been searching for a while and thought that maybe South was for that kind of problem but I gather that it is an old solution.
Thanks for your help
Well, it's definitively a wrong way. You should never share a database between environments. However, it is a good approach to use the same database engine on the production and dev environment but it doesn't mean that you need to share a DB, in the case of sqlite3.
Many developers use sqlite3 on dev and other DB engines on the production. This is acceptable but it is not recommended, because of differences between database engines.

Update SQLite database on disk

My Django application (a PoC, not a final product) with a backend library uses a SQLite database - read only. The SQLite database is part of the repo and deployed to Heroku. This is working fine.
I have the requirement to allow updates to this database via the Django admin interface. This is not a Django managed database, so from Django's point of view just a binary file.
I could allow for a FileField to handle this, overwriting the database; I guess this would work in a self-managed server, but I am on Heroku and have the constraints imposed by Disk Backed Storage. My SQLite is not my webapp database, but limitations apply the same: I can not write to the webapp's filesystem and get any guarantee the new data will be visible by the running webapp.
I can think of alternatives, all with drawbacks:
Put the SQLite database in another server (a "media" server), and access it remotely: this will severely impact performance. Besides, accessing SQLite databases over the network does not seem easy.
Create a deploy script for the customer to upload the database via the usual deploy mechanisms. Since the customer is not technically fit, and I can not provide direct support, this is unfeasible.
Move out of Heroku to a self-managed server, so I can implement this quick-and-dirty upload without complications.
Do you have another suggestion?
PythonAnywhere.com
deploy your app and you can easily access all of your files and update them and your Sqlite3 database is going to be updated in real time without losing data.
herokuapp.com erase your Sqlite3 database every 24 hours that's why it's not preferred for Sqlite3 having web apps

How to handle DB migration using AWS deployment tools

Amazon Web Services offer a number of continuous deployment and management tools such as Elastic Beanstalk, OpsWorks, Cloud Formation and Code Deploy depending on your needs. The basic idea being to facilitate code deployment and upgrade with zero downtime. They also help manage best architectural practice using AWS resources.
For simplicity lets assuming a basic architecture where you have a 2 tear structure; a collection of application servers behind a load balancer and then a persistence layer using a multi-zone RDS DB.
The actual code upgrade across a fleet of instances (app servers) is easy to understand. For a very simplistic overview the AWS service upgrades each node in turn handing connections off so the instance in question is not being used.
However, I can't understand how DB upgrades are managed. Assume that we are going from version 1.0.0 to 2.0.0 of an application and that there is a requirement to change the DB structure. Normally you would use a script or a library like Flyway to perform the upgrade. However, if there is a fleet of servers to upgrade there is a point where both 1.0.0 and 2.0.0 applications exist across the fleet each requiring a different DB structure.
I need to understand how this is actually achieved (high level) to know what the best way/time of performing the DB migration is. I guess there are a couple of ways they could be achieving this but I am struggling to see how they can do it and allow both 1.0.0 and 2.0.0 to persist data without loss.
If they migrate the DB structure with the first app node upgrade and at the same time create a cached version of the 1.0.0. Users connected to the 1.0.0 app persist using the cached version of the DB and users connected to the 2.0.0 app persist to the new migrated DB. Once all the app nodes are migrated, the cached data is merged into the DB.
It seems unlikely they can do this as the merge would be pretty complex but I can't see another way. Any pointers/help would be appreciated.
This is a common problem to encounter once your application infrastructure gets into multiple application nodes. In the olden days, you could take your application offline for "maintenance windows" during which you could:
Replace application with a "System Maintenance, back soon" page.
Perform database migrations (schema and/or data)
Deploy new application code
Put application back online
In 2015, and really for several years this approach is not acceptable. Your users expect 24/7 operation, so there must be a better way. Of course there is, the answer is a series of patterns for Database Refactorings.
The basic concept to always keep in mind is to assume you have to maintain two concurrent versions of your application, and there can be no breaking changes between these two versions. This means that you have a current application (v1.0.0) currently in production and (v2.0.0) that is scheduled to be deployed. Both these versions must work on the same schema. Once v2.0.0 is fully deployed across all application servers, you can then develop v3.0.0 that allows you to complete any final database changes.