Managing multiple PostgreSQL instances on Heroku (Django project) - django

I’m taking John Elder’s Udemy course titled: “How To Push Django Python Apps To Heroku for Web Hosting”. I've pushed to Heroku successfully already. Now I am experimenting with other aspects of deploying my Django project, specifically managing PostgreSQL.
When I initially set everything up 2 weeks ago, it appears I created three PostgreSQL database instances. See here for a screenshot on imgur.
My question: In that screenshot of my Heroku dashboard, what are the second and third configuration entries for?
I’ve already got some non-critical data in production on my live demo site. I’d like to delete the two of the three duplicate PostreSQL instances that I am not using but I’m curious and would first like to establish which one is the active one and which two are not.
What is a potential use case for even having multiple duplicate databases? Rolling back a database to a given state (like by using PostreSQL’s Continuous Archiving and Point-in-Time Recovery (PITR) feature) and importing/exporting instances are ways of backing up and restoring data which have been corrupted. But how are those features different or similar from configuring more than one db URL in the Heroku dashboard, as depicted in my imgur screenshot above?
I’ve Googled around using search terms such as ‘config vars heroku django postgresql colors’ (and variations) which loop me around to Heroku’s Configuration and Config Vars doc without addressing multiple PostgreSQL instances, like in my situation. Turning to Heroku’s Databases & Data Management doc explains how to handle PostgreSQL but doesn’t refer to configuration variables.

Related

Why we need to setup AWS and POSTgres db when we deploy our app using Heroku?

I'm building a web api by watching the youtube video below and until the AWS S3 bucket setup I understand everything fine. But he first deploy everything locally then after making sure everything works he is transferring all static files to AWS and for DB he switches from SQLdb3 to POSgres.
django portfolio
I still don't understand this part why we need to put our static files to AWS and create POSTgresql database even there is an SQLdb3 default database from django. I'm thinking that if I'm the only admin and just connecting my GitHub from Heroku should be enough and anytime I change something in the api just need to push those changes to github master and that should be it.
Why we need to use AWS to setup static file location and setup a rds (relational data base) and do the things from the beginning. Still not getting it!
Can anybody help to explain this ?
Thanks
Databases
There are several reasons a video guide would encourage you to switch from SQLite to a database server such as MySQL or PostgreSQL:
SQLite is great but doesn't scale well if you're expecting a lot of traffic
SQLite doesn't work if you want to distribute your app accross multiple servers. Going back to Heroky, if you serve your app with multiple Dynos, you'll have a problem because each Dyno will use a distinct SQLite database. If you edit something through the admin, it will happen on one of this databases, at random, leading to inconsistencies
Some Django features aren't available on SQLite
SQLite is the default database in Django because it works out of the box, and is extremely fast and easy to use in local/development environments for prototyping.
However, it is usually not suited for production websites. Additionally, while it can be tempting to store your sqlite.db file along with your code, for instance in a git repository, it is considered a bad practice because your database can contain sensitive data (such as passwords, usernames, emails, etc.). Hence, a strict separation between your code and data is a good practice.
Another way to put it is that your code and your data have different lifecycles. You want to be able to edit data in your database without redeploying your code, and update your code without touching your database.
Even if you can remove public access to some files through GitHub, this is not a good practice because when you work in a team with multiple developpers, developpers may have access to the code but not the production data, because it's usually sensitive. If you work with 5 people and each one of them has a copy of your database, it means the risk to lose it or have it stolen is 5x higher ;)
Static files
When you work locally, Django's built-in runserver command handles the serving of static assets such as CSS, Javascript and images for you.
However, this server is not designed for production use either. It works great in development, but will start to fail very fast on a production website, that should handle way more requests than your local version.
Because of that, you need to host these static files somewhere else, and AWS is one place where you can do that. AWS will serve those files for you, in a very efficient way. There are other options available, for instance configuring a reverse proxy with Nginx to serve the files for you, if you're using a dedicated server.
As far as I can tell, the progression you describe from the video is bringing you from a local, development enviromnent to a more efficient and scalable production setup. That is to be expected, because it's less daunting to start with something really simple (SQLite, Django's built-in runserver), and move on to more complex and abstract topics and tools later on.

My new migration will brake my database on the heroku (postgres)

I am facing a challenge here. So I inhertied the models from previous developers and the tables were not properly built. I added some constraints and new tables in order to normalize those tables. Before pushing the application to the heroku I tested it on my local machine and it actually broke my database.
Now the heroku website is already in production, so there are user information. How should i approach this, do I need to destroy the existing database and create a new one and run the migrations
Be very, very careful. Applying migrations on production servers can cause irreversible damage if you are not careful, and so you should be prepared for every possible situation.
My best recommendation would be to create an entire duplicate copy of your live DB (using Heroku this is as simple as a PG dump/backup). You can then create a new staging site using the same code, upload the backup into a new Database instance, and then test against that. Live environments are not always the same as local ones. You can then run your migrations on the staging site, and see if there are any unexpected effects (the best way to do this would be by utilizing django test cases). If there are any issues, be sure to understand how the rollback process works with django migrations.
A good tutorial that is fairly recent can be found here: https://realpython.com/django-migrations-a-primer/

wagtail cms content deploy to production

I am study on the popular django cms framework - wagtail and coming to question: how do you deploy your developed contents - like pages/documents/images to production environments?
I am puzzled because these contents(like page) are saved into database, essentially they are just database tables rows but not a resource in git repo, so if I develope a simple web site in my dev and when I come to deploy to prod, it's not as simple as a git push. what is the best practice on this?
I read some codes from torchbox, there are some database dump and records pulling tasks using fabaric, not sure if that's the preferred way and neither can fully understand them.
Or if it's production site, is it supposed that everyone add content there and prod is the source of truth, there won't need of "content deployment" as all but only those schema changes via souths migration or other static resources only.
Please help if anyone has got experience on this and provide guidance.
Thanks
On our (Torchbox) sites, all content entry usually happens on the production site, so we don't need to push any database content as part of our regular deployments. Many of our sites have tens or even hundreds of editors, so it would be almost impossible to synchronise the content across multiple installations of the site.
Whenever we need to transfer content from one installation to another (for example, deploying the production site for the first time, or pulling a snapshot of the live site to help with development), we use the Postgresql pg_dump command to make a SQL dump of the complete database, then restore it at the destination using the psql command. Tools like Fabric can be used to automate this, but this isn't essential.

Good way to deploy a django app with an asynchronous script running outside of the app

I am building a small financial web app with django. The app requires that the database has a complete history of prices, regardless of whether someone is currently using the app. These prices are freely available online.
The way I am currently handling this is by running simultaneously a separate python script (outside of django) which downloads the price data and records it in the django database using the sqlite3 module.
My plan for deployment is to run the app on an AWS EC2 instance, change the permissions of the folder where the db file resides, and separately run the download script.
Is this a good way to deploy this sort of app? What are the downsides?
Is there a better way to handle the asynchronous downloads and the deployment? (PythonAnywhere?)
You can write the daemon code and follow this approach to push data to DB as soon as you get it from Internet. Since your daemon would be running independently from the Django, you'd need to take care of data synchronisation related issues as well. One possible solution could be to use DateTimeField in your Django model with auto_now_add = True, which will give you idea of time when data was entered in DB. Hope this helps you or someone else looking for similar answer.

Options for maintaining MySQL databases for a django development team

What are some options to avoid the latency of pointing local django development servers to a remote MySQL database?
If developers use local MySQL databases to avoid the latency, what are some useful tools to sync schema updates of the remote db with the local db and avoid manually creating, downloading, and loading dumps?
Thanks!
One possibility is to configure the remote MySQL database to replicate to the developers local machine - assuming you have control of the remote database's configuration.
See the MySQL docs for replication notes. Using MySQL replication the remote node would be the Master and the developer machines would be Slaves. The main advantage of this approach is your developer machines would always remain synchronized to the Master database. One possible disadvantage (depending on the number of developer machines you are slaving) is a degradation in the remote database's performance due to extra load introduced by replication.
It sounds like you want to do schema migrations. Basically it's a way to log schema changes so that you can update and even roll back along with your source changes (if you change a model you also check in a new migration that has up and down commands). While this will likely become an official feature at some point, there are several third-party solutions to choose from. It's really a personal preference, here are some popular ones:
South
Django Evolution
dmigrations
I use a combination of South for schema migrations, and storing JSON fixtures (or SQL dumps) of useful test data in the VCS repo for the project. Works pretty seamlessly.