wagtail cms content deploy to production - django

I am study on the popular django cms framework - wagtail and coming to question: how do you deploy your developed contents - like pages/documents/images to production environments?
I am puzzled because these contents(like page) are saved into database, essentially they are just database tables rows but not a resource in git repo, so if I develope a simple web site in my dev and when I come to deploy to prod, it's not as simple as a git push. what is the best practice on this?
I read some codes from torchbox, there are some database dump and records pulling tasks using fabaric, not sure if that's the preferred way and neither can fully understand them.
Or if it's production site, is it supposed that everyone add content there and prod is the source of truth, there won't need of "content deployment" as all but only those schema changes via souths migration or other static resources only.
Please help if anyone has got experience on this and provide guidance.
Thanks

On our (Torchbox) sites, all content entry usually happens on the production site, so we don't need to push any database content as part of our regular deployments. Many of our sites have tens or even hundreds of editors, so it would be almost impossible to synchronise the content across multiple installations of the site.
Whenever we need to transfer content from one installation to another (for example, deploying the production site for the first time, or pulling a snapshot of the live site to help with development), we use the Postgresql pg_dump command to make a SQL dump of the complete database, then restore it at the destination using the psql command. Tools like Fabric can be used to automate this, but this isn't essential.

Related

Why we need to setup AWS and POSTgres db when we deploy our app using Heroku?

I'm building a web api by watching the youtube video below and until the AWS S3 bucket setup I understand everything fine. But he first deploy everything locally then after making sure everything works he is transferring all static files to AWS and for DB he switches from SQLdb3 to POSgres.
django portfolio
I still don't understand this part why we need to put our static files to AWS and create POSTgresql database even there is an SQLdb3 default database from django. I'm thinking that if I'm the only admin and just connecting my GitHub from Heroku should be enough and anytime I change something in the api just need to push those changes to github master and that should be it.
Why we need to use AWS to setup static file location and setup a rds (relational data base) and do the things from the beginning. Still not getting it!
Can anybody help to explain this ?
Thanks
Databases
There are several reasons a video guide would encourage you to switch from SQLite to a database server such as MySQL or PostgreSQL:
SQLite is great but doesn't scale well if you're expecting a lot of traffic
SQLite doesn't work if you want to distribute your app accross multiple servers. Going back to Heroky, if you serve your app with multiple Dynos, you'll have a problem because each Dyno will use a distinct SQLite database. If you edit something through the admin, it will happen on one of this databases, at random, leading to inconsistencies
Some Django features aren't available on SQLite
SQLite is the default database in Django because it works out of the box, and is extremely fast and easy to use in local/development environments for prototyping.
However, it is usually not suited for production websites. Additionally, while it can be tempting to store your sqlite.db file along with your code, for instance in a git repository, it is considered a bad practice because your database can contain sensitive data (such as passwords, usernames, emails, etc.). Hence, a strict separation between your code and data is a good practice.
Another way to put it is that your code and your data have different lifecycles. You want to be able to edit data in your database without redeploying your code, and update your code without touching your database.
Even if you can remove public access to some files through GitHub, this is not a good practice because when you work in a team with multiple developpers, developpers may have access to the code but not the production data, because it's usually sensitive. If you work with 5 people and each one of them has a copy of your database, it means the risk to lose it or have it stolen is 5x higher ;)
Static files
When you work locally, Django's built-in runserver command handles the serving of static assets such as CSS, Javascript and images for you.
However, this server is not designed for production use either. It works great in development, but will start to fail very fast on a production website, that should handle way more requests than your local version.
Because of that, you need to host these static files somewhere else, and AWS is one place where you can do that. AWS will serve those files for you, in a very efficient way. There are other options available, for instance configuring a reverse proxy with Nginx to serve the files for you, if you're using a dedicated server.
As far as I can tell, the progression you describe from the video is bringing you from a local, development enviromnent to a more efficient and scalable production setup. That is to be expected, because it's less daunting to start with something really simple (SQLite, Django's built-in runserver), and move on to more complex and abstract topics and tools later on.

Managing multiple PostgreSQL instances on Heroku (Django project)

I’m taking John Elder’s Udemy course titled: “How To Push Django Python Apps To Heroku for Web Hosting”. I've pushed to Heroku successfully already. Now I am experimenting with other aspects of deploying my Django project, specifically managing PostgreSQL.
When I initially set everything up 2 weeks ago, it appears I created three PostgreSQL database instances. See here for a screenshot on imgur.
My question: In that screenshot of my Heroku dashboard, what are the second and third configuration entries for?
I’ve already got some non-critical data in production on my live demo site. I’d like to delete the two of the three duplicate PostreSQL instances that I am not using but I’m curious and would first like to establish which one is the active one and which two are not.
What is a potential use case for even having multiple duplicate databases? Rolling back a database to a given state (like by using PostreSQL’s Continuous Archiving and Point-in-Time Recovery (PITR) feature) and importing/exporting instances are ways of backing up and restoring data which have been corrupted. But how are those features different or similar from configuring more than one db URL in the Heroku dashboard, as depicted in my imgur screenshot above?
I’ve Googled around using search terms such as ‘config vars heroku django postgresql colors’ (and variations) which loop me around to Heroku’s Configuration and Config Vars doc without addressing multiple PostgreSQL instances, like in my situation. Turning to Heroku’s Databases & Data Management doc explains how to handle PostgreSQL but doesn’t refer to configuration variables.

Good way to deploy a django app with an asynchronous script running outside of the app

I am building a small financial web app with django. The app requires that the database has a complete history of prices, regardless of whether someone is currently using the app. These prices are freely available online.
The way I am currently handling this is by running simultaneously a separate python script (outside of django) which downloads the price data and records it in the django database using the sqlite3 module.
My plan for deployment is to run the app on an AWS EC2 instance, change the permissions of the folder where the db file resides, and separately run the download script.
Is this a good way to deploy this sort of app? What are the downsides?
Is there a better way to handle the asynchronous downloads and the deployment? (PythonAnywhere?)
You can write the daemon code and follow this approach to push data to DB as soon as you get it from Internet. Since your daemon would be running independently from the Django, you'd need to take care of data synchronisation related issues as well. One possible solution could be to use DateTimeField in your Django model with auto_now_add = True, which will give you idea of time when data was entered in DB. Hope this helps you or someone else looking for similar answer.

Should I have my Postgres directory right next to my project folder? If so, how?

I'm trying to develop a Django website with Heroku. Having no previous experience with databases (except the sqlite3 one from the tutorial), it seems to me a good idea to have the following file structure:
Projects
'-MySite
|-MySite
'-MyDB
I'm finding it hard to figure out how to do it, with psql commands preferring to put the databases in some obscure directory instead. Perhaps it's not such a good idea?
Eventually I want to be able to test and develop my site (it'll be just a blog for a while, I'm still learning) locally (ie. add a post, play with the CSS) and sync with the Heroku repository, but I also want to be able to add posts via the website itself occasionally.
The underlying data files (MyDb) has nothing to do with your project files and should not be under your project.
EDIT
added two ways to sync your local database with the database ON the Heroku server
1) export-import
This is the most simple way, do the following steps every now and then:
make an export on the Heroku server by using the pg_dump utility
download the dump file
import the dump into your local database by using the psql utility
2) replication
A more sophisticated way for keeping your local db in sync all the time is Replication. It is used in professional environments and it is probably an overkill for you at the moment. You can read more about it here: http://www.postgresql.org/docs/9.1/static/high-availability.html

Where is the heroku database?

I am trying to host my php application over heroku cloud services. This is my first ever try with any GIT client; following the procedure as defined in heroku documentation, I am done with pushing my files to the repo.
But now one place where I am totally lost is where is the heoku database, how can I configure it?
I went through myapp>resources where it tells the 5mb of database can be used for free, the only clickable link there is the 5mb label but even that is not taking me anywhere.
But where is the control panel of that database, where I can edit and use sql to configure my database? Finds its name, username etc. (may be an interface like phpmyadmin)?
Kindly guide me through this.
Thank you.
There is no "control panel" for the Heroku database. As for "where is it", there is a SHARED_DATABASE_URL environment variable of the form:
$ heroku config | grep DATABASE
SHARED_DATABASE_URL => postgres://username:password#host:port/database_name
In your PHP code, you can get this like so:
$database_url = getenv('SHARED_DATABASE_URL');
You may need to do some parsing of that URL to get it into a format that your PHP database API needs (it's been a while since I wrote any PHP).
As for "how do I configure my database", either from the command line, e.g.
$ heroku run php
or, assuming your code has some ORM-y features, invoking that to set up the database schema, or using heroku's db:push command, e.g.:
$ heroku db:push [URL_TO_MY_LOCAL_SOURCE_DATABASE]
I was looking for something like phpmyadmin for heroku databases, and I found the Adminium add-on, which works in a similar way.
Much easier than console.
Heroku will automatically setup your access to the database.
You may use taps to push and pull data betweeen your development machine and heroku. See http://devcenter.heroku.com/articles/taps
Alternateively, you may use pgbackup - http://devcenter.heroku.com/articles/pgbackups
Heroku recommends pgbackup as the most complete way of handling your database data (as described on the taps page).
Usually when you push something to Heroku it is the production side of the application, so it has a separate database that uses the same schema that you have designed once it has been migrated over.
So all your data will need re-entering through the Heroku App which can be found at:
'app name'.herokuapp.com