git multiple remotes: what to do with ignored/private files - django

My problem/issue
We are working on an opensource project which we have hosted on github. The project is written in Django and so we have our settings in a settings.py file. We have it open source not because we are too cheap for a subscription, but because we intentionally want it to be open source.
Anyways, we also want to run this code ourselves and so we have a site_settings.py which contains all sensitive data like the db password etc. Since we don't want this on github for everyone to see, this is in the .gitignore file.
Normally we clone the repo and add this file locally and on the server, this way we all have our personal settings and sensitive data isn't stored on git.
The problem is that we now want to run our code on Heroku, which doesn't support the creation of files on the server, since heroku doesn't work like a normal server ( it works with a distributed file system).
Of course I have thought about it myself, but I am not sure if this is the right approach. Please keep in mind that I am both new to Heroku and Git. The following is my current solution:
My current idea on how to fix this
There are two remotes:
origin == Github
heroku == Heroku
leaving out all the development and feature branches, we only have both the master branches (on Heroku and Github) to worry about.
My idea was to have the branches set up like this locally:
master -> remotes/origin/master
heroku -> remotes/heroku/master
On the master branch we put the site_settings.py in the .gitignore so git will ignore it when committing to Github, the same goes for all the dev branches etc. We collect all our changes in the master branch until we have a new release.
On the heroku branch we edited the .gitignore file to stop ignoring the site_settings.py, so it will be committed to Heroku once we push to the heroku branch.
Once we have a new release we switch to heroku branch and merge master into heroku like so:
git checkout heroku
git pull
git merge master
Once we have sorted out merge conflicts from the merge, we commit to Heroku and we have a new release up and running.. =D
My question
Is my idea acceptable to solve this (will it even work with the .gitignore like this?) and if not how do we solve this in the cleanest/best way possible? If you need more info don't hesitate to ask :)

For starters, storing your config in files is the wrong way to go about this, so don't do it.
From the 12factor app:
An app’s config is everything that is likely to vary between deploys
(staging, production, developer environments, etc). This includes:
Resource handles to the database, Memcached, and other backing
services
Credentials to external services such as Amazon S3 or Twitter
Per-deploy values such as the canonical hostname for the deploy
Apps
sometimes store config as constants in the code. This is a violation
of twelve-factor, which requires strict separation of config from
code. Config varies substantially across deploys, code does not.
A litmus test for whether an app has all config correctly factored out
of the code is whether the codebase could be made open source at any
moment, without compromising any credentials.
Note that this definition of “config” does not include internal
application config, such as config/routes.rb in Rails, or how code
modules are connected in Spring. This type of config does not vary
between deploys, and so is best done in the code.
Another approach to config is the use of config files which are not
checked into revision control, such as config/database.yml in Rails.
This is a huge improvement over using constants which are checked into
the code repo, but still has weaknesses: it’s easy to mistakenly check
in a config file to the repo; there is a tendency for config files to
be scattered about in different places and different formats, making
it hard to see and manage all the config in one place. Further, these
formats tend to be language- or framework-specific.
The twelve-factor app stores config in environment variables (often
shortened to env vars or env). Env vars are easy to change between
deploys without changing any code; unlike config files, there is
little chance of them being checked into the code repo accidentally;
and unlike custom config files, or other config mechanisms such as
Java System Properties, they are a language- and OS-agnostic standard.
Another aspect of config management is grouping. Sometimes apps batch
config into named groups (often called “environments”) named after
specific deploys, such as the development, test, and production
environments in Rails. This method does not scale cleanly: as more
deploys of the app are created, new environment names are necessary,
such as staging or qa. As the project grows further, developers may
add their own special environments like joes-staging, resulting in a
combinatorial explosion of config which makes managing deploys of the
app very brittle.
In a twelve-factor app, env vars are granular controls, each fully
orthogonal to other env vars. They are never grouped together as
“environments,” but instead are independently managed for each deploy.
This is a model that scales up smoothly as the app naturally expands
into more deploys over its lifetime.
For Python, your config can be found in os.environ. For specific config, particularly the database you want to be using something like:
import os
import sys
import urlparse
# Register database schemes in URLs.
urlparse.uses_netloc.append('postgres')
urlparse.uses_netloc.append('mysql')
try:
# Check to make sure DATABASES is set in settings.py file.
# If not default to {}
if 'DATABASES' not in locals():
DATABASES = {}
if 'DATABASE_URL' in os.environ:
url = urlparse.urlparse(os.environ['DATABASE_URL'])
# Ensure default database exists.
DATABASES['default'] = DATABASES.get('default', {})
# Update with environment configuration.
DATABASES['default'].update({
'NAME': url.path[1:],
'USER': url.username,
'PASSWORD': url.password,
'HOST': url.hostname,
'PORT': url.port,
})
if url.scheme == 'postgres':
DATABASES['default']['ENGINE'] = 'django.db.backends.postgresql_psycopg2'
if url.scheme == 'mysql':
DATABASES['default']['ENGINE'] = 'django.db.backends.mysql'
except Exception:
print 'Unexpected error:', sys.exc_info()
For more info on config vars, see here: https://devcenter.heroku.com/articles/config-vars

Rather than developing a complex git-based deployment of settings files and variables, on Heroku, I'd use the environment variables for all of your sensitive stuff. That way you don't have to keep them in code. See database docs and environment variables docs for information - put all this stuff in there, and then access via os.getenv() python docs.
To see your environment variables for the app, using the heroku cli tool:
heroku run env
Will print out all the variables. To add new variables (e.g. S3 keys etc) use:
heroku config:add VAR_NAME=value

Related

What is the recommended method for deploying Django settings to a production environment?

I have looked everywhere and all I could find was outdated or bits and pieces from several different sources, but none of them had a complete step of how to deploy Django to a production environment.
What I would like to see is the recommended way to deploy Django utilizing git, local and production settings.py, .gitignore specifically for Django.
How would I implement the settings to accommodate both environments and what should be added to the .gitignore file so that only necessary files are sent to the git repository?
Please note that I want to know best practices and up to date methods of using local and production settings.
Such questions arise when deploying Django apps.
Where should the settings be stored?
How can I properly implement the settings to accommodate both environments?
What sensitive information should be separate from the main settings?
Update 25/09/2016:
Thanks andreas for clarifying this and now I completely agree that there is no standard way of deploying Django. I went on a long research and found the most common used methods to deploy Django settings. I will share my findings with everyone who's having the same doubt.
Answering
Where should the settings be stored?
The settings are better accommodated within a settings module rather than a single settings.py file.
How can I properly implement the settings to accommodate both environments?
This is where the settings module comes in very handy because you can simply have environment independent settings and they are maintainable with git or any other version control.
What sensitive information should be separate from the main settings?
Sensitive information such as database, email, secret key and anything with authentication related information and as to how you will serve this private information secretly, I do recommend to use andreas suggestion.
You could either create a separate file at /etc/app_settings and make it very restrict or simply use environment variables from within the system.
.gitignore part
For the .gitignore part I found an awesome repo where you can get all sort of examples at https://github.com/github/gitignore.
I ended up using:
# OSX Finder turds
.DS_Store
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# Distribution / packaging
.Python
env/
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Django stuff:
*.log
my_app/settings/dev.py
bower_components
Because the answer ended up being so long I've created an article explaining every step of the way at Django Settings Deployment.
There is not one recommended way of deploying a Django app. The official documentation provides a guide for deploying with Apache and mod_wsgi. Some other options are a PaaS like Heroku/Dokku or deployment with Docker
It's common to divide your settings in different files. You could for example divide settings in four different files:
base file (base.py) - Common settings for all environments (every file below imports from this with from .base import *
development file (development.py) - settings specific for development, (DEBUG = True etc...)
production file (production.py) - settings specific for the production environment. (DEBUG = False, whitenoise configuration etc)
testing file (testing.py)
You can control what settings file is loaded with the environment variable DJANGO_SETTINGS_MODULE
It is also common recommended practice that you store secrets like the SECRET_KEY in environment variables. SECRET_KEY = os.environ.get('SECRET_KEY', None). Or check out the django-environ package.
Check out this repo for an example setup.

Update sensitive code on Github and server?

I am new to web development and git. I have created a project which i have hosted on pythonanywhere.com. I pushed my code to github and then cloned it to pythonanywhere. I have some information in my settings.py file which i want to hide on github. So how can i make changes to the project on my local machine and update it on github and from there to pythonanywhere without revealing the hidden info.
As i said i am new to using git, so i am unaware of many tools that come with it. What is the proper way to do this?
Simple solutions is:
create settings_local.py near your settings.py
move all sensitive stuff to settings_local.py
add following code to import sensitive settings to settings.py:
try:
from .settings_local import * # noqa
except ImportError:
pass
add settings_local.py to .gitignore so git would exclude it from commits
remove sensitive data from GitHub following this guide
create settings_local.py on pythonanywhere and your local machine manually or with some script

How to maintain django different settings both on server and Github [duplicate]

This question already has answers here:
Using Git to keep different versions of a file locally vs. in the master repository
(2 answers)
Closed 9 years ago.
On the production server, I certainly need to have different settings, and are different from local settings.
Our opensource project is hosted on github. Thus the master branch is not production code (at least up to settings config).
Now, we got to host the project (django).. for that, The simplest solution I found is to create a new branch locally, setup production settings, add server git as remote, push that branch to the remote origin.
So, we can merge latest stable release whenever we want into production branch.. easily
But there is some issue in managing files in github..
Say the django settings are provided in "proj/settings.py". As local settings differ between system to system, we created "proj/local_settings.py" to overwrite system specific local settings (say, staticfiles location).. This file is ignored by Git using .gitignore
Now, if we use this file in production branch to config production settings, and as its currently ignored but Git.. we can't use it.
To push local_settings.py to production server from local production branch, we need to remove that location in .gitignore in that particular branch..
All is fine, accepted till here..
Here comes actual issue.
When we want to push new changes to production server, we first got to push them to production-local-branch, then to production server... but now,
.gitignore file changes to as present in release (i.e., local_settings.py is again added)
For this, I have to manually remove local_settings.py in gitignore of production-local-branch every time I merge something into it...
certainly the above all is a big mess..How do I handle it easily
What I'd suggest is to remove from your settings.py any settings that depend on where it is deployed. Put those settings into environment variables. For example:
STATIC_ROOT = os.environ.get('STATIC_ROOT')
then you need to set those environment variables before running the webserver.
On local:
STATIC_ROOT=`pwd`/static python manage.py runserver
On production depends on how you deploy, consult the documentation.
This principle of config into environment variables is exposed in the Twelve Factor App : http://www.12factor.net/config

Multiple Django sites with shared codebase and DB

I have created a Django proyect with 20 sites (one different domain per site) for 20 different countries. The sites share everything: codebase, database, urls, templates, etc.
The only thing they don't share are small customizations (logo, background color of the CSS theme, language code, etc.) that I set in each of the site settings file (each site has one settings file, and all of these files import a global settings file with the common stuff). Right now, in order to run the sites in development mode I'll do:
django-admin.py runserver 8000 --settings=config.site_settings.site1
django-admin.py runserver 8001 --settings=config.site_settings.site2
...
django-admin.py runserver 8020 --settings=config.site_settings.site20
I have a couple of questions:
I've read that it is possible to create a virtual host for each site (domain) and pass it the site's settings.py file. However, I am afraid this would create one Django instance per site. Am I right?
Is there a more efficient way of doing the deployment? I've read about django-dynamicsites but I am not sure if it is the right way to go.
If I decide to deploy using Heroku, it seems that Heroku expects only one settings file per app, so I would need to have 20 apps. Is there a solution for that?
Thanks!
So, I recently did something similar, and found that the strategy below is the best option. I'm going to assume that you are familiar with git branching at this point, as well as Heroku remotes. If you aren't, you should read this first: https://devcenter.heroku.com/articles/git#multiple-remotes-and-environments
The main strategy I'm taking is to have a single codebase (a single Git repo) with:
A master branch that contains all your shared code: templates, views, URLs.
Many site branches, based on master, which contain all site-specific customizations: css, images, settings files (if they are vastly different).
The way this works is like so:
First, make sure you're on the master branch.
Second, create a new git branch for one of your domains, eg: git checkout -b somedomain.com.
Third, customize your somedomain.com branch so that it looks the way you want.
Next, deploy somedomain.com live to Heroku, by running heroku create somedomain.com --remote somedomain.com.
Now, push your somedomain.com branch code to your new Heroku application: git push somedomain.com somedomain.com:master. This will deploy your code on Heroku.
Now that you've got your somedomain.com branch deployed with its own Heroku application, you can do all normal Heroku stuff by adding --remote somedomain.com to your normal Heroku commands, eg:
heroku pg:info --remote somedomain.com
heroku addons:add memcache:5mb --remote somedomain.com
etc.
So, now you've basically got two branches: a master branch, and a somedomain.com branch.
Go back to your master branch, and make another new branch for your next domain: git checkout master; git checkout -b anotherdomain.com. Then customize it to your liking (css, site-specific stuff), and deploy the same way we did above.
Now I'm sure you can see where this is going by now. We've got one git branch for each of our custom domains, and each domain has it's own Heroku app. The benefit (obviously) is that each of these project customizations are based off the master branch, which means that you can easily make updates to all sites at once.
Let's say you update one of your views in your master branch--how can you deploy it to all your custom sites at once? Easily!
Just run:
git checkout somedomain.com
git merge master
git push somedomain.com somedomain.com:master # deploy the changes
And repeat for each of your domains. In my environment, I wrote a script that does this, but it's easy enough to do manually if you'd like.
Anyhow, hopefully that helps.

Google App Engine Development and Production Environment Setup

Here is my current setup:
GitHub repository, a branch for dev.
myappdev.appspot.com (not real url)
myapp.appspot.com (not real url)
App written on GAE Python 2.7, using django-nonrel
Development is performed on a local dev server. When I'm ready to release to dev, I increment the version, commit, and run "manage.py upload" to the myappdev.appspot.com
Once testing is satisfactory, I merge the changes from dev to main repo. I then run "manage.py upload" to upload the main repo code to the myapp.appspot.com domain.
Is this setup good? Here are a few issues I've run into.
1) I'm new to git, so sometimes I forget to add files, and the commit doesn't notify me. So I deploy code to dev that works, but does not match what is in the dev branch. (This is bad practice).
2) The datastore file in the git repo causes issues. Merging binary files? Is it ok to migrate this file between local machines, or will it get messed up?
3) Should I be using "manage.py upload" for each release to the dev or prod environment, or is there a better way to do this? Heroku looks like it can pull right from GitHub. The way I'm doing it now seems like there is too much room for human error.
Any overall suggestions on how to improve my setup?
Thanks!
I'm on a pretty similar setup, though I'm still runing on py2.5, django-nonrel.
1) I usually use 'git status' or 'git gui' to see if I forgot to check in files.
2) I personally don't check in my datastore. Are you familiar with .gitignore? It's a text file in which you list files for git to ignore when you run 'git status' and other functions. I put in .gaedata as well as .pyc and backup files.
To manage the database I use "python manage.py dumpdata > file" which dumps the database to a json encoded file. Then I can reload it using "python manage.py loaddata".
3) I don't know of any deploy from git. You can probably write a little python script to check whether git is up to date before you deploy. Personally though, I deploy stuff to test to make sure it's working, before I check it in.