Mercurial: keep 2 branches in sync but with certain persistent differences? - django

I'm a web developer working on my own using django, and I'm trying to get my head round how best to deploy sites using mercurial. What I'd like to have is to be able to keep one repository that I can use for both production and development work. There will always be some differences between production/development (e.g. they might use different databases, development will always have debug turned on) but by and large they will be in sync. I'd also like to be able to make changes directly on the production server (tidying up html or css, simple bugfixes etc.).
The workflow that I intend to use for doing this is as follows:
Create 2 branches, prod and dev (all settings initially set to production settings)
Change settings.py and a few other things in the dev branch. So now I've got 2 heads, and from now on the repository will always have 2 heads.
(On dev machine) Make changes to dev, then use 'hg transplant' to copy relevant changesets to production.
push to master repository
(On production server) Pull from master repo, update to prod head
Note: you can also make changes straight to prod so long as you transplant the changes into dev.
This workflow has the drawback that whenever you make a change, not only do you have to commit it to whichever branch you make the change on, you also have to transplant it to the other branch. Is there a more sensible way of doing what I want here, perhaps using patches? Or failing that, is there a way of automating the commit process to automatically transplant the changeset to the other branch, and would this be a good idea?

I'd probably use Mercurial Queues for something like this. Keep the main repository as the development version, and have a for-production patch that makes any necessary changes for production.

Here are two possible solutions one using mercurial and one not using mercurial:
Use the hostname to switch between prod and devel. We have a single check at the top of our settings file that looks at the SERVER_NAME environment variable. If it's www.production.com it's the prod DB and otherwise it picks a specified or default dev/test/stage DB.
Using Mercurial, just have a clone that's dev and a clone that's prod, make all changes in dev, and at deploy time pull from dev to prod. After pulling you'll have 2 heads in prod diverging from a single common ancestor (the last deploy). One head will have a single changeset containing only the differences between dev and prod deployments, and the other will have all the new work. Merge them in the prod clone, selecting the prod changes on conflict of course, and you've got a deployable setup, and are ready to do more work on 'dev'. No need to branch, transplant, or use queues. So long as you never pull that changeset with the prod settings into 'dev' it will always need a merge after pulling from dev, and if it's just a few lines there's not much to do.

I've solved this with local settings.
Append to settings.py:
try:
from local_settings import *
except ImportError:
pass
touch local_settings.py
Add ^local_settings.py$ to your .hgignore
Each deploy I do has it's own local settings (typically different DB stuff and different origin email addresses).
PS: Only read the "minified versions of javascript portion" later. For this, I would suggest a post-update hook and a config setting (like JS_EXTENSION).
Example (from the top of my head! not tested, adapt as necessary):
Put JS_EXTENSION = '.raw.js' in your settings.py file;
Put JS_EXTENSION = '.mini.js' in your local_settings.py file on the production server;
Change JS inclusion from:
<script type="text/javascript" src="blabla.js"></script>
To:
<script type="text/javascript" src="blabla{{JS_EXTENSION}}"></script>
Make a post-update hook that looks for *.raw.js and generates .mini.js (minified versions of raw);
Add .mini.js$ to your .hgignore

Perhaps try something like this: (I was just thinking about this issue, in my case it's a sqlite database)
Add settings.py to .hgignore, to keep it out of the repository.
Take your settings.py files from the two separate branches and move them into two separate files, settings-prod.py and settings-dev.py
Create a deploy script which copies the appropriate settings-X file to settings.py, so you can deploy either way.
If you have a couple of additional files, do the same thing for them. If you have a lot of files but they're all in the same directory by themselves, you could just create a pair of directories: production and development, and then either copy or symlink the appropriate one into a deploy directory.
If you did something like this, you could dispense with the need for branching your repository.

I actually do this using named branches and straight merging instead of transplanting (which is more reliable, IMO). This usually works, although sometimes (when you've edited the different files on the other branch), you'll need to pay attention not to remove the differences again when you're merging.
So it works great if you're not changing the different files much.

Related

Working on the same git with two different pc's. Two different postgresql settings in the settings.py file

I'm very new to databases and I'm trying to find out what the best practise for what I'm trying to achieve.
I have the one repository which is a Django backend with a postgresql database attached. I'm working with this on my main pc but recently I've had to work on my laptop. My laptop has another postgresql database running on 5432, so I've had to change some of that info to be on port 54324. These changes I don't want pushed to the repository, but I would still like to track the settings.py file in the repository. So far I've just created a branch for each pc to maintain the separate settings, but I'm sure this is not a great way to do it. I've heard about setting up environment files, but I'm unsure about if this is the 'right way' to do it either.
I'm a little confused with the best way I can do this, hopefully I'm making sense. Any help would be appreciated greatly.
Thanks,
Darren
This is normally solved with a properties file that is ignored. What you keep is a sample file (that has a different name) and that you do track and change accordingly on git. Your python scripts read the properties file and everybody should be happy.
Besides eftshift0's answer, consider having a committed config.defaults.py file that set default configuration values that may be overridden by a per-site config.local.py file. If the default configuration works for you, you don't need to create the per-site config. If not, create the per-site config. Never commit (and do .gitignore) the per-site config.
The names of the configuration files might be located outside the repository proper, but the overall idea still applies. The distributed (and committed) configuration file is a sample and/or default and actual site settings are kept in some other file that is never committed.
If you already have a single config.py or settings.py, you can establish this configuration pattern by adding site.py (use whatever name you want for this per-site setting file) as an ignored file. Read the new file, if it exists, such that the site settings override the default settings from the existing tracked file, and you're good to go.

Sitecore Config Files + Project Setup

We are updating our sitecore to 8.2 and in the process I am trying to refine our source control and development workflow.
Goals
1. Have a single source of truth for support dlls, configs, lic, etc.
2. Have everything in source control that is needed to recreate the entire site from dev to prod. (excluding packages).
In order to have all of the different configs needed for the various machines I have created gulp tasks that transform the configs on build (dev, staging, prod). Those transformed configs are placed in a folder in the project that is then used to replace the originals on the target machines. This folder publishes all of its contents and seems to be working well so far.
What I don't know is how to deal with all of the config files that do not change.
Is it best to include all of those .config files in the project so that they publish? If not, then the target machine folders will have to be either manually managed (seems like a bad idea) or a script used to ensure the configs are up to date (more customization..by default not a great idea).
The only downside (that I see) to including all of the configs in the project is the weight that it would add to file searches (and that doesn't seem like a very strong argument).
Am I not seeing something?
How are you other Sitecore humans handling this?
Gregory
As a general rule of thumb, do not check in any default files into Source Control.
The main reasons are; bloat, making syncing/downloading from your source control take much longer, and upgrades, the latter being a much more important reason.
If/when you upgrade in the future, if you do not have any Sitecore files checked into source control then you can simply deploy a new/clean instance of Sitecore, fix any conflicts in your own code and then deploy on top. You don't have to try and figure out what has changed in the default install files between releases.
Any changes you need to make to Sitecore configs or settings should be made using patch files and only those custom files added to your solution.
How to handle this for deployments?
There are a few options. You could go done the scripted route, which will take a clean Sitecore install, unzip and made whatever modifications you need, then install/unzip the modules that you use in your solution one by one.
Another option maybe to create a default install with all the modules and then zip this up, then an install would be similar process to above but a more simpler case of just unzipping a single file. You could use Sitecore SIM to both install the instance, modules and then backup or do this manually.
Yet another alternative may be to check everything into Source Control, either under separate repository or a different project so ensure that all default files and configs are kept separate. If you need to upgrade in the future, simply delete the repo/project and add them back in again.
I would also do the same (a separate project) to keep all Support patches/dlls separate, again to help easily identify what fixes have been applied and to easily remove them if a future version resolves the issue.
These may add an additional step to your deploy, but keeping this separation will make your life much much easier when it comes to upgrade time.

Django Compressor on a multi-server deployment

I've been fortunate enough to discover django_compressor and implemented it within our stack, which deploys to many servers (Currently 6, but growing as we deploy smaller virtual machines.)
Now this is all fine and dandy if you're using django_compressor at its finest. Compressing raw CSS/JS code
However, say now I want introduce some type of pre-compiler. Let's say for this example it is LESS (css). The thought process for this is fairly simple:
Install node, npm, and the less package onto the server.
Add less to your precompilers!
COMPRESS_PRECOMPILERS = ( ('text/less', 'lessc {infile} {outfile}'), )
Now you deploy, and your server compiles the less file. Everything is fantastic!
Now let's add 8 more servers to that and you have to install node, npm, and less on each server?
This is where something doesn't seem right, and I feel like I'm missing something. I believe the Django community has run into this problem before.
My thoughts thus far have been:
Use a post-commit hook to compile the CSS on the developers machine. This means that via django_compressor, we link to the compiled static file in the HTML, and our repository contains both the compiled and non-compiled versions. My only downside to this is it ends up not using half of the benefits of django_compressor and may be tedious for developers?
Suck it up and make node, npm, and less part of the server stack.
Update
I did some additional looking around and it seems that using the COMPRESS_OFFLINE flag (or just --force) with the management command will produce an offline manifest file that does what I need (only tested locally). So setting this up with a pre-deploy hook likes to be the answer.
Of course, still open to other ideas :-)
Coupled with the tips in the comments about COMPRESS_OFFLINE, you could look at django-staticfiles' storage stuff. You can host the static files on amazon s3, for instance, so hosting it all on one static-hosting server and using that from all your servers could also be a nice solution. You wouldn't need to do anything with the static (and compressed) files on the individual servers.
Alternative solution regarding the multiple servers: I've made a custom fabric (docs.fabfile.org) script that installs/configures stuff on our servers. I've only recently started using coffeescript and less, but those two are definitively ending up in my fabfile. That solves the installation problem for me.
(Alternatives to a fabfile are things like a custom debian package with standard dependencies. Or chef or puppet or something similar.)
you can use puppet for the task

Django: pre-deployment

Question 1:
I am about to deploy my first Django website and I was wondering what tools are recommended to gathering all your Django files.
Like for example I don't need my sass and coffeescript files I just want the compiled css and js files. I also want to use the correct production settings file.
Question 2:
Do I put these files ready for deployment into their own version control repository? I guess the advantage is that you can easily roll back changes?
Question 3:
Do I run my tests before gathering the files or before deploying?
Shell scripts could be a solution but maybe there is a better way? I looked at jenkins/hudson but that seems more like a tool that sits on top of the tools that I am looking for.
For questions one and two, I'd recommend using a version control system for this. I'm sure you're already using some sort of version control, so you can just say which branch of your repository you would like to deploy. And yes, this makes rollbacks incredibly easy. Probably the most popular method for Django deployment is to package your files using git, and then deploy these files and run any deployment scripts using fabric.
Using git, packaging your files using your local repository would look something like:
git archive --format=tar HEAD | gzip > my_repo.tar.gz
Alternately, you can first push your changes to a github repository, and then in your deployment script just clone your repository from your production server.
For your third question, if you use this version control method for packaging your files, then just make sure when you are testing you have the deployment branch checked out.
I'll typically use Fabric for deploying most Django projects:
http://docs.fabfile.org/en/1.0.0/?redir
It has a decent api for communicating with remote servers and it's all in Python – bonus!
You don't need to store your concatenated media files in a separate repo. They're only needed for production. In that case I've found libraries like django-mediasync and django-compress to be useful. They both provide template tags/settings that can concatenate and cache your static files for you depending on the DEBUG setting/environments (production vs development).
You can run your tests whenever. Some people will run them as a version control hook to prevent broken code from being checked in or during deployment, stopping the deployment in case of test failure.

same project, multiple customers git workflow

After my first question, id like to have a confirmation about the best git workflow in my case.
I have a single django project, hosted at github, and differents clones with each his own branch : customerA, customerB, demo... (think websites)
Branches share the same core but have differents data and settings (these are in gitignore)
When i work on CustomerA branch, how should i replicate some bug corrections to the other deployments ?
When i create a new general feature, i create a special branch, then merge it into my master. Then, to deploy on the 'clients', i merge the master branch into the customer branch. Is it the right way ? or should i rebase ?
# from customerA branch
git fetch origin master
git merge origin master
Also, i have created a remote branch for each customer so i can backup the customers branches to github.
It looks a very classic problem but i guess i dont use git the right way
Thanks.
Ju.
I would have a single project repo at a well-known place containing a master branch with the common code, and branches for specific deployments (e.g. customer/A customer/B demo).
Then I would have checkouts from each of these branches for each customer, for the demo server, and so on. You can let these pull automatically from their respective branch with a commit hook on the single project repo.
Every developer would have their local copy of the project repo, do local work, and then push stuff back to the single project repo.
The challenge will be to maintain the branches diverging from master and doing the regular merges so the diversion do not grow over time.
I have seen this solution describe somewhere in much more detail somewhere on the web, but I could not find it quickly again. Some blog post on using git for a staging and production web server, IIRC.
If the three sites share some 'core' code (such as a Django app) you should factor that core out into its own repo and use git submodules to include it in the other projects, rather than duplicating it.
I would have a repo called project-master or something like that and a repo for each client. Then, when you have code you need to be available to those client repos, you pull from the project-master to that repo.
Don't separate the projects in branches, separate them into different repositories.
Make the "common" code generic enough so that costumerA's copy of the common code is exactly the same as costumerB's copy of the common code.
Then, you don't have to pull or merge anything. When you update the common code, both costumerA and costumerB will get the update automagically (because they use the same common code).
By "common" code: I'm referring to the package/series-of-apps that power the websites you're developing.
I'm assuming costumerA and costumerB repositories would only include things like site-specific settings and templates.
The key here is making the "common" code generic: don't let costumerA use a "slightly modified version" of the "common" code.
Also, I'd suggest using a deployment mechanism that doesn't rely on git. git is a great source code management tool; but it's not designed (AFAIK) to be a deployment tool.