Maintaining South migrations on Django forks

Maintaining South migrations on Django forks - django

I'm working on a pretty complex Django project (50+ models) with some complicated logic (lots of different workflows, views, signals, APIs, background tasks etc.). Let's call this project-base. Currently using Django 1.6 + South migrations and quite a few other 3rd party apps.
Now, one of the requirements is to create a fork of this project that will add some fields/models here and there and some extra logic on top of that. Let's call this project-fork. Most of the extra work will be on top of the existing models, but there will also be a few new ones.
As project-base continues to be developed, we want these features to also get into project-fork (much like a rebase/merge in git-land). The extra changes in project-fork will not be merged back into project-base.
What could be the best possible way to accomplish this? Here are some of my ideas:
Use South merges in project-fork to keep it up-to-date with latest changes from project-base, as explained here. Use signals and any other means necessarry to keep the new logic from project-fork as loosely coupled as possible to avoid any potential conflicts.
Do not modify ANY of the original project-base models and instead create new models in different apps that reference the old models (i.e. using OneToOneField). Extra logic could end up in the old and/or new apps.
your idea here please :)
I would go with option 1 as it seems less complicated as a whole, but might expose a greater risk. Here's how I would see it happening:
Migrations on project-base:
0001_project_base_one
0002_project_base_two
0003_project_base_three
Migrations on project-fork:
0001_project_base_one
0002_project_fork_one
After merge, the migrations would look like this:
0001_project_base_one
0002_project_base_two
0002_project_fork_one
0003_project_base_three
0004_project_fork_merge_noop (added to merge in changes from both projects)
Are there any pitfalls using this approach? Is there a better way?
Thank you for your time.

Official South workflow:
The official South recommendation is to try with the --merge flag: http://south.readthedocs.org/en/latest/tutorial/part5.html#team-workflow
This obviously won't work in all cases, from my experience it works in most though.
Pitfalls:
Multiple changes to the same model can still break
Duplicate changes can break things
The "better" way is usually to avoid simultaneous changes to the same models, the easiest way to do this is by reducing the error-window as much as possible.
My personal workflows in these cases:
With small forks where the model changes are obvious from the beginning:
Discus which model changes will need to be done for the fork
Apply those changes to both/all branches as fast as possible to avoid conflicts
Work on the fork ....
Merge the fork which gives no new migrations
With large forks where the changes are not always obvious and/or will change again:
Do the normal fork and development stuff while trying to stay up to date with the latest master/develop branch as much as possible
Before merging back, throw away all schemamigrations in the fork
Merge all changes from the master/develop
Recreate all needed schema changes
Merge to develop/master

Related

Sharing Django user model between two apps in the same project

I see some similar questions, but they don't appear to be the same or have answers.
I am practicing with Django and trying to make a simple dutch auction project. Initially I thought that the idea would be to create two distinct apps, a buyer app and a seller app, and just have them share databases (or three apps, a commonApp a buyerApp and a sellerApp). However, the more I dig into this the more complicated it seems - I feel like Django isn't really meant to have different apps that are designed around sharing all of their data from one set of tables (maybe I'm wrong?), loosely based on what I've found about having to modify the way Migrations work to accommodate this.
So idea #2, just make one app that separates out the functionality by carefully managing the views, but keeping just one set of models since pretty much all of the data I can think of (the users, the products, etc.) are shared anyway. This seems like it has the advantage of letting Django do all of the data management without my having to sweat about the database design. However, I worry that maybe managing the views will get to be overly complicated.
Maybe there is an idea #3 that makes sense for this sort of project, one that I haven't considered because I am a newb, maybe one that tells me that Django isn't even the right tool for this job...
I tried programming idea #1 and it quickly became spaghetti and only worked when things were very small. I am currently working on idea #2 and so far I think it's going OK, but I'm having trouble conceptualizing how to separate stuff in views, but this could very well just be my lack of experience.
So my question is: is there an obvious resource for this sort of information that I'm missing? If so, could you please point me that way?

Inside your Django project:
manage.py startapp sellers
manage.py startapp buyers
manage.py startapp common
Add these three apps to settings.py. Depending on your Django's version it can be just 'sellers', 'buyers', 'common' or 'seller.apps.SellerConfig' and so on.
Write your models in common/models.py, and any other logic related to both apps.
Then, in your seller or buyers views:
from common.models import * # or a particular model
Hope this helps.

Django best practices to validate data in other tables -taking complexity from view file?

I was wondering about best practices in Django of validating the tables content
I am creating a Sales Orders and my SO should check availability of the items I have in stock and if they are not in stock it will trigger manufacturing orders and purchase orders.
I don't want to make very complex view and looking for a way to decouple logic from there and also I predict performance issues.
What are best practices or ready solutions I can use in Django framework to address view complexity ?
I see different possibilities but I am wondering what will be the best fit in my case :
managers
celery - just to run a job occasionally I want the app to be
real time so I don't like this option.
using signals /pre_save/post_sav
model validation
creating extra layer like services.py file
Since I am new to Django I am a bit puzzled what root to take.

Not sure if this is the answer you are looking for.
Signals are for doing things automatically when events happen. Most commonly used to do things before and after model operations. So if you need to do something every time you save a record or every time you create a new record or delete that is where you use signals.
Managers are used to manage record retrieval and manipulations. If you want to do some clever way of retrieving data you can define a custom manager and add some custom methods to it. If you want to override some default behaviors of querysets you would also do it with a custom manager.
Celery is for running things asynchronously. If you are worried that some processing you are doing might take a long time that is were you might consider offloading things to celery. A friendly warning though, doing things asynchronously raises complexity of your code quite a bit, since you need to add some mechanism to pass the data back from celery tasks into your django app and your users.
services.py link that you posted seems to do what you want, it just provides a place where you can put logic that is not specific to a particular view.

Here on stackoverflow, i got an advice from some experienced developers that premature optimization is the root of all evil.
What i suggest is keep it simple. Making the view a little more complex is actually better than effectively adding one more layer of complexity. I would suggest that you try to put most of you logic in models and whatever remains after that in views.
Also, unnecessarily using multiple packages would not solve much of your problem so use the when its necessary. Otherwise try to write the minimal logic yourself so that you donot have to use many apps.
Signals and other things as everybody say is not a great thing however promising it may seem. Just try to make things simpler.
One more point from my side as you are just starting out, go through class based views and try to use them when you get familiar. That will simplify your views the most. Plus, if ou are new to django, read a little code. https://github.com/vitorfs/bootcamp might help you in initiation.

Database versions deployment. Entity Framework Migrations vs SSDT DacPacs

I have a data-centered application with SQL Server. The environments in which it´ll be deployed are not under our control and there´s no DBA in there (they are all small businesses) so we need the process of distribution of each application/database update to be as automatic as possible.
Besides of the normal changes between versions of an application (kind of unpredictable sometimes), we already know that we´ll need to distribute some new seed data with each version. Sometimes this seed data will be related to other data in our system. For instance: maybe we´ll need to insert 2 new rows of some master data during the v2-v3 update process, and some other 5 rows during the v5-v6 update process.
EF
We have checked Entity Framework Db Migrations (available for existing databases with no Code-First since 4.3.1 release), which represents the traditional sequential scripts in a more automatic and controlled way (like Fluent Migrations).
SSDT
On the other hand, with a different philosophy, we have checked SSDT and its dacpacs, snapshots and pre- and post-deployment scripts.
The questions are:
Which of these technologies / philosophies is more appropriate for the case described?
Any other technology / philosophy that could be used?
Any other advice?
Thanks in advance.

That's an interesting question. Here at Red Gate we're hoping to tackle this issue later this year, as we have many customers asking about how we might provide a simple deployment package. We do have SQL Packager, which essentially wraps a SQL script into an exe.
I would say that dacpacs are designed to cover the use case you describe. However, as far as I understand they work be generating a deployment script dynamically when applied to the target. The drawback is that you won't have the warm fuzzy feeling that you might get when deploying a pre-tested SQL script.
I've not tried updating data with dacpacs before, so I'd be interested to know how well this works. As far as I recall, it truncates the target tables and repopulates them.
I have no experience with EF migrations so I'd be curious to read any answers on this topic.

We´ll probably adopt an hybrid solution. We´d like not to renounce to the idea deployment packagers, but in the other hand, due to our applications´s nature (small businesses as final users, no DBA, no obligation to upgrade so multiple "alive" database versions coexisting), we can´t either renounce to the full control of the migration process, including schema and data. In our case, pre and post-deployment scripts may not be enough (or at least not comfortable enough ) for a full migration like EF Migrations are. Changes like addind/removing seed data, changing a "one to many" to a "many to many" relationship or even radical database schema changes (and, consequently , data migrations to this schema from any previous released schema) may be part of our diary work when our first version is released.
So we´ll probably use EF migations, with its "Up" and "Down" system for each version release. In principle, each "Up" will invoke a dacpac with the last database snapshot (and each Down, its previous), each one with its own deployment parameters for this specific migration. EF migrations will handle the versioning line, an maybe also some complex parts of data migration.
We feel more secure in this hybrid way. We missed automatization and schema changes detection in Entity Framework Migrations as much as we missed versioning line control in Dacpacs way.

Could Django project directory more simple structure with or without South?

I'm a new comer to Django.
However, I'm little confused on directory structure of Django. For instance I have to write so many models.py in different place, which, presumably, will cause difficulty to maintain the project in future. I want to make it more like real MVC structure, with every model files in a models directory.
Could it be possible to do that with using South, which seems to only looking at models.py, or should I consider different migration tools?

Django organizes code into 'apps', which is why you have separate models.py files, and I don't think there's a way to put them all in one directory instead, since each app gets its own Python package.
However, the way I normally structure my code, is to have one (or a few, if it's a larger project) app for all my code, since you can have as many Models in a single models.py file as you want.
I don't think South will help you with that, but it will make it a lot easier to manage your migrations, so I would highly recommend it.

I dont think there is a provision in django to put all models at one place. and also it is bad idea to put all at one place. because each app has its own DB Schema and putting apps independent is necessary to fulfill reusability factor. Its better to keep the models isolated from each other, attached to their app as it helps reausability.
South does not fulfill this. it just keeps track of ur Db migrations n fixtures.
At one point or the other, South comes into picture, no matter how perfectly the DB Schema is designed.

How do the big sites handle immediate schema changes whilst using Django?

I've been using south but I really hate having to manually migrate data all over again even if I make one small itty bitty update to a class. If I'm not using django I can easily just alter the table schema and make an adjustment in a class and I'm good.
I know that most people would probably tell me to properly think out the schema way in advance, but realistically speaking there are times where you need to immediately make changes, and I don't think using south is ideal for this.
Is there some sort of advanced method people use, perhaps even modifying the core of Django itself? Or is there something about south that I'm not just grokking?

I really hate having to manually migrate data all over again even if I make one small itty bitty update to a class.
Can you specify what kind of updates? If you mean adding new fields or editing existing ones then obviously yes. If you mean modifying methods that operate on fields then there is no need to migrate.
I know that most people would probably tell me to properly think out the schema way in advance
It would certainly help to think it over a couple of times. Experience helps too. But obviously you cannot foresee everything.
but realistically speaking there are times where you need to immediately make changes, and I don't think using south is ideal for this.
Honestly I am not convinced by this argument. If changes can be deployed "immediately" using SQL then I'd argue that they can be deployed using South as well. Especially if you have automated your deployment using Fabric or such.
Also I find it hard to believe that the time taken to execute a migration using a generated script can be significantly greater than the time taken to first write the appropriate SQL and then execute it. At least this has not been the case in my experience.
The one exception could be a situation where the ORM doesn't readily have an equivalent for the SQL. In that case you can still execute the raw SQL through your (South) migration script.
Or is there something about south that I'm not just grokking?
I suspect that you are not grokking the idea of having orderly, version-controlled, reversible migrations. SQL-only migrations are not always designed to be reversible (I know there are exceptions). And they are not orderly unless the developers take particular care about keeping them so. I've even seen people fire potentially troublesome updates on production without even pausing to start a transaction first and then discard the SQL without even making a record of it.
I'm not questioning your skills or attention to detail here; I'm just pointing out what I think is your disconnect with South.
Hope this helps.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js