Database change history - django

I'm searching for a django package that would allow me to save history of my models, but only on updates. I found django-reversion, but after a quick look it doesn't seem to have such ability. Database triggers are also not an option, since my database structure is rather complicated, and writing them would be hard.

I had the same problem at some point.
My solution at that time was to define two models:
content: which corresponds to an abstraction of content of the website. This model is what holds the current content of the website.
action: which corresponds to an abstraction of content change on the website.
content has two pointers (OneToOneField) to actions: first edit, which is the action that created that particular content and last edit, the action that has the last modification made on the content.
action has one pointer (OneToOneField) to an actor, the user that made the action, and a pointer (ForeignField) to another action, the previous action on the particular content. It also has a TextField which saves content change. This can be for instance a diff-like text for databases, or simply a pickled dictionary.
In some sense this is equivalent to version control like git, where "action" is a commit on a specific content (and not on the whole project), and content is the source. Commits are linked via "previous commit", and save content change and user.

You can use south. It detects changes to your models and creates migration files that allow you to sync your database forward and backward.

Related

Saving a new row in DynamoDB and then listing all those items with eventual read consistency?

So imagine we have a web frontend & API Gateway/Lambda/DynamoDB backend.
The user navigates to the "Add Project" page where they type the name of the new Project and click Save, this then navigates to a list of Projects (which should include the one they just added).
Because the read in DynamoDB is eventual by default, it is possible that the user would click save and then not see their new Project listed on the next page - this could cause confusion and if they entered a lot of information a bit of panic.
Is it a good pattern to have the backend accept an additional param to say "strongly consistent read" on the "getProjects" call? Or is there another way to deal with this?
It's actually a good pattern to do a consistent read after you insert or update an item. An example of this can be seen at the DynamoDB docs when describing CRUD operations
A common pattern I've done in the past when working with web applications is to end any POST/PUT request with a redirect to a GET in which I enable strong consistency. That gives strong consistency to the list immediately after inserting. In most cases users will just do nothing after inserting, will navigate to a different part of the application, or will click to see the details of the item.
Let's suppose the user clicks on the item in the list to see the details. Theoretically it might not be propagated yet (although chances is it will be, because DynamoDB replication tends to be very fast). Another pattern I have used in the past is for detail pages I issue an eventual request, but if I get no results, instead of returning not found directly to the end user, I retry once the read with consistency. If it returns no results, then I return the not found, but if it was just a propagation issue, then you are good to go.

Beginner Questions about django-viewflow

I am working on my first django-viewflow project, and I have some very basic questions. I have looked at the docs and the cookbook examples.
My question is which fields go into the "normal" django models (models.Model) and which fields go into the Process models? For example, I am building a publishing model, so a document that is uploaded starts in a private state, then goes into a pending state after some processing, and then an editor can update the documents state to publish, and the document is available through the front facing web site. I would assume the state field (private, pending, publish) are part of a process model, but what about the other fields related to the document (author, date, source, topic, etc.)? Do they go into the process model or the models.Model model? Does it matter? What are the considerations in building the models and flows for separation of data between the two types of models?
Another example - why in the Hello World example is the text field in the Process model and not a model.Models model? This field does not seem to have anything to do with the process, but I am probably not understanding how viewflow works.
Thanks!
Mark
That's your choice. Viewflow is the library and has no restriction on data alignment. The only thing that needs to be done is the link between process_pk and the process data. HelloWord is the minimal working sample, that demonstrates a workflow.
You can put everything in the separate mode and provide an FK to in the Process model.
But the state field itself is the antipattern since eventually, you can have several tasks executed in parallel. And even sequential workflow could be constantly changed, new tasks could be added or deleted. You can have only published Boolean or DateTime field in the POST model to filter on that on the front end.
The general rule could be - keep all people workflow decisions in the Process model, and build all data models in a declarative way, keep separated workflow and actual data.

What strategies can I use in Sitecore to archive items and then restore later via code?

We are building a Sitecore site that will pull in some product data from an external database. On a nightly basis we will query the external database and either Add, Update or Archive/Delete/Remove product content items in Sitecore as needed. Our data template has some fields that will be populated directly from the external database (and will be read-only for content authors) and other fields that they will populate themselves. Included in our custom fields will be the SKU of the item from the external database. It is possible that over time a product could disappear from the external database. In this case we would want Sitecore to somehow remove this item from our list of products, but not completely delete it. The reason for this is that the products that have been removed could reappear in the future and we would not want to lose all of the data that had been added to other custom fields on the item. I can think of a number of different approaches for this:
Use Archiving/Recycling features of Sitecore. When we find that there is a product item in Sitecore that no longer appears in the external database, then we could archive it. That works well. However I can't seem to figure out a way to restore that item later if it reappears in the external database. I don't have any access to any custom fields when an item is archived (from what I have read online). So when I come across a SKU in the external database that is not in Sitecore, I have no way of figuring out if there is an archived item that has that SKU.
Use a custom status field on each product content item. I could set each product content item to "active" or "inactive". This would make it easy to reactivate items that reappear in the external database. However I worry about things like search and publishing. It seems messy to me to have some content items that are inactive in the folder of all products in the master database. It could be confusing to content authors and I worry that they will find their way in to the web database, etc. It seems like I would have to do a lot of custom coding to make sure that those products do not show up on any pages, etc.
When a product disppears from the external database I could then move those content items to a different location in Sitecore. Then when they reappear I could move them back. This also feels messy.
I just wonder if there is some better solution that I am missing. Thanks in advance for any help.
I would go with option 2 "Setting status field on each product "Active" or "Inactive", as its more clear and keep the data in one place.
Additional thing to do (as suggested by Vasiliy) is to set the "Publishable" checkbox on product to "False", this way the product will disappear from web database, hence no extra filter in your search methods.
You can implement custom content editor warning to inform content editor that the current product is "inactive":
Creating Custom Content Editor warnings
Hope this helps
Just a thought what if you just unpublished the items that were removed from the external database and set the ones in the authoring db unpublishable until they reappear again. With this scenario, you could also have a task running archiving items that have been unpublished and not republished for a given period of time.
The best solution really depends on the number and frequency of items appearing / disappearing and the cost benefit of keeping those items in the authoring database vs. deleting them.

sitecore: Serialization and package designer

I have a large amount of data(content created by user, not developer) created in Sitecore.
I know that in order to transfer large amount of data from one environment to another, I need to serialize all the content first.
My question is, after I serialize the content, do I need to create a designer package that contains the data I want to move? Or after I serialize, I use the serialized file?
Serialization is an option, but you could also create a package through the Package designer, download it and install it on the other environment.
If you are installing big packages, it is a good practice to set the value of Indexing.UpdateInterval in the web.config to 00:00:00 to prevent starting the Lucene indexer during the package install which results in much longer install times.
You don't need to create a package, use the serliazied file and update via the UI as below.
To update an item from the text file:
In the Content Editor, select the item that you want to update.
On the Developer tab, in the Serialize group, click Update Item.
To update an item with all its subitems from the file system:
In the Content Editor, select the parent item that you want to update with all its subitems.
On the Developer tab, in the Serialize group, click Update Tree.
To update the whole database:
In the Content Editor, select any item.
On the Developer tab, in the Serialize group, click Update Database
You can also use the "Transfer Item to Another Database" feature.
Just select the database where you want to go, go to Control Panel, Database, Transfer Item to Another Database.
This will open a wizard. Then you can select the Source items (the items you want to transfer to another database), then select the Target database and select where you want the items to be in the tree (i.e. under Home or some other node).
For some more information you can go to this blogpost by Sam J. Griffin, which explains it step by step.
One very important side-note though - don't copy the /sitecore/templates/sytem if you want to do all templates. This will result in some circular reference issues. If it's just content that you're copying it should be fine.
If you have a spare $149 then you should also take a look at the new Sitecore synchronization tool from Hedgehog:
http://www.hhogdev.com/Products/Razl.aspx

DropLink datasource item reference with custom dataprovider in Sitecore

How do bind a DropLink using a custom dataprovider?
More info:
I am trying to build a product catalogue site using Sitecore. Each product in the sitecore content tree can have a star rating and short text review attached to it (which will be linked to a user extended with a profile provider but that is another question).
I am planning to store the review information in an external database and reference it using a custom dataprovider. I have downloaded the NorthwindDataProvider from the Shared Source (here) and have altered it to use a table which contains the rating, text and a uniqueidentifier field to store the ID of the product from in sitecore the review is attached to.
The template field is a droplink and the datasource is set to the products in the catalogue.
When I edit a review in the custom dataprovider using the sitecore content editor, the droplink states 'Value not in selection list' even if I select one of the populated products and save using sitecore.
It is saving the ID in the database but if I look at the raw value it displays the id without the curly brackets. Working droplink fields' raw values appear to contain the brackets.
To create a review, I am using a jquery post to a webservice which writes to the database using an external datacontext. Should I be using some Sitecore API to use the custom dataprovider instead?
Any information using custom dataproviders would be helpful. The documentation I've been able to find has all stated what can be done but I'm struggling to find actual implementation.
So the first thing is that you have a template field and you're using droplink which is going to store the guid for the item selected. I'm not quite clear on whether or not you're pointing the datasource to a Sitecore item or not.. but that's essential if you're using droplink. Here's what I would suggest instead for the most straight forward way to do this:
Create a template that you add fields to handle the logic dealing with your catalog items. How you do that is your choice and Sitecore doesn't care since its only going to deal with the item and all it cares about is finding an item... you write business logic to manipulate the external data.
Once you have a folder that stores your catalog items, you could easily write a script to be triggered by the Rules engine in Sitecore or a Sitecore task that runs regularly to get your catalog items to add/update or remove the corresponding list of Sitecore items.
Also, another option that is more complex to implement, but if you have multiple data sources on your site, is a valid approach, is to use an object framework (like the Entity framework) as a data object layer that allows you to create and populate common objects with from any data source.
Hope this is helpful!