I am writing a website using Django. I need to push the web site out as soon as possible. I don't need a lot of amazing things right now.
I am concern about the future development.
If I enable registration, which means I allow more contents to be writable. If I don't, then only the admins can publish the content. The website isn't exactly a CMS.
This is a big problem, as I will continue to add new features and rewriting codes (either by adapting third-party apps, or rewrites the app itself). So how would either path affects my database contents?
So the bottom line is, how do I ensure as the development continues, I can ensure the safety of my data?
I hope someone can offer a little insights on this matter.
Thank you very much. It's hard to describe my concern, really.
Whatever functionalities you will add after, if you add new fields, etc ... you can still migrate your data to the "new" database.
It becomes more complicated with relationships, because you might have integrity problems. Say you have a Comment model, and say you don't enable registration, so all users can comment on certain posts. If after, you decide to enable registration, and you decide that ALL the comments have to be associated with a user, then you will have problems migrating your data, because you'll have lots of comments for which you'll have to make up a user, or that you'll just have to drop. Of course, in that case there would be work-arounds, but it is just to illustrate some of the problems you might encounter later.
Personally, I try to have a good data-model, with only the minimum necessary fields (more fields will come after, with new functionalities). I especially try to avoid having to add new foreign keys in already existing models. For example, it is fine to add a new model later, with a foreign key to existing model, but the opposite is more complicated.
Finally, I am not sure about why you hesitate to enable registration. It is actually very very simple to do (you can for example use django-registration, and you would just have to write some urlconf, and some templates, and that's all ...)
Hope this helps !
if you are afraid of data migration, just use south...
Related
I am trying to write a web application that displays the content of a database as a website, e.g. in the form of a table, and lets the user update the table entries, which should automatically be reflected in the database, so that on page reload, the table looks exactly as the user left it.
Since my web development skills are fairly outdated, I wanted to take this as an excuse to try some new stuff. I know my way around SQL and Python quite well, so I thought Django would be a good choice. I don't have a lot of experience in Javascript however. I already worked through the tutorial, which covers classic HTML forms where you enter a bunch of data and then hit "submit" to push to the database.
What I would really prefer though is to have my whole table freely editable and either immediately save any change to the database (e.g. whenever I click a checkbox or "focus out" of a text box). As a second option I thought about having a single "save" button for the whole page (which may easily be several screens in size).
Now, for the first option, I assume I will likely have to use Javascript and Ajax techniques, which I am not comfortable with them yet, so writing greater pieces of Javascript code is something I am not very keen on at the moment.
For the second option, I would probably have my whole table be a huge, single form with a single submit button. I am a bit wary about this as it does not seem very robust to me.
So what my question boils down to: Are there ways to accomplish what I want in a robust and easy way without having to reinvent the wheel? From my understanding, Django does not cover the final rendering in HTML, it only provides the data, so I would assume I need some third party technology to handle that part?
Yes, for your second idea, submitting the whole table at once, Django has a thing called a ModelFormSet where you define a web form which is repeated for each row in the table (or, for the set of records you select). There are a good amount of basic things you'll need to understand to do it.. eg. how to create a Django view, how to set up a url, how to write templates... but you say you want to learn Django.. so.. it's a good exercise. The Django documentation has a good tutorial that leads you through development of a basic working app and from there it's not much further to do what you're seeking.
Here's the part of the Django documentation that discusses ModelFormSets:
https://docs.djangoproject.com/en/3.0/topics/forms/modelforms/#model-formsets
BTW, Django detects which rows have changed so it won't write every row every time, even though you've submitted them all.
I was wondering about best practices in Django of validating the tables content
I am creating a Sales Orders and my SO should check availability of the items I have in stock and if they are not in stock it will trigger manufacturing orders and purchase orders.
I don't want to make very complex view and looking for a way to decouple logic from there and also I predict performance issues.
What are best practices or ready solutions I can use in Django framework to address view complexity ?
I see different possibilities but I am wondering what will be the best fit in my case :
managers
celery - just to run a job occasionally I want the app to be
real time so I don't like this option.
using signals /pre_save/post_sav
model validation
creating extra layer like services.py file
Since I am new to Django I am a bit puzzled what root to take.
Not sure if this is the answer you are looking for.
Signals are for doing things automatically when events happen. Most commonly used to do things before and after model operations. So if you need to do something every time you save a record or every time you create a new record or delete that is where you use signals.
Managers are used to manage record retrieval and manipulations. If you want to do some clever way of retrieving data you can define a custom manager and add some custom methods to it. If you want to override some default behaviors of querysets you would also do it with a custom manager.
Celery is for running things asynchronously. If you are worried that some processing you are doing might take a long time that is were you might consider offloading things to celery. A friendly warning though, doing things asynchronously raises complexity of your code quite a bit, since you need to add some mechanism to pass the data back from celery tasks into your django app and your users.
services.py link that you posted seems to do what you want, it just provides a place where you can put logic that is not specific to a particular view.
Here on stackoverflow, i got an advice from some experienced developers that premature optimization is the root of all evil.
What i suggest is keep it simple. Making the view a little more complex is actually better than effectively adding one more layer of complexity. I would suggest that you try to put most of you logic in models and whatever remains after that in views.
Also, unnecessarily using multiple packages would not solve much of your problem so use the when its necessary. Otherwise try to write the minimal logic yourself so that you donot have to use many apps.
Signals and other things as everybody say is not a great thing however promising it may seem. Just try to make things simpler.
One more point from my side as you are just starting out, go through class based views and try to use them when you get familiar. That will simplify your views the most. Plus, if ou are new to django, read a little code. https://github.com/vitorfs/bootcamp might help you in initiation.
From the documentation: "Django’s comment framework has been deprecated and is no longer supported. Most users will be better served with a custom solution, or a hosted product like Disqus. The code formerly known as django.contrib.comments is still available in an external repository."
Is the move to django-contrib-comments only a fallback for existing projects that use django.contrib.comments? Should I use django-contrib-comments in new projects and why (not)?
I have been developing comments for our site using django.contrib.comments and found it to be quite a simple module and nothing else. If you are building a "just" commenting app to engage people, disqus might be a nice option. For instance, if you are building something like what stackoverflow is doing, you need to do by yourself.
For that, you can pretty well use django.contrib.comments and built your rest of the code on the top of it. I have been doing this and the following are points I would like to note
Very good chance that you are going to write all Views again for Ajax support or any other custom support
The app does not authenticate users. So, you might need to tweak this too
Add some special fields in comments, remove some provided
You might want to provide users to delete comments.. The built-in delete is just a flag where its marked "deleted" but not deleted exactly..
Regarding administration of commments, there might be lot you are going to improve.
It goes on, when you start doing it, you continue to tweak almost everything and make fit for your site. Probably if your tweaks seems to look too huge, I guess, start from scratch or take only parts of that django.contrib.comments where ever needed..
The Google Groups Django developers has the proposal:
"... if you don't really care much about how comments work but just want something easy, then Disqus (and its competitors) are easier to use and have much better features (spam prevents, moderation, etc.). If you want something complex and specific, on the other hand, you're better off writing something from scratch."
And the django-contrib-comments (the new home) is intended as a boneyard.
I am currently faced with the task of importing around 200K items from a custom CMS implementation into Sitecore. I have created a simple import page which connects to an external SQL database using Entity Framework and I have created all the required data templates.
During a test import of about 5K items I realized that I needed to find a way to make the import run a lot faster so I set about to find some information about optimizing Sitecore for this purpose. I have concluded that there is not much specific information out there so I'd like to share what I've found and open the floor for others to contribute further optimizations. My aim is to create some kind of maintenance mode for Sitecore that can be used when importing large columes of data.
The most useful information I found was on Mark Cassidy's blogpost http://intothecore.cassidy.dk/2009/04/migrating-data-into-sitecore.html. At the bottom of this post he provides a few tips for when you are running an import.
If migrating large quantities of data, try and disable as many Sitecore event handlers and whatever else you can get away with.
Use BulkUpdateContext()
Don't forget your target language
If you can, make the fields shared and unversioned. This should help migration execution speed.
The first thing I noticed out of this list was the BulkUpdateContext class as I had never heard of it. I quickly understood why as a search on the SND forum and in the PDF documentation returned no hits. So imagine my surprise when i actually tested it out and found that it improves item creation/deletes by at least ten fold!
The next thing I looked at was the first point where he basically suggests creating a version of web config that only has the bare essentials needed to perform the import. So far I have removed all events related to creating, saving and deleting items and versions. I have also removed the history engine and system index declarations from the master database element in web config as well as any custom events, schedules and search configurations. I expect that there are a lot of other things I could look to remove/disable in order to increase performance. Pipelines? Schedules?
What optimization tips do you have?
Incidentally, BulkUpdateContext() is a very misleading name - as it really improves item creation speed, not item updating speed. But as you also point out, it improves your import speed massively :-)
Since I wrote that post, I've added a few new things to my normal routines when doing imports.
Regularly shrink your databases. They tend to grow large and bulky. To do this; first go to Sitecore Control Panel -> Database and select "Clean Up Database". After this, do a regular ShrinkDB on your SQL server
Disable indexes, especially if importing into the "master" database. For reference, see http://intothecore.cassidy.dk/2010/09/disabling-lucene-indexes.html
Try not to import into "master" however.. you will usually find that imports into "web" is a lot faster, mostly because this database isn't (by default) connected to the HistoryManager or other gadgets
And if you're really adventureous, there's a thing you could try that I'd been considering trying out myself, but never got around to. They might work, but I can't guarantee that they will :-)
Try removing all your field types from App_Config/FieldTypes.config. The theory here is, that this should essentially disable all of Sitecore's special handling of the content of these fields (like updating the LinkDatabase and so on). You would need to manually trigger a rebuild of the LinkDatabase when done with the import, but that's a relatively small price to pay
Hope this helps a bit :-)
I'm guessing you've already hit this, but putting the code inside a SecurityDisabler() block may speed things up also.
I'd be a lot more worried about how Sitecore performs with this much data... assuming you only do the import once, who cares how long that process takes. Is this going to be a regular occurrence?
I work for a university, and in the past year we finally broke away from our static HTML site of several thousand pages and moved to a Drupal site. This obviously entails massive amounts of data entry.
What if you're already using a CMS and are switching to another one that better suits your needs? How do you minimize the mountain of data entry during such a huge change? Are there tools built for this, or some best practices one should follow?
The Migrate module for Drupal would provide a big help. The Economist.com data migration to Drupal will give you an overview of the process.
The video from the Migration: not just for the birds presentation at Drupalcon DC 2009 is probably somewhat out-of-date, but also gives a good introduction.
Expect to have to both pre-process and post-process your data manually, whatever happens. Accept early on that your data is likely to be in a worse state than you think it is: fields will be misused; record-to-record references (foreign keys) might not be implemented properly, or at all; content is likely to need weeding and occasionally to be just bad or incorrect.
Check your database encoding. Older databases won't be in Unicode encodings, and get grumpy if you have to export data dumps and import them elsewhere. Even then, assume that there'll be some wacky nonprintable characters in your data: programs like Word seem to somehow inject them everywhere, and I've seen... codepoints... you people wouldn't believe. Consider sweeping your data before you even start (or even sweeping a database dump) for these characters. Decide whether or not to junk them or try to convert them in the case of e.g. Word "smart" punctuation characters.
It's very difficult to create explicit data structures from implied one. If your incoming data has a separate date field, you can map that to a date field; if it has a date as part of a big lump of HTML, even if that date is in a tag with an id attribute, simple scripting won't work. You could use offline scripting with BeautifulSoup or (if your HTML's a bit nicer) the faster lxml to pre-process your data set, extract those implicit fields, and save them into an implicit format. Consider creating an intermediate database where these revisions are going to go.
The Migrate module is excellent, but to get really good data fidelity and play more clever tricks you might need to learn about its hook system (Drupal's terminology for functions following a particular naming scheme) and the basics of writing a module to put these hooks in (a module is broadly just a PHP file where all the functions begin with the same text, the name of the module file.)
All imported content should be flagged for at least a cursory check. You can do this by importing it with status=0 i.e. unpublished, and then create a view with the Views module to go through the content and open it in other tabs for checking. Views Bulk Operations lets you have a set of checkboxes alongside your view items, so you could approve many nodes at once.
Expect to run and re-run and re-run the import, fixing new things every time. Check ten, or twenty items, as early as possible. If there are any problems, check ten or twenty more. Fix and repeat the import.
Gauge how long a single import run is likely to take. Be pessimistic: we had an import we expected to take ten hours encounter exponential slowdown when we introduced the full data set; until we finally fixed some slow queries, it was projected to take two weeks.
If in doubt, or if you think the technical aspects of the above are just going to take more time than the work itself, then just hire temps to do the data. But you still need decent quality controls, as early as possible during their work. Drupal developers are also for hire: try your country's relevant IRC channel, or post a note in a relevant groups.drupal.org group. They're more expensive than temps but they usually write better PHP...! Consider hiring an agency too: that's a shameless plug, as I work for one, but sometimes it's best to get experts in for these specific jobs.
Really good imports are always hard, harder than you expect. Don't let it get you down!
Migrate + table wizard (and schema + views) is the way to go. With table wizard you can expose any table to drupal and map fields accordingly using migrate.
Look here for a detailed walktrough:
http://www.lullabot.com/articles/drupal-data-imports-migrate-and-table-wizard
You'll want to have an access to existing data from django. This helps me a lot with migrating: http://docs.djangoproject.com/en/1.2/howto/legacy-databases/ . With correct model definitions you'll have full django power including the admin. In fact, I'm using django just as admin backend for several legacy php projects - django's admin can easily outachieve a lot of custom hand-written admin scripts.
Authorization should remain the same. Users should be able to login with their credentials but it is hard to write a migration script for auth data because password hashing schemas may be different and there is no way to convert between them without knowing plain passwords. Django provides a way to support different sources of auth so you can write Drupal auth backend: http://docs.djangoproject.com/en/1.2/topics/auth/#writing-an-authentication-backend
There is no need to do the full rewrite. If some parts are working fine they can still be powered by Drupal. New code can written using Django with same UI. Routing between old and new parts can be performed by web server url rewriting. Both django and drupal parts can be powered by the same DB.