Search engine solution for Django that actually works? - django

The story so far:
Decided to go with Xapian as search backend because it has all search-engine features I was looking for, knows about Unicode, stemming, has few dependencies and requires no bloated app-server installation on top of it.
Tried Django and Haystack (plus xapian-haystack, the backend glue code to tie Haystack to Xapian) because it was advertised on quite some blogs as "working". Did not work. Neither django-haystack nor the xapian-haystack project provide a version combination that actually works together. MASTER from both projects yields an error from Xapian, so it's not stable at all. Haystack 1.0.1 and xapian-haystack 1.0.x/1.1.0 are not API-compatible. Plus, in a minimally working installation of Haystack 1.0.1 and xapian-haystack MASTER, any complex query yields zero results due to errors in either django-haystack or xapian-haystack (I double-verified this), maybe because the unit-tests actually test very simple cases, and no edge-cases at all.
Tried Djapian. The source-code is riddled with spelling errors (mind you, in variable names, not comments), documentation is also riddled with ambiguities and outdated information that will never lead to a working installation. Not surprisingly, users rarely ask for features but how to get it working in the first place.
Next on the plate: exploring Solr (installing a Java environment plus Tomcat gives me headaches, the machine is RAM- and CPU-constrained), or Lucene (slightly less headaches, but still).
Before I proceed spending more time with a solution that might or might not work as advertised, I'd like to know: Did anyone ever get an actual, real-world search solution working in Django? I'm serious. I find it really frustrating reading about "large problems mostly solved", and then realizing that you will never get a working installation from the source-code because, actually, all bloggers dealing with those "mostly solved problems" never went past basic installation and copy-pasting the official tutorials.
So here are the requirements:
must be able to search for 10-100 terms in one query
must handle + (term must be present) and - (term must not be present), AND/OR
must handle arbitrary grouping (i.e. parentheses around AND/OR)
must allow for Django-ORM filtering before or after fulltext-search (i.e. pre-/post-processing of results with the full set of filters that Django knows about)
alternatively, there must be a facility to bulk-fetch the result set and transform it into a QuerySet
should be light on the machine, so preferably no humongous JVM and Java-based app-server installation
Is there anything out there that does this? I'm not interested in anecdotal evidence, or references to some blog posts that claim it should be working. I'd like to hear from someone who actually has a fully-functional setup working in the real world, under real conditions, with real queries.
EDIT:
Let me repeat again that I'm not so much interested in anecdotal evidence that someone, somewhere has a somewhat running installation working with unspecified properties. I already went there, I read all the blog posts, mailing lists, I contacted the authors, but when it came to actual implementation of real-world scenarios, nothing ever worked as advertised.
Also, and a user below brought that point up as well, considering the TCO of any project, I'm definitely not interested in hearing that someone, somewhere was able to pull it off once a vendor parachuted in an unknown number of specialists to monkey-patch the whole installation with specific domain-knowledge that's documented nowhere.
So, please, if you claim you have a working installation that actually satisfies minimum requirements for a full-fledged search (see requirements above), please provide the following so that we can all benefit from a search solution for Django that actually solves the problem:
exact Linux distribution, release version,
exact release version of Haystack (or equivalent) and release version of search backend,
exact release version of the search engine
publicly (!) available documentation how to set up all components exactly in the way that your installation was set up such that the minimal requirements above are met.
Thank you.

I have developed some Django applications with xapian support too. The biggest of them has a xapian database with an index of 8G storing 2.4M documents (including forum posts, wiki entries, planet entries and blog entries) - still growing.
Overall I am quite happy with xapian. It performs extremely well and is easy to use. The only thing I don't like is that xapian won't work with mod_wsgi (except of the global mode) because of a deadlock. So you are forced to use fastcgi (or connect to xapian-tcpsrv or write your own service).
I recommend you, to use the xapian-bindings directly. Xapian nowadays offers quite a lot of useful helpers (TermGenerator, QueryParser etc), which makes both the indexing and the querying simple. In fact, there is nothing I can imaging which would justify an additional library. In my opinion they are all more complicated and don't allow you to index efficiently.
The only thing you need, is some understanding of the way how xapian is working. (What are terms? What are values? What is stemming and where should I use it? and so on). You can find all those topics on the xapian website, and as soon as you understand those concepts, dealing with xapian will become easy.
Also, the xapian API is extremly stable. I've started using it a long time before the 1.0 release and never had any problems with API changes or version conflicts. The only thing which has changed is that all those helpers (query parser, tokenizer, etc.) I have once written for my Django project are now useless, because similar classes have made their way into the xapian core.
So, to summarize, just give the direct usage of xapian-bindings a try.

I can vouch for Django-Haystack with the Xapian backend (In the interest of full disclosure, I am the author of the xapian-haystack backend) in a real life, production environment. We currently use Haystack/Xapian on several sites, the largest of which has more than 20,000 registered users and a Xapian database with 20,000+ documents containing more than 143,000 unique terms for a total size of ~141mb.
As for not being able to get any combination of Haystack and the Xapian backend running, I'll admit that I was not as diligent as I should have been with my tagging and so there is some confusion with the versions. You should, however, be able to use the current master of both codebases without any issue. If this is not case, I'd be more than happy to assist with problems. You'll need to be a little bit more specific about the issue though. Simply saying "it did not work" is not enough information.
Daniel and I both do our best to respond to any issues opened on Github within a timely manner. Also, we're both usually available on the #haystack IRC channel during the day and the django-haystack Google Group.
Versions used:
Haystack 1.0BETA with Xapian-Haystack 1.1.0BETA
Haystack 1.0.1FINAL with Xapian-Haystack 1.1.3BETA
Most of the sites we've deployed with Haystack have been running Ubuntu 8.04 LTS with Xapian 1.0.5

Short answer: No.
We bailed and went with a Google Custom Search. Although the site has over 10,000 possible page views, we keep the sitemap feed down to the main 4,000 pages or so and it costs $250/year, which is about 2 hours of my time. The customer is happy and he feels comfortable with the results.
I'd love to see someone come up with a good FOSS solution, but in a commercial situation the TCO has got to make economic sense.

The details you requested.
exact Linux distribution, release version - Ubuntu 9.04 & 9.10
exact release version of Haystack (or equivalent) - Haystack 1.0 as well as master
release version of search backend - The Solr & Whoosh backends included with Haystack
exact release version of the search engine - Solr 1.3, Solr 1.4 & Whoosh 0.3.15
publicly (!) available documentation how to set up all components exactly in the way that your installation was set up such that the minimal requirements above are met.
http://docs.haystacksearch.org/dev/installing_search_engines.html#solr (or #whoosh)
Beyond this, it's the standard configuration bits from the tutorial, plus any additional overrides from (which I can't link to, thanks Stack Overflow) as needed.
As the maintainer of Haystack, I'm actively running all of the above previous setups. The smallest Haystack installation (Haystack 1.0 + Whoosh) is ~600 documents. A slightly larger one (Haystack master + Solr 1.4) is ~4000 documents. The largest deployment I'm aware of (Haystack master + Solr 1.4) is ~3 million documents.
I generally try to avoid Stack Overflow, so don't be surprised if you see nothing further from me. The mailing list is the best place for support, but given your responses thus far, I'm sure you'd rather just trash me here.

I (and my colleagues) have successfully used Haystack to achieve a fairly good search functionality.
It is easy to start with haystack and whoosh backend; and change to the Apache-Solr backend when performance of whoosh is not acceptable.
We really got to get around to write a detailed post about it with links to the projects where it works.
For now I can suggest you to have a look at this search: http://www.webdevjobshq.com/search/?q=rails implemented using Haystack with Apache-Solr backend. Or this: http://www.govbuddy.com/search/?q=Roy

Have you considered Sphinx? What are you using as you data store? It has a MySQL engine that works terrific. I think it meet most of your requirements except I'm not exactly certain how nicely it can be tied into Django-ORM.
I'm heavily considering using Sphinx in one of my own Django Apps to improve performance on an auto-suggest field that does a prefix and infix search on a corpus of 3.5 million records. But I haven't got around to implementing it yet, so I can't speak to Django+Sphinx integration. My only Sphinx experience is with the MySQL Engine and directly querying MySQL.

I use Djapian. It was quite simple to install and works great. There is an actual tutorial that covers basic use-cases and shows entire integration process.
Yes, it has some ambiguities but issue tracker is open and authors rapidly fixes bugs and add features.

Related

Is there any way to bookmark a Django "current" documentation page, without version numbers?

The point of this is to keep notes/urls pointing to particular parts of the documentation that people want to refer to in the future. For example, when something is a complex feature that requires a little bit of review most of the times you work with it.
Let's take an example. I search for Django STATICFILES_DIR:
https://www.google.com/search?q=django+STATICFILES_DIRS
Pretty quickly I get exactly what I want:
https://docs.djangoproject.com/en/4.1/ref/contrib/staticfiles/
which has a STATICFILES_DIR configuration entry.
But, notice from the url, this is for Django 4.1. And it says so on the page too.
But maybe there are a current version? Let's look.
There isn't.
Contrast with Python, which points to a very generic, 3 version. Not to Python 3.10 or 3.11.
https://docs.python.org/3/tutorial/datastructures.html#dictionaries
Or postgres (looking for create table):
https://www.postgresql.org/docs/14/sql-createtable.html
OK, yes, I have a version 14, but...
I can click on that current and that will NOT pin me to a particular version.
https://www.postgresql.org/docs/current/sql-createtable.html
In the case of Django, Python and Postgresql, I am pretty confident a generic, version-less documentation page will serve my purposes just fine 90% of the time - those are pretty stable APIs by now.
Often searching gets you to ancient postgresql versions like 9.2, but you can always find a current link.
Am I looking in the wrong places for a permanent link for Django docs?
Yes, there is dev link on the Django versions, but that's living a bit dangerously, I assume people are potentially working updating the docs on the current version on that URL. Or should I use that after all?
Going to use the dev tag for now. Django is stable enough that I expect minimal problems there and I can always go back to my version of interest from there. But at least I don't have bookmarks pointing to a bunch of different historical versions.

What could break when migrating from Adobe ColdFusion to an alternative CFML engine?

We are currently using Adobe ColdFusion 9 for a rather large application. We are thinking about moving to Railo or Blue Dragon.
What problems will we run into?
Will it require a large amount of refactoring or will most CFML code just work on the new system?
Do alternative engines provide support for most all official tags, or are they more limited?
In short, how divergent are these alternatives from the official language?
Is there anything we can do to make this process less painful (like upgrading to CF11 first or removing/avoiding certain features)?
My question is similar to What Notable Differences are there between Railo, Open Bluedragon, and Adobe Coldfusion?, but while that is concerned with practical differences I'm asking more specifically about practicality of transition/implementation.
It all depends on your code and the specific Adobe ColdFusion functionality that you are using. For the most part each CFML iteration supports the same tags/functionality. Where they deviate from the Adobe product is usually documented and explained. You need to dive into your code base and look specifically at the features you are using and compare those to the CFML engine of your choosing. Or you can just download and spin-up the alternate CFML engine, drop your code base in it and see what breaks.
As an example from Railo - CFML Compatibility
Railo tries to adhere the CFML standard as good as possible, Still there are some differences like missing tags and functions or a slightly different behavior. This page and the ones below should describe the incompatibilities.
And I have to question what you are basing this comment on? "and especially it's very uncertain future with them". You are running ColdFusion 9. Adobe has implemented two major version releases since then (10 and 11) and are currently working on the future release.
There are two main areas that can prove problematic when migrating from Adobe ColdFusion to Railo:
Use of feature areas that are not supported by Railo
Sloppy CFML code
The former includes integration with Microsoft technologies, such as Exchange and Sharepoint, as well as Office document manipulation; PDF forms and some of the more sophisticated document manipulations; UI "widget" integration. There are third party extensions for some of the Microsoft integrations, e.g., cfSpreadsheet, but for PDF-related stuff you'll need to roll your own using Java libraries (PDF forms and high quality HTML to PDF conversion are Adobe specialties so be prepared to do quite a bit of work in your migration if you rely on these). As for the UI "widgets", you're better off doing that the "right way" so if you rely on those, you should read ColdFusion UI The Right Way.
The latter is a harder issue to nail down. The differences are not well documented - except in experience posts to mailing lists and blogs by people who've made the transition to Railo - but they include things like:
Using scope names as variables (Railo treats scopes as reserved names for performance reasons)
Embedding comments inside tags, e.g., <cfif x gt y <!--- check boundary --->> (I've seen things like this in older CFML code and was surprised it worked).
Reliance on automatic creation of nested struct elements, e.g., a.b.c = 0 when a has not been declared.
Reliance on long-deprecated features, e.g., parameterExists().
There are many other small differences: Railo is generally stricter about syntax and semantics than Adobe ColdFusion, and often those decisions are driven by performance concerns in that compatibility with Adobe ColdFusion would make Railo slower.
Full disclosure here: I have used Railo pretty much exclusively for five years and I used to run the US arm of Railo's consulting business. That said, you need to consider that Railo is a small company (despite the backing of five fairly large former Adobe partners) with just a handful of people working on the engine, and very little awareness of the product outside the more leading edge portion of the CFML community. By comparison, Adobe have a large team and a marketing budget. Your concerns about the difficulty of finding developers will not be addressed by switching to Railo - to gain access to a larger developer pool, you'd really need to switch to a more popular language, not just a different engine.
Finally, a word about Blue Dragon's engine, specifically Open BlueDragon: the maintainers of that project have stated publicly several times that compatibility with the other engines (Adobe, Railo) is not a primary concern for them, and indeed there are a lot of modern language features that they still don't support or at least don't support in a compatible manner. Last I checked, full-script components were on that list despite having been supported in Adobe ColdFusion and Railo for many years (by which I mean using component { ... } rather than the <cfcomponent><cfscript> .. </cfscript></cfcomponent> form). The BlueDragon dialect of CFML has been steadily diverging over the years so unless you have very old school CFML, that would still run on CFMX7 / ACF8, you probably won't have much success trying to migrate to Open BlueDragon.
There are a couple good answers here and I appreciate the advice given in them. When I asked this question I was looking for something a little more specific, so now that I've had the chance to really play around with migrating our app to Railo I thought I should come back and list out the issues we've run into and, just as importantly, the severity and workarounds. Hopefully this will help others considering making the jump:
cfMessageBox:
cfMessageBox is not a supported tag in Railo. The best solution we've come up with is to create a new custom tag called MessageBox.cfm, then drop it into “{railo-install}/lib/railo-server/context/library/tag/”. This will allow it to be recognized as a core tag and referenced via “”, which saves us from updating hundreds of templates that call it. This, of course, requires us to create a message box custom tag from the ground up.
cfDiv:
cfDiv seems to be throwing a JS error when used to bind to a JS function. I'm going to guess that this is because JS binding is not officially supported (given that I can't find any reference in the official docs), and while ACF allows it as delayed execution, Railo simply doesn’t accept it. We could just create a custom tag that generates a JS setTimeout as described in (1) above, which solved our problem, but applications that actually use this tag for its intended purpose may have a more difficult road ahead.
cfWindow:
There appears to be limited support for cfWindow in Railo. Specifically, new windows need manually shown, and the destroy methods do not exist. Various other bugs appeared as well. We decided that it made more sense to just move to JQuery based modals.
cfLayout:
cfLayout support is questionable. It is based on JQuery and not Ext-JS like ACF’s version. This causes a problem because we run JQuery 1.10 right now and the built-in tag doesn’t appear to work beyond JQuery 1.8. In fact, I could not find any JQuery version within which the tag worked perfectly. We decided that it may be best to, again, just write our own custom tag based on JQuery.
cfDocument:
cfDocument works differently in Railo and seems to require more strict HTML. I found a lot of helpful information here, though as of yet I haven't actually gotten any of my cfDocument calls to work as expected.
Relative cfLocations:
cfLocations that began with a “../” and backtracked beyond the webroot would throw a weird Java error. This ended up being a bug in Tomcat, and was patched by the Railo team in version 4.3.1.003. If you download an older Railo version you may run into this issue and need to update all of your cfLocation calls.
Oracle Thin Client:
Our database guy reported to me that he setup the Oracle Thin Client, because the OCI client is not natively supported in Railo. I found this, which might be relevant, but I don't have the expertise to say for sure.
Documentation:
ACF Livedocs are sometimes aggravating as they don't touch on the more important intricacies of how some tags are implemented, but Railo's version is the definition of minimalist. I think it's fair to say that Railo has no docs specifying each tag and function and that they leave you to rely on Adobe for that, which causes a serious issue when you need to know how the two implementations differ.
In the end it seems like, as predicted by previous answers, the UI tags were the bulk of our issues. Based on previous comments I was hoping for better implementations of them that may just require a tweak here and there, but (at least for our needs) the Railo versions seem borderline non-functional and it looks like we would need to replace them completely. For us, this may not be realistic, though we are still tossing the idea around.
To be fair, here are some of the good points from our research and testing:
Performance:
Although compatibility problems have prevented me from doing much performance testing, initial spot checks show approximately a 50% decrease in execution time for most pages.
Debugging:
The debugging options in Railo are quite amazing. There are far more options for formatting, including specifying different formats for different developers (IP addresses). One incredible feature is the inclusion of a comma delimited list of query fields that were actually used in the page: this could allow you to effectively develop based on a "select *" query and simply copy and paste the fieldlist into the query at the end of development, which would save a lot of time with views as large as the ones we're using.
Cost:
This is one of the larger reasons we decided to look into alternatives. Switching just a few Enterprise licensed ACF servers over to Railo would save $20k+ over upgrading to the newest version of ACF. Further, with the performance increases you could see an even greater savings in hardware requirements. A side effect of this point is that one can keep far more up to date without the constant cost/benefit analysis of licensing costs holding up upgrades.
Support:
Without a support contract, it doesn't seem like Adobe responds to user concerns. I've had a production impacting bug reported since ACF 9 which still hasn't been fixed. Yet the Railo community is one of the most helpful and responsive I've ever seen, and developers have even responded directly to concerns and bug reports I've raised.
Longevity:
This is a highly opinionated point, of course, but while Adobe seems to be relegating ACF to the shadows more and more with each new version, Railo appears to be dedicated to growing the community. Combined with its open source nature I think this makes it a safer bet for future support in the long term, even if that support is just us taking development into our own hands when needed.
For a number of reasons, including divergent CFML compatibility, we did not even get to the testing stage with Blue Dragon.

Django A/B Split Testing Packages (None I've found are well-documented and up-to-date.)

There are two main schools of thought for doing A/B (Split) Testing:
Javascript-based solutions such as Optimizely, Google Analytics Content Experiments.
Server-side solutions such as Django-AB, Splango, and django-lean. (Also, writing your own.)
My understanding is that Javascript-based solutions are spectacular for "which color button converts better," but not so great for switching out entire page layouts, and completely unworkable for trying out large functional changes such as the sequence of pages in a funnel.
That leads me towards a server-side solution. I'm not crazy about coding my own, and will do so only if there is no other option. I'm trying to add value by improving the core functionality of my site, not by creating a better split-testing framework.
The Django apps I've found for split testing are various mixtures of unmaintained, undocumented, documented incorrectly, and incompatible with Django 1.5. This surprises me, because the Django and Python communities seem to have a strong focus on good documentation. I'm also very surprised that none of the testing frameworks I've tried has been compatible with Django 1.5 -- is testing not as core a part of the philosophy in the Django/Python world as it is in Rails?
Here's what I've found:
Splango https://github.com/shimon/Splango -- Not compatible with Django 1.5 (although most compatibility bugs I found were trivial to fix). Largely un-touched since October 2010, except for a fix August 2012 which claims to make sure templates get included in the install. Since templates don't get included in the install when Splango is installed via PyPI, either the fix didn't work or didn't get submitted to PyPI. Documentation is largely accurate, but doesn't completely cover how to set up tests and get reports. It tells you how to configure the template to gather the data, but there appears to be additional steps required in the admin interface which are completely undocumented, and I'm not sure I've done them properly.
Django-lean. Original at https://bitbucket.org/akoha/django-lean has not been updated since July 2010. There is an apparently "blessed" fork at https://github.com/anandhenry2002/django-lean which has not been changed since May 2012, when it was copied over from the original. The original's documentation is incorrect in ways that make following the examples impossible. (Though you can probably muddle your way through, as I did.) The new version's documentation has formatting problems that make it difficult to read on github. (This appears to be because it's the unchanged documentation from the old project, and BitBucket syntax doesn't work on Github.) The django-lean Google Group has not had a message since July 2012.
django-mini-lean https://github.com/DanAncona/django-mini-lean -- Updated as recently as February 2013, but undocumented.
Leaner - https://bitbucket.org/brianjinwright/leaner -- Last updated July 2012, and no docs.
Django-AB -- Last updated May 2009. Is not a package, and can't be installed via PIP or PyPI. After placing the checkout in my django app folder (and renaming the folder to ab) and following the installation instructions, I get an error loading the template loader that I have not tracked down further.
So far Splango appears to be the winner, as I've actually been able to get it more-or-less working (by manually installing the templates, and then editing them to fix Django 1.5 incompatibilities).
Can anyone point me to anything I've missed?
You have missed this app : https://github.com/mixcloud/django-experiments + https://github.com/disqus/gargoyle/
And then there's waffle: http://waffle.readthedocs.org/
It's simple, updated, maintained, but not very feature rich, it doesn't have any analytics/reporting stuff integrated. But then again, google analytics or mixpanel type of service is better for this.
I first looked at Django-AB and that is almost what I wanted, but I couldn't get it to work either. After looking at django-experiments and deciding I didn't want to mess around with redis yet, I decided to roll my own. I've tried to package it up nicely and make it easy to use for the beginner. It's super basic.
https://github.com/crobertsbmw/RobertsAB
You can swap out entirely different page layouts with Google Analytics Experiments (their default experiment setup will redirect users to a different URL for each variation you have), although in general its much easier to interpret why something is more successful if you test smaller things against each other.
You are right that testing different funnels and user flows against each other using Google Analytics would require a lot of manual setup; although theoretically you could do it by swapping out different links and tracking your users with UTM campaigns.
For smaller A/B tests within the same page, I ended up using Google Analytics Experiments and writing a custom Django CMS plugin for adding a few variant options to a template, which queries the Google Analytics API and displays the correct variant using Javascript.

What are the gotchas with ColdFusion?

Background:
I have a new site in the design phase and am considering using ColdFusion. The Server is currently set-up with ColdFusion and Python (done for me).
It is my choice on what to use and ColdFusion seems intriguing with the tag concept. Having developed sites in PHP and Python the idea of using a new tool seems fun but I want to make sure it is as easy to use as my other two choices with things like URL beautification and scalability.
Are there any common problems with using ColdFusion in regards to scalability and speed of development?
My other choice is to use Python with WebPy or Django.
ColdFusion 9 with a good framework like Sean Cornfeld's FW/1 has plenty of performance and all the functionality of any modern web server development language. It has some great integration features like exchange server support and excel / pdf support out of the box.
Like all tools it may or may not be the right one for you but the gotchas in terms of scalability will usually be with your code, rarely the platform.
Liberally use memcached or the built in ehache in CF9, be smart about your data access strategy, intelligently chunk returned data and you will be fine performance wise.
My approach with CF lately involves using jQuery extensively for client side logic and using CF for the initial page setup and ajax calls to fill tables. That dramatically cuts down on CF specific code and forces nice logic separation. Plus it cuts the dependency on any one platform (aside from the excellent jQuery library).
To specifically answer your question, if you read the [coldfusion] tags here you will see questions are rarely on speed or scalability, it scales fine. A lot of the questions seem to be on places where CF is a fairly thin layer on another tool like Apache Axis (web services) and ExtJs (cfajax) - neither of which you need to use. You will probably need mod-rewrite or IIS rewrite to hide .cfm
Since you have both ColdFusion and Python available to you already, I would carefully consider exactly what it is you're trying to accomplish.
Do you need a gradual learning curve, newbie-friendly language (easy for someone who knows HTML to learn), great documentation, and lots of features that make normally difficult tasks easy? That sounds like a job for ColdFusion.
That said, once you get the basics of ColdFusion down, it's easy to transition into an Object Oriented approach (as others have noted, there are a plethora of MVC frameworks available: FW/1, ColdBox, Fusebox, Model-Glue, Mach-ii, Lightfront, and the list goes on...), and there are also dependency management (DI/IoC) frameworks (my favorite of which is ColdSpring, modeled after Java's Spring framework), and the ability to do Aspect-Oriented Programming, as well. Lastly, there are also several ORM frameworks (Transfer, Reactor, and DataFaucet, if you're using CF8 or earlier, or add Hibernate to the list in CF9+).
ColdFusion also plays nicely with just about everything else out there. It can load and use .Net assemblies, provides native access to Java classes, and makes creating and/or consuming web services (particularly SOAP, but REST is possible) a piece of cake. (I think it even does com/corba, if you feel like using tech from 1991...)
Unfortunately, I've got no experience with Python, so I can't speak to its strengths. Perhaps a Python developer can shed some light there.
As for url rewrting, (again, as others have noted) that's not really done in the language (though you can fudge it); to get a really nice looking URL you really need either mod_rewrite (which can be done without .htaccess, instead the rules would go into your Apache VHosts config file), or with one of the IIS URL Rewriting products.
The "fudging" I alluded to would be a url like: http://example.com/index.cfm/section/action/?search=foo -- the ".cfm" is in the URL so that the request gets handed from the web server (Apache/IIS) to the Application Server (ColdFusion). To get rid of the ".cfm" in the URL, you really do have to use a URL rewriting tool; there's no way around it.
From two years working with CF, for me the biggest gotchas are:
If you're mainly coding using tags (rather than CFScript) and formatting for readability, be prepared for your output to be filled with whitespace. Unlike other scripting languages, the whitespace between statements are actually sent to the client - so if you're looping over something 100 times and outputting the result, all the linebreaks and tabs in the loop source code will appear 100 times. There are ways around this but it's been a while - I'm sure someone on SO has asked the question before, so a quick search will give you your solution.
Related to the whitespace problem, if you're writing a script to be used with AJAX or Flash and you're trying to send xml; even a single space before the DTD can break some of the more fussy parsing engines (jQuery used to fall over like this - I don't know if it still does and flash was a nightmare). When I first did this I spent hours trying to figure out why what looked like well formed XML was causing my script to die.
The later versions aren't so bad, but I was also working on legacy systems where even quite basic functionality was lacking. Quite often you'll find you need to go hunting for a COM or Java library to do the job for you. Again, though, this is in the earlier versions.
CFAJAX was a heavy, cumbersome beast last time I checked - so don't bother, roll your own.
Other than that, I found CF to be a fun language to work with - it has its idiosyncracies like everything else, but by and large it was mostly headache free and fast to work with.
Hope this helps :)
Cheers
Iain
EDIT: Oh, and for reasons best known to Adobe, if you're running the trial version you'll get a lovely fat HTML comment before all of your output - regardless of whether or not you're actually outputting HTML. And yes, because the comment appears before your DTD, be prepared for some browsers (not looking at any one in particular!) to render it like crap. Again - perhaps they've rethought this in the new version...
EDIT#2: You also mentioned URL Rewriting - where I used to work we did this all the time - no problems. If you're running on Apache, use mod_rewrite, if you're running on IIS buy ISAPI Rewrite 3.
do yourself the favor and check out the CFWheels project. it has the url rewriting support and routes that you're looking for. also as a full stack mvc framework, it comes with it's own orm.
It's been a few years, so my information may be a little out of date, but in my experience:
Pros:
Coldfusion is easy to learn, and quick to get something up and running end-to-end.
Cons:
As with many server-side scripting languages, there is no real separation between persistence logic, business logic, and presentation. All of these are typically interwoven throughout a typical Coldfusion source file. This can mean a lot more work if you want to make changes to the database schema of a mature application, for example.
There are some disciplines that can be followed to make things a little more maintainable; "Fusebox" was one. There may be others.

Which web framework incurs the least overhead?

I'm playing around with Django on my website hosting service.
I found out that a simple Django page, which has only some static text, and is rendered from a very simple template I created takes a significant time to render. When compared to a static HTML page, I am getting ~2 seconds difference in the load times. Keep in mind this is a simple test of mine with nothing complicated. Also note that my web hosting is on a shared server (not dedicated), so I might be hitting some CPU limitations.
Seems to me that either:
I have some basic CGI/Apache/Django configuration wrong
Django takes significant overhead, at least in this specific scenario.
I find #1 not probable since I followed my web hosting service wiki on how to set up Django. So we are left with the overhead problem.
My question is which web framework do you find the best to use in scenarios where the website is hosted on a shared server, and CPU/memory overhead must be kept to minimum?
Edit: seems that my configuration is something I might want to look at, and perhaps later on I'll be opening a question on how to best configure Django.
For now, I would appreciate answers focusing on your experience, in general, with web frameworks, and which of those you found to be the best in terms of performance in the aforementioned scenario.
"I have some basic CGI/Apache/Django configuration wrong"
Correct.
First. The very first time Django returns a page, it takes forever. A lot of initialization happens for the first request.
Second. What specific configuration are you using. We just switched from mod_python to mod_wsgi in daemon mode and are very happy with the performance changes.
Third. What database are you using?
Fourth. What test configuration are you using?
Fifth. What caching parameters and reverse proxy are you using?
Odds are good that you have a lot of degrees of freedom in your configuration.
Edit
The question "which of those you found to be the best in terms of performance" is largely impossible to answer.
See http://wiki.python.org/moin/WebFrameworks
There are dozens of frameworks. Few people can examine more than a few to do head-to-head comparison.
The best possible performance is achieved through static content. A Python app that makes static pages (for instance a collection of Jinja templates) is fastest.
After that, it's largely impossible to say. Even http://werkzeug.pocoo.org/ involves some processing overheads that may be unacceptable in the above scenario. Python can be slow.
Django, with a modicum of effort, is often fast enough. Serving static content separately from dynamic content, for example, can be a huge speedup.
Since Django does so much automatically, there's a huge victory in not having to write every little administrative page.
I'd say there has to be something funky with your setup there to get such a large performance difference. Try mod_wsgi (if you're not already) and follow the excellent suggestions by the posters above. If Django genuinely was this slow in all cases, there's just no way companies would be able to use it for production applications. It's more than likely not to be Django that is holding the request up. Once you have the .pyc files all sorted (automatically generated bytecode), then the execution should be fairly zippy.
However, if you don't actually need all Django has to offer, then why use it? I'm using it in quite a large production application, and we're not using all of its features… if you're doing something fairly simple, you may want to consider using something like web.py or Werkzeug (or something non-Python-based if you'd rather).
Frameworks like Django or Ruby on Rails grew out of real world needs. As different as these needs were, as different they turned out.
Here is my Experience:
As a former PHP programmer, I prefered CakePHP for simple stuff and Symfony for more advanced applications. I had a look into Ruby, but the documentation sucked back then. Now I'm using Django. Django works very well for me. In contrast to Symfony I feel like Django brings less flexibility out of the Box, but its easier to extend.
Another approach would be to use 'no framework' CherryPy
I think the host may be an issue. I do Django development on my localhost (Mac) and it's way better. I like WebFaction for cheap hosting and Amazon ec2 for premium hosting.
The framework is strong and it can handle heavy sites - don't obsess about that stuff. The important thing is to create a clean product, Django can handle it. There are about a thousand steps to take when you see how the application handles in the wild, but for now, just trust us that you don't need to worry about the inherent speed of the framework before exhausting a whole slew of parameters including a dedicated VPS/instance when you need it.
Also, following on your edit - I personally don't think performance is a major issue in programming. Here are the issues in terms of concern:
UI/UX efficiency
UI/UX speed (application caching)
Well designed models/views
Optimization of the system (n-tier architecture, etc...)
Optimization of the process (good QA to reduce failures/bottlenecks from deployment)
Optimization of the subsystems (database, etc...)
Hardware
Framework internal optimization
Don't waste time with comparing framework speeds. Their advantage is in extensible code, smart architectures, etc...
On a side note, DO NOT NOT USE A FRAMEWORK FOR A NEW WEB APPLICATION. I'm sorry I can't say it loud enough, but it's an absolute requirement nowadays. It's not even a debate about not using one, just which one to use.
I personally chose Django, which is great. But I can't definitively knock the others out there.
It's possibly both. Django does have stuff for caching built-in, which would be worth trying. Regardless, any non-cached page will nearly always take longer than a static file. A file has to be read in both cases, and in the case of a dynamic page, it also must be executed. And then, in both cases, sent over to the client.
Definetely shared hosting is not the best choice to run heavy frameworks such as Django or CakePHP. If you can afford it, buy VPS.
As for performance, probably your host uses Python with mod-python, which is not recommended now. WSGI is preferred standard for Python powered webapps.