I am looking for a guide to migrate Django project to Google App Engine and use Google's datastore. The most of the guides I found were linked to Django-Appengine using Django-nonrel (but I want to use GAE's native support).
Going through GAE getting started guide, it says:
Google App Engine supports any framework written in pure Python that speaks CGI (and any WSGI-compliant framework using a CGI adaptor), including Django, CherryPy, Pylons, web.py, and web2py. You can bundle a framework of your choosing with your application code by copying its code into your application directory.
I understand that I won't be able to use some features of Django in that case (majorly the admin feature) and would also need to restructure the models.
From other reading, I also found that latest SDK of GAE now includes Django 1.3 on Python 2.5.
I tried to put all files from my Django application to a GAE project, but couldn't get it all to work together.
Please provide some basic guide using which I may migrate my Django project to Google App Engine's code.
Thanks.
For an existing Django app, using django-nonrel is the simplest approach; it is very popular so you should be able to find help with specific errors you get quickly.
Another approach is written up in this article: http://code.google.com/appengine/articles/pure_django.html -- it goes the other way, taking an App Engine app that uses Django for dispatch, templates, and forms, but not for models, and describes how to make it run in a native Django environment. Maybe you can glean some useful hints for your situation from it.
I've used django-nonrel, which behaves pretty much like django, except that operations with JOINs will return errors. I've basically worked around this by avoiding ManyToMany fields, and essentially building that functionality manually with an intermediate table.
So far I've ran into two problems with Django-nonrel:
1. No access to ancestor queries, which can be run in a transaction. There's a pending pull request for this feature though.
2. You can't specify fields that are not indexed. This could significantly increase your write costs. I have an idea to fix this, but I haven't done so yet.
(Edit: You CAN specify fields that are not indexed, and I've verified this works well).
2 (new). Google is pushing a new database backend called ndb that does automatic caching and batching, which will not be available with django-nonrel.
If you decide not to use django-nonrel, the main differences are that Django models do not run under App Engine. You'll have to rewrite your models to inherit from App Engine's db.Model. Your forms that use Django's ModelForm will need to inherit from google.appengine.ext.db.djangoforms instead. Once you're on App Engine, you'd have to port back Django if you ever take your app somewher else.
If you already have a Django application you might want to check this out. You won't work with App Engine's datastore but Google Cloud SQL might fit your needs.
Related
I understand that full django can be used out of the box with CloudSQL. But I'm interested in using HRD. I'd like to learn more about what percentage of django can be used with nonrel. Does middleware work? How about other features of the framework like i18n, forms, etc. Also does nonrel work with NDB?
The background here is that I've even using webapp2 and before that webapp and find them great until your project gets bigger. So for this project I'm interested to reevaluate other options.
The big limitation is that the datastore doesn't do JOINs, so anything that uses JOINS, like many-to-many relations won't work.
Any packages/middleware that uses many-to-many won't work, but others will.
For example, the sessions/auth middleware will work. But if you use permissions with auth, it won't. If you use the admin pages for auth, they use permissions, so you'll have some trouble with those too.
i8n works.
forms work.
nonrel does not work with ndb.
I don't know what you mean by "until your project gets bigger". django-nonrel won't help with the size of your app.
In my opinion there's two big reasons to use nonrel:
You're non-committal about App Engine. Nonrel potentially allows you to move to MongoDB as a backend.
You want to use django packages for "free". For example, I used tastypie for a REST API, and django-social-auth to get OAuth for FB/Twitter logins with very little effort. (On the flip side, with 1.7.0, they've addressed the REST API with endpoints)
According to this question:
Django on Google App Engine
The easiest way to get started with GAE/Django is with the Django non-rel bundle. However now that the latest Python/GAE SDK includes a build of Django, do we still need this?
What's the best-practice for getting started wth Django on GAE right now?
Thanks
Update: It seems that Web app2 is the easiest choice for new projects.
This guest article suggests that
"App Engine does come with some Django support, but this is mainly
only the templating and views."
non-rel is still seemingly your best bet. Although I'd caution you that further development and/or maintenance may not happen according to their blog.
Normal Django's models doesn't have a backend supporting GAE's datastore. Hence you can't use Django models, and hence, Django's model forms. What you'd have to do use use models derived from GAE's python db.Model(). Instead of using Django's ModelForm class for forms, you would use google.appengine.ext.db.djangoforms. Note, that's specifically for ModelForms, other forms work fine since they're not tied to the database.
I can think of two good reasons to use Django-nonrel:
1a) you have a existing project on Django. Using Django-nonrel would be the laziest way to go. Rewriting models to GAE's models isn't too hard, but it could be a small pain, especially if
1b) you use a lot of existing Django components, and you'd have to go through all of them to update the models and forms.
2) You want to hedge your bets against GAE. Using Django-nonrel will allow you to switch over to MongoDB with very little effort, since Django-nonrel has a functioning MongoDB backend. The current Django-nonrel maintainers seem to be more interested in MongoDB.
Having worked with Django-nonrel, I've so far run into some reasons why it may be a bad choice:
1) No support for ancestor queries. There's an outstanding pull request for this though. It won't be compatible with any other DB backend though.
2) ndb is coming out, and seems like it'll have a few more benefits, that likely won't see support on Django-nonrel.
If you do use GAE's native db API, the main benefit from Django would be the form validation. Otherwise, webapp2+jinja2+gae db.Models() would provide similar functionality to Django.
I have an application built with Django. Part of it relies on data that I aggregate from other websites. Wondering how I should approach building the scraper/aggregator.
The advantages I see of building it as a Django app is
the ability to use Django's models & database API
the ability to use Django's other methods
On the other hand I think the disadvantage would be scalability in the long run.
Should I build the scraper/aggregator as an app in my Django project or as a separate script that runs on its own?
Would love to hear your thoughts.
Neither of your points require it to run within Django. And since it will not be dependent on the web/HTTP interface, having it be a separate module is the only option that makes sense.
I just have published a Django app django-dynamic-scraper on GitHub, which is build on top of the scraping framework Scrapy and where you can build Scrapy scrapers in the Django admin and use Django model classes to store your scraped data, maybe this is of some use for people with similar problems.
If it's a django app, it will only run when someone loads the page. That could slow the loading.
Making another script could be a nicer idea but could produce inaccurate data.
I think it actually depends on the context.
I wanted to check the status of running Django on the Google App Engine currently and what the benefits of running django on GAE over simply using Webapp.
Django main killer feature, IMHO, is the reuseable apps and middleware. Unfortunately, most current Django apps use models or model forms (django-tags, django-reviews, django-profiles, Pinax apps).
So what are the remaining features or benefits that django has that can still run in Google App Engine (other than what's disabled: the popular django apps, session and authentication middleware, users and admin, models, etc).
Also, is there a list of the Django apps that work in App Engine as well?
app-engine-patch currently has the most of django functional, including sessions, contrib.auth, sites, and some other standard django apps. However, its main drawback (my opinion) is that it uses a zip file of a modified version of django to achieve this functionality and the current maintainers don't seem to have kept pace with current django releases. Currently it seems to be the consensus of the past and present maintainers that this approach is too cumbersome to maintain and therefore no one is currently maintaining it.
google-app-engine-django, uses a monkey patch approach of the latest django version included in the production GAE runtime, so as long as google continues to track django releases you'll be kept up to date regarding django. However, it currently has not fully ported contrib.auth, so you can only authenticate with google accounts - which can be a big drawback depending on whether you want contrib.auth User models to work as you know them on sql backends. There is also no django admin support in the helper as there is in app-engine-patch. A fork of django-app-engine-django exists which adds in some of the contrib apps, such as flatpages, sites, and sitemaps. Also note, it only works on django versions up to 1.1, until issue #3230 Django 1.2 is added to use_library, unless you upload django as a zip file.
On the horizon, the original developer of app-engine-patch has been working on the django-nonrel branch, but this may be pretty far away from being included in a django release. This django developers thread has a lot of information about these efforts.
Separately, there is a google summer of code project working on integrating some aspects of nonrel db's.
app-engine-patch gets most of those things working inside AppEngine - so you can (mostly) use straight Modelforms, use the Django users and admin, etc.
I've only used it for fairly simple projects (being quite new to django), but they claim that most Django apps will work with (at most) minor modifications on appengine. For instance, app-engine-patch uses the AppEngine Model classes rather than the Django classes; and there are some of the basic views that are too inefficient to run on Appengine.
added: google-app-engine-django is similar; but provides a BaseModel that appears identical to Django's BaseModel. My understand is that google-app-engine-django was released by Google, then forked to create app-engine-patch. The maintainers of app-engine-patch seem to have some different goals from the creators of google-app-engine-django, so you may find that one of the two suits your needs better than the other.
Google have provided some articles on running Django apps on appengine; the most recent is actually a guest post from the authors of app-engine-patch.
I've had the best success by simply picking and choosing the Django features that I need and patching them into webapp myself. In my latest project I actually just cut out the webapp stuff entirely. I still import and call several webapp utility functions, but it is mostly a hand rolled application built from the good parts of GAE and Django.
You might be interested to check out web2py, another Python framework that supposedly has less friction between GAE and a "normal" web server.
It is now quite easy to use full Django on GAE:
https://developers.google.com/appengine/articles/django-nonrel#ps
The Django version provided with App Engine has been updated to 1.2.5 with the latest SDK release (1.4.2, changelog). This version is available through the use_library() declaration, so you no longer need to mess around with monkey patching to the same extent.
The GoogleAppEngine (GAE) Python 2.7 runtime provides several third-party libraries that your application can use, in addition to the Python standard library, GAE tools, and GAE Python runtime environment. One of them is Django. The below is copied from the GAE docs page on third-party libraries:
To use Django in Python 2.7, specify the WSGI application and Django library in app.yaml:
...
handlers:
- url: /.*
script: main.app # a WSGI application in the main module's global scope
libraries:
- name: django
version: "1.2"
How much of a pain is it to run a Django App on App Engine? Also, does the Datastore work as-is with Django?
I spent some time trying to answer the same question... it seems to me that the most difficult thing to transfer to GAE are django's models... in the sense that they require various modifications and rethinking, mainly because GAE's backend is not a standard relational DB, but google's BigTable. I found a nice intro to this here:
http://www.youtube.com/watch?v=rRCx9e38yr8
Anyways, it's worth downloading one of those 'patches' and have a go with it!
For me the best solution is the 'app-engine-patch'. I downloaded the sample project and it worked straightaway! (Mind that you need to have GAE's SDK installed separately) A killer-feature for me is the fact that the django-admin and many other classic django functionalities have been ported too!!!
http://code.google.com/p/app-engine-patch/
The documentation is still quite minimal in my opinion, but it's good enough to get you going. It'll help you to skim though the official GAE docs though!
Just Yesterday (depending on your time zone), Google released a new SDK for Python on Appengine that supports django 1.0 out of the box.
You need to use django-nonrel (source).
You will still find loads of issues:
Many2Many relations not supported
Fake joins increase number of queries
App Engine doesn't allow any python lib with socket or C dependencies (sentry, lxml...)
You can try to get early access to CloudSQL.
Otherwise you are not constraint to use App Engine, you can think about using:
Heroku
Gondor
Cheaper and more control with support requirement files like pip.
31.01.2012, Google released App Engine 1.6.2 that supports Django out-of-the box.
App Engine includes version 0.96 of Django out-of-the-box, but it is quite crippled.
App Engine Helper and app-engine-patch supposedly fix this problem to some degree, but I haven't tried either myself.
http://code.google.com/appengine/articles/appengine_helper_for_django.html
http://code.google.com/appengine/articles/app-engine-patch.html
The amount of pain depends on how much existing code you want to reuse. Unfortunately because of the Datastore does not support SQL, you often cannot just take any django-pluggable app and use it on your GAE project.
App-engine-patch http://code.google.com/p/app-engine-patch/ looks to be ahead of the other django helpers in bringing the standard applications (Sites, ContentTypes, Flatpages) over to GAE. I have used app-engine-patch on several gae projects, and once you understand how to port a django-sql model to a django-gae model and converting sql to datastore queries things can be done very quickly - but there is always a learning curve.
appengine-helper tries to bridge the Datastore gap by providing a model so you don't have to change your model superclasses, but I've found that you end up having to change ManyToMany relationships and any sql anyway, so the advantage ends up being minimal. ae-patch has a roadmap to try to provide an ae-datastore backend, but it probably won't happen for a while.
Google has now launched their Cloud SQL storage. That is actually MySQL 5.5 in the cloud. IMO that's a very nice way to migrate your Django app into the cloud. They have a free trial up to June 1, 2013.
If you need some tips how to set up your Django project for Appengine and Cloud SQL I've written a tutorial for that.