Django -- I have a small app ready, Should I go on private VPS or Google App Engine? - django

I have my first app, not that big, but it is the first step. (next big one on the way)
Now if I want to put it on my own Linode VPS, I have to configure mod_python or mod_wsgi, as well as memcache, Ngix, mySQL or Postgresql, etc. to make it work. If I put it GAE, All I have to do is convert the models to use GAE's API.
What I like about GAE is scaling. (if they can really do it)
Then I'd only worry about developing my apps and doing SEO work on them instead of worrying about load share/balance, cache, db / IO redundancy, etc.
I don't want to do any porting later on. (I have to decide now and stick with it)
So, if you have any experience on this, what do you recommend:
1- Use VPS(s) for everthing
2- Use VPS(s) plus Amazon S3
3- Use VPS(s) plus Amazon S3 & SimpleDB
4- Use GAE
Also: Would I be able to get away with not having JOIN rights when using the BigTable?
Note: I don't have any spatial need now, but for a location table I might need that later on.
I'd like to know what do you think!

There's business risk and technical risk.
Business risk is that you might have to move hosts later for some external reason. VPS's, EC2, etc require more upfront investment, but keep you independent. Tools like Chef can help with the configuration effort.
Technical risk is that your application may not be easily implemented on the platform. Since most VPS options allow you to install arbitrary software, they minimize this, again at the cost of more configuration effort on your part. AFAIK, the largest constraint GAE enforces on you is it's difficult to do long running background tasks. (Working without JOINs and other aspects of de-normalized data requires a different way of thinking, but this approach is fairly common in web applications no matter where they run once the SQL database is larger than a single host can support.)
If you can live with both these risks, GAE would appear to save you a substantial amount of effort. If you cannot live with these risks, you should tailor your own environment.
As an aside, I find S3 to be worth it no matter your environment. It's far simpler than ensuring your local server static file storage is reliably backed up, and you never have to worry about capacity. It's best if you use it for data that is uploaded but rarely overwritten or deleted (think facebook photo albums).

I don't want to do any porting later on. (I have to decide now and stick with it)
If that's the case, wouldn't you prefer to control deployment from the outset? It could be a great pain to port back from GAE later down the line if you hit its limits (whether they be technological limits or simply business decisions by Google that run counter to your plans for the future of your app).
Also configuring mod_wsgi, installing postgres etc. isn't particularly difficult, and you don't have to worry about things like load balancing and db redundancy for a while yet.
If it were me, I'd prefer the long-term certainty of a traditional server over the quick win of GAE. It all depends on your vision for the app, however.

I may be biased, but if you can live with GAE's limitations it really saves you a lot of work and worry about system administration issues (and to some extent scaling) -- plus, it's free as long as your resource consumption is low (basically meaning your traffic is low).
Can you do without joins? I don't know, as I don't know your app -- I'm a SQL fanatic, myself, yet for simple enough needs I haven't found it too hard to adapt. As I see it, the main limitation of non-relational DBs is that they're nowhere as nice as relational ones for "ad hoc" queries... you typically have to write a lot of procedural code instead of a nice SELECT or two:-(. But, that's more of a "data mining later" issue than one connected with serving your web app -- probably best solved by regularly bulk-downloading data from the web app's online storage to a "data warehouse" kind of setup, anyway, even if such storage was relational in the first place;-).

Before deciding, it might be worth a quick prototype adaptation of your app to GAE. You might run into stoppers that force the decision. Possible stopper issues include
Your schema doesn't make the transition to BigTable
You're depending on some C-based library that GAE doesn't support
You have a few long-running requests that exceed the thresholds that GAE imposes

The answer depends on the complexity and nature of your model layer, really. If it's complex or tightly bound to the rest of your code, porting is likely to be a significant effort. If it's fairly straightforward, or easy to tear out and replace, I would say go for it.
These days, I mostly write new code for GAE, but the fact that I can simply deploy with a single command has really lowered the barrier I feel towards writing cool new apps. Not having to worry about deployment and hosting is quite liberating.

All I have to do is convert the models to use GAE's API.
I am sorry, you are totally mistaken.
You also need to rewrite all the views code that uses the ORM. There are no joins. So you have to deal with and write a lot of procedural code instead of the nifty SQL that provides U whatever you want.
Querying is slow. You need to override save method of each model to store additional information of that model which may take a lot of time to compute when need. You also need to work on memcache to make the queries fast enough.
And then, Guido has said Django 1.1 is going to be included in a future version of Appengine. I am hoping they will have an out of the box generic ORM to BigTable mapper.
That said, if your app is simple without many joins needed, you could use the appengine patch project to use the current version of django on Appengine. Here is how.

Related

Django project-apps: What's your approach about implementing a real database scheme?

I've read articles and posts about what a project and an app is for Django, and basically end up using the typical example of Pool and Users, however a real program generally use a complex relational database, therefore its design gravitates around this RDB; and the eternal conflict raises once again about: which ones to consider an application and which one to consider components of that application?
Let's take as an example this RDB (courtesy of Visual Paradigm):
I could consider the whole set as an application or to consider every entity as an application, the outlook looks gray. The only thing I'm sure is about this:
$ django-admin startproject movie_rental
So I wish to learn from the expertise of all of you: What approach (not necessarily those mentioned before) would you use to create applications based on this RDB for a Django project?
Thanks in advance.
PS1: MORE DETAILS RELATED ABOUT MY REQUEST
When programming something I follow this steps:
Understand the context what you are going to program about,
Identify the main actors and objects in this context,
If needed, make an UML diagram,
Design a solid-relational-database diagram, (solid=constraints, triggers, procedures, etc.)
Create the relational database,
Start coding... suffer and enjoy
When I learn something new I hope they follow these same steps to understand where they want to go with their actions.
When reading articles and posts (and viewing videos), almost all of them omit the steps 1 to 5 (because they choose simple demo apps), and when programming they take the easy route, and don't show other situations or the many supposed features that Django offers (reusability, pluggability, etc).
When doing this request, I wish to know what criteria is used for experienced programmers in Django to determine what applications to create based on this sample RDB diagram.
With the (2) answers obtained so far, "application" for...
brandonris1 is about features/services
Jeff Hui is about implementing entities of a DB
James Bennett is about every action on a object, he likes doing a lot of apps
Conclusion so far: Django application is a personal creed.
My initial request was about creating applications, but as models are mentioned, I have this another question: is with a legacy relational database (as showed in the picture) possible to create a Django project with multiple apps? this is because in every Django demo project showed, every app created has a model with their own tables, giving the impression that tables do not interact with those of other applications.
I hope my request is more clear. Thanks again for your help.
It seems you are trying to decide between building a single monolithic application vs microservices. Both approaches have their pros and cons.
For example, a single monolithic application is a good solution if you have a small amount of support resources and do not need to be able to develop new features in fast sprints across the different areas of the application (i.e. Film Management Features vs Staff Management Features)
One major downside to large monolithic applications is that eventually their feature sets grow too large and with each new feature, you have a significant amount of regression testing which will need to be done to ensure there aren't any negative repercussions in other areas of the application.
Your other option is to go with a microservice strategy. In this case, you would divide these entities amongst a series of smaller services and provide them each methods to integrate/communicate with each other (APIs).
Example:
- Film Service
- Customer Service
- Staff Service
The benefits of this approach is it allows you to separate capabilities and features by specific service areas thus reducing risk and regression testing across the application when new features are deployed or there is a catastrophic issue (i.e. DB goes down).
The downside to this approach is that under true microservice architecture, all resources are separated therefore you need to have unique resources (ie Databases, servers) for each service thus increasing your operating cost.
Either of these options is a good option but is totally dependent on your support model and expected volumes. Hope this helps.
ADDITIONAL DETAIL:
After reading through your additional details, since this DB already exists and my assumption is that you cannot migrate it, you still have the same choice as to whether or not you follow a monolithic application or a microservices architecture.
For both approaches, you would need to connect your django webapp the the specific DB you are already using. I can't speak for every connector out there but I know that the MySQL connector allows django to read from the pre-existing db to systematically generate the models.py file for the application. As a part of that connector, there is a model variable which allows you to define whether or not Django is responsible for actually managing the DB tables themselves.
The only thing this changes from an architecture perspective is how many times do you want to code this connection?
If you only want to do it once and completely comply with the DRY method, you can build a monolithic application knowing that as new features become required, application wide regression testing will be an absolute requirement.
If you want ultimate flexibility for future changes with this collection of features and don't mind recoding the migration across multiple apps while reducing the need for application wide regression testing as new features become required, a microservice architecture strategy is more appropriate.

SOA performance in a webapp

I'm struggling with the decision between a traditional backend (let's say a Django instance managing everything) and a service oriented architecture for a web app resembling LinkedIn. What I mean with SOA is having a completely independent data access interface - let's say Ruby + Sinatra - that queries the database, an independent chat application - Twisted - which is used via its API, a Django web server that uses those APIs for serving the content, etc.
I can see the advantages of having everything in the project modularized and accesed only via APIs: scalability, testing, etc. However, wouldn't this undermine the site performance? I imagine all modules would communicate via HTTP requests so wouldn't this arquitecture add a lot of latency to basically everything in the site? Is there a better alternative than HTTP?
Secondly, regarding development ease, would this really add much complexity to our developers? Specially during the first phase until we get an MVP.
Edit: We're a small team of two devs and a designer but we have no deadlines so we can handle a bit of extra work if it brings more technical value
Short answer, yes, SOA definitely trades encapsulation and reusability for latency. Long answer, it depends (as it always does) on how you do it.
How much latency affects your application is directly proportional to how fine-grained your services are. If you make very fine-grained services, you will have to make hundreds of sequential calls to assemble a user experience. If you make extremely coarse-grained services, you will not get any reusability out of your services; as they will be too tightly coupled to your application.
There are alternatives to HTTP, but if you are going to use something customized, you need to ask yourself, why are you using services at all? Why don't you just use libraries, and avoid the network layer completely?
You are definitely adding costs and complexities to your project by starting with an API. This has to balanced by the flexibility an API gives you. It might be a situation where you would benefit from internally structuring APIs to your code-base, but just invoking them as modules. Or building libraries instead of stand-alone APIs.
A lot of this depends on how big your project is. Are you a team of 1-3 devs cranking to get out your MVP? Or are you an enterprise with 20-100 devs that all need to figure out a way to divide up a project without stepping on each other?

using SQLite in Django in production?

Sorry for this question, I dont know if i've understood the concept, but SQLite is Serverless, this means the database in in a local machine, and it's stored in one file, this file is only accessible on one mode: if one client reads it, it's made only for reading mode for other clients, and if a client writes, then all clients have the write mode, so only in one mode at once!
so imagine that i've made a django application, a blog for example; then how is this made using sqlite? since if a client enters to the blog he gots the reading mode to see the page and the blog entries, and if a registred client tries to add a comment then the file will be made as write mode, so how can sqlite handle this?
so, does SQLite is here just like the BaseHTTPServer (the server shipped with django), for testing and learning purpose?
Different databases manage concurrency in different ways, but in sqlite, the method used is a global database-level lock. Only one thread or process can make changes to a sqlite database at a time; all other, concurrent processes will be forced to wait until the currently running process has finished.
As your number of users grows; sqlite's simple locking strategy will lead to increasingly great lock contention, and you will need to migrate your data to another database, such as MySQL (Which can do row level locking, at least with InnoDB engine) or PostgreSQL (Which uses Multiversion Concurrency Control). If you anticipate that you will get a substantial number of users (on the level of say, more than 1 request per second for a good part of the day), you should migrate off of sqlite; and the sooner you do so, the easier it will be.
SQLite is not like BaseHTTPServer or anything basic like that. It's a fully featured embedded database. Quite fast too. Its SQL language might not have the most bells and whistles, but it's flexible enough. I haven't run into cases where I needed something it cannot do for the projects I was involved in (which aren't your typical web apps, truth be told).
Anyone that claims SQLite is good or bad for production without discussing the actual design is not telling you much. SQLite is pretty fast. In some cases, literally orders of magnitude faster than, say, Postgres, which comes up as a go-to alternative among Djangonauts. As someone pointed out, it also supports lots of concurrency. It's a matter of whether your app falls under the 'some cases' or not.
Now, there is one significant factor that has to be taken into account. SQLite is an in-process database. This is really important. If you are using something like gevent, you may run into edge cases where your app breaks. E.g., trying to do a transaction where you have a context switch in middle of it can possibly break the transaction in horrible ways. In other words, 'concurrency' really depends on your app, because SQLite is part of your app.
What you can't do with SQLite, though, in terms of scaling, is you can't make clusters of SQLite servers like you can with some of the other database engines, because it's in-process. Your app may or may not need to go to such lengths in terms of scaling, but my guess is that vast majority of apps out there don't anyway (wild guess).
On the other hand, being in-process means adding custom functions and aggregates to it is pretty trivial. I'm not sure if Django's ORM makes that any more difficult than it has to be, but you can come up with pretty good designs taking advantage of those features.
This issue in database theory is called concurrency and SQLite does support it in Windows versions > Win98 and elsewhere according to the FAQ:
http://www.sqlite.org/faq.html#q5
We are aware of no other embedded SQL database engine that supports as
much concurrency as SQLite. SQLite allows multiple processes to have
the database file open at once, and for multiple processes to read the
database at once. When any process wants to write, it must lock the
entire database file for the duration of its update. But that normally
only takes a few milliseconds. Other processes just wait on the writer
to finish then continue about their business. Other embedded SQL
database engines typically only allow a single process to connect to
the database at once.
Basically, do not worry about concurrency, any database worth its salt takes care of just fine. More information on as how SQLite3 manages this can be found here. You, as a developer, not a database designer, needn't care about it unless you are interested in the inner-workings.
SQLite will only work effectively in production from some specific situations. It's quite easy to get MySQL or PostgreSQL up and running, even on Windows, and have a database that works in most situations.
The real problem is that SQLite3 isn't threaded in Django so only one PAGE view can happen at a time on your server, see this bug https://code.djangoproject.com/ticket/12118 Fixed
I don't use SQLite3 even in development.
EDIT: I keep getting downvoted here but the Django documentation itself recommended not using SQLite3 in Production at the time I wrote this answer. The documentation still contains the following caveat:
SQLite provides an excellent development alternative for applications that are predominantly read-only or require a smaller installation footprint.
If you do not have a small foot print/read-only Django instance, do NOT use SQLite3. Feel free to continue to downvote this answer.
It is not impossible to use Django with Sqlite as database in production, primarily depending on your website/webapp traffic and how hard you hit your db (alongside what kind of operations you perform on it i.e. reads/writes/etc). In fact, approaching end of 2019, I have used it in several low volume applications with less than 5k daily interactions (these are more common than you might think).
Simply put for the current state of tech , at the moment Sqlite-3 supports unlimited concurrent reads (or as far as your machine / workers can handle), BUT only a single process can write to it at any point in time. Bear in mind, a well designed query/ops to the db will last only miliseconds!
Coming from experience in using sqlite as the only db for simple non-routine (by non-routine i mean that a typical user would not be using this app on a daily basis year-round) production web app for overseas job matching that deal with ~5000 registered students (stats show consistently less than 2k requests per day that involves hitting the database during peak season - 40% write 60% read), I've had no problems whatsoever with timeouts/performance issues.
It really boils down to being pragmatic about the development and the URS (client spec). If it becomes the next unicorn , one can always migrate the SQLITE to another RDBMS. For instance, see David d C e Freitas's take on migration in Quick easy way to migrate SQLite3 to MySQL?
Additionally the SQLITE website uses sqlite db at its backend .. see below...
The SQLite website (https://www.sqlite.org/) uses SQLite itself, of course, and as of this writing (2015) it handles about 400K to 500K HTTP requests per day, about 15-20% of which are dynamic pages touching the database. Dynamic content uses about 200 SQL statements per webpage. This setup runs on a single VM that shares a physical server with 23 others and yet still keeps the load average below 0.1 most of the time.
Bear in mind that the above quote is of course mainly referring to read operations, so the values may not be a applicable for write-heavy sites.
The example I gave above on the job matching application I built using sqlite as db is quite write heavy if you've noticed the numbers ... on average, 40% are short lived write operations (i.e. form submissions, etc etc) but bear in mind my volume hitting the db is only 2k per day during peak season.
Then again, if you realize that your sqlite.db is causing alot of timeout and bad user experience (408 !!! on form submission...), especially with Django throwing the OperationalError: database is locked error. (and then they have to key in the whole thing again)...You can always increase the timeout in your settings.py as per django docs as a temporary solution while you prepare for migrating the db.
'OPTIONS': {
# ...
'timeout': 20,
# ...
}
Again, it all boils down to pragmatic development and facing reality that the site may not attract as much activity as hoped , and is prone to over-engineering from the get-go.
There are many times that going for a simple solution enables faster time to market , essentially, to quickly test waters , and of course, be prepared If the piranhas do come in swarms and then its time to upgrade to another RDBMS.
With Django's ORM, for most cases you dont need to touch your models.py during migration to other supported sql db. Be VERY mindfull though that Sqlite does not support some more advanced functions or even fields that its bigger cousins MYSQL and POSTGRES do.
Late to the party, but the question is still relavant as of mid 2018.
"Client" of a blog site is a different term that a "database client". SQLite documentation refers to a client as a process opening a database file. Such process, say a django app, may handle many web app clients ("users") simultaneously and it still is going to be just one client from the standpoint of SQLiite.
The important consideration for choosing SQLite over proper RDBMS is whether your architecture is comprised of more than one software component connecting to a database. In such case, using SQLite may be a major performance bottleneck due to the fact that each app needs to access the same DB file, possibly over a network.
If multiple apps(database clients) is not the case, SQLite is a great production choice in 99% of cases. The remaining 1% is apps using specific DB features, apps under enormous load, etc.
Know your architecture.
The anwer to this question depends on the application that you want to deploy in production:
According to the how to use from the SQLite website, SQLite works great in production as the database engine for most website having low to medium traffic (which is to say, most websites).
They argue that the amount of web traffic that SQLite can handle depends on how heavily you use the database of your website. It is known that any site that gets fewer than 100K hits/day should work fine with SQLite. However, this 100K hits/day figure is a conservative estimate, not a hard upper bound.
In summary, SQLite might be a great choice for applications with fewer users and databases uses. Thus, use SQLite for website with fewer or medium interactions with the database and MySQL or PostgreSQL for website with higher interactions with the database.
Reference: sqlite.org

Django hosting on ep.io

is there someone who has expirience in hosting django applications on ep.io?
Waht are the pros/cons on it?
I'm currently using ep.io, I'm still in development with my app but I have an app deployed and running.
When you use a service like this you go into it knowing that it isn't going to be the perfect solution for every case. Knowing the pros and cons before hand will help set your expectations so that you aren't disappointed later on.
ep.io is still very young and I believe still in beta, and isn't available to the general public. To be totally fair to them, it is still a work in progress and some of these pros and cons may change as they roll out new features. I will try and come back and update this post as the new versions become available, and my experience with the service continues.
So far I am really pleased with what they have, they took the most annoying part of developing an application and made it better. If you have a simple blog app, it should be a breeze to deploy it, and probably not cost that much to host.
Pros:
Server Management: You don't have to worry about your server setup at all, it handles everything for you. With a VPS, you would need to worry about making sure the server is up to date with security patches, and all that fun stuff, with this, you don't worry about anything, they take care of all that for you.
deployment: It makes deploying an app and having it up and running really quickly. deploying a new version of an app is a piece of cake, I just need to run one maybe two commands, and it handles everything for me.
Pricing: you are only charged for what you use, so if you have a very low traffic website, it might not cost you anything at all.
Scaling: They handle scaling and load balancing for you out of the box, no need for you to worry about that. You still need to write your application so that it can scale efficiently, but if you do, they will handle the rest.
Background tasks: They have support for cronjobs as well as background workers using celery.
Customer support: I had a few questions, sent them an email, and had an answer really fast, they have been great, so much better then I would have expected. If you run your own VPS, you really don't have anyone to talk to, so this is a major plus.
Cons:
DB access: You don't have direct access to the database, you can get to the psql shell, but you can't connect an external client gui. This makes doing somethings a little more difficult or slow. But you can still use the django admin or fixtures to do a lot of things.
Limited services available: It currently only supports Postgresql and redis, so if you want to use MySQL, memcached, mongodb,etc you are out of luck.
low level c libs: You can't install any dependencies that you want, similar to google app engine, they have some of the common c libs installed already, and if you want something different that isn't already installed you will need to contact them to get it added. http://www.ep.io/docs/runtime/#python-libraries
email: You can't send or recieve email, which means you will need to depend on a 3rd party for that, which is probably good practice anyway, but it just means more money.
file system: You have a more limited file system available to you, and because of the distributed nature of the system you will need to be very careful when working from files. You can't (unless i missed it) connect to your account via (s)ftp to upload files, you will need to connect via the ep.io command line tool and either do an rsync or a push of a repo to get files up there.
Update: for more info see my blog post on my experiences with ep.io : http://kencochrane.net/blog/2011/04/my-experiences-with-epio/
Update: Epio closed down on May 31st 2012

Struts2 or Django for GAE and future scalability

I am developing location based service. FYI, the database will expand vastly as time and location are the variables. I am considering GAE for initial deployment. I am open for any of python or java based development. While calculating the scalability, I am getting confused. I never thought of scalability before as I haven't worked on big projects. Also I am considering the fact that may be I will have to change hosting in near future for more flexibility.
Considering this situation, what should I start with? Struts2? or Django? Will there be a big difference in terms of development time?
Do you know already know Java or Python? If you are proficient in one and not the other, you might want to use what you know. If you are unfamiliar with both, and particularly programming in general, I think Python would be much easier to learn. But this is very subjective.
GAE is a good platform for some applications. If you are, for example, frequently reporting a location from a mobile device (like a phone), I think GAE would be a good fit. But I would not use django to handle such requests; Instead use the 'lightest' possible framework to record the data (probably webapp (Python) or the low-level datastore API (java)).
Keep in mind the limitations on queries in GAE. There are no JOINS, you'll need to denormalize. You can use inequality filters on one property at a time, so for proximity queries you'll need a technique like GeoBoxes. If you can work around those limitations, App Engine has a lot to offer.