Django development environment with SimpleDB - django

If I want to write a django app with Amazon SimpleDB, should I install a local SimpleDB server in my development environment? If so, is there a good one around? simpledb-dev seems to be no longer maintained. Or, should I access the DB on the cloud directly?

I would access simpledb directly, just point to a different account or a different set of domains.
By the way, I don't think there are any "local SimpleDB" servers. You would have to write your own test implementation which sounds like a nightmare, but then again maybe you have a lot of free time.
Also, you are probably going to want to actually get a feel for SimpleDB which will be easier if you are using the real thing.

Related

Django website accessible to others just for testing

Right now the website is running locally and I'm still working on it.
While doing this I also have to make it visible to a specific group of users as I need their feedback in order to add/change features, etc.
I've tried to find a free web hosting without any luck (see dependencies).
I was thinking to create a VPN but then I will have to use my PC as a host for a virtual machine which is by far not what I'm looking for.
Therefore, my questions are:
1. Which is the best way to achieve this (website visibility for TESTING) fast and easy?
2. If a dedicated web host is the best solution, please point me to an easy-to-use and cheap one. What I've tried so far: elastichosts, alwasydata, stackable, 1FreeHosting and probably others I don't remember right now. For a reason or another I couldn't use none of the above.
Another aspect to be considered: I want this only for simple testing and I don't need a lot of server resources. Also the traffic will be very low as there are only 5 testers. That's why I wouldn't pay too much for it. I will probably need this temporary web hosting for 2-3 months.
Dependencies:
- as the website uses mezzanine, for the moment I only need mezzanine's dependencies.
Thanks in advance!
You can always just setup port forwarding on your router. This would allow your testers direct access to your app. Though this might give your PC more exposure than you want.
Heroku has a free tier.
In your non free options, an instance at linode costs $20/month, but requires some setup. Rackspace has similar options in their cloud servers line. Both are no contract servers.
My blogpost covers gracefully deploying a Mezzanine site. The monthly hosting cost is nothing compared to the cost of a slow, painful deployment process.
An EC2 micro-instance right now costs as little as ~US$3.50/month. I create and destroy staging servers on EC2 servers for testing and sharing with others.

Best database solution for Django on AWS

Good morning,
I am currently looking at deploying a Django app on a EC2 instance, but everything is getting too confusing for me! I understand that Django has built-in implementation for MySQL, PSQL, and SQLite. Now, Amazon has RDS (MySQL), SimpleDB and DynamoDB. Do you guys have any recommendation on what should be used? I want something that is scalable for the future and bullet-proof. AWS provides a python API for its SimpleDB and DynamoDB. Will that work nicely with Django?!
Thanks a lot!
EDIT: I would rather be focusing on an overall solution that will be bulletproof, efficient and fast, and not too complicated. As I plan for more people to work on the system, I don't really want something that is complicated and hard to maintan. I would rather spend more time implementing and installing things, but at the end, the solution will be faster and easy to understand and work with. (IE.: Querying the DB will be straight-forward and no hacks around).
SimpleDB and DynamoDB are NoSQL so you'll need django-nonrel to deal with it and have no guarantees if everything will work fine. But if you need to use NoSQL - there is some 3rd-party modules for Django.
RDS is MySQL so you can use Django's default MySQL driver, and ORM, and admin and so on. It seems a good solution but you can't tweak or update these MySQL instances.
If your DB is not big and heavy yet, you can set up a local mysql instance on your EC2 and move it to RDS if you will need to grow.

Django redundancy and replication over two VPS accounts

I'm slowly getting into the position where one of my Django sites needs some robustness behind it. I'd currently running on a single VPS on a SQLite database with memcached.. It's about as un-scaled as things can get.
If I bought another VPS account, what would I want to do?
Move to MySQL/PostgreSQL with replication? What's easiest? Does replication protect me from one server exploding? Are there concurrency downsides?
How do I load-balance between the two servers?
I'd put memcached on the new server too. If I put both IPs into the configuration, would that keep a copy of data on both servers? (I'm thinking of what happens to session data - currently stored in memcached)
I'm currently using Cherokee as the httpd - I'm sure this has its own set of issues. If you've any tips, let me know.
Am I going at this the wrong way? Is there an easier way to have faster, more robust django sites?
First step: switch from SQLite to a real production database (I like Postgres). This should happen long before you even think about a second VPS. SQLite essentially does not support concurrency at all. Personally, I wouldn't even consider deploying a live site on SQLite in the first place.
If your site is running on SQLite and is functioning, my guess is you are still quite a long ways from actually outgrowing your single VPS (unless it's already heavily loaded otherwise).
If/when you do need to add a second server, how you configure things depends on where you're actually seeing a bottleneck. Chances are it'll be the database, in which case a good step might be simply moving the database onto its own server (presuming you can guarantee low latency between the two VPSes) and loading the database server with as much RAM as you can afford. In general disk performance suffers most in a VPS, so another step to consider might be putting the DB onto raw metal.
I'd probably look at those steps before I'd think about DB replication or multiple web-tier servers, but it really depends on profiling your actual case (and how you value performance vs reliability).
Watching the Django Deployment Workshop by Jacob Kaplan-Moss should give you a good overview.
MySQL supports Master-Slave and Master-Master setups I don't use PostgreSQL.
You can use nginx as your loadbalancer, HAProxy is an option, too (SO use it).
Memcached distributes the objects over the servers, If one crashes the data is lost.
I don't know Cherokee, but nginx is great.

Should I use CouchDB or SimpleDB?

I'm creating an application that will be hosted on amazon EC2 and a lot of the data that'll be saved is more document oriented (as well as saving tweets and such related to those documents).
Right now I'm at a crossroads... should I use simpleDB or couchDB? Whats the pros/cons of using either? Should I just try both for a month and decide then?
You may find the the article Amazon SimpleDB and CouchDB Compared to be useful.
I've also found that MongoDB gives excellent performance.
Keep in mind that if your code lives in EC2, SimpleDB will be presumably hosted in the same data center that your code is, which would give SimpleDB a lower latency than CouchDB for requests from an EC2 server. Also, Amazon doesn't charge you bandwidth costs between EC2 and SimpleDB.
I would expect SimpleDB to be both faster and cheaper for code running in EC2, for those reasons.
SimpleDB is hosted and maintained by Amazon for you, CouchDB is all up to you. That's the big difference.
I would absolutely do some benchmark of the two solutions with your own use-case, if that's possible, i.e. if you can build a reasonable subset of your application to run on either databases (they have quite different APIs so this might not be easy).
If you develop in .Net environment there's an excellent lib for SimpleDB called Simple Savant which really eases the integration..
I've built some live solutions using SimpleDB and it works very well, especially with a caching layer in front of it (cf memcached et al). However I've recently started scoping out a new project and have decided to move to CouchDB for the primary reason of having control over the data.
As your commitment to SimpleDB grows, it gets harder and harder to migrate away to anything else (ah the joys of vendor lock in) and, frankly, that just isn't great for our business.
I remain a strong evangelist of cloud tech, and Amazon in particular, but I feel a lot better running couchdb on EC2 than I do with SimpleDB.
Roger

Django -- I have a small app ready, Should I go on private VPS or Google App Engine?

I have my first app, not that big, but it is the first step. (next big one on the way)
Now if I want to put it on my own Linode VPS, I have to configure mod_python or mod_wsgi, as well as memcache, Ngix, mySQL or Postgresql, etc. to make it work. If I put it GAE, All I have to do is convert the models to use GAE's API.
What I like about GAE is scaling. (if they can really do it)
Then I'd only worry about developing my apps and doing SEO work on them instead of worrying about load share/balance, cache, db / IO redundancy, etc.
I don't want to do any porting later on. (I have to decide now and stick with it)
So, if you have any experience on this, what do you recommend:
1- Use VPS(s) for everthing
2- Use VPS(s) plus Amazon S3
3- Use VPS(s) plus Amazon S3 & SimpleDB
4- Use GAE
Also: Would I be able to get away with not having JOIN rights when using the BigTable?
Note: I don't have any spatial need now, but for a location table I might need that later on.
I'd like to know what do you think!
There's business risk and technical risk.
Business risk is that you might have to move hosts later for some external reason. VPS's, EC2, etc require more upfront investment, but keep you independent. Tools like Chef can help with the configuration effort.
Technical risk is that your application may not be easily implemented on the platform. Since most VPS options allow you to install arbitrary software, they minimize this, again at the cost of more configuration effort on your part. AFAIK, the largest constraint GAE enforces on you is it's difficult to do long running background tasks. (Working without JOINs and other aspects of de-normalized data requires a different way of thinking, but this approach is fairly common in web applications no matter where they run once the SQL database is larger than a single host can support.)
If you can live with both these risks, GAE would appear to save you a substantial amount of effort. If you cannot live with these risks, you should tailor your own environment.
As an aside, I find S3 to be worth it no matter your environment. It's far simpler than ensuring your local server static file storage is reliably backed up, and you never have to worry about capacity. It's best if you use it for data that is uploaded but rarely overwritten or deleted (think facebook photo albums).
I don't want to do any porting later on. (I have to decide now and stick with it)
If that's the case, wouldn't you prefer to control deployment from the outset? It could be a great pain to port back from GAE later down the line if you hit its limits (whether they be technological limits or simply business decisions by Google that run counter to your plans for the future of your app).
Also configuring mod_wsgi, installing postgres etc. isn't particularly difficult, and you don't have to worry about things like load balancing and db redundancy for a while yet.
If it were me, I'd prefer the long-term certainty of a traditional server over the quick win of GAE. It all depends on your vision for the app, however.
I may be biased, but if you can live with GAE's limitations it really saves you a lot of work and worry about system administration issues (and to some extent scaling) -- plus, it's free as long as your resource consumption is low (basically meaning your traffic is low).
Can you do without joins? I don't know, as I don't know your app -- I'm a SQL fanatic, myself, yet for simple enough needs I haven't found it too hard to adapt. As I see it, the main limitation of non-relational DBs is that they're nowhere as nice as relational ones for "ad hoc" queries... you typically have to write a lot of procedural code instead of a nice SELECT or two:-(. But, that's more of a "data mining later" issue than one connected with serving your web app -- probably best solved by regularly bulk-downloading data from the web app's online storage to a "data warehouse" kind of setup, anyway, even if such storage was relational in the first place;-).
Before deciding, it might be worth a quick prototype adaptation of your app to GAE. You might run into stoppers that force the decision. Possible stopper issues include
Your schema doesn't make the transition to BigTable
You're depending on some C-based library that GAE doesn't support
You have a few long-running requests that exceed the thresholds that GAE imposes
The answer depends on the complexity and nature of your model layer, really. If it's complex or tightly bound to the rest of your code, porting is likely to be a significant effort. If it's fairly straightforward, or easy to tear out and replace, I would say go for it.
These days, I mostly write new code for GAE, but the fact that I can simply deploy with a single command has really lowered the barrier I feel towards writing cool new apps. Not having to worry about deployment and hosting is quite liberating.
All I have to do is convert the models to use GAE's API.
I am sorry, you are totally mistaken.
You also need to rewrite all the views code that uses the ORM. There are no joins. So you have to deal with and write a lot of procedural code instead of the nifty SQL that provides U whatever you want.
Querying is slow. You need to override save method of each model to store additional information of that model which may take a lot of time to compute when need. You also need to work on memcache to make the queries fast enough.
And then, Guido has said Django 1.1 is going to be included in a future version of Appengine. I am hoping they will have an out of the box generic ORM to BigTable mapper.
That said, if your app is simple without many joins needed, you could use the appengine patch project to use the current version of django on Appengine. Here is how.