Good morning,
I am currently looking at deploying a Django app on a EC2 instance, but everything is getting too confusing for me! I understand that Django has built-in implementation for MySQL, PSQL, and SQLite. Now, Amazon has RDS (MySQL), SimpleDB and DynamoDB. Do you guys have any recommendation on what should be used? I want something that is scalable for the future and bullet-proof. AWS provides a python API for its SimpleDB and DynamoDB. Will that work nicely with Django?!
Thanks a lot!
EDIT: I would rather be focusing on an overall solution that will be bulletproof, efficient and fast, and not too complicated. As I plan for more people to work on the system, I don't really want something that is complicated and hard to maintan. I would rather spend more time implementing and installing things, but at the end, the solution will be faster and easy to understand and work with. (IE.: Querying the DB will be straight-forward and no hacks around).
SimpleDB and DynamoDB are NoSQL so you'll need django-nonrel to deal with it and have no guarantees if everything will work fine. But if you need to use NoSQL - there is some 3rd-party modules for Django.
RDS is MySQL so you can use Django's default MySQL driver, and ORM, and admin and so on. It seems a good solution but you can't tweak or update these MySQL instances.
If your DB is not big and heavy yet, you can set up a local mysql instance on your EC2 and move it to RDS if you will need to grow.
Related
I'm creating a web app in react with a nodeJS backend. I'm hosting all this on the Google Cloud Platform. I'm using a postgresql database and a redis database, and because my knowledge of these databases is very little, I'm using the managed options (cloud SQL and cloud memorystore).
These are not the cheapest solutions, but for now, they'll do what I want them to do.
My question now is: I'm using the managed options. Imagine my web app has success and grows bigger, I'll probably want my own managed solution (like a redis cluster on compute engine or a postgresql cluster on compute engine). Will I be able to migrate my managed databases to the compute engine solution without downtime/loss of data?
If things are getting bigger, I'll probably hire someone with more knowledge about postgresql/redis, that's not the problem, the only thing I want to know: is it possible to upgrade from a GCP managed solution to an unmanaged solution on compute engine without loss of data and downtime? I'm do not want loss of data at all, a little downtime should not be the problem.
Using the managed solution is, in fact, a better approach for handling databases. GCP takes over updates, management and maintenance of the database and provides reliable tools for backup and scaling.
But to answer your question, yes it is possible to migrate with a minimum downtime. You would need to configure main/worker or master/slave (deprecated terminology) with synchronous replication. After that you can switch your database to worker (which is in Compute Engine) and make it your primary database. This would give basically minimal possible downtime.
I am trying to use DynamoDB for the backend DB of my application, but am having a hard time finding useful information associated to it.
What is the best source of examples and tutorial information for syntax structures etc?
AWS docs are really confusing. Or am I the only person sitting with these problems?
Oh and is the newly launched AWS DocumentDB (Basically MongaDB) going to make DynamoDB pointless to learn, or is there still merit in learning DynamoDB?
The pricing model between DocumentDB and DynamoDB are completely different - there is definitely a place for both - imo, dynamodb is not going away any time soon.
As far as tutorials - there are tons of AWS reinvent videos on youtube, and this site allows you to search/find them easily: https://reinventvideos.com/. Good place to start.
We need a reasonable insert and query speed over huge tables so I considered using some noSQL adapter with Django. Unfortunately:
Django does not provide official support for noSQL databases.
In our original schema some Big Data are relational to other Big Data making the data duplication unacceptable.
Project deadlines are enemies of hot stuff like this.
So, as far I can see, PostgreSQL should be the way to go for this scenario, right?!
Please let me know any other detail that may be relevant to this question!
Bonus to anyone that can point out some useful database techniques like database sharding...
Well, there is a fork of django project that uses MongoDb as the backend.You can read about it here . The Code on GitHub is here.You give some heads up, MongoDB is a NOSQL db that does support sharding and replication. So i think this might something that you are looking for.
If I want to write a django app with Amazon SimpleDB, should I install a local SimpleDB server in my development environment? If so, is there a good one around? simpledb-dev seems to be no longer maintained. Or, should I access the DB on the cloud directly?
I would access simpledb directly, just point to a different account or a different set of domains.
By the way, I don't think there are any "local SimpleDB" servers. You would have to write your own test implementation which sounds like a nightmare, but then again maybe you have a lot of free time.
Also, you are probably going to want to actually get a feel for SimpleDB which will be easier if you are using the real thing.
I'm creating an application that will be hosted on amazon EC2 and a lot of the data that'll be saved is more document oriented (as well as saving tweets and such related to those documents).
Right now I'm at a crossroads... should I use simpleDB or couchDB? Whats the pros/cons of using either? Should I just try both for a month and decide then?
You may find the the article Amazon SimpleDB and CouchDB Compared to be useful.
I've also found that MongoDB gives excellent performance.
Keep in mind that if your code lives in EC2, SimpleDB will be presumably hosted in the same data center that your code is, which would give SimpleDB a lower latency than CouchDB for requests from an EC2 server. Also, Amazon doesn't charge you bandwidth costs between EC2 and SimpleDB.
I would expect SimpleDB to be both faster and cheaper for code running in EC2, for those reasons.
SimpleDB is hosted and maintained by Amazon for you, CouchDB is all up to you. That's the big difference.
I would absolutely do some benchmark of the two solutions with your own use-case, if that's possible, i.e. if you can build a reasonable subset of your application to run on either databases (they have quite different APIs so this might not be easy).
If you develop in .Net environment there's an excellent lib for SimpleDB called Simple Savant which really eases the integration..
I've built some live solutions using SimpleDB and it works very well, especially with a caching layer in front of it (cf memcached et al). However I've recently started scoping out a new project and have decided to move to CouchDB for the primary reason of having control over the data.
As your commitment to SimpleDB grows, it gets harder and harder to migrate away to anything else (ah the joys of vendor lock in) and, frankly, that just isn't great for our business.
I remain a strong evangelist of cloud tech, and Amazon in particular, but I feel a lot better running couchdb on EC2 than I do with SimpleDB.
Roger