Struts2 or Django for GAE and future scalability - django

I am developing location based service. FYI, the database will expand vastly as time and location are the variables. I am considering GAE for initial deployment. I am open for any of python or java based development. While calculating the scalability, I am getting confused. I never thought of scalability before as I haven't worked on big projects. Also I am considering the fact that may be I will have to change hosting in near future for more flexibility.
Considering this situation, what should I start with? Struts2? or Django? Will there be a big difference in terms of development time?

Do you know already know Java or Python? If you are proficient in one and not the other, you might want to use what you know. If you are unfamiliar with both, and particularly programming in general, I think Python would be much easier to learn. But this is very subjective.
GAE is a good platform for some applications. If you are, for example, frequently reporting a location from a mobile device (like a phone), I think GAE would be a good fit. But I would not use django to handle such requests; Instead use the 'lightest' possible framework to record the data (probably webapp (Python) or the low-level datastore API (java)).
Keep in mind the limitations on queries in GAE. There are no JOINS, you'll need to denormalize. You can use inequality filters on one property at a time, so for proximity queries you'll need a technique like GeoBoxes. If you can work around those limitations, App Engine has a lot to offer.

Related

Transferring from a monolithic application to a micro service one - approach

We currently have a monolithic web application built with Scala (scalatra for the Rest APIs) for the backend and AngularJS for the front end. The application is deployed at AWS. We are going to build a new component, which we would like to build it as an independent microservice. And this component will have its own data repository which may not be the same type of DB. It will also be built with Scala as well, but Akka for the Rest APIs. The current application is built with DB module, domain module, and web service API module and front end/client module.
What is a good approach of a smooth journey? We possibly need to set up a micro service architecture first, such as an API gateway service along with others.
Too many ways, too many approaches, too many best practices. It really all depends on the analysis of your application, trying to figure out where the natural breaks are.
One place I start is looking at the data model. Lots of people advocate each microservice having its own database. Well, that's fine and dandy, but that can really be difficult to achieve without breaking things all over the place. But if you get lucky and there's a place where the data segregates nicely, than see what services would go with it and try breaking it out.
If you do not adhere to the separate database mentality, then I start with the low-hanging fruit, often times nothing more than simple CRUD operations with just a little business logic mixed in, providing some of the basic support for other larger-grained services to come. Of course, this becomes more iterative, not sure your organization will like it.
Which brings me to methodology. Organizations who've created monolithic applications often have methodologies that support them, whereas microservices require a much different approach to application development. Is your organization ready for that?
Needless to say, there's no right answer. I've gone to many conferences where these concepts are high on the interest list and the fact is there's no silver bullet, everyone has different ideas of what is right, and there's exceptions galore. You're just going to have to bite the bullet and cross your fingers, unfortunately.

SOA performance in a webapp

I'm struggling with the decision between a traditional backend (let's say a Django instance managing everything) and a service oriented architecture for a web app resembling LinkedIn. What I mean with SOA is having a completely independent data access interface - let's say Ruby + Sinatra - that queries the database, an independent chat application - Twisted - which is used via its API, a Django web server that uses those APIs for serving the content, etc.
I can see the advantages of having everything in the project modularized and accesed only via APIs: scalability, testing, etc. However, wouldn't this undermine the site performance? I imagine all modules would communicate via HTTP requests so wouldn't this arquitecture add a lot of latency to basically everything in the site? Is there a better alternative than HTTP?
Secondly, regarding development ease, would this really add much complexity to our developers? Specially during the first phase until we get an MVP.
Edit: We're a small team of two devs and a designer but we have no deadlines so we can handle a bit of extra work if it brings more technical value
Short answer, yes, SOA definitely trades encapsulation and reusability for latency. Long answer, it depends (as it always does) on how you do it.
How much latency affects your application is directly proportional to how fine-grained your services are. If you make very fine-grained services, you will have to make hundreds of sequential calls to assemble a user experience. If you make extremely coarse-grained services, you will not get any reusability out of your services; as they will be too tightly coupled to your application.
There are alternatives to HTTP, but if you are going to use something customized, you need to ask yourself, why are you using services at all? Why don't you just use libraries, and avoid the network layer completely?
You are definitely adding costs and complexities to your project by starting with an API. This has to balanced by the flexibility an API gives you. It might be a situation where you would benefit from internally structuring APIs to your code-base, but just invoking them as modules. Or building libraries instead of stand-alone APIs.
A lot of this depends on how big your project is. Are you a team of 1-3 devs cranking to get out your MVP? Or are you an enterprise with 20-100 devs that all need to figure out a way to divide up a project without stepping on each other?

AppEngine + django: is reliable to rely on both?

I need to create a In-App-Purchase backend for a iPhone App, and think in build it on GAE.
However, after my experience in a recent gig in one of the largest GAE customers and reading stuff like this http://www.agmweb.ca/blog/andy/2286/, I wonder if right now is good idea (ie: reliable) to host a django+gae project like this. I expect low traffic in the first months. Mainly a API-based website with some web front-end.
Or exist any kind of hints so get possible get a reliable operation using django + gae? I'm using App engine Helper, but could switch to another implementation if is more rock solid.
From my experience it seems that Django needs a bit of effort to get working correctly, and using it on AppEngine is a bit different to how you would use it otherwise. I suggest considering the possibility of using a different framework.
Personally, I suggest Tipfy as it was built specifically for App Engine, but there are quite a few frameworks I haven't even tried but have heard great things about.
IIRC the problem with Django poisoning instances due to exceeding the deadline has been solved.

Groovy or Django

I have never created a high traffic site so I have no idea what the best long term plan is. There is no room for a dedicated server in the budget. I'm currently using VPS hosting for my current site. I was going to stick with VPS and migrate grails. I looked at Django and python hosting plans (which look cheaper than VPS plans) from fatcow.com for example. Which is a better investment, grails through VPS hosting or django through a standard python hosting plan? Which would have better performance in short and in long term?
The front end of the application is javafx, and the backend will be a REST interface.
I went through the same process as you too before deciding to use django. I am a Java programmer during the day and I want to have a pet project that I can make during my spare time. So I got myself a VPS with the cheapest plan available. I installed Java webserver and deploy a Grails app, but it turns out that it needs a bigger memory. Then I realized that Java webapp needs a large memory to get running. So I went to look for a non-Java framework. I didn't have much criteria at that time other than it can run smooth on my current VPS plan.
I took a look at django and I was amazed that:
It is so simple and easy to get started. It only creates small numbers of file (compared to Grails)
It has many built-in feature that Grails doesn't have:
RSS feed framework
Commenting system
The admin system (you gonna love it, it's like scaffolding only better)
And many other webby features that takes time to create
It needs less memory to get started, but it can also scale really well
Other than that you're just going to compare Groovy and Python. If you're a Java programmer you're going to love Groovy syntax as it is really close to Java. But python is a good language too (despite that many people don't like it's syntax).
If you want to use JavaFX as the front-end, then you can use django just to return JSON data or XML data, and you can do this easily because it has a built-in serializer to do this.
So all in all the criteria drills down to what you need and what you already know.
I would stick with Django. Django and Grails are quite similar, but I prefer Python over Groovy. Python's development cycle is just less tedious than Groovy's. The Python console is e.g. started immediately, while the Groovy console can take over a second to load. That's just a small issue, but waiting a second many times gets frustrating in the end.
There is a Grails App Engine plugin that does not use hibernate.
http://www.grails.org/plugin/app-engine
Personally, I think the choice comes down to which language you like the most. If you are a Java/JSP developer, you'll probably like Grails better. However, if you are already quite proficient in Python then that is the better choice.
Here are some resources that might help you evaluate Grails.
http://grails.org/Success+Stories
http://www.pubbs.net/grails/200908/12877
Python is already well established and mature. There are plenty of resources and it is certainly a good choice, if you are a Python fan.
Have you looked at Google AppEngine? You can run Django there, and it's a good cheap way to start.
I haven't seen any performance comparisons between CPython and Jython, but I do know that Django runs on the latest version of Jython now. This also allows you the flexibility of being able to rewrite parts of your app later (remember, no premature optimization) in Java or, say, Scala, if you need the speed.
You may want to consider the memory footprint consumed by the app server in the VPS environment. If your VPS is really small (256 mb) then you might run out of memory if you are running the app server + db server.
Groovy's future is debatable. Its creator, James Strachan, has said:
I can honestly say if someone had shown me the Programming in Scala book by by Martin Odersky, Lex Spoon & Bill Venners back in 2003 I'd probably have never created Groovy.
-- http://macstrac.blogspot.com/2009/04/scala-as-long-term-replacement-for.html
My 2 cents: go with Python & Django. Skip Scala. Seriously consider Lisp.

Django -- I have a small app ready, Should I go on private VPS or Google App Engine?

I have my first app, not that big, but it is the first step. (next big one on the way)
Now if I want to put it on my own Linode VPS, I have to configure mod_python or mod_wsgi, as well as memcache, Ngix, mySQL or Postgresql, etc. to make it work. If I put it GAE, All I have to do is convert the models to use GAE's API.
What I like about GAE is scaling. (if they can really do it)
Then I'd only worry about developing my apps and doing SEO work on them instead of worrying about load share/balance, cache, db / IO redundancy, etc.
I don't want to do any porting later on. (I have to decide now and stick with it)
So, if you have any experience on this, what do you recommend:
1- Use VPS(s) for everthing
2- Use VPS(s) plus Amazon S3
3- Use VPS(s) plus Amazon S3 & SimpleDB
4- Use GAE
Also: Would I be able to get away with not having JOIN rights when using the BigTable?
Note: I don't have any spatial need now, but for a location table I might need that later on.
I'd like to know what do you think!
There's business risk and technical risk.
Business risk is that you might have to move hosts later for some external reason. VPS's, EC2, etc require more upfront investment, but keep you independent. Tools like Chef can help with the configuration effort.
Technical risk is that your application may not be easily implemented on the platform. Since most VPS options allow you to install arbitrary software, they minimize this, again at the cost of more configuration effort on your part. AFAIK, the largest constraint GAE enforces on you is it's difficult to do long running background tasks. (Working without JOINs and other aspects of de-normalized data requires a different way of thinking, but this approach is fairly common in web applications no matter where they run once the SQL database is larger than a single host can support.)
If you can live with both these risks, GAE would appear to save you a substantial amount of effort. If you cannot live with these risks, you should tailor your own environment.
As an aside, I find S3 to be worth it no matter your environment. It's far simpler than ensuring your local server static file storage is reliably backed up, and you never have to worry about capacity. It's best if you use it for data that is uploaded but rarely overwritten or deleted (think facebook photo albums).
I don't want to do any porting later on. (I have to decide now and stick with it)
If that's the case, wouldn't you prefer to control deployment from the outset? It could be a great pain to port back from GAE later down the line if you hit its limits (whether they be technological limits or simply business decisions by Google that run counter to your plans for the future of your app).
Also configuring mod_wsgi, installing postgres etc. isn't particularly difficult, and you don't have to worry about things like load balancing and db redundancy for a while yet.
If it were me, I'd prefer the long-term certainty of a traditional server over the quick win of GAE. It all depends on your vision for the app, however.
I may be biased, but if you can live with GAE's limitations it really saves you a lot of work and worry about system administration issues (and to some extent scaling) -- plus, it's free as long as your resource consumption is low (basically meaning your traffic is low).
Can you do without joins? I don't know, as I don't know your app -- I'm a SQL fanatic, myself, yet for simple enough needs I haven't found it too hard to adapt. As I see it, the main limitation of non-relational DBs is that they're nowhere as nice as relational ones for "ad hoc" queries... you typically have to write a lot of procedural code instead of a nice SELECT or two:-(. But, that's more of a "data mining later" issue than one connected with serving your web app -- probably best solved by regularly bulk-downloading data from the web app's online storage to a "data warehouse" kind of setup, anyway, even if such storage was relational in the first place;-).
Before deciding, it might be worth a quick prototype adaptation of your app to GAE. You might run into stoppers that force the decision. Possible stopper issues include
Your schema doesn't make the transition to BigTable
You're depending on some C-based library that GAE doesn't support
You have a few long-running requests that exceed the thresholds that GAE imposes
The answer depends on the complexity and nature of your model layer, really. If it's complex or tightly bound to the rest of your code, porting is likely to be a significant effort. If it's fairly straightforward, or easy to tear out and replace, I would say go for it.
These days, I mostly write new code for GAE, but the fact that I can simply deploy with a single command has really lowered the barrier I feel towards writing cool new apps. Not having to worry about deployment and hosting is quite liberating.
All I have to do is convert the models to use GAE's API.
I am sorry, you are totally mistaken.
You also need to rewrite all the views code that uses the ORM. There are no joins. So you have to deal with and write a lot of procedural code instead of the nifty SQL that provides U whatever you want.
Querying is slow. You need to override save method of each model to store additional information of that model which may take a lot of time to compute when need. You also need to work on memcache to make the queries fast enough.
And then, Guido has said Django 1.1 is going to be included in a future version of Appengine. I am hoping they will have an out of the box generic ORM to BigTable mapper.
That said, if your app is simple without many joins needed, you could use the appengine patch project to use the current version of django on Appengine. Here is how.