Search Engine Necessary? [closed] - django

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
In my application, I have a bunch of service providers in my database offering various services. I need a user to be able to search through these service providers by either name, location, or both. I also need a user to be able to filter the providers by different criteria, based on multiple attributes.
I am trying to decide if I could implement this simply with database queries or if a more robust solution (i.e. a search engine) would better suit my needs.
Please let me know the pros and cons of either solution, and which you think would be best to go with.
I am writing my application in Django 1.7, using a PostGIS database, and would use django-haystack with elasticsearch if a search engine is the way to go here.

Buddy,It seems that you are working on a search intensive application.Now my opinion in this regard is as follows-:
1)If u use search intensive queries directly with the database,Then automatically overhead is gonna be very high as each time a separate criteria based query is to be fired to the database engine from your django.Each time query is to be built with seperate parameters and is to be built to fire at the backend database engine. Consequence is it will make you highly dependent on the availability of database server.Things can go more worse if database server will be located in some remote location.As overhead of network connectivity will be another addendum to this.
2)You should try to implement a server side caching system like redis that is a in-memory nosql database (sometimes also called a data structure server) that will beat all the problems I discussed in my previous point.Read more about it here.
3)To powerpack your search.Read about Apache Solr enter link description here.A lucene based search library this will power pack your search to the next level.
4)Last but not least go with case studies of biggies like facebook,twitter etc regarding how they are managing their infrastructure.You will get even more better idea.
Any doubts or suggestions.Kindly comment cheers :-)

Related

What is the production-readiness of sails.js and meteor.js, and how to they compare to Django? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am thinking of using either one of them in building our startup which is like a job portal with validation,verification and includes special features for freelancing and all.
Is meteor or sails good for using as a backend or should we go with more robust backend like Django? Will using javascript on our backend provide the ability to scale in the future moreso than Django/python?
I would really like some opinions in this matter to get to a decision.
sails.js and meteor are both great options for production.
Both frameworks have good real-time (socket.io) support, large/active communities, support a stateless backend design which make horizontal scalability possible, and are great for getting a web application spun up quickly.
sails.js - http://sailsjs.org
broad database support through the Waterline ORM (there are over a dozen supported databases)
concepts should more familiar to most node.js developers (it's built on express)
modeled after rails, grails, and django, so the paradigm is more familiar to developers with experience in those tools
extensible through npm package manager via express middleware and custom modules
meteor - https://www.meteor.com
better integration between the backend and frontend
project is VC-backed with a firmer corporate backing
extensible using custom package manager and extension system
built-in deployment system and hosting on meteor.com
Will using javascript on our backend provide the ability to scale in the future moreso than Django/python?
Probably. As with anything, you just have to do it right.
My overall opinion is that meteor is sort of cult-ish and monolithic, and that once you've chosen it, you're locked in. sails.js is built on express, so it's easy to split out functionality and integrate with other tools.
My disclaimer is that I work for Balderdash (the company that invented sails.js); but on that note, I can also tell you that millions of users are served by sails.js applications. We find that it's quite good, and our business is thriving because of the power of sails.js. I know folks who have used meteor with success as well.
I think this is a primarily opinion-based question, so you're going to receive answers from the very same "type".
But I can tell you one thing: Meteor is robust enough for production use, specially now that they hit the 1.1 release (https://www.meteor.com/blog/2015/03/31/meteor-11-microsoft-windows-mongodb-30).
Meteor is perfectly suited for startups, since it brings everything you need (and more) into a single "pack".
Check this: http://meteorpedia.com/read/Why_Meteor
So yah, that's my answer going for the Meteor side.. (not very technical, I know).

Which one is better to user between Parse, Firebase and AWS Cognito? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am willing to use synchronisation service for my application. But I want to choose the best one. I want to know which one is better among all these. My application will run on Android , IOS , Windows and Web.
I am going with Firebase because I tested it. It is giving me fast results and it is also allowing me to work offline. Is it better or I will go with Parse or AWS Cognito?
I Also have an option of Google Cloud. Does Google Cloud provides service like Firebase? And are realtime updates possible with Parse as like Firebase?
Codeek has a good point that this question is opinion based, so take my answer with a grain of salt.
I have experience with both Parse and Firebase, but not with Cognito.
In my experience, Parse is better when working with large relationship-based databases. (I.E. databases where multiple classes of objects are pointing to each other and interact.) In this system, it is easy to store a lot of data very succinctly, but working with this data is done via snapshots. This means that you can take a snapshot of the data, edit it, and then refresh the server with the updated snapshot. This is perfect for things like my delivery application where only one user is updating the orders on our server at any one time.
Firebase implements a model-observer scheme, and so it is much better for applications that are highly interactive. For instance, I have used Firebase for creating a real-time game of hot potato. The advantage here is that changes to the data on the server are automatically pushed out to all devices that have registered as listeners (functionality not available on Parse from my experience). This keeps all users on the same page all the time. The downside is that the database is structured in a hierarchal manner and doesn't have defined "objects". Rather, it is structured via key/value pairs where parent keys cannot have an associated value. To illustrate this, a sample structure for storing a game on my database went something like this:
-Games
--1
---Users
----1 = "example#gmail.com"
----2 = "example2#gmail.com"
---PotatoHolder = 1
---TimeRemaining = 30
---Loser = -1
Cognito I am not familiar with, so I'll allow someone else to explain how that database system is designed.
In summary, codeek is correct that this is an opinion-based question, but for two of your options a good rule of thumb from my experience is that Parse is fantastic for large relationship databases in conjunction with single-user applications (i.e. single-player or turn based games). Firebase is more suited to hierarchal data systems in conjunction with real-time multiplayer applications.
I hope this helps! If you could post a little more about what kind of application you are trying to build then perhaps I, or someone else, could provide a little more guidance.
Expanded Answer: Although this question has been marked as off topic, to answer Nidhi's follow-up question if there is a way to use Parse as a model-observer scheme: Not easily. Using a timer is the simplest option. The other option is to use push notifications. This would require getting permission from you user. You can set the Cloud Code on Parse to automatically send push notifications all relevant users and then intercept them within your client so that they are "silent". In other words, when they arrive, you can have your client respond by updating your game without showing a ribbon or notification like normal push notifications. I have not done this myself, as I prefer using Firebase for that kind of application, but I believe that it is possible.
Source: PFQueryTableView Auto Refresh When New Data Updated or Refresh Every Minute Using Parse
Keith's answer is similar to Nidhi's reference to refreshing PFObjects via a Timer, Handsomeguy's comment refers to the possibility of "silent" push notifications.

What does it take to scale Django? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Problem
So I've been Django-ing for a number of months*. I find myself in a position where, I'm able to code up a Django web app for whatever, but am terrified by my inability** to come up with solutions as to how to go about building a Django web app for a large (LARGE) audience. Good to know that Django scales, at least.
How I'm thinking about it
It seems like there would need to be a relatively large leap of knowledge to understand how to (let alone actually execute) scale a Django web app. I say this because my research has given me the impression that scaling (or, enabling scalability) is a process of fitting aftermarket solutions to the different components of your web app to enhance the performance of each of these components.
There'sjustsomanythings~~
So there's a ton of solutions, and a bunch of components. For instance, there's Elastic Beanstalk for hosting, Django's cache framework, Memcached and Varnish for caching, Cassandra, Redis and PostgreSQL for databases, and uWSGI, Nginx and Apache for deployment. If what I think is right, anyway. I'm still not sure.
What I Need
I crave that amazing response that becomes the canonical answer to the question, but would also appreciate leads on where to begin, or suggestions of an approach to take to solve the problem, or your approach to scale Django. Thank you in advance for your been-there-done-that words of wisdom. << Edit: SO disapproves :(
BRAND NEW & EXCLUSIVE: A QUESTION FOR STACKOVERFLOW!
What I need
What are the 3 most important/effective things I should do/implement to improve the preparedness for scaling of the Django web apps that I'm building? List the approach, and explaining how they help would be nice.
*I've been cheating. I deploy on Pythonanywhere and have only used Sqlite3 up till now. I have also managed to keep my hands clean of WSGI/Apache deployment stuff to date.
**With Django is when I first managed to create something of value through programming. Before, I had only used Pascal to cheat at Runescape and Java to make some shitty Android apps. Which could perhaps explain why I feel this is that large of a leap.
I really wouldn't worry too much about it initially. That said, here are some ideas for how you might want to think about scaling your Django apps.
Caching
Depending on what your application is, caching can be very useful indeed. Certainly for any application that has a high proportion of reads to writes, such as a blog or content management system, then implementing caching is a no-brainer. For other types of sites, you may have to be a bit more careful, however the Django caching framework makes it straightforward to customise how caching works for your application.
Memcached is easy to set up with the Django caching, and it's solid and reliable. It should probably be your default choice as the caching backend.
 Celery
If your web app does any appreciable number of tasks in the background that need not be done during the same HTTP requests, then you should consider using Celery to carry them out in a separate task.
Case in point: on a Django app I built, there was the option to send an email to a client with a PDF copy of a report attached. Because the email need not be sent within the same HTTP request, then I handed that task off to Celery. Now, when the app receives the HTTP request, it just pushes the request to send that email onto the messaging queue. The Celery process picks up this task and handles it separately.
In theory that task could be handled on an entirely separate machine when your web app gets big enough.
Web server
It seems to be generally accepted that serving static content and dynamic content with Django is a bad idea. The solution I use seems to be fairly typical and employs two web servers:
Nginx runs on port 80. It serves all the static files and reverse proxies everything else to another port
Gunicorn runs on that other port and it serves the dynamic content, and Supervisor is used to run the Gunicorn process
There are variants of this general idea, but this kind of two server approach seems to be common. You could also consider using something like Amazon's S3 to host static files.
It's also well worth your while taking the time to minify your static files to improve their performance. Using a tool like Grunt it's quite easy to concatenate and minify your JavaScript and CSS files so that only one of each need be downloaded, rather than including many files that need to be downloaded individually.
Database
Either MySQL or Postgresql will be fine. Both are solid databases that are used in production on many websites.
As I said higher up, scaling your app shouldn't really be too much of a concern early on. However, it helps to be familiar with the kind of strategies you'll need to use.

Private Microblogging/Twitter-like Service [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Are there any cloud based private Twitter-like services out there?
I am working for a client who needs a service like this implemented, but we don't have the time or budget to create one from scratch.
I am looking for something with a REST api where I can create an account on it from the master server, set an account to follow another account, post updates for accounts, and then get a feed of posts (sorted by date) from accounts that another account is following (like a facebook wall, or twitter feed). It would be great if it could automatically scale out to hundreds of thousands of users, with perhaps 50 000 daily posts being made.
I had thought about implementing this myself, but it seems like there are some tricky areas when it comes to having an account following a few thousand other accounts, or being followed by 10s of thousands of accounts, and generating the feed in somewhat realtime as posts come in.
I have found some services such as http://www.ning.com/ and http://www.socialengine.com/ but I'm not sure if they can do what I need, and they seem to be very focussed on having a website. This is for a mobile app so that is not required.
There are a few open source projects out there, but they would all require setting up/maintaining hosting (not a huge problem) and I'm not certain how scalable they are (the client requires it scale up to at least 100k users).
I'm sorry for the late reply. I hope it will be useful to others looking at this.
I had pretty much the exact same need as you, and ended up creating a full-featured solution after finding no other resources. The service is called Collabinate (http://www.collabinate.com). It provides a RESTful API that focuses on simplicity and ease of use, and currently leaves the UI completely up to you. It uses a graph database and algorithms in the backend, and scales quite well for your situation.
Maybe private team inbox can fit in your solution too...
https://www.flowdock.com/
there is not a following feature in this but if this is an internal company need...
you can create chat rooms for departments and in general ... maybe the chat rooms can be the following feature for you
Looks like there isn't a good solution here.
I have found jaiku which looks incredibly complex and doesn't seem to run on the latest app engine sdk.
There is also diaspora which could be modified and run on your own server to do what is needed.
In the end, I have decided to just implement this myself on Google App Engine. It seems the best way to do what is needed. Using the fan-out pattern seems to be the best way. The Fantasm library seems to provide an easy to use way to do this, so I am going to try that.

Where to go to learn about web architecture? Youtube example? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 13 years ago.
Improve this question
I'm trying to build a web application that is similar to Youtube (it's not a knock off), but I guess I don't know how video is served on the internet very well.
I know how to build regular database driven web applications, but nothing like the scalability of Youtube. All of the applications I have built before have all been run on one server with the files stored on the same box as the web server.
How does one decouple the application server from the file storage from the media server?
I would more or less want 4 machines (clusters of machines)
1.) Application servers
-- Present the web page, handle user uploads, link the user's flash player to the correct media server etc.
2.) Database shards
-- Store user information, check favorites, etc.
3.) File storage
-- Store the media files
4.) Media servers
-- Serve the media files
How do I hook all of this together? Which technologies should I leverage? Where do I go to learn more about architecting this?
How does Youtube's embeddable flash stuff work? I want to embed my flash player on other websites and have it tie into my architecture.
Note I have looked into: http://highscalability.com/youtube-architecture
But I still don't get the overall picture of how this stuff ties together.
If someone can explain in high level terms how all of this stuff works?
Are there dedicated client servers running internally to shuffle around all of this stuff between the application servers, file storage, etc. Is it all via HTTP using JSON, what is going on here!
Thanks
Two books I'd recommend are:
Scalable Internet Architectures
Building Scalable Web Sites
The latter is by the director of engineering at flickr. Not youtube, but I think you'll find it enlightening.
Beyond that, the High Scalability blog is a good source of case studies and collected wisdom, all of which provide a good starting point for further exploration.
Start by hiring the right people; if you hire smart people, they'll be able to come up with answers to these questions, and more which will crop up.
Also, start at the scale that you plan to initially operate at. Don't plan for scalability you don't need. You aren't going to be making another Youtube - even if you're very successful within your field.
Scalability is expensive - very expensive - to develop and maintain. If you don't need it, it will drain your resources and restrict your developers needlessly. Just building a credible test environment for high performance systems tends to be a big job, and such a system would require several such environments.