I have a java web server and am currently using the Guava library to handle my in-memory caching, which I use heavily. I now need to expand to multiple servers (2+) for failover and load balancing. In the process, I switched from a in-process cache to Memcache (external service) instead. However, I'm not terribly impressed with the results, as now for nearly every call, I have to make an external call to another server, which is significantly slower than the in-memory cache.
I'm thinking instead of getting the data from Memcache, I could keep using a local cache on each server, and use RabbitMQ to notify the other servers when their caches need to be updated. So if one server makes a change to the underlying data, it would also broadcast a message to all other servers telling them their cache is now invalid. Every server is both broadcasting and listening for cache invalidation messages.
Does anyone know any potential pitfalls of this approach? I'm a little nervous because I can't find anyone else that is doing this in production. The only problems I see would be that each server needs more memory (in-memory cache), and it might take a little longer for any given server to get the updated data. Anything else?
I am a little bit confused about your problem here, so I am going to restate in a way that makes sense to me, then answer my version of your question. Please feel free to comment if I am not in line with what you are thinking.
You have a web application that uses a process-local memory cache for data. You want to expand to multiple nodes and keep this same structure for your program, rather than rely upon a 3rd party tool (memcached, Couchbase, Redis) with built-in cache replication. So, you are thinking about rolling your own using RabbitMQ to publish the changes out to the various nodes so they can update the local cache accordingly.
My initial reaction is that what you want to do is best done by rolling over to one of the above-mentioned tools. In addition to the obvious development and rigorous testing involved, Couchbase, Memcached, and Redis were all designed to solve the problem that you have.
Also, in theory you would run out of available memory in your application nodes as you scale horizontally, and then you will really have a mess. Once you get to the point when this limitation makes your app infeasible, you will end up using one of the tools anyway at which point all your hard work to design a custom solution will be for naught.
The only exceptions to this I can think of are if your app is heavily compute-intensive and does not use much memory. In this case, I think a RabbitMQ-based solution is easy, but you would need to have some sort of procedure in place to synchronize the cache between the servers on occasion, should messages be missed in RMQ. You would also need a way to handle node startup and shutdown.
Edit
In consideration of your statement in the comments that you are seeing access times in the hundreds of milliseconds, I'm going to advise that you first examine your setup. Typical read times for a single item in the cache from a Memcached (or Couchbase, or Redis, etc.) instance are sub-millisecond (somewhere around .1 milliseconds if I remember correctly), so your "problem child" of a cache server is several orders of magnitude from where it should be in terms of performance. Start there, then see if you still have the same problem.
We're using something similar for data which is read-only and doesn't require updated every time. I'm in doubt, that this is good plan for you. Just imagine you should have one more additional service on each instance, which will monitor queue, and process change to in-memory storage. This is very hard to test.
Are you sure that most of the time is spent on communication between your servers? Maybe you run multiple calls?
Related
I am using Django for my project and I ll be hosting it on Linode or any other hosting service. Plus if I want to use memcache will I require a new Linode for it? Means just one server will be ok or I ll have to host my site on 2 servers, one for memcache and one for django? And is it the same for Redis? Also will I require a separate server for Mysql?
I don't think you understand that nobody is a fortune telling wizard. Nobody knows how many requests you will receive per second, nor how cpu/memory intensive each request will be. Nobody knows how optimized your code is. Nobody knows if your application is read heavy or write heavy. Your use case is your own, and your probably the only one who estimate it.
My only actual advice to you is to try to estimate your server data and sever load and benchmark your setup on one machine. If you are unsatisfied with the performance then scale up. You can either scale up vertically, by increasing the size of your linode, or scale horizontally by adding more linode instances. In the latter case, you will most likely put your DB on a machine of it's own and have multiple django instances fed by a load balancer. These Django instances could each share the same memcache on a machine, or they can each have their own memcaches on their own machine. Which one is better? I can't tell you. It again depends on your use case.
If I were you, I would set it all up on one linode instance. I would create test data that I assume would be close to real world. Then I would try to test my response times with an estimated number of requests per second. I would measure response times, cache hits, and memory usage. I would then decide based on that if my use case is satisfied with this level of performance or not because I'm really the only one who would know what is satisfactory performance. Additionally, adding more linode resources is not necessarily where I would first try and improve performance.
Some great tips on optimizing and benchmarking can be found here:
https://docs.djangoproject.com/en/1.8/topics/performance/
http://blog.disqus.com/post/62187806135/scaling-django-to-8-billion-page-views
http://scottbarnham.com/blog/2008/04/28/django-performance-testing-a-real-world-example/
Late night reading about scaling up Django can be found in many books, I like this one:
https://highperformancedjango.com/
Sorry if I sound a bit blunt, I just want you to understand that nobody can walk in here and give you an answer with a large degree of confidence. This question doesn't have a straight-forward answer.
TL;DR Start with one instance and scale up only if you've convinced yourself you need to.
You say Memcached or Redis, so I assume Redis would be deployed without persistence, with a purely in-memory configuration.
In such case both Memcached and Redis are unlikely to get saturated even if you run them in one server, since the limiting factor is more likely to be a single Django instance if your requests/second go high.
However you should make sure to have enough memory and to configure an appropriate max memory usage for Memcached / Redis (different ways to accomplish this in the two different services). Note that under memory pressure, the Linux OOM killer may kill your cache otherwise, so if you go for a single instance, which seems to me a sensible first step, make sure your Django memory usage plus the memory you allocate for caching, are not enough to go near the limits of the instance free memory.
CPU is hardly going to be an issue as I said since Memcached / Redis are pretty good at using little CPU, so I can't foresee a setup where Django is ok serving pages but the instance is in trouble since the CPU is burned by the cache.
I have a webmachine REST API server running on one machine. in anticipation of having more traffic that this machine cannot handle, i would need to expand to other nodes on other cpu`s. is there a way of configuring this?
if not what is the right way of distribution here, would i need to do it manually through OTP, concurrent workers and supervisors? where a worker is spawned and ships the request to the neighboring machine.
It kind of depends on your use case. Best way would be observing where you experience problems, and react accordingly.
You could look at your application as three separate parts. First one would be REST interface, second could be businesses logic (little more later), and third would be the data itself (resources, lets call it data store, but it could be even just another service).
Data
This one is simplest. I assume you are using separate service for this (like Riak cluster), where you could do your scaling separately. One thing you could look into is just making sure connection between Webmachine and your data store can scale enough for your needs.
Interface
If your server just can not handle enough requests, just put another one next to it. You can router to dispatch request to both of therm, ans since they will use same data store, they will stay in "sync".
REST being based on http assumes stateless communication. Meaning, any two requests (form same user or two different ones) don't share any resources and can be handled by different applications (you also don't have to share anything between your Webmachine instances).
Domain logic
In theory you should not have any of this in your REST API server, but still lets discuss this a little bit.
Some of your requests might require little more work than just serving content. You might be doing some computation (like serving statistics that need to be generated). You might be updating some resource, that need to change more that one data in one place (one could think of it as of transaction). It could call for more computational power, or state synchronization, which would make scaling harder.
Way around this would be separating REST from such logic. Especially introducing micro-services, which you could scale up or down independently from Webmachine itself.
In Erlang you could actually introduce separate applications inside Erlang VM. Those again could be scaled up with use of Distributed Erlang (and little more in this topic and pull of workers (like poolboy. I would recommend this approach for start, since it is easiest to implement, and due to async nature of Erlang it could always easily be ported to external micro-service.
PS. System resources
You also should check if your box can handle such traffic. One of most common mistakes is not increasing maximal number of file descriptors in your production. But again, first you should observe such problems, and then react to them. Premature optimization in most cases doesn't pay off.
PS2 What and when
You can monitor our applications and system resources with tools like Exometer or more out-of-the-box WombatOAM.
And you can (should) stress test your application with tools like tsung or basho-bench
How do you tune Django for better performance? Is there some guide? I have the following questions:
Is mod_wsgi the best solution?
Is there some opcode cache like in PHP?
How should I tune Apache?
How can I set up my models, so I have fewer/faster queries?
Can I use Memcache?
Comments on a few of your questions:
Is mod_wsgi the best solution?
Apache/mod_wsgi is adequate for most people because they will never have enough traffic to cause problems even if Apache hasn't been set up properly. The web server is generally never the bottleneck.
Is there some opcode cache like in PHP?
Python caches compiled code in memory and the processes persist across requests. You thus don't need a separate opcode caching product like PHP as that is what Python does by default. You just need to ensure you aren't using a hosting mechanism or configuration that would cause the processes to be thrown away on every request or too often. Don't use CGI for example.
How should I tune Apache?
Without knowing anything about your application or the system you are hosting it on one can't give easy guidance as how you need to set up Apache. This is because throughput, duration of requests, amount of memory used, amount of memory available, number of processors and much much more come into play. If you haven't even written your application yet then you are simply jumping the gun here because until you know more about your application and production load you can't optimally tune Apache.
A few simple suggestions though.
Don't host PHP in same Apache.
Use Apache worker MPM.
Use mod_wsgi daemon mode and NOT embedded mode.
This alone will save you from causing too much grief for yourself to begin with.
If you are genuinely needing to better tune your complete stack, ie., application and web server, and not just prematurely optimising because you think you are going to have the next FaceBook even though you haven't really written any code yet, then you need to start looking at performance monitoring tools to work out what your application is doing. Your application and database are going to be the real source of your problems and not the web server.
The sort of performance monitoring tool I am talking about is something like New Relic. Even then though, if you are very early days and haven't got anything deployed even, then that itself would be a waste of time. In other words, just get your code working first before worrying about how to run it optimally.
I am writing a web application with django on the server side. It takes ~4 seconds for server to generate a response to the user. It makes use of a weather api. My application has to make ~50 query to that api for each user request.
Server side uses urllib of python for using the weather api. I used pythons threading to speed up the process because urllib is synchronous. I am using wsgi with apache. The problem is wsgi stack is fully synchronous and when many users use my application, they have to wait for one anothers request to finish. Since each request takes ~4 seconds, this is unacceptable.
I am kind of stuck, what can I do?
Thanks
If you are using mod_wsgi in a multithreaded configuration, or even a multi process configuration, one request should not block another from being able to do something. They should be able to run concurrently. If using a multithreaded configuration, are you sure that you aren't using some locking mechanism on some resource within your own application which precludes requests running through the same section of code? Another possibility is that you have configured Apache MPM and/or mod_wsgi daemon mode poorly so as to preclude concurrent requests.
Anyway, as mentioned in another answer, you are much better off looking at caching strategies to avoid the weather lookups in the first place, or offloading to client.
50 queries to an outside resource per request is probably a bad place to be, and probably not neccesary at all.
The weather doesn't change all that quickly, and so you can probably benefit enormously by just caching results for a while. Then it doesn't matter how many requests you're getting, you don't need to do more than a few queries per day
If that's not your situation, you might be able to get the client to do the work for you. Refactor the code so that the weather api aggregation happens on the client in javascript, rather than funneling it all through the server.
Edit: based on comments you've posted, what you are asking for probably cannot be optimized within the constraints of the API you are using. The problem is that the service is doing a good job of abstracting away the differences in the many sources of weather information they aggregate into a nearest location query. after all, weather stations provide only point data.
If you talk directly to the technical support people that provide the API, you might find that they are willing to support more complex queries (bounding box), for which they will give you instructions. More likely, though, they abstract that away because they don't want to actually reveal the resolution that their API actually provides, or because there is some technical reason in the way that they model their data or perform their calculations that would make such queries too difficult to support.
Without that or caching, you are just out of luck.
I’m thinking about building an offline-enabled web application.
The architecture I’m considering is as follows:
Web server (remote) <--> Web server/cache (local) <--> Browser/Prism
The advantages I envision for this model are:
Deployment is web-based, with all the advantages of this approach
Offline-enabled
UI (html/js) synchronization is a non-issue
Data synchronization can be mostly automated
as long as I stay within a RESTful paradigm
I can break this as required but manual synchronization would largely remain surgical
The local web server is started as a service; I can run arbitrary code, including behind-the-scene data synchronization
I have complete control of the data (location, no size limit, no possibility of user deleting unknowingly)
Prism with an extension could allow to keep the javascript closed source
Any thoughts on this architecture? Why should I / shouldn’t I use it? I'm particularly looking for success/horror stories.
The long version
Notes:
Users are not very computer-literate.
For instance, even superficially
explaining how Gears works is totally
out of the question.
I WILL be held liable if data is loss, even if it’s really the users fault (short of him deleting random directories on his machine)
I can require users to install something on their machine. It doesn’t have to be 100% web-based and/or run in a sandbox
The common solutions to this problem don’t feel adequate somehow. Here is a short analysis of each.
Gears/HTML5:
no control over data, can be deleted
by users without any warning
no
control over location of data (not
uniform across browsers and
platforms)
users need to open application in browser for synchronization to happen; no automatic, behind-the-scene synchronization
different browsers are treated differently, no uniform view of data on a single machine
limited disk space available
synchronization is completely manual, sql-based storage makes this a pain (would be less complicated if sql tables were completely replicated but it’s not so in my case). This is a very complex problem.
my code would be almost completely open sourced (html/js)
Adobe AIR:
some of the above
no server-side includes (!)
can run in the background, but not windowless
manual synchronization
web caching seems complicated
feels like a kludge somehow, I’ve had trouble installing on some machines
My requirements are:
Web-based (must). For a number of
reasons, sharing data between users
for instance.
Offline (must). The application must be fully usable offline (w/ some rare exceptions).
Quick development (must). I’m a single developer going against players with far more business resources.
Closed source (nice to have). Yes, I understand the open source model. However, at this point I don’t want competitors to copy me too easily. Again, they have more resources so they could take my hard work and make it better in less time than I could myself. Obviously, they can still copy me developing their own code -- that is fine.
Horror stories from a CRM product:
If your application is heavily used, storing a complete copy of its data on a user's machine is unfeasible.
If your application features data that can be updated by many users, replication is not simple. If three users with local changes synch, who wins?
In reality, this isn't really what users want. They want real-time access to the most current data from anywhere. We had better luck offering a mobile interface to a single source of truth.
The part about running the local Web server as a service appears unwise. Besides the fact that you are tied to certain operating environments that are available in the client, you are also imposing an additional burden of managing the server, on the end user. Additionally, the local Web server itself cannot be deployed in a Web-based model.
All in all, I am not too thrilled by the prospect of a real "local Web server". There is a certain bias to it, no doubt since I have proposed embedded Web servers that run inside a Web browser as part of my proposal for seamless off-line Web storage. See BITSY 0.5.0 (http://www.oracle.com/technology/tech/feeds/spec/bitsy.html)
I wonder how essential your requirement to prevent data loss at any cost is. What happens when you are offline and the disk crashes? Or there is a loss of device? In general, you want the local cache to be the least farther ahead of the server, but be prepared to tolerate loss of data to the extent that the server is behind the client. This may involve some amount of contractual negotiation or training. In practice this may not be a deal-breaker.
The only way to do this reliably is to offer some sort of "check out and lock" at the record level. When a user is going remote they must check out the records they want to work with. This check out copied the data to a local DB and prevents the record in the central DB from being modified while the record is checked out.
When the roaming user reconnects and check their locked records back in the data is updated on the central DB and unlocked.