Coordinating master and worker machines - google-cloud-platform

If this question seems basic to more IT-oriented folks, then I apologize in advance. I'm not sure it falls under the ServerFault domain, but correct me if I'm wrong...
This question concerns some backend operations of a web application, hosted in a cloud environment (Google). I'm trying to assess options for coordinating our various virtual machines. I'll describe what we currently have, and those "in the know" can maybe suggest a better way (I hope!).
In our application there are a number of different analyses that can be run, each of which has different hardware requirements. They are typically very large, and we do NOT want these to be run on the application server (referred to as app_server below).
To that end, when we start one of these analyses, app_server will start a new VM (call this VM1). For some of these analyses, we only need VM1; it performs the analysis and sends a HTTP POST request back to app_server to let it know the work is complete.
For other analyses, VM1 will in turn will launch a number of worker machines (worker-1,...,worker-N), which run very similar tasks in parallel. Once the task on a single worker (e.g. worker-K) is complete, it should communicate back to VM1: "hey, this is worker-K and I am done!". Once all the workers (worker-1,...,worker-N) are complete, VM1 does some merging operations, and finally communicates back to app_server.
My question is:
Aside from starting a web server on VM1 which listens for POST requests from the workers (worker-1,..), what are the potential mechanisms for having those workers communicate back to VM1? Are there non-webserver ways to listen for HTTP POST requests and do something with the request?
I should note that all of my VMs are operating within the same region/zone on GCE, so they are able to communicate via internal IPs without any special firewall rules, etc. (e.g. running $ ping <other VM's IP addr> works). I obviously do not want any of these VMs (VM1, worker-1, ..., worker-N) to be exposed to the internet.
Thanks!

Sounds like the right use-case for Cloud Pub/Sub. https://cloud.google.com/pubsub
In your case workers would be publishing events to the queue and VM1 would be subscribing to them.
Hard to tell from your high - level overview if it can be a match, but take a look at Cloud Composer too https://cloud.google.com/composer/

Related

How to design an event listener app to only process events once and also be HA and scalable?

I have a situation where I have a NodeJs app that runs as an event listener. This NodeJs app listens for external events outside of my application through websocket.
I need each of the events coming in to only be processed once by my Nodejs app.
However, it's also crucial to ensure that this particular NodeJs app instance can auto-scale up/down when needed and is highly available so that it wouldn't be a bottleneck.
Usually, when it comes to scaling and HA, the first thing that come to my mind is to run a few of instances of it with a load balancer, or run multiple containers on something like ECS. Doing so would introduce multiple instances of the Nodejs app and would also mean each of the same events from the websocket will get processed more than once by all the instances/containers which received it.
What would be a good solution and design to tackle such a problem?
Not sure I fully understand the situation here but I think what you are saying is that you have a socket server that emit to other services, however that a single instance, even with dedicated resources is subject to bottlenecks.
Assuming what I have said is in-line with the question what you probably want to look at (not sure if you using socket.io or not) is the redis socket.io package. This will essentially use redis to store the sockets so you can cluster your socket server and not have it sending duplicates or missed users.
To your question about scale, you for sure would want to use containers for this, we actually use digitalocean 'apps' as an easy way to deploy our containers without having to manage Kuberneties and docker images, only downside right now is no auto scale, however scaling out is just a click of a button and with alerts setup we know when to scale up or down.
With this setup, we have our socket server running with managed redis server, when we need more socket server we just tick it up and we have more throughput.

Django and Celery Confusion

After reading a lot of blogposts, I decided to switch from crontab to Celery for my middle-scale Django project. I have a few things I didn't understand:
1- I'm planning to start a micro EC2 instance which will be dedicated to RabbitMQ, would this be sufficient for a small-to-medium heavy tasking? (Such as dispatching periodical e-mails to Amazon SES).
2- Computing of tasks, does compution of tasks occur on the Django server or the rabbitMQ server (assuming the rabbitMQ is on a seperate server)?
3- When I need to grow my system and have 2 or more application servers behind a load balancer, do these two celery machines need to connect to the same rabbitMQ vhost? Assuming application servers are the carbon copy and tasks are same and everything is sync on the database level.
I don't know the answer to this question, but you can definitely configure it to be suitable (e.g. use -c1 for a single process worker to avoid using much memory, or eventlet/gevent pools), see also the --autoscale option. The choice of broker transport also matters here, the ones that are not polling are more CPU effective (rabbitmq/redis/beanstalk).
Computing happens on the workers, the broker is only responsible for accepting, routing and delivering messages (and persisting messages to disk when necessary).
To add additional workers these should indeed connect to the same virtual host. You would
only use separate virtual hosts if you would want applications to have separate message buses.

Redundancy without central control point?

If it possible to provide a service to multiple clients whereby if the server providing this service goes down, another one takes it's place- without some sort of centralised "control" which detects whether the main server has gone down and to redirect the clients to the new server?
Is it possible to do without having a centralised interface/gateway?
In other words, its a bit like asking can you design a node balancer without having a centralised control to direct clients?
Well, you are not giving much information about the "service" you are asking about, so I'll answer in a generic way.
For the first part of my answer, I'll assume you are talking about a "centralized interface/gateway" involving ip addresses. For this, there's CARP (Common Address Redundancy Protocol), quoted from the wiki:
The Common Address Redundancy Protocol or CARP is a protocol which
allows multiple hosts on the same local network to share a set of IP
addresses. Its primary purpose is to provide failover redundancy,
especially when used with firewalls and routers. In some
configurations CARP can also provide load balancing functionality. It
is a free, non patent-encumbered alternative to Cisco's HSRP. CARP is
mostly implemented in BSD operating systems.
Quoting the netbsd's "Introduction to CARP":
CARP works by allowing a group of hosts on the same network segment to
share an IP address. This group of hosts is referred to as a
"redundancy group". The redundancy group is assigned an IP address
that is shared amongst the group members. Within the group, one host
is designated the "master" and the rest as "backups". The master host
is the one that currently "holds" the shared IP; it responds to any
traffic or ARP requests directed towards it. Each host may belong to
more than one redundancy group at a time.
This might solve your question at the network level, by having the slaves takeover the ip address in order, without a single point of failure.
Now, for the second part of the answer (the application level), with distributed erlang, you can have several nodes (a cluster) that will give you fault tolerance and redundancy (so you would not use ip addresses here, but "distributed erlang" -a cluster of erlang nodes- instead).
You would have lots of nodes lying around with your Distributed Applciation started, and your application resource file would contain a list of nodes (ordered) where the application can be run.
Distributed erlang will control which of the nodes is "the master" and will automagically start and stop your application in the different nodes, as they go up and down.
Quoting (as less as possible) from http://www.erlang.org/doc/design_principles/distributed_applications.html:
In a distributed system with several Erlang nodes, there may be a need
to control applications in a distributed manner. If the node, where a
certain application is running, goes down, the application should be
restarted at another node.
The application will be started at the first node, specified by the
distributed configuration parameter, which is up and running. The
application is started as usual.
For distribution of application control to work properly, the nodes
where a distributed application may run must contact each other and
negotiate where to start the application.
When started, the node will wait for all nodes specified by
sync_nodes_mandatory and sync_nodes_optional to come up. When all
nodes have come up, or when all mandatory nodes have come up and the
time specified by sync_nodes_timeout has elapsed, all applications
will be started. If not all mandatory nodes have come up, the node
will terminate.
If the node where the application is running goes down, the
application is restarted (after the specified timeout) at the first
node, specified by the distributed configuration parameter, which is
up and running. This is called a failover
distributed = [{Application, [Timeout,] NodeDesc}]
If a node is started, which has higher priority according to
distributed, than the node where a distributed application is
currently running, the application will be restarted at the new node
and stopped at the old node. This is called a takeover.
Ok, that was meant as a general overview, since it can be a long topic :)
For the specific details, it is highly recommended to read the Distributed OTP Applications chapter for learnyousomeerlang (and of course the previous link: http://www.erlang.org/doc/design_principles/distributed_applications.html)
Also, your "service" might depend on other external systems like databases, so you should consider fault tolerance and redundancy there, too. The whole architecture needs to be fault tolerance and distributed for "the service" to work in this way.
Hope it helps!
This answer is a general overview to high availability for networked applications, not specific to Erlang. I don't know too much about what is available in the OTP framework yet because I am new to the language.
There are a few different problems here:
Client connection must be moved to the backup machine
The session may contain state data
How to detect a crash
Problem 1 - Moving client connection
This may be solved in many different ways and on different layers of the network architecture. The easiest thing is to code it right into the client, so that when a connection is lost it reconnects to another machine.
If you need network transparency you may use some technology to sync TCP states between different machines and then reroute all traffic to the new machine, which may be entirely invisible for the client. This is much harder to do than the first suggestion.
I'm sure there are lots of things to do in-between these two.
Problem 2 - State data
You obviously need to transfer the session state from the crashed machine unto the backup machine. This is really hard to do in a reliable way and you may lose the last few transactions because the crashed machine may not be able to send the last state before the crash. You can use a synchronized call in this way to be really sure about not losing state:
Transaction/message comes from the client into the main machine.
Main machine updates some state.
New state is sent to backup machine.
Backup machine confirms arrival of the new state.
Main machine confirms success to the client.
This may potentially be expensive (or at least not responsive enough) in some scenarios since you depend on the backup machine and the connection to it, including latency, before even confirming anything to the client. To make it perform better you can let the client check with the backup machine upon connection what transactions it received and then resend the lost ones, making it the client's responsibility to queue the work.
Problem 3 - Detecting a crash
This is an interesting problem because a crash is not always well-defined. Did something really crash? Consider a network program that closes the connection between the client and server, but both are still up and connected to the network. Or worse, makes the client disconnect from the server without the server noticing. Here are some questions to think about:
Should the client connect to the backup machine?
What if the main server updates some state and send it to the backup machine while the backup have the real client connected - will there be a data race?
Can both the main and backup machine be up at the same time or do you need to shut down work on one of them and move all sessions?
Do you need some sort of authority on this matter, some protocol to decide which one is master and which one is slave? Who is that authority? How do you decentralise it?
What if your nodes loses their connection between them but both continue to work as expected (called network partitioning)?
See Google's paper "Chubby lock server" (PDF) and "Paxos made live" (PDF) to get an idea.
Briefly,this solution involves using a consensus protocol to elect a master among a group of servers that handles all the requests. If the master fails, the protocol is used again to elect the next master.
Also, see gen_leader for an example in leader election which works with detecting failures and transferring service ownership.

Move to 2 Django physical servers (front and backend) from a single production server?

I currently have a growing Django production server that has all of the front end and backend services running on it. I could keep growing that server larger and larger, but instead I want to try and leave that main server as my backend server and create multiple front end servers that would run apache/nginx and remotely connect to the main production backend server.
I'm using slicehost now, so I don't think I can benefit from having the multiple servers run on an intranet. How do I do this?
The first step in scaling your server is usually to separate the database server. I'm assuming this is all you meant by "backend services", unless you give us any more details.
All this needs is a change to your settings file. Change DATABASE_HOST from localhost to the new IP of your database server.
If your site is heavy on static content, creating a separate media server could help. You may even look into a CDN.
The first step usually is to separate the server running actual Python code and the database server. Any background jobs that does processing would probably run on the database server. I assume that when you say front end server, you actually mean a server running Python code.
Now, as every request will have to do a number of database queries, latency between the webserver and the database server is very important. I don't know if Slicehost has some feature to allow you to create two virtual machines that are "close" in terms of network latency(a quick google search did not find anything). They seem like nice guys, so maybe you could ask them if they have such a service or could make an exception.
Anyway, when you do have two machines on Slicehost, you could check the latency between them by simply pinging between them. When you have the result you will probably know if this is at all feasible or not.
Further steps depends on your application. If it is media heavy, then maybe using a separate media server would make sense. Otherwise the normal step is to add more web servers.
--
As a side note, I personally think it makes more sense to invest in real dedicated servers with dedicated network equipment for this kind of setup. This of course depends on what budget you are on.
I would also suggest looking into Amazon EC2 where you can provision servers that are magically close to each other.

High number of persistent connections

I'm setting up a project and one of the main questions is how to implement a simple message queueing system (something along the line of a messenger chat system). I would like to avoid polling, but there will most likely be a lot of concurrent connections (tens of thousands). These will be HTTP+SSL connections, started from an application not a browser.
One solution I found would be DNS Load Balancing: distribute these persistent connections across a bunch of nginx webservers.
What do you think? Any other possible solutions?
For load balancing, keeping the application server stateless will open up the field significantly. Once you've got that, you're free to use almost any generic load balancer. From something protocol specific like HTTP load balancers to the generic TCP level load balancers.
Keep it stateless, the rest will be trivial in comparison.
If you are planning on using web services (XML message passing ), you can use gsoap, which has an included web server sample application, which uses thread pools. I've run a server using this and mysql ( for persistent state ). I agree with Ryan on reducing/eliminating the statefulness of the application.
DNS load balancing will allow you to distribute queries between multiple IP addresses, which could be multiple servers. Keep in mind that your clients could get different servers from one request to another, so your applicaiton can't use local state management. Your applicaiton will have to store its state in a centralized location such as a database.
Have you considered peer-to-peer? The state of the art in punching through firewalls is actually very effective especially since you're running your own client software in each instance, and you have servers to start the connection.
More work, but significantly less server resources.
Also, write your own server software - make sure it can handle a lot of connections and is extraordinarily lightweight and you should be able to handle thousands of connections per server before you do load balancing.
-Adam