Game engine design: Multiplayer and listen servers

Game engine design: Multiplayer and listen servers - c++

My game engine right now consists of a working singleplayer part. I'm now starting to think about how to do the multiplayer part.
I have found out that many games actually don't have a real singleplayer mode, but when playing alone you are actually hosting a local server as well, and almost everything runs as if you were in multiplayer (except that the data packets can be passed over an alternate route for better performance)
My engine would need major refactoring to adapt to this model. There would be three possible modes: Dedicated client, Dedicated server and Client-Server (listen mode)
How often is the listen-server model used in the gaming industry?
What are the (dis)advantages of it?
What other options do I have?

I'll see if I can answer this the best I can:
How often is the listen-server model used in the gaming industry?
When it comes to most online games, you'll find that a large majority of games use a client-server architecture, though not always in the way you think. Take any Source game, for instance. Most will use a standard client-server with a master server architecture (to list games available), in that one person will host a dedicated server and anyone with a client can join it.
However, you have some games and services, take for instance Left 4 Dead, League of Legends, and some XBox Live games, that take a slightly different approach. These all use a client-server architecture with a controlling server. The main idea here is that someone creates a dedicated server that isn't "running" any game. The controlling server will create a "lobby" of sorts, and when the game is started, the controlling server will add them to a queue, and when it is that lobby's turn, it will select a matching dedicated server (in terms of location/speed, availability, numerous factors), and assign the players to that server. Only then will the server actually "run" the game. It's the same idea, but a little simplified, as the client doesn't need to "pick" a server, only join a game and let the controlling server do the work.
Of course, the biggest client-server model is the MMO model, where one or many servers runs a persistent world that handles almost all data and logic. Some of the more famous games using this model are World of Warcraft, Everquest, anything like that.
So where does a listen server fit in here? To be honest, not really that well, however, you will still find many games using it. For instance, most Source games allow listen servers to be created, and many XBox Live games do (it's been a while, but I believe Counter Strike did, as well as Quake 4, and many others). In general though, they seem to be frowned upon due to the advantages of the client-server model, which brings us to our next point.
What are the (dis)advantages of it?
First and foremost: performance. In a client-server model, the client will handle local changes (such as input, graphics, sounds, etc) on each cycle of the game. At the end of the cycle, it will package up relevant data (such as, did the player move? If so, where to? Where is s/he looking now? Velocity? Did they shoot? If so, information on the bullet. Etc) and send that to the server for processing. The server will take this data and determine if every thing is valid such as, is the user moving in a way that indicates hacking (more on that later), is the move valid (anything in the way?), did the bullet from player 1 hit player 2?, and more. Then the server packages this up, and sends it to the clients, which then update whatever necessary, such as adjusting health if the player was shot, kicking the player if it is determined that they are hacking, etc.
A listen server, however, must deal with all of this at the same time. Since I assume you are familiar with programming, you probably realize how much power a game can rob from a computer, especially a poorly designed one. Adding on network processing, security processing, and more as well as the client's game, you can see where performance would take a serious hit, at least as far as just standard processing goes. Furthermore, most servers run on fast networks, and are servers designed to withstand network traffic. If a listen server's network is slow, the entire game will suffer.
Second security, as stated earlier, one of the main things a server will do is determine if a player is exploiting the game. You may have seen these as Punkbuster, VAC, etc. There are a very complicated set of rules that run these programs, for instance, determining the difference between a hacker, and just a very good player. It would be very bad for your game if you weren't able to catch hackers, but even worse if you executed action against a falsely accused one.
A listen server will generally not be able to handle the client's game, the server processing, and the hack detection, and in most cases, detectors like Punkbuster are very hard to, if not impossible to get to run on a listen server, because it's hard for it to function correctly without the necessary processing power, as generally the game logic is prioritized over security, and if the detector is not allowed to process for one frame it may lose the data it needed to convict someone.
Lastly, gameplay. The biggest thing about servers is that they are persistent, meaning that even if everyone leaves, the server will continue to run. This is useful if you have a popular server that doesn't have much activity in the night time, people can still join when they are ready to play and not have to wait for it to be brought back online.
In a listen server, the main disadvantage is that as soon as the client hosting the listen server leaves, the game must either be transferred to another player (creating a lul in the game that can last minutes in some cases), or must end completely. This is not preferable on a big server, as the host must either stay online (wasting a slot in the server, and his/her computer power, which could also slow the game), or end the game for everyone.
However, despite these problems, listen servers do have a few advantages.
Easy to set up: Most listen servers are nothing more than hitting "New game" and letting people join. This is easy for people who just want to play with their friends, and don't wish to have to try to find an empty dedicated server, or play with other people.
Good for testing: If one owns a dedicated server and wishes to change it's configuration, it is generally a better idea to test the configuration first. The user would either have to create a backup of the dedicated server and go blindly into the changes, with the only option being to roll back if something goes wrong, create a new dedicated server to test them, of just create a simple listen server to test them. And in with point 1, these are generally easier to start up and configure. This is especially true, as most dedicated servers are not within the administrators immediate access (most dedicated servers are rented from a remote location). It takes much longer to push configuration changes, as well as commands for restarting, etc, to a remote location than a machine that the administrator is currently on.
Less resources: In most dedicated servers, a user with the same IP cannot connect to the dedicated server (meaning, the client must either host the server, or play, they cannot do both). If the client wishes to play on his/her own server, they will usually need a second machine to host the server, or buy or rent a dedicated server so that they can actually play on it. A listen server requires only one machine, which may be the only thing the client can use.
In either case, both have advantages and disadvantages, and you need to weigh them with what you're willing to design and implement. From my experience, I believe that if you were to implement a listen server it would get used, if for nothing else than for a few users wishing to play around with friends, or test settings.
Lastly:
What other options do I have?
This is an industrial can of worms. In reality, any type of network architecture can be applied to video games. However, from what I've seen, like most internet communication, most boil down to some form of client-server model.
Please let me know if I didn't answer your question, or if you need something expanded, and I'll see what I can do.

Related

Scalable login/lobby servers for a multiplayer game

I am developing a multiplayer game (client-server model) and I am stuck when it comes to scaling its servers.
I understand that most games never even reach 10 000+ players, and I don't think mine will either.
Though if I would be very lucky to get that I want to develop the servers so they cannot become a huge obsticle later.
I have searched a lot for a solution to my problem on the internet, checking GDC talks about it and checking other posts on this website, but none of them seems to solve my specific problem.
My current setup is below and all servers are written in C++ using ENet as my network library.
Game server
This server handles the actual gameplay of the game and requires quite a lot of CPU and packages being sent between the server and its connected clients.
But this dedicated server is hosted by the players themselves, so I don't have to think about scaling it at all.
Lobby server
This server handles the server list, containing all servers currently up.
All game servers are sending a UDP package to this server every 5 seconds to say they are still alive.
This is so the lobby server can keep an updated list of all servers currently online.
All clients are sending a UDP package to this server when they want to fetch all servers (which is only in the server
list screen), and the lobby server sends back a list of all servers.
This does not happen that often and the lobby server is limited to send 4 servers per second to a client (and not a huge package containing all servers).
Login server
This server handles creating accounts, lost password, logins, friends and their current game status,
private messages to other logged in players and player profiles that specifies what in-game items they have.
All clients are sending a UDP package to this server every 5 seconds to say they are still alive, while also
sending what game they are currently in. The server then sends back their friend lists online/offline/in-game statuses.
This is so their friends can keep an updated list of which friend is online/offline/in-game.
It sends messages only with player actions otherwise, like creating an account, logging in, changing/resetting password,
adding/removing/ignoring a friend, private messages to friends, etc.
My questions
What I am worried about is that my lobby and login server might not be scalable and that they would have too much traffic on them.
1. Could they in theory be hosted on just a single computer? Or would it be too much traffic for 10 000+ players?
2. If they can be hosted on a single computer, will the servers still not have issues for people that live far away?
Would it be better to have the lobby and login servers per region of the world in that case?
The bad thing about that is that the players would not be able to see servers in the US if they live in Europe, and that their account and items would not exist on the other servers.
3. Might be far-fetched, but if I would rewrite both servers to instead be on a website with a database and make the client/game server do
web requests instead (such as HTTPS or calling a php with specific headers),
would it help in solving my problems somehow?

All your problems and questions are solved by serverless cloud based solution AWS Lambda e.g. or similar. In this case the scalability is not your problem. Just develop the logic. This will save you much time.
If you would like to make servers as single app hosted by your own server. Consider using something like e.g. Go instead of C++. It's designed exactly for these purposes. I mean highly loaded web/network services.

Well, this is c++ and i code in java, but maybe the logic is useful for you any way so i will tell you how i end up implementing something similar but in a casino.
In my case I have 2 diferrent sockets in the same server program, one of the sockets is TCP and it handles all logins, registers and payments, while the second socket is UDP and it handle the actual game multiple payers are playing, then you could group internally all those UDP connection in groups (probably arrays of sockets) to generate those lobbies. Doing that all your server is just one class that could run in a single pc using 2 ports (one for each socket) However this do not solve the problem of the ping for people who live far away.
If ping is a problem (not my case in a casino) you could probably host your server region base but removing the login, registration and paymets of your server and replace it for a connection to a central server (this central server should be TCP and you could also implement a https socket to also allow your webpage to connect to this server and create accounts or pay you directly from the browser)
sorry to mess your life even more, but i hope it helps

Is REST a good solution if you have lots of requests?

I want people to controll my arduino robot via the internet. It's important that the controlling reacts very fast. The user may send many requests per second.
Let me explain my architecture:
The user connects to a web-frontend. He can use a virtual joystick and buttons. The frontend will then send orders (like "motor1:255" "motor2:0"....) to an application server (Wildfly).
When a frontend-session starts, Wildfly will establish a connection to my computer or smartphone using a socket. The orders will be passed to the arduino using bluetooth. When a frontend-session is not longer active, the socket will be closed.
One Wildfly should be able to controll up to 10 robots. One robot can be controlled by exactly one user. Some developers use a mysql table and add a row for each incoming order. I don't think this would work in my case.
Is it okay to use REST to send the orders from the frontend to the application server? Is there any other fast and secure way to transport the user input from the frontend to the business logic?
Goot

REST, when properly understood and applied, is a solution for problems of long-term evolution and maintenance of your application. It doesn't seem to be your case.
What you mean is probably if an HTTP API is a good solution if you have lots of requests, and the answer is, it depends. I'd probably look into something like ZeroMQ for doing what you want.

Debugging network applications and testing for synchronicity?

If I have a server running on my machine, and several clients running on other networks, what are some concepts of testing for synchronicity between them? How would I know when a client goes out-of-sync?
I'm particularly interested in how network programmers in the field of game design do this (or just any continuous network exchange application), where realtime synchronicity would be a commonly vital aspect of success.
I can see how this may be easily achieved on LAN via side-by-side comparisons on separate machines... but once you branch out the scenario to include clients from foreign networks, I'm just not sure how it can be done without clogging up your messaging system with debug information, and therefore effectively changing the way that synchronicity would result without that debug info being passed over the network.
So what are some ways that people get around this issue?
For example, do they simply induce/simulate latency on the local network before launching to foreign networks, and then hope for the best? I'm hoping there are some more concrete solutions, but this is what I'm doing in the meantime...

When you say synchronized, I believe you are talking about network latency. Meaning, that a client on a local network may get its gaming information sooner than a client on the other side of the country. Correct?
If so, then I'm sure you can look for books or papers that cover this kind of topic, but I can give you at least one way to detect this latency and provide a way to manage it.
To detect latency, your server can use a type of trace route program to determine how long it takes for data to reach each client. A common Linux program example can be found here http://linux.about.com/library/cmd/blcmdl8_traceroute.htm. While the server is handling client data, it can also continuously collect the latency statistics and provide the data to the clients. For example, the server can update each client on its own network latency and what the longest latency is for the group of clients that are playing each other in a game.
The clients can then use the latency differences to determine when they should process the data they receive from the server. For example, a client is told by the server that its network latency is 50 milliseconds and the maximum latency for its group it 300 milliseconds. The client then knows to wait 250 milliseconds before processing game data from the server. That way, each client processes game data from the server at approximately the same time.
There are many other (and probably better) ways to handle this situation, but that should get you started in the right direction.

Redundancy without central control point?

If it possible to provide a service to multiple clients whereby if the server providing this service goes down, another one takes it's place- without some sort of centralised "control" which detects whether the main server has gone down and to redirect the clients to the new server?
Is it possible to do without having a centralised interface/gateway?
In other words, its a bit like asking can you design a node balancer without having a centralised control to direct clients?

Well, you are not giving much information about the "service" you are asking about, so I'll answer in a generic way.
For the first part of my answer, I'll assume you are talking about a "centralized interface/gateway" involving ip addresses. For this, there's CARP (Common Address Redundancy Protocol), quoted from the wiki:
The Common Address Redundancy Protocol or CARP is a protocol which
allows multiple hosts on the same local network to share a set of IP
addresses. Its primary purpose is to provide failover redundancy,
especially when used with firewalls and routers. In some
configurations CARP can also provide load balancing functionality. It
is a free, non patent-encumbered alternative to Cisco's HSRP. CARP is
mostly implemented in BSD operating systems.
Quoting the netbsd's "Introduction to CARP":
CARP works by allowing a group of hosts on the same network segment to
share an IP address. This group of hosts is referred to as a
"redundancy group". The redundancy group is assigned an IP address
that is shared amongst the group members. Within the group, one host
is designated the "master" and the rest as "backups". The master host
is the one that currently "holds" the shared IP; it responds to any
traffic or ARP requests directed towards it. Each host may belong to
more than one redundancy group at a time.
This might solve your question at the network level, by having the slaves takeover the ip address in order, without a single point of failure.
Now, for the second part of the answer (the application level), with distributed erlang, you can have several nodes (a cluster) that will give you fault tolerance and redundancy (so you would not use ip addresses here, but "distributed erlang" -a cluster of erlang nodes- instead).
You would have lots of nodes lying around with your Distributed Applciation started, and your application resource file would contain a list of nodes (ordered) where the application can be run.
Distributed erlang will control which of the nodes is "the master" and will automagically start and stop your application in the different nodes, as they go up and down.
Quoting (as less as possible) from http://www.erlang.org/doc/design_principles/distributed_applications.html:
In a distributed system with several Erlang nodes, there may be a need
to control applications in a distributed manner. If the node, where a
certain application is running, goes down, the application should be
restarted at another node.
The application will be started at the first node, specified by the
distributed configuration parameter, which is up and running. The
application is started as usual.
For distribution of application control to work properly, the nodes
where a distributed application may run must contact each other and
negotiate where to start the application.
When started, the node will wait for all nodes specified by
sync_nodes_mandatory and sync_nodes_optional to come up. When all
nodes have come up, or when all mandatory nodes have come up and the
time specified by sync_nodes_timeout has elapsed, all applications
will be started. If not all mandatory nodes have come up, the node
will terminate.
If the node where the application is running goes down, the
application is restarted (after the specified timeout) at the first
node, specified by the distributed configuration parameter, which is
up and running. This is called a failover
distributed = [{Application, [Timeout,] NodeDesc}]
If a node is started, which has higher priority according to
distributed, than the node where a distributed application is
currently running, the application will be restarted at the new node
and stopped at the old node. This is called a takeover.
Ok, that was meant as a general overview, since it can be a long topic :)
For the specific details, it is highly recommended to read the Distributed OTP Applications chapter for learnyousomeerlang (and of course the previous link: http://www.erlang.org/doc/design_principles/distributed_applications.html)
Also, your "service" might depend on other external systems like databases, so you should consider fault tolerance and redundancy there, too. The whole architecture needs to be fault tolerance and distributed for "the service" to work in this way.
Hope it helps!

This answer is a general overview to high availability for networked applications, not specific to Erlang. I don't know too much about what is available in the OTP framework yet because I am new to the language.
There are a few different problems here:
Client connection must be moved to the backup machine
The session may contain state data
How to detect a crash
Problem 1 - Moving client connection
This may be solved in many different ways and on different layers of the network architecture. The easiest thing is to code it right into the client, so that when a connection is lost it reconnects to another machine.
If you need network transparency you may use some technology to sync TCP states between different machines and then reroute all traffic to the new machine, which may be entirely invisible for the client. This is much harder to do than the first suggestion.
I'm sure there are lots of things to do in-between these two.
Problem 2 - State data
You obviously need to transfer the session state from the crashed machine unto the backup machine. This is really hard to do in a reliable way and you may lose the last few transactions because the crashed machine may not be able to send the last state before the crash. You can use a synchronized call in this way to be really sure about not losing state:
Transaction/message comes from the client into the main machine.
Main machine updates some state.
New state is sent to backup machine.
Backup machine confirms arrival of the new state.
Main machine confirms success to the client.
This may potentially be expensive (or at least not responsive enough) in some scenarios since you depend on the backup machine and the connection to it, including latency, before even confirming anything to the client. To make it perform better you can let the client check with the backup machine upon connection what transactions it received and then resend the lost ones, making it the client's responsibility to queue the work.
Problem 3 - Detecting a crash
This is an interesting problem because a crash is not always well-defined. Did something really crash? Consider a network program that closes the connection between the client and server, but both are still up and connected to the network. Or worse, makes the client disconnect from the server without the server noticing. Here are some questions to think about:
Should the client connect to the backup machine?
What if the main server updates some state and send it to the backup machine while the backup have the real client connected - will there be a data race?
Can both the main and backup machine be up at the same time or do you need to shut down work on one of them and move all sessions?
Do you need some sort of authority on this matter, some protocol to decide which one is master and which one is slave? Who is that authority? How do you decentralise it?
What if your nodes loses their connection between them but both continue to work as expected (called network partitioning)?

See Google's paper "Chubby lock server" (PDF) and "Paxos made live" (PDF) to get an idea.
Briefly,this solution involves using a consensus protocol to elect a master among a group of servers that handles all the requests. If the master fails, the protocol is used again to elect the next master.
Also, see gen_leader for an example in leader election which works with detecting failures and transferring service ownership.

How are Massively Multiplayer Online RPGs built?

How are Massively Multiplayer Online RPG games built?
What server infrastructure are they built on? especially with so many clients connected and communicating in real time.
Do they manage with scripts that execute on page requests? or installed services that run in the background and manage communication with connected clients?
Do they use other protocols? because HTTP does not allow servers to push data to clients.
How do the "engines" work, to centrally process hundreds of conflicting gameplay events?
Thanks for your time.

Many roads lead to Rome, and many architectures lead to MMORPG's.
Here are some general thoughts to your bullet points:
The server infrastructure needs to support the ability to scale out... add additional servers as load increases. This is well-suited to Cloud Computing by the way. I'm currently running a large financial services app that needs to scale up and down depending on time of day and time of year. We use Amazon AWS to almost instantly add and remove virtual servers.
MMORPG's that I'm familiar with probably don't use web services for communication (since they are stateless) but rather a custom server-side program (e.g. a service that listens for TCP and/or UDP messages).
They probably use a custom TCP and/or UDP based protocol (look into socket communication)
Most games are segmented into "worlds", limiting the number of players that are in the same virtual universe to the number of game events that one server (probably with lots of CPU's and lots of memory) can reasonably process. The exact event processing mechanism depends on the requirements of the game designer, but generally I expect that incoming events go into a priority queue (prioritized by time received and/or time sent and probably other criteria along the lines of "how bad is it if we ignore this event?").
This is a very large subject overall. I would suggest you check over on Amazon.com for books covering this topic.

What server infrastructure are they built on? especially with so many clients connected and communicating in real time.
I'd guess the servers will be running on Linux, BSD or Solaris almost 99% of the time.
Do they manage with scripts that execute on page requests? or installed services that run in the background and manage communication with connected clients?
The server your client talks to will be a server running a daemons or service that sits idle listening for connections. For instances (dungeons), usually a new process is launched for each group, which would mean there is a dispatcher service somewhere mananging this (analogous to a threadpool)
Do they use other protocols? because HTTP does not allow servers to push data to clients.
UDP is the protocol used. It's fast as it makes no guarantees the packet will be received. You don't care if a bit of latency causes the client to lose their world position.
How do the "engines" work, to centrally process hundreds of conflicting gameplay events?
Most MMOs have zones which limit this to a certain amount of people. For those that do have 100s of people in one area, there is usually high latency. The server is having to deal with 100s of spells being sent its way, which it must calculate damage amounts for each one. For the big five MMOs I imagine there are teams of 10-20 very intelligent, mathematically gifted developers working on this daily and there isn't a MMO out there that has got it right yet, most break after 100 players.
--
Have a look for Wowemu (there's no official site and I don't want to link to a dodgy site). This is based on ApireCore which is an MMO simulator, or basically a reverse engineer of the WoW protocol. This is what the private WoW servers run off. From what I recall Wowemu is
mySQL
Python
However ApireCore is C++.
The backend for Wowemu is amazingly simple (I tried it in 2005 however) and probably a complete over simplification of the database schema. It does gives you a good idea of what's involved.

Because MMOs by and large require the resources of a business to develop and deploy, at which point they are valuable company IP, there isn't a ton of publicly available information about implementations.
One thing that is fairly certain is that since MMOs by and large use a custom client and 3D renderer they don't use HTTP because they aren't web browsers. Online games are going to have their own protocols built on top of TCP/IP or UDP.
The game simulations themselves will be built using the same techniques as any networked 3D game, so you can look towards resources for that problem domain to learn more.
For the big daddy, World of Warcraft, we can guess that their database is Oracle because Blizzard's job listings frequently cite Oracle experience as a requirement/plus. They use Lua for user interface scripting. C++ and OpenGL (for Mac) and Direct3D (for PC) can be assumed as the implementation languages for the game clients because that's what games are made with.
One company that is cool about discussing their implementation is CCP, creators of Eve online. They have published a number of presentations and articles about Eve's infrastructure, and it is a particularly interesting case because they use Stackless Python for a lot of Eve's implementation.
http://www.disinterest.org/resource/PyCon2006-StacklessInEve.wmv
http://us.pycon.org/2009/conference/schedule/event/91/
There was also a recent Game Developer Magazine article on Eve's architecture:
https://store.cmpgame.com/product/3359/Game-Developer-June%7B47%7DJuly-2009-Issue---Digital-Edition

The Software Engineering radio podcast had an episode with Jim Purbrick about Second Life which discusses servers, worlds, scaling and other MMORPG internals.

Traditionally MMOs have been based on C++ server applications running on Linux communicating with a database for back end storage and fat client applications using OpenGL or DirectX.
In many cases the client and server embed a scripting engine which allows behaviours to be defined in a higher level language. EVE is notable in that it is mostly implemented in Python and runs on top of Stackless rather than being mostly C++ with some high level scripts.
Generally the server sits in a loop reading requests from connected clients, processing them to enforce game mechanics and then sending out updates to the clients. UDP can be used to minimize latency and the retransmission of stale data, but as RPGs generally don't employ twitch gameplay TCP/IP is normally a better choice. Comet or BOSH can be used to allow bi-directional communications over HTTP for web based MMOs and web sockets will soon be a good option there.
If I were building a new MMO today I'd probably use XMPP, BOSH and build the client in JavaScript as that would allow it to work without a fat client download and interoperate with XMPP based IM and voice systems (like gchat). Once WebGL is widely supported this would even allow browser based 3D virtual worlds.
Because the environments are too large to simulate in a single process, they are normally split up geographically between processes each of which simulates a small area of the world. Often there is an optimal population for a world, so multiple copies (shards) are run which different sets of people use.
There's a good presentation about the Second Life architecture by Ian Wilkes who was the Director of Operations here: http://www.infoq.com/presentations/Second-Life-Ian-Wilkes
Most of my talks on Second Life technology are linked to from my blog at: http://jimpurbrick.com

Take a look at Erlang. It's a concurrent programming language and runtime system, and was designed to support distributed, fault-tolerant, soft-real-time, non-stop applications.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js