Appropriate architecture for event logging in a game

Appropriate architecture for event logging in a game - c++

I'm trying to modify a game engine so it records events (like key presses), and store these in a MySQL database on a remote server. The game engine is written in C++, and I currently have the following straightforward architecture, using mysql++ to directly INSERTrecords into appropriate databases:
Unfortunately, there's a very large overhead when connecting to the MySQL server, and the game stops for a significant amount of time. Pushing a batch of Xs worth of events to the server causes a significant delay in gameplay (60s worth of events can take 12s to synchronise). There are also apparently security concerns with leaving the MySQL port accessible publicly.
I was considering an alternative option, instead sending commands to the server, which can interact with the database in its own time:
Here the game would only send the necessary information (e.g. the table to update and the data to insert). I'm not sure whether the speed increase would be sufficient, or what system would be appropriate for managing the commands sent from the game.
Someone else suggested Log4j, but obviously I need a C++ solution. Is there an appropriate existing framework for accomplishing what I want?

Most applications gathering user-interface interaction data (in your case keystrokes) put it into a local file of some sort.
Then at an appropriate time (for example at the end of the game, or the beginning of another game), they POST that file, often in compressed form, to a publicly accessible web server. The software on the web server decompresses the data and loads it into the analytics system (the MySQL server in your case) for processing.
So, I suggest the following.
stop making your MySQL server's port available to people you don't know and trust.
get your game to gather keystrokes locally somehow.
get it to upload that data in big bunches when your game is not in realtime mode.
write a web service to receive and interpret these files.
That way you'll build a more secure analytics system and a more responsive game.

Related

Django receiving data from UDP and storing in DB, only if condition coming from websockets set to true

I'm trying to implement a rather complex architecture for a desktop application (not to be distributed, so it's ok to use technology usually adopted for servers - please don't tell me to use electron or .NET).
It basically must store data coming from a UDP stream (with new data frequency at ~90Hz). The application should also open a websocket server and accept new clients, specifically from a tablet. The tablet user should be able to set a flag, enabling or disabling data storage.
This is a very simple block scheme of the system
I already used Django before, but for more standard usage (CMS, REST APIs, etc). After some research, I found some tools I could use to build the system:
1 - Celery, which to my understanding enables running asynchronous tasks (I guess I could use it to store the data coming from the UDP stream, maybe after accumulating a hundred values or so)
2 - Django channels, which should help me in the websocket communication
3 - Twisted, to receive UDP messages.
What confuses me is how to integrate these components, and exchange data between them. Looks like twisted is a completely separated server, so how can i run a celery task that takes input as data and writes it to a django model?
How should I implement the flag coming from webosckets? global variable?
any help appreciated!

What is the modern programming standard for synchronizing data between a web service and a client?

The question is a little general, so to help narrow the focus, I'll share my current setup that is motivating this question. I have a LAMP web service running a RESTful API. We have two client implementations: one browser-based javascript client (local storage store) and one iOS-based client (core data store). Obviously these two clients store data very differently, but the data itself needs to be kept in two-way sync with the remote server as often as possible.
Currently, our "sync" process is a little dumb (as in, non-smart). Conceptually, it looks like:
Client periodically asks the server for ALL of the most-recent data.
Server sends down the remote data, which overwrites the current set of local data in the client's store.
Any local creates/updates/deletes after this point are treated as gold, and immediately sent to the server.
The data itself is stored relationally, and updated occasionally by client users. The clients in my specific case don't care too much about the relationships themselves (which is why we can get away with local storage in the browser client for now).
Obviously this isn't true synchronization. I want to move to a system where, conceptually, a "diff" of the most recent changes are sent to the server periodically, and the server sends back a "diff" of the most recent changes it knows about. It seems very difficult to get to this point, but maybe I just don't understand the problem very well.
REST feels like a good start, but REST only talks about the way two data stores talk to each other, not how the data itself is synchronized between them. (This sync process is left up to the implementer of each store.) What is the best way to implement this process? Is there a modern set of programming design patterns that apply to inform a specific solution to this problem? I'm mostly interested in a general (technology agnostic) approach if possible... but specific frameworks would be useful to look at too, if they exist.

Multi-master replication is always (and will always be) difficult and bespoke, because how conflicts are handled will be specific to your application.
IMO A more robust approach is to use Master-slave replication, with your web service as the master and the clients as slaves. To keep the clients in sync, use an archived atom feed of the changes (see event sourcing) as per RFC5005. This is the closest you'll get to a modern standard for this type of replication and it's RESTful.
When the clients are online, they do not update their replica directly, instead they send commands to the server and have their replica updated via the atom feed.
When the clients are offline things get difficult. Your clients will need to have a model of how your web service behaves. It will need to have an offline copy of your replica, which should be copied on write from the online replica (the online replica is the one that is updated by the atom feed). When the client executes commands that modify the data, it should store the command (for later replay against the web service), the expected result (for verification during replay) and update the offline replica.
When the client goes back online, it should replay the commands, compare the result with the expected result and notify the client of any variances. How these variances are handled will vary based on your application. The offline replica can then be discarded.

CouchDB replication works over HTTP and does what you are looking to do. Once databases are synced on either end it will send diffs for adds/updates/deletes.
Couch can do this with other Couch machines or with a mobile framework like TouchDB.
https://github.com/couchbaselabs/TouchDB-iOS
I've done a fair amount of it, but you can always set up CouchDB on one machine, set up TouchDB on a mobile device and then watch the HTTP traffic go back and forth to get an idea of how they do it.
Or read this: http://guide.couchdb.org/draft/replication.html
Maybe something from the link above will help you get an idea of how to do your own diffs for your REST service. (Since they are both over HTTP thought it could be useful.)

You may want to look into the Dropbox Datastore API:
https://www.dropbox.com/developers/datastore
It sounds like it might be a very good fit for your purposes. They have iOS and javascript clients.

Lately, I've been interested in Meteor.
The platform sets up Mongo on the server and minimongo in the browser. The client subscribes to some data and when that data changes, the platform automatically sends down the new data to the client.
It's a clever solution to the syncing problem, and it solves several other problems as well. It will be interesting to see if more platforms do this in the future.

Finding and optimal way of distributing dictionary changes over the network

The problem: You have a big dictionary on the server and you are distributing it to lots of clients.
The dictionary is updated only on the server side but you want to allow the clients to update the dictionary by minimizing the data being transfered.
Also you can assume that you have a huge number of clients requesting updates, probably daily or so.
If a key is removed from the server you expect it to be removed from the client on sync.
How would you solve this problem?
Additional request: the solution should be easy to implement on different platforms including desktop (Windows,Linux,OS X) and mobile ones (iOS, Android,...). If this request the usage of third-party library their license has to be very liberal, like BSD.

If this is at a file level, you use rsync (or the awesome bsdiff or xdelta or such).
If this is at an application level, then one approach is to write journal updates to the dictionary (key-value-store) in the server - you write a log of all updates, adds and removes in the order they occur. Your clients then periodically hit the server and say the position in the log they last received, and the server sends them all log items newer than that. The server may also skip journal items that are superseded (e.g. an add that was later removed). If the server keeps track of the clients, it can keep track of the minimum client journal position, and so get rid of journal items it doesn't need any more.
If the dictionary is large and yet requests are low, the clients can just hit the server for each lookup and always get the newest key. This often scales better than you imagine.

Ideally you can find a solution that supports your requirements rather than build your own.
I suggest that you take a look at CouchDB. It has the following features that make it relevant for your problem imo:
It's a key-value store = dictionary, so should easily fit your data model.
Supports replication from machine to machine (or multiple machines) in an occasionally connected environment. That should fit your use case of clients connecting to a server once in a while to pull all updates.
Works well in a distributed environment, so you should be able to handle the huge number of clients, e.g. by maintaining several servers.
Good scaling - works on servers and any kind of client (including mobile). Also, runs on multiple OSs.
It has a rather efficient data protocol for the replication process.
It's free.

Creating C++ client app for some abstract windows server - how to manage TCP connection to server speed?

So we have some server with some address port and ip. we are developing that server so we can implement on it what ever we need for help. What are standard/best practices for data transfer speed management between C++ windows client app and server (C++)?
My main point is in how to get how much data can be uploaded/downloaded from/to client via his low speed network to my relatively super fast server. (I need it for set up of his live stream Audio/Video bit rate)
My try on explaining number 3.
We do not care how fast is our server. It is always faster than needed. We care about client tyring to stream out to our server his media. he streams encoded (via ffmpeg) live video data to our server. But he has say ADSL with 500kb/s of outgoing traffic. Also he uses some ICQ or what so ever so he has less than 500 kb/s per second. And he wants to stream live video! So we need to set up our ffmpeg to encode video with respect to the bit rate user can provide. We develop server side and client side. We need a way of finding out how much user can upload per second currently (so value can change dynamically over time)

Check this CodeProject Article
it's dot-net but you can try figure out the technique from there.

I found what I wanted. "thrulay, network capacity tester" A C++ code library for Available bandwidth tracking in real time on clients. And there is "Spruce" and it is also oss. It is made using some of linux code but I use Boost library so it will be easy to rewrite.
Offtop: I want to report that there is some group of people on SO down voting on all questions on this topic - I do not know why they are so angry but they deffenetly exist.

Ideal way/architecture to deliver large data over Web Services

We are trying to design 6 web services, which will serve another client component. The client component requires data from the web service we are implementing.
Now, the problem is, there is not 1 Web Service we are implementing, there is one Web Service which the client component hits, this initiates a series (5 more) of Web Services which gather data from their respective data stores and finally provide the data back to the original Web Service, which then delivers the data back to the client component.
So, if the requested data becomes huge, then, this will be a serious problem for our internal communication channel.
So, what do you guys suggest? What can be done to avoid overloading of the communication channel between the internal Web Service and at the same time, also delivering the data to the client component.
Update 1
Using 5 WS, where, 1WS does not know about the others, except the next one is a business requirement. Actually, 5 companies "small services" are being integrated.
We use Java and Axis2

We've had a similar problem. Apart from trying to avoid it (eg for internal communication go direct to db instead of web service) you can mitigate it by at least not performing the 5 or so tasks in series. Make new threads to collect them all in parallel and process them at the end to reduce latency (except where they might contend for the same resource and bottle neck).
But before I'd do anything load test it and see if it is even an issue and get some baseline stats so you can see what improvement each change makes. Also sometimes you might be better off tweaking network settings or the actual network rather than trying to optimise the code - but again test and see.

Put all the data on a temporary compressed file and give back the ftp url of the file.
The client fetches the big data chunk uncompress it and reads it. (maybe some authentication mechanism for the ftp server)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js