Can libraries replace local socketing in C/C++? - c++

I'm trying to develop a specific DB server in C++, and I have two questions:
Is it possible to have a dynamic library take care of the communication between client programs instead of using sockets? This way, serializing is avoided and all querying can be made using native C/C++ library calls, while the server listens to the library for incoming requests
Does any known database work like that, and if yes what are the pros and cons of such an approach?
As far as I can see, having native calls to the DB server through the library removes overhead from serializing and socket system calls (even though it adds calls to a dynamic library). Also, I'm not sure how memory can be shared with libraries, but if it can then it could be very beneficial to "almost" share memory with the server as a client.

(I am focusing on Linux and POSIX, but the principles would be the same on other OSes like Windows, Android, MacOSX)
The communication between a database client and the database server is very likely to happen on socket(7)s or some similar byte stream, like pipe(7)s or fifo(7)s. Using shared memory (shm_overview(7)...) for that communication is unusual, and you still need some synchronization mechanism (e.g. semaphores sem_overview(7)...).
There are some libraries (above sockets) to facilitate such communications, e.g.
0mq.
Some database libraries exist that work without communicating to some database sever, in particular sqlite, which manage the database storage directly (in your client process). You might have some issues if several processes are accessing the same database concurrently (so ACID properties might not be guaranteed, at least if using sqlite without care).
Notice that local inter-process communications are quite efficient on Linux. It is not unusual to have a bandwidth of several hundreds of megabytes per second on a local pipe (use rather large buffers, e.g. of 64 Kbytes or a megabyte, for read(2) & write(2)...)
In practice, in a database, indexing and disk access are more likely to be the bottleneck than client <-> server communication, at least on the same local host. If the server is a remote host, network communication is probably the bottleneck (at least on common gigabit/sec ethernet).
Read also this, in particular the table in Answers section.
Perhaps gdbm, redis, mongodb, postgresql might be relevant for your issues.

Yes, if you DB clients are on the same machine that your DB server is on, they could communicate directly using techniques like shared memory IPC. However, this is typically not useful, because:
A database with all its clients on a single machine is rare.
A database with even one client on the same machine other than an administrative interface is not typical.
Systems like Linux already have optimizations built in for localhost socket communication, so it doesn't go via the network at all--only through the kernel.
A database whose performance is limited by socket IPC due to syscalls could easily overcome this by simply using a third-party kernel bypass solution for network communication, which does not require any special code at all--just plug in a kernel-bypass TCP stack--you can do this with many existing databases.

Related

unix socket vs shared memory message which is faster

I am looking at a linux server program which, for each client, creates some shared memory and uses message queues (a C++ class called from the code) in that shared memory to send messages to and fro. On the face of it this sounds like the same usage pattern as domain sockets - i.e. have a server program that sends and recvs payloads from its clients.
My question is - what extra work do unix domain sockets do? What could conceivably cause shared memory with a message queue to be faster than a socket and vice versa?
My guess is there is some overhead to calling send and recv, but I'm not exactly sure what. I might try and benchmark this, just looking for some insight before I do this.
Here is one discussion:
UNIX Domain sockets vs Shared Memory (Mapped File)
I can add that sockets are very primitive, just a stream of bytes for stream sockets. This may actually be an advantage - it tends to make messages between different subsystems small and simple, promoting lean interfaces and loose coupling. But sometimes, shared memory is really useful. I used shared memory in a C++ Linux back-end to a data-intensive Google Maps application - the database was just huge (+1 Gigabyte) png rasters in shared memory.

Strategy for simple yet effective UDP server for gaming (and other tasks)

i'm trying to implment my idea of simple yet pretty effective multithreaded server working on UDP. Main goal is for gaming(-like) applications, but it would be good if it could be used in other purposes too.
I want to use this API/technologies etc
STD::Thread for multithreading, since it is part of C++ standard, it should be future-proof and as far as i seen it it's both simple and works well with C++.
BSDSock (Linux) & WinSock2 (Windows). I would create one abstract class called Socket and for each platform (Linux - BSD, Windows - WinSock) create derived class implementing native API. Then i would use API provided by base class Socket, not native/platform API. That would allow me to use one code for whole server/client module and if i want to change platform i'd have to just switch class type of socket and thats it.
As for strategy of server-client comunication i thought of something like this:
Each programm has two sockets - one that listens on specified port and one that is used to send data to server/other clients. Both sockets run on different threads so that i can both read and send data at the same time (sort of), that way waiting for data won't ruin my performance. There will be one main server, and other clients will connect directly to that server. Clients will send only their data and recieve data directly from server.
Now i have some question:
Is it wise to use STD::Thread? I heard it's good on Linux, but not that good on Windows. Would PThreads would be much better?
Any other interesting ideas about making one code for many platforms (mainly Linux&Windows)? Or mine is good enough?
Any other ideas or some tips about strategy for how server/client would work? I wrote some simple network apps, but it didn't need that good strategy, so i'm not sure if it's best from simple ideas.
How often should i send data from client to server (and from server to client)? I dont want to flood the network and to make server load 100%?
Also: it should work nice with 2-4 players at the same time, i don't plan to use it with more at the moment.
Intuitively, from multi-threading purposes Linux + Pthread would be a nice combination. A vast number of mission critical systems are running on that combination. However, when we come to std::thread, having platform dependent nature is a nice to have feature. Certainly, if some bad smells are in windows dialect, MS would correct them future. BUT, if I were you, I will certainly select Linux + std::thread combination. Selection of Linux over Windows is a different topic and no need to comment here (with respect to server development perspective). std::thread provides you a nice set of feature,yet having the power of pthreads.
Regarding UDP, you have both pros and cons. But, I'd say if you are going to open your sever for public, you have to think about network firewalls as well. If you can address the inherent transport layer issues of UDP (packet re-ordering, lost packet recovery), a UDP server is light weighted in most of the cases.
It depends on your game to decide how often you need to send messages. I can't comment it.
Moreover, pay your attention to extra security on your data communication more seriously. Sooner or later your sever will be hacked. It is just a matter of fact of TIME.

boost::asio multi-threading problem

Ive got a server that uses boost::asio which I wish to make multi-threaded.
The server can be broken down into several "areas", with the sockets starting in a connect area, then once connected to a client being moved to a authentication area (i.e. login or register) before then moving between various other parts of the server depedning on what the client is doing.
I don't particularly want to just use a thread pool on a single io_service for all the sockets, since a large number of locks will be required, especially in areas with a large amount of interaction with common resources. However, instead I want to give each server component (like say authentication) their own thread.
However I'm not sure on how to do this. I considered the idea of giving each component its own io_service, so it could use whatever threads it wanted, however sockets area tied to an io_service, and I'm not sure how to then move a clients socket from one component to another.
You can solve this with asio::io_service::strand. Create a thread pool for io_service as usual. Once you've established a connection with the client, from there on wrap all async calls with a io_service::strand. One strand per client. This essentially guarantees that from the client's point of view it is single threaded.
First, I'd advocate considering the multi-process approach instead; it is a very straightforward, easy to reason about and debug, and easy to scale architecture.
A server design where you can scale horizontally - several instances of the server, where state within each does not need to be shared between servers (e.g. shared state can be in a common database (SQL, Voldemort (persistant) or Redis (sets and lists - very cool, I'm really excited about a persistent version), memcached (unreliable) or such) - is more easily scaleable.
You could, for example, have a single listener thread that balances between several server processes using UNIX sendmsg() to transfer the descriptor. This architecture would be straightforward to migrate to multi machine with hardware load-balancers later.
The area idea in the poster is intriguing. It could be that, rather than locking, you could do it all by message queues. Reason that disk IO - even with SSD and such - and the network are the real bottlenecks and it is not necessary to be as careful with CPU; the latencies of messages passing between threads is not such a big deal, and depending on your operating system the threads (or processes) could be scheduled to different cores in an SMP setup.
But ultimately, once you reach saturation, to scale up the area idea you need faster cores and not more of them. Here's an interesting monologue from one of our hosts about that.

Fast Cross Platform Inter Process Communication in C++

I'm looking for a way to get two programs to efficiently transmit a large amount of data to each other, which needs to work on Linux and Windows, in C++. The context here is a P2P network program that acts as a node on the network and runs continuously, and other applications (which could be games hence the need for a fast solution) will use this to communicate with other nodes in the network. If there's a better solution for this I would be interested.
boost::asio is a cross platform library handling asynchronous io over sockets. You can combine this with using for instance Google Protocol Buffers for your actual messages.
Boost also provides you with boost::interprocess for interprocess communication on the same machine, but asio lets you do your communication asynchronously and you can easily have the same handlers for both local and remote connections.
I have been using ICE by ZeroC (www.zeroc.com), and it has been fantastic. Super easy to use, and it's not only cross platform, but has support for many languages as well (python, java, etc) and even an embedded version of the library.
Well, if we can assume the two processes are running on the same machine, then the fastest way for them to transfer large quantities of data back and forth is by keeping the data inside a shared memory region; with that setup, the data is never copied at all, since both processes can access it directly. (If you wanted to go even further, you could combine the two programs into one program, with each former 'process' now running as a thread inside the same process space instead. In that case they would be automatically sharing 100% of their memory with each other)
Of course, just having a shared memory area isn't sufficient in most cases: you would also need some sort of synchronization mechanism so that the processes can read and update the shared data safely, without tripping over each other. The way I would do that would be to create two double-ended queues in the shared memory region (one for each process to send with). Either use a lockless FIFO-queue class, or give each double-ended queue a semaphore/mutex that you can use to serialize pushing data items into the queue and popping data items out of the queue. (Note that the data items you'd be putting into the queues would only be pointers to the actual data buffers, not the data itself... otherwise you'd be back to copying large amounts of data around, which you want to avoid. It's a good idea to use shared_ptrs instead of plain C pointers, so that "old" data will be automatically freed when the receiving process is done using it). Once you have that, the only other thing you'd need is a way for process A to notify process B when it has just put an item into the queue for B to receive (and vice versa)... I typically do that by writing a byte into a pipe that the other process is select()-ing on, to cause the other process to wake up and check its queue, but there are other ways to do it as well.
This is a hard problem.
The bottleneck is the internet, and that your clients might be on NAT.
If you are not talking internet, or if you explicitly don't have clients behind carrier grade evil NATs, you need to say.
Because it boils down to: use TCP. Suck it up.
I would strongly suggest Protocol Buffers on top of TCP or UDP sockets.
So, while the other answers cover part of the problem (socket libraries), they're not telling you about the NAT issue. Rather than have your users tinker with their routers, it's better to use some techniques that should get you through a vaguely sane router with no extra configuration. You need to use all of these to get the best compatibility.
First, ICE library here is a NAT traversal technique that works with STUN and/or TURN servers out in the network. You may have to provide some infrastructure for this to work, although there are some public STUN servers.
Second, use both UPnP and NAT-PMP. One library here, for example.
Third, use IPv6. Teredo, which is one way of running IPv6 over IPv4, often works when none of the above do, and who knows, your users may have working IPv6 by some other means. Very little code to implement this, and increasingly important. I find about half of Bittorrent data arrives over IPv6, for example.

Remote proxy with shared memory in C++

Suppose I have a daemon that is sharing it's internal state to various applications via shared memory. Processes can send IPC messages to the daemon on a named pipe to perform various operations. In this scenario, I would like to create a C++ wrapper class for clients that acts as a kind of "Remote Proxy" to hide some of the gory details (synchronization, message passing, etc) from clients and make it easier to isolate code for unit tests.
I have three questions:
Generally, is this a good idea/approach?
Do you have any tips or gotchas for synchronization in this setup, or is it enough to use a standard reader-writer mutex setup?
Are there any frameworks that I should consider?
The target in question is an embedded linux system with a 2.18 kernel, therefore there are limitations on memory and compiler features.
Herb Sutter had an article Sharing Is the Root of All Contention that I broadly agree with; if you are using a shared memory architecture, you are exposing yourself to quite a bit of potential threading problems.
A client/server model can make things drastically simpler, where clients write to the named server pipe, and the server writes back on a unique client pipe (or use sockets). It would also make unit testing simpler (since you don't have to worry about testing shared memory), could avoid mutexing, etc.
There's Boost.Interprocess library, though I can't comment on its suitability for embedded systems.