What's a pattern for getting two "deep" parts of a multi-threaded program talking to each other?

What's a pattern for getting two "deep" parts of a multi-threaded program talking to each other? - c++

I have this general problem in design, refactoring or "triage":
I have an existing multi-threaded C++ application which searches for data using a number of plugin libraries. With the current search interface, a given plugin receives a search string and a pointer to a QList object. Running on a different thread, the plugin goes out and searches various data sources (locally and on the web) and adds the objects of interest to the list. When the plugin returns, the main program, still on the separate thread, adds this data to the local data store (with further processing), guarding this insertion point using a mutex. Thus each plugin can return data asynchronously.
The QT-base plugin library is based on message passing. There are a fair number of plugins which are already written and tested for the application and they work fairly well.
I would like to write some more plugins and leverage the existing application.
The problem is that the new plugins will need more information from the application. They will to need intermittent access to the local data store itself as they search. So to get this, they would need direct or indirect access both the hash array storing the data and the mutex which guards multiple access to the store. I assume the access would be encapsulated by adding an extra method in a "catalog" object.
I can see three ways to write these new plugins.
When loading a plugin, pass them
a pointer to my "catalog" at the
start. This becomes an extra,
"invisible" interface for the new
plugins. This seems quick, easy,
completely wrong according to OO but
I can't see what the future problems would be.
Add a method/message to the
existing interface so I have a
second function which could be
called for the new plugin libraries,
the message would pass a pointer to
the catalog to the plugins. This
would be easy for the plugins but it
would complicate my main code and
seems generally bad.
Redesign the plugin interface.
This seems "best" according to OO,
could have other added benefits but
would require all sorts of
rewriting.
So, my questions are
A. Can anyone tell me the concrete dangers of option 1?
B. Is there a known pattern that fits this kind of problem?
Edit1:
A typical function for calling the plugin routines looks like:
elsewhere(spec){
QList<CatItem> results;
plugins->getResult(spec, &results);
use_list(results);
}
...
void PluginHandler::getResults(QString* spec, QList<CatItem>* results)
{
if (id->count() == 0) return;
foreach(PluginInfo info, plugins) {
if (info.loaded)
info.obj->msg(MSG_GET_RESULTS, (void*) spec, (void*) results);
}
}
It's a repeated through-out the code. I'd rather extend it than break it.

Why is it "completely wrong according to OO"? If your plugin needs access to that object, and it doesn't violate any abstraction you want to preserve, it is the correct solution.
To me it seems like you blew your abstractions the moment you decided that your plugin needs access to the list itself. You just blew up your entire application's architecture. Are you sure you need access to the actual list itself? Why? What do you need from it? Can that information be provided in a more sensible way? One which doesn't 1) increase contention over a shared resource (and increase the risk of subtle multithreading bugs like race conditions and deadlocks), and 2) doesn't undermine the architecture of the rest of the app (which specifically preserves a separation between the list and its clients, to allow asynchronicity)
If you think it's bad OO, then it is because of what you're fundamentally trying to do (violate the basic architecture of your application), not how you're doing it.

Well, option 1 is option 3, in the end. You are redesigning your plugin API to receive extra data from the main app.
It's a simple redesign that, as long as the 'catalog' is well implemented and hide every implementation detail of your hash and mutex backing store, is not bad, and can serve the purpose well enough IMO.
Now if the catalog leaks implementation details then you would better use messages to query the store, receiving responses with the needed data.

Sorry, I just re-read your question 3 times and I think my answer may have been too simple.
Is your "Catalog" an independent object? If not, you could wrap it as it's own object. The Catalog should be completely safe (including threadsafe)--or better yet immutable.
With this done, it would be perfectly valid OO to pass your catalog to the new plugins. If you are worried about passing them through many layers, you can create a factory for the catalog.
Sorry if I'm still misunderstanding something, but I don't see anything wrong with this approach. If your catalog is an object outside your control, however, such as a database object or collection then you really HAVE to encapsulate it in something you can control with a nice, clean interface.
If your Catalog is used by many pieces across your program, you might look at a factory (which, at it's simplest degrades to a Singleton). Using a factory you should be able to summon your Catalog with a Catalog.getType("Clothes"); or whatever. That way you are giving out the same object to everyone who wants one without passing it around.
(this is very similar to a singleton, by the way, but coding it as a factory reminds you that there will almost certainly be more than one--also remember to allow a Catalog.setType("Clothes", ...); for testing.

Related

C++ Singleton Design pattern alternatives

I hate to beat a dead horse, that said, I've gone over so many conflicting articles over the past few days in regards to the use of the singleton pattern.
This question isn't be about which is the better choice in general, rather what makes sense for my use case.
The pet project I'm working on is a game. Some of the code that I'm currently working on, I'm leaning towards using a singleton pattern.
The use cases are as follows:
a globally accessible logger.
an OpenGL rendering manager.
file system access.
network access.
etc.
Now for clarification, more than a couple of the above require shared state between accesses. For instance, the logger is wrapping a logging library and requires a pointer to the output log, the network requires an established open connection, etc.
Now from what I can tell it's more suggested that singletons be avoided, so lets look at how we may do that. A lot of the articles simply say to create the instance at the top and pass it down as a parameter to anywhere that is needed. While I agree that this is technically doable, my question then becomes, how does one manage the potentially massive number of parameters? Well what comes to mind is wrapping the different instances in a sort of "context" object and passing that, then doing something like context->log("Hello World"). Now sure that isn't to bad, but what if you have a sort of framework like so:
game_loop(ctx)
->update_entities(ctx)
->on_preupdate(ctx)
->run_something(ctx)
->only use ctx->log() in some freak edge case in this function.
->on_update(ctx)
->whatever(ctx)
->ctx->networksend(stuff)
->update_physics(ctx)
->ctx->networksend(stuff)
//maybe ctx never uses log here.
You get the point... in some areas, some aspects of the "ctx" aren't ever used but you're still stuck passing it literally everywhere in case you may want to debug something down the line using logger, or maybe later in development, you actually want networking or whatever in that section of code.
I feel like the above example would much rather be suited to a globally accessible singleton, but I must admit, I'm coming from a C#/Java/JS background which may color my view. I want to adopt the mindset/best practices of a C++ programmer, yet like I said, I can't seem to find a straight answer. I also noticed that the articles that suggest just passing the "singleton" as a parameter only give very simplistic use cases that anyone would agree a parameter would be the better way to go.
In this game example, you probably wan't to access logging everywhere even if you don't plan on using it immediately. File system stuff may be all over but until you build out the project, it's really hard to say when/where it will be most useful.
So do I:
Stick with using singletons for these use cases regardless of how "evil/bad" people say it is.
Wrap everything in a context object, and pass it literally everywhere. (seems kinda gross IMO, but if that's the "more accepted/better" way of doing it, so be it.)
Something completely else. (Really lost as to what that might be.)
If option 1, from a performance standpoint, should I switch to using namespace functions, and hiding the "private" variables / functions in anonymous namespaces like most people do in C? (I'm guessing there will be a small boost in performance, but then I'll be stuck having to call an "init" and "destroy" method on a few of these rather than being able to just allow the constructor/destructor to do that for me, still might be worth while?)
Now I realize this may be a bit opinion based, but I'm hoping I can still get a relatively good answer when a more complicated/nested code base is in question.
Edit:
After much more deliberation I've decided to use the "Service Locator" pattern instead. To prevent a global/singleton of the Service Locator I'm making anything that may use the services inherit from a abstract base class that requires the Service Locator be passed when constructed.
I haven't implemented everything yet so I'm still unsure if I'll run into any problems with this approach, and would still love feedback on if this is a reasonable alternative to the singleton / global scope dilemma.
I had read that Service Locator is also somewhat of an anti-pattern, that said, many of the example I found implemented it with statics and/or as a singleton, perhaps using it as I've described removes the aspects that cause it to be an anti-pattern?

Whenever you think you want to use a Singleton, ask yourself the following question: Why is it that it must be ensured at all cost that there never exists more than one instance of this class at any point in time? Because the whole point of the Singleton pattern is to make sure that there can never be more than one instance of the Singleton. That's what the term "singleton" is all about: there only being one. That's why it's called the Singleton pattern. That's why the pattern calls for the constructor to be private. The point of the Singleton pattern is not and never was to give you a globally-accessible instance of something. The fact that there is a global access point to the sole instance is just a consequence of the Singleton pattern. It is not the objective the Singleton pattern is meant to achieve. If all you want is a globally accessible instance of something, then use a global variable. That's exactly what global variables are for…
The Singleton pattern is probably the one design pattern that's singularly more often misunderstood than not. Is it an intrinsic aspect of the very concept of a network connection that there can only ever be one network connection at a time, and the world would come to an end if that constraint was ever to be violated? If the answer is no, then there is no justification for a network connection to ever be modeled as a Singleton. But don't take my word for it, convince yourself by checking out page 127 of Design Patterns: Elements of Reusable Object-Oriented Software where the Singleton pattern was originally described…😉
Concerning your example: If you're ending up having to pass a massive number of parameters into some place then that first and foremost tells you one thing: there are too many responsibilities in that place. This fact is not changed by the use of Singletons. The use of Singletons simply obfuscates this fact because you're not forced to pass all stuff in through one door in the form of parameters but rather just access whatever you want directly all over the place. But you're still accessing these things. So the dependencies of your piece of code are the same. These dependencies are just not expressed explicitly anymore at some interface level but creep around in the mists. And you never know upfront what stuff a certain piece of code depends on until the moment your build breaks after trying to take away one thing that something else happened to depend upon. Note that this issue is not specific to the Singleton pattern. This is a concern with any kind of global entity in general…
So rather than ask the question of how to best pass a massive number of parameters, you should ask the question of why the hell does this one piece of code need access to that many things? For example, do you really need to explicitly pass the network connection to the game loop? Should the game loop not maybe just know the physics world object and that physics world object is at the moment of creation given some object that handles the network communication. And that object in turn is upon initialization told the network connection it is supposed to use? The log could just be a global variable (or is there really anything about the very idea of a log itself that prohibits there ever being more than one log?). Or maybe it would actually make sense for each thread to have its own log (could be a thread-local variable) so that you get a log from each thread in the order of the control flow that thread happened to take rather than some (at best) interleaved mess that would be the output from multiple threads for which you'd probably want to write some tool so that you'd at least have some hope of making sense of it at all…
Concerning performance, consider that, in a game, you'll typically have some parent objects that each manage collections of small child objects. Performance-critical stuff would generally be happening in places where something has to be done to all child objects in such a collection. The relative overhead of first getting to the parent object itself should generally be negligible…
PS: You might wanna have a look at the Entity Component System pattern…

Adding unlimited number of classes as a member og GameObect C++

Thanks for reading!
Problem Itself
So the question is Is there any simple specific way provided by C++ to add unlimited number of unspecified classes as a members
or have I play with vectors ;)
They must be:
Able to add when the program is running
Able to add many different classes that even doesn't exist now, so I can't
use std::optional because I will die with every code change.
Able to remove also during program work.
I found something that can be similar C++ Components based class
but this can't help me because I neeed flexibile code without prdefining what classes can be memeber of Game Object
This is my first question on stack overflow so pleas be patient ;) I'll be very glad for any kind of help.
Thanks for all answers!

I'm not sure if you're heading the right way. You see, games don't usually have a lot of dynamic stuff. Usually, you might want to load a level in memory where it stays until it's completed by player. When it does, you may clean all the game objects and load a next level. Open world games are more complicated and you might have to load and unload adjacent areas on fly when developing one. Also, games require lots of CPU speed, so when you create lots of dynamic stuff it leads to fragmented spaghetti structure of your memory and lots of cache misses. You don't want that. Dynamic classes will require trampoline code to connect your functions together and it'll lead to another performance loss. Solve your problems one by one. You don't have that much classes to have a need for another CORBA or DCOM yet. I guess, you'll never have. Linux core is monolithic, for instance.
IMHO C++ isn't any good for making binary interfaces.
If you want to make a plugin API, you might list some predefined directory for *.dll files, load each DLL file and call some function from it, something like "LoadGameEnginePlugin". Valid plugin will return a structure describing its functionality and pointers to functions implementing one. After that, you call these functions via pointers.
If you want to extend your GameObject's functionality indefinitely, you may look to Visitor or MultiMethod design patterns.

Avoiding singletons in plugins to be shared among all the instances

I'm building a plugin which will be used in a host. This plugin is using a singleton for services I would like to easily access anywhere. The problem comes when I instance several times the same plugin, the same (static) singleton, being specific to the runnable, will be shard among all the instanced plugins. Is there, generally speaking, a way to reduce the scope of the singleton (c++) ?
As each plugin is an instance in itself, I could obviously pass the root class of the plugin to all of it's subclasses but I would like to keep the same global singleton design as possible.

Is there a reason for having a singleton? The rationale is when you need to enforce that there is only one, and need to provide a single point of access. If these aren't really requirements, then just create one and pass it around where needed.
I would gradually get rid of the singleton.
Does the singleton do a lot, or not much?
You might need to divide it up into parts.
If it doesn't do much, just pass it where is is needed, and get rid of its singleton-ness.
If it provides lots of services, create interfaces for each service and pass those around where they are needed. Your design will improve and become more testable and easier to comprehend.
At first, the implementations of the interfaces could delegate to the original singleton, but you want to make them self contained eventually.

A singleton do internally make use of a static variable.
The scope of this static variable is specified by the source file where it is defined and partitioned by its current runnable. For those reason, while running under the same host (and then the same runnable) both plugins (which are the same code) do share the same static variable (and by extension the same singleton).
As we assume in this question the code to be the same for each plugin, the only way to split those singletons would then be to run a new executable. This could be done using the fork unix command for example where both process will then hold their own memory range.
Obviously (as most of you commented) it is a much better approach to avoid using singletons in this case as forking a process is just adding useless complexity.

Efficient ways to save and load data from C++ simulation

I would like to know which are the best way to save and load C++ data.
I am mostly interested in saving classes and matrices (not sparse) I use in my simulations.
Now I just save them as txt files, but if I add a member to a class I then have to modify the function that loads the data (it has to parse and check for the value in the txt file),
that I think is not ideal.
What would you recommend in general? (p.s. as I'd like to release my code I'd really like to use only standard c++ or libraries that can be redistributed).

In this case, there is no "best." What is best for you is highly dependent upon your situation. But, lets have an example to get you thinking about your details and how deep this rabbit hole can go.
If you absolutely positively must have the fastest save possible without question (and you're willing to pay the price), you can define your own memory management to put all objects into a contiguous array of a common type (such as integers). This allows you to write that array to disk as binary data very rapidly. You might need this in a simulation that uses threads efficiently to load every core/processor to run at real time.
Why is a rather horrible solution? Because it takes a LOT of work and runs many risks for problems in the name of "optimization."
It requires you to build your own memory management (operator new() and operator delete()) which may need to be thread safe.
If you try to load from this array, you will have to placement new all objects with a unique non-modifying constructor in order to ensure all virtual pointers are set properly. Oh, and you have to track the type of each address to now how to do this.
For portability with other systems and between versions of the binary, you will need to have utilities to convert from the binary format to something generic enough to be cross platform (including repopulating pointers to other objects).
I have done this. It was highly unpleasant. I have no doubt there are still problems with it and I have only listed a few here. But, it was very, very fast and very, very, very problematic.
You must design to your needs. Generally, the first need is "Make it work." Don't care about efficiency, just about something that accurately persists and that you have the information known and accessible at some point to do it. Also, you should encapsulate the process of saving and loading. Then, if the need "Make it better" steps in, you should be able to change that one bit of code and the rest should work. You might even make the saving format selectable on user needs instead of your needs which you must assume for all users.
Given all the assumptions, pros and cons listed, you should be able to elaborate your particular needs for this question.

Given that performance is not your concern -- which is a critical part of the answer -- the Boost Serialization library is a great answer.
The link in the comment leads to the documentation. Read the tutorial (which is overkill for what you are initially wanting, but well worth it).
Finally, since you have mostly array matrices, try to encapsulate the entire process of save and load so that should you need to change it later, you are writing a new implementatio and choosing between the exisiting. I expend the eddedmtime for the smarts of Boost Serialization would not be great; however, you might find a future requirement moves you to something else or multiple something elses.

The C++ Middleware Writer automates the creation of marshalling functions. When you add a member to a class, it updates the marshalling functions for you.

How can I decrease complexity in library without increasing complexity elsewhere?

I am tasked to maintain and update a library which allows a computer to send commands at a hardware device and then receive its response. Currently the code is setup in such a way that every single possible command the device can receive is sent via its own function. Code repetition is everywhere; a DRY advocate's worst nightmare.
Obviously there is much opportunity for improvement. The problem is each command has a different payload. Currently the data that is to be the payload is passed to each command function in the form of arguments. It's difficult to consolidate functionality without pushing the complexity to a level that calls the library.
When a response is received from the device its data is put into an object of a class solely responsible for holding this data, they do nothing else. There are hundreds of classes which do this. These objects are then used to access the returned data by the app layer.
My objectives:
Throughly reduce code repetition
Maintain similiar level of complexity at application layer
Make it easier to add new commands
My idea:
Have one function to send a command and one to receive (the receiving function is automatically called when a response from the device is detected). Have a struct holding all command/response data which will be passed to sending function and returned by receiving function. Since each command has a corresponding enum value, have a switch statement which sets up any command specific data for sending.
Is my idea the best way to do it? Is there a design pattern I could use here? I've looked and looked but nothing seems to fit my needs.
Thanks in advance! (Please let me know if clarification is necessary)

This reminds me of the REST vs. SOA debate, albeit on a smaller physical scale.
If I understand you correctly, right now you have calls like
device->DoThing();
device->DoOtherThing();
and then sometimes I get a callback like
callback->DoneThing(ThingResult&);
callback->DoneOtherTHing(OtherThingResult&)
I suggest that the user is the key component here. Do the current library users like the interface at the level it is designed? Is the interface consistent, even if it is large?
You seem to want to propose
device->Do(ThingAndOtherThingParameters&)
callback->Done(ThingAndOtherThingResult&)
so to have a single entry point with more complex data.
The downside from a library user perspective may that now I have to use a manual switch() or other type statement to tell what really happened. While the dispatching to the appropriate result callback used to be done for me, now you have made it a burden upon the library user.
Unless this bought me as a user some level of flexibility, that I as as user wanted I would consider this a step backwards.
For your part as an implementor, one suggestion would be to go to the generic form internally, and then offer both interfaces externally. Perhaps the old specific interface could even be auto-generated somehow.
Good Luck.

Well, your question implies that there is a balance between the library's complexity and the client's. When those are the only two choices, one almost always goes with making the client's life easier. However, those are rarely really the only two choices.
Now in the text you talk about a command processing architecture where each command has a different set of data associated with it. In the olden days, this would typically be implemented with a big honking case statement in a loop, where each case called a different routine with different parameters and perhaps some setup code. Grisly. McCabe complexity analysers hate this.
These days what you can do with an OO language is use dynamic dispatch. Create a base abstract "command" class with a standard "handle()" method, and have each different command inherit from it to add their own members (to represent the different "arguments" to the different commands). Then you create a big honking array of these at startup, usually indexed by the command ID. For languages like C++ or Ada it has to be an array of pointers to "command" objects, for the dynamic dispatch to work. Then you can just call the appropriate command object for the command ID you read from the client. The big honking case statement is now handled implicitly by the dynamic dispatch.
Where you can get the big savings in this scenario is in subclassing. Do you have several commands that use the exact same parameters? Make a subclass for them, and then derive all of those commands from that subclass. Do you have several commands that have to perform the same operation on one of the parameters? Make a subclass for them with that one method implemented for that operation, and then derive all those commands from that subclass.

Your first objective should be to produce a library that decouples higher software layers from the hardware. Users of your library shouldn't care that you have a hardware device that can execute a number of functions with a different payload. They should only care what the device does in a higher level. In this sense, it is in my opinion a good thing that every command is mapped to each one function.
My plan will be:
Identify the objects the higher data layers need to get the job done. Model the objects in C++ classes from their perspective, not from the perspective of the hardware
Define the interface of the library using the above objects
Start the implementation of the library. Perhaps an intermediate layer that maps software objects to hardware objects is necessary
There are many things you can do to reduce code repetition. You can use polymorphism. Define a class with the base functionality and extend it. You can also use utility classes, that implement functions needed for many commands.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js