C++ modules and circular class reference

C++ modules and circular class reference - c++

To learn more about C++20 modules, I'm in the process of migrating a graphics application from header files to modules. At the moment I have a problem with a circular dependency between two classes. The two classes describe nodes and edges of a graph. The edge class has pointers to two nodes and the node class has a vector of pointers to adjacent edges. I know, there are other ways to describe a graph, but this archtitecture seems very natural to me, I have very fast access to neighboring elements and it works seamlessly in the old world of header files and #include. The key are forward references.
But in the new world of C++20 modules, forward references no longer work.
The topic of circular references has been discussed in many places, but I haven't yet found a solution that really convinces me.
A common statement is that circular references are an architectural problem and should be avoided. If necessary, the two classes should be packed into one module. That would clearly be a step backwards. I try to make modules small and elementary.
I could replace the pointers to nodes or edges with pointers to a common base class NetworkObject that actually already exists. But that would destroy valuable information and force me to use static_cast to artificially add the type information back.
My question is: Am I missing anything? Is there an easier way?

There are a few misconceptions I can see here. Not entirely false, but not entirely true either.
But in the new world of C++20 modules, forward references no longer work.
This is not completely true. You cannot use forward reference that would declare something as part of a different module, but you can certainly do that within the same module.
For example:
export module M;
export namespace n {
struct B;
struct A {
B* b;
};
struct B {
A* a;
};
}
Then you can split it up in multiple module partitions:
export module M:a;
namespace n {
struct B;
export struct A {
B* b;
};
};
export module M:b;
namespace n {
struct A;
export struct B {
A* b;
};
};
export module M;
export import :a;
export import :b;
The gist of it is that types that depends on each other to be defined are coupled enough that they must reside in the same module.
Also, note that modules are not necessarily supposed to be as granular as headers. Dividing your modules too much could hurt compile time performances. For example, a whole library could be just one big module. The standard library chose this approach and export everything in the std modules and turns out it's faster than dividing the standard library in many smaller modules.
Smaller modules are not as good as many may think. Related things and classes should be packed in the same module, and if further splitting is needed for code organization within that module, partitions are an option.
The amount of modules and their name is part of your API. This means that if you have too much fine grained module, simply moving your code around will result in a breaking change. Module partitions are not part of your API and can be moved around freely.
A common statement is that circular references are an architectural problem and should be avoided. If necessary, the two classes should be packed into one module. That would clearly be a step backwards. I try to make modules small and elementary.
Those modules would not be small and elementary because of the cycle between them. ie you can't just use one module without also using the other. You will need to link against that other module if the implementation reside in another static library.
The two classes describe nodes and edges of a graph
We there be a program that would work with only the nodes module or only the edges module? Hardly. They should be part of the graph module. You could have a :edge and :node partitions, but it would not make sense using only one of those in a program or part of program.
If this is for compile times, then making bigger modules has been proven today that they are faster than smaller modules with current compiler technologies
The rationale for splitting modules into smaller modules is that there would be a use case for wanting to only import certain specific things. For example, std.freestanding would only contain the freestanding part of the standard library so programmers don't accidentally use parts they are not allowed to use.
Of course, another way to do that would be to drop all the modules safeguards and use Global Module Fragments (GMF). Using that allows modules to interface with the implicit global module. And yes, using that allows the benefit and the consequences that comes with global forward declaration. You will open the way for ODR violations to become possible again, and your entities won't be part of a named module anymore. It also allows a user to use your entities without importing the specific named module the declaration reside in, bypassing the API you expose to your users via your module names.
You can open Pandora's box using the extern "C++" directive:
export module A;
export namespace n {
extern "C++" {
struct B;
struct A {
B* b;
};
}
}
export module B;
export namespace n {
extern "C++" {
struct A;
struct B {
A* a;
};
}
}
Live example

Related

Forward declarations in C++ modules (MSVC)

I have been experimenting with modules implementation as provided by the MSVC lately and I've run into an interesting scenario. I have two classes that have a mutual dependency in their interfaces, which means that I'll have to use forward declarations to get it to compile. The following code shows an example:
Module interface
export module FooBar;
export namespace FooBar {
class Bar;
class Foo {
public:
Bar createBar();
};
class Bar {
public:
Foo createFoo();
};
}
Module implementation
module FooBar;
namespace FooBar {
Bar Foo::createBar() {
return Bar();
}
Foo Bar::createFoo() {
return Foo();
}
}
Now I would like to split these two classes up into their own modules named Foo and Bar. However, each module needs to import the other as their interfaces depend on each other. And according to the modules proposal as it currently stands, circular interface imports are not allowed. This article suggests to use the proclaimed ownership declaration, but it seems that this is not yet implemented in the MSVC implementation of modules.
Therefore, am I correct in assuming that this situation currently cannot be resolved using the current implementation as provided by the MSVC? Or is there some alternative I am missing? In this scenario the situation is pretty trivial, however I encountered this problem while modularizing a library which has many classes that have such dependencies. I realize circular dependencies are often an indication of a bad design, however in certain instances they are unavoidable or difficult to refactor away.

You can create a third module which exports only forward declarations to each of your classes (could be many classes).
Then you import this module into both (or all) of your modules, where it provides the forward declarations needed to implement each module.
Unfortunately, MSVC has still (today is version 16.7) issues with modules; although this approach works, you get often completely wild error messages; for example, "cannot convert MyClass* to MyClass* - no conversion provided (this example happens when you directly add forward decalarations to the same class into several modules; the compiler considers them different animals).
Another issue is if you forget to import all the modules needed, the error message is either grossly misleading ('no such method in that class'), or the compiler aborts with an internal error.
Don't expect too much until they have completed their work.

Can I sandbox a namespace that uses static data?

Say I have (pretty large) C++ module in a namespace foo which has a lot (well, at least one) of static data, namespace-global data and Singletons and so forth, spread across a myriad of files and directories. Is there any way to "sandbox" that entire thing in order to run independent versions at the same time (in the same process, that is). How many versions are to be run will be decided at runtime.
I thought about wrapping everything in several namespaces (e.g. bar1::foo, bar2::foo, ...), but that is a) not possible, since I don't want to touch all files and b) it would not enable me to have an arbitrary number at runtime.
Update: I was thinking perhaps I could create separate thread for each version, but I'm not that well versed with threads.

Consider putting your foo code inside a shared object. During runtime you can load and unload that shared object as often as you desire.
For an initial reference on dynamic loading of shared object, take a peek at http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html

Basically you have created a namespace with state, this is bad, you want to use a class for this use case you should be able to change it reasonably easily so that it a class
So where you have had
namespace foo{
int state;
int func();
}
foo::func();
you need
class foo{
int state;
int func();
};
foo foo1;
foo1.func();

Module and classes handling (dynamic linking)

Run into a bit of an issue, and I'm looking for the best solution concept/theory.
I have a system that needs to use objects. Each object that the system uses has a known interface, likely implemented as an abstract class. The interfaces are known at build time, and will not change. The exact implementation to be used will vary and I have no idea ahead of time what module will be providing it. The only guarantee is that they will provide the interface. The class name and module (DLL) come from a config file or may be changed programmatically.
Now, I have all that set up at the moment using a relatively simple system, set up something like so (rewritten pseudo-code, just to show the basics):
struct ClassID
{
Module * module;
int number;
};
class Module
{
HMODULE module;
function<void * (int)> * createfunc;
static Module * Load(String filename);
IObject * CreateClass(int number)
{
return createfunc(number);
}
};
class ModuleManager
{
bool LoadModule(String filename);
IObject * CreateClass(String classname)
{
ClassID class = AvailableClasses.find(classname);
return class.module->CreateObject(class.number);
}
vector<Module*> LoadedModules;
map<String, ClassID> AvailableClasses;
};
Modules have a few exported functions to give the number of classes they provide and the names/IDs of those, which are then stored. All classes derive from IObject, which has a virtual destructor, stores the source module and has some methods to get the class' ID, what interface it implements and such.
The only issue with this is each module has to be manually loaded somewhere (listed in the config file, at the moment). I would like to avoid doing this explicitly (outside of the ModuleManager, inside that I'm not really concerned as to how it's implemented).
I would like to have a similar system without having to handle loading the modules, just create an object and (once it's all set up) it magically appears.
I believe this is similar to what COM is intended to do, in some ways. I looked into the COM system briefly, but it appears to be overkill beyond belief. I only need the classes known within my system and don't need all the other features it handles, just implementations of interfaces coming from somewhere.
My other idea is to use the registry and keep a key with all the known/registered classes and their source modules and numbers, so I can just look them up and it will appear that Manager::CreateClass finds and makes the object magically. This seems like a viable solution, but I'm not sure if it's optimal or if I'm reinventing something.
So, after all that, my question is: How to handle this? Is there an existing technology, if not, how best to set it up myself? Are there any gotchas that I should be looking out for?

COM very likely is what you want. It is very broad but you don't need to use all the functionality. For example, you don't need to require participants to register GUIDs, you can define your own mechanism for creating instances of interfaces. There are a number of templates and other mechanisms to make it easy to create COM interfaces. What's more, since it is a standard, it is easy to document the requirements.
One very important thing to bear in mind is that importing/exporting C++ objects requires all participants to be using the same compiler. If you think that ever could be a problem to you then you should use COM. If you are happy to accept that restriction then you can carry on as you are.

I don't know if any technology exists to do this.
I do know that I worked with a system very similar to this. We used XML files to describe the various classes that different modules made available. Our equivalent of ModuleManager would parse the xml files to determine what to create for the user at run time based on the class name they provided and the configuration of the system. (Requesting an object that implemented interface 'I' could give back any of objects 'A', 'B' or 'C' depending on how the system was configured.)
The big gotcha we found was that the system was very brittle and at times hard to debug/understand. Just reading through the code, it was often near impossible to see what concrete class was being instantiated. We also found that maintaining the XML created more bugs and overhead than expected.
If I was to do this again, I would keep the design pattern of exposing classes from DLL's through interfaces, but I would not try to build a central registry of classes, nor would I derive everything from a base class such as IObject.
I would instead make each module responsible for exposing its own factory functions(s) to instantiate objects.

c++ Namespace headaches

Okay, this question has evolved a bit, and I want to try to start (over) with the basic goals I'm shooting for:
Create library code that wrappers legacy C-language entities in C++ resource acquisition is initialization and also provides basic or better exception guarantee.
Enable clients of this code to use it in a very natural C++ fashion w/o creating a lot of overhead to existing code to convert it to use the C++ wrapper objects (i.e. automatic conversion to appropriate legacy types, constructors that take legacy types, etc.)
Limit the namespace impact of the library code. Ideally, The library would have several sub-namespaces that provide related functionality that limit the volume and impact of using namespace X type declarations - much as the boost libraries do (i.e. use of details namespaces to only inject those symbols that the user would reasonably want to use, and hide those that are implementation details; also limit the extent of possible new meanings to existing symbols, to avoid surprising implicit conversions within user-code)
Require that clients explicitly ask for those parts of the library that they actually want to inject into their code base. This goes hand in hand with limiting the impact of inclusion of the the library's headers. The client code should have a reasonable level of control over which parts of the library are going to be automatically used for name-resolution when they compile their code.
My own library code should not have to be riddled with refactor-brittle code constructs. It would be ideal if the library's headers didn't have to constantly declare private typedefs in order to have access to the rest of that section of the library. Or in other words: I want my library to be able to be written as intuitively as my clients get to when making use of said library. Name resolution should include the namespace that the library is defined within in addition to any others that have been explicitly "using'd".
I come across this scenario often, and am looking for a better way...
I have a class, C in namespace N. C has a member, Free. It free's something that C manages, and allows C to manage a new thing.
There are several global Free functions. There are also a few helper functions in the same namespace N as C, one of which is a helper that free's the thing managed by C, named free.
So we have something like:
namespace N {
void free(THING * thing);
class C
{
public:
... details omitted...
free()
{
free(m_thing); // <- how best to refer to N::free(THING&)
}
}
} // namespace N
I could use N::free(m_thing). But that seems unfortunate to me. Is there no way to refer to that which is outside the class scope but without resolving absolute namespace (a relative one step out scope-wise)?
It seems to me that having to name N::free is obnoxious, since you wouldn't have to if this were a free-standing function. Nor would you need to if the class's method name happened to be different (e.g. dispose). But because I've used the same name, I cannot access it without having to specify what amounts to an absolute path - rather than a relative path - if you'll indulge me the analogy.
I hate absolute paths. They make moving things around in namespaces very brittle, so code-refactoring becomes much uglier. Plus, the rules of how to name things in function bodies becomes more complex with the current set of rules (as I understand them) - less regular - inducing a schism between what one expects and what one gets as a programmer.
Is there a better way to access free-standing functions in the same namespace as a class without having to absolutely name the free-function absolutely?
EDIT:
Perhaps I should have gone with a less abstract example:
namespace Toolbox {
namespace Windows {
// deallocates the given PIDL
void Free(ITEMIDLIST ** ppidl);
class Pidl
{
public:
// create empty
Pidl() : m_pidl(NULL) { }
// create a copy of a given PIDL
explicit Pidl(const ITEMIDLIST * pidl);
// create a PIDL from an IShellFolder
explicit Pidl(IShellFolder * folder);
...
// dispose of the underlying ITEMIDLIST* so we can be free to manage another...
void Free();
};
So ITEMIDLIST* come from a variety of places, and are destroyed with CoTaskMemFree(). I could introduce Pidl as a global name - as well as all of the helper functions in the "Windows Shell.h" header that is part of my toolbox library.
Ideally, I would segment some of the tools in my library by what they relate to - in this case the above all relates to COM programming in Windows. I have chose Toolbox as the base namespace for my libraries stuff, and was currently thinking I'd use Toolbox::Windows for very windows-y functions, classes, etc.
But the C++ namespace and name-resolution rules seem to make this very difficult (hence this question). It makes it very unnatural to create such segmentation of my code - since koenig lookup fails (since ITEMIDLIST is not in my Toolbox::Windows namespace), and I don't have the ability to move it there! Nor should I. The language should be flexible enough, IMO, to both allow for extension libraries such as my Toolbox library to extend other folks code without having to inject my extensions into their namespace (which, in the case of Win32 and the general vast majority of code that exists today, is the GLOBAL NS - which is the whole point of making namespaces in the first place: to avoid global NS crowding / pollution / ambiguity / programming surprises).
So, I come back around to, Is there a better way to do this: Extend existing libraries of code while not polluting their NS with my extensions but still allow for intuitive and useful name resolution as one would expect if my code were in their NS but explicitly introduced by the client of my code (i.e. I don't want to inject my code willy-nilly, but only upon explicit request)?
Another Thought: Perhaps what would satisfy my above criterea would be if I had the following:
using namespace X {
code here...
}
Where I could place such a construct anywhere, including in a header, and I would not have to be concerned about dragging X into my client's code, but I would have the same freedom to write my source code as I would if I were in the root namespace.

I don't see to avoid having to qualify namespace-scope free() here, but it should be noted that this is not the same as an "absolute path". For one thing, if you have nested namespaces, you only have to refer to the innermost one:
namespace N1 {
namespace N2 {
namespace N3 {
void free(THING * thing);
class C {
public:
free() {
N3::free(m_Thing); // no need to do N1::N2::
}
};
}
}
}
[EDIT] in response to edited question. Again, I do not see any way to do this in the exact scenario that you describe. However, it doesn't seem to be idiomatic C++ approach - the more common way to do the same is to provide your own wrapper class for ITEMIDLIST that manages all allocations, RAII-style, and exposes the original handle (e.g. via conversion operators a la ATL, or an explicit member function like c_str() if you want extra safety). That would remove the need for your free altogether, and for any other free function that you might want there, since you control the wrapper type and the namespace it's in, you can use ADL as usual.
[EDIT #2]. This part of the question:
allow for intuitive and useful name resolution as one would expect if my code were in their NS but explicitly introduced by the client of my code (i.e. I don't want to inject my code willy-nilly, but only upon explicit request)?
Isn't this precisely what you putting it in a namespace, and your client writing using namespace ..., will achieve?

You have two options:
Fully qualify function name.
Let Koenig lookup do the job (if THING belongs to namespace N).
I'd choose the first one with such popular function name as free.
Alternate option is to use m_thing->Release() or something like that.

Yuu could write ::free(m_thing);. This will call the global free() function. However, if you have another free() function outside the N namespace, you've got yourself into trouble. Either change names of some of your functions or use the function name with explicit namespace.

It seems to me that there is a problem of design here.
I find strange to have the same identifier used both as a free-function and a class method, it really is confusing. I try to minimize the occurence as much as possible, although I admit that it happens for swap...
I try not to qualify the names within the method, but here of course you run the risk of hiding, where MyClass::free hides the N::free method, which never participates in the lookup... tried to find about the scope resolution here but could not find the exact passage.
For swap I simply use:
class MyClass
{
void swap(MyClass& rhs)
{
using std::swap; // Because int, etc... are not in std
swap(m_foo, rhs.m_foo);
swap(m_bar, rhs.m_bar);
}
};
inline void swap(MyClass& lhs, MyClass& rhs) { lhs.swap(rhs); }
Therefore I make sure that the right method will be included in the lookup, and avoid falling back on a 'global' method if any.
I prefer this approach to explicit naming and usually bolts using and typedef declaration at the top of the method so that the actual code is not bloated.

I would use a completely different approach to what you are trying to accomplish. I think you need a bit of redesigning. Remove the globals and make a factory/abstract factory that will fabricate (create an instance of your class) and destroy it with the destructor. In fact, that would be the RAII idiom.

How to better organize the code in C++ projects

I'm currently in the process of trying to organize my code in better way.
To do that I used namespaces, grouping classes by components, each having a defined role and a few interfaces (actually Abstract classes).
I found it to be pretty good, especially when I had to rewrite an entire component and I did with almost no impact on the others. (I believe it would have been a lot more difficult with a bunch of mixed-up classes and methods)
Yet I'm not 100% happy with it. Especially I'd like to do a better separation between interfaces, the public face of the components, and their implementations in behind.
I think the 'interface' of the component itself should be clearer, I mean a new comer should understand easily what interfaces he must implement, what interfaces he can use and what's part of the implementation.
Soon I'll start a bigger project involving up to 5 devs, and I'd like to be clear in my mind on that point.
So what about you? how do you do it? how do you organize your code?

Especially I'd like to do a better
separation between interfaces, the
public face of the components, and
their implementations in behind.
I think what you're looking for is the Facade pattern:
A facade is an object that provides a simplified interface to a larger body of code, such as a class library. -- Wikipedia
You may also want to look at the Mediator and Builder patterns if you have complex interactions in your classes.
The Pimpl idiom (aka compiler firewall) is also useful for hiding implementation details and reducing build times. I prefer to use Pimpl over interface classes + factories when I don't need polymorphism. Be careful not to over-use it though. Don't use Pimpl for lightweight types that are normally allocated on the stack (like a 3D point or complex number). Use it for the bigger, longer-lived classes that have dependencies on other classes/libraries that you'd wish to hide from the user.
In large-scale projects, it's important to not use an #include directives in a header file when a simple forward declaration will do. Only put an #include directives in a header file if absolutely necessary (prefer to put #includes in the implementation files). If done right, proper #include discipline will reduce your compile times significantly. The Pimpl idiom can help to move #includes from header files to their corresponding implementation files.
A coherent collection of classes / functions can be grouped together in it's own namespace and put in a subdirectory of your source tree (the subdirectory should have the same name as the library namespace). You can then create a static library subproject/makefile for that package and link it with your main application. This is what I'd consider a "package" in UML jargon. In an ideal package, classes are closely related to each other, but loosely related with classes outside the package. It is helpful to draw dependency diagrams of your packages to make sure there are no cyclical dependencies.

There are two common approaches:
If, apart from the public interface, you only need some helper functions, just put them in an unnamed namespace in the implementation file:
// header:
class MyClass {
// interface etc.
};
// source file:
namespace {
void someHelper() {}
}
// ... MyClass implementation
If you find yourself wanting to hide member functions, consider using the PIMPL idiom:
class MyClassImpl; // forward declaration
class MyClass {
public:
// public interface
private:
MyClassImpl* pimpl;
};
MyClassImpl implements the functionality, while MyClass forwards calls to the public interface to the private implementation.

You might find some of the suggestions in Large Scale C++ Software Design useful. It's a bit dated (published in 1996) but still valuable, with pointers on structuring code to minimize the "recompiling the world when a single header file changes" problem.

Herb Sutter's article on "What's In a Class? - The Interface Principle" presents some ideas that many don't seem to think of when designing interfaces. It's a bit dated (1998) but there's some useful stuff in there, nonetheless.

First declare you variables you may use them in one string declaration also like so.
char Name[100],Name2[100],Name3[100];
using namespace std;
int main(){
}
and if you have a long peice of code that could be used out of the program make it a new function.
likeso.
void Return(char Word[100]){
cout<<Word;
cin.ignore();
}
int main(){
Return("Hello");
}
And I suggest any outside functions and declarations you put into a header file and link it
likeso
Dev-C++ #include "Resource.h"

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ modules and circular class reference - c++

Related

Forward declarations in C++ modules (MSVC)

Can I sandbox a namespace that uses static data?

Module and classes handling (dynamic linking)

c++ Namespace headaches

How to better organize the code in C++ projects

Categories

Resources