I have been experimenting with modules implementation as provided by the MSVC lately and I've run into an interesting scenario. I have two classes that have a mutual dependency in their interfaces, which means that I'll have to use forward declarations to get it to compile. The following code shows an example:
Module interface
export module FooBar;
export namespace FooBar {
class Bar;
class Foo {
public:
Bar createBar();
};
class Bar {
public:
Foo createFoo();
};
}
Module implementation
module FooBar;
namespace FooBar {
Bar Foo::createBar() {
return Bar();
}
Foo Bar::createFoo() {
return Foo();
}
}
Now I would like to split these two classes up into their own modules named Foo and Bar. However, each module needs to import the other as their interfaces depend on each other. And according to the modules proposal as it currently stands, circular interface imports are not allowed. This article suggests to use the proclaimed ownership declaration, but it seems that this is not yet implemented in the MSVC implementation of modules.
Therefore, am I correct in assuming that this situation currently cannot be resolved using the current implementation as provided by the MSVC? Or is there some alternative I am missing? In this scenario the situation is pretty trivial, however I encountered this problem while modularizing a library which has many classes that have such dependencies. I realize circular dependencies are often an indication of a bad design, however in certain instances they are unavoidable or difficult to refactor away.
You can create a third module which exports only forward declarations to each of your classes (could be many classes).
Then you import this module into both (or all) of your modules, where it provides the forward declarations needed to implement each module.
Unfortunately, MSVC has still (today is version 16.7) issues with modules; although this approach works, you get often completely wild error messages; for example, "cannot convert MyClass* to MyClass* - no conversion provided (this example happens when you directly add forward decalarations to the same class into several modules; the compiler considers them different animals).
Another issue is if you forget to import all the modules needed, the error message is either grossly misleading ('no such method in that class'), or the compiler aborts with an internal error.
Don't expect too much until they have completed their work.
Related
To learn more about C++20 modules, I'm in the process of migrating a graphics application from header files to modules. At the moment I have a problem with a circular dependency between two classes. The two classes describe nodes and edges of a graph. The edge class has pointers to two nodes and the node class has a vector of pointers to adjacent edges. I know, there are other ways to describe a graph, but this archtitecture seems very natural to me, I have very fast access to neighboring elements and it works seamlessly in the old world of header files and #include. The key are forward references.
But in the new world of C++20 modules, forward references no longer work.
The topic of circular references has been discussed in many places, but I haven't yet found a solution that really convinces me.
A common statement is that circular references are an architectural problem and should be avoided. If necessary, the two classes should be packed into one module. That would clearly be a step backwards. I try to make modules small and elementary.
I could replace the pointers to nodes or edges with pointers to a common base class NetworkObject that actually already exists. But that would destroy valuable information and force me to use static_cast to artificially add the type information back.
My question is: Am I missing anything? Is there an easier way?
There are a few misconceptions I can see here. Not entirely false, but not entirely true either.
But in the new world of C++20 modules, forward references no longer work.
This is not completely true. You cannot use forward reference that would declare something as part of a different module, but you can certainly do that within the same module.
For example:
export module M;
export namespace n {
struct B;
struct A {
B* b;
};
struct B {
A* a;
};
}
Then you can split it up in multiple module partitions:
export module M:a;
namespace n {
struct B;
export struct A {
B* b;
};
};
export module M:b;
namespace n {
struct A;
export struct B {
A* b;
};
};
export module M;
export import :a;
export import :b;
The gist of it is that types that depends on each other to be defined are coupled enough that they must reside in the same module.
Also, note that modules are not necessarily supposed to be as granular as headers. Dividing your modules too much could hurt compile time performances. For example, a whole library could be just one big module. The standard library chose this approach and export everything in the std modules and turns out it's faster than dividing the standard library in many smaller modules.
Smaller modules are not as good as many may think. Related things and classes should be packed in the same module, and if further splitting is needed for code organization within that module, partitions are an option.
The amount of modules and their name is part of your API. This means that if you have too much fine grained module, simply moving your code around will result in a breaking change. Module partitions are not part of your API and can be moved around freely.
A common statement is that circular references are an architectural problem and should be avoided. If necessary, the two classes should be packed into one module. That would clearly be a step backwards. I try to make modules small and elementary.
Those modules would not be small and elementary because of the cycle between them. ie you can't just use one module without also using the other. You will need to link against that other module if the implementation reside in another static library.
The two classes describe nodes and edges of a graph
We there be a program that would work with only the nodes module or only the edges module? Hardly. They should be part of the graph module. You could have a :edge and :node partitions, but it would not make sense using only one of those in a program or part of program.
If this is for compile times, then making bigger modules has been proven today that they are faster than smaller modules with current compiler technologies
The rationale for splitting modules into smaller modules is that there would be a use case for wanting to only import certain specific things. For example, std.freestanding would only contain the freestanding part of the standard library so programmers don't accidentally use parts they are not allowed to use.
Of course, another way to do that would be to drop all the modules safeguards and use Global Module Fragments (GMF). Using that allows modules to interface with the implicit global module. And yes, using that allows the benefit and the consequences that comes with global forward declaration. You will open the way for ODR violations to become possible again, and your entities won't be part of a named module anymore. It also allows a user to use your entities without importing the specific named module the declaration reside in, bypassing the API you expose to your users via your module names.
You can open Pandora's box using the extern "C++" directive:
export module A;
export namespace n {
extern "C++" {
struct B;
struct A {
B* b;
};
}
}
export module B;
export namespace n {
extern "C++" {
struct A;
struct B {
A* a;
};
}
}
Live example
I've gotten into a bit of a design block in a C++ program of mine as two different header files are required to reference each other. Typically a forward declaration would be used here, but since both classes use template functions/constructors a forward declaration cannot be used as methods/variables from both classes need to be used.
For example consider the following scenario (this is pseudo code as an example, it may/may not compile. The objects are representative of my actual application so if a redesign is necessary then I'd love to understand the design philosophies of what I did wrong)
// Application.hpp
#include <Assets.hpp>
#include <Logger.hpp>
class Application {
public:
// Some brilliant code here ...
Logger myLogger;
template <int someArrayLen> Application(std::array<int, someArrayLen> myArr, SomeOtherTypes someOtherStuff) : myLogger(stuffHere) {
mainAssets = new Assets(myArr);
}
~Application(); // Assume this is implemented in Application.cpp and has delete mainAssets;
};
extern Application* mainApp; // Assume Application* mainApp = nullptr; in Application.cpp
// Assets.hpp
// #include <Application.hpp> ???? The issue lies here
class Assets {
private:
// Random data structures/stuff for holding shaders/textures/etc
protected:
template <int someArrayLen> Assets(std::array<int, someArrayLen> myArr) {
if (!shadersSupported()) {
// Main app is an unknown symbol
mainApp->myLogger->error("Your GPU is too old/whatever!");
}
// Random code for loading assets based on my template stuff
}
friend class Application;
public:
// Get assets/whatever here
};
extern Assets* mainAssets; // Assume Assets* mainAssets = nullptr; in Assets.cpp
How can I fix the compile error regarding mainApp being an unknown symbol? Any feedback/help is appreciated, thanks!
I've already looked through all the following questions but none address this unique scenario:
two classes referencing each other
This question had no use of templates so forward declarations could be used as the method bodies weren't defined in the headers
Two classes referencing each other with hash template specialization
The solution from this question cannot be used as here the compiler was unable to figure out how much memory to allocate, whereas in my question the issue isn't regarding the compiler being confused with how much to allocate but rather what to reference
Two template classes being composed of a member of each other
This question addressed a design flaw of circular dependencies which my application does not have, both classes are stored globally, they are just instantiated in separate constructors which reference each other.
Two classes that refer to each other
This question provides forward declarations as a solution which cannot be used here due to the requirement for using the class methods/constructors in template function definitions.
I've also already considered the following:
Trying to change from std::array to pointers, this wouldn't work as my Assets constructor does rely on the lengths of the array.
Trying to change from std::array to std::vector, I want to stick to aggregate initialization so it can be done at compile time, I believe vectors/lists would be too heavy for this.
Forward declarations will indeed work for your problem. The key is that function templates can be defined out of line (i.e., not in your class ... { }; declaration) legally. The same can be achieved for arbitrary functions using the inline keyword.
To now solve your specific problem, just split Application.hpp into Applicaton_fwd.hpp and Application.hpp - similar to iosfwd. Application_fwd.hpp contains almost all the code and Application.hpp includes Application_fwd.hpp and Assets.hpp before defining the Application::Application function template (just like you would define a function in a *.cpp file).
In Assets.hpp, you can simply use Application_fwd.hpp as long as you do not use the constructor. If you also use the Application constructor in Assets.hpp, things become a bit more complicated in that you need to very carefully consider all possible inclusion scenarios (i.e., what happens exactly every time one of your headers is included by themselves or a user) to make sure that it resolves in the order that you need it to without the guards causing trouble.
You can see it in action here
In a C++ cross-platform library,
we use shared headers which are compiled with different module versions for each OS.
A.k.a Link-Seam
//Example:
//CommonHeader.h
#ifndef COMMON_HEADER_H
#define COMMON_HEADER_H
class MyClass {
public:
void foo();
}
#endif
.
//WindowsModule.cpp
#include "CommonHeader.h"
void MyClass::foo() {
//DO SOME WINDOWS API STUFF
printf("This implements the Windows version\n");
}
.
//LinuxModule.cpp
#include "CommonHeader.h"
void MyClass::foo() {
//DO SOME LINUX SPECIFIC STUFF HERE
printf("This implements the Linux version\n");
}
Of course, in each build you only select one module, respective to the environment you are using.
This is meant to suppress the indirect call to the functions
My Question is: How to note this relationship in UML ?
"Inheritance"? "Dependency"? "Note"?
class MyClass {
public:
void foo();
}
This is nothing more than a class contract, so basically an interface which you realize in different modules. To visualize that, you can use interface realization notation (like a generalization, but with dashed lines).
The reality is I think that you’ve got only one class in a UML class diagram, being MyClass, with a public operation foo(); it’s just that you have two code implementations of this class, one for Linux and one for Windows. UML Class models are not really setup to answer the question of how you implement this, in your case using c++ with header files: Imagine if instead you in-lined your functions and ended up writing two inline implementations of MyClass, one for Linux and one for Windows. Logically you still have one class in your class diagram (but with two implementations) and the header file (or lack thereof) wouldn’t come into it. What you need therefore is a way to show a how the C++ is structured, not a way to show the logical class constructs. I’m not aware of a specific way in UML to represent code structure, however you could build a code structure model using Artefacts maybe (?) which could denote the c++ file structures.
It could be seen as some kind of "inheritance". There is however no relation between classes as there is just one class - so there is no relation between two. The actual construct of using platform dependent implementation however imitate relation "is a".
I have a class Foo, which I do not implement directly, but wrap external libraries (e.g FooXternal1 or FooXternal2 )
One way that I have seen to do this, is using preprocessor directives as
#include "config.h"
#include "foo.h"
#ifdef _FOOXTERNAL1_WRAPPER_
//implementation of class Foo using FooXternal1
#endif
#ifdef _FOOXTERNAL2_WRAPPER_
//implementation of class Foo using FooXternal2
#endif
and a config.h is used to define these preprocessor flags (_FOOXTERNAL1_WRAPPER_ and _FOOEXTERNAL2_WRAPPER_).
I have the impression this is frowned upon by the C++ programmer community because it uses preprocessor directives, is hard to debug, etc. Further, it does not allow for the parallel existence of both implementations.
I thought about making Foo a base class and inheriting from it to allow for both implementations to exist in parallel with each other. But I ran into two problems:
Pure virtual functions: cannot instatiate an object of type 'Foo', which I need during use.
Virtual functions run the risk of running an object with no (proper) implementation.
Am I missing something? Is there a cleaner way to do this?
EDIT : To summarize, there are 3(.5?!) ways to doing the wrapping- 2(.5) are given by icepack, and the last by Sergey
1- Use factory methods
2- Use preprocessor directives
2.5- Use makefile or IDE to effectively do the work of the preprocessor directives
3.5- Use templates suggested by Sergay
I am working on an embedded system where resources are limited, I decided to use template<enum = default_library>, with template specialization. It is easy to understand for later users; at least thats what I think
If all method names of external implementations are similar, you can use templates. Let external implementations look like:
class FooX1
{
public:
void Met1()
{
std::cout << "X1\n";
}
};
class FooX2
{
public:
void Met1()
{
std::cout << "X2\n";
}
};
Then you can use several variants.
Variant 1. You can declare member of a template type and wrap all calls to external implementation, even with some preparations before the call. Don't forget to delete impl in ~Foo destructor.
template<typename FooX>
class FooVariant1
{
public:
FooVariant1()
{
impl=new FooX();
}
void Met1Wrapper()
{
impl->Met1();
}
private:
FooX *impl;
};
Usage:
FooVariant1<FooX1> bar;
bar.Met1Wrapper();
Variant 2. You can inherit from a template parameter. In this case you don't declare any members, but just call implementation's methods by their names.
template<typename FooX>
class FooVariant2 : public FooX
{
};
Usage:
FooVariant2<FooX1> bar;
bar.Met1();
A disadvantage of using templates is that there is no easy way to change implementations in runtime. But in return you get much more optimal code, because types are generated in compile-time and there is no table of virtual functions, which can make the program slower.
If you want the 2 implementations to coexist at runtime, interface is the way to go (for example, you can use a factory method design pattern to instantiate the concrete object, like #n.m. has suggested).
If you can decide at compilation time what is the implementation that you need, you have several options:
Still use interface. This will allow an easy transition if in the future you'll need both implementations at runtime.
Use preprocessor directives. There is nothing wrong here as far as C++ is considered. It's a pure design issue.
Put the implementations in different files and configure your compiler to compile either one of them according to settings - this is actually similar to using preprocessor directives but it's cleaner and doesn't add garbage to your code (since the flags are in the solution/makefile/whatever your compiler uses).
The only thing I'd frown upon is including both implementations in the same source file. That might get confusing. Otherwise, this is one of the things preprocessor flags are good at, especially if you're not linking both libraries at the same time. It's just like supporting multiple operating systems. Provide a consistent interface in all cases and hide the implementation details somewhere else.
Does type Foo need to hold any information specific to each library? If not, you might be able to get away with this:
#include "Foo.h"
#if defined _FOOXTERNAL1_WRAPPER_
#include "Foo_impl1.cpp"
#elif defined _FOOXTERNAL2_WRAPPER_
#include "Foo_impl2.cpp"
#else
#error "Warn about a missing define here"
#endif
This way you don't have to bother with virtual functions or inheritance and you still prevent any member functions from going unimplemented.
Keep Foo abstract. Provide a factory method
Foo* MakeFoo();
that allocates a new object of either type FooImpl1 or FooImpl2, and returns its address.
Wikipedia on Factory Method pattern.
Consider I'm writting a static library. Let it has a class Foo
// mylib.h
#include <dependency_header_from_other_static_library.h>
class Foo {
// ...
private:
type_from_dependent_library x;
}
As you can see this library (let call it mylib) depends on another library. It compiles well. But when user compile it's code (that uses Foo and includes mylib.h) and linking with my lib the compilation fails, because user need to have dependency_header_from_other_static_library.h header file to compile code as well.
I want to hide this dependency from the user. How this can be done? The one thing that comes to mind is a PIMPL idiom. Like:
// mylib.h
#include <dependency_header_from_other_static_library.h>
class Foo {
// ...
private:
class FooImpl;
boost::shared_ptr<FooImpl> impl_;
}
// mylib_priv.h
class FooImpl {
// ...
private:
type_from_dependent_library x;
}
But it requires me to duplicate the interface of the class Foo in FooImpl. And, is it an overkill to use PIMPL in my case?
Thanks.
When decoupling a header from other headers, there are a few approaches you might be able to use:
If the used library makes a promise about how it declares its types, you may be able to forward declare the needed types in your header. Of course, this still means you can only refer to these types as pointers or in function signatures in the header but this may be good enough. For example, if the used library promises to have a class LibraryType that you need to use, you can do something like this:
// Foo.h
class LibraryType;
class Foo {
// ...
LibraryType* data;
};
This may cut you the necessary slack to use the type without including its header and without jumping through a PImpl approach.
If the library doesn't make a promise about how it declares it types you may use void* to refer to the corresponding types. Of course, this means that whenever you access the data in your implementation, you'll need to cast the void* to the appropriate type. Since the type is statically known, using static_cast<LibraryType*> is perfectly fine, i.e., there isn't any overhead due to the cast, but it is still relatively painful to do.
The other alternative is, of course, to use the PImpl idiom. If you type provides any reasonably service, it will probably change the interface quite a bit and it shouldn't amount much to replicating the interface between the class itself and the privately declared type. Also, note that the private type is just a data container, i.e., it is reasonably to just make it a struct and have no protection to its accesses. The only real issue is that you need to make sure that the type's definition is visible at the point where the destructor is called. Using std::shared_ptr<T>(new T(/*...*)) arranges for this.
Effectively, all three approaches do the same thing although with slightly different techniques: they provide you an opaque handle to be used in the header file whose definition is only known to the implementation. This way, the client of the library doesn't need to include the corresponding header files. However, unless the symbols are resolved when building the library, it would still be necessary for the client to have access to the used library.