Can I sandbox a namespace that uses static data? - c++

Say I have (pretty large) C++ module in a namespace foo which has a lot (well, at least one) of static data, namespace-global data and Singletons and so forth, spread across a myriad of files and directories. Is there any way to "sandbox" that entire thing in order to run independent versions at the same time (in the same process, that is). How many versions are to be run will be decided at runtime.
I thought about wrapping everything in several namespaces (e.g. bar1::foo, bar2::foo, ...), but that is a) not possible, since I don't want to touch all files and b) it would not enable me to have an arbitrary number at runtime.
Update: I was thinking perhaps I could create separate thread for each version, but I'm not that well versed with threads.

Consider putting your foo code inside a shared object. During runtime you can load and unload that shared object as often as you desire.
For an initial reference on dynamic loading of shared object, take a peek at http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html

Basically you have created a namespace with state, this is bad, you want to use a class for this use case you should be able to change it reasonably easily so that it a class
So where you have had
namespace foo{
int state;
int func();
}
foo::func();
you need
class foo{
int state;
int func();
};
foo foo1;
foo1.func();

Related

C++ modules and circular class reference

To learn more about C++20 modules, I'm in the process of migrating a graphics application from header files to modules. At the moment I have a problem with a circular dependency between two classes. The two classes describe nodes and edges of a graph. The edge class has pointers to two nodes and the node class has a vector of pointers to adjacent edges. I know, there are other ways to describe a graph, but this archtitecture seems very natural to me, I have very fast access to neighboring elements and it works seamlessly in the old world of header files and #include. The key are forward references.
But in the new world of C++20 modules, forward references no longer work.
The topic of circular references has been discussed in many places, but I haven't yet found a solution that really convinces me.
A common statement is that circular references are an architectural problem and should be avoided. If necessary, the two classes should be packed into one module. That would clearly be a step backwards. I try to make modules small and elementary.
I could replace the pointers to nodes or edges with pointers to a common base class NetworkObject that actually already exists. But that would destroy valuable information and force me to use static_cast to artificially add the type information back.
My question is: Am I missing anything? Is there an easier way?
There are a few misconceptions I can see here. Not entirely false, but not entirely true either.
But in the new world of C++20 modules, forward references no longer work.
This is not completely true. You cannot use forward reference that would declare something as part of a different module, but you can certainly do that within the same module.
For example:
export module M;
export namespace n {
struct B;
struct A {
B* b;
};
struct B {
A* a;
};
}
Then you can split it up in multiple module partitions:
export module M:a;
namespace n {
struct B;
export struct A {
B* b;
};
};
export module M:b;
namespace n {
struct A;
export struct B {
A* b;
};
};
export module M;
export import :a;
export import :b;
The gist of it is that types that depends on each other to be defined are coupled enough that they must reside in the same module.
Also, note that modules are not necessarily supposed to be as granular as headers. Dividing your modules too much could hurt compile time performances. For example, a whole library could be just one big module. The standard library chose this approach and export everything in the std modules and turns out it's faster than dividing the standard library in many smaller modules.
Smaller modules are not as good as many may think. Related things and classes should be packed in the same module, and if further splitting is needed for code organization within that module, partitions are an option.
The amount of modules and their name is part of your API. This means that if you have too much fine grained module, simply moving your code around will result in a breaking change. Module partitions are not part of your API and can be moved around freely.
A common statement is that circular references are an architectural problem and should be avoided. If necessary, the two classes should be packed into one module. That would clearly be a step backwards. I try to make modules small and elementary.
Those modules would not be small and elementary because of the cycle between them. ie you can't just use one module without also using the other. You will need to link against that other module if the implementation reside in another static library.
The two classes describe nodes and edges of a graph
We there be a program that would work with only the nodes module or only the edges module? Hardly. They should be part of the graph module. You could have a :edge and :node partitions, but it would not make sense using only one of those in a program or part of program.
If this is for compile times, then making bigger modules has been proven today that they are faster than smaller modules with current compiler technologies
The rationale for splitting modules into smaller modules is that there would be a use case for wanting to only import certain specific things. For example, std.freestanding would only contain the freestanding part of the standard library so programmers don't accidentally use parts they are not allowed to use.
Of course, another way to do that would be to drop all the modules safeguards and use Global Module Fragments (GMF). Using that allows modules to interface with the implicit global module. And yes, using that allows the benefit and the consequences that comes with global forward declaration. You will open the way for ODR violations to become possible again, and your entities won't be part of a named module anymore. It also allows a user to use your entities without importing the specific named module the declaration reside in, bypassing the API you expose to your users via your module names.
You can open Pandora's box using the extern "C++" directive:
export module A;
export namespace n {
extern "C++" {
struct B;
struct A {
B* b;
};
}
}
export module B;
export namespace n {
extern "C++" {
struct A;
struct B {
A* a;
};
}
}
Live example

Should pointers be used to reduce header dependencies?

When creating a class that is composed of other classes, is it worthwhile reducing dependencies (and hence compile times) by using pointers rather than values?
For example, the below uses values.
// ThingId.hpp
class ThingId
{
// ...
};
// Thing.hpp
#include "ThingId.hpp"
class Thing
{
public:
Thing(const ThingId& thingId);
private:
ThingId thingId_;
};
// Thing.cpp
#include "Thing.hpp"
Thing::Thing(const ThingId& thingId) :
thingId_(thingId) {}
However, the modified version below uses pointers.
// ThingId.hpp
class ThingId
{
// ...
};
// Thing.hpp
class ThingId;
class Thing
{
public:
Thing(const ThingId& thingId);
private:
ThingId* thingId_;
};
// Thing.cpp
#include "ThingId.hpp"
#include "Thing.hpp"
Thing::Thing(const ThingId& thingId) :
thingId_(new ThingId(thingId)) {}
I've read a post that recommends such an approach, but if you have a large number of pointers, there'll be a large number of new calls, which I imagine would be slow.
This is what most people call the Pimpl idiom (http://c2.com/cgi/wiki?PimplIdiom).
Simple Answer
I highly suspect that you do not have a good use case for this and should avoid it at all cost.
My Experience
The main way that Pimpl has ever been useful for me is to make an implementation detail private. It achieves this because you do not need to include the headers of your dependencies, but can simply forward declare their types.
Example
If you want to provide an SDK to someone which uses some boost library code under the hood, but you want the option of later swapping that out for some other library without causing any problem for the consumer of your SDK, then Pimpl can make a lot of sense.
It also helps create a facade over an implementation so that you have control over the entire exposed public interface, rather than exposing the library you implicitly depend on, and consequently its entire interface that you don't have control over and may change, may expose too much, may be hard to use, etc.
If your program doesn't warrant dynamic allocation, don't introduce it just for the sake of project organisation. That would definitely be a false economy.
What you really want to do is attempt to reduce the number of inter-class dependencies entirely.
However, as long as your coupling is sensible and tree-like, don't worry too much about it. If you're using precompiled headers (and you are, right?) then none of this really matters for compilation times.

Best practice for handling const class data

Say you have a certain class in which each instance of it needs to access the exact same set of data. It is more efficient to declare the data outside the class rather than have each instance make its own one, but doesn't this breach the 'no globals' rule?
Example code:
Foo.h
class Foo{
Foo();
void someFunction();
};
Foo.cpp
#include "Foo.h"
//surely ok since it's only global to the class's .cpp?
const int g_data[length] = {
//contains some data
};
Foo::someFunction(){
//do something involving g_data ..
}
..rather than making 'g_data' a member variable. Or is there some other way which avoids creating a global?
Use the modifier static which modifies the declaration of a class member so that is shared among all class instances. An example:
class A {
int length = 10;
static int g_data[length]; //Inside the class
}
And then you can access it like this:
int main(void) {
std::cout << "For example: " << A::g_data[2] << std::endl;
}
You can find more on this matter here
That's what a static member is for.
Thus you will have in your declaration :
class Foo{
Foo();
void someFunction();
static int const sharedData[length];
};
And somewhere in your cpp file :
int const Foo::sharedData[length] = { /* Your data */ };
Summarily, yes - your "global" is probably ok (though it'd be better in an anonymous namespace, and there are a few considerations).
Details: lots of answers recommending a static class member, and that is often a good way to go, but do keep in mind that:
a static class member must be listed in the class definition, so in your case it'll be in Foo.h, and any client code that includes that header is likely to want to recompile if you edit the static member in any way even if it's private and of no possible direct relevance to them (this is most important for classes in low-level libraries with diverse client code - enterprise libraries and those shared across the internet and used by large client apps)
with a static class member, code in the header has the option of using it (in which case a recompile will be necessary and appropriate if the static member changes)
if you put the data in the .cpp file, it's best in an anonymous namespace so it doesn't have external linkage (no other objects see it or can link to it), though you have no way to encapsulate it to protect it from access by other functions later in the .cpp's translation unit (whereas a non-public static class member has such protection)
What you really want is data belonging to the type, not to an instance of that type. Fortunately, in C++ there is an instrument doing exactly that — static class members.
If you want to avoid the global and have a more object-oriented solution, take a look at the Flyweight pattern. The Boost Flyweight library has some helper infrastructure and provides a decent explanation of the concept.
Since you are talking about efficiency, it may or not be a good idea to use such an approach, depending on your actual goal. Flyweights are more about saving memory and introduce an additional layer of indirection which may impact runtime performance. The externally stored state may impact compiler optimizations, especially inlining and reduce data locality which prevents efficient caching. On the other hand, some operations which like assignment or copying can be considerably faster because there is only the pointer to the shared state that needs to be copied (plus the non-shared state, but this has to be copied anyway). As always when it comes to efficiency / performance, make sure you measure your code to compare what suits your requirements.

What is the best strategy for sharing variables between source files in c/c++?

I frequently have to write c/c++ programs with 10+ source files where a handful of variables need to be shared between functions in all the files. I have read before that it is generally good practice to avoid using global variables with extern. However, if it is completely necessary to use global variables, this link provides a good strategy. Lately, I have been toying with the strategy of wrapping up all my variables in a struct or a class and passing this struct around to different functions. I was wondering which way people consider to be cleaner and if there are any better alternatives.
EDIT: I realize strategies may be different in the two languages. I am interested in strategies that apply to only one language or both.
Pass around a class/struct of "context" data instead of global variables. You will be suprised how often a global variable becomes no longer global, with different modules wanting to use different values for it at the same time.
The better alternative to globals is to not use globals.
Don't try to sweep them under the rug using a struct or a namespace or a singleton or some other silly thing whose only purpose is to hide the fact that you're using globals.
Just don't ever create one. Ever.
It will force you to think of ownership and lifetime and dependencies and responsibility. You know, grown-up things.
And then, when you're comfortable writing global-free code, you can start violating all those rules.
Because that's what rules are for: to be followed, and to be broken.
Does it REALLY need to be global?
This is the first question you should always ask, is this variable used GLOBALLY e.g. in all contexts. The answer is almost certainly... no it's not.
Consider Context
Is the variable global state, or is it context? Global state is usually rare, context on the other hand is quite common. If it's global state consider wrapping in a singleton so you can manage the how of interaction with your globals. Using Atomic<> is probably not a bad idea, you should at least consider synchronization.
If it is context then it should be passed explicitly in a structure or class, as the data is explicitly relevant to that context an no-other. Passing context explicitly may seem like a burden but it makes it very clear where the context is coming from rather than just referencing random variables out of the ether.
What is the Scope?
It may seem odd to say that globals are scoped, but any global declared in a single file may be declared static and thus unlinkable from any other file. This means you can restrict who has access to the global state in a given scope. This allows you to prevent people from randomly tweaking variables.
My take in C++
I found it a good practice in C++ anyway to limit the scope of my global variables with a namespace. That way you can eliminate any ambiguity between your 10+ source files.
For example:
namespace ObjectGlobalVars {
//Put all of your global variables here
int myvariable = 0;
}
//And then later on you can reference them like
ObjectGlobalVars::myvariable++;
In c++
Having global variables lying around here and there, is an example of bad code.
If you want to share things on a global scale, then group them up and follow the singleton pattern.
Example:
class Singleton
{
private:
int mData;
public:
static Singleton& getInstance()
{
static Singleton instance;
return instance;
}
int GetData()
{
return mData;
}
private:
Singleton() {};
Singleton(Singleton const&);
void operator=(Singleton const&);
};
Advantages:
Only 1 global variable. The instance of our singleton.
You can include mutex / semaphore mechanisms inside the singleton, for thread-safe access of it's members.
Restricts the access of it's members helping you avoid logical and synchronization flaws.
Disadvantages:
Harder to implement. - If it's your first time -
In c
You should avoid declaring global variables, pass them in structs instead.
For instance:
struct MyData
{
int a;
int b;
};
void bar(struct MyData* data)
{
data->b = 2;
}
void foo()
{
struct MyData mdata;
mdata.a = 1;
bar( &mdata );
}
To sum things up
Having global variables lying around should be avoided as much as possible, in both languages.

Namespaces vs. Static Classes

For a project I'm working on, I have a bunch of "library classes". These are essentially collections of related functions of values. Some of these libraries need to be "initialized" at run-time. So far, I've been utilizing the design below as a solution:
// Filename: Foo.h
namespace my_project
{
namespace library
{
class Foo
{
public:
static int some_value; // members used externally and internally
Foo()
{
// Lots of stuff goes on in here
// Therefore it's not a simply member initialization
// But for this example, this should suffice
some_value = 10;
Foo::bar();
}
static void bar() { ++some_value; } // some library function
// no destructor needed because we didn't allocate anything
private:
// restrict copy/assignment
Foo(const Foo&);
void operator=(const Foo&);
};
int Foo::some_value = 0; // since some_value is static, we need this
} // library namespace
static library::Foo Foo;
} // my_project namespace
Using Foo would be similar to this, as an example:
#include "Foo.h"
using namespace my_project;
int main()
{
int i = Foo.some_value;
Foo.bar();
int j = Foo.some_value;
return 0;
}
Of course, this example is very simplified, but it gets the point across. This method has four advantages to me:
User of the library doesn't need to worry about initialization. They wouldn't need to call something like Foo::init(); inside their main(), because library::Foo was initialized when my_project::Foo was constructed. This is the main design constraint here. User should not be responsible for initializing the library.
I can create various private functions inside the library to control its use.
The user can create other instances of this library if they choose, for whatever reason. But no copying would be allowed. One instance would be provided for them by default. This is a requirement.
I can use the . syntax instead of ::. But that's a personal style thing.
Now, the question is, are there any disadvantages to this solution? I feel like I'm doing something that C++ wasn't meant to do because Visual Studio's IntelliSense keeps freaking out on me and thinks my_project::Foo isn't declared. Could it be because both the object and the class are called Foo even though they're in different namespaces?
The solution compiles fine. I'm just worried that once my project grows larger in scale, I might start having name ambiguities. Furthermore, am I wasting extra memory by creating an object of this library?
Should I simply stick to the singleton design pattern as an alternative solution? Are there any alternative solutions?
UPDATE:
After reviewing the solutions provided, and jumping around google for various solutions, I stumbled upon extern. I have to say I'm a bit fuzzy on what this keyword really does; I've been fuzzy about it ever since I learned C++. But after tweaking my code, I changed it to this:
// Foo.h
namespace my_project
{
namespace library
{
class Foo_lib
{
public:
int some_value;
Foo_lib() { /* initialize library */ }
void bar() { /* do stuff */ }
private:
// restrict copy/assignment
Foo_lib(const Foo_lib&);
void operator=(const Foo_lib&);
};
} // library namespace
extern library::Foo_lib Foo;
} // my_project namespace
// Foo.cpp
#include "Foo.h"
namespace my_project
{
namespace library
{
// Foo_lib definitions
} // library namespace
library::Foo_lib Foo;
} // my_project namespace
// main.cpp
#include "Foo.h"
using namespace my_project;
int main()
{
int i = Foo.some_value;
Foo.bar();
int j = Foo.some_value;
return 0;
}
This seems to have the exact same effect as before. But as I said, since I'm still fuzzy on extern usage, would this also have the exact same bad side-effects?
This line is particularly bad:
static library::Foo Foo;
It emits a static copy of Foo in every translation. Don't use it :) The result of Foo::some_value would be equal to the number of translations the Foo.h was visible to, and it's not thread safe (which will frustrate your users).
This line will result in multiple definitions when linking:
int Foo::some_value = 0;
Singletons are also bad. Searching here #SO will produce a lot of reasons to avoid them.
Just create normal objects, and document to your users why they should share objects when using your library, and in which scenarios.
User of the library doesn't need to worry about initialization. They wouldn't need to call something like Foo::init(); inside their main(), because library::Foo was initialized when my_project::Foo was constructed. This is the main design constraint here. User should not be responsible for initializing the library.
Objects should be able to construct themselves as needed without introducing unstrippable binary baggage.
I can create various private functions inside the library to control its use.
That's not unique to your approach.
The user can create other instances of this library if they choose, for whatever reason. But no copying would be allowed. One instance would be provided for them by default. This is a requirement.
Then you can force your users to pass Foo as a necessary argument to create the types they depend upon (where Foo is needed).
I can use the . syntax instead of ::. But that's a personal style thing.
Not good. Not threadsafe, and the user can then seriously mess up your library's state. Private data is best.
There are two things going on here:
What if the user would dearly like to parallelize her code ?
What if the user would like to start using your library during the static initialization phase ?
So, one at a time.
1. What if the user would dearly like to parallelize her code ?
In the age of multi-core processors libraries should strive for re-entrancy. Global State is bad, and unsynchronized Global State is even worse.
I would simply recommend for you to make Foo contain regular attributes instead of static ones, it is then up to the user to decide how many instances in parallel should be used, and perhaps settle on one.
If passing a Foo to all your methods would prove awkward, have a look at the Facade pattern. The idea here would be to create a Facade class that is initialized with a Foo and provides entry points to your library.
2. What if the user would like to start using your library during the static initialization phase ?
The static initialization order fiasco is just horrid, and the static destruction order fiasco (its sibling) is no better, and even harder to track down (because the memory is not 0-initialized there, so it's hard to see what's going on).
Since once again it's hard (impossible ?) for you to predict the usage of your library and since any attempt to use it during static initialization or destruction is nigh impossible with a singleton that you would create, the simpler thing to do is to delegate at least initialization to the user.
If the user is unlikely to be willing to use this library at start-up and shut-down, then you may provide a simple safeguard to automatically initialize the library on first use if she didn't already.
This can be accomplished easily, and in a thread-safe manner (*), using local static variables:
class Foo {
public:
static Foo& Init() { static Foo foo; return foo; }
static int GetValue() { return Init()._value; }
private:
Foo(): _value(1) {}
Foo(Foo const&) = delete;
Foo& operator=(Foo const&) = delete;
int _value;
}; // class Foo
Note that all this glue is completely useless if you simply decide not to use a Singleton and go for the first solution: a regular object, with per-instance state only.
(*) Thread safety is guaranteed in C++11. In C++03 (the version used primarily in the industry) the best compilers guarantee it as well, check the documentation if required.
Now, the question is, are there any disadvantages to this solution?
Yes. See for instance, this entry in the c++ faq on the static initialization order fiasco. http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.14 tldr? Essentially, you have no control over what order static objects (such as Foo above) get initialized in, any assumptions about the order (eg. initializing one static object with values from another) will result in Undefined Behaviour.
Consider this code in my app.
#include "my_project/library/Foo.h"
static int whoKnowsWhatValueThisWillHave = Foo::some_value;
int main()
{
return whoKnowsWhatValueThisWillHave;
}
There are no guarantees on what I am returning from main() here.
The user can create other instances of this library if they choose, for whatever reason. But no copying would be allowed. One instance would be provided for them by default. This is a requirement.
Not really, no... Since all of your data is static, any new instances are essentially empty shells pointing to the same data. Basically, you have a copy.
I feel like I'm doing something that C++ wasn't meant to do because Visual Studio's IntelliSense keeps freaking out on me and thinks my_project::Foo isn't declared. Could it be because both the object and the class are called Foo even though they're in different namespaces?
You are! Suppose I add this line to my code:
using namespace ::my_project::library;
what does 'Foo' resolve to now? Maybe this is defined in the standard, but at the very least, it is confusing.
I can use the . syntax instead of ::. But that's a personal style thing.
Don't fight the language. If you want to code in Python or Java syntax, use Python or Java (or Ruby or whatever)...
Should I simply stick to the singleton design pattern as an alternative solution? Are there any alternative solutions?
Yes, the Singleton is a good one, but you should also consider whether you actually need a singleton here. Since your example is only syntactic, it is hard to say, but maybe it would be better to use dependency injection or something similar to minimize/eliminate tight couplings between classes.
Hopefully I haven't hurt your feelings :) It's good to ask questions, but obviously you already know that!