I'm considering a certain solution where I would like to initialize a cell of an array that is defined in other module (there will be many modules initializing one table). The array won't be read before running main (so there is not problem with static initialization order).
My approach:
/* secondary module */
extern int i[10]; // the array
const struct Initialize {
Initialize() { i[0] = 12345; }
} init;
/* main module */
#include <stdio.h>
int i[10];
int main()
{
printf("%d\n", i[0]); // check if the value is initialized
}
Compiler won't strip out init constant because constructor has side effects. Am I right? Is the mechanism OK? On GCC (-O3) everything is fine.
//EDIT
In a real world there will be many modules. I want to avoid an extra module, a central place that will gathered all minor initialization routines (for better scalability). So this is important that each module triggers its own initialization.
This works with MSVC compilers but with GNU C++ does not (at least for me). GNU linker will strip all the symbol not used outside your compilation unit. I know only one way to guarantee such initialization - "init once" idiom. For examle:
init_once.h:
template <typename T>
class InitOnce
{
T *instance;
static unsigned refs;
public:
InitOnce() {
if (!refs++) {
instance = new T();
}
}
~InitOnce() {
if (!--refs) {
delete instance;
}
}
};
template <typename T> unsigned InitOnce<T>::refs(0);
unit.h:
#include "init_once.h"
class Init : public InitOnce<Init>
{
public:
Init();
~Init();
};
static Init module_init_;
secondary.cpp:
#include "unit.h"
extern int i[10]; // the array
Init::Init()
{
i[0] = 12345;
}
...
I don't think you want the extern int i[10]; in your main module, though, adf88.
EDIT
/*secondary module (secondary.cpp) */
int i[10];
void func()
{
i[0]=1;
}
.
/*main module (main.cpp)*/
#include<iostream>
extern int i[];
void func();
int main()
{
func();
std::cout<<i[0]; //prints 1
}
Compile, link and create and executable using g++ secondary.cpp main.cpp -o myfile
In general constructors are used(and should be used) for initializing members of a class only.
This might work, but it's dangerous. Globals/statics construction order within a single module is undefined, and so is module loading order (unless you're managing it explicitly). For example, you assume that during secondary.c Initialize() ctor run, i is already present. You'd have to be very careful not to have two modules initialize the same common data, or have two modules carry out initializations with overlapping side effects.
I think a cleaner design to tackle such a need is to have the owner of the common data (your main module) expose it as a global singleton, with an interface to carry out whichever data initializations needed. You'd have a central place to control init-order, and maybe even control concurrent access (using critical sections or other concurrency primitives). Along the lines of your simplified example, that might be -
/main module (main.c)/
#include
class CommonDat
{
int i;
public:
const int GetI() { return i;}
void SetI(int newI) { i = newI; }
void incI()
{
AcquireSomeLock();
i++;
ReleaseTheLock();
}
}
CommonDat g_CommonDat;
CommonDat* getCommonDat() { return &g_CommonDat; }
int main(void)
{
printf("%d",getCommonDat()->GetI());
}
It's also preferable to have the secondary modules call these interfaces at controlled times in runtime (and not during the global c'tors pass).
(NOTE: you named the files as C files, but tagged the question as c++. The suggested code is c++, of course).
May I ask why you use an array (running the risk of getting out of bounds) when you could use a std::vector ?
std::vector<int>& globalArray()
{
static std::vector<int> V;
return V;
}
bool const push_back(std::vector<int>& vec, int v)
{
vec.push_back(v);
return true; // dummy return for static init
}
This array is lazily initialized on the first call to the function.
You can use it like such:
// module1.cpp
static bool const dummy = push_back(globalArray(), 1);
// module2.cpp
static bool const dummy = push_back(globalArray(), 2);
It seems much easier and less error-prone. It's not multithread compliant until C++0x though.
Related
My program looks something like this:
map<string, function<void(const MyType&)>> callables;
int main(int argc, char *argv[]) {
string name = GetFromSomewhere();
auto iter = callables.find(name);
if (iter != callables.end()) {
MyType my_thing = GetSomeValue();
iter->second(my_thing);
}
}
In other words, I have a table of functions, and main is going to do something that produces a lookup key into that table, do the lookup, and if successful, call the function.
Now I could initialise the table in the translation unit where I define the map, but that means each new function that wants to be in that map has to modify the map's TU. That gets cumbersome.
Better to have a registration function:
void RegisterCallable(const string&, function<void(MyType)>);
and then any developer who wants to put something in the table just calls RegisterCallable():
# In foo.cc:
void NiftyCallable(const MyType& thing) { ... }
RegisterCallable("nifty", NiftyCallable);
Past experience with string vs char[] warns me that I'm asking for pain, but I've not been able to (re)find the specific C++ rule that tells me when those RegisterCallable() calls that we'll scatter about the code base will be called (in particular, if they're guaranteed to be called before main executes or if maybe the TU can load on demand later -- and so whether my memory of pain is correct or not for C++14).
Am I misremembering that this will cause pain?
Or is there a better way to do this other than asking for some TU to know about (currently 100 or so) functions that need registering?
Don't put the table in global scope put it in function scope (still has to be static to make sure that it lives for the length of the application). So you can force the initialization order. Then you solve the problem of initialization order across compilation units.
static std::map<std::string, std::function<void(const MyType&)& getCallables() {
static std::map<std::string, std::function<void(const MyType&)>> callables;
// ^^^^^^ Static storage duration object.
// lives as long as the application.
return callables;
}
int main(int argc, char *argv[]) {
std::string name = GetFromSomewhere();
auto iter = getCallables().find(name);
if (iter != getCallables().end()) {
MyType my_thing = GetSomeValue();
iter->second(my_thing);
}
}
When calling from any scope to register a new function it calls getCallables() which forces initialization. So you avoid the initialization order issue.
void RegisterCallable(const std::string& name, std::function<void(MyType)> f)
{
getCallables()[name] = f;
}
Unfortunately, you can not have freestanding function calls directly in a compilation unit in C++ (unlike a lot of interpreted languages).
// So this will not work
RegisterCallable("nifty", NiftyCallable);
So the way to do this is to declare objects at global scope whose constructor registers the object.
struct DoRegisterCallable {
DoRegisterCallable(std::string const& name, std::function<void(MyType)> f) {
RegisterCallable(name, f);
}
};
Now in your compilation unit the person adding the function will do:
// In foo.cc:
void NiftyCallable(const MyType& thing) { ... }
DoRegisterCallable niftyCallableRegister("nifty", NiftyCallable);
In the comments above IgorTandetnik suggests that niftyCallableRegister may not be included in the executable as the compiler may optimize the variable out. This statement is not interlay true but has merit to think about.
If the file foo.cc is compiled into a static library. Then this static library is linked against the executable, then there is a potential that it may not be included. But in normal situations most builds are done with dynamic libraries not static libraries (as static libraries have so many other issues that people have mostly stopped using them) so this is minor concern in normal operations (but is something to think about).
Additionally, it is implementation defined if file scope, static storage duration variables are initialized before main or deferred. This is easily testable via some unit tests as it is a property of the compiler and not undefined behavior (if you are compiler is doing this then you need to check the documentation to see if the behavior can be changed).
My speculation on this language in the standard is to allow delayed loading of shared libraries till after the application starts, but still guarantee that their behavior conforms to the standard. The way this is written, allows an application to dynamically load a shared library and initialize it (make sure file scope static storage duration objects are initialized) after the application main() has started. A corner case and easily tested via unit test.
I may decide that it is nice to wrap this in a class for easy usage:
#include <string>
#include <functional>
#include <map>
#include <iostream>
class MyType
{
};
using Callable = std::function<void(MyType)>;
using CallableMap = std::map<std::string, Callable>;
class Callables
{
static CallableMap& getCallables()
{
static CallableMap callables;
return callables;
}
public:
static void registerFunc(std::string name, std::function<void(MyType)>&& f)
{
getCallables()[std::move(name)] = std::move(f);
}
static void call(std::string const& name, std::function<MyType()>&& getter)
{
auto find = getCallables().find(name);
if (find != getCallables().end()) {
find->second(getter());
}
}
static void call(std::string const& name, MyType const& value)
{
call(name, [&value](){return value;});
}
};
struct RegisterCallables
{
RegisterCallables(std::string value, Callable&& f)
{
Callables::registerFunc(std::move(value), std::move(f));
}
};
void echo(MyType v)
{
std::cout << "Echo\n";
}
RegisterCallables echoRegister("echo", echo);
int main()
{
MyType d;
Callables::call("echo", d);
}
Is there any technique or compiler extension keyword to declare class member variables inside class member functions? Something like
struct test_t{
void operator ()(){
instance_local int i = 0;
}
};
The best that came in my mind was using thread_local and then executing the member function inside another thread, but this would be too ugly to be useful.
EDIT: example
Well I'm really sorry for the following probably confusing example (it is related to my question yesterday Is there any problem in jumping into if(false) block?). I really tried to make a less confusing up...
#include <iostream>
#define instance_local thread_local
struct A{
A(int i) :
i(i)
{
}
void dosomethinguseful(){
std::cout << i << std::endl;
}
int i;
};
struct task1{
int part;
task1() : part(0){}
void operator ()(){
int result_of_calculation;
switch (part) {
case 0:{
//DO SOME CALCULATION
result_of_calculation = 5;
instance_local A a(result_of_calculation);
if(false)
case 1:{ a.dosomethinguseful();}
part++;
}
default:
break;
}
}
};
int main(){
task1 t;
t();
t();
return 0;
}
instance_local A a(result_of_calculation); that is what i could get from such a keyword instead of making a smart pointer for a.
You're describing a coroutine. Here a rough draft of what it could look like (I'm not an expert in coroutine)
auto task1() -> some_awaitable_type {
result_of_calculation = 5;
A a(result_of_calculation);
co_yield;
a.dosomethinguseful();
}
This could be called like this:
some_awaitable_type maybe_do_something = task1();
// calculation done here
// dosomethinguseful called here
co_await maybe_do_something();
There is not. The compiler needs to know the structure of the class without compiling all the method implementations. If you could slip instance_local int foo into a method body, that would make the size of the data structure 4 bytes larger.
On a more principled level, it's not good to hide data. The equivalent feature for global variables that you might be thinking of, static local variables, is a carryover from C that is widely considered to be an anti-pattern:
Why are static variables considered evil?
Not directly, no.
You could define a:
static std::map<test_t*, int> is;
…where the first part of each element is a this pointer.
But, why?
Make a member variable.
Context :
In my application, I have some functions using global variables. Due to the undefined order of allocation of the global variables, I want to forbid the call to these functions before the main function is running. For the moment, I only document it by a \attention in Doxygen, but I would like to add an assertion.
My question :
Is there a elegant way to know that the main function is not running yet ?
Example (uniqid.cpp):
#include <boost/thread.hpp>
#include <cassert>
unsigned long int uid = 0;
boost::mutex uniqid_mutex;
unsigned long int uniquid()
{
assert(main_is_running() && "Forbidden call before main is running");
boost::mutex::scoped_lock lock(uniqid_mutex);
return ++uid;
}
My first (ugly) idea :
My first idea to do that is by checking another global variable with a specific value. Then the probability to have this value in the variable before initialisation is very small :
// File main_is_running.h
void launch_main();
bool main_is_running();
// File main_is_running.cpp
unsigned long int main_is_running_check_value = 0;
void launch_main()
{
main_is_running_check_value = 135798642;
}
bool main_is_running()
{
return (main_is_running_check_value == 135798642);
}
// File main.cpp
#include "main_is_running.h"
int main()
{
launch_main();
// ...
return 0;
}
Is there a better way to do that ?
Note that I can't use C++11 because I have to be compatible with gcc 4.1.2.
If static std::atomic<bool> s; is defined, along with a little toggling struct:
struct toggle
{
toggle(std::atomic<bool>& b) : m_b(b)
{
m_b = true;
}
~toggle()
{
m_b = false;
}
std::atomic<bool>& m_b;
};
Then, in main, write toggle t(s); as the first statement. This is one of those instances where having a reference as a member variable is a good idea.
s can then tell you if you're in main or not. Using std::atomic is probably overkill given that the behaviour of main calling itself is undefined in C++. If you don't have C++11, then volatile bool is adequate: effectively your not in main until that extra statement has completed.
(I know) In c++ I can declare variable out of scope and I can't run any code/statement, except for initializing global/static variables.
IDEA
Is it a good idea to use below tricky code in order to (for example) do some std::map manipulation ?
Here I use void *fakeVar and initialize it through Fake::initializer() and do whatever I want in it !
std::map<std::string, int> myMap;
class Fake
{
public:
static void* initializer()
{
myMap["test"]=222;
// Do whatever with your global Variables
return NULL;
}
};
// myMap["Error"] = 111; => Error
// Fake::initializer(); => Error
void *fakeVar = Fake::initializer(); //=> OK
void main()
{
std::cout<<"Map size: " << myMap.size() << std::endl; // Show myMap has initialized correctly :)
}
One way of solving it is to have a class with a constructor that does things, then declare a dummy variable of that class. Like
struct Initializer
{
Initializer()
{
// Do pre-main initialization here
}
};
Initializer initializer;
You can of course have multiple such classes doing miscellaneous initialization. The order in each translation unit is specified to be top-down, but the order between translation units is not specified.
You don't need a fake class... you can initialize using a lambda
auto myMap = []{
std::map<int, string> m;
m["test"] = 222;
return m;
}();
Or, if it's just plain data, initialize the map:
std::map<std::string, int> myMap { { "test", 222 } };
Is it a good idea to use below tricky code in order to (for example)
do some std::map manipulation ?
No.
Any solution entailing mutable non-local variables is a terrible idea.
Is it a good idea...?
Not really. What if someone decides that in their "tricky initialisation" they want to use your map, but on some system or other, or for not obvious reason after a particular relink, your map ends up being initialised after their attempted use? If you instead have them call a static function that returns a reference to the map, then it can initialise it on first call. Make the map a static local variable inside that function and you stop any accidental use without this protection.
§ 8.5.2 states
Except for objects declared with the constexpr specifier, for which
see 7.1.5, an initializer in the definition of a variable can consist
of arbitrary expressions involving literals and previously declared
variables and functions, regardless of the variable’s storage duration
therefore what you're doing is perfectly allowed by the C++ standard. That said, if you need to perform "initialization operations" it might be better to just use a class constructor (e.g. a wrapper).
What you've done is perfectly legal C++. So, if it works for you and is maintainable and understandable by anybody else who works with the code, it's fine. Joachim Pileborg's sample is clearer to me though.
One problem with initializing global variables like this can occur if they use each other during initialization. In that case it can be tricky to ensure that variables are initialized in the correct order. For that reason, I prefer to create InitializeX, InitializeY, etc functions, and explicitly call them in the correct order from the Main function.
Wrong ordering can also cause problems during program exit where globals still try to use each other when some of them may have been destroyed. Again, some explicit destruction calls in the correct order before Main returns can make it clearer.
So, go for it if it works for you, but be aware of the pitfalls. The same advice applies to pretty much every feature in C++!
You said in your question that you yourself think the code is 'tricky'. There is no need to overcomplicate things for the sake of it. So, if you have an alternative that appears less 'tricky' to you... that might be better.
When I hear "tricky code", I immediately think of code smells and maintenance nightmares. To answer your question, no, it isn't a good idea. While it is valid C++ code, it is bad practice. There are other, much more explicit and meaningful alternatives to this problem. To elaborate, the fact that your initializer() method returns void* NULL is meaningless as far as the intention of your program goes (i.e. each line of your code should have meaningful purpose), and you now have yet another unnecessary global variable fakeVar, which needlessly points to NULL.
Let's consider some less "tricky" alternatives:
If it's extremely important that you only ever have one global instance of myMap, perhaps using the Singleton Pattern would be more fitting, and you would be able to lazily initialize the contents of myMap when they are needed. Keep in mind that the Singleton Pattern has issues of its own.
Have a static method create and return the map or use a global namespace. For example, something along the lines of this:
// global.h
namespace Global
{
extern std::map<std::string, int> myMap;
};
// global.cpp
namespace Global
{
std::map<std::string, int> initMap()
{
std::map<std::string, int> map;
map["test"] = 222;
return map;
}
std::map<std::string, int> myMap = initMap();
};
// main.cpp
#include "global.h"
int main()
{
std::cout << Global::myMap.size() << std::endl;
return 0;
}
If this is a map with specialized functionality, create your own class (best option)! While this isn't a complete example, you get the idea:
class MyMap
{
private:
std::map<std::string, int> map;
public:
MyMap()
{
map["test"] = 222;
}
void put(std::string key, int value)
{
map[key] = value;
}
unsigned int size() const
{
return map.size();
}
// Overload operator[] and create any other methods you need
// ...
};
MyMap myMap;
int main()
{
std::cout << myMap.size() << std::endl;
return 0;
}
In C++, you cannot have statements outside any function. However, you have global objects declared, and constructor (initializer) call for these global objects are automatic before main starts. In your example, fakeVar is a global pointer that gets initialized through a function of class static scope, this is absolutely fine.
Even a global object would do provide that global object constructor does the desired initializaton.
For example,
class Fake
{
public:
Fake() {
myMap["test"]=222;
// Do whatever with your global Variables
}
};
Fake fake;
This is a case where unity builds (single translation unit builds) can be very powerful. The __COUNTER__ macro is a de facto standard among C and C++ compilers, and with it you can write arbitrary imperative code at global scope:
// At the beginning of the file...
template <uint64_t N> void global_function() { global_function<N - 1>(); } // This default-case skips "gaps" in the specializations, in case __COUNTER__ is used for some other purpose.
template <> void global_function<__COUNTER__>() {} // This is the base case.
void run_global_functions();
#define global_n(N, ...) \
template <> void global_function<N>() { \
global_function<N - 1>(); /* Recurse and call the previous specialization */ \
__VA_ARGS__; /* Run the user code. */ \
}
#define global(...) global_n(__COUNTER__, __VA_ARGS__)
// ...
std::map<std::string, int> myMap;
global({
myMap["test"]=222;
// Do whatever with your global variables
})
global(myMap["Error"] = 111);
int main() {
run_global_functions();
std::cout << "Map size: " << myMap.size() << std::endl; // Show myMap has initialized correctly :)
}
global(std::cout << "This will be the last global code run before main!");
// ...At the end of the file
void run_global_functions() {
global_function<__COUNTER__ - 1>();
}
This is especially powerful once you realize that you can use it to initialize static variables without a dependency on the C runtime. This means you can generate very small executables without having to eschew non-zero global variables:
// At the beginning of the file...
extern bool has_static_init;
#define default_construct(x) x{}; global(if (!has_static_init()) new (&x) decltype(x){})
// Or if you don't want placement new:
// #define default_construct(x) x{}; global(if (!has_static_init()) x = decltype(x){})
class Complicated {
int x = 42;
Complicated() { std::cout << "Constructor!"; }
}
Complicated default_construct(my_complicated_instance); // Will be zero-initialized if the CRT is not linked into the program.
int main() {
run_global_functions();
}
// ...At the end of the file
static bool get_static_init() {
volatile bool result = true; // This function can't be inlined, so the CRT *must* run it.
return result;
}
has_static_init = get_static_init(); // Will stay zero without CRT
This answer is similar to Some programmer dude's answer, but may be considered a bit cleaner. As of C++17 (that's when std::invoke() was added), you could do something like this:
#include <functional>
auto initializer = std::invoke([]() {
// Do initialization here...
// The following return statement is arbitrary. Without something like it,
// the auto will resolve to void, which will not compile:
return true;
});
Could someone please tell me if this is possible in C or C++?
void fun_a();
//int fun_b();
...
main(){
...
fun_a();
...
int fun_b(){
...
}
...
}
or something similar, as e.g. a class inside a function?
thanks for your replies,
Wow, I'm surprised nobody has said yes! Free functions cannot be nested, but functors and classes in general can.
void fun_a();
//int fun_b();
...
main(){
...
fun_a();
...
struct { int operator()() {
...
} } fun_b;
int q = fun_b();
...
}
You can give the functor a constructor and pass references to local variables to connect it to the local scope. Otherwise, it can access other local types and static variables. Local classes can't be arguments to templates, though.
C++ does not support nested functions, however you can use something like boost::lambda.
C — Yes for gcc as an extension.
C++ — No.
you can't create a function inside another function in C++.
You can however create a local class functor:
int foo()
{
class bar
{
public:
int operator()()
{
return 42;
}
};
bar b;
return b();
}
in C++0x you can create a lambda expression:
int foo()
{
auto bar = []()->int{return 42;};
return bar();
}
No but in C++0x you can http://en.wikipedia.org/wiki/C%2B%2B0x#Lambda_functions_and_expressions which may take another few years to fully support. The standard is not complete at the time of this writing.
-edit-
Yes
If you can use MSVC 2010. I ran the code below with success
void test()
{
[]() { cout << "Hello function\n"; }();
auto fn = [](int x) -> int { cout << "Hello function (" << x << " :))\n"; return x+1; };
auto v = fn(2);
fn(v);
}
output
Hello function
Hello function (2 :))
Hello function (3 :))
(I wrote >> c:\dev\loc\uniqueName.txt in the project working arguments section and copy pasted this result)
The term you're looking for is nested function. Neither standard C nor C++ allow nested functions, but GNU C allows it as an extension. Here is a good wikipedia article on the subject.
Clang/Apple are working on 'blocks', anonymous functions in C! :-D
^ ( void ) { printf("hello world\n"); }
info here and spec here, and ars technica has a bit on it
No, and there's at least one reason why it would complicate matters to allow it. Nested functions are typically expected to have access to the enclosing scope. This makes it so the "stack" can no longer be represented with a stack data structure. Instead a full tree is needed.
Consider the following code that does actually compile in gcc as KennyTM suggests.
#include <stdio.h>
typedef double (*retdouble)();
retdouble wrapper(double a) {
double square() { return a * a; }
return square;
}
int use_stack_frame(double b) {
return (int)b;
}
int main(int argc, char** argv) {
retdouble square = wrapper(3);
printf("expect 9 actual %f\n", square());
printf("expect 3 actual %d\n", use_stack_frame(3));
printf("expect 16 actual %f\n", wrapper(4)());
printf("expect 9 actual %f\n", square());
return 0;
}
I've placed what most people would expect to be printed, but in fact, this gets printed:
expect 9 actual 9.000000
expect 3 actual 3
expect 16 actual 16.000000
expect 9 actual 16.000000
Notice that the last line calls the "square" function, but the "a" value it accesses was modified during the wrapper(4) call. This is because a separate "stack" frame is not created for every invocation of "wrapper".
Note that these kinds of nested functions are actually quite common in other languages that support them like lisp and python (and even recent versions of Matlab). They lead to some very powerful functional programming capabilities, but they preclude the use of a stack for holding local scope frames.
void foo()
{
class local_to_foo
{
public: static void another_foo()
{ printf("whatevs"); }
};
local_to_foo::another_foo();
}
Or lambda's in C++0x.
You can nest a local class within a function, in which case the class will only be accessible to that function. You could then write your nested function as a member of the local class:
#include <iostream>
int f()
{
class G
{
public:
int operator()()
{
return 1;
}
} g;
return g();
}
int main()
{
std::cout << f() << std::endl;
}
Keep in mind, though, that you can't pass a function defined in a local class to an STL algorithm, such as sort().
int f()
{
class G
{
public:
bool operator()(int i, int j)
{
return false;
}
} g;
std::vector<int> v;
std::sort(v.begin(), v.end(), g); // Fails to compile
}
The error that you would get from gcc is "test.cpp:18: error: no matching function for call to `sort(__gnu_cxx::__normal_iterator > >, __gnu_cxx::__normal_iterator > >, f()::G&)'
"
It is not possible to declare a function within a function. You may, however, declare a function within a namespace or within a class in C++.
Not in standard C, but gcc and clang support them as an extension. See the gcc online manual.
Though C and C++ both prohibit nested functions, a few compilers support them anyway (e.g., if memory serves, gcc can, at least with the right flags). A nested functor is a lot more portable though.
No nested functions in C/C++, unfortunately.
As other answers have mentioned, standard C and C++ do not permit you to define nested functions. (Some compilers might allow it as an extension, but I can't say I've seen it used).
You can declare another function inside a function so that it can be called, but the definition of that function must exist outside the current function:
#include <stdlib.h>
#include <stdio.h>
int main( int argc, char* argv[])
{
int foo(int x);
/*
int bar(int x) { // this can't be done
return x;
}
*/
int a = 3;
printf( "%d\n", foo(a));
return 0;
}
int foo( int x)
{
return x+1;
}
A function declaration without an explicit 'linkage specifier' has an extern linkage. So while the declaration of the name foo in function main() is scoped to main(), it will link to the foo() function that is defined later in the file (or in a another file if that's where foo() is defined).