Alternative to thread_local for calling legacy C code on OSX - c++

I have a mesh processing algorithm that calls vtkPointsProjectedHull multithreaded via high-level map-reduce (Qt's version).
If you look at the source code of vtkPointsProjectedHull you can see that it calls a free standing C-function and for that it uses a static global variable at line 27:
static double firstPt[3];
(You might be able to imagine how long it took to find this bug after I made the code multi-threaded...).
The layout of the class and free standing C-functions make it hard to move the static definition into a class variable. (I'm sure it is doable, but not straightforward).
The solution in VisualC++ is quite easy, I made a vtkPointsProjectedHullFixed.cxx with the single change being that the static variable is thread_local:
__declspec(thread) static double firstPt[3];
Now I am porting this code to OSX Clang. And thread local storage is explicitly disabled there.
Do I have to rewrite the whole vtkPointsProjectedHullFixed class to use a class variable? Or do you know a better way?

A guess as I cannot confirm at present, but you might find:
_Thread_local static double firstPt[3];
will work. Apple Clang does support C11 thread local, and double is a C type; what it doesn't support is C++ thread local which supports C++ types with constructors/destructors.
HTH
Edit: Confirmed _Thread_local works as expected with C code. Should work with C++ code, but only if the variable type is a C one (as double is).

Related

Is there a portable way to tell if main() has started yet?

In C++ it's possible to call arbitrary functions at init time before main() has been entered, including calls to libraries that are not fully initialized yet, which can cause confusing errors. If I'm writing a library, is it possible in standard modern C++ (C++20 or so) to tell whether main() has started yet, so I can prevent the user of the library from using it before it's safe?
Considered solutions:
Having a library::init() function called at the beginning of main(). The library already works fine without an init(), so it seems silly to add one just to improve error reporting. If nothing else works, obviously this is the best solution.
Using a static initializer to determine when it's safe to use the library. It cannot be predicted what order static initializers run in, so this is not reliable.
Using a function-level static variable to initialize the library (essentially lazy initialization) so that initialization order doesn't matter. I already do this for some things, but this can't extend the protection to other libraries or system resources that are not usable before main().
Walking up the stack manually to find a frame for main(). I think not. : )
Answering my own question: from what I gather based on people's comments and looking at
Is there any way in C/C++ to detect if code is running during static initialization?
How can I call a function or statically initialize an object immediately before main?
it appears that there is no way to get this information in standard C++, and the standard solution is just to provide a library::init() function for the user to call in main().

How to dynamically register class in a factory class at runtime period with c++

Now, I implemented a factory class to dynamically create class with a idenification string, please see the following code:
void IOFactory::registerIO()
{
Register("NDAM9020", []() -> IOBase * {
return new NDAM9020();
});
Register("BK5120", []() -> IOBase * {
return new BK5120();
});
}
std::unique_ptr<IOBase> IOFactory::createIO(std::string ioDeviceName)
{
std::unique_ptr<IOBase> io = createObject(ioDeviceName);
return io;
}
So we can create the IO class with the registered name:
IOFactory ioFactory;
auto io = ioFactory.createIO("BK5120");
The problem with this method is if we add another IO component, we should add another register code in registerIO function and compile the whole project again. So I was wondering if I could dynamically register class from a configure file(see below) at runtime.
io_factory.conf
------------------
NDAM9020:NDAM9020
BK5120:BK5120
------------------
The first is identification name and the second is class name.
I have tried with Macros, but the parameter in Macros cann't be string. So I was wondering if there is some other ways. Thanks for advance.
Update:
I didn't expect so many comments and answers, Thank you all and sorry for replying late.
Our current OS is Ubuntu16.04 and we use the builtin compiler that is gcc/g++5.4.0, and we use CMake to manage the build.
And I should mention that it is not a must that I should register class at runtime period, it is also OK if there is a way to do this in compile period. What I want is just avoiding the recompiling when I want to register another class.
So I was wondering if I could dynamically register class from a configure file(see below) at runtime.
No. As of C++20, C++ has no reflection features allowing it. But you could do it at compile time by generating a simple C++ implementation file from your configuration file.
How to dynamically register class in a factory class at runtime period with c++
Read much more about C++, at least a good C++ programming book and see a good C++ reference website, and later n3337, the C++11 standard. Read also the documentation of your C++ compiler (perhaps GCC or Clang), and, if you have one, of your operating system. If plugins are possible in your OS, you can register a factory function at runtime (by referring to to that function after a plugin providing it has been loaded). For examples, the Mozilla firefox browser or recent GCC compilers (e.g. GCC 10 with plugins enabled), or the fish shell, are doing this.
So I was wondering if I could dynamically register class from a configure file(see below) at runtime.
Most C++ programs are running under an operating system, such as Linux. Some operating systems provide a plugin mechanism. For Linux, see dlopen(3), dlsym(3), dlclose(3), dladdr(3) and the C++ dlopen mini-howto. For Windows, dive into its documentation.
So, with a recent C++ implementation and some recent operating systems, y ou can register at runtime a factory class (using plugins), and you could find libraries (e.g. Qt or POCO) to help you.
However, in pure standard C++, the set of translation units is statically known and plugins do not exist. So the set of functions, lambda-expressions, or classes in a given program is finite and does not change with time.
In pure C++, the set of valid function pointers, or the set of valid possible values for a given std::function variable, is finite. Anything else is undefined behavior. In practice, many real-life C++ programs accept plugins thru their operating systems or JIT-compiling libraries.
You could of course consider using JIT-compiling libraries such as asmjit or libgccjit or LLVM. They are implementation specific, so your code won't be portable.
On Linux, a lot of Qt or GTKmm applications (e.g. KDE, and most web browsers, e.g. Konqueror, Chrome, or Firefox) are coded in C++ and do load plugins with factory functions. Check with strace(1) and ltrace(1).
The Trident web browser of MicroSoft is rumored to be coded in C++ and probably accepts plugins.
I have tried with Macros, but the parameter in Macros can't be string.
A macro parameter can be stringized. And you could play x-macros tricks.
What I want is just avoiding the recompiling when I want to register another class.
On Ubuntu, I recommend accepting plugins in your program or library
Use dlopen(3) with an absolute file path; the plugin would typically be passed as a program option (like RefPerSys does, or like GCC does) and dlopen-ed at program or library initialization time. Practically speaking, you can have lots of plugins (dozen of thousands, see manydl.c and check with pmap(1) or proc(5)). The dlsym(3)-ed C++ functions in your plugins should be declared extern "C" to disable name mangling.
A single C++ file plugin (in yourplugin.cc) can be compiled with g++ -Wall -O -g -fPIC -shared yourplugin.cc -o yourplugin.so and later you would dlopen "./yourplugin.so" or an absolute path (or configure suitably your $LD_LIBRARY_PATH -see ld.so(8)- and pass "yourplugin.so" to dlopen). Be also aware of Rpath.
Consider also (after upgrading your GCC to GCC 9 at least, perhaps by compiling it from its source code) using libgccjit (it is faster than generating temporary C++ code in some file and compiling that file into a temporary plugin).
For ease of debugging your loaded plugins, you might be interested by Ian Taylor's libbacktrace.
Notice that your program's global symbols (declared as extern "C") can be accessed by name by passing a nullptr file path to dlopen(3), then using dlsym(3) on the obtained handle. You want to pass -rdynamic -ldl when linking your program (or your shared library).
What I want is just avoiding the recompiling when I want to register another class.
You might registering classes in a different translation unit (a short one, presumably). You could take inspiration from RefPerSys multiple #include-s of its generated/rps-name.hh file. Then you would simply recompile a single *.cc file and relink your entire program or library. Notice that Qt plays similar tricks in its moc, and I recommend taking inspiration from it.
Read also J.Pitrat's book on Artificial Beings: the Conscience of a Conscious Machine ISBN which explains why a metaprogramming approach is useful. Study the source code of GCC (or of RefPerSys), use or take inspiration from SWIG, ANTLR, GNU bison (they all generate C++ code) when relevant
You seem to have asked for more dynamism than you actually need. You want to avoid the factory itself having to be aware of all of the classes registered in it.
Well, that's doable without going all the way runtime code generation!
There are several implementations of such a factory; but I am obviously biased in favor of my own: einpoklum's Factory class (gist.github.com)
simple example of use:
#include "Factory.h"
// we now have:
//
// template<typename Key, typename BaseClass, typename... ConstructionArgs>
// class Factory;
//
#include <string>
struct Foo { Foo(int x) { }; }
struct Bar : Foo { Bar(int x) : Foo(x) { }; }
int main()
{
util::Factory<std::string, Foo, int> factory;
factory.registerClass<Bar>("key_for_bar");
auto* my_bar_ptr factory.produce("key_for_bar");
}
Notes:
The std::string is used as a key; you could have a factory with numeric values as keys instead, if you like.
All registered classes must be subclasses of the BaseClass value chosen for the factory. I believe you can change the factory to avoid that, but then you'll always be getting void *s from it.
You can wrap this in a singleton template to get a single, global, static-initialization-safe factory you can use from anywhere.
Now, if you load some plugin dynamically (see #BasileStarynkevitch's answer), you just need that plugin to expose an initialization function which makes registerClass() class calls on the factory; and call this initialization function right after loading the plugin. Or if you have a static-initialization safe singleton factory, you can make the registration calls in a static-block in your plugin shared library - but be careful with that, I'm not an expert on shared library loading.
Definetly YES!
Theres an old antique post from 2006 that solved my life for many years. The implementation runs arround having a centralized registry with a decentralized registration method that is expanded using a REGISTER_X macro, check it out:
https://web.archive.org/web/20100618122920/http://meat.net/2006/03/cpp-runtime-class-registration/
Have to admit that #einpoklum factory looks awesome also. I created a headeronly sample gist containing the code and a sample:
https://gist.github.com/h3r/5aa48ba37c374f03af25b9e5e0346a86

Is it possible to create a user-defined datatype in a language like C/C++(or maybe any) from a string as user input or from file

Well this might be a very weird question but my curiosity has striken pretty hard on this. So here it goes...
NOTE: Lets take the language C into consideration here.
As programmers we usually define a user-defined datatype(say struct) in the source code with the appropriate name.
Suppose I have a program in which I have a structure defined as:
struct Animal {
char *name;
int lifeSpan;
};
And also I have started the execution of this program.
Now, my question here is;
What if I want to define a new structure called "Plant" just like "Animal" mentioned above in my program, without writing its definition in the source code itself(which is obviously impossible currently) but rather from a user input string(or a file input) during runtime.
Lets say my program takes input string from a text file named file1.txt whose content is:
struct Plant {
char *name;
int lifeSpan;
};
What I want now is to have a new structure named "Plant" in my program which is already in execution. The program should read the file content and create a structure as written in the file and attach it to itself on-the-go.
I have checked out a solution for C++ in the discussion Declaring a data type dynamically in C++ but it doesnt seem to have a very convincing solution.
The solution I am looking for is at the compiler-linker-loader level rather than from the language itself.I would be very pleased and thankful if anyone is looking forward to sharing their ideas on this.
What you're asking about is basically "can we implement C as a scripting language?", since this is the only way code can be executed after compilation.
I'm aware that people have been writing (mostly in the comments) that it's possible in other languages but isn't possible in C, since C is a compiled language (hence data types should be defined during compile time).
However, to the best of my knowledge it's actually possible (and might not be as hard as one would imagine).
There are many possible approaches (machine code emulation (VM), JIT compilation, etc').
One approach will use a C compiler to compile the C script as an external dynamic library (.dll on windows, .so on linux, etc') and than "load" the compiled library and execute the code (this is pretty much the JIT compilation approach, for lazy people).
EDIT:
As mentioned in the comments, by using this approach, the new type is loaded as part of an external library.
The original code won't know about this new type, only the new code (or library) will be "aware" of this new type and able to properly use it.
On the other hand, I'm not sure why you're insisting on the need to use static types and a compiler-linker-loader level solution.
The language itself (the C language) can manage this task dynamically (during execution time).
Consider Ruby MRI, for example. The Ruby language supports dynamic types that can be defined during runtime...
...However, this is implemented in C and it's possible to use the code from within C to define new modules and classes. These aren't static types that can be tested during compilation (type creation and identification is performed during runtime).
This is a perfect example showing that C (as a language) can dynamically define "types".
However, this is also a poor example because Ruby's approach is slow. A custom approved can be far faster since it would avoid the huge overhead related to functionality you might not need (such as inheritance).

dll entry point question/advice needed

I have a dynamic library of c++ code that is cross platform and mostly just native c++. I then use this dynamic library from my main exe. Up until now all has been good on OSX using gcc. Now I'm on windows I am confused as to what method I should use to enter the dll. I don't have a DllMain function at present as this wasn't required in gcc (to my knowledge). My initial tests worked but on inspection revealed that strangely one of my class constructors was being called on dll load, so I figured I needed to do something more on windows. So do I:
add a DllMain function?
am I safe to just use the noentry compiler option?
When I do either of the above I start getting compiler complaints in the vein of ".CRT section exists there may be unhandled static initializers or terminators"
I have read up on this using this article, but any advice and clarity on the best way forward would be greatly appreciated. Its all a bit blurry in my head as to what I need to do.
Based on the .CRT error, you definitely need a DllMain function. For most Windows compilers, a DllMain will be provided for you automatically, so that you don't need to write one yourself. Based on other parts of your question it seems most likely that you are using Visual C++, whose CRT does provide a DllMain for you. So while you do need a DllMain, you don't need to write the code for it.
The default VC CRT DllMain is used to initialize the CRT for the DLL in question, and to initialise all the static/global variables that the dll provides.
The model for DLLs on Unix and Windows is significantly different, and you should think of each DLL as having a more 'private' set of state. Although, if all Dlls opt into the same version of the CRT dll, some of that state will then be shared.
Because the CRT is providing a DllMain for you, you should not throw /noentry on the linker.
The .CRT section exists error (which you must have seen by throwing /noentry) is telling you that you need a DllMain because you've got one or more objects in your DLL that require static initialisation.
Martyn
If it is just a library, then NOENTRY should suffice. DllMain is there to control events that happen with the DLL (i.e. attach, detach etc).
You can change the code (slightly) to avoid all entry points but main. Essentially, if you have any variable defined outside of the functions (globally but not statically linked), wrap them in a function call. Use the often-forgotten static function variables.
Ie, change global declaration
SomeType var_name;
to this:
SomeType & var_name(){static SomeType var; return var;}
Similarly, you can change static class instance variables by changing this:
struct Container{
Container();
static Container instance;
};
Container Container::instance;
to this:
struct Container{
Container();
static Container & instance();
};
Container & Container::instance(){
static Container var;
return var;
}
This is essentially a singleton, but there might be some concurrency issues if the 1st time you access the instance will be from a multi-threaded environment. In fact, the thing to keep in mind is that unlike globally-defined variables, static locally-defined variables will be initialized the first time the function is called.

Is it safe to use getenv() in static initializers, that is, before main()?

I looked in Stevens, and in the Posix Programmer's Guide, and the best I can find is
An array of strings called the enviroment is made available when the process begins.
This array is pointed to by the external variable environ, which is defined as:
extern char **environ;
It's that environ variable that has me hesitating. I want to say
-The calling process/shell has already allocated the block of null terminated strings
-the 'external' variable environ is used as the entry point by getenv().
-ipso facto feel free to call getenv() within a static initializer.
But I can't find any guarantee that the 'static initialization' of environ precedes all the other static initialization code. Am I overthinking this?
Update
On my platform (AMD Opteron, Redhat 4, GCC 3.2.3), setting LD_DEBUG shows that environ gets set before my static initializers are called. This is a nice thing to know; thanks, #codelogic. But it is not necessarily the result I'd get on all platforms.
Also, while I agree intuitively with #ChrisW on the behavior of the C/C++ runtime library, this is just my intuition based on experience. So anyone who can pipe up with a quote from someplace authoritative guaranteeing that environ is there before static initializers are called, bonus points!
I think you can run your program with LD_DEBUG set to see the exact order:
LD_DEBUG=all <myprogram>
EDIT:
If you look at the source code of the runtime linker (glibc 2.7), specifically in files:
sysdeps/unix/sysv/linux/init-first.c
sysdeps/i386/init-first.c
csu/libc-start.c
sysdeps/i386/elf/start.S
you will see that argc, argv and environ (alias for __environ) are set before any global constructors are called (the init functions). You can follow the execution starting right from _start, the actual entry point (start.S). As you've quoted Stevens "An array of strings called the enviroment is made available when the process begins", suggesting that environment assignment happens at the very beginning of the process initialization. This backed by the linker code, which does the same, should give you sufficient peace of mind :-)
EDIT 2: Also worth mentioning is that environ is set early enough that even the runtime linker can query it to determine whether or not to output verbosely (LD_DEBUG).
Given that both the environment setup and the invoking of the static initializers are functions that the language run-time has to perform before main() is invoked, I am not sure you will find a guarantee here. That is, I am not aware of a specific requirement here that this has to work and that the order is guaranteed prior to main() in, say, the ANSI language and library specifications or anything....but I didn't check to make sure either.
At the same time, I am not aware of a specific requirement that restricts which run-time library functions can be called from a static initializer. And, more to the point, it would feel like a run-time bug (to me) if you couldn't access the environment from one.
On this basis, I vote that I would expect this to work, is a safe assumption, and current data points seem to support this line of reasoning.
In my experience, the C run time library is initialized before the run-time invokes the initializers of your static variables (and so your initializers may call C run time library functions).