C migrating to C++ (embedded) - c++

I have a project in pure C - st usb library and I need to migrate it to c++ and change same structures into classes. I removed all c++ "protections" like:
#ifdef __cplusplus
extern "C" {
#endif
#ifdef __cplusplus
}
#endif
I changed all files extensions from .c to .cpp (except HAL library).
I realized that c++ .hex is 7kB smaller then c .hex. When I looked into .map file I saw that many functions are missing. I thought that staticfunctions caused that, but removing static key word didn't help. Does anyone have idea what could cause that some functions weren't compiled. When extensions are .c everything is fine.

I can think of two main reasons:
Inlining. The compiler can decide there is no need to emit the function as a standalone function if all usages can be inlined.
Unused code. The compiler can see that a function isn't used anywhere in your code and decide to eliminate it from the final result.
If the result is to be used as sort of library, that your environment calls specific functions without having an explicit call in your own code, I think the best method is to compile and link your code as a library (probably dynamic lib) and exporting those functions as library interface (visibility in gcc, dllexport in MSVC) will force the compiler/linker to include them in the produced binary even if they don't see why they are needed right now.
(Of course, this is a wild guess about your target environment.)
Another option is to turn off the specific compiler/linker optimizations for removing dead code and forcing emitting standalone function instance for inlined functions, but I think it's very indirect approach, have a wider effect and can complicate maintenance later.

C++ functions are given different signatures than C functions. Since you lost functionality and the code is much smaller, it's likely that a function that requires C linkage is being compiled as C++ and the signature mismatch is preventing proper linking.
A likely place for this to happen is in the interrupt vector table. If the handler function is compiled with C++ linkage, the handler address won't make it into a table that's compiled with C.
Double check the interrupt vectors and make sure that they reference the correct functions. If they are correct, check any other code compiled with C that might reference an external symbol compiled with C++.

Related

Is it safe to use #ifdef guards on C++ class member functions?

Suppose you have the following definition of a C++ class:
class A {
// Methods
#ifdef X
// Hidden methods in some translation units
#endif
};
Is this a violation of One Definition Rule for the class? What are the associated hazards?
I suspect if member function pointers or virtual functions are used this will most likely break. Otherwise is it safe to use?
I am considering it in the context of Objective C++. The header file is included in both pure C++ and Objective C++ translation units. My idea is to guard methods with Objective-C types with OBJC macro. Otherwise, I have to use void pointer for all Objective-C types in the header, but this way I am losing strong typing and also ugly static casts must be added all over the code.
Yes, it MAY allow a hazard of ODR breach if separate compilation units are allowed to have different state of macro definition X. X should be defined (or not defined) globally across program (and shared objects) before every inclusion of that class definition for program to meet requirement to be compliant. As far as C++ compiler (not preprocessor) concerned, those are two different, incompatible, unrelated class types.
Imagine situation where in compilation unit A.cpp X was defined before class A and in unit B.cpp X was not defined. You would not get any compiler errors if nothing within B.cpp uses those members which were "removed". Both units may be considered well-formed on their own. Now if B.cpp would contain a new expression, it would create an object of incompatible type, smaller than one defined in A.cpp. But any method from class A , including constructor, may cause an UB by accessing memory outside of object's storage when called with object created in B.cpp, because they use the larger definition.
There is a variation of this folly, an inclusion of header file's copy into two or more different folders of build tree with same name of file and POD struct type, one of those folders accessible by #include <filename>. Units with #include "filename" designed to use alternatives. But they wouldn't. Because order of header file lookup in this case is platform-defined, programmer isn't entirely in control of which header would be included in which unit on every platform with #include "filename". As soon as one definition would be changed, even just by re-ordering members, ODR was breached.
To be particularly safe, such things should be done only in compiler domain by using templates, PIMPL, etc. For inter-language communication some middle ground should be arranged, using wrappers or adapters, C++ and ObjectiveC++ may have incompatible memory layout of non-POD objects.
This blows up horribly. Do not do this. Example with gcc:
Header file:
// a.h
class Foo
{
public:
Foo() { ; }
#ifdef A
virtual void IsCalled();
#endif
virtual void NotCalled();
};
First C++ File:
// a1.cpp
#include <iostream>
#include "a.h"
void Foo::NotCalled()
{
std::cout << "This function is never called" << std::endl;
}
extern Foo* getFoo();
extern void IsCalled(Foo *f);
int main()
{
Foo* f = getFoo();
IsCalled(f);
}
Second C++ file:
// a2.cpp
#define A
#include "a.h"
#include <iostream>
void Foo::IsCalled(void)
{
std::cout << "We call this function, but ...?!" << std::endl;
}
void IsCalled(Foo *f)
{
f->IsCalled();
}
Foo* getFoo()
{
return new Foo();
}
Result:
This function is never called
Oops! The code called virtual function IsCalled and we dispatched to NotCalled because the two translation units disagreed on which entry was where in the class virtual function table.
What went wrong here? We violated ODR. So now two translations units disagree on what is supposed to be where in the virtual function table. So if we create a class in one translation unit and call a virtual function in it from another translation unit, we may call the wrong virtual function. Oopsie whoopsie!
Please do not deliberately do thigs that the relevant standards say are not allowed and will not work. You will never be able to think of every possible way it can go wrong. This kind of reasoning has caused many disasters over my decades of programming and I really wish people would stop deliberately and intentionally creating potential disasters.
Is it safe to use #ifdef guards on C++ class member functions?
In practice (look at the generated assembler code using GCC as g++ -O2 -fverbose-asm -S) what you propose to do is safe. In theory it should not be.
However, there is another practical approach (used in Qt and FLTK). Use some naming conventions in your "hidden" methods (e.g. document that all of them should have dontuse in their name like int dontuseme(void)), and write your GCC plugin to warn against them at compile time. Or just use some clever grep(1) in your build process (e.g. in your Makefile)
Alternatively, your GCC plugin may implement new #pragma-s or function attributes, and could warn against misuse of such functions.
Of course, you can also use (cleverly) private: and most importantly, generate C++ code (with a generator like SWIG) in your build procedure.
So practically speaking, your #ifdef guards may be useless. And I am not sure they make the C++ code more readable.
If performance matters (with GCC), use the -flto -O2 flags at both compile and link time.
See also GNU autoconf -which uses similar preprocessor based approaches.
Or use some other preprocessor or C++ code generator (GNU m4, GPP, your own one made with ANTLR or GNU bison) to generate some C++ code. Like Qt does with its moc.
So my opinion is that what you want to do is useless. Your unstated goals can be achieved in many other ways. For example, generating "random" looking C++ identifiers (or C identifiers, or ObjectiveC++ names, etc....) like _5yQcFbU0s (this is done in RefPerSys) - the accidental collision of names is then very improbable.
In a comment you state:
Otherwise, I have to use void* for all Objective-C types in the header, but this way I am losing strong typing
No, you can generate some inline C++ functions (that would use reinterpret_cast) to gain again that strong typing. Qt does so! FLTK or FOX or GTKmm also generate C++ code (since GUI code is easy to generate).
My idea was to guard methods with Objective-C types with OBJC macro
This make perfect sense if you generate some C++ or C or Objective C code with these macros.
I suspect if member function pointers or virtual functions are used this will most likely break.
In practice, it won't break if you generate random looking C++ identifiers. Or just if you document naming conventions (like GNU bison or ANTLR does) in generated C++ code (or in generated Objective C++, or in generated C, ... code)
Please notice that compilers like GCC use today (in 2021, internally) several C++ code generators. So generating C++ code is a common practice. In practice, the risks of name collisions are small if you take care of generating "random" identifiers (you could store them in some sqlite database at build time).
also ugly static casts must be added all over the code
These casts don't matter if the ugly code is generated.
As examples, RPCGEN and SWIG -or Bisoncpp- generate ugly C and C++ code which works very well (and perhaps also some proprietary ASN.1 or JSON or HTTP or SMTP or XML related in-house code generators).
The header file is included in both pure C++ and Objective C++ translation units.
An alternative approach is to generate two different header files...
one for C++, and another for Objective C++. The SWIG tool could be inspirational. Of course your (C or C++ or Objective C) code generators would emit random looking identifiers.... Like I do in both Bismon (generating random looking C names like moduleinit_9oXtCgAbkqv_4y1xhhF5Nhz_BM) and RefPerSys (generating random looking C++ names like rpsapply_61pgHb5KRq600RLnKD ...); in both systems accidental name collision is very improbable.
Of course, in principle, using #ifdef guards is not safe, as explained in this answer.
PS. A few years ago I did work on GCC MELT which generated millions of lines of C++ code for some old versions of the GCC compiler. Today -in 2021- you practically could use asmjit or libgccjit to generate machine code more directly. Partial evaluation is then a good conceptual framework.

How do I (programmatically) ensure that a dll contains definitions for all exported functions?

Let's say I have this lib
//// testlib.h
#pragma once
#include <iostream>
void __declspec(dllexport) test();
int __declspec(dllexport) a();
If I omit the definition for a() and test() from my testlib.cpp, the library still compiles, because the interface [1] is still valid. (If I use them from a client app then they won't link, obviously)
Is there a way I can ensure that when the obj is created (which I gather is the compiler's job) it actually looks for the definitions of the functions that I explicitly exported, and fails if doesn't ?
This is not related to any real world issue. Just curious.
[1] MSVC docs
No, it's not possible.
Partially because a dllexport declaration might legally not even be implemented in the same DLL, let alone library, but be a mere forward declaration for something provided by yet another DLL.
Especially it's impossible to decide on the object level. It's just another forward declaration like any other.
You can dump the exported symbols once the DLL has been linked, but there is no common tool for checking completeness.
Ultimately, you can't do without a client test application which attempts to load all exported interfaces. You can't check that on compile time yet. Even just successfully linking the test application isn't enough, you have to actually run it.
It gets even worse if there are delay-loaded DLLs (and yes, there usually are), because now you can't even check for completeness unless you actually call at least one symbol from each involved DLL.
Tools like Dependency Walker etc. exist for this very reason.
You asked
"Is there a way I can ensure that when the obj is created (which I gather is the compiler's job) it actually looks for the definitions of the functions that I explicitly exported, and fails if doesn't ?"
When you load a DLL then the actual function binding is happenning at runtime(late binding of functions) so it is not possible for the compiler to know if the function definition are available in the DLL or not. Hope this answer to your query.
How do I (programmatically) ensure that a dll contains definitions for all exported functions?
In general, in standard C++11 (read the document n3337 and see this C++ reference), you cannot
Because C++11 does not know about DLL or dynamic linking. It might sometimes make sense to dynamically load code which does not define a function which you pragmatically know will never be called (e.g. an incomplete DLL for drawing shapes, but you happen to know that circles would never be drawn in your particular usage, so the loaded DLL might not define any class Circle related code. In standard C++11, every called function should be defined somewhere (in some other translation unit).
Look also into Qt vision of plugins.
Read Levine's Linkers and loaders book. Notice that on Linux, plugins loaded with dlopen(3) have a different semantics than Windows DLLs. The evil is in the details
In practice, you might consider using some recent variant of the GCC compiler and develop your GCC plugin to check that. This could require several weeks of work.
Alternatively, adapt the Clang static analyzer for your needs. Again, budget several weeks of work.
See also this draft report and think about C++ cross compilers (e.g. compiling on Windows a plugin for a RaspBerry Pi)
Consider also runtime code generation frameworks like asmjit or libgccjit. You might think of generating at runtime the missing stubs or functions (and fill appropriately function pointers with them, or even vtables). C++ exceptions could also be an additional issue.
If your DLL contains calls to the functions, the linker will fail if a definition isn't provided for those functions. It doesn't matter if the calls are never executed, only that they exist.
//// testlib.h
#pragma once
#include <iostream>
#ifndef DLLEXPORT
#define DLLEXPORT(TYPE) TYPE __declspec(dllexport)
#endif
DLLEXPORT(void) test();
DLLEXPORT(int) a();
//// testlib_verify.c
#define DLLEXPORT(TYPE)
void DummyFunc()
{
#include testlib.h
}
This macro-based solution only works for functions with no parameters, but it should be easy to extend.

How to link old C code with reserved keywords in it with C++?

I have a 10+ years old C library which -- I believe -- used to work just fine in the good old days, but when I tried to use it with a C++ source (containing the main function) the other day I ran into some difficulties.
Edit: to clarify, the C library compiles just fine with gcc, and it generates an object file old_c_library.o. This library was supposed to be used in a way so that the C header file old_c_library.h is #included in your main.c C source file. Then your main C source file should be compiled and linked together with old_c_library.o via gcc. Here I want to use a C++ source file main.cpp instead, and compile/link it with g++.
The following three problems occurred, during the compilation of the C++ source file:
one of the header files of the C library contains the C++ reserved word new (it is the name of an integer), which resulted in fatal error; and
one of the header files of the C library contains a calloc call (an explicit typecast is missing), which resulted in fatal error; and
various files of the C library contain code where comparison of signed and unsigned integers happen, which resulted in warnings.
Edit: I tried to use the #extern "C" { #include "obsolete_c_library.h" } "trick", as suggested in the comments, but that did not solve any of my problems.
I can sort out problem 1 by renaming all instances of the reserved words and replacing them by -- basically -- anything else. I can sort out problem 2 by typecasting the calloc call. I might try to sort out the warnings by ideas suggested here: How to disable GCC warnings for a few lines of code.
But I still wonder, is there a way to overcome these difficulties in an elegant, high-level way, without actually touching the original library?
Relevant:
Where is C not a subset of C++? and Do I cast the result of malloc? and How do I use extern to share variables between source files?.
Generally speaking, it is not safe to #include C header files into C++ sources if those headers were not built in anticipation of such usage. Under some circumstances it can be made to work, but you need to be prepared to either modify the headers or write your own declarations for the functions and global variables you want to access.
At minimum, if the C headers declare any functions and you are not recompiling those functions in C++ then you must ensure that the declarations are assigned C linkage in your C++ code. It's not uncommon for C headers to account for that automatically via conditional compilation directives, but if they do not then you can do it on the other side by wrapping the inclusion(s) in a C linkage block:
extern "C" {
#include "myclib.h"
}
If the C headers declare globals whose names conflict with C++ keywords, and which you do not need to reference, then you may be able to use the preprocessor to redefine them:
#define new extern_new
#include "myclib.h"
#undef new
That's not guaranteed to work, but it's worth a try. Do not forget to #undef such macros after including the C headers, as shown.
There may be other fun tricks you can play with macros to adapt specific headers to C++, but at some point it makes more sense to just copy / rewrite the needed declarations (and only those), either in your main C++ source or in your own C++ header. Note that doing so does not remove the need to declare C linkage -- that requirement comes from the library having been compiled by a C compiler rather than a C++ compiler.
You can write a duplicate of the c header with the only difference being lack of declaration of extern int new. Then use the newly created c++ friendly header instead of the old, incompatible one.
If you need to refer that variable from c++, then you need to do something more complex. You will need to write a new wrapper library in c, that exposes read and write functions a pointer to c++ that can be used to access the unfortunately named variable.
If some inline functions refer to the variable, then you can leave those out from the duplicate header and if you need to call them from c++, re-implement them in a c++ friendly manner. Just don't give the re-implementation same name within extern "C", because that would give you a conflict with the original inline function possibly used by some existing c code.
Do the same header duplication as described in 1. leaving out the problematic inline function and re-implementing if needed.
Warnings can be ignored or disabled as you already know. No need to modify the original headers. You could rewrite any {const,type}-unsafe inline functions with versions that don't generate warnings, but you have to consider whether your time is worth it.
The disadvantage of this approach is that you now have two vesions of some/all of the headers. New ones used by c++, and the old ones that may be used by some old code.
If you can give up the requirement of not touching the original library and don't need to work with an existing compiled binary, then an elegant solution would be to simply rename the problematic variables (1.). Make non-c++-compatible inline functions (2.) non-inline and move them into c source files. Fix unsafe code (3.) that generates warnings or if the warning is c++ specific, simply make the function non-inline.

Difference Between C and C++ Executables?

I know there are differences in the source code between C and C++ programs - this is not what I'm asking about.
I also know this will vary from CPU to CPU and OS to OS, depending on compiler.
I'm teaching myself C++ and I've seen numerous references to libraries that can be used by both languages. This has started me thinking - are there significant differences between the binary executables of the two languages?
For libraries to be easily used by both, I would think they'd have to be similar on an executable level.
Are there many situations where a person could examine a executable file and tell whether it was created by C or C++ source code? Or would the binaries be pretty similar?
In most cases, yes, it's pretty easy. Here are just a few clues that I've seen often enough to remember them easily:
C++ program will typically end up with at least a few visible symbols that have been mangled.
C++ program will typically have at least a few calls to virtual functions, which are typically quite distinctive from code you'll typically see in C.
Many C++ compilers implement a calling convention for C++ that gives special consideration to passing the this pointer into C++ member functions. Again, since the this pointer simply doesn't exist in C, you'll rarely see a direct analog (though in some cases, they will use the same convention to pass some other pointer, so you need to be careful about this one).
A executable is a executable is a executable, no matter what language it's written in. If it's built for the target architecture, it'll run on the architecture.
The (arguably) most important difference between C and C++-compiled code, and the one relevant to libraries that can be linked both against C and C++ executables, is that of name mangling. Basically: when a library is compiled, it exports a set of symbols (function names, exported variables, etc.) that executables linked against the library can use. How these symbols are named is a fairly compiler/linker-specific, and if the subsequent executable is linked using a linker using an incompatible convention, then symbols won't resolve correctly. In addition, C and C++ have slightly different conventions. The Wikipedia article linked above has more of the details; suffice to say, when declaring exported symbols in a header file, you'll usually see a construction like:
#ifdef __cplusplus
extern "C" {
#endif
/* exported declarations here */
#ifdef __cplusplus
}
#endif
__cplusplus is a preprocessor macro only defined when compiling C++ code. The idea here is that, when using the header in C++, the compiler is instructed to use the C way of naming exported symbols (inside the "extern "C" { /* foo */ }" block, so the library can be linked both in C and C++ correctly.
I think I could tell if something is C++ or C from reading the disassembled binary code [for processor architectures that I'm familiar with, x86, x86_64 and ARM]. But in reality, there isn't much difference, you'd have to look pretty hard to know for sure.
Signs to look for are "indirect calls" (function pointer calls via a table) and this-pointers. Although C can have pointer to struct arguments and will often use function pointers, it's not usually set up in the way that C++ does it. Also, you'll notice, sometimes, that the compiler takes a pointer to a struct and adds a small offset - that's removing the outer layer of an inherited class. This CAN happen in C as well, but it won't be as common/distinctive.
Looking just at the binary [unless you can "do disassembly in your head" would be a lot harder - especially if it's been stripped of symbols - that's like the guy who could tell you what classical music something was on an old Vinyl record from looking at the tracks [with the label hidden] - not something most people can do, even if they are "good".
In practice, a C program (or a C++ program) is rarely only pure standard C (or C++) (for instance the C99 standard has no mean to scan a directory). So programs use additional libraries.
On Linux, most binaries are dynamically linked. Use the ldd command to find out.
If the binary is linked to the stdc++ library, the source code is likely C++.
If only the libc.so library is linked, the source code is probably only C (but you could link statically the libstdc++.a library).
You can also use tools working on binary files (e.g. objdump, readelf, strings, nm on Linux ....) to find more about them.
The code generated by C and C++ compilers is generally the same code. There are two important differences:
Name mangling: Each function and global variable becomes a symbol at compile time. In C these symbol's names are the same as their names in your source code. In C++ they are being mangled a bit to allow for polymorphic code
Calling conventions: If you call a method in C++ the this-pointer is passed as a hidden first parameter. Other conventions might also be different such as call by reference which does not exist in C
You can use an block such as this to let the C++-compiler generate code compatible to C:
extern "C" {
/* code */
}

Header-only linking

Many C++ projects (e.g. many Boost libraries) are "header-only linked".
Is this possible also in plain C? How to put the source code into headers? Are there any sites about it?
Executive summary: You can, but you shouldn't.
C and C++ code is preprocessed before it's compiled: all headers are "pasted" into the source files that include them, recursively. If you define a function in a header and it is included by two C files, you will end up with two copies in each object file (One Definition Rule violation).
You can create "header-only" C libraries if all your functions are marked as static, that is, not visible outside the translation unit. But it also means you will get a copy of all the static functions in each translation unit that includes the header file.
It is a bit different in C++: inline functions are not static, symbols emitted by the compiler are still visible by the linker, but the linker can discard duplicates, rather than giving up ("weak" symbols).
It's not idiomatic to write C code in the headers, unless it's based on macros (e.g. queue(3)). In C++, the main reason to keep code in the headers are templates, which may result in code instantiation for different template parameters, which is not applicable to C.
You do not link headers.
In C++ it's slightly easier to write code that's already better-off in headers than in separately-compiled modules because templates require it. 1
But you can also use the inline keyword for functions, which exists in C as well as C++. 2
// Won't cause redefinition link errors, because of 6.7.4/5
inline void foo(void) {
// ...
}
[c99: 6.7.4/5:] A function declared with an inline function
specifier is an inline function. The function specifier may appear
more than once; the behavior is the same as if it appeared only once.
Making a function an inline function suggests that calls to the
function be as fast as possible. The extent to which such
suggestions are effective is implementation-defined.
You're a bit stuck when it comes to data objects, though.
1 - Sort of.
2 - C99 for sure. C89/C90 I'd have to check.
Boost makes heavy use templates and template meta-programming which you cannot emulate (all that easily) in C.
But you can of course cheat by having declarations and code in C headers which you #include but that is not the same thing. I'd say "When in Rome..." and program C as per C conventions with libraries.
Yes, it is quite possible. Declare all functions in headers and either all as static or just use a single compilation unit (i.e. only a single c file) in your projects.
As a personal anecdote, I know quite a number of physicists who insist that this technique is the only true way to program C. It is beneficial because it's the poor man's version of -fwhole-program, i.e. makes optimizations based on the knowledge of function behaviour possible. It is practical because you don't need to learn about using the linker flags. It is a bad idea because your whole program must be compiled as a whole and recompiled with each minor change.
Personally, I'd recommend to let it be or at least go with static for only a few functions.