Speed : typedef vs #define in c++ - c++

Recently I came across typedefs and #defines. Even though they have similar usage, one of them is a compiler token and the other is a preprocessor token.
This made me wonder about their speeds of operation, as one wants to be as fast as possible in competitive programming.
So, which one is relatively faster? An explanation accompanied with the answer would be great. Would the compiler used make any difference like g++ vs MSVC compiler vs clang compiler?
Use case examples:
typedef long long int; and #define ll long long int.

There is no difference in performance, but preprocessor macros are not recommended because they pollute the global scope since unlike typedef they can't be placed in a namespace.
But arguably, ll isn't very expressive; it may make the code less readable. Consider using int64_t from <cstdint>. It's nice because it's more expressive (_t clearly indicates it's a type and its size is exactly 64 bits, and so is future-proof, even when long long is 128 bits), and is relatively concise so no need to typedef anything.

They both take the same amount of time. You won't notice any difference at all.
Also note that if used properly they are identical at runtime. Only in compile times they might differ slightly but that's barely anything.

If you care about using c++ features, the better option would be to use the using keyword which is available since c++11 and specifically designed for this purpose. It is also compatible with templates.
Please note that, both using and typedef are semantically same.
https://en.cppreference.com/w/cpp/language/type_alias

No difference in execution time, so you should be fine using either for competitive programming.
I personally avoid #define and use typedef for types as it is more cleaner and readable.

Related

Any reason to define types in dll? Like #define MODULE_NAME_INT8 __int8

So I am going through some code at work and while I am not an expert in c++ yet I think it is safe to say the code is a bit of mess. Still I would like confirmation on something. In the code every type is defined like so:
#define MODULE_NAME_INT8 __int8
The same for unsigned, 16, 32 and 64 bit as well as bool, null, true and false. Is there possibly some reason for this? It is in a dll, and other parts of the application is written in C# so maybe there is something with this I have not thought of?
It really seems like a complete waste and a misuse of define to me. I mean the type name is in the define, it is basically just renaming the type, why clutter the code like this? Am I right or am I missing something?
I think those defines make sense. If this code is a bit old, MSVC 2008 and before do not have stdint.h nor cstdint. So the standard type int8_t cannot be used and the programmer decided to hide the Windows specific __int8 behind a define to only have to change the define if the code has to be ported to another platform (or simply a different compiler)
Double underscore before type is compiler-specific. Typically you can see defines like this one if the code is supposed to be built using different compiler or platform, so that you can build from the same source code, and types remain the same.
Although this clutters the code, using for example just "int" may be dangerous in case you want to build for multiple platforms, because in C++ int has no standard-defined size (int may be 1,2,4,8 bytes).
C++11 introduced fixed types for this purpose: int8_t, int16_t, and so on, but since your code was most probably created long before this, it does not take advantage of this, and goes the "old-fashioned" way.
Having a header file with all data types used by the application is very common practice in commercial software.
It's more of a thinking ahead kind of measure than a maintain compatibility. The purpose is to make the code easily portable and easy to maintain.
Since there is no guarantee that the code will not have to run a different architecture which uses different sizes for the common data types, the practice is a very good one.

Enum class performance when array indexing

Can I expect to see a performance hit for the casting in this . .
enum class myEnum {A,B,C};
myArray[(int)myEnum::A] = 123;
Compared to this?
enum myEnum {A,B,C};
myArray[A] = 123;
I'm leaning towards the new style enum classes for the type safety, but don't want to do it at the expense of performance.
It depends whether the enum value used as an index is known at compile time or passed in a variable.
That is myArray[(int)myEnum::A] shall not incur any penalty but myArray[(int)e] might, depending on the physical representation of e (ie, it might be necessary to "extend" it).
On the other hand, a simple extension is a trivial operation that is unlikely to ever show up as a performance issues: things like branch prediction (in conditionals) and caching are much more important in most applications (for low-level), and at a higher level algorithms matter.
Note: to avoid the extension issue in the runtime scenario, you could define the base type of myEnum to be the natural type that is expected for compiler arithmetic, I believe a ptrdiff_t would be most appropriate here. It is a big integer though.
Of course it's ultimately up to the compiler, but it's hard to see why any reasonable compiler would generate different code in these two cases.
I've tested this with g++ 4.7.2 on Intel, and they compile to identical assembly code.
No, it's not likely that the casting should have any impact on performance. This is all resolved at compile time.
I would expect NOT, but it would depend on the compiler implementation. Why don't you try both approaches, and time it? Try a few different compiler optimisation settings too, so you know that it won't make it different when it comes to production code compiles.
I doubt there would be any assembly instructions at all for the cast, since the entire expression (int)myEnum::A can be evaluated at compile time.
But if you really want to know, make a pair of sample programs, and analyze both by dumping disassembly, and/or measuring performance.

What are the major advantages of const versus #define for global constants?

In embedded programming, for example, #define GLOBAL_CONSTANT 42 is preferred to const int GLOBAL_CONSTANT = 42; for the following reasons:
it does not need place in RAM (which is usually very limited in microcontrollers, and µC applications usually need a large number of global constants)
const needs not only a storage place in the flash, but the compiler generates extra code at the start of the program to copy it.
Against all these advantages of using #define, what are the major advantages of using const?
In a non-µC environment memory is usually not such a big issue, and const is useful because it can be used locally, but what about global constants? Or is the answer just "we should never ever ever use global constants"?
Edit:
The examples might have caused some misunderstanding, so I have to state that they are in C. If the C compiler generated the exact same code for the two, I think that would be an error, not an optimization.
I just extended the question to C++ without thinking much about it, in the hopes of getting new insights, but it was clear to me, that in an object-oriented environment there is very little space for global constants, regardless whether they are macros or consts.
Are you sure your compiler is too dumb to optimize your constant by inserting its value where it is needed instead of putting it into memory? Compilers usually are good in optimizations.
And the main advantage of constants versus macros is that constants have scope. Macros are substituted everywhere with no respect for scope or context. And it leads to really hard to understand compiler error messages.
Also debuggers are not aware of macros.
More can be found here
The answer to your question varies for C and C++.
In C, const int GLOBAL_CONSTANT is not a constant in C, So the primary way to define a true constant in C is by using #define.
In C++, One of the major advantage of using const over #define is that #defines don't respect scopes so there is no way to create a class scoped namespace. While const variables can be scoped in classes.
Apart from that there are other subtle advantages like:
Avoiding Weird magical numbers during compilation errors:
If you are using #define those are replaced by the pre-processor at time of precompilation So if you receive an error during compilation, it will be confusing because the error message wont refer the macro name but the value and it will appear a sudden value, and one would waste lot of time tracking it down in code.
Ease of Debugging:
Also for same reasons mentioned in #2, while debugging #define would provide no help really.
Another reason that hasn't been mentioned yet is that const variables allow the compiler to perform explicit type-checking, but macros do not. Using const can help prevent subtle data-dependent errors that are often difficult to debug.
I think the main advantage is that you can change the constant without having to recompile everything that uses it.
Since a macro change will effectively modify the contents of the file that use the macro, recompilation is necessary.
In C the const qualifier does not define a constant but instead a read-only object:
#define A 42 // A is a constant
const int a = 42; // a is not constant
A const object cannot be used where a real constant is required, for example:
static int bla1 = A; // OK, A is a constant
static int bla2 = a; // compile error, a is not a constant
Note that this is different in C++ where the const really qualifies an object as a constant.
The only problems you list with const sum up as "I've got the most incompetent compiler I can possibly imagine". The problems with #define, however, are universal- for example, no scoping.
There's no reason to use #define instead of a const int in C++. Any decent C++ compiler will substitute the constant value from a const int in the same way it does for a #define where it is possible to do so. Both take approximately the same amount of flash when used the same way.
Using a const does allow you to take the address of the value (where a macro does not). At that point, the behavior obviously diverges from the behavior of a Macro. The const now needs a space in the program in both flash and in RAM to live so that it can have an address. But this is really what you want.
The overhead here is typically going to be an extra 8 bytes, which is tiny compared to the size of most programs. Before you get to this level of optimization, make sure you have exhausted all other options like compiler flags. Using the compiler to carefully optimize for size and not using things like templates in C++ will save you a lot more than 8 bytes.

Best Practices: Should I create a typedef for byte in C or C++?

Do you prefer to see something like t_byte* (with typedef unsigned char t_byte) or unsigned char* in code?
I'm leaning towards t_byte in my own libraries, but have never worked on a large project where this approach was taken, and am wondering about pitfalls.
If you're using C99 or newer, you should use stdint.h for this. uint8_t, in this case.
C++ didn't get this header until C++11, calling it cstdint. Old versions of Visual C++ didn't let you use C99's stdint.h in C++ code, but pretty much every other C++98 compiler did, so you may have that option even when using old compilers.
As with so many other things, Boost papers over this difference in boost/integer.hpp, providing things like uint8_t if your compiler's standard C++ library doesn't.
I suggest that if your compiler supports it use the C99 <stdint.h> header types such as uint8_t and int8_t.
If your compiler does not support it, create one. Here's an example for VC++, older versions of which do not have stdint.h. GCC does support stdint.h, and indeed most of C99
One problem with your suggestion is that the sign of char is implementation defined, so if you do create a type alias. you should at least be explicit about the sign. There is some merit in the idea since in C# for example a char is 16bit. But it has a byte type as well.
Additional note...
There was no problem with your suggestion, you did in fact specify unsigned.
I would also suggest that plain char is used if the data is in fact character data, i.e. is a representation of plain text such as you might display on a console. This will present fewer type agreement problems when using standard and third-party libraries. If on the other hand the data represents a non-character entity such as a bitmap, or if it is numeric 'small integer' data upon which you might perform arithmetic manipulation, or data that you will perform logical operations on, then one of the stdint.h types (or even a type defined from one of them) should be used.
I recently got caught out on a TI C54xx compiler where char is in fact 16bit, so that is why using stdint.h where possible, even if you use it to then define a byte type is preferable to assuming that unsigned char is a suitable alias.
I prefer for types to convey the meaning of the values stored in it. If I need a type describing a byte as it is on my machine, I very much prefer byte_t over unsigned char, which could mean just about anything. (I have been working in a code base that used either signed char or unsigned char to store UTF-8 strings.) The same goes for uint8_t. It could just be used as that: an 8bit unsigned integer.
With byte_t (as with any other aptly named type), there rarely ever is a need to look up what it is defined to (and if so, a good editor will take 3secs to look it up for you; maybe 10secs, if the code base is huge), and just by looking at it it's clear what's stored in objects of that type.
Personally I prefer boost::int8_t and boost::uint8_t.
If you don't want to use boost you could borrow boost\cstdint.hpp.
Another option is to use portable version of stdint.h (link from this answer).
Besides your awkward naming convention, I think that might be okay. Keep in mind boost does this for you, to help with cross-platform-ability:
#include <boost/integer.hpp>
typedef boost::uint8_t byte_t;
Note that usually type's are suffixed with _t, as in byte_t.
I prefer to use standard types, unsigned char, uint8_t, etc., so any programmer looking at the source does not have to refer back to headers to grok the code. The more typedefs you use the more time it takes for others to get to know your typing conventions. For structures, absolutely use typedefs, but for primitives use them sparingly.

How to limit the impact of implementation-dependent language features in C++?

The following is an excerpt from Bjarne Stroustrup's book, The C++ Programming Language:
Section 4.6:
Some of the aspects of C++’s fundamental types, such as the size of an int, are implementation- defined (§C.2). I point out these dependencies and often recommend avoiding them or taking steps to minimize their impact. Why should you bother? People who program on a variety of systems or use a variety of compilers care a lot because if they don’t, they are forced to waste time finding and fixing obscure bugs. People who claim they don’t care about portability usually do so because they use only a single system and feel they can afford the attitude that ‘‘the language is what my compiler implements.’’ This is a narrow and shortsighted view. If your program is a success, it is likely to be ported, so someone will have to find and fix problems related to implementation-dependent features. In addition, programs often need to be compiled with other compilers for the same system, and even a future release of your favorite compiler may do some things differently from the current one. It is far easier to know and limit the impact of implementation dependencies when a program is written than to try to untangle the mess afterwards.
It is relatively easy to limit the impact of implementation-dependent language features.
My question is: How to limit the impact of implementation-dependent language features? Please mention implementation-dependent language features then show how to limit their impact.
Few ideas:
Unfortunately you will have to use macros to avoid some platform specific or compiler specific issues. You can look at the headers of Boost libraries to see that it can quite easily get cumbersome, for example look at the files:
boost/config/compiler/gcc.hpp
boost/config/compiler/intel.hpp
boost/config/platform/linux.hpp
and so on
The integer types tend to be messy among different platforms, you will have to define your own typedefs or use something like Boost cstdint.hpp
If you decide to use any library, then do a check that the library is supported on the given platform
Use the libraries with good support and clearly documented platform support (for example Boost)
You can abstract yourself from some C++ implementation specific issues by relying heavily on libraries like Qt, which provide an "alternative" in sense of types and algorithms. They also attempt to make the coding in C++ more portable. Does it work? I'm not sure.
Not everything can be done with macros. Your build system will have to be able to detect the platform and the presence of certain libraries. Many would suggest autotools for project configuration, I on the other hand recommend CMake (rather nice language, no more M4)
endianness and alignment might be an issue if you do some low level meddling (i.e. reinterpret_cast and friends things alike (friends was a bad word in C++ context)).
throw in a lot of warning flags for the compiler, for gcc I would recommend at least -Wall -Wextra. But there is much more, see the documentation of the compiler or this question.
you have to watch out for everything that is implementation-defined and implementation-dependend. If you want the truth, only the truth, nothing but the truth, then go to ISO standard.
Well, the variable sizes one mentioned is a fairly well known issue, with the common workaround of providing typedeffed versions of the basic types that have well defined sizes (normally advertised in the typedef name). This is done use preprocessor macros to give different code-visibility on different platforms. E.g.:
#ifdef __WIN32__
typedef int int32;
typedef char char8;
//etc
#endif
#ifdef __MACOSX__
//different typedefs to produce same results
#endif
Other issues are normally solved in the same way too (i.e. using preprocessor tokens to perform conditional compilation)
The most obvious implementation dependency is size of integer types. There are many ways to handle this. The most obvious way is to use typedefs to create ints of the various sizes:
typedef signed short int16_t;
typedef unsigned short uint16_t;
The trick here is to pick a convention and stick to it. Which convention is the hard part: INT16, int16, int16_t, t_int16, Int16, etc. C99 has the stdint.h file which uses the int16_t style. If your compiler has this file, use it.
Similarly, you should be pedantic about using other standard defines such as size_t, time_t, etc.
The other trick is knowing when not to use these typedef. A loop control variable used to index an array, should just take raw int types so the compile will generate the best code for your processor. for (int32_t i = 0; i < x; ++i) could generate a lot of needless code on a 64-bite processor, just like using int16_t's would on a 32-bit processor.
A good solution is to use common headings that define typedeff'ed types as neccessary.
For example, including sys/types.h is an excellent way to deal with this, as is using portable libraries.
There are two approaches to this:
define your own types with a known size and use them instead of built-in types (like typedef int int32 #if-ed for various platforms)
use techniques which are not dependent on the type size
The first is very popular, however the second, when possible, usually results in a cleaner code. This includes:
do not assume pointer can be cast to int
do not assume you know the byte size of individual types, always use sizeof to check it
when saving data to files or transferring them across network, use techniques which are portable across changing data sizes (like saving/loading text files)
One recent example of this is writing code which can be compiled for both x86 and x64 platforms. The dangerous part here is pointer and size_t size - be prepared it can be 4 or 8 depending on platform, when casting or differencing pointer, cast never to int, use intptr_t and similar typedef-ed types instead.
One of the key ways of avoiding dependancy on particular data sizes is to read & write persistent data as text, not binary. If binary data must be used then all read/write operations must be centralised in a few methods and approaches like the typedefs already described here used.
A second rhing you can do is to enable all your your compilers warnings. for example, using the -pedantic flag with g++ will warn you of lots of potential portability problems.
If you're concerned about portability, things like the size of an int can be determined and dealt with without much difficulty. A lot of C++ compilers also support C99 features like the int types: int8_t, uint8_t, int16_t, uint32_t, etc. If yours doesn't support them natively, you can always include <cstdint> or <sys/types.h>, which, more often than not, has those typedefed. <limits.h> has these definitions for all the basic types.
The standard only guarantees the minimum size of a type, which you can always rely on: sizeof(char) < sizeof(short) <= sizeof(int) <= sizeof(long). char must be at least 8 bits. short and int must be at least 16 bits. long must be at least 32 bits.
Other things that might be implementation-defined include the ABI and name-mangling schemes (the behavior of export "C++" specifically), but unless you're working with more than one compiler, that's usually a non-issue.
The following is also an excerpt from Bjarne Stroustrup's book, The C++ Programming Language:
Section 10.4.9:
No implementation-independent guarantees are made about the order of construction of nonlocal objects in different compilation units. For example:
// file1.c:
Table tbl1;
// file2.c:
Table tbl2;
Whether tbl1 is constructed before tbl2 or vice versa is implementation-dependent. The order isn’t even guaranteed to be fixed in every particular implementation. Dynamic linking, or even a small change in the compilation process, can alter the sequence. The order of destruction is similarly implementation-dependent.
A programmer may ensure proper initialization by implementing the strategy that the implementations usually employ for local static objects: a first-time switch. For example:
class Zlib {
static bool initialized;
static void initialize() { /* initialize */ initialized = true; }
public:
// no constructor
void f()
{
if (initialized == false) initialize();
// ...
}
// ...
};
If there are many functions that need to test the first-time switch, this can be tedious, but it is often manageable. This technique relies on the fact that statically allocated objects without constructors are initialized to 0. The really difficult case is the one in which the first operation may be time-critical so that the overhead of testing and possible initialization can be serious. In that case, further trickery is required (§21.5.2).
An alternative approach for a simple object is to present it as a function (§9.4.1):
int& obj() { static int x = 0; return x; } // initialized upon first use
First-time switches do not handle every conceivable situation. For example, it is possible to create objects that refer to each other during construction. Such examples are best avoided. If such objects are necessary, they must be constructed carefully in stages.