C++ preprocessor ensure globally unique string - c++

I have a Macro function which takes a name and is intended to be called from various namespaces. I want to ensure that this name be unique globally. The define looks something like this:
#define DECLARE_NEW_MYVAR( Name ) static MyVar Name( #Name )
I want this static variable to be namespaced (which is why I expect it to be called from various namespaces) but I also need to ensure that the string being passed to the constructor of MyVar is globally unique. This is because I am going to serialize this value and I need to be able to associate it back correctly.
A few things that I tried without success:
To Force it to be Unique: Based on some information append something to the name to force it to be unique
Use the __COUNTER__ macro: One of the compilers I am targeting does not have this Macro. Also, it seems dangerous to assume that the order in which the Macro is called will be the same
Use the __FILE__ macro: This has the entire filepath which is good to make sure that it is unique, but if compiled from a different place or a different machine, the deserialization would no longer work.
To Check if it is Unique: Leave it up to the caller for it to be globally unique and have the compiler complain if it is not
I was looking for a way to declare something in the global namespace from within a namespace so that I could at least cause a multiply defined symbols if they don't make it unique. I couldn't figure out a way to do this.
Basically I need to come up with a globally unique string to pass to MyVar that I can trust will not change between different compilations and preferably not change between code changes (as long as the specific call hasn't been changed).
Does anyone know how to do this?

You could remove the prefix to the project directory from __FILE__ like this:
#include <cstring>
#include <iostream>
#define COMMON_PATH_PREFIX "/home/user/path/to/project/"
#define UNIQUE_IDENTIFIER() (__FILE__ + std::strlen(COMMON_PATH_PREFIX))
void someFunction(const char *identifier)
{
std::cout << identifier << std::endl;
}
int main()
{
someFunction(UNIQUE_IDENTIFIER());
}
The COMMON_PATH_PREFIX could be #define'd by your build system. If you use CMake for example you could simply use CMAKE_SOURCE_DIRECTORY.
The call to std::strlen() should be optimized out by your compiler, since the string is constant and known at compile time.
Of course this only works if you only want to declare one variable per file. You could also add __LINE__ to the identifier, but then it's much more likely that the identifier changes if you change your code.

Related

Can you use scanf() to define a variable set with #define?

If I initialize a variable using #define, can I then set its value using scanf()? i.e. does this work:
#define miscellaneous
printf("What value would you like to use for this example: ");
scanf("%g",&miscellaneous);
If I can't do it this way, is it even possible to set the value of a variable defined this way?
#define miscellaneous means that every time you write miscellaneous, you would like the compiler to replace it with nothing. #define is automated copy-paste and in this case it pastes nothing.
So when you write scanf("%g", &miscellaneous); the macro (the #define) causes it to be changed to scanf("%g", &); which is not valid at all. This is not a variable.
If I can't do it this way, is it even possible to set the value of a variable defined this way?
miscellaneous is not a variable at all. What you have defined is a macro. And no, you cannot set the value of a macro at runtime. Macro processing happens before compilation which happens before the program is run.
P.S. Avoid unnecessary use of macros.
You can not assign value to macro. Using #define directive compiler only in compile time using it, in runtime everything defined using #define directive does not exist.

What is the difference between #define and creating a normal type?

In C/C++, what is the difference between using #define [and #ifndef #endif] to create values, when you can easily do it with an int or std::string [C++] too?
#ifndef MYVAL
#define MYVAL(500)
#endif
//C++
cout << MYVAL << endl;
//C
printf(MYVAL);
//C++
int MYVAL = 500;
cout << MYVAL << endl;
//C
int MYVAL = 500;
printf(MYVAL);
Your assumptions are wrong. #define doesn't create "values", it creates replacement text in your source code. It has basically nothing to do with C or C++ at all.
Before I jump into history, here's a brief understanding of the difference between the two.
Variables are, well, variables. They take up space in the compiled program, and unless you mark them with const (which is a much later development than macros), they're mutable.
Macros, on the other hand, are preprocessed. The compiler never sees the macro. Instead, the macros are handled before compiling. The precompiler goes through the code, finds every macro, and replaces it verbatim with the macro text. This can be very powerful, somewhat useful, and fairly dangerous (since it's modifying code and never does any checking when doing so).
Also, macros can be set on the command line. You can define as many things as you want when you are compiling, and if your code checks for that macro, it can behave differently.
Macros existed long before C++. They have been useful for many things:
You can use them very easily to represent constant expressions. They can save space, because they don't require any variables (though the constant expression still needs to be compiled in somewhere), and they existed before the const specifier, so they were an easy way to maintain constant "variables" - the precompiler would replace all instances of MYVAR with 500.
You can do all sorts of functions with them. I actually never made any myself, because the benefits never seemed to outweigh the risks. Macro functions that aren't carefully constructed can easily break your compile. But I have used some predefined macro functions.
#define macros are still used for many things
include guards (header files usually have a macro defined at the top, and check if it's defined to make sure they don't add it again),
TRUE and FALSE in C,
setting DEBUG mode so that code can behave differently for debugging and release. As one simple example, assertions are functions that behave differently if the DEBUG macro is present. (If it's not present, it returns completely empty code.)
In the limited case where you're simply using a macro to represent a constant expression, you're right - they're no longer needed for that.
The difference is that with the macros (#) the preprocessor does a search and replace on that symbol. There is no type checking on the replace.
When you create a variable, it is typed and the compiler will do type checking where you use it.
C/C++ compilers are often thought of as 2-pass compilers. The first pass is the preprocessor which does search and replace on macros. The second pass is the actual compilation where the declared variables are created.
Macros are often used to create more complex expressions so the code doesn't have to be repeated more than once and so the syntax is more compact. They are useful, but also more dangerous due to their 'blind' search and replace nature. In addition, you can't step into a macro with a debugger so they can be harder to troubleshoot.
Also, macros do not obey any scoping rules. #define MYVAL(500) will replace MYVAL with 500 even if it occurs in functions, global scope, class declarations, etc. so you have to be more careful in that way.
When you #define something, it will be blindly replaced whenever it's found in your code:
#define the_answer 42
/// ...
int the_answer = /* oops! */
There are few important reasons why you shouldn't use #defines. For your questions in particular I would say, #define are plain text replacements and you can't limit the scope of the macro. i.e, you can't specify an access specifier or bind it to a namespace, so once you define the macros you can use them anywhere in the files where the define is included.
With 'const' variables you can have them bound in a scope
These could help : http://www.parashift.com/c++-faq/const-vs-define.html
http://www.parashift.com/c++-faq/preprocessor-is-evil.html
There is a huge difference:
a) #define MYVAL 500
This will create a macro. Each of its occurences in the source code will be replaced by its raw value by the preprocessor. It completely ignores the scope and you cannot change its value
b) int MYVAL = 500;
This is a regular variable that obeys scope rules, i. e. when declared inside a function, it cannot be seen outside it, it can be shadowed within another function, etc...
On the other hand, variables cannot be used in preprocesor conditions (#if, #endif blocks)
One last example:
#define MYVAL 500
int main() {
int MYVAL = 10; // illegal, gets preprocessed as int 500 = 10;
}
Same with variable:
int MYVAL = 500
int main() {
int MYVAL = 10; // legal, MYVAL now references local variable, ::MYVAL is the global variable
}

Limit scope of #define labels

What is the correct strategy to limit the scope of #define labels and avoid unwarranted token collision?
In the following configuration:
Main.c
# include "Utility_1.h"
# include "Utility_2.h"
# include "Utility_3.h"
VOID Main() { ... }
Utility_1.h
# define ZERO "Zero"
# define ONE "One"
BOOL Utility_1(); // Uses- ZERO:"Zero" & ONE:"One"
Utility_2.h
# define ZERO '0'
# define ONE '1'
BOOL Utility_2(); // Uses- ZERO:'0' & ONE:'1'
Utility_3.h
const UINT ZERO = 0;
const UINT ONE = 1;
BOOL Utility_3(); // Uses- ZERO:0 & ONE:1
Note: Utility _1, Utility_2 and Utility_3 have been written independently
Error: Macro Redefinition and Token Collision
Also, Most Worrying: Compiler does not indicate what replaced what incase of token replacement
{Edit} Note: This is meant to be a generic question so please: do not propose enum or const
i.e. What to do when: I MUST USE #define & _Please comment on my proposed solution below.. __
The correct strategy would be to not use
#define ZERO '0'
#define ONE '1'
at all. If you need constant values, use, in this case, a const char instead, wrapped in a namespace.
There are two types of #define Macros:
One which are need only in a single file. Let's call them Private #defines
eg. PI 3.14 In this case:
As per the standard practice: the correct strategy is to place #define labels - in only the implementation, ie. c, files and not the header h file.
Another that are needed by multiple files: Let's call these Shared #defines
eg. EXIT_CODE 0x0BAD In this case:
Place only such common #define labels in header h file.
Additionally try to name labels uniquely with False NameSpaces or similar conventions like prefixing the label with MACRO_ eg: #define MACRO_PI 3.14 so that the probability of collision reduces
#defines don't have scope that corresponds to C++ code; you cannot limit it. They are naive textual replacement macros. Imagine asking "how do I limit the scope when I replace text with grep?"
You should avoid them whenever you possibly can, and favor instead using real C++ typing.
Proper use of macros will relieve this problem almost by itself via naming convention. If the macro is named like an object, it should be an object (and not a macro). Problem solved. If the macro is named like a function (for example a verb), it should be a function.
That applies to literal values, variables, expressions, statements... these should all not be macros. And these are the places that can bite you.
In other cases when you're using like some kind syntax helper, your macro name will almost certainly not fit the naming convention of anything else. So the problem is almost gone. But most importantly, macros that NEED to be macros are going to cause compile errors when the naming clashes.
Some options:
Use different capitalization conventions for macros vs. ordinary identifiers.
const UINT Zero = 0;
Fake a namespace by prepending a module name to the macros:
#define UTIL_ZERO '0'
#define UTIL_ONE '1'
Where available (C++), ditch macros altogether and use a real namespace:
namespace util {
const char ZERO = '0';
const char ONE = '1';
};
What is the correct strategy to limit the scope of #define and avoid unwarrented token collisions.
Avoid macros unless they are truly necessary. In C++, constant variables and inline functions can usually be used instead. They have the advantage that they are typed, and can be scoped within a namespace, class, or code block. In C, macros are needed more often, but think hard about alternatives before introducing one.
Use a naming convention that makes it clear which symbols are macros, and which are language-level identifiers. It's common to reserve ALL_CAPITALS names for the exclusive use of macros; if you do that, then macros can only collide with other macros. This also draws the eye towards the parts of the code that are more likely to harbour bugs.
Include a "pseudo-namespace" prefix on each macro name, so that macros from different libraries/modules/whatever, and macros with different purposes, are less likely to collide. So, if you're designing a dodgy library that wants to define a character constant for the digit zero, call it something like DODGY_DIGIT_ZERO. Just ZERO could mean many things, and might well clash with a zero-valued constant defined by a different dodgy library.
What is the correct strategy to limit the scope of #define and avoid unwarrented token collisions.
Some simple rules:
Keep use of preprocessor tokens down to a minimum.
Some organizations go so far as down this road and limit preprocessor symbols to #include guards only. I don't go this far, but it is a good idea to keep preprocessor symbols down to a minimum.
Use enums rather than named integer constants.
Use const static variables rather than named floating point constants.
Use inline functions rather than macro functions.
Use typedefs rather than #defined type names.
Adopt a naming convention that precludes collisions.
For example,
The names of preprocessor symbols must consist of capital letters and underscores only.
No other kinds of symbols can have a name that consists of capital letters and underscores only.
const UINT ZERO = 0; // Programmer not aware of what's inside Utility.h
First off, if the programmer isn't away of what's inside Utility.h, why did the programmer use that #include statement? Obviously that UINT came from somewhere ...
Secondly, the programmer is asking for trouble by naming a variable ZERO. Leave those all cap names for preprocessor symbols. If you follow the rules, you don't have to know what's inside Utility.h. Simply assume that Utility.h follows the rules. Make that variable's name zero.
I think you really just have to know what it is you're including. That's like trying to include windows.h and then declare a variable named WM_KEYDOWN. If you have collisions, you should either rename your variable, or (somewhat of a hack), #undef it.
C is a structured programming language. It has its limitations. That is the very reason why object oriented systems came in 1st place. In C there seems to be no other way, then to understand what your header files's variables start with _VARIABLE notation, so that there are less chances of it getting over written.
in header file
_ZERO 0
in regular file
ZERO 0
I think the correct strategy would be to place #define labels - in only the implementation, ie. c, files
Further all #define could be put separately in yet another file- say: Utility_2_Def.h
(Quite like Microsoft's WinError.h:Error code definitions for the Win32 api functions)
Overheads:
an extra file
an extra #include statement
Gains:
Abstraction: ZERO is: 0, '0' or "Zero" as to where you use it
One standard place to change all static parameters of the whole module
Utility_2.h
BOOL Utility_2();
Utility_2_Def.h
# define ZERO '0'
# define ONE '1'
Utility_2.c
# include "Utility_2.h"
# include "Utility_2_Def.h"
BOOL Utility_2()
{
...
}

Passing the caller __FILE__ __LINE__ to a function without using macro

I'm used to this:
class Db {
_Commit(char *file, int line) {
Log("Commit called from %s:%d", file, line);
}
};
#define Commit() _Commit(__FILE__, __LINE__)
but the big problem is that I redefine the word Commit globally, and in a 400k lines application framework it's a problem. And I don't want to use a specific word like DbCommit: I dislike redundancies like db->DbCommit(), or to pass the values manually everywhere: db->Commit(__FILE__, __LINE__) is worst.
So, any advice?
So, you're looking to do logging (or something) with file & line info, and you would rather not use macros, right?
At the end of the day, it simply can't be done in C++. No matter what mechanism you chose -- be that inline functions, templates, default parameters, or something else -- if you don't use a macro, you'll simply end up with the filename & linenumber of the logging function, rather than the call point.
Use macros. This is one place where they are really not replaceable.
EDIT:
Even the C++ FAQ says that macros are sometimes the lesser of two evils.
EDIT2:
As Nathon says in the comments below, in cases where you do use macros, it's best to be explicit about it. Give your macros macro-y names, like COMMIT() rather than Commit(). This will make it clear to maintainers & debuggers that there's a macro call going on, and it should help in most cases to avoid collisions. Both good things.
Wait till C++20, you cal use source_location
https://en.cppreference.com/w/cpp/utility/source_location
You can use a combination of default parameter and preprocessor trick to pass the caller file to a functions. It is the following:
Function declaration:
static const char *db_caller_file = CALLER_FILE;
class Db {
_Commit(const char *file = db_caller_file) {
Log("Commit called from %s", file);
}
};
Declare db_caller_file variable in the class header file.
Each translation unit will have a const char *db_caller_file. It is static, so it will not interfere between translation units. (No multiple declarations).
Now the CALLER_FILE thing, it is a macro and will be generated from gcc's command line parameters. Actually if using automated Make system, where there is generic rule for source files, it is a lot easier: You can add a rule to define macro with the file's name as a value. For example:
CFLAGS= -MMD -MF $(DEPS_DIR)/$<.d -Wall -D'CALLER_FILE="$<"'
-D defines a macro, before compiling this file.
$< is Make's substitution for the name of the prerequisite for the rule, which in this case is the name of the source file. So, each translation unit will have it's own db_caller_file variable with value a string, containing file's name.
The same idea cannot be applied for the caller line, because each call in the same translation unit should have different line numbers.

Removing Unused (Unreferenced) Static Global Variable Constants in C++

Update,
I tried removing the static modifier, and I tried putting them in a namespace (as well as both of those), and none worked.
Hi,
I have a header file with common constants like names and stuff that are automatically included in each project (an example follows). The thing is that they are included in the compiled binary (EXE) whether they are used (referenced) or not. If I use DEFINEs instead, then naturally they are not included if they are not used, but of course consts are better than defines so… I tried Googling it, but the closest thing I could find was a question right here on SO that did not quite help. Matters of i18n aside, how can I keep the ones that are not used out of the binary, while still keeping it easy to use like this?
Thanks.
//COMMON.H:
static const CString s_Company _T("Acme inc.");
//etc. others
static const CString s_Digits _T("0123456789");
//TEST.CPP:
#include common.h
int main() {
AfxMessageBox(s_Company);
}
//s_Company should be in the final EXE, but s_Digits should not be, but is
The reason they're not stripped from the binary is because they are used: CString is not a POD type, so when you create instances of them at global scope, the compiler has to generate code to call their constructors and destructors.
If you want unused symbols to be stripped, just replace the CStrings with a POD type such as const TCHAR*:
static const TCHAR *s_Company = _T("Acme inc.");
static const TCHAR *s_Digits = _T("0123456789");
Then, the unused constants will be stripped from your binary automatically by the compiler. However, one important thing to keep in mind is that if your strings are used in multiple files, then your binary will have multiple copies of those strings in it, one copy for each translation unit that uses the string. Not even gcc's -fmerge-constants option seems to fix this. If you don't want this to happen, you'll need to use extern declarations in your header files and then put the string constants' values in source files (usually in one file, but that's not required). This also allows you to change the constants without needing to recompile every file that uses them.
If you don't need the values as literals, you can declare them in the header file as:
extern const CString s_Company;
Put each one of them in its own source file that defines it as:
const CString s_Company _T("Acme inc.");
Then you can only link in the constants that you need; the linker will tell you in its error messages if you're missing any! (There are also ways to tell the compiler to not make those symbols public in the built library – assuming you're building a library at all – but they're not standard or portable.)
If there are clear rules defined that make it clear when you want to use what, then you can use preprocessor defines like this:
#if defined CONFIG_COMPANY
static const CString s_Company _T("Acme inc.");
#elif defined CONFIG_DIGITS
static const CString s_Digits _T("0123456789");
#endif
You can then define or not define either of these values in a separate configuration header.
By the way, if this file is meant to be included in many source files, then you should probably refrain from declaring these constants as static. They have internal linkage and you'll get a separate copy of them in all of your translation units.
On Windows, especially using MFC, the answer is to put the strings into resources, and load the string from there using LoadResource / LoadString (see http://msdn.microsoft.com/en-us/library/a44fb3wy(VS.80).aspx)
In your case, you have two constant CStrings, So the resource compier will generate something like
#define IDS_COMPANY 1
#define IDS_DIGITS 2
So in this case you would have a .dll with two strings. You app would load the .dll (AfxLoadLibrary). Then when you need the string your app loads it from the resource .dll.
This is what keeps Windows apps from bloating with global static strings, bitmaps, etc.