How to get GCC/Clang to error on reserved identifiers

How to get GCC/Clang to error on reserved identifiers - c++

In large codebases, it might be impossible for the maintainer/project-owner to review and audit every line of code. In C++, some identifiers are reserved (according to the standard) and prohibiting the definition of reserved identifiers seems reasonable.
According to Identifiers - cppreference.com:
the identifiers with a double underscore anywhere are reserved;
the identifiers that begin with an underscore followed by an uppercase letter are reserved;
the identifiers that begin with an underscore are reserved in the global namespace.
Is there tooling to enforce this guideline (prohibit the use of reserved identifiers)?
For instance, is it possible (through GCC, Clang, clang-tidy, or some other compiler or linter) to get code like the following snippet to error:
auto __destination = cv::Mat3f(1, 1);
cv::cvtColor(source, __destination, cv::ColorConversionCodes::COLOR_HSV2RGB);
auto result = __destination.at<cv::Vec3f>(0, 0);
I know I can either hack around this by testing some regular expressions against the codebase or even solve it decently by parsing the C++ code into a parse tree and then validating all identifiers, but I am interested in knowing whether or not there is already something that does this.
I would be interested in forbidding even the use of already defined variables starting with a double underscore as it could be a hack exploiting a GCC leaky abstraction, for instance.

Clang has this feature starting from version 13 with -Wreserved-identifier flag (link).
GCC doesn't implement such feature yet (as in March 2022).

Related

Is there any advantages of defining variable names as __00000001 in C / C++

Is there any advantages of defining variable names as __00000001, __00000002, etc.?
Example:
int __00000001, __00000002 = 0;
for (__00000001 = 0; __00000001 < 10; __00000001++) {
__00000002 = __00000002 + __00000001；
}
...
Update: this is mentioned in one of my programming classes a few years ago, and I remembered that the professor said there is some advantages of using it. However, I cannot recall any more information. Maybe I am wrong.

Those particular variable names are not available for user programs:
All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. (C11, section 7.1.3, paragraph 1)
So that's a big disadvantage.

Is obfuscating the crap out of your code worthwhile? No, not unless your goal is literally to do just that: to make your code as hard to read as possible. Trouble is, you've got to read it too.
Sometimes you'll run into code like this when somebody's "decompiled" a program — variable names do not survive the compilation process so this is sort of the best a decompiler can do when reconstructing a C++ program. Of course it cannot really reconstruct a C++ program; it can only re-spell the flattened logic in C++ syntax. Oh well.
Addressing your example specifically, it's worth noting that all identifiers beginning with two underscores are reserved to the implementation (your compiler and standard library), so your program has undefined behaviour.

Is it safe to use "yes","no","i","out" as name for variables/enum?

I have read the document about naming rule of C++, they seems to be usable names.
However, in practice, when I tried to create a variable/enum with a name like iter, yes, no, out, i, Error, etc. , Visual Studio will strangely use italic font for them.
I can only guess that they are reserved for special thing, and IDE (e.g. refactoring/rename process) might act strangely if I use such names.
Is it safe to use those names in practice? Am I just too panic?
Sorry if it is too newbie or an inappropriate question.
I doubt about it for a few weeks but too afraid to ask.

These names are valid and will not cause any "harm", the standard only says:
Each name that contains a double underscore (_ _) or begins with an underscore followed by an uppercase letter (2.11) is reserved to the
implementation for any use.
Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
Which means that all your names are fine to use in user-code. Visual Studio might just have a thing for these names as i and iter are usually used in looping.

These names are not reserved in standard C++, as explained by Rick Astley. An implementation may choose to accept additional reserved words to provide language extensions, such as ref class in C++/CLI. In some cases, such as with ref class, where ref is a contextual keyword, these extensions only make otherwise ill-formed programs well-formed in the scope of the extended language. In other cases, an otherwise well-formed program may change its meaning or become ill-formed. In the former case, the implementation is still conforming to the C++ standard, as long as it issues all mandatory diagnostics; in the latter case, it is certainly not conforming.
It is considered good practice to make the latter kind of extensions optional e.g. using a command line option, so that the implementation still has a mode in which it is fully standards compliant. My immediate guess is that VC++ in fact does allow you to write well-formed programs containing yes, no, i, iter which will behave as required by the standard (implementation bugs notwithstanding).
The IDE is a different beast, though. It is considered to be outside of the scope of the C++ standard, and might discourage or even stop you from writing perfectly well-formed code. That would still be a quality of implementation issue, or an issue of customer satisfaction, if you will.

(v) is actually (*&v) since when?

Could C++ standards gurus please enlighten me:
Since which C++ standard version has this statement failed because (v) seems to be equivalent to (*&v)?
I.e. for example the code:
#define DEC(V) ( ((V)>0)? ((V)-=1) : 0 )
...{...
register int v=1;
int r = DEC(v) ;
...}...
This now produces warnings under -std=c++17 like:
cannot take address of register variable
left hand side of operand must be lvalue
Many C macros enclose ALL macro parameters in parentheses, of which the above is meant only to be a representative example.
The actual macros that produce warnings are for instance
the RTA_* macros in /usr/include/linux/rtnetlink.h.
Short of not using/redefining these macros in C++, is there any workaround?

If you look at the revision summary of the latest C++1z draft, you'd see this in [diff.cpp14.dcl.dcl]
[dcl.stc]
Change: Removal of register storage-class-specifier.
Rationale: Enable repurposing of deprecated keyword in future
revisions of this International Standard.
Effect on original feature: A valid C++ 2014 declaration utilizing the register
storage-class-specifier is ill-formed in this International Standard.
The specifier can simply be removed to retain the original meaning.
The warning may be due to that.

register is no longer a storage class specifier, you should remove it. Compilers may not be issuing the right error or warnings but your code should not have register to begin with
The following is a quote from the standard informing people about what they should do with regards to register in their code (relevant part emphasized), you probably have an old version of that file
C.1.6 Clause 10: declarations [diff.dcl]
Change: In C++, register is not a storage class specifier.
Rationale: The storage class specifier had no effect in C++.
Effect on original feature: Deletion of semantically well-defined feature.
Difficulty of converting: Syntactic transformation.
How widely used: Common.

Your worry is unwarranted since the file in question does not actually contain the register keyword:
grep "register" /usr/include/linux/rtnetlink.h
outputs nothing. Either way, you shouldn't be receiving the warning since:
System headers don't emit warnings by default, at least in GCC
It isn't wise to try to compile a file that belongs to a systems project like the linux kernel in C++ mode, as there may be subtle and nasty breaking changes
Just include the file normally or link the C code to your C++ binary. Report a bug if you really are getting a warning that should normally be suppressed to your compiler vendor.

Is it possible to disable GCC warning about missing underscore in user defined literal?

void operator"" test( const char* str, size_t sz )
{
std::cout<<str<<" world";
}
int main()
{
"hello"test;
return 0;
}
In GCC 4.7, this generates "warning: literal operator suffixes not preceded by '_' are reserved for future standardization [enabled by default]"
I understand why this warning is generated, but GCC says "enabled by default".
Is it possible to disable this warning without just disabling all warnings via the -w flag?

After reading several comments to this question, I reviewed the C++ 11 Standard (non-final draft N3337).
When I said "I understand why this warning is generated" I was mistaken.
I assumed that an underscore was not technically required by the standard, but just a recommendation (hence the warning rather than an error).
But as Nicol Bolas has brought up, the standard uses the following language when speaking about user defined literals:
"Literal suffix identifiers that do not start with an underscore are reserved for future standardization." usrlit.suffix
"Some literal suffix identifiers are reserved for future standardization; see [usrlit.suffix]. A declaration whose literal-operator-id uses such a literal suffix identifier is ill-formed, no diagnostic required." over.literal
This is similar to the language used for reserved identifiers and the "alternative representations" such as "and", "or", "not". I think this makes it pretty clear that this shouldn't actually be a warning in the first place, but an error.
This may not be the direct answer to the question of "is it possible to disable", but it is answer enough for me.

For what it is worth, -Wno-literal-suffix silences this warning since gcc-7 (see here live on godbold), i.e. this option also turns off warnings for user defined literal operators without leading underscore:
-Wliteral-suffix (C++ and Objective-C++ only)
...
Additionally, warn when a user-defined literal operator is declared with a literal suffix identifier that doesn’t
begin with an underscore. Literal suffix identifiers that don’t begin
with an underscore are reserved for future standardization.
However, one should stick to the advice in #cmeub's answer and rather avoid using literal suffix identifiers without underscore, as it leads to ill formed programs.

strcmpi renamed to _strcmpi?

In MSVC++, there's a function strcmpi for case-insensitive C-string comparisons.
When you try and use it, it goes,
This POSIX function is deprecated beginning in Visual C++ 2005. Use the ISO C++ conformant _stricmp instead.
What I don't see is why does ISO not want MSVC++ to use strcmpi, and why is _stricmp the preferred way, and why would they bother to rename the function, and how is a function beginning with an underscore ISO conformant. I know there must be a reason for all this, and I'm suspecting its because strcmpi is non-standard, and perhaps ISO wants non-standard extensions to begin with an _underscore?

ISO C reserves certain identifiers for future expansion (see here), including anything that starts with "str".

IMNSHO, this is Microsoft's way of saying "Do not put Unix software on Windows machines". There are several frustrating aspects to the problem:
strcmpi() is not a POSIX function - the relevant functions are defined in <strings.h> and are called strcasecmp() etc.
Even if you explicitly request support for POSIX functions, Microsoft thinks that you may not use the POSIX names but must prefix them with the wretched underscore.
AFAIK, there isn't a way of overriding the MSVC compiler's view on the issue.
That said, the GCC tool chain gets a bit stroppy about some functions - mktemp() et al. However, it does compile and link successfully, despite the warnings (which are justified).
I note that MSVC also has a bee in its bonnet about snprintf() et al. If their function conformed to the C99 standard (along with the rest of the compiler), then there would never be any risk of an overflow - the standard requires null termination, contrary to the claims of Microsoft.
I haven't got a really good solution to this problem - I'm not sure there is one. One possibility is to create a header (or set of headers) to map all the actual POSIX names to Microsoft's misinterpretation of them. Another is two create a library of trivial functions with the correct POSIX name that each call down onto the Microsoft version of the name (giving you a massive collection of four-line functions - the declarator line, an open brace, a close brace, and a return statement that invokes the Microsoft variant of the POSIX function name.
It's funny how the Microsoft API calls, which also pollute the user's name space, are not deprecated or renamed.

Names begining witth an underscore and a lower case letter are reserved by the C++ Standard for the C++ implementation, if they are declared in the global namespace. This stops them from clashing with similar names in your own code, which must not use this naming convention.

strcmpi goes away altogether in Visual C++ 2008, so you should definitely heed the deprecation if you ever intend to upgrade.
The _ doesn't make the function ISO standard, it's just that functions beginning with _ are safer to add as the language evolves because that's one of the parts of the namespace reserved for the language to use.
According to Microsoft's documentation for _stricmp, it sounds like strcmpi has some practices that result in some unintuitive orderings (including normalizing to lower case instead of simply treating case as irrelevant). Sounds like _stricmp takes more pains to do what one would naturally expect.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to get GCC/Clang to error on reserved identifiers - c++

Clang has this feature starting from version 13 with -Wreserved-identifier flag (link). GCC doesn't implement such feature yet (as in March 2022).

Related

Is there any advantages of defining variable names as __00000001 in C / C++

Is it safe to use "yes","no","i","out" as name for variables/enum?

(v) is actually (*&v) since when?

Is it possible to disable GCC warning about missing underscore in user defined literal?

strcmpi renamed to _strcmpi?

Categories

Resources