undefined behaviour and C++ language stability - c++

I was told that #defining language keywords is undefined behaviour. How does this play with the following facts:
Users can #define names that are not keywords in their code.
The language can get new keywords that used not to be keywords over time.
Users should be able to compile old code with new compilers and get a compile-time diagnostic rather than undefined behaviour if any construct they used has ceased to be supported.
#3 is obviously my assumption but I consider this assumption essential as newer compilers tend to be better and "knowing the whole extent of the current law" is a theoretical legal assumption that does not apply to software developers, I hope (if the compiler assumed otherwise, it could replace any compile-time error in the code with whatever undefined behaviour it pleases).

#3 is not the case and has never been guaranteed by anyone. While the C++ committee does not like to create backwards incompatibilities that don't noisily break, they still sometimes do it.
You shouldn't expect a thing that you were not told to expect.
And yes, it's possible that adding a new keyword breaks silently for those who #defined that keyword. This is one of the many reasons why users are told to use ALL_CAPS for their #define names, since keywords will almost certainly not be in ALL_CAPS.

C++ standards are not fully backwards-compatible. That doesn't hold you back from using a modern compiler.
If you want to compile old code with a new compiler, you set the C++ version explicitely with a flag. For example, with GCC, the default C++ version varies with version. To set it explicitely, you use the -std option, e.g. std=c++11 or std=c++0x.
Then any keywords introduced after this version will not be in effect so you will not run into undefined behavior. If you would like to use the newer language features on the other hand, you need to go through the documented newly introduced keywords and some subtleties that changed and review your code accordingly.

#3 Users should be able to compile old code with new compilers and get a compile-time diagnostic rather than undefined behaviour if any construct they used has ceased to be supported.
This is a good thing to strive for, but there is technically no such guarantee.
#2 The language can get new keywords that used not to be keywords over time.
Yes. Because of desire to keep existing programs working, the standards committee is usually against introducing new keywords, but it does happen regardless.
For those cases where a new keyword is introduced, there is a trick to avoid name collisions (with high probability) with your macros: Use upper case. None of the C++ keywords with the exception of _Pragma use upper case, and this is likely to stay true in future. For other identifiers than macros, using keywords makes the program ill-formed, so you are guaranteed a diagnostic.

Related

Are compilers allowed to support a feature, that is removed in the standard?

In the first paragraph cppreference.com clearly states that throw(T1, ..., Tn) is removed in C++17.
It confuses me, that some compilers support throw(T1, ..., Tn) in C++17 mode (see demo).
MSVC supports it by default, but you can turn on a warning for it, see C5040. It can be turned into an error with /we5040.
Clang reports it as an error by default, but the error can be turned off with -Wno-dynamic-exception-spec.
GCC leaves you with no choice: it's an error.
Are compilers allowed to support a feature, that is removed in the standard? For what purpose?
Or is this just a compiler extension, like void foo(int size) { char a[size]; } in GCC, see demo.
Are compilers allowed to support a feature, that is removed in the standard?
The standard doesn't allow that. AFAIK in general it doesn't give any special treatment to features that used to be in the language (doesn't separate them from non-existent features).
If a compiler doesn't diagnose this error (i.e. doesn't give an error or a warning) with a specific configuration (i.e. specific flags), then it doesn't conform to the standard in that configuration.
For what purpose?
Backward compatibility (what else could it be). More specifically, it lets you use both old and new features in a same translation unit.
This can be useful if you're using a library that uses a removed feature in its headers, but want to use the new language features in your own code.
Or if you want to use the removed feature in your own code along with the new features, for some reason.
Note that absolute conformance to the standard is not practically possible to achieve.
Some compiler vendors care about conformance more than the others. Microsoft tends to care less about it (or at least used to, they've been working on that).
There is no single answer for this.
Some things outside the Standard can be treated as pure enhancements. Some of these enhancements are suggested by the Standard ("It's implementation-dependent if X"), some are not even mentioned at all (#include <windows.h>).
For other things, the Standard does require that a compiler flags the violation of the Standard. But the Standard doesn't talk about errors or warnings. Instead it says "Diagnostic Required", which is understood to mean either an error or a warning. And in other cases it even says "No Diagnostic Required" (NDR), which means the compiler is not obliged to flag non-standard code.
So depending on the removed feature, it may or may not require a diagnostic. And if it does require a diagnostic, you can often tell the compiler that you're not interested in that particular diagnostic anyway.
Are compilers allowed to support a feature, that is removed in the standard? For what purpose?
Compilers can do whatever they want. The C++ standard dictates rules for the C++ language, and while they do consult compiler vendors to ensure its rules are implementable, the vendors themselves will do what they feel is best for them and their users. Compilers have many non-standard features that developers use all the time. And I don't think any of these compilers fully adhere to the standard by default.
That said, I wouldn't call a compiler+settings "C++17 compliant" if they allowed non-C++17 code or rejected valid C++17 code (as dictated by the standard). Most compilers have settings that can be set if full compliance is desired.
If you want to be pedantic though MSVC isn't even C++11 compliant due to lacking preprocessor. The standard isn't everything.

Attributes in c++ and their use

The most recent c++ feature(that is also quite modern) that I found about is attributes. They seem quite useful for signalling the compiler, but apart from that, what other more specific uses can attributes have? How can custom attributes be created and used and what is the main idea behind the use of c++ Attributes? If the topic is too broad, im particularly interested in attributes with functions.
By attributes I mean these: https://en.cppreference.com/w/cpp/language/attributes
Attributes (that is, attributes as a feature of the C++ language, not compiler-specific __declspec or __attribute__ attributes) are kind of a C++ hack. They're a solution to meta-problems with the evolution of the C++ language.
For example, keywords. There's a lot of C++ code out there, and any new version of the language that adds new keywords needs to avoid breaking any code that might use that keyword for an identifier. So any language feature that might want to spell something out explicitly has a really high bar for passing standardization. That is, it'd had better be worth it.
But attributes are cheap; they don't conflict with any existing code. Consider wanting to declare that a function does not return normally (ie: always throws an exception or calls std::terminate or whatever). That feature it doesn't really govern the behavior of the program; it's more an indicator to the compiler/user about how that function is going to behave. So it's not a feature worth breaking the code of someone who just so happened to name a variable noreturn.
But you can have the [[noreturn]] attribute, since that won't break anybody's code.
Another meta-problem attributes solve is "fixing" things that maybe weren't good ideas in the first place. For example, case labels in switch statements automatically fall-through to the next one if you don't explicitly break. While fall-through behavior is useful, it's the wrong behavior to have as a default, since 90+% of the time, you intend to break.
But you can't go and change how case labels work and introduce a fallthrough keyword. That would break everybody's code that already uses the implicit fall-through behavior.
But you can add a [[fallthrough]] attribute. It doesn't mean anything within the language, but if a compiler sees it, then it can know that you meant to fall-through to the next label. Furthermore, you can now turn on compiler warnings about fall-through behavior, so that any fall-through that happens without the [[fallthrough]] attribute will give you a warning. And you can even choose to make that warning an error, thus effectively "fixing" the language. For you.
Most C++ attributes are like that: tags for code that are too trivial/non-functional to bother burning a keyword for, or indicating something useful about the code to the compiler that isn't really part of the langauge.

C++ code compiles differently on different OS

I was wondering so why does c++ code compile differently on different version of the OS. Such as when the same code is complied on the OS no warning or anything will be brought up, but when the same code is complied on a different OS, there will be warnings or errors.
So why does this happen. Is the difference between gcc versions or what actually makes the c++ code unique when its complied on two different OS such Ubuntu 14 and Ubuntu 16. I am just trying to understand how the c++ code is unique to the OS compilation.
C++ as a language is defined by its standard. The standard is an enormous, lawyer-lingo document that defines the language's syntax, rules, standard library, and some guidelines for how compilers should correctly process source code. Compilers, the bridge between the abstract language and real, executable programs, are implemented by different vendors or organizations, and should adhere to that standard as closely as possible. In practice, their correctness varies[1].
Many compiler errors are part of the standard (diagnostics in standardese), and so should in principle be essentially the same across compilers[2]. Compiler warnings generally are less technical, and are often ways that compiler vendors try to help you catch common programming errors that aren't technically ill-formed programs. A program may be ill-formed according to the standard, meaning that it is syntactically invalid and does not represent a real program. Compilers are required by the standard to issue a diagnostic for an ill-formed program.
There are however lesser, more subtle ways that programs can be incorrect though, for example by using what the standard refers to as undefined behavior (UB) and implementation-defined behavior. These are situations where the standard doesn't specify how a compiler should correctly translate source code into a program, and compiler vendors are legally allowed to proceed how they please. While many compilers are likely to produce code that does approximately what you expect it to, invoking undefined behavior in a program is generally a very bad idea because there's no guarantee of any kind how your program will behave. Code with UB that compiles quietly and passes tests on one compiler may fail tests or fail to compile altogether, or encounter a bug at the worst possible time, on a different compiler. The situation gets hairy too if you're using compiler-specific language extensions.
When faced with potential UB, some compilers may offer very helpful advice and others may be misleadingly silent. The best practice would be to be familiar with causes of UB by learning C++ from a good source and reading documentation carefully, both C++ language documentation and that of any libraries you may be using.
[1] Take a look at the 'Standard conformance' columns of the list of C++ compilers at https://en.wikipedia.org/wiki/List_of_compilers#C++_compilers
[2] A comparison of error messages and warnings from three very popular compilers: https://easyaspi314.github.io/gcc-vs-clang.html

In standard Fortran grammar, can we specify some unused variant?

To find out all the (possible) problems that existed in the program, we had better turn on all the debug tools of the compiler. The tool will always tell us something like "remark #7712: This variable has not been used.".
In many cases, in order to keep some rules, I have to keep some input and output without using them. At the same time, I want to keep the debug tool turned on.
Can we do something by standard grammar to tell the compiler we really mean to do it and do not report any warning about it?
The Fortran standard sets out the rules for correct programs and requires that compilers identify any breach of those rules. Such breaches, which cause compilation to fail, are generally known as errors.
However, programmers make many mistakes which are not errors and which a (Fortran) compiler is not required to spot. Some compilers provide additional diagnostic capabilities, such as identifying unused variables, which go beyond what the standard requires. The compilers raise what are generally known as warnings in these cases. This type of mistake does not cause compilation to fail. Compilers also generally provide some means to determine which warnings are raised during compilation, so that you can switch off and on this diagnostic capability. For details of these capabilities refer to your compiler's documentation.
The standard is entirely silent on this type of mistake so, if I understand the question correctly, there is nothing
by standard grammar to tell the compiler we really mean to do it and
do not report any warning about it
The simplest thing (besides of course not declaring things you don't use)
may be to simply use the variables.
real x
x=huge(x) !reminder x is declared but not used.
at least makes gfortran happy that you have "used" the variable.

Deprecation of the static keyword... no more?

In C++ it is possible to use the static keyword within a translation unit to affect the visibility of a symbol (either variable or function declaration).
In n3092, this was deprecated:
Annex D.2 [depr.static]
The use of the static keyword is deprecated when declaring objects in namespace scope (see 3.3.6).
In n3225, this has been removed.
The only article I could find is somewhat informal.
It does underline though, that for compatibility with C (and the ability to compile C-programs as C++) the deprecation is annoying. However, compiling a C program directly as C++ can be a frustrating experience already, so I am unsure if it warrants consideration.
Does anyone know why it was changed ?
In C++ Standard Core Language Defect Reports and Accepted Issues, Revision 94 under 1012. Undeprecating static they note:
Although 7.3.1.1 [namespace.unnamed] states that the use of the static keyword for declaring variables in namespace scope is deprecated because the unnamed namespace provides a superior alternative, it is unlikely that the feature will be removed at any point in the foreseeable future.
Basically this is saying that the deprecation of static doesn't really make sense. It won't ever be removed from C++. It's still useful because you don't need the boilerplate code you would need with unnamed namespace's if you just want to declare a function or object with internal linkage.
I will try to answer your question, although it is an old question, and it does not look very important (it really is not very important in itself), and it has received quite good answers already. The reason I want to answer it is that it relates to fundamental issues of standard evolution and language design when the language is based on an existing language: when should language features be deprecated, removed, or changed in incompatible ways?
In C++ it is possible to use the static keyword within a translation unit to affect the visibility of a symbol (either variable or function declaration).
The linkage actually.
In n3092, this was deprecated:
Deprecation indicates:
The intent to remove some feature in the future; this does not mean that deprecated features will be removed in the next standard revision, or that they must be removed "soon", or at all. And non-deprecated features may be removed in the next standard revision.
A formal attempt to discourage its use.
The latter point is important. Although there is never a formal promise that your program won't be broken, sometimes silently, by the next standard, the committee should try to avoid breaking "reasonable" code. Deprecation should tell programmers that it is unreasonable to depend on some feature.
It does underline though, that for compatibility with C (and the ability to compile C-programs as C++) the deprecation is annoying. However, compiling a C program directly as C++ can be a frustrating experience already, so I am unsure if it warrants consideration.
It is very important to preserve a C/C++ common subset, especially for header files. Of course, static global declarations are declarations of symbol with internal linkage and this not very useful in a header file.
But the issue is never just compatibility with C, it's compatibility with existing C++: there are tons of existing valid C++ programs that use static global declarations. This code is not just formally legal, it is sound, as it uses a well-defined language feature the way it is intended to be used.
Just because there is now a "better way" (according to some) to do something does not make the programs written the old way "bad" or "unreasonable". The ability of using the static keyword on declarations of objects and functions at global scope is well understood in both C and C++ communities, and most often used correctly.
In a similar vein, I am not going to change C-style casts to double to static_cast<double> just because "C-style casts are bad", as static_cast<double> adds zero information and zero safety.
The idea that whenever a new way to do something is invented, all programmers would rush to rewrite their existing well-defined working code is just crazy. If you want to remove all the inherited C ugliness and problems, you don't change C++, you invent a new programming language. Half-removing one use of static hardly makes C++ less C-ugly.
Code changes need a justification, and "old is bad" is never a justification for code changes.
Breaking language changes need a very strong justification. Making the language very slightly simpler is never a justification for a breaking change.
The reasons given why static is bad are just remarkably weak, and it isn't even clear why not both objects and function declarations are deprecated together - giving them different treatment hardly makes C++ simpler or more orthogonal.
So, really, it is a sad story. Not because of the practical consequences it had: it had exactly zero practical consequences. But because it shows a clear lack of common sense from the ISO committee.
Deprecated or not, removing this language feature would break existing codes and annoy people.
The whole static deprecation thing was just wishful thinking along the lines of "anonymous namespaces are better than static" and "references are better pointers". Lol.