Is relying on GCC's/LLVM's `-fexceptions` technically undefined behavior? - c++

As far as I can tell, compiler extensions may be considered undefined rather than implementation-defined. I am guessing (but do not know for sure) that this applies to the C++ standard as well as C standard.
Both GCC and LLVM offer an -fexceptions feature that appears to ensure that throwing an exception from C++ code through C code and then catching it in C++ code will behave as expected, i.e., unwinding the stack frames in both C and C++ and invoking the destructors for the C++ locals. (Note: I understand that resources allocated in the C stack frames being unwound will not be freed. That is not part of my question.) Here is the relevant text from the GCC documentation:
If you do not specify this option, GCC enables it by default for languages like C++ that normally require exception handling, and disables it for languages like C that do not normally require it. However, you may need to enable this option when compiling C code that needs to interoperate properly with exception handlers written in C++.
However, I cannot find anything in the C or C++ standards indicating how stack-unwinding should be expected to interact with a stack containing frames compiled from different source languages. The C++ standard appears to only mention unwinding in 15.2, except.ctor, which simply explains the rules regarding destroying local objects when an exception is thrown.
Therefore, is passing an exception through C code undefined behavior, even using a language extension designed to make it work in a well-defined way? Is using such an implementation-provided extension "wrong"?
For context, this question is inspired by two fairly lengthy discussions in the Rust community about stack-unwinding through C code:
Rust internals thread
GitHub issue

Relying on Implementation Documentation
The essential question here is whether we can rely on specifications provided by a C or C++ implementation. (Since we are dealing with a situation with mixed C and C++ code, I will refer to this combined implementation as a single implementation.)
In fact, we must rely on implementation documentation. The C and C++ standards do not apply unless and until an implementation asserts that it conforms (at least in part) to the standards. The standards have no power of law; they do not apply to any person or undertaking until somebody decides to adopt them. (The C 2018 Foreword refers to an ISO statement explaining the standards are voluntary.)
If an implementation tells you it conforms to the C and C++ standards, and it also tells you it supports throwing C++ exceptions through C code, there is no reason to believe one and not the other. If you accept the implementation’s documentation, then it both conforms to the language standard and supports throwing exceptions through C code. If you do not accept the implementation’s documentation, then there is no reason to expect conformance to the language standards. (This is a general view, neglecting instances where apparent bugs give us reason to doubt specific behaviors, for example.)
If you ask whether passing an exception through C code is “undefined” in the sense used in the C or C++ standards, the answer is yes. But those standards are only discussing what they define. Their use of “undefined” does not prohibit anybody else from defining behavior. In fact, if you are using an implementation’s documentation, you have a definition for the behavior. The C and C++ standards do not undo, negate, or nullify definitions made by other documents:
Where the C or C++ standard says any behavior is undefined that only means the behavior is undefined within the context of the C or C++ standard.
Any other specification a programmer chooses to use may define additional behavior that is not defined by the C or C++ standard. The C and C++ standards do not prohibit this.
Example
As an example, some of the documents one might rely on to specify the behavior of a commercial software product include:
The C standard.
The C++ standard.
The assembler manual.
The compiler documentation.
Apple’s Developer Tools documentation, include behaviors of Xcode, the linker, and other tools used during a software build.
Processor manuals.
Instruction set architecture specifications.
IEEE-754 Standard for Floating Point Arithmetic.
Unix documentation for command-line tools.
Unix documentation for system interfaces.
For much software, it would be impossible to produce the software if the overall behavior were not defined by all these specifications combined. The notion that the C or C++ standard overrides or trumps other documentation is ludicrous.
Writing Portable Code
Any software project, or any engineering project, works from premises: It takes as given various tool specifications, material properties, device properties, and so on, and it derives desired products from those premises. Rarely does any complete end-user commercial product rely solely on the C or C++ standard. When you buy an iPhone, it obeys the laws of physics, and you are entitled to rely on it to conform to safety specifications for electrical devices and to radio frequency behaviors regulated by governmental agencies. It conforms to many specifications, and the notion that the C standard should be regarding as trumping those other specifications is absurd. If your device burst into flame because of a programming error that the C standard says has undefined behavior, that is not acceptable—the fact the C standard says it is not defined does not trump the safety specification.
Even in purely software projects, very few strictly conform to the C or C++ standards. Largely, only software that does some pure computations and limited input/output can be written in strictly conforming C or C++. That can include very useful libraries that are included in other software, but it includes very few complete commercial end-user programs—such as a few things used by mathematicians and scientists to answer questions about logic, math, and modeling, for example. Most software in this world interacts with devices and operating systems in ways not defined by the C or C++ standards. Most software uses extensions not defined by the standards—extensions that manipulate files and memory in ways not defined by the standards, that interact with devices and users in ways not defined by the standards. They display GUI windows and accept mouse and keyboard input from the user. They transmit and receive data over a network. They send radio waves to other devices.
These things are impossible without using behaviors not defined by the language standards. And, if the language standards trumped the definitions of these behaviors, writing such software would be impossible. If you wanted to send a Wi-Fi radio signal, and you had adopted the C standard, and the C standard trumped other definitions, that would mean it would be impossible for you to write software that reliable sends a radio signal. Obviously, that is not true. The C standard does not trump other specifications.
Writing “portable code” is not a feasible requirement for most software projects. It is, of course, desirable to contain non-portable code to clear interfaces. It is desirable to write what code one can using portable code so that it can be reused. But this is only part of most projects. For most projects, the project as a whole must use behaviors defined by documents other than the language standards.

In the sense that C does not define what happens when you call a function written in a language other than C, much less what happens if that function fails to return but instead ends its lifetime and the lifetime of the C caller in some other way, yes, it is undefined behavior. It is not "implementation-defined behavior", because the defining characteristic of implementation-defined behavior is that the language standard imposes a requirement on implementations that they document a particular behavior, and that is not the case here; the topic in question is completely outside the scope of the relevant standard.
From a standpoint of reasonable and portable C programming, you should not use or depend on -fexceptions and C++ code that's intended to be called from C should catch all exceptions in the outermost extern "C" function (or function exposed via a function pointer to C callers) and translate them into error codes or some mechanism compatible with C (e.g. a longjmp, but only if it's documented that the C caller has to be prepared for the callee to do so).

The code is not UB because the code is not in C++ language, the code is in C++ with gcc/clang extensions language. In C++ with gcc/clang extensions the code is documented and well defined. In C++ the same code would be UB.
So if you take the same code and compile it in pure standard C++ then that code would exhibit UB. But if you compile it in C++ with gcc/clang extensions then the code is well defined.

Related

c++ proposal for new mathematical special functions [duplicate]

When the C++ committee publish a new feature that will be part of the standard library in the next standard of the language, do they also release some source code or some kind of guidance on how to implement that feature?
Let's take unique_ptr as an example. The language committee just defines an interface for that class template and let the compiler vendor implement it as it wants? How exactly does this process of implementation of the standard library features occur?
Can anyone implement parts of the standard library for a platform that doesn't have support for them yet? Let say I would like to implement some cool features of the C++ standard library to use it on a microcontroller environment. How could I do that? Where should I look for information? If I decide to open source my project, can I do that? Will I need to follow exactly what the standard says, or I can write a non-compliant version?
Usually,
every new library feature goes through a proposal.
If the proposal makes it to the C++ committee's Library Evolution Working Group, it goes through a series of iterations (a "tough ground" as I am aware).
It undergoes a series of refinement process as described here
Should it require a (TS) Technical Specification (since C++11), it goes there to be baked. Take for example, the #include <filesystem> was in a Filesystem TS prior to C++17.
One thing I believe the committee likes, is an implementation experience.
More information can be found on the ISOCpp site
Well, as to the implementation:
There are quite a number of "library features" that cannot be implemented purely as a library. they require compiler support. And in these case, compilers provides "intrinsic" that you could hook on to. Take for example, clang provides intrinsics for certain type_traits
Most library features have some implementation experience, mostly from the Boost libraries.
You could actually look into the source code for the default standard library that ships with your compiler:
libc++ for Clang
libstdc++ for GCC
Sadly most of the implementations use a whole bunch of underscores. Mostly because they are reserved for use by the "Standard Library".
Can anyone implement parts of the standard library for a platform that doesn't have support for it yet?
Yes, you can, so far your compiler supports that platform, and the platform or Operating System provides usable API. For example. std::cout, elements of std::ifstream, and so much more requires platform specific support.
Let say I would like to implement some cool features of the C++ standard library to use it on a microcontroller environment. How could I do that?
You can look into the code of others and start from there. We learn from giants. Some Open Source Examples:
ETL
StandardCPlusPlus
uClib++
How could I do that? Where should I look for information?
You could check the paper that introduced the feature into the C++ library. For example, std::optional has a stand-alone implementation here which was used as a reference implementation during the proposal stages.
You could check the standard library, and do a laborious study. :-)
Search the internet. :-)
Or write it from scratch as specified by the standard
Will I need to follow exactly what the standard say or I can write a non-compliant version?
There's is no compulsion to follow what the C++ standard library specifies. That would be your "own" library.
Formally, no. As with all standards out there, C++ Standard sets the rules, and does not gives implementation. However, from the practical standpoint, it is nearly impossible to introduce a new feature into Standard Library without proposed implementation, so you often can find those attached to proposals.
As for your questions on "can you write non-compliant version", you can do whatever you want. Adoption might depend on your compliance, or might not - a super-widely adopted MSVC is known to violate C++ standard.
Typically, a new feature is not standardized, unless the committee has some solid evidence that it can be implemented, and will be useful. This very often consists of a prototype implementation in boost, a GNU library, or one of the commercial compiler vendors.
The standard itself does not contain any implementation guidance - it is purely a specification. The compiler vendors (or their subcontractors) choose how to implement that specification.
In the specific case of unique_ptr, it was adopted into the standard from boost::unique_ptr - and you can still use the latter. If you have a compiler that will compile for your microcontroller, it is almost certain that it will be able to build enough of boost to make unique_ptr work.
There is nothing stopping you from writing a non-conforming implementation (apart from the trivial point that if you sold it as being standards-conforming, and it wasn't you might get your local equivalent of Trading Standards come knocking.)
The committee does not release any reference implementations. In the early days, things got standardized and then the tool developers went away and implemented the standard. This has changed, and now the committee looks for features that have been implemented and tested before standardization.
Also major developments usually don't go directly into the standard. First they become experimental features called a Technical Specification or TS. These TS may then be incorporated into the main standard at a later date.
You are free to write you own implementation of the C++ standard library. Plum Hall has a test suite (commercial, I have no connection, but Plum Hall are very involved with C++ standardization).
I don't see any issue with not being conformant. Almost all implementations have some extensions. Just don't make any false claims, especially if you want to sell your product.
If you're interested in getting involved, this can be done via your 'National Body' (ANSI for the USA, BSI for the UK etc.). The isocpp web site has a section on standardization which would be a good starting place.

What kind of software is part of an "Implementation" exactly when stating "Implementation-defined"? What is an "Implementation" exactly?

I often see the statement "implementation-defined" in the C Standard documentations, as well as getting it as answer very much.
I have then searched in the C99 Standard for it, and:
In ISO/IEC 9899/1999 (C99) is stated under §3.12:
3.12
Implementation
particular set of software, running in a particular translation environment under particular control options, that performs translation of programs for, and supports execution of functions in, a particular execution environment
As well under §5:
Environment
An implementation translates C source files and executes C programs in two dataprocessing-system environments, which will be called the translation environment and the execution environment in this International Standard. Their characteristics define and constrain the results of executing conforming C programs constructed according to the syntactic and semantic rules for conforming implementations.
But to which software applications exactly it refers to?
Which set of software in particular?
It is stated as providing a translation AND an execution environment. So it couldn´t be the compiler alone, or am i wrong about this assumption?
About which parts of my system i can think of as part of "the implementation"?
Is it the Composing of the used Compiler with its relying C standard, the operation system, the C standard used itself or a mix between those all?
Does it despite the previous statement also include a piece of hardware (used processor, mainboard, etc)?
I quite do not understand, what an implementation exaclty is.
I feel like i have to be a 100-year experienced cyborg to know what it all includes entirely and exactly.
Generally speaking, an "implementation" refers to a given compiler and the machine it runs on. The latter is important due to things such an endianness, which dictates the byte ordering of integer and floating point types, among other considerations.
An implementation is required to document its implementation defined behavior. For example you can find GCC's implementation defined behavior here.
Compilers often support multiple versions of the C standard, so each operating mode can also be considered an implementation. For example, you can pass the -std option to GCC to specify c89, c99. or c11 modes.
I think you have a good formal sense of what it is, and are focusing your question on specifics of real-world implementations, so that's what I'll address. "The implementation" actually tends to encompass a number of components which act and depend upon one another via a number of interface contracts, all of which need to be honored in order to have any hope of the implementation as a whole being conforming.
These include possibly:
the compiler
possibly an assembler, if the compiler produces asm as an intermediate form
the linker
library code that's part of the standard library (which is part of the language, as the language is specified, not a separate component, but only for hosted implementations not freestanding ones)
library code that's "compiler glue" for implementing language constructs for which the compiler doesn't directly emit code (on GCC, this is libgcc), often used for floating point on machines that lack hardware fpu, division on machines that lack hardware divider, etc.
the dynamic linker, if the implementation uses dynamic-linked programs
the operating system kernel, if the implementation's library functions don't directly drive the hardware, but depend on syscalls or "software interrupts" or similar defined by the operating system having their specified behavior in order to implement part of the standard library or other (e.g. startup or glue) library code
etc.
Arguably you could also say the hardware itself is part of the implementation.
The C99 Standard defines many things, but some are just not that relevant so they did not care to define them in the Standard in detail. Instead, they write "implementation defined" which means that whoever actually programs a compiler according to their standard can choose how exactly they do that.
For example, gcc is an implementation of that standard (Actually, gcc implements various different Standards, as pmg points out in his comment. But that's not too important right now). If you were to write your own compiler, you can only call it a "C99 Compiler" if it adheres to the standard. But where the standard states that something is implementation dependent, you are free to choose what your compiler should do.

Why Embedded C++ compilers not support exceptions?

I was writing a library to embedded systems and I bumped into no STL standard library easy found.
But the worst news I receive is no exception support by compiler.
Atmel reference manual show this
Why not support exceptions on embedded environment?
It's simple make impossible to use a lot of libraries written in C++.
C++ is closely linked with exceptions, like with the new operator!
Obviously, nobody but the people that produce that compiler can really answer that question.
My best guess is that exceptions are both space- and time-consuming (time is only a real problem when the code throws, but space is needed for the unwind tables always, and can easily be about the same size as the overall code) for the compiled code, coupled with "rather difficult to implement", it's perhaps not the highest item on the list for the compiler developers, and thus not yet implemented. Whether it WILL be at some point in the future is obviously up to Atmel or whoever they subcontract to make their compiler.
I'm not an expert on Atmel C++ implementation, but I would be VERY surprised if the compiler supports throw, but not try/catch - since that would be about as useful as a screwdriver made from chocolate, when trying to fix the heater in a sauna that is stuck on full heat.
If you use the -fno-exceptions, the compiler should error if you have any throw in the code. And the STL can be compiled with -fno-exceptions - as that's how I compile my compiler code.
The 8 bit AtmelAVR with its limited ROM and RAM resources is not suited to the use of the STL which imposes a considerable load on both. Moreover C++ exceptions are intrinsically non-deterministic and therefore unsuited to many real-time applications.
A compiler may implement the EC++ (Embedded C++) "standard". The aim of EC++ was two fold:
to support the use of C++ in embedded systems at a time before ISO standardisation of the language, by restricting the language to a common subset available on all compilers.
to avoid non-deterministic (in both memory and timing) language/library features unsuited to hard-real-time applications.
Things missing from EC++ include:
namespace
templates
RTTI
exceptions
The omission of namespaces is entirely down to the first aim of EC++ and now essentially obsolete. The omission of the others are justified by both aims, and preclude the use of the STL and std::string libraries.
EC++ itself is now largely obsolete, but the subset it defines is nonetheless applicable to severely resource constrained targets, and many compilers support whole or partial EC++ enforcement.
Note that perhaps not all compilers for AVR impose these restrictions, but if you attempt to use these features extensively you will probably soon discover whey they are ill advised in most cases on targets very limited CPU and memory resources.
With respect to the use of the new operator, default dynamic memory allocation (as opposed to placement new or an override), is intrinsically non-deterministic and often best avoided in real-time embedded systems, and especially so where minimal heap is available. To use new without the assumption of exception handling, use new (std::nothrow) (#include <new>) which will not throw an exception but return a null pointer like malloc(). In EC++ new and new (std::nothrow) are essentially the same thing, but the latter is portable and explicit.
Lack of support for C++ in Atmel's maintained open-source library for GCC-AVR does not mean that there is no support for embedded systems in general, or indeed for AVR. IAR's AVR compiler supports what it refers to as Extended Embedded C++ (EEC++) as well as EC++. EEC++ does support a C++ library including STL with some modifications - no RTTI, no exceptions, and not in std:: namespace (although namespaces are supported). Support for ISO C++ as opposed to EC++ on most 32 bit targets is generally more comprehensive.

C++ ABI issues list

I've seen a lot of discussion about how C++ doesn't have a Standard ABI quite in the same way that C does. I'm curious as to what, exactly, the issues are. So far, I've come up with
Name mangling
Exception handling
RTTI
Are there any other ABI issues pertaining to C++?
Off the top of my head:
C++ Specific:
Where the 'this' parameter can be found.
How virtual functions are called
ie does it use a vtable or other
What is the layout of the structures used for implementing this.
How are multiple definitions handled
Multiple template instantiations
Inline functions that were not inlined.
Static Storage Duration Objects
How to handle creation (in the global scope)
How to handle creation of function local (how do you add it to the destructor list)
How to handle destruction (destroy in reverse order of creation)
You mention exceptions. But also how exceptions are handled outside main()
ie before or after main()
Generic.
Parameter passing locations
Return value location
Member alignment
Padding
Register usage (which registers are preserved which are scratch)
size of primitive types (such as int)
format of primitive types (Floating point format)
The big problem, in my experience, is the C++ standard library. Even if you had an ABI that dictates how a class should be laid out, different compilers provide different implementations of standard objects like std::string and std::vector.
I'm not saying that it would not be possible to standardize the internal layout of C++ library objects, only that it has not been done before.
The closest thing we have to a standard C++ ABI is the Itanium C++ ABI:
this document is written as a generic specification, to be usable by C++ > implementations on a variety of architectures. However, it does contain > processor-specific material for the Itanium 64-bit ABI, identified as
such."
The GCC doc explains support of this ABI for C++:
Starting with GCC 3.2, GCC binary conventions for C++ are based
on a written, vendor-neutral C++ ABI that was designed to be specific
to 64-bit Itanium but also includes generic specifications that apply
to any platform. This C++ ABI is also implemented by other compiler
vendors on some platforms, notably GNU/Linux and BSD systems
As was pointed out by #Lindydancer, you need to use the same C++ standard libary/runtime as well.
An ABI standard for any language really needs to come from a given platform that wants to support such a thing. Language standards especially C/C++ really can not do this for many reasons but mostly because such a thing would make the language less flexible and less portable and therefore less used. C really doesn't have a defined ABI but many platforms define (directly or indirectly) one. The reason this isn't happening with C++ is because the language is much bigger and changes are made more often. However, Herb Sutter has a very interesting proposal about how to get more platforms to create standard ABIs and how developers can write code that uses the ABI in a standard way:
https://isocpp.org/blog/2014/05/n4028
He points out how C++ has a standard way to link into a platform C ABI but not a C++ ABI via extern "C". I think this proposal could go a long way to allowing interfaces to be defined in terms of C++ instead of C.
I've seen a lot of discussion about how C++ doesn't have a Standard ABI quite in the same way that C does.
What standard C ABI? Appendix J in the C99 standard is 27 pages long. In addition to undefined behavior (and some implementations give some UB a well-defined behavior), it covers unspecified behavior, implementation-defined behavior, locale-specific behavior, and common extensions.

C++ languages extensions

I already read the FAQ, and i think maybe this is a subjective question, but i need to ask.
Does anyine knows what exactly (i mean formally) is a C++ language extensions.
I already saw examples, like nvdia CUDA c ext, Avalon transaction-based c++ ext.
So the point is something like a formal definition or so.
thxs anyway.
A language extension is simply anything that goes beyond what the language specification calls for. Your compiler might add new features, like special "min" and "max" operators. Your compiler might define the behavior of division by zero, which is otherwise undefined, according to the standard. It might provide additional parameters for your main function. It might be the incorporation of another language's features, such as allowing C-style variable-sized arrays in C++. It might be a facility for specifying a function's calling convention.
Using a language extension usually makes your code non-portable because when you take your code to another OS, compiler, or even compiler version, the extension may not be available anymore, or its behavior may be different from what you had originally used.
Please see Extensible programming:
Extensible programming is a term used
in computer science to describe a
style of computer programming that
focuses on mechanisms to extend the
programming language, compiler and
runtime environment.
and more to the point, the Extensible syntax section:
This simply means that the source
language(s) to be compiled must not be
closed, fixed, or static. It must be
possible to add new keywords,
concepts, and structures to the source
language(s).