In a 2013 presentation about the future of Standard ML, Bob Harper says, on slide 9, that "structure sharing is broken". Can someone give more detail on that? I don't have enough experience with sharing to understand what he meant.
It is broken because as specified, it cannot be applied to structures that have transparent type components. For example:
signature S = sig type t; type u = int end
signature T =
sig
structure A : S
structure B : S
sharing A = B
end
would already be illegal, although you would naturally expect this to be fine.
The history here is that structure sharing was introduced in SML'90, where no transparent type components existed. With SML'97, those were added. At that point, the whole business with sharing constraints became somewhat obsolete because they were (to some extent) superseded by "where type" constraints. So, the semantics of sharing was vastly simplified, and structure sharing degraded from a primitive to syntactic sugar. But this sugar was defined such that it only worked with SML'90 programs -- which makes sense if you only view it as a backwards compatibility hack, but not if you consider structure sharing a central feature of SML'97.
People in the SML community disagree about the relevance of sharing constraints. Some consider them obsolete, some still important. Unfortunately, SML'97 failed to add a "where structure" constraint, which could have properly replaced structure shairing.
Andreas Rossberg's answer has already clarified matters, but I've written to professor Harper before Andreas' answer. I'm posting his reply here for the curious:
It's a purely technical problem with the definition. In SML 90 there
was a notion of structure sharing above and beyond the constituent
type sharing. In SML 97 structure sharing was redefined to mean
sharing of the constituent types, but the formulation is incorrect
(there are several candidate replacements, so compilers differ in
their behavior). It's a dark corner anyway, so in the scheme of
things it is very minor, but the mistake and the associated
incompatibilities render it useless any more.
Related
Background
Discussions on the mostly un-or-implementation-defined nature of type-punning via a union typically quote the following bits, here via #ecatmur ( https://stackoverflow.com/a/31557852/2757035 ), on an exemption for standard-layout structs having a "common initial sequence" of member types:
C11 (6.5.2.3 Structure and union members; Semantics):
[...] if a union contains several structures that share a common initial sequence (see below), and if the union object currently
contains one of these structures, it is permitted to inspect the
common initial part of any of them anywhere that a declaration of
the completed type of the union is visible. Two structures share a
common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or
more initial members.
C++03 ([class.mem]/16):
If a POD-union contains two or more POD-structs that share a common initial sequence, and if the POD-union object currently contains one
of these POD-structs, it is permitted to inspect the common initial
part of any of them. Two POD-structs share a common initial sequence
if corresponding members have layout-compatible types (and, for
bit-fields, the same widths) for a sequence of one or more initial
members.
Other versions of the two standards have similar language; since C++11
the terminology used is standard-layout rather than POD.
Since no reinterpretation is required, this isn't really type-punning, just name substitution applied to union member accesses. A proposal for C++17 (the infamous P0137R1) makes this explicit using language like 'the access is as if the other struct member was nominated'.
But please note the bold - "anywhere that a declaration of the completed type of the union is visible" - a clause that exists in C11 but nowhere in C++ drafts for 2003, 2011, or 2014 (all nearly identical, but later versions replace "POD" with the new term standard layout). In any case, the 'visible declaration of union type bit is totally absent in the corresponding section of any C++ standard.
#loop and #Mints97, here - https://stackoverflow.com/a/28528989/2757035 - show that this line was also absent in C89, first appearing in C99 and remaining in C since then (though, again, never filtering through to C++).
Standards discussions around this
[snipped - see my answer]
Questions
From this, then, my questions were:
What does this mean? What is classed as a 'visible declaration'? Was this clause intended to narrow down - or expand up - the range of contexts in which such 'punning' has defined behaviour?
Are we to assume that this omission in C++ is very deliberate?
What is the reason for C++ differing from C? Did C++ just 'inherit' this from C89 and then either decide - or worse, forget - to update alongside C99?
If the difference is intentional, then what benefits or drawbacks are there to the 2 different treatments in C vs C++?
What, if any, interesting ramifications does it have at compile- or runtime? For example, #ecatmur, in a comment replying to my pointing this out on his original answer (link as above), speculated as follows.
I'd imagine it permits more aggressive optimization; C can assume that
function arguments S* s and T* t do not alias even if they share a
common initial sequence as long as no union { S; T; } is in view,
while C++ can make that assumption only at link time. Might be worth
asking a separate question about that difference.
Well, here I am, asking! I'm very interested in any thoughts about this, especially: other relevant parts of the (either) Standard, quotes from committee members or other esteemed commentators, insights from developers who might have noticed a practical difference due to this - assuming any compiler even bothers to enforce C's added clause - and etc. The aim is to generate a useful catalogue of relevant facts about this C clause and its (intentional or not) omission from C++. So, let's go!
I've found my way through the labyrinth to some great sources on this, and I think I've got a pretty comprehensive summary of it. I'm posting this as an answer because it seems to explain both the (IMO very misguided) intention of the C clause and the fact that C++ does not inherit it. This will evolve over time if I discover further supporting material or the situation changes.
This is my first time trying to sum up a very complex situation, which seems ill-defined even to many language architects, so I'll welcome clarifications/suggestions on how to improve this answer - or simply a better answer if anyone has one.
Finally, some concrete commentary
Through vaguely related threads, I found the following answer by #tab - and much appreciated the contained links to (illuminating, if not conclusive) GCC and Working Group defect reports: answer by tab on StackOverflow
The GCC link contains some interesting discussion and reveals a sizeable amount of confusion and conflicting interpretations on part of the Committee and compiler vendors - surrounding the subject of union member structs, punning, and aliasing in both C and C++.
At the end of that, we're linked to the main event - another BugZilla thread, Bug 65892, containing an extremely useful discussion. In particular, we find our way to the first of two pivotal documents:
Origin of the added line in C99
C proposal N685 is the origin of the added clause regarding visibility of a union type declaration. Through what some claim (see GCC thread #2) is a total misinterpretation of the "common initial sequence" allowance, N685 was indeed intended to allow relaxation of aliasing rules for "common initial sequence" structs within a TU aware of some union containing instances of said struct types, as we can see from this quote:
The proposed solution is to require that a union declaration be visible
if aliases through a common initial sequence (like the above) are possible.
Therefore the following TU provides this kind of aliasing if desired:
union utag {
struct tag1 { int m1; double d2; } st1;
struct tag2 { int m1; char c2; } st2;
};
int similar_func(struct tag1 *pst2, struct tag2 *pst3) {
pst2->m1 = 2;
pst3->m1 = 0; /* might be an alias for pst2->m1 */
return pst2->m1;
}
Judging by the GCC discussion and comments below such as #ecatmur's, this proposal - which seems to mandate speculatively allowing aliasing for any struct type that has some instance within some union visible to this TU - seems to have received great derision and rarely been implemented.
It's obvious how difficult it would be to satisfy this interpretation of the added clause without totally crippling many optimisations - for little benefit, as few coders would want this guarantee, and those who do can just turn on fno-strict-aliasing (which IMO indicates larger problems). If implemented, this allowance is more likely to catch people out and spuriously interact with other declarations of unions, than to be useful.
Omission of the line from C++
Following on from this and a comment I made elsewhere, #Potatoswatter in this answer here on SO states that:
The visibility part was purposely omitted from C++ because it's widely considered to be ludicrous and unimplementable.
In other words, it looks like C++ deliberately avoided adopting this added clause, likely due to its widely pereceived absurdity. On asking for an "on the record" citation of this, Potatoswatter provided the following key info about the thread's participants:
The folks in that discussion are essentially "on the record" there. Andrew Pinski is a hardcore GCC backend guy. Martin Sebor is an active C committee member. Jonathan Wakely is an active C++ committee member and language/library implementer. That page is more authoritative, clear, and complete than anything I could write.
Potatoswatter, in the same SO thread linked above, concludes that C++ deliberately excluded this line, leaving no special treatment (or, at best, implementation-defined treatment) for pointers into the common initial sequence. Whether their treatment will in future be specifically defined, versus any other pointers, remains to be seen; compare to my final section below about C. At present, though, it is not (and again, IMO, this is good).
What does this mean for C++ and practical C implementations?
So, with the nefarious line from N685... 'cast aside'... we're back to assuming pointers into the common initial sequence are not special in terms of aliasing. Still. it's worth confirming what this paragraph in C++ means without it. Well, the 2nd GCC thread above links to another gem:
C++ defect 1719. This proposal has reached DRWP status: "A DR issue whose resolution is reflected in the current Working Paper. The Working Paper is a draft for a future version of the Standard" - cite. This is either post C++14 or at least after the final draft I have here (N3797) - and puts forward a significant, and in my opinion illuminating, rewrite of this paragraph's wording, as follows. I'm bolding what I consider to be the important changes, and {these comments} are mine:
In a standard-layout union with an active member {"active" indicates a union instance, not just type} (9.5 [class.union])
of struct type T1, it is permitted to read {formerly "inspect"} a non-static data member m
of another union member of struct type T2 provided m is part of the
common initial sequence of T1 and T2. [Note: Reading a volatile object
through a non-volatile glvalue has undefined behavior (7.1.6.1
[dcl.type.cv]). —end note]
This seems to clarify the meaning of the old wording: to me, it says that any specifically allowed 'punning' among union member structs with common initial sequences must be done via an instance of the parent union - rather than being based on the type of the structs (e.g. pointers to them passed to some function). This wording seems to rule out any other interpretation, a la N685. C would do well to adopt this, I'd say. Hey, speaking of which, see below!
The upshot is that - as nicely demonstrated by #ecatmur and in the GCC tickets - this leaves such union member structs by definition in C++, and practically in C, subject to the same strict aliasing rules as any other 2 officially unrelated pointers. The explicit guarantee of being able to read the common initial sequence of inactive union member structs is now more clearly defined, not including vague and unimaginably tedious-to-enforce "visibility" as attempted by N685 for C. By this definition, the main compilers have been behaving as intended for C++. As for C?
Possible reversal of this line in C / clarification in C++
It's also very worth noting that C committee member Martin Sebor is looking to get this fixed in that fine language, too:
Martin Sebor 2015-04-27 14:57:16 UTC If one of you can explain the problem with it I'm willing to write up a paper and submit it to WG14 and request to have the standard changed.
Martin Sebor 2015-05-13 16:02:41 UTC I had a chance to discuss this issue with Clark Nelson last week. Clark has worked on improving the aliasing parts of the C specification in the past, for example in N1520 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1520.htm). He agreed that like the issues pointed out in N1520, this is also an outstanding problem that would be worth for WG14 to revisit and fix."
Potatoswatter inspiringly concludes:
The C and C++ committees (via Martin and Clark) will try to find a consensus and hammer out wording so the standard can finally say what it means.
We can only hope!
Again, all further thoughts are welcome.
I suspect it means that the access to these common parts is permitted not only through the union type, but outside of the union. That is, suppose we have this:
union u {
struct s1 m1;
struct s2 m2;
};
Now suppose that in some function we have a struct s1 *p1 pointer which we know was lifted from the m1 member of such a union. We can cast this to a struct s2 * pointer and still access the members which are in common with struct s1. But somewhere in the scope, a declaration of union u has to be visible. And it has to be the complete declaration, which informs the compiler that the members are struct s1 and struct s2.
The likely intent is that if there is such a type in scope, then the compiler has knowledge that struct s1 and struct s2 are aliased, and so an access through a struct s1 * pointer is suspected of really accessing a struct s2 or vice versa.
In the absence of any visible union type which joins those types this way, there is no such knowledge; strict aliasing can be applied.
Since the wording is absent from C++, then to take advantage of the "common initial members relaxation" rule in that language, you have to route the accesses through the union type, as is commonly done anyway:
union u *ptr_any;
// ...
ptr_any->m1.common_initial_member = 42;
fun(ptr_any->m2.common_initial_member); // pass 42 to fun
Per [meta.rel] (20.10.6 in C++14), for a class type T, std::is_base_of<T,T> is true, but for a built-in type T, std::is_base_of<T,T> is false. Colloquially speaking, class types are bases of themselves, but built-in types are not. What is the motivation/utility for this difference in treatment?
The rationale goes back quite a ways, and is not well documented.
is_base_of was originally called is_base_and_derived, and was introduced in TR1. Dave Abrahams introduced an issue against this class in N1541, number 3.13:
Currently, is_base_and_derived<X,Y> returns false when X and Y
are the same. This is technically correct (X isn’t its own base
class), but it isn’t useful. The definition should be loosened to
return true when X and Y are the same, even when the type isn’t
actually a class.
Unfortunately the issue does not expound on why that definition is not useful. However the viewpoint was not unique at the time (2003). Andrei Alexandrescu's Modern C++ Design published two years earlier had much the same trait, and much the same comment in section 2.7 about its macro SUPERSUBCLASS, though this book also added a workaround macro if you really didn't want a class to be considered its own base.
Modern C++ Design goes on to use SUPERSUBCLASS in section 3.12 to order a Typelist in order of inheritance. Deep in the details of this exercise, the fact that SUPERSUBCLASS(T, T) is true is taken advantage of (for convenience in the implementation).
By 2004, the TR1 report, N1647, had adopted the notion that std::is_base_of<T,T>::value == true was useful when T is a non-union class type.
N2255 further clarifies how is_base_of is supposed to work for non-class types, and this change resulted in the wording that you see today. However there are extensive editorial differences between what this accepted paper proposes, and what was produced in the following draft (N2284). My opinion is that the editor greatly improved the wording.
The undocumented rationale for why N2255 created a division between non-union class types and everything else is that is_base_of historically answered questions about the inheritance hierarchy of types, with only a convenient "trick" that a class can be considered its own base via the "is-a" analysis. However types that can't possibly participate in an inheritance relationship should never qualify as base classes, according to the multiple authors of this trait.
Whether or not that was the best design is up for debate. However there are sufficient traits (e.g. is_class and is_same) to build whatever you need from these fundamental traits.
This is more of a history, rather than a straight-forward "why". However the point of this answer is to call out that is_base_of evolved over a great deal of time, through many iterations. And each iteration was felt at the time to be an evolutionary improvement upon what was there before.
And it all boils down to: This is the spec that the committee thought would be most useful. But as the spec & design evolved over many years, and through several authors, there exists no good overall design document or rationale.
Built-in types are not classes, and things that aren't classes cannot be a base of anything.
If anything I'm somewhat surprised that std::is_base_of<T,T> is true even for class types, but that's probably a "can be treated as a" relation for classes.
Meh, I dunno.
I find this atrocious:
std::numeric_limits<int>::max()
And really wish I could just write this:
int::max
Yes, there is INT_MAX and friends. But sometimes you are dealing with something like streamsize, which is a synonym for an unspecified built-in, so you don't know whether you should use INT_MAX or LONG_MAX or whatever. Is there a technical limitation that prevents something like int::max from being put into the language? Or is it just that nobody but me is interested in it?
Primitive types are not class types, so they don't have static members, that's it.
If you make them class types, you are changing the foundations of the language (although thinking about it it wouldn't be such a problem for compatibility reasons, more like some headaches for the standard guys to figure out exactly what members to add to them).
But more importantly, I think that nobody but you is interested in it :) ; personally I don't find numeric_limits so atrocious (actually, it's quite C++-ish - although many can argue that often what is C++-ish looks atrocious :P ).
All in all, I'd say that this is the usual "every feature starts with minus 100 points" point; the article talks about C#, but it's even more relevant for C++, that has already tons of language features and subtleties, a complex standard and many compiler vendors that can put their vetoes:
One way to do that is through the concept of “minus 100 points”. Every feature starts out in the hole by 100 points, which means that it has to have a significant net positive effect on the overall package for it to make it into the language. Some features are okay features for a language to have, they just aren't quite good enough to make it into the language.
Even if the proposal were carefully prepared by someone else, it would still take time for the standard committee to examine and discuss it, and it would probably be rejected because it would be a duplication of stuff that is already possible without problems.
There are actually multiple issues:
built-in types aren't classes in C++
classes can't be extended with new members in C++
assuming the implementation were required to supply certain "members": which? There are lots of other attributes you might want to find for type and using traits allows for them being added.
That said, if you feel you want shorter notation for this, just create it:
namespace traits {
template <typename T> constexpr T max() {
return std::numeric_limits<T>::max();
}
}
int m = traits::max<int>();
using namespace traits;
int n = max<int>();
Why don't you use std::numeric_limits<streamsize>::max()? As for why it's a function (max()) instead of a constant (max), I don't know. In my own app I made my own num_traits type that provides the maximum value as a static constant instead of a function, (and provides significantly more information than numeric_limits).
It would be nice if they had defined some constants and functions on "int" itself, the way C# has int.MaxValue, int.MaxValue and int.Parse(string), but that's just not what the C++ committee decided.
For my projects, I usually define a lot of aliases for types like unsigned int, char and double as well as std::string and others.
I also aliased and to &&, or to ||, not to !, etc.
Is this considered bad practice or okay to do?
Defining types to add context within your code is acceptable, and even encouraged. Screwing around with operators will only encourage the next person that has to maintain your code to bring a gun to your house.
Well, consider the newcomers who are accustomed to C++. They will have difficulties maintaining your project.
Mind that there are many possibilities for the more valid aliasing. A good example is complicated nested STL containers.
Example:
typedef int ClientID;
typedef double Price;
typedef map<ClientID, map<Date, Price> > ClientPurchases;
Now, instead of
map<int, map<Date, double> >::iterator it = clientPurchases.find(clientId);
you can write
ClientPurchases::iterator it = clientPurchases.find(clientId);
which seems to be more clear and readable.
If you're only using it for pointlessly renaming language features (as opposed to the example #Vlad gives), then it's the Wrong Thing.
It definitely makes the code less readable - anyone proficient in C++ will see (x ? y : z) and know that it's a ternary conditional operator. Although ORLY x YARLY y NOWAI z KTHX could be the same thing, it will confuse the viewer: "is this YARLY NOWAI the exact same thing as ? :, renamed for the author's convenience, or does it have subtle differences?" If these "aliases" are the same thing as the standard language elements, they will only slow down the next person to maintain your code.
TLDR: Reading code, any code, is hard enough without having to look up your private alternate syntax all the time.
That’s horrible. Don’t do it, please. Write idiomatic C++, not some macro-riddled monstrosity. In general, it’s extremely bad practice to define such macros, except in very specific cases (such as the BOOST_FOREACH macro).
That said, and, or and not are actually already valid aliases for &&, || and ! in C++!
It’s just that Visual Studio only knows them if you first include the standard header <ciso646>. Other compilers don’t need this.
Types are something else. Using typedef to create type aliases depending on context makes sense if it augments the expressiveness of the code. Then it’s encouraged. However, even better would be to create own types instead of aliases. For example, I can’t imagine it ever being beneficial to create an alias for std::string – why not just use std::string directly? (An exception are of course generic data structures and algorithms.)
"and", "or", "not" are OK because they're part of the language, but it's probably better to write C++ in a style that other C++ programmers use, and very few people bother using them. Don't alias them yourself: they're reserved names and it's not valid in general to use reserved names even in the preprocessor. If your compiler doesn't provide them in its default mode (i.e. it's not standard-compliant), you could fake them up with #define, but you may be setting yourself up for trouble in future if you change compiler, or change compiler options.
typedefs for builtin types might make sense in certain circumstances. For example in C99 (but not in C++03), there are extended integer types such as int32_t, which specifies a 32 bit integer, and on a particular system that might be a typedef for int. They come from stdint.h (<cstdint> in C++0x), and if your C++ compiler doesn't provide that as an extension, you can generally hunt down or write a version of it that will work for your system. If you have some purpose in mind for which you might in future want to use a different integer type (on a different system perhaps), then by all means hide the "real" type behind a name that describes the important properties that are the reason you chose that type for the purpose. If you just think "int" is unnecessarily brief, and it should be "integer", then you're not really helping anyone, even yourself, by trying to tweak the language in such a superficial way. It's an extra indirection for no gain, and in the long run you're better off learning C++, than changing C++ to "make more sense" to you.
I can't think of a good reason to use any other name for string, except in a case similar to the extended integer types, where your name will perhaps be a typedef for string on some builds, and wstring on others.
If you're not a native English-speaker, and are trying to "translate" C++ to another language, then I sort of see your point but I don't think it's a good idea. Even other speakers of your language will know C++ better than they know the translations you happen to have picked. But I am a native English speaker, so I don't really know how much difference it makes. Given how many C++ programmers there are in the world who don't translate languages, I suspect it's not a huge deal compared with the fact that all the documentation is in English...
If every C++ developer was familiar with your aliases then why not, but you are with these aliases essentially introducing a new language to whoever needs to maintain your code.
Why add this extra mental step that for the most part does not add any clarity (&& and || are pretty obvious what they are doing for any C/C++ programmer, and any way in C++ you can use the and and or keywords)
I'm reading a very silly paper and it keeps on talking about how Giotto defines a "formal semantics".
Giotto has a formal semantics that specifies the meaning of mode switches, of intertask communication, and of communication with the program environment.
I'm on the edge of, but just cannot quite grasp what it means by "formal semantics."
To expand on Michael Madsen's answer a little, an example might be the behaviour of the ++ operator. Informally, we describe the operator using plain English. For instance:
If x is a variable of type int, ++x causes x to be incremented by one.
(I'm assuming no integer overflows, and that ++x doesn't return anything)
In a formal semantics (and I'm going to use operational semantics), we'd have a bit of work to do. Firstly, we need to define a notion of types. In this case, I'm going to assume that all variables are of type int. In this simple language, the current state of the program can be described by a store, which is a mapping from variables to values. For instance, at some point in the program, x might be equal to 42, while y is equal to -5351. The store can be used as a function -- so, for instance, if the store s has the variable x with the value 42, then s(x) = 42.
Also included in the current state of the program is the remaining statements of the program we have to execute. We can bundle this up as <C, s>, where C is the remaining program, and s is the store.
So, if we have the state <++x, {x -> 42, y -> -5351}>, this is informally a state where the only remaining command to execute is ++x, the variable x has value 42, and the variable y has value -5351.
We can then define transitions from one state of the program to another -- we describe what happens when we take the next step in the program. So, for ++, we could define the following semantics:
<++x, s> --> <skip, s{x -> (s(x) + 1)>
Somewhat informally, by executing ++x, the next command is skip, which has no effect, and the variables in the store are unchanged, except for x, which now has the value that it originally had plus one. There's still some work to be done, such as defining the notation I used for updating the store (which I've not done to stop this answer getting even longer!). So, a specific instance of the general rule might be:
<++x, {x -> 42, y -> -5351}> --> <skip, {x -> 43, y -> -5351}>
Hopefully that gives you the idea. Note that this is just one example of formal semantics -- along with operational semantics, there's axiomatic semantics (which often uses Hoare logic) and denotational semantics, and probably plenty more that I'm not familiar with.
As I mentioned in a comment to another answer, an advantage of formal semantics is that you can use them to prove certain properties of your program, for instance that it terminates. As well as showing your program doesn't exhibit bad behaviour (such as non-termination), you can also prove that your program behaves as required by proving your program matches a given specification. Having said that, I've never found the idea of specifying and verifying a program all that convincing, since I've found the specification usually just being the program rewritten in logic, and so the specification is just as likely to be buggy.
Formal semantics describe semantics in - well, a formal way - using notation which expresses the meaning of things in an unambiguous way.
It is the opposite of informal semantics, which is essentially just describing everything in plain English. This may be easier to read and understand, but it creates the potential for misinterpretation, which could lead to bugs because someone didn't read a paragraph the way you intended them to read it.
A programming language can have both formal and informal semantics - the informal semantics would then serve as a "plain-text" explanation of the formal semantics, and the formal semantics would be the place to look if you're not sure what the informal explanation really means.
Just like the syntax of a language can be described by a formal grammar (e.g. BNF), it's possible to use different kinds of formalisms to map that syntax to mathematical objects (i.e. the meaning of the syntax).
This page from A Practical Introduction to Denotational Semantics is a nice introduction to how [denotational] semantics relates to syntax. The beginning of the book also gives a brief history of other, non-denotational approaches to formal semantics (although the wikipedia link Michael gave goes into even more detail, and is probably more up-to-date).
From the author's site:
Models for semantics have not
caught-on to the same extent that BNF
and its descendants have in syntax.
This may be because semantics does
seem to be just plain harder than
syntax. The most successful system is
denotational semantics which describes
all the features found in imperative
programming languages and has a sound
mathematical basis. (There is still
active research in type systems and
parallel programming.) Many
denotational definitions can be
executed as interpreters or translated
into "compilers" but this has not yet
led to generators of efficient
compilers which may be another reason
that denotational semantics is less
popular than BNF.
What is meant in the context of a programming language like Giotto is, that a language with formal semantics, has a mathematically rigorous interpretation of its individual language constructs.
Most programming languages today are not rigorously defined. They may adhere to standard documents that are fairly detailed, but it's ultimately the compiler's responsibility to emit code that somehow adheres to those standard documents.
A formally specified language on the other hand is normally used when it is necessary to do reasoning about program code using, e.g., model checking or theorem proving. Languages that lend themselves to these techniques tend to be functional ones, such as ML or Haskell, since these are defined using mathematical functions and transformations between them; that is, the foundations of mathematics.
C or C++ on the other hand are informally defined by technical descriptions, although there exist academic papers which formalise aspects of these languages (e.g., Michael Norrish: A formal semantics for C++, https://publications.csiro.au/rpr/pub?pid=nicta:1203), but that often do not find their way into the official standards (possibly due to a lack of practicality, esp. difficulty to maintain).