Regarding struct variable declaration in C vrs C++ - c++

in C
struct node {
int x;
};
node *y; // incorrect
struct node *y; // correct
in C++
struct node{
int x;
};
node *y; // correct
y = new node; // correct
Question: Why is the tag name acceptable to create a pointer in C++ not in C?

Because the C++ standard says so, and the C standard doesn't.
I would say that C++ made the change to make classes easier to use (remember, classes and structs are fundamentally the same thing in C++; they only differ in default visibility), and C didn't follow suit because that would break tons of existing C code.

The reason is not just because the standard says so, C++ actually has namespaces while the struct keyword allowed for a primitive form of namespace in C which allowed you to have a struct and a non-struct identifier with the same name. We can see this primitive form of a namespace from the C99 draft standard section 6.2.3 Name spaces of identifiers which says:
If more than one declaration of a particular identifier is visible at any point in a
translation unit, the syntactic context disambiguates uses that refer to different entities.
Thus, there are separate name spaces for various categories of identifiers, as follows:
and has the following bullets:
— label names (disambiguated by the syntax of the label declaration and use);
— the tags of structures, unions, and enumerations (disambiguated by following any24)
of the keywords struct, union, or enum);
— the members of structures or unions; each structure or union has a separate name
space for its members (disambiguated by the type of the expression used to access the
member via the . or -> operator);
— all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants).
I could make up an example but POSIX gives us a great example with stat which is both a function and a struct:
int stat(const char *restrict path, struct stat *restrict buf);
^^^^ ^^^^^^^^^^^

If you want to write the way node* y in C, just
typedef struct node {
int x;
} node;

Related

structure tag in C vs C++

I wrote following simple program & compiled it on gcc compiler
#include <stdio.h>
typedef int i;
void foo()
{
struct i {i i;} i;
i.i = 3;
printf("%i\n", i.i);
}
int main() { foo(); }
It compiles & runs fine in C.(See live demo here) But it fails in compilation in C++. C++ compiler gives following error messages.
prog.cc: In function 'void foo()':
prog.cc:5:17: error: field 'i' has incomplete type 'foo()::i'
struct i {i i;} i;
^
prog.cc:5:12: note: definition of 'struct foo()::i' is not complete until the closing brace
struct i {i i;} i;
See live demo here
I couldn't find rules regarding this in C & C++ standards. Why it compiles fine in C but not in C++ ? What does the standard says about this ? I very well know that C & C++ are different languages having different rules but I am curious to know about exact rules.
The difference between C and C++ is the following. In C the data member i is considered as having type int because if you wanted that it had type struct i then you have to write struct i i specifying the keyword struct before i.
Structure tags are in their own namespace compared with the namespace of other variables.
According to the C Standard (6.2.3 Name spaces of identifiers)
1 If more than one declaration of a particular identifier is visible
at any point in a translation unit, the syntactic context
disambiguates uses that refer to different entities. Thus, there are
separate name spaces for various categories of identifiers, as
follows:
— label names (disambiguated by the syntax of the label declaration
and use);
— the tags of structures, unions, and enumerations (disambiguated by
following any32) of the keywords struct, union, or enum);
— the members of structures or unions; each structure or union has a
separate name space for its members (disambiguated by the type of the
expression used to access the member via the . or -> operator);
— all other identifiers, called ordinary identifiers (declared in
ordinary declarators or as enumeration constants).
As for C++ then inside the structure definition the name of the structure hides the name of the typedef and the compiler issues the error. In C++ there is separate class scope.
For example in C++ (3.4 Name lookup) there is written
3 The injected-class-name of a class (Clause 9) is also considered
to be a member of that class for the purposes of name hiding and
lookup.
and (3.4.1 Unqualified name lookup)
7 A name used in the definition of a class X outside of a member
function body or nested class definition29 shall be declared in one of
the following ways: — before its use in class X or be a member of
a base class of X (10.2), or ...
Thus the injected name of the class hides the typedef name within the class definition.
Take into account that outside the class definition the name of the class can be hidden by the same name of an object. Thus if you want to declare an object of the class in that scope you have to use its elaborated name like
int i;
struct i {};
//...
struct i obj;

Why can a typedef-name for a struct not be used interchangeably with the struct name?

The following code (live example) does not compile:
struct S {};
typedef struct S T;
S s = T(); // OK
struct T * p; // error: elaborated type refers to a typedef
T::T(){} // error: C++ requires a type specifier for all declarations
Why is the language designed to not permit the last two lines?
Relevant Standard quote (N4140 §7.1.3/8):
[ Note: A typedef-name that names a class type, or a cv-qualified version thereof, is also a class-name (9.1).
If a typedef-name is used to identify the subject of an elaborated-type-specifier (7.1.6.3), a class definition (Clause 9), a constructor declaration (12.1), or a destructor declaration (12.4), the program is ill-formed.
—end note ]
So there are three unrelated issues. The first one you have in the quote you provide:
struct T * p;
That is illegal as T is a typedef.
T{};
That is illegal at namespace level, but would be legal in other concepts, for example as part of the initialization of a global, or inside a function:
T t = T{};
void f() { T{}; }
It really means to create a value-initialized temporary object of type T.
T::T(){}
That would be a valid definition for a default constructor, except that you did not declare one. If you modify the S to have a user declared default constructor that would work:
struct S { S(); };
Why is the language designed to not permit the last two lines?
Those two lines, in the updated question are:
struct T* p;
T::T() {}
The second one is legal, but you are trying to define a function that has not been declared as a member, so this is also unrelated to the original text. Which leaves us with one: struct T* p.
The motive comes from C. The identifiers for user defined types and other names appear to live in different scopes, when lookup is trying to resolve a name not qualified with struct or enum, it will ignore struct and enums, when trying to resolve a struct or enum it ignores everything else. The following is valid C (and C++):
struct T {}; // 1
typedef struct S {} T; // 2
struct T t;
In C++ the rules for lookup changed a bit and you can use the type specifiers without explicitly qualifying it but that is a different thing. Additionally, typedef-ed names can be used in other contexts that were not possible in C.
An special case is lookup for an elaborated type specifier, should the typedef-ed name be usable in an elaborated type specifier? If it was, the semantics of the program above would change and where in C t is of type T (defined in 1), in C++ it would become S (defined in 2).
Note that this is to some extent a wild guess, I did not make the rules and I don't know what went into consideration there. Note that C and C++ were never really compatible in this respect, a similar example changes semantics in C and C++:
int T;
void f() {
struct T { int data[10]; };
printf("%d\n", sizeof(T));
}
That program will print a number 10x larger in C++ than in C. But the ability to use a type without having to qualify it with class or struct was probably more important than breaking compatibility in a few cases...

Are "anonymous structs" standard? And, really, what *are* they?

MSDN reckons that anonymous structs are non-standard in C++:
A Microsoft C extension allows you to declare a structure variable
within another structure without giving it a name. These nested
structures are called anonymous structures. C++ does not allow
anonymous structures.
You can access the members of an anonymous structure as if they were
members in the containing structure.
#K-ballo agrees.
I'm told that this feature isn't necessarily the same as just creating an unnamed struct but I can't see a distinction in terms of standard wording.
C++11 says:
[C++11: 9/1]: [..] A class-specifier whose class-head omits the class-head-name defines an unnamed class.
and provides an entire grammatical construction for a type definition missing a name.
C++03 lacks this explicit wording, but similarly indicates that the identifier in a type definition is optional, and makes reference to "unnamed classes" in 9.4.2/5 and 3.5/4.
So is MSDN wrong, and these things are all completely standard?
Or is there some subtlety I'm missing between "unnamed structs/classes" and the same when used as members that prevents them from being covered by this C++03/C++11 functionality?
Am I missing some fundamental difference between "unnamed struct" and "anonymous struct"? They look like synonyms to me.
All the standard text refers to creating an "unnamed struct":
struct {
int hi;
int bye;
};
Just a nice friendly type, with no accessible name.
In a standard way, it could be instantiated as a member like this:
struct Foo {
struct {
int hi;
int bye;
} bar;
};
int main()
{
Foo f;
f.bar.hi = 3;
}
But an "anonymous struct" is subtly different — it's the combination of an "unnamed struct" and the fact that you magically get members out of it in the parent object:
struct Foo {
struct {
int hi;
int bye;
}; // <--- no member name!
};
int main()
{
Foo f;
f.hi = 3;
}
Converse to intuition†, this does not merely create an unnamed struct that's nested witin Foo, but also automatically gives you an "anonymous member" of sorts which makes the members accessible within the parent object.
It is this functionality that is non-standard. GCC does support it, and so does Visual C++. Windows API headers make use of this feature by default, but you can specify that you don't want it by adding #define NONAMELESSUNION before including the Windows header files.
Compare with the standard functionality of "anonymous unions" which do a similar thing:
struct Foo {
union {
int hi;
int bye;
}; // <--- no member name!
};
int main()
{
Foo f;
f.hi = 3;
}
† It appears that, though the term "unnamed" refers to the type (i.e. "the class" or "the struct") itself, the term "anonymous" refers instead to the actual instantiated member (using an older meaning of "the struct" that's closer to "an object of some structy type"). This was likely the root of your initial confusion.
The things that Microsoft calls anonymous structs are not standard. An unnamed struct is just an ordinary struct that doesn't have a name. There's not much you can do with one, unless you also define an object of that type:
struct {
int i;
double d;
} my_object;
my_object.d = 2.3;
Anonymous unions are part of the standard, and they have the behavior you'd expect from reading Microsoft's description of their anonymous structs:
union {
int i;
double d;
};
d = 2.3;
The standard talks about anonymous unions: [9.5]/5
A union of the form
union { member-specification } ;
is called an anonymous union; it defines an unnamed object of unnamed type. The member-specification of an anonymous union shall only define non-static data members. [ Note: Nested types and functions cannot be declared within an anonymous union. —end note ] The names of the members of an anonymous union shall be distinct from the names of any other entity in the scope in which the anonymous union is declared. For the purpose of name lookup, after the anonymous union definition, the members of the anonymous union are considered to have been defined in the scope in which the anonymous union is declared. [ Example:
void f() {
union { int a; const char* p; };
a = 1;
p = "Jennifer";
}
Here a and p are used like ordinary (nonmember) variables, but since they are union members they have the same address. —end example ]
The anonymous structs that Microsoft talks about is this feature for unions but applied to structs. Is not just an unnamed definition, its important to note that mebers of the anonymous union/struct are considered to have been defined in the scope in which the anonymous union/struct is declared.
As far as I know, there is no such behavior for unnamed structs in the Standard. Note how in the cited example you can achieve things that wouldn't be otherwise possible, like sharing storage for variables in the stack, while anonymous structs bring nothing new to the table.

Name lookup Clarification

$10.2/4- "[ Note: Looking up a name in
an elaborated-type-specifier (3.4.4)
or base-specifier (Clause 10), for
instance, ignores all nontype
declarations, while looking up a name
in a nested-name-specifier (3.4.3)
ignores function, variable, and
enumerator declarations."
I have found this statement to be very confusing in this section while describing about name lookup.
void S(){}
struct S{
S(){cout << 1;}
void f(){}
static const int x = 0;
};
int main(){
struct S *p = new struct ::S; // here ::S refers to type
p->::S::f();
S::x; // base specifier, ignores the function declaration 'S'
::S(); // nested name specifier, ignores the struct declaration 'S'.
delete p;
}
My questions:
Is my understanding of the rules correct?
Why ::S on the line doing new treated automatically to mean struct S, whereas in the last line ::S means the functions S in the global namespace.
Does this point to an ambiguity in the documentation, or is it yet another day for me to stay away from C++ Standard document?
Q1: I think so.
Q2: Compatibility with C. When you declare a struct in C, the tag name is just that, a tag name. To be able to use it in a standalone way, you need a typedef. In C++ you don't need the typedef, that makes live easier. But C++ rules have been complicated by the need to be able to import already existing C headers which "overloaded" the tag name with a function name. The canonical example of that is the Unix stat() function which uses a struct stat* as argument.
Q3: Standard reading is usually quite difficult... you need to already know that there is no place elsewhere modifying what you are reading. It isn't strange that people knowing how to do that are language lawyer...
You are mistaken about the second comment. In S::x, the S is a name in a nested name specifier. What the Standard refers to with "base-specifier" is the following
namespace B { struct X { }; void X() }
struct A : B::X { }; // B::X is a base-specifier
You are also not correct about this:
::S(); // nested name specifier, ignores the struct declaration 'S'.`
That code calls the function not because ::S would be a nested-name-specifier (it isn't a nested-name-specifier!), but because function names hide class or enumeration names if both the function and the class/enumeration are declared in the same scope.
FWIW, the following code would be equally valid for line 2 of your main
p->S::f();
What's important is that S preceedes a ::, which makes lookup ignore the function. That you put :: before S has no effect in your case.

Difference between 'struct' and 'typedef struct' in C++?

In C++, is there any difference between:
struct Foo { ... };
and:
typedef struct { ... } Foo;
In C++, there is only a subtle difference. It's a holdover from C, in which it makes a difference.
The C language standard (C89 §3.1.2.3, C99 §6.2.3, and C11 §6.2.3) mandates separate namespaces for different categories of identifiers, including tag identifiers (for struct/union/enum) and ordinary identifiers (for typedef and other identifiers).
If you just said:
struct Foo { ... };
Foo x;
you would get a compiler error, because Foo is only defined in the tag namespace.
You'd have to declare it as:
struct Foo x;
Any time you want to refer to a Foo, you'd always have to call it a struct Foo. This gets annoying fast, so you can add a typedef:
struct Foo { ... };
typedef struct Foo Foo;
Now struct Foo (in the tag namespace) and just plain Foo (in the ordinary identifier namespace) both refer to the same thing, and you can freely declare objects of type Foo without the struct keyword.
The construct:
typedef struct Foo { ... } Foo;
is just an abbreviation for the declaration and typedef.
Finally,
typedef struct { ... } Foo;
declares an anonymous structure and creates a typedef for it. Thus, with this construct, it doesn't have a name in the tag namespace, only a name in the typedef namespace. This means it also cannot be forward-declared. If you want to make a forward declaration, you have to give it a name in the tag namespace.
In C++, all struct/union/enum/class declarations act like they are implicitly typedef'ed, as long as the name is not hidden by another declaration with the same name. See Michael Burr's answer for the full details.
In this DDJ article, Dan Saks explains one small area where bugs can creep through if you do not typedef your structs (and classes!):
If you want, you can imagine that C++
generates a typedef for every tag
name, such as
typedef class string string;
Unfortunately, this is not entirely
accurate. I wish it were that simple,
but it's not. C++ can't generate such
typedefs for structs, unions, or enums
without introducing incompatibilities
with C.
For example, suppose a C program
declares both a function and a struct
named status:
int status(); struct status;
Again, this may be bad practice, but
it is C. In this program, status (by
itself) refers to the function; struct
status refers to the type.
If C++ did automatically generate
typedefs for tags, then when you
compiled this program as C++, the
compiler would generate:
typedef struct status status;
Unfortunately, this type name would
conflict with the function name, and
the program would not compile. That's
why C++ can't simply generate a
typedef for each tag.
In C++, tags act just like typedef
names, except that a program can
declare an object, function, or
enumerator with the same name and the
same scope as a tag. In that case, the
object, function, or enumerator name
hides the tag name. The program can
refer to the tag name only by using
the keyword class, struct, union, or
enum (as appropriate) in front of the
tag name. A type name consisting of
one of these keywords followed by a
tag is an elaborated-type-specifier.
For instance, struct status and enum
month are elaborated-type-specifiers.
Thus, a C program that contains both:
int status(); struct status;
behaves the same when compiled as C++.
The name status alone refers to the
function. The program can refer to the
type only by using the
elaborated-type-specifier struct
status.
So how does this allow bugs to creep
into programs? Consider the program in
Listing 1. This program defines a
class foo with a default constructor,
and a conversion operator that
converts a foo object to char const *.
The expression
p = foo();
in main should construct a foo object
and apply the conversion operator. The
subsequent output statement
cout << p << '\n';
should display class foo, but it
doesn't. It displays function foo.
This surprising result occurs because
the program includes header lib.h
shown in Listing 2. This header
defines a function also named foo. The
function name foo hides the class name
foo, so the reference to foo in main
refers to the function, not the class.
main can refer to the class only by
using an elaborated-type-specifier, as
in
p = class foo();
The way to avoid such confusion
throughout the program is to add the
following typedef for the class name
foo:
typedef class foo foo;
immediately before or after the class
definition. This typedef causes a
conflict between the type name foo and
the function name foo (from the
library) that will trigger a
compile-time error.
I know of no one who actually writes
these typedefs as a matter of course.
It requires a lot of discipline. Since
the incidence of errors such as the
one in Listing 1 is probably pretty
small, you many never run afoul of
this problem. But if an error in your
software might cause bodily injury,
then you should write the typedefs no
matter how unlikely the error.
I can't imagine why anyone would ever
want to hide a class name with a
function or object name in the same
scope as the class. The hiding rules
in C were a mistake, and they should
not have been extended to classes in
C++. Indeed, you can correct the
mistake, but it requires extra
programming discipline and effort that
should not be necessary.
One more important difference: typedefs cannot be forward declared. So for the typedef option you must #include the file containing the typedef, meaning everything that #includes your .h also includes that file whether it directly needs it or not, and so on. It can definitely impact your build times on larger projects.
Without the typedef, in some cases you can just add a forward declaration of struct Foo; at the top of your .h file, and only #include the struct definition in your .cpp file.
There is a difference, but subtle. Look at it this way: struct Foo introduces a new type. The second one creates an alias called Foo (and not a new type) for an unnamed struct type.
7.1.3 The typedef specifier
1 [...]
A name declared with the typedef specifier becomes a typedef-name. Within the scope of its declaration, a
typedef-name is syntactically equivalent to a keyword and names the type associated with the identifier in
the way described in Clause 8. A typedef-name is thus a synonym for another type. A typedef-name does not introduce a new type the way a class declaration (9.1) or enum declaration does.
8 If the typedef declaration defines an unnamed class (or enum), the first typedef-name declared by the declaration
to be that class type (or enum type) is used to denote the class type (or enum type) for linkage
purposes only (3.5). [ Example:
typedef struct { } *ps, S; // S is the class name for linkage purposes
So, a typedef always is used as an placeholder/synonym for another type.
You can't use forward declaration with the typedef struct.
The struct itself is an anonymous type, so you don't have an actual name to forward declare.
typedef struct{
int one;
int two;
}myStruct;
A forward declaration like this wont work:
struct myStruct; //forward declaration fails
void blah(myStruct* pStruct);
//error C2371: 'myStruct' : redefinition; different basic types
An important difference between a 'typedef struct' and a 'struct' in C++ is that inline member initialisation in 'typedef structs' will not work.
// the 'x' in this struct will NOT be initialised to zero
typedef struct { int x = 0; } Foo;
// the 'x' in this struct WILL be initialised to zero
struct Foo { int x = 0; };
There is no difference in C++, but I believe in C it would allow you to declare instances of the struct Foo without explicitly doing:
struct Foo bar;
Struct is to create a data type.
The typedef is to set a nickname for a data type.