What are canonical types in Clang?

What are canonical types in Clang? - c++

I have a simple header parser based on clang and I get the typedefs from some source.
struct _poire {
int g;
tomate rouge;
};
typedef struct _poire kudamono;
After parsing this I have a clang::TypedefDecl then I get the clang::QualType of the typedef with clang::TypedefDecl::getUnderlyingType()
With the QualType if I use the getAsString method I can find the "struct _poire" std::string. All this is Ok.
The problem is if I try to see if this type is a canonical type, with QualType::isCanonical(), it returns false.
So I try to get the canonical type with QualType::getCanonicalType().getAsString() and it returns the same string "struct _poire".
according to the clang reference on type http://clang.llvm.org/docs/InternalsManual.html#canonical-types , I thought that the isCanonical() should return
true when no typedef is involved.
So what are really canonical type?

After further investigations and a question in the clang mailing list, I think I have figured out what is a canonical type.
Firstly it 's important to not focus on the QualType in order to understand Canonical Type. Look this (code /pseudocode):
source file :
typedef struct _poire kudamono;
clang code :
QualType t = clang::TypedefDecl::getUnderlyingType()
t.getAsString() // "struct _poire"
t.isCanonical() // false
t.getTypePtr()->getTypeClassName() // ElaboredType
c = t.getCanonicalType()
c.getAsString() // "struct _poire"
c.isCanonical() // true
c.getTypePtr()->getTypeClassName() // RecordType
c and t are not the same QualType even if they have the same string representation.
QualType are used to associate qualifiers ("const", "volatile"...) with a clang type. There are a lot of Clang Types classes because clang needs to keep tracks of the user-specified types for diagnostics.( http://clang.llvm.org/docs/InternalsManual.html#the-type-class-and-its-subclasses and http://clang.llvm.org/doxygen/classclang_1_1Type.html )
The clang types used depends heavily on the syntaxic sugars or modifiers associated with the C/C++ types in the source file.
In the exemple above, the QualType t is associated with an ElaboratedType. This type allows to keep track of the type name as written in the source code. But the canonical QualType is associated with a RecordType.
Another example:
source file:
typedef struct _poire kudamono;
typedef kudamono tabemono;
clang code :
QualType t = clang::TypedefDecl::getUnderlyingType()
t.getAsString() // "kudamono"
t.isCanonical() // false
t.getTypePtr()->getTypeClassName() // TypedefType
c = t.getCanonicalType()
c.getAsString() // "struct _poire"
c.isCanonical() // true
c.getTypePtr()->getTypeClassName() // RecordType
Here we can see that the underlying type of the typedef is recorded as "kudamono" a TypedefType and not "struct _poire" an ElaboratedType.
The canonical type for the TypedefType "kudamono" is a RecordType "struct _poire".
Another examples that I have had from the clang mailing-list ( http://article.gmane.org/gmane.comp.compilers.clang.devel/38371/match=canonical+type ):
Consider:
int (x);
The type of x is not a BuiltinType; it's a ParenType whose canonical type is a BuiltinType. And given
struct X { int n; };
struct X x;
the type of x will probably be represented as an ElaboratedType whose canonical type is a RecordType.
So the canonical Type in clang are classes of types that are not associated with any syntaxic sugars or modifiers or typedef (like BuiltinType or RecordType). Other classes of types (like ParentType, TypedefType or ElaboratedType) are used to keep tracks of the user type for diagnostics (error message ...).

It seems that you have raised an interesting point. I have figured out something, but since I can't actually test my intuition right now, I can't be 100% sure. Anyway here is what I would do :
If I parse your code (with a little extension to declare a kudamono variable), here is what I can say from this:
struct _poire {
int g;
char rouge; // tomate is probably one of your classes so I just changed the type of the field.
};
typedef struct _poire kudamono;
int maFonction(){
kudamono une_poire;
return 0;
}
When the typedef is parsed, here is what is yielded :
-TypedefDecl 0x23b4620 <line:5:1, col:23> kudamono 'struct _poire':'struct _poire'
When I declare a variable of type kudamono, here is below its AST-dump :
-VarDecl 0x2048040 <col:2, col:11> une_poire 'kudamono':'struct _poire'
NB : You can get the AST Dump of your code with this command line, it can be really handy to understand how your code will be parsed :
clang -Xclang -ast-dump -std=c++11 -fsyntax-only test.cpp (just remove -std=c++11 if you want to compile a file_name.c file)
Now, from what I understand, I will make a comparaison between the VarDecl and the TypedefDecl :
1°) This VarDecl is named une_poire and has the type kudamono which is a typedef from the type struct _poire.
2°) This TypedefDecl is named kudamono and has the type struct _poire which is a typedef from the type struct _poire
So, the weird part is right here. struct _poire is considered as typedef from struct _poire.
You'll note that I tried to make a typedef with a usual type :
typedef int numbers;
And this time, AST-dump yields :
TypedefDecl 0x25d9680 <line:7:1, col:13> numbers 'int', so I guess the parser may have some troubles with handmade types (typically structs).
I can see one dirty way to know if your type is canonical or not (without getting false positives or false negatives) :
Check that the QualType and the canonical QualType are not the same
I don't know if a simple '=' between Qualtype will make false positives or false negatives (as I can't test), but you can still compare the names of the types with strcmp
So, to sum up a little bit :
Your understanding of a canonical type is fine.
Clang seems to have some trouble with handmade types, but it should be fine with Typedef from usual types (such as typedef int int32_t).
When you want to know if a type is canonical or not, you can compare the name of the type and the name of the canonical type, but it's quite dirty. On usual type, isCanonical() works well.

Related

Convert complex struct / opaquepointer / function from C++ header to Delphi

I'm converting from C/C++ header to Delphi.
I've carefully read the great Rudy's Delphi Corner article about this kind of conversion. Anyway, I'm facing something I'm hard to understand.
There's an opaque pointer, then a function prototype that has that pointer as parameter, followed by the struct declaration og the function type.
Maybe the code will make things clearer.
source .h code:
struct my_ManagedPtr_t_;
typedef struct my_ManagedPtr_t_ my_ManagedPtr_t;
typedef int (*my_ManagedPtr_ManagerFunction_t)(
my_ManagedPtr_t *managedPtr,
const my_ManagedPtr_t *srcPtr,
int operation);
typedef union {
int intValue;
void *ptr;
} my_ManagedPtr_t_data_;
struct my_ManagedPtr_t_ {
void *pointer;
my_ManagedPtr_t_data_ userData[4];
my_ManagedPtr_ManagerFunction_t manager;
};
typedef struct my_CorrelationId_t_ {
unsigned int size:8; // fill in the size of this struct
unsigned int valueType:4; // type of value held by this correlation id
unsigned int classId:16; // user defined classification id
unsigned int reserved:4; // for internal use must be 0
union {
my_UInt64_t intValue;
my_ManagedPtr_t ptrValue;
} value;
} my_CorrelationId_t;
... i'm lost. :-( I can't figure out where to start.
The structure? The function?
Thank you.

As you clarified in the comments, the immediate area of confusion for you is the circular reference. The function pointer parameters refer to the struct, but the struct contains the function pointer. In the C code this is dealt with by the opaque struct type declaration which is simply a forward declaration. A forward declaration simply promises that the type will be fully declared at some later point.
In Delphi you can deal with this in a directly analogous manner. You need to use a forward type declaration. I don't want to translate all the types in your question because that would require dealing with unions and bitfields which I deem to be separate topics. Instead I will present a simple Delphi example that shows how to deal with such circular type declarations. You can take the concept and apply it to your specific types.
type
PMyRecord = ^TMyRecord; // forward declaration
TMyFunc = function(rec: PMyRecord): Integer; cdecl;
TMyRecord = record
Func: TMyFunc;
end;

It is a little hard to find out where to start, but #DavidHeffernan's explanation of forward declaring a pointer type should give you a start.
I would translate this to following (untested) code:
type
_my_ManagedPtr_p = ^my_ManagedPtr_t;
my_ManagedPtr_ManagerFunction_t = function(
managedPtr: my_ManagedPtr_p;
scrPtr: my_ManagedPtr_p;
operation: Integer): Integer cdecl;
my_ManagedPtr_t_data = record
case Boolean of
False: (intValue: Integer);
True: (ptr: Pointer);
end;
my_ManagedPtr_t = record
ptr: Pointer;
userData: array[0..3] of my_ManagedPr_t_data;
manager: my_ManagedPtr_ManagerFunction_t;
end;
my_CorrelationId_t = record
typeData: UInt32; // size, valueType, classId and reserved combined in one integer.
case Byte of
0: (intValue: my_UInt64_t);
1: (ptrValue: my_ManagedPtr_t;
end;
I am not going to do the bitfields, but please read the Bitfields section of my article Pitfalls of converting again (I see you mentioned it already) to find a few solutions. If you want to make it really nice, use the methods and indexed access, otherwise just use shifts and masks to access the bitfields contained in the member I called typeData. How this can be done is explained in the article and is far too much to repeat here.
If you have problems with them anyway, ask a new question.

Is the declaration "const typedef enum" valid in C++?

I thought enums were static, what's the point of a const enum?
For example:
const typedef enum
{
NORMAL_FUN = 1,
GREAT_FUN = 2,
TERRIBLE_FUN = 3,
} Annoying;
I have had an old program dropped on my head that I am being forced to work with (from an equipment manufacturer), and I keep coming across enums being defined with const typedef enum.
Now, I am used to C#, so I don't fully understand all the C++ trickery that goes on, but this case appears to be straightforward.
From the coding of the program it would appear that variables that are of type Annoying are meant to be changed, everywhere, all the time.
They aren't meant to be constant. Long story short, the compiler doesn't like it.
This sample was written back sometime prior to 2010, so this could be some kind of version difference, but what did/does const typedef enum even mean?

That makes the type-alias Annoying constant, so all variables declared with that type-aliases are constant:
Annoying a = NORMAL_FUN;
a = GREAT_FUN; // Failure, trying to change a constant variable

const typedef Type def; and typedef const Type def; mean the same thing, and have for many years. There's nothing special about the case where Type is an enum definition, and you can see it too in:
const typedef int const_int;
const_int i = 3;
i = 4; // error

Writing
typedef enum
{
NORMAL_FUN = 1,
GREAT_FUN = 2,
TERRIBLE_FUN = 3,
} Annoying;
has the advantage of the enum working nicely in C too, which handles typedef by introducing Annoying into the typedef namespace. So the provider of the enum declaration could be also targetting C.
Using the const qualifier means that you cannot write code like
Annoying foo = NORMAL_FUN;
foo = GREAT_FUN; // this will fail as `foo` is a `const` type.

Is it possible to expand typedef with forced compiler error?

I have been using method shown below to force compiler to yell at me a variable type:
template <class T>
struct show_type;
Using it with desired variable so compiler errors an incomplete struct type:
typedef int32_t s32;
s32 a;
show_type<decltype(a)>();
So GCC 5.3.0 produces error:
invalid use of incomplete type 'struct show_type<int>'
And MSVC 2015:
'show_type<s32>': no appropriate default constructor available
Now I wonder if there is a way to force an error to show full hierarchy of typedefs (that is, s32 -> int32_t -> int), or at least newest typedef and first original type? I don't mind dirty or evil tricks.

Now I wonder if there is a way to force an error to show full hierarchy of typedefs (that is, s32 -> int32_t -> int), or at least newest typedef and first original type?
There is no such hierarchy. s32 is int32_t is int. There is no way to differentiate those three types, since they aren't actually three different types. Two of those are just aliases.
What you're really looking for is static reflection, or P0194. That would allow you to do something like:
using meta_s32 = reflexpr(s32);
using meta_int32 = meta::get_aliased_t<meta_s32>;
using meta_int = meta::get_aliased_t<meta_int32>;
std::cout << meta::get_name_v<meta_s32> << ", "
<< meta::get_name_v<meta_int32> << ", "
<< meta::get_name_v<meta_int> << '\n';
You could produce the reflection hierarchy by repeatedly going up with get_aliased_t and stopping when is_alias_v yields false_type.

typedef of nested structs

I'm trying to typedef a group of nested structs using this:
struct _A
{
struct _Sim
{
struct _In
{
STDSTRING UserName;
VARIANT Expression;
int Period;
bool AutoRun;
//bool bAutoSave;
} In;
struct _Out
{
int Return;
} Out;
} Sim;
} A;
typedef _A._Sim._In SIM_IN;
The thing is the editor in VS2010 likes it. It recognizes the elements in the typedef, I can include it as parameters to functions but when you go to build it I get warnings first C4091 (ignored on left when no variable is declared) and then that leads to error C2143 "missing ';' before '.'.
The idea of the typedef is to make managing type definitions (in pointers, prototypes, etc) to _A._Sim._In easy with one name...a seemingly perfect use for typedef if the compiler allowed it.
How can I refer to the nested structure with one name to make pointer management and type specifiction easier than using the entire nested name (_A._Sim._In) ?

The dot operator is a postfix operator applied to an object (in terms of C). I.e., you can not apply it to a type.
To reach what you want you can use a function or a macro, e.g.:
#define SIM_IN(x) x._Sim._In

It might not be preferable to do so but, if it cannot be achieved using a typedef, I guess you could always do
#define _A._Sim._In SIM_IN
But as I said you might not prefer that for various reasons. :)

Use Enum or #define?

I'm building a toy interpreter and I have implemented a token class which holds the token type and value.
The token type is usually an integer, but how should I abstract the int's?
What would be the better idea:
// #defines
#define T_NEWLINE 1
#define T_STRING 2
#define T_BLAH 3
/**
* Or...
*/
// enum
enum TokenTypes
{
t_newline = 1,
t_string = 2,
t_blah = 3
};

Enums can be cast to ints; furthermore, they're the preferred way of enumerating lists of predefined values in C++. Unlike #defines, they can be put in namespaces, classes, etc.
Additionally, if you need the first index to start with 1, you can use:
enum TokenTypes
{
t_newline = 1,
t_string,
t_blah
};

Enums work in debuggers (e.g. saying "print x" will print the "English" value). #defines don't (i.e. you're left with the numeric and have to refer to the source to do the mapping yourself).
Therefore, use enums.

There are various solutions here.
The first, using #define refers to the old days of C. It's usually considered bad practice in C++ because symbols defined this way don't obey scope rules and are replaced by the preprocessor which does not perform any kind of syntax check... leading to hard to understand errors.
The other solutions are about creating global constants. The net benefit is that instead of being interpreted by the preprocessor they will be interpreted by the compiler, and thus obey syntax checks and scope rules.
There are many ways to create global constants:
// ints
const int T_NEWLINE = 1;
struct Tokens { static const int T_FOO = 2; };
// enums
enum { T_BAR = 3; }; // anonymous enum
enum Token { T_BLAH = 4; }; // named enum
// Strong Typing
BOOST_STRONG_TYPEDEF(int, Token);
const Token NewLine = 1;
const Token Foo = 2;
// Other Strong Typing
class Token
{
public:
static const Token NewLine; // defined to Token("NewLine")
static const Token Foo; // defined to Token("Foo")
bool operator<(Token rhs) const { return mValue < rhs.mValue; }
bool operator==(Token rhs) const { return mValue == rhs.mValue; }
bool operator!=(Token rhs) const { return mValue != rhs.mValue; }
friend std::string toString(Token t) { return t.mValue; } // for printing
private:
explicit Token(const char* value);
const char* mValue;
};
All have their strengths and weaknesses.
int lacks from type safety, you can easily use one category of constants in the place where another is expected
enum support auto incrementing but you don't have pretty printing and it's still not so type safe (even though a bit better).
StrongTypedef I prefer to enum. You can get back to int.
Creating your own class is the best option, here you get pretty printing for your messages for example, but that's also a bit more work (not much, but still).
Also, the int and enum approach are likely to generate a code as efficient as the #define approach: compilers substitute the const values for their actual values whenever possible.

In the cases like the one you've described I prefer using enum, since they are much easier to maintain. Especially, if the numerical representation doesn't have any specific meaning.

Enum is type safe, easier to read, easier to debug and well supported by intellisense. I will say use Enum whenever possible, and resort to #define when you have to.
See this related discussion on const versus define in C/C++ and my answer to this post also list when you have to use #define preprocessor.
Shall I prefer constants over defines?

I vote for enum
#define 's aren't type safe and can be redefined if you aren't careful.

Another reason for enums: They are scoped, so if the label t_blah is present in another namespace (e.g. another class), it doesn't interfere with t_blah in your current namespace (or class), even if they have different int representations.

enum provided type-safety and readability and debugger. They are very important, as already mentioned.
Another thing that enum provides is a collection of possibilities. E.g.
enum color
{
red,
green,
blue,
unknown
};
I think this is not possible with #define (or const's for that matter)

Ok, many many answers have been posted already so I'll come up with something a little bit different: C++0x strongly typed enumerators :)
enum class Color /* Note the "class" */
{
Red,
Blue,
Yellow
};
Characteristics, advantages and differences from the old enums
Type-safe: int color = Color::Red; will be a compile-time error. You would have to use Color color or cast Red to int.
Change the underlying type: You can change its underlying type (many compilers offer extensions to do this in C++98 too): enum class Color : unsigned short. unsigned short will be the type.
Explicit scoping (my favorite): in the example above Red will be undefined; you must use Color::Red. Imagine the new enums as being sort of namespaces too, so they don't pollute your current namespace with what is probably going to be a common name ("red", "valid", "invalid",e tc).
Forward declaration: enum class Color; tells the compiler that Color is an enum and you can start using it (but not values, of course); sort of like class Test; and then use Test *.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

What are canonical types in Clang? - c++

Related

Convert complex struct / opaquepointer / function from C++ header to Delphi

Is the declaration "const typedef enum" valid in C++?

Is it possible to expand typedef with forced compiler error?

typedef of nested structs

Use Enum or #define?

Categories

Resources