LLVM malloc an array of pointers - llvm

I am writing a compiler of my own language to LLVM-IR. I have defined some structure type representing an array:
{ i32, [ 0 x i32] }
Now I need to alloc a memory for a real array of pointers to these structs, i.e.
[{ i32, [ 0 x i32] }* x 10]
But to tell malloc to allocate the memory, I need a size of the pointer. How can I find it out?
P.S. I see that 8 bytes per a pointer should be OK since there doesn't exist any architecture with bigger pointers, but am looking for a more general solution.

The DataLayout of an LLVM Module specifies the size of pointers. The DataLayout is bound to the architecture and the target triple what every LLVM Module should have.
The DataLayout from the x86_64 architecture looks like this:
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
Edit: to explicitly set your pointer size you can add p[n]:<size>:<abi>:<pref> where p denotes pointers to address space [n] (this is optional and defaults to 0). Second parameter is the size of the pointer (e.g., 64 bit). In the provided DataLayout above the replacement rules are used (cited from here):
When LLVM is determining the alignment for a given type, it uses the
following rules:
If the type sought is an exact match for one of the specifications, that specification is used. If no match is found, and the type sought
is an integer type, then the smallest integer type that is larger
than the bitwidth of the sought type is used.
If none of the specifications are larger than the bitwidth then the largest integer type is used. For example, given the default
specifications above, the i7 type will use the alignment of i8
(next largest) while both i65 and i256 will use the alignment of
i64 (largest specified).
If no match is found, and the type sought is a vector type, then the largest vector type that is smaller than the sought vector type
will be used as a fall back. This happens because <128 x double>
can be implemented in terms of 64 <2 x double>, for example.
Backends will recognize the pointer size and accept it (if valid).
With this said you can do the following to get the allocation size of each Type using the Module's DataLayout with the LLVM C++ API:
Module* M = /*your current module*/;
Type* myType = /*some type*/;
unsigend size = M->getDataLayout()->getTypeAllocSize(myType);
//size is 8 with the DataLayout defined above

Related

How compilers identify the length of byte shift operators

Consider the following line:
int mask = 1 << shift_amount;
we know that mask is 4 bytes because it was explicitly declared int, but this 1 that to be shifted has unknown length. If the compiler chose type as char it would be 8 bits, or it could be unsigned short with size 16 bits, so shifting result will really depend on the size of the compiler's decision about how to treat that 1. How does the compiler decide here? And is it safe to leave the code this way or should it instead be:
int flag = 1;
int mask = flag << shift_amount;
1 is an int (typically 4 bytes). If you wanted it to be a type other than int you'd use a suffix, like 1L for long. For more details see https://en.cppreference.com/w/cpp/language/integer_literal.
You can also use a cast like (long)1 or if you want a known fixed length, (int32_t)1.
As Eric Postpischil points out in a comment, values smaller than int like (short)1 are not useful because the left-hand argument to << is promoted to int anyway.
The 2018 C standard says in 6.4.4 3:
Each constant has a type, determined by its form and value, as detailed later.
This means we can always tell what the type of a constant is just from the text of the constant itself, without regard to the expression it appears in. (Here, “constant” actually means a literal: A thing whose value is given by its text. For example 34 and 'A' literally represent the number 34 and the character A, in contrast to an identifier foo that refers to some object.)
(This answer addresses C specifically. The rules described below are different in C++.)
The subclauses of 6.4.4 detail the various kinds of constants (integers, floating-point, enumerations, and characters). An integer constant without a suffix that can be represented in an int is an int, so 1 is an int.
If an integer constant has a suffix or does not fit in an int, then its type is affected by its suffix, its value, and whether it is decimal, octal, or hexadecimal, according to a table in 6.4.4.1 5.
Floating-point constants are double if they have no suffix, float with f or F, and long double with l or L.
Enumeration constants (declared with enum) have type int. (And these are not directly literals as I describe above, because they are names for values, but the name does indicate the value by way of the enum declaration.)
Character constants without a prefix have type int. Constants with prefixes L, u, or U have type wchar_t, char16_t, or char32_t, respectively.

ARM Neon: How to convert from uint8x16_t to uint8x8x2_t?

I recently discovered about the vreinterpret{q}_dsttype_srctype casting operator. However this doesn't seem to support conversion in the data type described at this link (bottom of the page):
Some intrinsics use an array of vector types of the form:
<type><size>x<number of lanes>x<length of array>_t
These types are treated as ordinary C structures containing a single
element named val.
An example structure definition is:
struct int16x4x2_t
{
int16x4_t val[2];
};
Do you know how to convert from uint8x16_t to uint8x8x2_t?
Note that that the problem cannot be reliably addressed using union (reading from inactive members leads to undefined behaviour Edit: That's only the case for C++, while it turns out that C allows type punning), nor by using pointers to cast (breaks the strict aliasing rule).
It's completely legal in C++ to type pun via pointer casting, as long as you're only doing it to char*. This, not coincidentally, is what memcpy is defined as working on (technically unsigned char* which is good enough).
Kindly observe the following passage:
For any object (other than a base-class subobject) of trivially
copyable type T, whether or not the object holds a valid value of type
T, the underlying bytes (1.7) making up the object can be copied into
an array of char or unsigned char.
42 If the content of the array of char or unsigned char is copied back
into the object, the object shall subsequently hold its original
value. [Example:
#define N sizeof(T)
char buf[N];
T obj;
// obj initialized to its original value
std::memcpy(buf, &obj, N);
// between these two calls to std::memcpy,
// obj might be modified
std::memcpy(&obj, buf, N);
// at this point, each subobject of obj of scalar type
// holds its original value
— end example ]
Put simply, copying like this is the intended function of std::memcpy. As long as the types you're dealing with meet the necessary triviality requirements, it's totally legit.
Strict aliasing does not include char* or unsigned char*- you are free to alias any type with these.
Note that for unsigned ints specifically, you have some very explicit leeway here. The C++ Standard requires that they meet the requirements of the C Standard. The C Standard mandates the format. The only way that trap representations or anything like that can be involved is if your implementation has any padding bits, but ARM does not have any- 8bit bytes, 8bit and 16bit integers. So for unsigned integers on implementations with zero padding bits, any byte is a valid unsigned integer.
For unsigned integer types other than unsigned char, the bits
of the object representation shall be divided into two groups:
value bits and padding bits (there need not be any of the
latter). If there are N value bits, each bit shall represent
a different power of 2 between 1 and 2N−1, so that objects
of that type shall be capable of representing values from 0
to 2N−1 using a pure binary representation; this shall be
known as the value representation. The values of any padding bits are
unspecified.
Based on your comments, it seems you want to perform a bona fide conversion -- that is, to produce a distinct, new, separate value of a different type. This is a very different thing than a reinterpretation, such as the lead-in to your question suggests you wanted. In particular, you posit variables declared like this:
uint8x16_t a;
uint8x8x2_t b;
// code to set the value of a ...
and you want to know how to set the value of b so that it is in some sense equivalent to the value of a.
Speaking to the C language:
The strict aliasing rule (C2011 6.5/7) says,
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
a type compatible with the effective type of the object, [...]
an aggregate or union type that includes one of the aforementioned types among its members [...], or
a character type.
(Emphasis added. Other enumerated options involve differently-qualified and differently-signed versions of the of the effective type of the object or compatible types; these are not relevant here.)
Note that these provisions never interfere with accessing a's value, including the member value, via variable a, and similarly for b. But don't overlook overlook the usage of the term "effective type" -- this is where things can get bolluxed up under slightly different circumstances. More on that later.
Using a union
C certainly permits you to perform a conversion via an intermediate union, or you could rely on b being a union member in the first place so as to remove the "intermediate" part:
union {
uint8x16_t x1;
uint8x8_2_t x2;
} temp;
temp.x1 = a;
b = temp.x2;
Using a typecast pointer (to produce UB)
However, although it's not so uncommon to see it, C does not permit you to type-pun via a pointer:
// UNDEFINED BEHAVIOR - strict-aliasing violation
b = *(uint8x8x2_t *)&a;
// DON'T DO THAT
There, you are accessing the value of a, whose effective type is uint8x16_t, via an lvalue of type uint8x8x2_t. Note that it is not the cast that is forbidden, nor even, I'd argue, the dereferencing -- it is reading the dereferenced value so as to apply the side effect of the = operator.
Using memcpy()
Now, what about memcpy()? This is where it gets interesting. C permits the stored values of a and b to be accessed via lvalues of character type, and although its arguments are declared to have type void *, this is the only plausible interpretation of how memcpy() works. Certainly its description characterizes it as copying characters. There is therefore nothing wrong with performing a
memcpy(&b, &a, sizeof a);
Having done so, you may freely access the value of b via variable b, as already mentioned. There are aspects of doing so that could be problematic in a more general context, but there's no UB here.
However, contrast this with the superficially similar situation in which you want to put the converted value into dynamically-allocated space:
uint8x8x2_t *c = malloc(sizeof(*c));
memcpy(c, &a, sizeof a);
What could be wrong with that? Nothing is wrong with it, as far as it goes, but here you have UB if you afterward you try to access the value of *c. Why? because the memory to which c points does not have a declared type, therefore its effective type is the effective type of whatever was last stored in it (if that has an effective type), including if that value was copied into it via memcpy() (C2011 6.5/6). As a result, the object to which c points has effective type uint8x16_t after the copy, whereas the expression *c has type uint8x8x2_t; the strict aliasing rule says that accessing that object via that lvalue produces UB.
So there are a bunch of gotchas here. This reflects C++.
First you can convert trivially copyable data to char* or unsigned char* or c++17 std::byte*, then copy it from one location to another. The result is defined behavior. The values of the bytes are unspecified.
If you do this from a value of one one type to another via something like memcpy, this can result in undefined behaviour upon access of the target type unless the target type has valid values for all byte representations, or if the layout of the two types is specified by your compiler.
There is the possibility of "trap representations" in the target type -- byte combinations that result in machine exceptions or something similar if interpreted as a value of that type. Imagine a system that doesn't use IEEE floats and where doing math on NaN or INF or the like causes a segfault.
There are also alignment concerns.
In C, I believe that type punning via unions is legal, with similar qualifications.
Finally, note that under a strict reading of the c++ standard, foo* pf = (foo*)malloc(sizeof(foo)); is not a pointer to a foo even if foo was plain old data. You must create an object before interacting with it, and the only way to create an object outside of automatic storage is via new or placement new. This means you must have data of the target type before you memcpy into it.
Do you know how to convert from uint8x16_t to uint8x8x2_t?
uint8x16_t input = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
uint8x8x2_t output = { vget_low_u8(input), vget_high_u8(input) };
One must understand that with neon intrinsics, uint8x16_t represents a 16-byte register; while uint8x8x2_t represents two adjacent 8-byte registers. For ARMv7 these may be the same thing (q0 == {d0, d1}) but for ARMv8 the register layout is different. It's necessary to get (extract) the low 8 bytes and the high 8 bytes of the single 16-byte register using two functions. The clang compiler will determine which instruction(s) are necessary based on the context.

Why is the C++ memory defined using a maximal sequence of bit-fields?

The smallest unit of storage is a byte, see quotes from standard here:
The fundamental storage unit in the C++ memory model is the byte.
But then a memory location is defined to possibly be adjacent bit-fields:
A memory location is either an object of scalar type or a maximal sequence of adjacent bit-fields all having nonzero width.
I would like to understand this definition:
If the smallest storage unit is a byte, why don't we define a memory location as a sequence of bytes?
How do C-style bitfields fit with the first sentence at all?
What is the point of maximal sequence; what is maximal here?
If we have bitfields in the definition of memory, why do we need anything else? E.g. a float or int both are made up of bits, so the 'either an object of scalar type'-part seems redundand.
Let's analyze the terms:
For reference:
http://en.cppreference.com/w/cpp/language/bit_field
http://en.cppreference.com/w/cpp/language/memory_model
Byte
As you said, smallest unit of (usually) 8 bits in memory, explicitly addressable using a memory address.
Bit-Field
A sequence of BITS given with explicit bit count!
Memory location
Every single address, of a byte-multiple type OR (!!!) the beginning of a contigious sequence of bit-fields of non-zero size.
Your questions ##
Let's take the cpp-reference example with some more commments and answer your questions one by one:
struct S {
char a; // memory location #1, 8-bit character, no sequence, see missing :#, scalar-type.
int b : 5; // memory location #2, new sequence, new location, integer-type of 5-bits length
int c : 11, // memory location #2 (continued) integer-type of 11-bits length
: 0, // (continued but ending!) IMPORTANT: zero-size-bitfield, sequence ends here!!!
d : 8; // memory location #3 integer-type 8-bit, starts a new bit-field sequence, thus, new memory-location
struct {
int ee : 8; // memory location #4
} e;
} obj; // The object 'obj' consists of 4 separate memory locations
If the smallest storage unit is a byte, why don't we define a memory location as a sequence of bytes?
Maybe we want to have a fine-grained bit-level control of memory-consumption for given system-types, i.e. 7 bit integer, or 4 bit char, ...
A byte as the holy-grail of units would deny us that freedom
How do C-style bitfields fit with the first sentence at all?
Actually, since the bit-field feature originates in C...
The important thing here is, even if you define a struct with bitfields, consuming for example only 11 bits, the first bit will be byte-aligned in the memory, i.e. will have a location aligned to 8-bit steps and the data-type will finally consume at least (!) 16 bits, to hold the bitfield...
The exact way to store the data is at least in C not standardized afaik.
What is the point of maximal sequence; what is maximal here?
The point of maximal sequence is to allow efficient memory alignment of individual fields, compiler optimization, ... Maximal in this case means all bitfields declared in a sequences of size >= 1, i.e. i.e. no other scalar types and no bitfield with ':0'
If we have bitfields in the definition of memory, why do we need anything else? E.g. a float or int both are made up of bits, so the 'either an object of scalar type'-part seems redundand.
Nope, both are made up of bits, BUT: Not specifying the bit-size of the type, will make the compiler assume default size, i.e. int: 32-bit... If you don't need so much resolution of the integer value, but for example only 24bit, you write unsigned int v : 24, ...
Of course, the non-bitfield way to write stuff can be expressed with bitfields, e.g.:
int a,
int b : 32 // should be equal to a
BUT (something I don't know, any captain here?)
If the system defined default with of type T is n-bits and you write something like:
T value : m // m > n
I don't know what is the resulting behaviour...
You can infer some of the reasons by looking at the statement that follows: " Two or more threads of execution can access separate memory locations without interfering with each other."
I.e. two threads cannot access separate bytes of an object. And two threads accessing adjacent bitfields may also interfere with each other.
The maximal sequence here is because the standard doesn't exactly specify how a sequence of bitfields is mapped to bytes, and which of those bytes can then be accessed independently. Implementations may vary in this respect. However, the maximal sequence of bitfields is the longest sequence that any implementation may allocate as a whole. In particular, a maximal sequence ends with a btfield of width 0. The next bitfield starts a new sequence.
And while integers and floats are made up of bits, "bitfield" in C and C++ refers specifically to 'object members of integral type, whose width in bits is explicitly specified.' Not everything made of bits is a bitfield.

Is the map from a pointers to his integer representation gomomorphic?

Let f: Pointers -> Integer_Represenataion be a map provided by implementation (I hope, that map doesn't depends on the way we cast a pointer to an integral type). Let be a pointer to T and be a variable of integral type.
Does the standard explcitly define that the map is isomorphic, i.e. f(p+i)= f(p)+i*sizeof(T)? In general I would like to understand how additive operation between pointers and integrals is bounded.
It isn't. The specification does not require anything for it. It is implementation-defined and some implementations may be weird.
In similar cases it always helps to remember the memory models on 8086 (in 16 bits). There pointers are 32-bits, segment+offset, but they overlap to form only 20 bit address. In huge mode, these are normalized to smallest offset.
So say p = 0123:0004 (which converts to f(p) = 0x01230004), i = 42 and sizeof(T) = 2. Then p + i = 0128:0008 and converts to f(p+i) = 0x01280008, but f(p) + i*sizeof(T) = 0x01230058`, a different representation, though of the same address.
On the other hand in large model, the pointers are not normalized. So you can have both 0128:0008 and 0123:0058 and they are different pointers, but point to the same address.
Both follow the letter of the standard. Because arithmetic is only required to work on pointers to the same array or allocated block and the conversion to integer is implementation defined completely.

Member offset macro - need details

Please take a look at this macro. It is used in Symbian OS SDK, which compiler is based on GCC (< 4 version of it).
#ifndef _FOFF
#if __GNUC__ < 4
#define _FOFF(c,f) (((TInt)&(((c *)0x1000)->f))-0x1000)
#else
#define _FOFF(c,f) __builtin_offsetof(c,f)
#endif
#endif
I understand that it is calculating offset to specific class/struct member. But I cannot understand how that weird statement works - what is the constant 0x1000 and why is it there? Could somebody please explain this to me?
Imo 0x1000 is just a randomly chosen number. It is not a valid pointer, and it you could probably use zero instead of it.
How it works:
Casts 0x1000 into class pointer (pointer of type c). - (c*)0x1000
Takes pointer to "f" member of class c - &(((c *)0x1000)->f)
Casts it into TInt. ((TInt)&(((c *)0x1000)->f))
Substracts integer value of pointer to base (0x1000 in this case) from integer value of pointer to c's member: (((TInt)&(((c *)0x1000)->f))-0x1000)
Becuase f isn't being written to, there is no accessViolation/segfault.
You could probably use zero instead of 0x1000 and discard subtraction (i.e. just use "((TInt)&(((c *)0x0000)->f))"), but maybe author thought think that subtracting base pointer from pointer to member is a more "proper" way than trying to directly cast pointer into integer. Or maybe compiler provides "hidden" class members that can have negative offset (which is possible in some compilers - for example Delphi Compiler (I know it isn't c++) provided multiple hidden "fields" that were located before "self"(analogue of "this") pointer), in which case using 0x1000 instead of 0 makes sense.
"If there was a member of struct c starting exactly at the (perfectly-aligned;-) address 0x1000, then at what address would the struct's member f be?" -- answer: the offset you're looking for, minus of course the hypothetical starting address 0x1000 for the struct... with the difference, AKA distance or offset, computed as integers, otherwise the automatic scaling in address arithmetic throws you off (whence the cast).
What parts of the expression, specifically, are giving you problems?
The inner part &(((c *)0x1000)->f) is "the address of member f of a hypothetical struct c located at 0x1000. Right in front of it is the cast (I assume TInt is some kind of integer type, of course), then the - 0x1000 to get the offset (AKA distance or difference between the address of the specific member of interest and the start of the whole structure).
It is working out the relative address of 'f' as a member of a class/struct at address 0x1000, and then subtracting 0x1000 so that only the difference between the class/struct address and the member function address is returned. I imagine a non-zero value (i.e. the 0x1000) is used to avoid null pointer detection.