Confusion between GetElementPtr and C++ API

Confusion between GetElementPtr and C++ API - c++

Looking at the documentation of GetElementPtr:
http://llvm.org/docs/GetElementPtr.html
The examples rely on multiple indexes: the 1st for the struct member and the 2nd for the element in the array. This supposedly returns an offset from the base pointer
I'm trying to figure out what's the correct way to create a given GetElementPtr instruction with the C++ API. Unfortunately, there are several varieties of the CreateXXXGEP instruction, with a parameter "val" that I presume is one of the indices. No version of it seems to use two indices as in the documentation: http://llvm.org/docs/doxygen/html/classllvm_1_1IRBuilder.html
Even the CreateStructGEP uses a single idx parameter!
I want to do a very simple thing; I want to take a char buffer:
Value* vB = llvm::ConstantDataArray::GetString(...)
and use the pointer to the array to pass it to another function that expects i8*

You're probably looking for the variants taking an array of Value *. Construct ConstantInts and put them in an std::vector and pass them in.

Related

What is the best way to pass uint8_t* buffer and size_t bufferlen to API function in C from a C++

A C function API accepts uint8_t* and size_t as parameters:
bool foo(uint8_t* buff, size_t bufflen)
What is the best way to manage and handle in C++ layer invoking this API. Is string, vector or list a better option

Just make sure while calling this API from C++ you always pass a uint8_t type pointer . normal array uint8_t arr[x] (x is any +ve number) will also work. Just make sure address you passed has data of type uint8_t with correct size of the buffer.
e.g. uint8_t arr[6]; for this the call will be foo(arr,6);

You probably want std::vector<uint8_t> while passing data() and size().

You can't pass a container to the C function. You can still use one in your C++ code, but you'll need to pass a pointer to the data, in accordance to what the C function parameters are. Use a vector. This is equivalent to an array in C, in that the data is stored contiguously in memory.
std::vector<uint8_t> myData;
// ... fill myData
// for c++11 and later,
foo(myData.data(), myData.size());
// pre-c++11
foo(&(myData[0]), myData.size());

Is string, vector or list a better option?
Well, list is a non-starter, because it will not store the sequence in contiguous memory. In other words, the C code would not be able to take a pointer to the first element and increment it to get to the second element.
As for the other two, that depends on the rest of your C++ code. I would lean towards vector rather than string, but you haven't really provided enough context for that to be any more than a general feeling.

Usually I would go with a helper class that has an method that either takes a vector, or a custom structure that acts like a span - i.e. a pair<void*,int>, or perhaps even a span (but I'm not allowed the C++14 crayons).
If the data really is character-based the std::string and string spans can work well, but if it is really binary data, vector and vector spans are the better encapsulation, IMBO.
I still don't want to call that directly from application code if what is actually in there is structured data. You can easily write a method that takes an expected structure type and generates the pointer and sizeof(instance).
You can write a generic template that would accept any structure and convert it to a void*/char* and length, but that tends to open your code up to more accidents.

How to uniquely identify an instruction in LLVM Pass?

So I am trying to keep a count of how many times certain call instructions are called and I am struggling with identifying the instructions uniquely. I couldn't find something as an instruction ID in the documentation. I want to get the ID and pass it on to an external function that knows how to do the job.
So the question is how can I get a unique ID for those instructions (preferably as an integer)?

I take it you perform counting on runtime, and in the pass you are just inserting code that performs that counting near call instructions you are interested in. In this case Instruction pointer should work just fine. The pointer would not change if you move an Instruction around, it can only become invalid if you delete Instruction.
To convert a pointer into an integer use static_cast<uintptr_t>(i).

If you know the type of call instructions that are possible then you can just declare an enum for all possible type of call instructions and pass the enum value to the counting function whenever you come across that type of call instruction based on the parameter value.
If you don't know all the possible call instructions, then you can pass the name of the function that is being called by the call instruction to the counting function. In this case you would have to implement the counting function in such a way that it maintains a map of function names and the count for that function.
Since a call instruction returns a value (Value*) for that particular call, I think all the Instruction* pointers that you get would be unique. So it won't serve your purpose if you use the pointer value as ID.

Why C++ have the type array?

I am learning C++. I found that the pointer has the same function with array, like a[4], a can be both pointer and array. But as C++ defined, for different length of array, it is a different type. In fact when we pass an array to a function, it will be converted into pointer automatically, I think it is another proof that array can be replaced by pointer. So, my question is:
Why C++ don't replace all the array with pointer?

In early C it was decided to represent the size of an array as part of its type, available via the sizeof operator. C++ has to be backward compatible with that. There's much wrong with C++ arrays, but having size as part of the type is not one of the wrong things.
Regarding
” pointer has the same function with array, like a[4], a can be both pointer and array
no, this is just an implicit conversion, from array expression to pointer to first item of that array.
A weird as it sounds, C++ does not provide indexing of built-in arrays. There's indexing for pointers, and p[i] just means *(p+i) by definition, so you can also write that as *(i+p) and hence as i[p]. And thus also i[a], because it's really the pointer that's indexed. Weird indeed.
The implicit conversion, called a “decay”, loses information, and is one of the things that are wrong about C++ arrays.
The indexing of pointers is a second thing that's wrong (even if it makes a lot of sense at the assembly language and machine code level).
But it needs to continue to be that way for backward compatibility.
Why array decay is Bad™: this causes an array of T to often be represented by simply a pointer to T.
You can't see from such a pointer (e.g. as a formal argument) whether it points to a single T object or to the first item of an array of T.
But much worse, if T has a derived class TD, where sizeof(TD) > sizeof(T), and you form an array of TD, then you can pass that array to a formal argument that's pointer to T – because that array of TD decays to pointer to TD which converts implicitly to pointer to T. Now using that pointer to T as an array yields incorrect address computations, due to incorrect size assumption for the array items. And bang crash (if you're lucky), or perhaps just incorrect results (if you're not so lucky).

In C and C++, everything of a single type has the same size. An int[4] array is twice as big as an int[2] array, so they can't be of the same type.
But then you might ask, "Why should type imply size?" Well:
A local variable needs to take up a certain amount of memory. When you declare an array, it takes up memory that scales up with its length. When you declare a pointer, it is always the size of pointers on your machine.
Pointer arithmetic is determined by the size of the type it's pointing to: the distance between the address pointed to by p and that pointed to by p+1 is exactly the size of its type. If types didn't have fixed sizes, then p would need to carry around extra information, or C would have to give up arithmetic.
A function needs to know how big its arguments are, because functions are compiled to expect their variables to be in particular places, and having a parameter with an unknown size screws that up.
And you say, "Well, if I pass an array to a function, it just turns into a pointer anyway." True, but you can make new types that have arrays as members, and then you can pass THOSE types around. And in C++, you can in fact pass an array as an array.
int sum10(int (&arr)[10]){ //only takes int arrays of size 10
int result = 0;
for(int i=0; i<10; i++)
result += arr[i];
return result
}

You can't use pointers in place of array declarations without having to use malloc/free or new/delete to create and destroy memory on the heap. You can declare an array as a variable and it gets created on the stack and you do not have to worry about it's destruction.

Well, array is an easier was of dealing with data and manipulating them. However, In order to use pointers you need to have a clear memory address to point to. Also, both concepts are not different from each other when it comes to passing them to a function. Bothe pointers and arrays are passed by reference. Hope that helps

I'm not sure if i get your question but assuming you're new to coding:
when you declare an array int a[4] you let the compiler know you need 4*int memory, and what the compiler does is assign a the address of the 'start' of that 4*int size memory. when u later use a[x], [x] means to do (a + sizeof(int)*x) AND dereference that pointer address to get the int.
In other words, it's always a pointer being passed around instead of an 'array', which is just an abstraction that makes it easier for you to code.

Using void pointers in calculations

This is quite a long introduction to a simple question, but otherwise there will be questions of the type "Why do you want to handle void pointers in C++? You horrible person!". Which I'd rather (a)void. :)
I'm using an C library from which I intially retrieve a list of polygons which it will operate on. The function I use gives me an array of pointers (PolygonType**), from which I create a std::vector<MyPolyType> of my own polygon class MyPolyType. This is in turn used to create a boost::graph with node identifiers given by the index in the vector.
At a later time in execution, I want to calculate a path between two polygons. These polygons are given to me in form of two PolygonType*, but I want to find the corresponding nodes in my graph. I can find these if I know the index they had in the previous vector form.
Now, the question: The PolygonType struct has a void* to an "internal identifier" which it seems I cannot know the type of. I do know however that the pointer increases with a fixed step (120 bytes). And I know the value of the pointer, which would be the offset of the first object. Can I use this to calculate my index with (p-p0)/120, where p is the address of the Polygon I want to find, and p0 is the offset of the first polygon? I'd have to cast the adresses to ints, is this portable? (The application can be used on windows and linux)

You cannot substract two void pointers. The compiler will shout that it doesn't know the size. You must first cast them to char pointers (char*) and then substract them and then divide them by 120. If you are dead sure that your object's size is actually 120, then it is safe( though ugly) provided that p and p0 point to objects within the same array
I still don't understand why p0 is the offset? I'd say p is the address of your Polygon, and p0 is the address of the first polygon... Am I misunderstanding something?

Given that the pointer is pointing to an "internal identifier" I don't think you can make any assumptions about the actual values stored in it. If the pointer can point into the heap you may be just seeing one possible set of values and it will subtly (or obviously) break in the future.
Why not just create a one-time reverse mapping of PolygonType* -> index and use that?

Why it is not allowed to pass arrays by value to a function in C and C++?

C and C++ allows passing of structure and objects by value to function, although prevents passing arrays by values.
Why?

In C/C++, internally, an array is passed as a pointer to some location, and basically, it is passed by value. The thing is, that copied value represents a memory address to the same location.
In C++, a vector<T> is copied and passed to another function, by the way.

You can pass an array by value, but you have to first wrap it in a struct or class. Or simply use a type like std::vector.
I think the decision was for the sake of efficiency. One wouldn't want to do this most of the time. It's the same reasoning as why there are no unsigned doubles. There is no associated CPU instruction, so you have to make what's not efficient very hard to do in a language like C++.
As #litb mentioned: "C++1x and boost both have wrapped native arrays into structs providing std::array and boost::array which i would always prefer because it allows passing and returning of arrays within structs"
An array is a pointer to the memory that holds that array and the size. Note it is not the exact same as a pointer to the first element of the array.
Most people think that you have to pass an array as a pointer and specify the size as a separate parameter, but this is not needed. You can pass a reference to the actual array itself while maintaining it's sizeof() status.
//Here you need the size because you have reduced
// your array to an int* pointing to the first element.
void test1(int *x, int size)
{
assert(sizeof(x) == 4);
}
//This function can take in an array of size 10
void test2(int (&x)[10])
{
assert(sizeof(x) == 40);
}
//Same as test2 but by pointer
void test3(int (*x)[10])
{
assert(sizeof(*x) == 40);
//Note to access elements you need to do: (*x)[i]
}
Some people may say that the size of an array is not known. This is not true.
int x[10];
assert(sizeof(x) == 40);
But what about allocations on the heap? Allocations on the heap do not return an array. They return a pointer to the first element of an array. So new is not type safe. If you do indeed have an array variable, then you will know the size of what it holds.

EDIT: I've left the original answer below, but I believe most of the value is now in the comments. I've made it community wiki, so if anyone involved in the subsequent conversation wants to edit the answer to reflect that information, feel free.
Original answer
For one thing, how would it know how much stack to allocate? That's fixed for structures and objects (I believe) but with an array it would depend on how big the array is, which isn't known until execution time. (Even if each caller knew at compile-time, there could be different callers with different array sizes.) You could force a particular array size in the parameter declaration, but that seems a bit strange.
Beyond that, as Brian says there's the matter of efficiency.
What would you want to achieve through all of this? Is it a matter of wanting to make sure that the contents of the original array aren't changed?

I think that there 3 main reasons why arrays are passed as pointers in C instead of by value. The first 2 are mentioned in other answers:
efficiency
because there's no size information for arrays in general (if you include dynamically allocated arrays)
However, I think a third reason is due to:
the evolution of C from earlier languages like B and BCPL, where arrays were actually implemented as a pointer to the array data
Dennis Ritchie talks about the early evolution of C from languages like BCPL and B and in particular how arrays are implemented and how they were influenced by BCPL and B arrays and how and why they are different (while remaining very similar in expressions because array names decay into pointers in expressions).
http://plan9.bell-labs.com/cm/cs/who/dmr/chist.html

I'm not actually aware of any languages that support passing naked arrays by value. To do so would not be particularly useful and would quickly chomp up the call stack.
Edit: To downvoters - if you know better, please let us all know.

This is one of those "just because" answers. C++ inherited it from C, and had to follow it to keep compatibility. It was done that way in C for efficiency. You would rarely want to make a copy of a large array (remember, think PDP-11 here) on the stack to pass it to a function.

from C How To Program-Deitel p262
..
"C automatically passes arrays to functions by reference. The array’s name evaluates to the address of the array’s first element. Because the starting address of the array is passed, the called function knows precisely where the array is stored. Therefore, when the called function modifies array elements in its function body, it’s modifying the actual elements of the array in their original memory locations. "
this helped me, hope it helps you too

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Confusion between GetElementPtr and C++ API - c++

You're probably looking for the variants taking an array of Value *. Construct ConstantInts and put them in an std::vector and pass them in.

Related

What is the best way to pass uint8_t* buffer and size_t bufferlen to API function in C from a C++

How to uniquely identify an instruction in LLVM Pass?

Why C++ have the type array?

Using void pointers in calculations

Why it is not allowed to pass arrays by value to a function in C and C++?

Categories

Resources