Confused about LLVM Arrays - llvm

I'm kind of confused on what are the steps to create, store and get values from LLVM arrays, so far I'm creating it with:
auto type = llvm::ArrayType::get(<TYPE>, <SOME_UINT>);
auto array = builder.CreateAlloca(type);
With that, I tried to get llvm::InBoundGEP to each element and store llvm::Value's to them, but that didn't worked...
Is there some guide for doing this?

Well, today I discovered the llc -march=cpp tool and actually helped me a lot, so I will answer here what I did in the end:
create the array (for that we need the type first):
auto arrayType = llvm::ArrayType::get(llvm::IntegerType::get(context, 32), size);
auto arrayPtr = new llvm::AllocaInst(arrayType, "", block);
now we will store some values to the array, for that we need to integers (read the GetElementPtr manual to know that each of them do), the second integer will tell the index (as in array[index]):
auto zero = llvm::ConstantInt::get(context, llvm::APInt(64, 0, true));
auto index = llvm::ConstantInt::get(context, llvm::APInt(32, INDEX, true));
now we can point to the element in the array and store something to it, (we will store the same index as in array[index] = index):
auto ptr = llvm::GetElementPtrInst::Create(arrayPtr, { zero, index }, "", block);
auto store = new llvm::StoreInst(index, ptr, false, block);
Do that for each element of the array.
Now to load, assuming you have a llvm::Value as the index (which was my case, and ExtractElementInst didn't handle that, as far as I know at least...):
first get a ptr to the element (just like before):
ptr = llvm::GetElementPtrInst::Create(arrayPtr, { zero, index }, "", block);
And load the value to some variable:
auto value = builder.CreateLoad(ptr);
There's a thing I learnt, you cannot easily create variable-length arrays, you will have to use some stack tricks I don't actually know how to use just yet, but if you wanna know, here are the instructions to use: http://llvm.org/docs/LangRef.html#llvm-stacksave-intrinsic
Try compiling this simple code:
int n = 5;
int array[n];
with clang -S -llvm-emit your-file.c
And you will see those instructions.

following #Shello 's code i've been using it also for GlobalVariable arrays:
Value *index_pointer(GlobalVariable *array, Value *index) {
return Builder.CreateGEP(
array, {ConstantInt::get(Context, APInt(64, 0, true)), index}, "tmp");
}

Here's what worked for me, inspired by #Shelo's answer:
// The values we want to create an array of
int integers[] = { 1, 2, 3, 4, 5 };
// Create llvm constants
std::vector<llvm::Value *> values;
for (int i : integers) {
Value * value = ConstantInt::get(context, llvm::APInt(32, i, true));
values.push_back(value);
}
// Get type of element and create array-type
Type * type = values[0]->getType();
auto arrayType = llvm::ArrayType::get(type, values.size());
// Create the alloca and zero index
AllocaInst *alloca = Builder.CreateAlloca(arrayType);
auto zero = ConstantInt::get(context, llvm::APInt(32, 0, true));
// Foreach value, create the corresponding index as a Value, create a GEP and store.
for (int i = 0; i < values.size(); i++) {
auto index = ConstantInt::get(context, llvm::APInt(32, i, true));
auto ptr = Builder.CreateGEP(alloca, { zero, index });
Builder.CreateStore(values[i], ptr);
}
// Do something with alloca

Related

Fastest way to determine if a uint64 has been "seen" already

I've been interested in optimizing "renumbering" algorithms that can relabel an arbitrary array of integers with duplicates into labels starting from 1. Sets and maps are too slow for what I've been trying to do, as are sorts. Is there a data structure that only remembers if a number has been seen or not reliably? I was considering experimenting with a bloom filter, but I have >12M integers and the target performance is faster than a good hashmap. Is this possible?
Here's a simple example pseudo-c++ algorithm that would be slow:
// note: all elements guaranteed > 0
std::vector<uint64_t> arr = { 21942198, 91292, 21942198, ... millions more };
std::unordered_map<uint64_t, uint64_t> renumber;
renumber.reserve(arr.size());
uint64_t next_label = 1;
for (uint64_t i = 0; i < arr.size(); i++) {
uint64_t elem = arr[i];
if (renumber[elem]) {
arr[i] = renumber[elem];
}
else {
renumber[elem] = next_label;
arr[i] = next_label;
++next_label;
}
}
Example input/output:
{ 12, 38, 1201, 120, 12, 39, 320, 1, 1 }
->
{ 1, 2, 3, 4, 1, 5, 6, 7, 7 }
Your algorithm is not bad, but the appropriate data structure to use for the map is a hash table with open addressing.
As explained in this answer, std::unordered_map can't be implemented that way: https://stackoverflow.com/a/31113618/5483526
So if the STL container is really too slow for you, then you can do better by making your own.
Note, however, that:
90% of the time, when someone complains about STL containers being too slow, they are running a debug build with optimizations turned off. Make sure you are running a release build compiled with optimizations on. Running your code on 12M integers should take a few milliseconds at most.
You are accessing the map multiple times when only once is required, like this:
uint64_t next_label = 1;
for (size_t i = 0; i < arr.size(); i++) {
uint64_t elem = arr[i];
uint64_t &label = renumber[elem];
if (!label) {
label = next_label++;
}
arr[i] = label;
}
Note that the unordered_map operator [] returns a reference to the associated value (creating it if it doesn't exist), so you can test and modify the value without having to search the map again.
Updated with bug fix
First, anytime you experience "slowness" with a std:: collection class like vector or map, just recompile with optimizations (release build). There is usually a 10x speedup.
Now to your problem. I'll show a two-pass solution that runs in O(N) time. I'll leave it as an exercise for you to convert to a one-pass solution. But I'll assert that this should be fast enough, even for vectors with millions of items.
First, declare not one, but two unordered maps:
std::unordered_map<uint64_t, uint64_t> element_to_label;
std::unordered_map<uint64_t, std::pair<uint64_t, std::vector<uint64_t>>> label_to_elements;
The first map, element_to_label maps an integer value found in the original array to it's unique label.
The second map, label_to_elements maps to both the element value and the list of indices that element occurs in the original array.
Now to build these maps:
element_to_label.reserve(arr.size());
label_to_elements.reserve(arr.size());
uint64_t next_label = 1;
for (size_t index = 0; index < arr.size(); index++)
{
const uint64_t elem = arr[index];
auto itor = element_to_label.find(elem);
if (itor == element_to_label.end())
{
// new element
element_to_label[elem] = next_label;
auto &p = label_to_elements[next_label];
p.first = elem;
p.second.push_back(index);
next_label++;
}
else
{
// existing element
uint64_t label = itor->second;
label_to_elements[label].second.push_back(index);
}
}
When the above code runs, it's built up a database all values in the array, their labels, and indices where they occur.
So now to renumber the array such that all elements are replaced with their smaller label value:
for (auto itor = label_to_elements.begin(); itor != label_to_elements.end(); itor++)
{
uint64_t label = itor->first;
auto& p = itor->second;
uint64_t elem = p.first; // technically, this isn't needed. It's just useful to know which element value we are replacing from the original array
const auto& vec = p.second;
for (size_t j = 0; j < vec.size(); j++)
{
size_t index = vec[j];
arr[index] = label;
}
}
Notice where I assign variables by reference with the & operator to avoid making an expensive copy of any value in the maps.
So if your original vector or array was this:
{ 100001, 2000002, 300003, 400004, 400004, 300003, 2000002, 100001 };
Then the application of labels would render the array as this:
{1,2,3,4,4,3,2,1}
And what's nice you still have a quick O(1) look operator to map any label in that set back to its original element value using label_to_elements

Creating BoolTensor Mask in torch C++

I am trying to create a mask for torch in C++ of type BoolTensor. The first n elements in dimension one need to be False and the rest need to be True.
This is my attempt but I do not know if this is correct (size is the number of elements):
src_mask = torch::BoolTensor({6, 1});
src_mask[:size,:] = 0;
src_mask[size:,:] = 1;
I'm not sure to understand exactly your goal here, so here is my best attempt to convert into C++ you pseudo-code .
First, with libtorch you declare the type of your tensor through the torch::TensorOptions struct (types names are prefixed with a lowercase k)
Second, your python-like slicing is possible thanks to the torch::Tensor::slicefunction (see here and there).
Finally, that gives you something like :
// Creates a tensor of boolean, initially all ones
auto options = torch::TensorOptions().dtype(torch::kBool));
torch::Tensor bool_tensor = torch::ones({6,1}, options);
// Set the slice to 0
int size = 3;
bool_tensor.slice(/*dim=*/0, /*start=*/0, /*end=*/size) = 0;
std::cout << bool_tensor << std::endl;
Please not that this will set the first size rows to 0. I assumed that's what you meant by "first elements in dimension x".
Another way to do it:
using namespace torch::indexing; //for using Slice(...) function
at::Tensor src_mask = at::empty({ 6, 1 }, at::kBool); //empty bool tensor
src_mask.index_put_({ Slice(None, size), Slice() }, 0); //src_mask[:size,:] = 0
src_mask.index_put_({ Slice(size, None), Slice() }, 1); //src_mask[size:,:] = 0

LLVM How to get return value of an instruction

I have a program which allocates memory from stack like this:
%x = alloca i32, align 4
In my pass I want to get the actual memory pointer that points to this allocated memory at runtime. This should be %x. How do I get the pointer in my pass?
Instruction* I;
if (AllocaInst* AI = dyn_cast<AllocaInst>(I)) {
//How to get %x?
}
You can work with an Instruction* as a Value* (and Instruction inherits from Value), then you are working with the result / return value of that instruction. I have adapted some code from my LLVM Pass to demonstrate allocating space using alloca and then storing into that location. Notice that the results of the instructions can be directly passed to other instructions, as they are values.
// M is the module
// ci is the current instruction
LLVMContext &ctx = M.getContext();
Type* int32Ty = Type::getInt32Ty(ctx);
Type* int8Ty = Type::getInt8Ty(ctx);
Type* voidPtrTy = int8Ty->getPointerTo();
// Get an identifier for rand()
Constant* = M.getOrInsertFunction("rand", FunctionType::get(cct.int32Ty, false));
// Construct the struct and allocate space
Type* strTy[] = {int32Ty, voidPtrTy};
Type* t = StructType::create(strTy);
Instruction* nArg = new AllocaInst(t, "Wrapper Struct", ci);
// Add Store insts here
Value* gepArgs[2] = {ConstantInt::get(int32Ty, 0), ConstantInt::get(int32Ty, 0)};
Instruction* prand = GetElementPtrInst::Create(NULL, nArg, ArrayRef<Value*>(gepArgs, 2), "RandPtr", ci);
// Get a random number
Instruction* tRand = CallInst::Create(getRand, "", ci);
// Store the random number into the struct
Instruction* stPRand = new StoreInst(tRand, prand, ci);
If you want to store or load to %x you just use a store or lid instruction
If you want the numeric value of your pointer, use the ptrtoint instruction.

Passing an array into a function c++

so I'm having an issue passing an entire array of histograms into a function in C++
the arrays are declared like this
TH1F *h_Energy[2];
h_Energy[0] = new TH1F("h1", "h1", 100, 0, 100);
h_Energy[1] = new TH1F("h2", "h2", 100, 0, 100);
And here is what I'm trying to do in the function:
void overlayhists(TH1 *hists, int numhists) {
int ymax = 0;
for (int i=0; i<numhists; i++) {
if (hist[i].GetMaximum() > ymax) {
ymax = (hist[i].GetMaximum())*1.05;
}
}
}
And I'm passing the function an array like this
overlayhists(*h_Energy, 2);
Where h_Energy is an 1D array with 2 elements. The code will run through the first histogram in the loop but as soon as it starts the second loop and tries to access hist[i].GetMaximum() on the second try it segfaults.
What gives?
This creates an array of pointers to type TH1F
TH1F *h_Energy[2]; //edited after OP changed
If you want to use this, and subsequently pass it as an argument
You must first initialize it, and then create your function prototype to accommodate:
void overlayhists(TH1F **hists, int numhists);
^^
From what you have shown above, you would call it like this: (after your initializations)
h_Energy[0] = new TH1F("h1", "h1", 100, 0, 100);
h_Energy[1] = new TH1F("h2", "h2", 100, 0, 100);
overlayhists(h_Energy, 2);
1. Passing any array to function in c++ to change the content:
Refer to this code snippet:
//calling:
int nArr[5] = {1,2,3,4,5};
Mul(nArr, 5);
Whenever you pass an array to function you actually pass the pointer to first element of the array. This is implicit to C++ and C. If you pass normal value(non array) it will be considered as pass by value though.
// Function Mul() declaration and definition
void MUl(int* nArr, size_t nArrSize){
size_t itr = 0;
for(;itr<nArrSize; itr++)
nArr[i] = 5*nArr;// here we've coded to multiply each element with 5
}
2. Passing any Ptr to function in c++ to change what pointer is pointing to:
Now let us suppose we want to copy nArr (from above code snippet) to another array, say nArrB
The best way for a beginner would be to use reference to the pointer.
You can pass reference to the pointer to your function
//so we had
int nArr[5] = {1,2,3,4,5};
int *nArrB;
Here we don't know the gonnabe size of nArrB.
to copy nArr to nArrB we have to pass nArr, address of pointer to nArrB(or reference to pointer of nArrB or pointer to pointer of nArrB) and size of array.
Here is the implementation.
//Calling
CopyNArr(nArr, &nArrB, 5);
//Function implementation
void CopyNArr(int* nArr, int* & nArrB, size_t nArrSize) {
// dymanically allocating memory size for array. Assuming 4 byte int size
nArrB = new int[nArrSize*4];
size_t itr = 0;
//Copying values
for(;itr<nArrSize; itr++)
nArrB[i] = nArr[i];
}
//After copy nArrB is pointing to first element of 5 element array.
I hope it helped. Write for any further clarification.
You have an array of size 2, but you've created only one element. And that one with a wrong index. Array indexing starts with 0.
The elements should be at h_histogram[0] and h_histogram[1].
I am sorry if this answer is completely irrelevant but
I am tempted to post it. These is an experiment I have
done after seeing your question.
#include<iostream>
using namespace std;
main()
{
int e[2]={0,1};
int *p[2];
int i;
/*
Printing the array e content using one pointer
from an array of pointers. Here I am not using p[2]
at all.
*/
p[1]=e;
cout<<"Elements of e are : \n";
for(i=0;i<2;i++)
{
cout<<*(p[1]+i)<<endl;
/*
In the above line both *((*p)+i) and *(p+i)
won't serve the purpose of printing the array values.
*/
}
/*Printing the array e content using pointer to array*/
cout<<"Elements of e are : \n";
for(i=0;i<2;i++)
{
cout<<*(e+i)<<endl;
}
/*Note that pointer to array is legal but array TO pointer
(don't confuse with array OF pointers) is not.*/
}
Hope this will refresh your understanding.

Array as out parameter in c++

I created a function that returns an error code (ErrCode enum) and pass two output parameters. But when I print the result of the function, I don't get the correct values in the array.
// .. some codes here ..
ErrCode err;
short lstCnt;
short lstArr[] = {};
err = getTrimmedList(lstArr, &lstCnt);
// list returned array (for comparison)
for (int i=0; i<lstCnt; ++i)
printf("lstArr[%3d] = %d", i, lstArr[i]);
// .. some codes here ..
The getTrimmedList function is like this:
ErrCode getTrimmedList(short* vList, short* vCnt)
{
short cnt;
ErrCode err = foo.getListCount(FOO_TYPE_1, &cnt);
if (NoError!=err) return err;
short* list = new short [cnt];
short total = 0;
for (short i=0; i<cnt; ++i)
{
FooBar bar = foo.getEntryByIndex(FOO_TYPE_1, i);
if (bar.isDeleted) continue;
list[total] = i;
++total;
}
*vCnt = total;
//vList = (short*)realloc(index, sizeof(short)*total);
vList = (short*)malloc(sizeof(short)*total);
memcpy(vList, list, sizeof(short)*total)
// list returned array (for comparison)
for (int i=0; i<lstCnt; ++i)
printf("lstArr[%3d] = %d", i, lstArr[i]);
return NoError;
}
where:
foo is an object that holds arrays of FooBar objects
foo.getListCount() returns the number of objects with type FOO_TYPE_1
FOO_TYPE_1 is the type of object we want to take/list
foo.getEntryByIndex() returns the ith FooBar object with type FOO_TYPE_1
bar.isDeleted is a flag that tells if bar is considered as 'deleted' or not
What's my error?
Edit:
Sorry, I copied a wrong line. I commented it above and put the correct line.
Edit 2
I don't have control over the returns of foo and bar. All their function returns are ErrCode and the outputs are passed through parameter.
Couple of questions before I can answer your post...
Where is "index" defined in:
vList = (short*)realloc(index, sizeof(short)*total);
Are you leaking the memory associated with:
short* list = new short [cnt];
Is it possible you have accidentally confused your pointers in memory allocation? In any case, here is an example to go from. You have a whole host of problems, but you should be able to use this as a guide to answer this question as it was originally asked.
WORKING EXAMPLE:
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
int getTrimmedList(short** vList, short* vCnt);
int main ()
{
// .. some codes here ..
int err;
short lstCnt;
short *lstArr = NULL;
err = getTrimmedList(&lstArr, &lstCnt);
// list returned array (for comparison)
for (int i=0; i<lstCnt; ++i)
printf("lstArr[%3d] = %d\n", i, lstArr[i]);
// .. some codes here ..
return 0;
}
int getTrimmedList(short** vList, short* vCnt)
{
short cnt = 5;
short* list = new short [cnt];
short* newList = NULL;
short total = 0;
list[0] = 0;
list[1] = 3;
list[2] = 4;
list[3] = 6;
total = 4;
*vCnt = total;
newList = (short*)realloc(*vList, sizeof(short)*total);
if ( newList ) {
memcpy(newList, list, sizeof(short)*total);
*vList = newList;
} else {
memcpy(*vList, list, sizeof(short)*total);
}
delete list;
return 0;
}
You have serious problems.
For starters, your function has only one output param as you use it: vCnt.
vList you use as just a local variable.
realloc is called with some index that we kow nothing about, not likely good. It must be something got from malloc() or realloc().
The allocated memory in vList is leaked as soon as you exit getTrimmedList.
Where you call the function you pass the local lstArr array as first argument that is not used for anything. Then print the original, unchanged array, to bounds in cnt, while it has 0 size still -- behavior is undefined.
Even if you managed to pass that array by ref, you could not reassign it to a different value -- C-style arrays can't do that.
You better use std::vector that you can actually pass by reference and fill in the called function. eliminating the redundant size and importantly the mess with memory handling.
You should use std::vector instead of raw c-style arrays, and pass-by-reference using "&" instead of "*" here. Right now, you are not properly setting your out parameter (a pointer to an array would look like "short **arr_ptr" not "short *arr_ptr", if you want to be return a new array to your caller -- this API is highly error-prone, however, as you're finding out.)
Your getTrimmedList function, therefore, should have this signature:
ErrCode getTrimmedList(std::vector<short> &lst);
Now you no longer require your "count" parameters, as well -- C++'s standard containers all have ways of querying the size of their contents.
C++11 also lets you be more specific about space requirements for ints, so if you're looking for a 16-bit "short", you probably want int16_t.
ErrCode getTrimmedList(std::vector<int16_t> &lst);
It may also be reasonable to avoid requiring your caller to create the "out" array, since we're using smarter containers here:
std::vector<int16_t> getTrimmedList(); // not a reference in the return here
In this style, we would likely manage errors using exceptions rather than return-codes, however, so other things about your interface would evolve, as well, most likely.