Creating array of integers in LLVM - llvm

I have a vector of integer values, vector<Value*> myIntegers in my LLVM code (not necessarily constant). I want to create a Store instruction to store these integers. To create the store instruction using the format below, for the first argument I need to create a Value* pointing to these integers (create an array out of them).
new StoreInst(Value *Val, Value *Ptr, ...);
If my integers were constants I would have used:
Constant *IntArrayConstant = ConstantDataArray::get(getGlobalContext(), ArrayRef<Value*> myIntegers);
How can I create a generic array of i32 types, with a Value* pointing to it? The documentation says storing ArrayRef is not safe either.

You should probably use VectorType::get(), create an UndefValue of the type you just obtained, and then populate it with N InsertElementInsts, with N the number of elements. You will then create a StoreInst to store the Value* on the heap.
The result of the last InsertElementInst will thus be the Value* you are looking for (i.e. a vector containing the values). Please note that, depending on what you're trying to do, the StoreInst might actually be not needed at all.
Note that I'm assuming that all your Values have the same underlying type (i.e. getType() returns the same result for all of them).
Edit: also note that maybe, depending on what you're trying to do, it could be more appropriate to use ArrayType::get instead of VectorType::get.

Related

Building a dataframe in C++

I am trying to build a DataFrame in C++. I'm facing some problems, such as dealing with variable data type.
I am thinking in a DataFrame inspired by Pandas DataFrame (from python). So my design idea is:
Build an object 'Series' which is a vector of a fixed data type.
Build an object 'DataFrame' which will store a list of Series (this list can be variable).
The item 1. is just a regular vector. So, for instance, the user would call
Series.fill({1,2,3,4}) and it would store the vector {1,2,3,4} in some attribute of Series, say Series.data.
Problem 1. How I would make a class that understands {1,2,3,4} as a vector of 4 integers. Is it possible?
The next problem is:
About 2., I can see the DataFrame as a matrix of n columns and m rows, but the columns can have different data types.
I tried to design this as a vector of n pointers, where each pointer would point to a vector of dimension m with different data types.
I tried to do something like
vector<void*> columns(10)
and fill it with something like
columns[0] = (int*) malloc(8*sizeof(int))
But this does not work, if I try to fill the vector, like
(*columns[0])[0] = 5;
I get an error
::value_type {aka void*}’ is not a pointer-to-object type
(int *) (*a[0])[0] = 5;
How can I do it properly? I still have other questions like, how would I append an undetermined number of Series into a DataFrame, but for now, just building a matrix with columns with different data types is a great start.
I know that I must keep track of the types of pointers inside my void vector but I can create a parallel list with all data types and make this an attribute of my class DataFrame.
Building a heterogeneous container (which a dataframe is supposed to be) in C++ is more complex than you think, because C++ is statically typed. It means you have to know all the types at compile time.
Your approach uses a vector of pointers (there are a few variations of this approach, which I am not going into). This approach is very inefficient, because pointers are pointing to all over the memory and trashing your cache locality. I do not recommend even attempting to implement such a dataframe because there is really no point to it.
Look at this implementation of DataFrame in C++: https://github.com/hosseinmoein/DataFrame. You might be able to just use it as is. Or get insight from it how to implement a true heterogeneous DataFrame. It uses a collection of static vectors in a hash table to implement a true heterogeneous container. It also uses contiguous memory space, so it avoids the pointer effect.
TL;DR Version
Discard what you are doing.
Use vector<vector<int>> columns;. When you need a column, use columns[index].data() to get a pointer to the backing array from the indexed inner vector and pass that int * to whatever required the void *. The int * will be implicitly converted.
Explanation
Quoting cppreference
void - type with an empty set of values. It is an incomplete type that cannot be completed (consequently, objects of type void are disallowed). There are no arrays of void, nor references to void. However, pointers to void and functions returning type void (procedures in other languages) are permitted.
Since void is incomplete, you can't have a void. void* needs to be cast back to the actual data type, int*, before it can be used for anything other than passing the anonymously typed pointer around. All receivers of the void * have to know what it really is to do anything with it other than pass it on.
Functions that require void * parameters will take any pointer you give them without any further effort on your part, so there is almost no need to make void * variables in C++. Almost all cases where you would need a void * are filled in with polymorphism or templates. The last time I used a void * in C++ was back when I wrote C++ as C with classes bolted on.
The Error
Given
vector<void*> columns(10);
where each element will contain an array of ints, let's work through
(*columns[0])[0] = 5;
step by step to see what types we have and make sure thee types at each step are consistent
columns[0]
Gets the first element in the vector, a void*. So far so good.
*columns[0]
dereferences the void* at columns[0]. As covered in the preamble, this cannot be done. You cannot dereference a void * because that you have a value of type void This produces the reported ::value_type {aka void}’ is not a pointer-to-object type* error message.
We could
*reinterpret_cast<int*>(columns[0])
to turn it into a pointer to int, something we can dereference and matches the initial type, and receive an int, specifically the first int in the array.
(*reinterpret_cast<int*>(columns[0]))[0]
will fail because you can't index an int. That would be like writing 42[0]. This means the dereference is unnecessary.
The end result needs to look like
reinterpret_cast<int*>(columns[0])[0]
But don't do this. It is unnecessary and grossly over-complicated.

F# Create an empty Array of a pre-defined size

So I'm trying to create an empty Array that is the length of a table row. I know how to get the length of a row, but I haven't got a clue in how to make an array with a pre-defined length. The program i'm making is dynamic so the length of the array will vary depending on the table I'm accessing.
Does anyone know how?
You've said you want an empty array, so take a look at Array.zeroCreate<'T>.
From the documentation:
Creates an array where the entries are initially the default value
Unchecked.defaultof<'T>.
Example:
let arrayOfTenZeroes : int array = Array.zeroCreate 10
This page has a lot of useful information on F# arrays - have look through it, it should point you in the right direction.
As Panagiotis Kanavos has pointed out in comments, F# differs from a language like C# for array creation, so I will quote directly from the F# language reference article I've linked to above for clarity:
Several functions create arrays without requiring an existing array.
Array.empty creates a new array that does not contain any elements.
Array.create creates an array of a specified size and sets all the
elements to provided values. Array.init creates an array, given a
dimension and a function to generate the elements. Array.zeroCreate
creates an array in which all the elements are initialized to the zero
value for the array's type.

Accessing the values of map from its key using pointer to map

I want to dynamically allocate an array of pointers to an unordered_map in C++. The std::unordered map has been typedef as 'dictionary'.
dict_array= ( dictionary **) calloc(input_size, sizeof(dictionary*));
Now I want to access the individual hashmaps, and for each individual hashmap (mydict), I want to access the values using some key. like below:
for (int i=0; i< input_size, i++){
dictionary *mydict= dict_array[i];
mydict[some_key]++; /*access the value against 'some_key' and increment it*/
}
But this above line to access the value against the key generates a compilation error. What would be the correct way to access it?
In your example, you haven't actually allocated any dictionary or (std::unordered_map) objects yet.
The dict_array[i] is simply a null pointer. Thus the assignment to mydict also results in a null pointer. You would need to construct a dictionary first by invoking dict_array[i] = new dictionary();.
The expression mydict[some_key]++ doesn't mean what you think it does because mydict is a dictionary * and not a dictionary. Thus you would need to actually dereference it first before having access to a valid dictionary object:
(*my_dict)[some_key]++
But again, before this would work, you need to initialize the underlying pointers.
Also, it's generally a bad idea (which often leads to undefined behavior) to mix C allocation with C++ standard objects.
Why on earth are you messing around with pointers like this?
If you really want an array of pointers, then you'll have to dereference to access the map itself:
(*my_dict)[some_key]++;
assuming you've correctly set up each pointer to point to a valid dictionary.
Or use a less insane data structure
std::vector<dictionary> dict_array(input_size);
dict_array[i][some_key]++;
use operator[]():
mydict->operator[](some_key)++;

Returning a vector of tuples c++

I am trying to create and return a vector of two element arrays (which I will refer to as tuples), however I am running into issues.
std::vector<int *> distr;
int tuple[2];
distr.push_back(tuple);
//modify tuple's contents
distr.push_back(tuple)
In this case distr then has two copies of the modified tuple rather than the two distinct tuples I desired.
So I figured it had to do with memory so I tried this approach instead
distr.push_back(new int [num1, num2]);
But it doesn't save the tuples correctly as trying to access their values returns weird false values.
This is clearly due to a misunderstanding of how memory is allocated. I can understand why the first example fails in that fashion but I do not understand the issue with the second example.
When you use
distr.push_back(new int [num1, num2]);
You are not creating a a two element array filled with num1, num1. That would be done like the following:
new int[2] {num1, num2}
I would advise against using this method though. If all of your tuples will be the same size I would make struct to represent that data type (in the special case of two, you can even use std::pair)
Use a pair instead of a pointer:
std::vector<std::pair<int, int> > distr;
// Do some code
distr.emplace_back(num1, num2);
At first, you should understand, that "classic" C and C++ arrays are just buffers of allocated memory. In your sample, tuple is just a pointer to allocated buffer of 2 integers. So, when you push_back value of tuple you just add the same pointer twice. The array itself is not copied to std::vector, so, you end with vector containing two pointers to the SAME area of memory. To achieve desired behavior, you can use more high-level C++-ish data types, such as std::tuple or std::array.
Speaking about your second code snippet, it's just syntax misunderstanding: expression new <type>[<count>] creates a memory buffer (similar to your tuple, but on the HEAP) of values of type <type>. So, if you are going to create buffer of 2 ints, you should write new int[2]. When you are use a, b expression, it evaluates as comma operator, and <count> will be num2 in your sample.
P.S. Be aware, that to work correct with heap memory you should study C++ memory management much deeper.

Check array element for null

I have an array called quint8 block[16] which I:
initialize to zero with block[16] = { 0 }
fill it with some data
then pass it to a method/function as an argument like this bool fill(const quint8 *data)
In this fill function I want to see if the array is filled completely or if it contains null elements that were not filled.
Is it correct to perform a check this way? if(!data[i])? I have seen similar Q&A on this forum but all of them use \0 or NULL and I've heard this is not a good style for doing so and that has confused me.
Integer types do not have a "null" concept in C++.
Two possible ideas are:
1) Use some specific integer value to mean "not set". If 0 is valid for your data, then obviously it cannot be used for this. Maybe some other value would work, such as the maximum for the type (which looks like it would be 0xff in this case). But if all possible values are valid for your data, then this idea won't work.
2) Use some other piece of data to track which values have been set. One way would be an array of bools, each corresponding to a data element. Another way would be a simple count of how many elements have been set to a value (obviously this only works if your data is filled sequentially without any gaps).
If your question is whether you can distinguish between an element of an array that has been assigned the value zero, versus an element that has not been assigned to but was initialized with the value zero, then the answer is: you cannot.
You will have to find some other way to accomplish what you are trying to accomplish. It's hard to offer specific suggestions because I can't see the broader picture of what you're trying to do.
It depends on whether type quint8 has a bool conversion operator or some other conversion operaror for example to an integer type.
If quint8 is some integral type (some typedef for an integral type) then there is no problem. This definition
quint8 block[16] = { 0 };
initializes all elements of the array by zero.
Take into account that in general case the task can be done with algorithms std::find, std::find_if or std::any_of declared in header <algorithm>