Compare pointers by type? - c++

How could I compare two pointers to see if they are of the same type?
Say I have...
int * a;
char * b;
I want to know whether or not these two pointers differ in type.
Details:
I'm working on a lookup table (implemented as a 2D void pointer array) to store pointers to structs of various types. I want to use one insert function that compares the pointer type given to the types stored in the first row of the table. If they match I want to add the pointer to the table in that column.
Basically I want to be able to store each incoming type into its own column.
Alternative methods of accomplishing this are welcomed.

In this case, since you know the types before hand, it doesn't make much sense to check. You can just proceed knowing that they are different types.
However, assuming that perhaps the types may be dependent on some compile-time properties like template arguments, you could use std::is_same:
std::is_same<decltype(a), decltype(b)>::value
This will be true if they are the same type and false otherwise.

Related

Building a dataframe in C++

I am trying to build a DataFrame in C++. I'm facing some problems, such as dealing with variable data type.
I am thinking in a DataFrame inspired by Pandas DataFrame (from python). So my design idea is:
Build an object 'Series' which is a vector of a fixed data type.
Build an object 'DataFrame' which will store a list of Series (this list can be variable).
The item 1. is just a regular vector. So, for instance, the user would call
Series.fill({1,2,3,4}) and it would store the vector {1,2,3,4} in some attribute of Series, say Series.data.
Problem 1. How I would make a class that understands {1,2,3,4} as a vector of 4 integers. Is it possible?
The next problem is:
About 2., I can see the DataFrame as a matrix of n columns and m rows, but the columns can have different data types.
I tried to design this as a vector of n pointers, where each pointer would point to a vector of dimension m with different data types.
I tried to do something like
vector<void*> columns(10)
and fill it with something like
columns[0] = (int*) malloc(8*sizeof(int))
But this does not work, if I try to fill the vector, like
(*columns[0])[0] = 5;
I get an error
::value_type {aka void*}’ is not a pointer-to-object type
(int *) (*a[0])[0] = 5;
How can I do it properly? I still have other questions like, how would I append an undetermined number of Series into a DataFrame, but for now, just building a matrix with columns with different data types is a great start.
I know that I must keep track of the types of pointers inside my void vector but I can create a parallel list with all data types and make this an attribute of my class DataFrame.
Building a heterogeneous container (which a dataframe is supposed to be) in C++ is more complex than you think, because C++ is statically typed. It means you have to know all the types at compile time.
Your approach uses a vector of pointers (there are a few variations of this approach, which I am not going into). This approach is very inefficient, because pointers are pointing to all over the memory and trashing your cache locality. I do not recommend even attempting to implement such a dataframe because there is really no point to it.
Look at this implementation of DataFrame in C++: https://github.com/hosseinmoein/DataFrame. You might be able to just use it as is. Or get insight from it how to implement a true heterogeneous DataFrame. It uses a collection of static vectors in a hash table to implement a true heterogeneous container. It also uses contiguous memory space, so it avoids the pointer effect.
TL;DR Version
Discard what you are doing.
Use vector<vector<int>> columns;. When you need a column, use columns[index].data() to get a pointer to the backing array from the indexed inner vector and pass that int * to whatever required the void *. The int * will be implicitly converted.
Explanation
Quoting cppreference
void - type with an empty set of values. It is an incomplete type that cannot be completed (consequently, objects of type void are disallowed). There are no arrays of void, nor references to void. However, pointers to void and functions returning type void (procedures in other languages) are permitted.
Since void is incomplete, you can't have a void. void* needs to be cast back to the actual data type, int*, before it can be used for anything other than passing the anonymously typed pointer around. All receivers of the void * have to know what it really is to do anything with it other than pass it on.
Functions that require void * parameters will take any pointer you give them without any further effort on your part, so there is almost no need to make void * variables in C++. Almost all cases where you would need a void * are filled in with polymorphism or templates. The last time I used a void * in C++ was back when I wrote C++ as C with classes bolted on.
The Error
Given
vector<void*> columns(10);
where each element will contain an array of ints, let's work through
(*columns[0])[0] = 5;
step by step to see what types we have and make sure thee types at each step are consistent
columns[0]
Gets the first element in the vector, a void*. So far so good.
*columns[0]
dereferences the void* at columns[0]. As covered in the preamble, this cannot be done. You cannot dereference a void * because that you have a value of type void This produces the reported ::value_type {aka void}’ is not a pointer-to-object type* error message.
We could
*reinterpret_cast<int*>(columns[0])
to turn it into a pointer to int, something we can dereference and matches the initial type, and receive an int, specifically the first int in the array.
(*reinterpret_cast<int*>(columns[0]))[0]
will fail because you can't index an int. That would be like writing 42[0]. This means the dereference is unnecessary.
The end result needs to look like
reinterpret_cast<int*>(columns[0])[0]
But don't do this. It is unnecessary and grossly over-complicated.

C++ Table of Vectors of Different Types

I have a collection of vectors of different types like this:
std::vector<int> a_values;
std::vector<float> b_values;
std::vector<std::string> c_values;
Everytime i get a new value for a, b and c I want to push those to their respective vectors a_values, b_values and c_values.
I want to do this in the most generic way possible, ideally in a way I can iterate over the vectors. So I want a function addValue(...) which automatically calls the respective push_back() on each vector. If I add a new vector d_values I only want to have to specify it in one place.
The first answer to this post https://softwareengineering.stackexchange.com/questions/311415/designing-an-in-memory-table-in-c seems relevant, but I want to easily get the vector out for a given name, without having to manually cast to a particular type. ie. I want to call getValues("d") which will give me the underlying std::vector.
Does anyone have a basic example of a collection class that does this?
This idea can be achieved with the heterogenous container tuple, which will allow for storage of vectors containing elements of different types.
In particular, we can define a simple data structure as follows
template <typename ...Ts>
using vector_tuple = std::tuple<std::vector<Ts>...>;
In the initial case, of the provided example, the three vectors a_values, b_values, c_values, simply corresponds to the type vector_tuple<int, float, std::string>. Adding an additional vector simply requires adding an additional type to our collection.
Indexing into our new collection is simple too, given the collection
vector_tuple<int, float, std::string> my_vec_tup;
we have the following methods for extracting a_values, b_values and c_values
auto const &a_values = std::get<0>(my_vec_tup);
auto const &b_values = std::get<1>(my_vec_tup);
auto const &c_values = std::get<2>(my_vec_tup);
Note that the tuple container is indexed at compile-time, which is excellent if you know the intended size at compile-time, but will be unsuitable otherwise.
In the description of the problem that has been provided, the number of vectors does appear to be decided at compile-time and the naming convention appears to be arbitrary and constant (i.e., the index of the vectors won't need to be changed at runtime). Hence, a tuple of vectors seems to be a suitable solution, if you associate each name with an integer index.
As for the second part of the question (i.e., iterating over each vector), if you're using C++17, the great news is that you can simply use the std::apply function to do so: https://en.cppreference.com/w/cpp/utility/apply
All that is required is to pass a function which takes a vector (you may wish to define appropriate overloading to handle each container separately), and your tuple to std::apply.
However, for earlier versions of C++, you'll need to implement your own for_each function. This problem has fortunately already been solved: How can you iterate over the elements of an std::tuple?

2-D vector of boost::variant in C++

I'm looking to store information from a data table with multiple rows and columns. Each column houses a differing type (int, double, std::string, etc), which would only be known at runtime.
Is a 2-d vector of boost::variant the best way, or are there better storing mechanisms to accomplish this?
From your question it's not clear what you're actually looking for. The answer depends on various factors:
Assuming you have different types per column, are the types the same for
all rows?
Are the types known at compile time or only at run-time?
In the simplest case of the types being known at compile time and being the same for all rows, why not simply use a custom class to represent a column or a std::tuple?
If the types are different between different columns, you must use a omnipotent type, such as boost::any.
This may also be the easiest solution should the types only be known at run-time.

c++ last element of a structure field

I get a structure, and I don't know the size of it (every time it's different). I would like to set the last place in one of the fields of this structure to a certain value. In pseudocode, I mean something like this:
structureA.fieldB[end] = cert_value;
I'd do it in matlab however I cannot somehow find the proper syntax in c++, can you help me?
In Matlab, a structure data type holds key-value pairs where the "value" may be of different types. In C++, there are some key-value containers available (associative containers like set, map, multimap), but they usually store elements of a single type. What you need if I understood it right is something like
"one" : 1
"two" : [1,2,5]
"three" : "name"
Which means that your structure resembles a Python dictionary.
In C++, the only way I have heard of using containers with truly different types is by using boost::any, which is accepted as the answer to this question.
If you pack a container with elements of different types, then you can use the end() member function of a container to get the last element.
You need sizeof, this gives you the size of the array in bytes. Since you want the the index of the last element, you have to divide this number by the number of bytes for one element. You end up with:
int index_end = sizeof(structureA.fieldB) / sizeof(structureA.fieldB[0]);
structureA.fieldB[index_end] = new_value;

How can I get two different binary trees of two different types?

For an array if I want an array of integers it's:
int anArray[];
For an array of strings it is:
string anArray[];
I have a binary search tree template that allows the type to be chosen using a typedef:
typedef desiredType TreeItemType; // desired type of tree items i.e. string, int, etc.
How can I get two different trees of two different types? Right now the only way I see possible is to write all the supporting code twice with different file names and setting the typedef's. There has to be a way to set the typedef desiredType in a method or something. Any ideas?
why not turn it into a templated class, seeing as your using C++? this allows any number of permutations of types, and remove any problems that might occur with a typedef'd type(aliased type).