How to deal with different data types when parsing a general CSV read function? (without specifying them explicitly) - c++

How to deal with different data types when reading in data from an arbitrary CSV?
That is, how to deal with the different data types without specifying them explicitly?

You need to use std::any - See http://en.cppreference.com/w/cpp/utility/any.
Store the values into an std::vector<std::vector<std::any>>

Related

C++ Array of different functions

It's easy to do something like that in Python, but implementing it in C++ seems to be more challenging.
I actually have some solution to this, but I'd like to see if you can see any better solution.
Here's what I want to do.
I have a list of values of different types (string, integer, can be also instance of some class etc.). Now here's the first problem - in C++ (unlike in Python) all values in vector/array have to be of the same type.
The solution I can see is that I can use std::any like this: vector<std::any> list.
I also have an array/vector of functions (or pointers to functions) with different parameter types and returned values - one function can accept string and integer and return a char and other can accept a char and return an int. Here's another problem: in C++ you can have an array/vector of functions only if they have the same parameters and returned values (as far as I know) because in your declaration of the vector you need to define the parameter types and the returned value.
The other problem is that I need to retrieve the information about the parameters and the returned value for each function. In other words, having those functions, I need to know that this function accepts 2 strings and 1 integer and returns a char for example. In Python I can use inspect.signature function to retrieve information about type annotations of a function. In C++, I don't know if there is a way to do this.
The solution I can see here is to use std::any again (although I will use another solution, I will explain why later).
The solution I can see to this problem is that I won't retrieve that information but instead the user of the class which accepts this vector of functions will simply have to specify what are the parameter types and returned value for each function. In other words, the solution I can see is that I won't be retrieving the information about parameter types programmatically.
The other problem I have is that later I need to call one of those functions with some parameters. In Python I do this like this:
arguments = [1, 'str', some_object] // here I prepare a list of arguments (they are of different types)
func(**arguments)
In C++ I can do unpacking as well, but not if the parameters are of different types.
The solution I can see here is as follows. Those functions in the vector will all accepts only argument which is vector<std::any> args which will simply contain all of the arguments. Later when I want to call the function, I will simply construct a vector with std::any values and pass it as an argument. This would also solve the previous problem of not being able to store vector of functions with different parameters.
Can you see better solutions?
You might wonder what I need all of this is for. I do some program synthesis stuff and I need to programmatically construct programs from existing functions. I'm writing a library and I want the user of my library to be able to specify those base functions out of which I construct programs. In order to do what I want, I need to know what are the parameters and returned values of those functions and I need to call them later.
I believe what you are looking for is std::apply. You can use std::tuple instead of std::vector to store a list of values of different types -- as long as the types are known at compile-time. Then std::apply(f, t) in C++ is basically the same as f(*t) in Python.
I have a list of values of different types (string, integer, can be also instance of some class etc.).
A type which is a union of subtypes is called a sum type or tagged union. C++ has the template std::variant for that.
Now here's the first problem - in C++ (unlike in Python) all values in vector/array have to be of the same type.
Of course, so use cleverly C++ containers. You might want some std::map or std::vector of your particular instance of std::variant.
I also have an array/vector of functions
You probably want some std::vector of std::function-s and code with C++ lambda expressions
You should read a good C++ programming book
I'm writing a library and I want the user of my library to be able to specify those base functions out of which I construct programs.
You could get inspiration from SWIG and consider generating some C++ code in your library. So write (in Python or C++) your C++ metaprogram (generating some C++ code, like ANTLR does) which generates the user code, and your user would adapt his build automation tool for such a need (like users of GNU bison do).
You might also consider embedding Guile (or Lua) in your application.
PS. You might be interested by other programming languages like Ocaml, Go, Scheme (with Guile, and read SICP), Common Lisp (with SBCL), or Rust.

boost serialization data type compatibility

I need to serialize/deserialize certain data structures between two identical applications, but built using different compilers.
Consider primitive datatypes. Boost documentation mentions that in order to ensure compatibility you need to use the numeric types in <boost/cstdint.hpp>. So did I understand it right that I can't simply declare int number;, but rather I should do something like int16_t number;?

Creating interface for c++ dll

I have a class that is constructed with a path to a text file. It parses the text file and stores a lot of data in various vectors and maps as its members. I'd like to share the class as a dll with users of different versions of MSVS (something that's new to me).
My original implementation when it was just for me returned the STL containers directly. After reading, my understanding is that this is dangerous because different compilers or different versions of the same compiler can easily implement the containers differently. One solution I saw was to explicitly instantiate any templates you were using and export them as well. I also had strings so I'd need to instantiate and export that since a std::string is actually an alias for a more complex template. However, even if I went that route it appears there's nothing I can do about exporting maps.
What I've done now is that instead of giving the user access to the containers, I have accessor functions that take an index (or a key for the maps, or a key and index for a vector of maps I've got) and fetch the value. All my parameters and return values are primitive types, including const char* for the strings.
Am I understanding the problem correctly, and is this a reasonable approach to it? Do I need to worry about the integral primitives in c++ not being strictly defined in the standard? I suppose I could use the std-defined integral types as well. One issue is that the user won't be able to iterate over the containers or check size. I could provide the size as a member(all the vectors are the same size), and then I guess it'd just be up to the user to provide their own vector and fill it if they want the other vector functionality.

Conversion of multiple internal types to system level types

At my workplace, I am working on a use case where I have to convert multiple internal/product level data types to C++ compatible data types. Earlier we used something called as switch fence where code would look like
switch(InternalTypeCategory)
{
case InternalTypeA:
convert_to_int8_t;
break;
case InternalTypeB:
convert_to_int16_t
break;
.....
}
But for the sake of performance and other related issues, we are going to convert this switch fence block to C++ template based code where we dont have use switch case every now and then.
What have I tried so far?
I have been playing around with boost::any, boost::variant and boost::any_cast, boost::numeric_cast but nothing concrete has come up so far. I always end up with repetition of code or use some sort of mechanism (control structures or hash table) to select the particular value to have enough information for type conversion.
The internal/product level data types are variants (in terms of size, signed/unsigned) of integer, floating point, double and character.
Kindly help. Thanks in advance.
Templates only help you if the type is defined at compile-time. Since you're using a variant, I assume that this is not possible.
However a simple approach is to use a table of conversion methods and then use the internalType as index for these methods.
Create a generic interface for all conversion functions:
typedef InternalVariant (*conversionFunc_ptr)(const InternalVariant& data);
define an array of all available functions and assign as necessary:
static conversionFunc_ptr conversionFunc[MaxInternalTypeCategory];
conversionFunc[InternalTypeA] = convert_to_int8;
conversionFunc[InternalTypeB] = convert_to_int16;
[...]
And in the actual function use it like this:
return conversionFunc[InternalTypeCategory](*this);
Note: You shouldn't use instance members but static members. So you need to supply the variable to the function (otherwise you need to have the conversion array for each value instead only once)

unique type identifiers across different C++ programs

Is there a way to automatically (i.e. not by hand) assign unique identifiers to types in different programs that share common source code? I'd need one program to tell another "use type X" and the other would know what that "X" meant. Of course, they would (partially) share the source code, as you cannot construct types in runtime, I just want an automatic way of constructing a map from some sort of identifiers (integers or strings) to e.g. factory functions returning objects of given type.
An obvious choice I'd go for is result of name() in std::type_info, but as I understand, that is not even guaranteed to be different across types, and using address of std::type_info instances is certainly not going to work across programs.
I cannot use C++11, but can use Boost for this.
I just want an automatic way of constructing a map from some sort of
identifiers (integers or strings) to e.g. factory functions returning
objects of given type.
Not going to happen, not within Standard C++, anyway.
You could take a look at boost serialisation. It automatically generates unique ids for polimorphic classes and allows the explicit registration of non polimorphic ones.