I would like to perform range check for a std::array at compile time. Here is an example:
#include <iostream>
#include <array>
void rarelyUsedFunction(const std::array<double, 2>& input)
{
std::cout << input[5] << std::endl;
}
int main()
{
std::array<double, 2> testArray;
rarelyUsedFunction(testArray);
}
If I compile this with g++ there is no warning or error, despite the undefined access to an element which is not in the array. The compiled program just prints some random value.
Is there a compiler option in g++ for a suitable range/boundary check, that is performed during compile time? I know that I can add "-D_GLIBCXX_DEBUG" but this will only perform a check during runtime. If I have a function which is not called very often, this won't be triggered.
I am aware, that such a range check could not be performed in all circumstances, but in the case above, the compiler should be able to spot the problem!?
As mentioned in the comments, std::get(std::array) will do that nicely for you, since it is obligated to do such bounds checking:
I must be an integer value in range [0, N). This is enforced at compile time as opposed to at() or operator[].
In your example, it would look like this:
void rarelyUsedFunction(const std::array<double, 2>& input)
{
std::cout << std::get<5>(input) << std::endl; // <---- Compilation error!
}
If the index is not just a literal, you can still calculate it with "complex" code as long as you manage to stuff it in a constexpr variable:
void rarelyUsedFunction(const std::array<double, 2>& input)
{
constexpr std::size_t index = /* Whatever, as long as it compiles... */;
std::cout << std::get<index>(input) << std::endl;
}
Obviously, in either case, this involves providing the compiler with a hard guarantee that the index is known at compile time.
Related
I am a mathematician by training and need to simulate a continuous time Markov chain. I need to use a variant of Gillespie algorithm which relies on fast reading and writing to a 13-dimensional array. At the same time, I need to set the size of each dimension based on users input (they will be each roughly of order 10). Once these sizes are set by the user, they will not change throughout the runtime. The only thing which changes will be the data contained in them. What is the most efficient way of doing this?
My first try was to use the standard arrays but their sizes must be known at the compilation time, which is not my case. Is std::vector a good structure for this? If so, how shall I go about initializing a creature as:
vector<vector<vector<vector<vector<vector<vector<vector<vector<vector<vector<vector<vector<int>>>>>>>>>>>>> Array;
Will the initialization take more time than dealing with an array? Or, is there a better data container to use, please?
Thank you for any help!
I would start by using a std::unordered_map to hold key-value pairs, with each key being a 13-dimensional std::array, and each value being an int (or whatever datatype is appropriate), like this:
#include <iostream>
#include <unordered_map>
#include <array>
typedef std::array<int, 13> MarkovAddress;
// Define a hasher that std::unordered_map can use
// to compute a hash value for a MarkovAddress
// borrowed from: https://codereview.stackexchange.com/a/172095/126857
template<class T, size_t N>
struct std::hash<std::array<T, N>> {
size_t operator() (const std::array<T, N>& key) const {
std::hash<T> hasher;
size_t result = 0;
for(size_t i = 0; i < N; ++i) {
result = result * 31 + hasher(key[i]); // ??
}
return result;
}
};
int main(int, char **)
{
std::unordered_map<MarkovAddress, int> map;
// Just for testing
const MarkovAddress a{{1,2,3,4,5,6,7,8,9,10,11,12,13}};
// Place a value into the map at the specified address
map[a] = 12345;
// Now let's see if the value is present in the map,
// and retrieve it if so
if (map.count(a) > 0)
{
std::cout << "Value in map is " << map[a] << std::endl;
}
else std::cout << "Value not found!?" << std::endl;
return 0;
}
That will give you fast (O(1)) lookup and insert, which is likely your first priority. If you later run into trouble with that (e.g. too much RAM used, or you need a well-defined iteration order, or etc) you could replace it with something more elaborate later.
I've implemented a constexpr map array lookup based on this SO answer, but it leaves me wondering now what the memory overhead might be like if the map array is very large, and what other gotchas might exist with this technique, particularly if the constexpr function cannot be resolved at compile time.
Here is a contrived code example that hopefully makes my question more clear:
example.h:
enum class MyEnum
{
X0,
X1,
X2,
X3,
X4,
X5
};
struct MyStruct
{
const MyEnum type;
const char* id;
const char* name;
const int size;
};
namespace
{
constexpr MyStruct myMap[] = {
{MyEnum::X0,"X0","Test 0", 0},
{MyEnum::X1,"X1","Test 1", 1},
{MyEnum::X2,"X2","Test 2", 2},
{MyEnum::X3,"X3","Test 3", 3},
{MyEnum::X4,"X4","Test 4", 4},
{MyEnum::X5,"X5","Test 5", 5},
};
constexpr auto mapSize = sizeof myMap/sizeof myMap[0];
}
class invalid_map_exception : public std::exception {};
// Retrieves a struct based on the associated enum
inline constexpr MyStruct getStruct(MyEnum key, int range = mapSize) {
return (range == 0) ? (throw invalid_map_exception()):
(myMap[range - 1].type == key) ? myMap[range - 1]:
getStruct(key, range - 1);
};
example.cpp:
#include <iostream>
#include <vector>
#include "example.h"
int main()
{
std::vector<MyEnum> enumList = {MyEnum::X0, MyEnum::X1, MyEnum::X2, MyEnum::X3, MyEnum::X4, MyEnum::X5};
int idx;
std::cout << "Enter a number between 0 and 5:" << std::endl;
std::cin >> idx;
MyStruct test = getStruct(enumList[idx]);
std::cout << "choice name: " << test.name << std::endl;
return 0;
}
Output:
Enter a number between 0 and 5:
1
choice name: Test 1
Compiled with g++ with -std=c++14.
In the above example, although getStruct is a constexpr function, it cannot be fully resolved until runtime since the value of idx is not known until then. May that change the memory overhead when compiled with optimization flags, or would the full contents of myMap be included in the binary regardless? Does it depend on the compiler and optimization setting used?
Also, what if the header file is included in multiple translation units? Would myMap be duplicated in each one?
I imagine this could be important if the map array becomes enormous and/or the code is going to be used in more resource constrained environments such as embedded devices.
Are there any other potential gotchas with this approach?
If you call a constexpr function with a non-constant expression, it will call the function at run time.
If you call getStruct with a constant expression, the compiler can just call the function at compile time. Then, the getStruct function will be "unused" at runtime, and the compiler will probably optimise it out. At this point, myMap will also be unused, and be optimised out.
In terms of runtime size, it would actually probably be smaller than an std::unordered_map or std::map; It literally stores the minimum information necessary. But it's lookup time would be a lot slower, as it has to compare all the elements individually in O(N) time, so it doesn't actually do what a map does (reduce lookup time).
If you want to make it more likely that it is optimised out, I would ensure that it is only used in constant-expression situations:
template<MyEnum key>
struct getStruct
{
static constexpr const MyStruct value = _getStruct(key);
}
Here's some compiler output that shows that the map is optimised out entirely
And about including it in multiple translation units, it would be duplicated in every one since you use an anonymous namespace to define it. If it was optimised out in all of them, there would be no overhead, but it would still be duplicated for every translation unit that you do a runtime lookup in.
Is this undefined behavior?
std::array<int, 5> x = {3, 5, 1, 2, 3};
std::array<int, 3>& y = *reinterpret_cast<std::array<int, 3>*>(&x[1]);
for(int i = 0; i != 3; i++) {
std::cout << y[i] << "\n";
}
Maybe yes, but I really feel like there should be a safe way to slice std::arrays.
EDIT: Following Radek's suggestion:
template<unsigned N, unsigned start, unsigned end, typename T>
std::array<T, end - start>& array_slice(std::array<T, N>& x)
{
static_assert(start <= end, "start <= end");
static_assert(end <= N-1, "end <= N");
return *reinterpret_cast<std::array<T, end - start>*>(&x[start]);
}
EDIT: Ok, I decided that I'm unhappy with std::arrays and will move to something else, any ideas?
Yes, that is undefined behavior. You're taking one type and reinterpret_casting it to another. Indeed, the use of the reinterpret_cast should be a big red flag for "here there be dragons!"
As for slicing arrays, that's not going to happen. A std::array contains values; a slice of this would contain references to part of that array. And therefore, it would not be a std::array. You can copy slices of arrays, but not using std::array. You would need to use std::vector, since it allows the calling of constructors, as well as construction from a range of values. Remember: std::array is just a nicer wrapper around a C-style array.
The committee is looking into a template array_ref<T> class, which is exactly what it says: a reference to some segment of an array of type T. This could be a regular C-style array, a std::vector, a std::array, or just some memory allocated with new T[]. There are some library implementations of the class already, but nothing is standardized yet.
Following Radek's suggestion:
Hiding the undefined behavior in a function does not make it defined behavior. You can try to pretend that it isn't undefined, but it still is. The moment you use that reinterpret_cast, you willingly give up living in C++-land.
What about a placement new?
#include <array>
#include <iostream>
#include <iterator>
template<typename T, std::size_t N>
struct array_slice : public std::array<T,N> {
~array_slice() = delete;
};
int main() {
std::array<double,4> x_mu{0.,3.14,-1.,1.};
std:: cout << &x_mu << std::endl;
{
auto slicer = [] (std::array<double,4>& ref) {
array_slice<double,3>* p = new (&ref) array_slice<double,3>;
return p;
};
std::array<double,3>& x_ = *slicer(x_mu);
std::copy(x_.begin(),x_.end(),
std::ostream_iterator<float>(std::cout," "));
std:: cout << std::endl;
std:: cout << &x_ << std::endl;
}
std::copy(x_mu.begin(),x_mu.end(),
std::ostream_iterator<float>(std::cout," "));
}
I was wondering whether sorting an array of std::pair is faster, or an array of struct?
Here are my code segments:
Code #1: sorting std::pair array (by first element):
#include <algorithm>
pair <int,int> client[100000];
sort(client,client+100000);
Code #2: sort struct (by A):
#include <algorithm>
struct cl{
int A,B;
}
bool cmp(cl x,cl y){
return x.A < y.A;
}
cl clients[100000];
sort(clients,clients+100000,cmp);
code #3: sort struct (by A and internal operator <):
#include <algorithm>
struct cl{
int A,B;
bool operator<(cl x){
return A < x.A;
}
}
cl clients[100000];
sort(clients,clients+100000);
Update: I used these codes to solve a problem in an online Judge. I got time limit of 2 seconds for code #1, and accept for code #2 and #3 (ran in 62 milliseconds). Why code #1 takes so much time in comparison to other codes? Where is the difference?
You know what std::pair is? It's a struct (or class, which is the same thing in C++ for our purposes). So if you want to know what's faster, the usual advice applies: you have to test it and find out for yourself on your platform. But the best bet is that if you implement the equivalent sorting logic to std::pair, you will have equivalent performance, because the compiler does not care whether your data type's name is std::pair or something else.
But note that the code you posted is not equivalent in functionality to the operator < provided for std::pair. Specifically, you only compare the first member, not both. Obviously this may result in some speed gain (but probably not enough to notice in any real program).
I would estimate that there isn't much difference at all between these two solutions.
But like ALL performance related queries, rather than rely on someone on the internet telling they are the same, or one is better than the other, make your own measurements. Sometimes, subtle differences in implementation will make a lot of difference to the actual results.
Having said that, the implementation of std::pair is a struct (or class) with two members, first and second, so I have a hard time imagining that there is any real difference here - you are just implementing your own pair with your own compare function that does exactly the same things that the already existing pair does... Whether it's in an internal function in the class or as an standalone function is unlikely to make much of a difference.
Edit: I made the following "mash the code together":
#include <algorithm>
#include <iostream>
#include <iomanip>
#include <cstdlib>
using namespace std;
const int size=100000000;
pair <int,int> clients1[size];
struct cl1{
int first,second;
};
cl1 clients2[size];
struct cl2{
int first,second;
bool operator<(const cl2 x) const {
return first < x.first;
}
};
cl2 clients3[size];
template<typename T>
void fill(T& t)
{
srand(471117); // Use same random number each time/
for(size_t i = 0; i < sizeof(t) / sizeof(t[0]); i++)
{
t[i].first = rand();
t[i].second = -t[i].first;
}
}
void func1()
{
sort(clients1,clients1+size);
}
bool cmp(cl1 x, cl1 y){
return x.first < y.first;
}
void func2()
{
sort(clients2,clients2+size,cmp);
}
void func3()
{
sort(clients3,clients3+size);
}
void benchmark(void (*f)(), const char *name)
{
cout << "running " << name << endl;
clock_t time = clock();
f();
time = clock() - time;
cout << "Time taken = " << (double)time / CLOCKS_PER_SEC << endl;
}
#define bm(x) benchmark(x, #x)
int main()
{
fill(clients1);
fill(clients2);
fill(clients3);
bm(func1);
bm(func2);
bm(func3);
}
The results are as follows:
running func1
Time taken = 10.39
running func2
Time taken = 14.09
running func3
Time taken = 10.06
I ran the benchmark three times, and they are all within ~0.1s of the above results.
Edit2:
And looking at the code generated, it's quite clear that the "middle" function takes quite a bit longer, since the comparison is made inline for pair and struct cl2, but can't be made inline for struct cl1 - so every compare literally makes a function call, rather than a few instructions inside the functions. This is a large overhead.
Is this undefined behavior?
std::array<int, 5> x = {3, 5, 1, 2, 3};
std::array<int, 3>& y = *reinterpret_cast<std::array<int, 3>*>(&x[1]);
for(int i = 0; i != 3; i++) {
std::cout << y[i] << "\n";
}
Maybe yes, but I really feel like there should be a safe way to slice std::arrays.
EDIT: Following Radek's suggestion:
template<unsigned N, unsigned start, unsigned end, typename T>
std::array<T, end - start>& array_slice(std::array<T, N>& x)
{
static_assert(start <= end, "start <= end");
static_assert(end <= N-1, "end <= N");
return *reinterpret_cast<std::array<T, end - start>*>(&x[start]);
}
EDIT: Ok, I decided that I'm unhappy with std::arrays and will move to something else, any ideas?
Yes, that is undefined behavior. You're taking one type and reinterpret_casting it to another. Indeed, the use of the reinterpret_cast should be a big red flag for "here there be dragons!"
As for slicing arrays, that's not going to happen. A std::array contains values; a slice of this would contain references to part of that array. And therefore, it would not be a std::array. You can copy slices of arrays, but not using std::array. You would need to use std::vector, since it allows the calling of constructors, as well as construction from a range of values. Remember: std::array is just a nicer wrapper around a C-style array.
The committee is looking into a template array_ref<T> class, which is exactly what it says: a reference to some segment of an array of type T. This could be a regular C-style array, a std::vector, a std::array, or just some memory allocated with new T[]. There are some library implementations of the class already, but nothing is standardized yet.
Following Radek's suggestion:
Hiding the undefined behavior in a function does not make it defined behavior. You can try to pretend that it isn't undefined, but it still is. The moment you use that reinterpret_cast, you willingly give up living in C++-land.
What about a placement new?
#include <array>
#include <iostream>
#include <iterator>
template<typename T, std::size_t N>
struct array_slice : public std::array<T,N> {
~array_slice() = delete;
};
int main() {
std::array<double,4> x_mu{0.,3.14,-1.,1.};
std:: cout << &x_mu << std::endl;
{
auto slicer = [] (std::array<double,4>& ref) {
array_slice<double,3>* p = new (&ref) array_slice<double,3>;
return p;
};
std::array<double,3>& x_ = *slicer(x_mu);
std::copy(x_.begin(),x_.end(),
std::ostream_iterator<float>(std::cout," "));
std:: cout << std::endl;
std:: cout << &x_ << std::endl;
}
std::copy(x_mu.begin(),x_mu.end(),
std::ostream_iterator<float>(std::cout," "));
}