C++ Eigen initialize static matrix - c++

Is it possible to initialize a static eigen matrix4d in a header file? I want to use it as a global variable.
I'd like to do something along the lines of:
static Eigen::Matrix4d foo = Eigen::Matrix4d(1, 2 ... 16);
Or similar to vectors:
static Eigen::Matrix4d foo = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
Here is a link to the eigen matrix docs. I can't seem to find how to do this from there.

A more elegant solution might include the use of finished(). The function returns 'the built matrix once all its coefficients have been set.'
E.g:
static Eigen::Matrix4d foo = (Eigen::Matrix4d() << 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16).finished();

On the lines of Dawid's answer (which has a small issue, see the comments), you can do:
static Eigen::Matrix4d foo = [] {
Eigen::Matrix4d tmp;
tmp << 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16;
return tmp;
}();
Return value optimization takes care of the temporary, so no worries about an extra copy.

You can use initialization lambda like this:
static Eigen::Matrix4d foo = [] {
Eigen::Matrix4d matrix;
matrix << 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16;
return matrix;
}();

Related

c++ get indices of duplicating rows in 2D array

The task is following: find indices of duplicating rows of 2D array. Rows considered to be duplicated if 2nd and 4th elements of one row are equal to 2nd and 4th elements of another row.The simplest way to do it is something like that:
std::unordered_set<int> result;
for (int i = 0; i < rows_count; ++i)
{
for (int j = i + 1; j < rows_count; ++j)
{
if (arr[i][2] == arr[j][2] && arr[i][4] == arr[j][4])
{
result.push_back(j);
}
}
}
But if rows_count is very large this algorithm is too slow. So my question is there any way to get needed indices using some data structures (from stl or other) with only single loop (without nested loop)?
You could take advantage of the properties of a `std::unordered_set.
A small helper class will further ease up things.
So, we can store in a class the 2nd and 4th value and use a comparision function to detect duplicates.
The std::unordered_set has, besides the data type, 2 additional template parameters.
A functor for equality and
a functor for calculating a hash function.
So we will add 2 functions to our class an make it a functor for both parameters at the same time. In the below code you will see:
std::unordered_set<Dupl, Dupl, Dupl> dupl{};
So, we use our class additionally as 2 functors.
The rest of the functionality will be done by the std::unordered_set
Please see below one of many potential solutions:
#include <vector>
#include <unordered_set>
#include <iostream>
struct Dupl {
Dupl() {}
Dupl(const size_t row, const std::vector<int>& data) : index(row), firstValue(data[2]), secondValue(data[4]){};
size_t index{};
int firstValue{};
int secondValue{};
// Hash function
std::size_t operator()(const Dupl& d) const noexcept {
return d.firstValue + (d.secondValue << 8) + (d.index << 16);
}
// Comparison
bool operator()(const Dupl& lhs, const Dupl& rhs) const {
return (lhs.firstValue == rhs.firstValue) and (lhs.secondValue == rhs.secondValue);
}
};
std::vector<std::vector<int>> data{
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, // Index 0
{2, 3, 4, 5, 6, 7, 8, 9, 10, 11}, // Index 1
{3, 4, 42, 6, 42, 8, 9, 10, 11, 12}, // Index 2 ***
{4, 5, 6, 7, 8, 9, 10, 11, 12, 13}, // Index 3
{5, 6, 42, 8, 42, 10, 11, 12, 13, 14}, // Index 4 ***
{6, 7, 8, 9, 10, 11, 12, 13, 14, 15}, // Index 5
{7, 8, 9, 10, 11, 12, 13, 14, 15, 16}, // Index 6
{8, 9, 10, 11, 12, 13, 14, 15, 16, 17}, // Index 7
{9, 10, 42, 12, 42, 14, 15, 16, 17, 18}, // Index 8 ***
{10, 11, 12, 13, 14, 15, 16, 17, 18, 19}, // Index 9
};
int main() {
std::unordered_set<Dupl, Dupl, Dupl> dupl{};
// Find the unique rows
for (size_t i{}; i < data.size(); ++i)
dupl.insert({i, data[i]});
// Show some debug output
for (const Dupl& d : dupl) {
std::cout << "\nIndex:\t " << d.index << "\t\tData: ";
for (const int i : data[d.index]) std::cout << i << ' ';
}
}

Using std::map should be deterministic or not?

I'm facing a strange behaviour using Intel C++ compiler 2019 update 5. When I fill a std::map it seems to lead to a non deterministic (?) result. The stl is from VS2019 16.1.6 in which ICC is embedded. I am on Windows 10.0.17134.286.
My code:
#include <map>
#include <vector>
#include <iostream>
std::map<int, int> AddToMapWithDependencyBetweenElementsInLoop(const std::vector<int>& values)
{
std::map<int, int> myMap;
for (int i = 0; i < values.size(); i+=3)
{
myMap.insert(std::make_pair(values[i], myMap.size()));
myMap.insert(std::make_pair(values[i + 1], myMap.size()));
myMap.insert(std::make_pair(values[i + 2], myMap.size()));
}
return myMap;
}
std::map<int, int> AddToMapOnePerLoop(const std::vector<int>& values)
{
std::map<int, int> myMap;
for (int i = 0; i < values.size(); ++i)
{
myMap.insert(std::make_pair(values[i], 0));
}
return myMap;
}
int main()
{
std::vector<int> values{ 6, 7, 15, 5, 4, 12, 13, 16, 11, 10, 9, 14, 0, 1, 2, 3, 8, 17 };
{
auto myMap = AddToMapWithDependencyBetweenElementsInLoop(values);
for (const auto& keyValuePair : myMap)
{
std::cout << keyValuePair.first << ", ";
}
std::cout << std::endl;
}
{
auto myMap = AddToMapOnePerLoop(values);
for (const auto& keyValuePair : myMap)
{
std::cout << keyValuePair.first << ", ";
}
std::cout << std::endl;
}
return 0;
}
I simply wanted to perform a test so I call directly icl from the command line:
$ icl /nologo mycode.cpp
$ mycode.exe
0, 1, 2, 3, 4, 5, 6, 7, 11, 12, 13, 14, 15, 16, 17,
0, 1, 2, 3, 4, 5, 6, 7, 12, 13, 14, 15, 16, 17
Curious. I expected to have 18 entries and I got 15 and 14 (depending on the insertion method, see the code).
$ icl /nologo /EHsc mycode.cpp
$ mycode.exe
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17,
0, 1, 2, 3, 4, 5, 6, 7, 12, 13, 14, 15, 16, 17
Still curious, now I got 17 and 14 entries rather than 18 and 18!
$ icl /nologo /Od mycode.cpp
$ mycode.exe
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
Now, with no optimization, I got 18/18, as expected.
My question is two-fold: 1) is it normal to get such results and 2) if it's not (what I suspect) what did I do wrong? I tought a simple call to the compiler would call the std::map::insert() function correctly?
Does the problem lies in the for(){}???
Thanks for helping me understanding this problem and finding a solution!
I cannot reproduce this but in either case, for peace of mind you could populate the map much simpler:
for (auto i: values) {
myMap[i] = 0;
}
There is no need to use myMap.insert(std::make_pair(key, value)) just to add an entry to the map.
Otherwise your code produces the expected output (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 twice, the sequence is obviously sorted because this is the ordered map) if compiled with gcc 8.4.0 under Ubuntu. I suspect this is simply a bug of that particular compiler you use. It would be beneficial to report the bug to the compiler developers so that they could fix it.
Syntactically, your code is fine. I see no possible undefined behavior here (as far as you did no further hidden crazy hacks like redefining size_t/map, modifying standard headers etc.).
But:
Since I experienced loop-optimizer issues with older compilers due to lines like this one
for (int i = 0; i < values.size(); ++i)
where you mixed signed and unsigned integers / data type ranges, I suspect your intel compiler might have an issue with loop-unrolling here. Maybe it's also due to an according issue inside the loop and the subscript operator usage there. Typical fundamental issue here: Misassumption about allowed register usage. Can you try your code again with a strict size_t usage here?
Further idea:
Can you reproduce the issue if your 'static' pre-defined values to print are created in a very dynamic way instead of hard-code construction? That might at least exclude a lot of possible underlying reasons if you cannot.
Just guessing that there could be an optimization related to for(...; i+=3)
I see that your use-case has the number of items dividable by 3, but anyway I would fix a bug in your code for more general cases:
{
std::map<int, int> myMap;
for (int i = 0; (i + 2) < values.size(); i+=3) // ignore the possibly incomplete last triplet
I know it is not directly related to your problem, but maybe this fix triggers something in the compiler optimizer to build a correct code.

Reading integers in different endianness from binary file in C++

I'm reading an ESRI Shapefile, and to my dismay it uses big endian and little endian at different points (see, for instance, the table at page 4, plus the tables from page 5 to 8).
So I created two functions in C++, one for each endianness.
uint32_t readBig(ifstream& f) {
uint32_t num;
uint8_t buf[4];
f.read((char*)buf,4);
num = buf[3] | buf[2]<<8 | buf[1]<<16 | buf[0]<<24;
return num;
}
uint32_t readLittle(ifstream& f) {
uint32_t num;
f.read(reinterpret_cast<char *>(&num),4);
//f.read((char*)&num,4);
return num;
}
But I'm not sure this is the most efficient way to do it. Can this code be improved? Keep in mind it will run thousands, maybe millions of times for a single shapefile. So to have even one of the functions calling the other seem worse than to have two separate functions. Is there a difference in performance between using reinterpret_cast or explicit type conversion (char*)? Should I use the same in both functions?
Casting between pointer types does not affect performance -- In
this case, it's just a technicality to make the compiler happy.
If you're really making a separate call to read for every 32-bit
value, the time taken by the byte-swapping operation will likely be
in the noise. For speed, you probably should have your own
buffering layer so that you inner loop doesn't make any function
calls.
It's nice if the swap compiles down to a single opcode (like bswap), but whether or not that
is possible, or the fastest option, is processor-specific.
If you're really interested in maximizing speed, consider using SIMD intrinsics.
In most cases the compiler should generate a bswap instruction, which is probably sufficient. If however you need something faster than that, vpshufb is your friend...
#include <immintrin.h>
#include <cstdint>
// swap byte order in 16 x int16
inline void swap_16xi16(uint16_t input[16])
{
constexpr uint8_t mask_data[] = {
1, 0,
3, 2,
5, 4,
7, 6,
9, 8,
11, 10,
13, 12,
15, 14,
1, 0,
3, 2,
5, 4,
7, 6,
9, 8,
11, 10,
13, 12,
15, 14
};
const __m256i swapped = _mm256_shuffle_epi8(
_mm256_loadu_si256((const __m256i*)input),
_mm256_loadu_si256((const __m256i*)mask_data)
);
_mm256_storeu_si256((__m256i*)input, swapped);
}
// swap byte order in 8 x int32
inline void swap_8xi32(uint32_t input[8])
{
constexpr uint8_t mask_data[] = {
3, 2, 1, 0,
7, 6, 5, 4,
11, 10, 9, 8,
15, 14, 13, 12,
3, 2, 1, 0,
7, 6, 5, 4,
11, 10, 9, 8,
15, 14, 13, 12
};
const __m256i swapped = _mm256_shuffle_epi8(
_mm256_loadu_si256((const __m256i*)input),
_mm256_loadu_si256((const __m256i*)mask_data)
);
_mm256_storeu_si256((__m256i*)input, swapped);
}
// swap byte order in 4 x int64
inline void swap_4xi64(uint64_t input[4])
{
constexpr uint8_t mask_data[] = {
7, 6, 5, 4, 3, 2, 1, 0,
15, 14, 13, 12, 11, 10, 9, 8,
7, 6, 5, 4, 3, 2, 1, 0,
15, 14, 13, 12, 11, 10, 9, 8
};
const __m256i swapped = _mm256_shuffle_epi8(
_mm256_loadu_si256((const __m256i*)input),
_mm256_loadu_si256((const __m256i*)mask_data)
);
_mm256_storeu_si256((__m256i*)input, swapped);
}
inline void swap_16xi16(int16_t input[16])
{ swap_16xi16((uint16_t*)input); }
inline void swap_8xi32(int32_t input[8])
{ swap_8xi32((uint32_t*)input); }
inline void swap_4xi64(int64_t input[4])
{ swap_4xi64((uint64_t*)input); }
inline void swap_8f(float input[8])
{ swap_8xi32((uint32_t*)input); }
inline void swap_4d(double input[4])
{ swap_4xi64((uint64_t*)input); }

Efficient Eigen Matrix SubIndexing + Concatenation

I'm using Eigen for easy optimization of some of my matrix math. I'm currently trying to make the following operation more efficient:
Given Matrix A:
1, 2, 3
4, 5, 6
Matrix B:
7, 11, 13, 19, 26, 7, 11
8, 9, 15, 6, 8, 4, 1
and "index map" column vector IM:
0, 1, 3, 6
I'd like to append the columns of Matrix B mapping to the indexes in IM, to Matrix A as such:
1, 2, 3, 7, 11, 19, 11
4, 5, 6, 8, 9, 6, 1
I'm currently able to do this with a massive for loop, but this is the bottleneck in my code and I'd like to avoid this:
#pragma unroll
for (int i = 0; i < 25088; i++) {
block.noalias() += _features.col(ff[i]);
}
I've seen the discussion here and poured over the docs but can't seem to figure out the right syntax relating to Eigen matrices: http://eigen.tuxfamily.org/bz/show_bug.cgi?id=329
Any thoughts/tips would be much appreciated!

How do I find size of varying rows of a dynamically allocated array?

I have the following code:
int *exceptions[7];
int a[] = {1, 4, 11, 13};
int b[] = {5, 6, 11, 12, 14, 15};
int c[] = {2, 12, 14, 15};
int d[] = {1, 4, 7, 9, 10, 15};
int e[] = {1, 3, 4, 5, 7, 9};
int f[] = {1, 2, 3, 7, 13};
int g[] = {0, 1, 7, 12};
exceptions[0] = a;
exceptions[1] = b;
exceptions[2] = c;
exceptions[3] = d;
exceptions[4] = e;
exceptions[5] = f;
exceptions[6] = g;
Size of exception[0] and exception[1] should be 4 and 6 respectively.
Here's my code:
short size = sizeof(exceptions[1]) / sizeof(exceptions[1][0]);
But I'm getting 2 for every row. How can I solve this problem?
short size = sizeof(exceptions[1]) / sizeof(exceptions[1][0]);
effectively does the same as
short size = sizeof(int*) / sizeof(int);
On a 64 bit platform, that yields most probably 2.
How can I solve this problem?
Use some c++ standard container like std::vector<std::vector<int>> instead:
std::vector<std::vector<int>> exceptions {
{1, 4, 11, 13},
{5, 6, 11, 12, 14, 15},
{2, 12, 14, 15},
{1, 4, 7, 9, 10, 15},
{1, 3, 4, 5, 7, 9},
{1, 2, 3, 7, 13},
{0, 1, 7, 12},
}
Your statement will become:
short size = exceptions[0].size();
size = exceptions[1].size();
(for whatever that's needed)
The best remedy would be to use vector provided in standard template library. They have a size() function which you can use and they are much more versatile than array.