Related
I want to inline the function MyClass:at(), but performance isn't as I expect.
MyClass.cpp
#include <algorithm>
#include <chrono>
#include <iostream>
#include <vector>
#include <string>
// Making this a lot shorter than in my actual program
std::vector<std::vector<int>> arrarr =
{
{ 1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48},
{24, 47, 32, 60, 99, 3, 45, 2, 44, 75, 33, 53, 78, 36, 84, 20, 35, 17, 12, 50},
{32, 98, 81, 28, 64, 23, 67, 10, 26, 38, 40, 67, 59, 54, 70, 66, 18, 38, 64, 70},
{67, 26, 20, 68, 2, 62, 12, 20, 95, 63, 94, 39, 63, 8, 40, 91, 66, 49, 94, 21},
{24, 55, 58, 5, 66, 73, 99, 26, 97, 17, 78, 78, 96, 83, 14, 88, 34, 89, 63, 72},
{21, 36, 23, 9, 75, 0, 76, 44, 20, 45, 35, 14, 0, 61, 33, 97, 34, 31, 33, 95},
{78, 17, 53, 28, 22, 75, 31, 67, 15, 94, 3, 80, 4, 62, 16, 14, 9, 53, 56, 92},
{16, 39, 5, 42, 96, 35, 31, 47, 55, 58, 88, 24, 0, 17, 54, 24, 36, 29, 85, 57},
{86, 56, 0, 48, 35, 71, 89, 7, 5, 44, 44, 37, 44, 60, 21, 58, 51, 54, 17, 58},
{19, 80, 81, 68, 5, 94, 47, 69, 28, 73, 92, 13, 86, 52, 17, 77, 4, 89, 55, 40},
{ 4, 52, 8, 83, 97, 35, 99, 16, 7, 97, 57, 32, 16, 26, 26, 79, 33, 27, 98, 66},
{88, 36, 68, 87, 57, 62, 20, 72, 3, 46, 33, 67, 46, 55, 12, 32, 63, 93, 53, 69},
{ 4, 42, 16, 73, 38, 25, 39, 11, 24, 94, 72, 18, 8, 46, 29, 32, 40, 62, 76, 36},
{20, 69, 36, 41, 72, 30, 23, 88, 34, 62, 99, 69, 82, 67, 59, 85, 74, 4, 36, 16},
{20, 73, 35, 29, 78, 31, 90, 1, 74, 31, 49, 71, 48, 86, 81, 16, 23, 57, 5, 54},
{ 1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48},
};
class MyClass
{
public:
MyClass(std::vector<std::vector<int>> arr) : arr(arr)
{
rows = arr.size();
cols = arr.at(0).size();
}
inline auto at(int row, int col) const { return arr[row][col]; }
void arithmetic(int n) const;
private:
std::vector<std::vector<int>> arr;
int rows;
int cols;
};
MyClass.cpp:
void MyClass::arithmetic(int n) const
{
using std::chrono::high_resolution_clock;
using std::chrono::duration_cast;
using std::chrono::duration;
using std::chrono::milliseconds;
auto t1 = high_resolution_clock::now();
int highest_product = 0;
for (auto y = 0; y < rows; ++y)
{
for (auto x = 0; x < cols; ++x)
{
// Horizontal product
if (x + n < cols)
{
auto product = 1;
for (auto i = 0; i < n; ++i)
{
product *= at(y, x + i);
}
highest_product = std::max(highest_product, product);
}
}
}
auto t2 = high_resolution_clock::now();
duration<double, std::milli> ms_double = t2 - t1;
std::cout << ms_double.count() << "ms\n";
return highestProduct;
};
Now what I want know is why do I get better performance when I replace product *= at(y, x + i); with product *= arr[y][x+i];? When I test it with the first case, the timing on my large array takes roughly 6.7ms, and the second case takes 5.3ms. I thought when I inlined the function, it should be the same implementation as the second case.
Member function directly defined in the class definition (typically in header files) are implicitly inlined so using inline is useless in this case. inline do not guarantee the function is inlined. It is just an hint for the compiler. The keyword is also an important during the link to avoid the multiple-definition issue. Function that are not make inline can still be inlined if the compiler can see the code of the target function (ie. it is in the same translation unit or link time optimization are applied). For more information about this, please read Why are class member functions inlined?
Note that the inlining is typically performed in the optimization step of compilers (eg. -O1//O1). Thus without optimizations, most compilers will not inline the function.
Using std::vector<std::vector<int>> is not efficient since it is not a contiguous data structure and it require 2 indirection to access an item. Two sub-vectors next to each other can be stored far away in memory likely causing more cache misses (and/or thrashing due to the alignment). Please consider using one big flatten array and access items using y*cols+x where cols is the size of the sub-vectors (20 here). Alternatively a int[16][20] data type should do the job well if the size if fixed at compile-time.
MyClass(std::vector<std::vector<int>> arr) cause the input parameter to be copied (and so all the sub-vectors). Please consider using a const std::vector<std::vector<int>>& type.
While at is convenient for checking bounds at runtime, this feature can strongly decrease performance. Consider using the operator [] if you do not need that. You can use assertions combined with flatten arrays so to get a fast code in release and a safe code in debug (you can enable/disable them by defining the NDEBUG macro).
I'm analyzing an obfuscated OpenGL application. I want to generate a .obj file that describes the multi-polygon model which is displayed in the application.
So I froze the app and dig out the values set in VBO and IBO. But the values set in IBO was far more mysterious than what I've expected. The value was
0, 0, 1, 2, 3, 4, 5, 6, 7, 7, 5, 8, 3, 3, 9, 9, 10, 11, 12, 12, 10, 13, 14, 14, 10, 15, 16, 16, 17, 17, 7, 8, 8, 18, 18, 19, 20, 21, 21, 22, 22, 23, 24, 25, 25, 26, 26, 27, 28, 29, 29, 30, 30, 31, 32, 32, 33, 33, 34, 35, 36, 37, 38, 38, 36, 39, 34, 34, 40, 40, 40, 41, 42, 43, 44, 44, 45, 45, 46, 47, 48, 49, 49, 50, 50, 51, 52, 52, 53, 53, 54, 55, 55, 56, 56, 57, 58, 58, 59, 59, 60, 61, 62, 62, 63, 63, 63, 64, 65, 66, 67, 64, 68, 68, 69, 69, 70, 71, 72, 73, 74, 75, 76, 76, 77, 77, 78, 79, 80, 81, 82, 82, 80, 83, 83, 84, 84, 85, 86, 87, 88, 88, 89, 89, 90, 91, 91, 92, 92, 92, 93, 94, 95, 96, 96, 97, 97, 97, 98, 99, 100, 101, 102, 102, 100, 103, 103, 104, 104, 105, 106, 107, 107, 108, 108, 108, 109, 110, 111, 112, 112, 100, 100, 101, 113, 114, 114, ... (length=10495)
As you can see indices like 40, 63, 92 and 108 are tripled, so setting neither GL_TRIANGLES, GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN, GL_QUADS, GL_QUAD_STRIP nor GL_POLYGON to glDrawElements won't work correctly.
Are there some kind of advanced techniques to use triple sequenced indices in IBO? What does it mean? For what reason is it used for?
Repeated indices like that are indicative of aggressive optimization of triangle strips. A repeated index creates degenerate triangles: triangles with zero area. Since they have no visible area, they are not rendered. They exist so that you can jump from one triangle strip to the next without having to issue another draw command.
So a double-index is often used to stitch two strips together. The two triangles it generates will not be rendered.
However, because of the way strips work with the winding order, the facing for the triangles can work out incorrectly. That is, if you stitched two strips together with a double-index, the second strip would start out with the reverse winding order than it desires.
That's where triple indices come in. The third index fixes the winding order for the triangles in the destination strip. The three extra triangles it generates will not be rendered.
The more modern way to handle multiple strips in the same draw call is to use primitive restart indices. But the index list as it currently stands is adequate for use with GL_TRIANGLE_STRIP.
You can read this strip list and process it into a series of separate triangles (as appropriate for GL_TRIANGLES) easily enough. Simply look at each sequence of 3 vertices, and output that to your triangle buffer, so long as it is not a degenerate triangle. And you'll have to reverse the order of two of the indices for every odd-numbered triangle. The code would look something like this:
const int num_faces = indices.size() - 2;
faces.reserve(num_faces);
for(auto i = 0; i < num_faces; ++i)
{
Face f(indices[i], indices[i + 1], indices[i + 2]);
//Don't add any degenerate faces.
if(!(f[0] == f[1] || f[0] == f[2] || f[1] == f[2]))
{
if(i % 2 == 1) //Every odd-numbered face.
std::swap(f[1], f[2]);
faces.push_back(f);
}
}
I am new using Eigen library and I am having problems transform/reshape a vector in a matrix.
I am trying to get an specific row of a matrix and convert it as a matrix, but each time that I do that the result is not what I am expecting.
Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor> m(8, 9);
m << 11, 12, 13, 14, 15, 16, 17, 18, 19,
21, 22, 23, 24, 25, 26, 27, 28, 29,
31, 32, 33, 34, 35, 36, 37, 38, 39,
41, 42, 43, 44, 45, 46, 47, 48, 49,
51, 52, 53, 54, 55, 56, 57, 58, 59,
61, 62, 63, 64, 65, 66, 67, 68, 69,
71, 72, 73, 74, 75, 76, 77, 78, 79,
81, 82, 83, 84, 85, 86, 87, 88, 89;
std::cout << m << std::endl << std::endl;
Matrix<double,1,Dynamic,RowMajor> B = m.row(0);
std::cout << B << std::endl << std::endl;
Map<Matrix3d,RowMajor> A(B.data(),3,3);
std::cout << A << std::endl << std::endl;
Result
11 14 17
12 15 18
13 16 19
I want:
11 12 13
14 15 16
17 18 19
You dont need to select a row first and then map. Just map directly from m and assign the transpose of map to a matrix A as follows
Matrix3d A = Map<Matrix3d>(m.data()).transpose();
If you don't like transposing then forcing the map to use RowMajor for the destination type works too
Matrix3d A = Map<Matrix<double, 3, 3, RowMajor>>(m.data());
Although, at this small size it doesn't matter. Cheers
You need to get the transpose of the result matrix. I think eigen library is converting a vector to a matrix by picking every n'th element to form a row in a n*n sized vector.
I'm trying to implement simplified DES for learning purposes in python, but I am having trouble figuring out how to do the permutations based on a "schedule." Essentially, I have a tuple with the appropriate permutations, and I need to bit shift to the correct location.
For example, using a key:
K = 00010011 00110100 01010111 01111001 10011011 10111100 11011111 11110001
Would move the 57st bit to the first bit spot, 49th bit to the second bit spot, etc...
K+ = 1111000 0110011 0010101 0101111 0101010 1011001 1001111 0001111
Current code:
def keyGen(key):
PC1table = (57, 49, 41, 33, 25, 17, 9,
1, 58, 50, 42, 34, 26, 18,
10, 2, 59, 51, 43, 35, 27,
19, 11, 3, 60, 52, 44, 36,
63, 55, 47, 39, 31, 23, 15,
7, 62, 54, 46, 38, 30, 22,
14, 6, 61, 53, 45, 37, 29,
21, 13, 5, 28, 20, 12, 4)
keyBinary = bin(int(key, 16))[2:].zfill(64)
print keyBinary
permute(PC1table, keyBinary)
def permute(permutation, permuteInput):
elements = list(enumerate(permutation))
for bit in permuteInput:
***magic bitshifting goes here***
keyGen("133457799BBCDFF1")
The logic I thought would work was to enumerate the tuple of permutations, and for each bit of my old key, look in the enumeration to find the index corresponding the the bit, and bit shift the appropriate number of times, but I just can't figure out how to go about doing this. It may be that I am approaching the problem from the wrong angle, but any guidance would be greatly appreciated!
Ok, I ended up figuring a way to make this work, although this probably isn't the most efficient way...
prior to calling the function, turn the binary number into a list:
keyBinary = bin(int(key, 16))[2:].zfill(64)
keyBinary = [int(i) for i in keyBinary]
Kplus = permute(PC1table, keyBinary)
def permute(mapping, permuteInput):
permuteOutput = []
for i in range(len(mapping)):
permuteOutput.append(permuteInput[mapping[i % 56] - 1])
return permuteOutput
if anyone has a better way of tackling this, I'd love to see your solutions!
I have a question which can be divided into two subquestions.
I have created a table the code of which is given below.
Problem 1.
xstep = 1;
xmaximum = 6;
numberofxnodes = 6;
numberofynodes = 3;
numberofzlayers = 3;
maximumgridnodes = numberofxnodes*numberofynodes
mnodes = numberofxnodes*numberofynodes*numberofzlayers;
orginaltable =
Table[{i,
node2 = i + xstep, node3 = node2 + xmaximum,
node4 = node3 - xstep,node5 = i + maximumgridnodes,
node6 = node5 + xstep,node7 = node6 + xmaximum,
node8 = node7 - xstep},
{i, 1, mnodes}]
If I run this I will get my original table. Basically I want to remove the sixth element and multiples of the sixth element from my original table. I am able to do this by using this code below.
modifiedtable = Drop[orginaltable, {6, mnodes, 6}]
Now I get the modified table where every sixth element and multiples of sixth element of my original table is removed. This solves my Problem 1.
Now my Problem 2:
** MAJOR EDITED VERSION**:(ALL THE CODES GIVEN ABOVE IS CORRECT)
Thanks a lot for the answers, but I wanted something else and I made a mistake
while explaining it initially so I'm making another try.
Below is my modified table: I want the elements in between
"/** and **/" deleted and remaining there.
{{1, 2, 8, 7, 19, 20, 26, 25}, {2, 3, 9, 8, 20, 21, 27, 26}, {3, 4,10, 9, 21, 22, 28, 27}, {4, 5, 11, 10, 22, 23, 29, 28}, {5, 6, 12, 11, 23, 24, 30, 29}, {7, 8, 14, 13, 25, 26, 32, 31}, {8, 9, 15, 14, 26, 27, 33, 32}, {9, 10, 16, 15, 27, 28, 34, 33}, {10, 11, 17, 16, 28, 29, 35, 34}, {11, 12, 18, 17, 29, 30, 36, 35}, /**{13, 14, 20, 19, 31, 32, 38, 37}, {14, 15, 21, 20, 32, 33, 39, 38}, {15, 16, 22, 21, 33, 34, 40, 39}, {16, 17, 23, 22, 34, 35, 41, 40}, {17, 18, 24, 23, 35, 36, 42, 41},**/ {19, 20, 26, 25, 37, 38, 44, 43}, {20, 21, 27, 26, 38, 39, 45, 44}, {21, 22, 28, 27, 39, 40, 46, 45}, {22, 23, 29, 28, 40, 41, 47, 46}, {23, 24, 30, 29, 41, 42, 48, 47}, {25, 26, 32, 31,43, 44, 50, 49}, {26, 27, 33, 32, 44, 45, 51, 50}, {27, 28, 34, 33, 45, 46, 52, 51}, {28, 29, 35, 34, 46, 47, 53, 52}, {29, 30, 36, 35, 47, 48, 54, 53}, /**{31, 32, 38, 37, 49, 50, 56, 55}, {32, 33, 39, 38,50, 51, 57, 56}, {33, 34, 40, 39, 51, 52, 58, 57}, {34, 35, 41, 40, 52, 53, 59, 58}, {35, 36, 42, 41, 53, 54, 60, 59},**/ {37, 38, 44, 43,55, 56, 62, 61}, {38, 39, 45, 44, 56, 57, 63, 62}, {39, 40, 46, 45, 57, 58, 64, 63}, {40, 41, 47, 46, 58, 59, 65, 64}, {41, 42, 48, 47,59, 60, 66, 65}, {43, 44, 50, 49, 61, 62, 68, 67}, {44, 45, 51, 50, 62, 63, 69, 68}, {45, 46, 52, 51, 63, 64, 70, 69}, {46, 47, 53, 52, 64, 65, 71, 70}, {47, 48, 54, 53, 65, 66, 72, 71}, /**{49, 50, 56, 55, 67, 68, 74, 73}, {50, 51, 57, 56, 68, 69, 75, 74},{51,52, 58, 57, 69, 70, 76, 75}, {52, 53, 59, 58, 70, 71, 77, 76}, {53, 54, 60, 59, 71, 72, 78, 77}}**/
Now, if you observe, I wanted the first ten elements
(1st to 10th element of modifiedtable) to be there in my final table
( DoubleModifiedTable ). the the next five (11th to 15th elements of modifiedtable) deleted.
Then the next ten elements ( 16th to 25th elements of modifiedtable)
to be present in my final table ( DoubleModifiedTable )
then the next five deleted (26th to 30th elements of modifiedtable) and so on for the whole table.
Let say we solve this problem and we name the final table DoubleModifiedTable.
I am basically interested in getting the DoubleModifiedTable. I decided to subdivide the problem as it easy to explain.
I want this to happen automatically through the table since as this is just an example table but in reality I have huge table. If I can understand how I can solve this problem for this table, then I can solve it for my large table too.
Perhaps simpler:
DoubleModifiedTable =
Module[{copy = modifiedtable},
copy[[Flatten[# + Range[5] & /# Range[10, Length[copy], 10]]]] = Sequence[];
copy]
EDIT
Even simpler:
DoubleModifiedTable =
Delete[modifiedtable,
Transpose[{Flatten[# + Range[5] & /# Range[10, Length[modifiedtable], 10]]}]]
EDIT 2
Per OP's request: one only has to change a single number (10 to 15) in any of my solutions to get the answer to a modified problem:
DoubleModifiedTable =
Delete[modifiedtable,
Transpose[{Flatten[# + Range[5] & /# Range[10, Length[modifiedtable], 15]]}]]
Another way is to do something like
DoubleModifiedTable = With[{n = 10, m = 5},
Flatten[{{modifiedtable[[;; m]]},
Partition[modifiedtable, n - m, n, {n - m + 1, 1}, {}]}, 2]]
Edit
The edited version of Problem 2 is actually slightly simpler to solve than the original version. You could for example do something like
DoubleModifiedTable =
With[{n = 10, m = 5}, Flatten[Partition[modifiedtable, n, n + m, 1, {}], 1]]
Edit 2
What my second version does is to split the original list modifiedtable into sublists using Partition and then to flatten these sublists to form the final list. If you look at the Documentation for Partition you can see that I'm using the 6th form of Partition which means that the length of the sublists is n and the offset (the distance be is n+m. The gap between the sublists is therefore n+m-n==m.
The next argument, 1, is actually equivalent to {1,1} which tells Mathematica that the first element of modifiedtable should appear at position 1 in the first sublist and the last element of modifiedtable should appear on or after position 1 of the last sublist.
The last argument, {} is to indicate that no padding should be used for sublists with length <=n.
In summary, if you want to delete the first 10 elements and keep the next 5 you want sublists of length n=5 with gap m=10. Since you want the first sublist to start with the (m+1)-th element of modifiedtable, you could replace the fourth argument in Partition with something of the form {k,1} for some value of k but it's probably easier to just drop the first m elements of modifiedtable beforehand, i.e.
DoubleModifiedTable =
With[{n = 5, m = 10},
Flatten[Partition[Drop[modifiedtable, m], n, n + m, 1, {}], 1]]
DoubleModifiedTable=
modifiedtable[[
Complement[
Range[Length[modifiedtable]],
Flatten#Table[10 i + j, {i, Floor[Length[modifiedtable]/10]}, {j, 5}]
]
]]
or, slightly shorter
DoubleModifiedTable=
#[[
Complement[
Range[Length[#]],
Flatten#Table[10 i + j, {i, Floor[Length[#]/10]}, {j, 5}]
]
]] & # modifiedtable