Template Metaprogramming in loop? - c++

Minutes ago, I was pracitise trival algorithm problem. The codes below(concrete logic of the algorithm problem is not importnant, so all we need to know is codes above main function are just TMP):
#include <array>
#include <algorithm>
#include <iterator>
#include <iostream>
constexpr int digit_in_ones[10] = { 6, 2, 5, 5, 4, 5, 6, 3, 7, 6 };
constexpr int createOneD(int index);
template<int ...>
struct seq
{
};
template<int A, int ...B>
struct gens : gens<A - 1, A - 1, B...>
{
};
template<int ...S>
struct gens<0, S ...>
{
typedef seq<S...> type;
};
template<int N>
class oneDArrayMaker
{
private:
typedef typename gens<N>::type sequence;
template<int ...S>
static constexpr std::array<int, N> make(seq<S ...>)
{
return std::array<int, N>{ {createOneD(S)...}};
}
public:
static constexpr std::array<int, N> oneDArr = make(sequence());
};
template<int N>
constexpr std::array<int, N> oneDArrayMaker<N>::oneDArr;
constexpr int createOneD(int index)
{
return index < 10 ?
digit_in_ones[index] :
digit_in_ones[(index % 100) / 10] + digit_in_ones[index % 10] +
(index >= 100 ? digit_in_ones[index / 100] : 0);
}
int main()
{
int n{}, ans{};
scanf("%d", &n);
for (int i = 0; i < 800; i++)
{
for (int j = 0; j < 800; j++)
{
auto temp = oneDArrayMaker<800>::oneDArr[i] + oneDArrayMaker<800>::oneDArr[j] + (i+j < 800 ? oneDArrayMaker<800>::oneDArr[i+j] : 100) + 4;
if (temp == n)
{
ans++;
}
}
}
printf("%d", ans);
}
I knew loop and if(exclude constexpr function and if constexpr) are run-time, not compile time. So tricks like template specialization are substations for if and loop. I learned a lesson about silly usage of if in template programming from this article-
Compile Time Loops with C++11 - Creating a Generalized static_for Implementation, here the codes:
#include <iostream>
template<int index> void do_stuff()
{
std::cout << index << std::endl;
}
template<int max_index, int index = 0> void stuff_helper()
{
if (index <= max_index)
{
do_stuff<index>();
stuff_helper<max_index, index + 1>();
}
}
int main()
{
stuff_helper<100>();
return 0;
}
author's explanation:
On the surface, it could look like the if statement would be responsible for terminating the recursion, like how this would work with a "normal" run-time based recursion algorithm. But that's the problem. What works at runtime doesn't work at compile time.
This is an infinite loop, and only stops because compilers limit themselves to a certain recursion depth. In clang, I get an error fatal error: recursive template instantiation exceeded maximum depth of 256. You can expect a similar error with your compiler of choice.
Oops..., I just state what I have known...
Finally, it comes to my question:
Now that templates's instantiation(specifically, two-parses) is at compile time. So all templates instantiation in the toppest codes should be at compile time:
for (int i = 0; i < 800; i++)
{
for (int j = 0; j < 800; j++)
{
auto temp = oneDArrayMaker<800>::oneDArr[i] + ... // 800 * 800 instantiations should be deternimated at compile time
...
}
...
}
As we known
1. the two for loop here is runtime ahthough it is out of template function/class's definition and just in main function.
2. every auto temp = oneDArrayMaker<800>::oneDArr[i] + ... should be initializated at compile time, so 800 * 800 instantiations should be deternimated at compile time.
Q1: Is runtime loop in main function confliced with 799*799 compile-time template initializations?
My assumption: At compile time, compiler know the depth of the loop, so just unroll the loops, which there is no loop at runtime.
But I maintain that the two loops(i and j) can also not be deternimated at runtime, I change main function to:
int main()
{
int n{}, ans{}, i{}, j{};
scanf("%d", &n);
scanf("%d %d", &i, &j);
std::cout << n << " " << i << " " << j << std::endl;
for (; i < 800; i++)
{
for (; j < 800; j++)
{
auto temp = oneDArrayMaker<800>::oneDArr[i] + oneDArrayMaker<800>::oneDArr[j] + (i+j < 800 ? oneDArrayMaker<800>::oneDArr[i+j] : 100) + 4;
if (temp == n)
{
ans++;
}
}
}
printf("%d", ans);
}
Now i and j have to be deternimated at runtime because of scanf. I just pass extra two 0 to stdin.
Here is live example after alter main function, and output is 12(the right answer is 128)
It compile successfully and no warning is generated. What confuses me is the output is different from the original codes(live code, whose output is 128(equal to the rigth answer).
After dubug, I find the key is after altering codes, for (; i < 800; i++) is only excuate once i = 0, whereas it should have excauted 1~799, that's the reason for 12, not 128.
Q2: If depth of for loop cannot be deternimated at runtime and TMP codes live in loops, what will happen?
Q3: How to explain the output 12
Update:
Q3 has been resolved by #Scott Brown, I'm so careless.
Q1 and Q2 still confuses me

You forget to reset j before 'for (; j < 800; j++)'.
int main()
{
int n{}, ans{}, i{}, j{};
scanf("%d", &n);
scanf("%d %d", &i, &j);
std::cout << n << " " << i << " " << j << std::endl;
int j_orig = j;// here
for (; i < 800; i++)
{
j = j_orig;// and here
for (; j < 800; j++)
{
auto temp = oneDArrayMaker<800>::oneDArr[i] + oneDArrayMaker<800>::oneDArr[j] + (i+j < 800 ? oneDArrayMaker<800>::oneDArr[i+j] : 100) + 4;
if (temp == n)
{
ans++;
}
}
}
printf("%d", ans);
}

Related

C++ - response is int%

I decide 2D Dinamic Coding on C++, i'm decide task about count of ways to bottom-right field in table, and my program return %. Why?
Program:
#include <iostream>
using namespace std;
int main() {
int n, m;
cin >> n >> m;
int arr[n][m];
for (int i = 0; i < n; i++)
arr[i][0] = 1;
for (int i = 0; i < m; i++)
arr[0][i] = 1;
for (int i = 1; i < n; i++) {
for (int j = 1; j < m; j++)
arr[i][j] = arr[i-1][j] + arr[i][j-1];
}
cout << arr[n-1][m-1];
}
I would like answer
Request:
1 10
Response:
1
Your program has undefined behavior for any other sizes than n = 1 and m = 1 because you leave the non-standard VLA (variable length array) arr's positions outside arr[0][0] uninitialized and later read from those positions. If you want to continue using these non-standard VLA:s, you need to initialize them after constructing them. Example:
#include <cstring> // std::memset
// ...
int arr[n][m];
std::memset(arr, 0, sizeof arr); // zero out the memory
// ...
Another approach that would both make it initialized and be compliant with standard C++ would be to use std::vectors instead:
#include <vector>
// ...
std::vector<std::vector<int>> arr(n, std::vector<int>(m));
// ...
A slightly more cumbersome approach is to store the data in a 1D vector inside a class and provide methods of accessing the data as if it was stored in a 2D matrix. A class letting you store arbitrary number of dimensions could look something like below:
#include <utility>
#include <vector>
template <class T, size_t Dim> // number of dimensions as a template parameter
class matrix {
public:
template <class... Args>
matrix(size_t s, Args&&... sizes) // sizes of all dimensions
: m_data(s * (... * sizes)), // allocate the total amount of data
m_sizes{s, static_cast<size_t>(sizes)...}, // store sizes
m_muls{static_cast<size_t>(sizes)..., 1} // and multipliers
{
static_assert(sizeof...(Args) + 1 == Dim);
for (size_t i = Dim - 1; i--;)
m_muls[i] *= m_muls[i + 1]; // calculate dimensional multipliers
}
template <size_t D> size_t size() const { return m_sizes[D]; }
size_t size(size_t D) const { return m_sizes[D]; }
// access the data using (y,z) instead of [y][x]
template <class... Args>
T& operator()(Args&&... indices) {
static_assert(sizeof...(Args) == Dim);
return op_impl(std::make_index_sequence<Dim>{}, indices...);
}
private:
template <std::size_t... I, class... Args>
T& op_impl(std::index_sequence<I...>, Args&&... indices) {
return m_data[(... + (indices * m_muls[I]))];
}
std::vector<T> m_data;
size_t m_sizes[Dim];
size_t m_muls[Dim];
};
With such a wrapper, you'd only need to change the implementation slightly:
#include <iostream>
int main() {
int n, m;
if(!(std::cin >> n >> m && n > 0 && m > 0)) return 1;
matrix<int, 2> arr(n, m);
for (int i = 0; i < arr.size<0>(); i++)
arr(i, 0) = 1;
for (int i = 0; i < arr.size<1>(); i++)
arr(0, i) = 1;
for (int i = 1; i < n; i++) {
for (int j = 1; j < m; j++)
arr(i, j) = arr(i - 1, j) + arr(i, j - 1);
}
std::cout << arr(n - 1, m - 1) << '\n';
}

Use one template for one-dimensional array and three-dimensional array

I have a template that uses a three-dimensional array to find the maximum. The crux of the problem is that this template must find the maximum in a one-dimensional array. We add a question with a char variable, if question = '1' = three-dimensional, if 2, then one-dimensional.
l need use template T2 for one dimensional and thee dimensional it depends of question(char)
T2 maxShablon2(T2 ***arr, const int n) {
int max = arr[0][0][0];
for (int i = 0; i < n; ++i) {
for (int j = 0; j < n; ++j) {
for (int k = 0; k < n; ++k) {
if (arr[i][j][k] > max) {
max = arr[i][j][k];
}
}
}
}
cout << " Our max: " << max;
}
template<std::size_t N>
using count = std::integral_constant<std::size_t, N>;
template<class T>
constexpr T maxOver( count<0> unused, T t, int n ) {
return t;
}
template<std::size_t depth, class Ptr>
constexpr auto maxOver( count<depth>, Ptr* t, int n ) {
auto max = maxOver( count<depth-1>{}, t[0], n );
for (int i = 1; i < n; ++i) {
auto candidate = maxOver( count<depth-1>{}, t[i], n );
if (candidate > max)
max = candidate;
}
return max;
}
now,
constexpr int arr[100] = {1,2,3,0};
constexpr int arr_max = maxOver( count<1>{}, arr, 100 );
static_assert(arr_max == 3);
constexpr int arr3[5][5][5] = {{{1,-1,7},{2,3,4}},{{3},{4,5,9}}};
constexpr int arr3_max = maxOver( count<3>{}, arr3, 5 );
static_assert(arr3_max == 9);
passes (Live example).

compact form of many for loop in C++

I have a piece of code as follows, and the number of for loops is determined by n which is known at compile time. Each for loop iterates over the values 0 and 1. Currently, my code looks something like this
for(int in=0;in<2;in++){
for(int in_1=0;in_1<2;in_1++){
for(int in_2=0;in_2<2;in_2++){
// ... n times
for(int i2=0;i2<2;i2++){
for(int i1=0;i1<2;i1++){
d[in][in_1][in_2]...[i2][i1] =updown(in)+updown(in_1)+...+updown(i1);
}
}
// ...
}
}
}
Now my question is whether one can write it in a more compact form.
The n bits in_k can be interpreted as the representation of one integer less than 2^n.
This allows easily to work with a 1-D array (vector) d[.].
In practice, an interger j corresponds to
j = in[0] + 2*in[1] + ... + 2^n-1*in[n-1]
Moreover, a direct implementation is O(NlogN). (N = 2^n)
A recursive solution is possible, for example using
f(val, n) = updown(val%2) + f(val/2, n-1) and f(val, 0) = 0.
This would correspond to a O(N) complexity, at the condition to introduce memoization, not implemented here.
Result:
0 : 0
1 : 1
2 : 1
3 : 2
4 : 1
5 : 2
6 : 2
7 : 3
8 : 1
9 : 2
10 : 2
11 : 3
12 : 2
13 : 3
14 : 3
15 : 4
#include <iostream>
#include <vector>
int up_down (int b) {
if (b) return 1;
return 0;
}
int f(int val, int n) {
if (n < 0) return 0;
return up_down (val%2) + f(val/2, n-1);
}
int main() {
const int n = 4;
int size = 1;
for (int i = 0; i < n; ++i) size *= 2;
std::vector<int> d(size, 0);
for (int i = 0; i < size; ++i) {
d[i] = f(i, n);
}
for (int i = 0; i < size; ++i) {
std::cout << i << " : " << d[i] << '\n';
}
return 0;
}
As mentioned above, the recursive approach allows a O(N) complexity, at the condition to implement memoization.
Another possibility is to use a simple iterative approach, in order to get this O(N) complexity.
(here N represents to total number of data)
#include <iostream>
#include <vector>
int up_down (int b) {
if (b) return 1;
return 0;
}
int main() {
const int n = 4;
int size = 1;
for (int i = 0; i < n; ++i) size *= 2;
std::vector<int> d(size, 0);
int size_block = 1;
for (int i = 0; i < n; ++i) {
for (int j = size_block-1; j >= 0; --j) {
d[2*j+1] = d[j] + up_down(1);
d[2*j] = d[j] + up_down(0);
}
size_block *= 2;
}
for (int i = 0; i < size; ++i) {
std::cout << i << " : " << d[i] << '\n';
}
return 0;
}
You can refactor your code slightly like this:
for(int in=0;in<2;in++) {
auto& dn = d[in];
auto updown_n = updown(in);
for(int in_1=0;in_1<2;in_1++) {
// dn_1 == d[in][in_1]
auto& dn_1 = dn[in_1];
// updown_n_1 == updown(in)+updown(in_1)
auto updown_n_1 = updown_n + updown(in_1);
for(int in_2=0;in_2<2;in_2++) {
// dn_2 == d[in][in_1][in_2]
auto& dn_2 = dn_1[in_2];
// updown_n_2 == updown(in)+updown(in_1)+updown(in_2)
auto updown_n_2 = updown_n_1 + updown(in_2);
.
.
.
for(int i2=0;i2<2;i1++) {
// d2 == d[in][in_1][in_2]...[i2]
auto& d2 = d3[i2];
// updown_2 = updown(in)+updown(in_1)+updown(in_2)+...+updown(i2)
auto updown_2 = updown_3 + updown(i2);
for(int i1=0;i1<2;i1++) {
// d1 == d[in][in_1][in_2]...[i2][i1]
auto& d1 = d2[i1];
// updown_1 = updown(in)+updown(in_1)+updown(in_2)+...+updown(i2)+updown(i1)
auto updown_1 = updown_2 + updown(i1);
// d[in][in_1][in_2]...[i2][i1] = updown(in)+updown(in_1)+...+updown(i1);
d1 = updown_1;
}
}
}
}
}
And make this into a recursive function now:
template<std::size_t N, typename T>
void loop(T& d) {
for (int i = 0; i < 2; ++i) {
loop<N-1>(d[i], updown(i));
}
}
template<std::size_t N, typename T, typename U>
typename std::enable_if<N != 0>::type loop(T& d, U updown_result) {
for (int i = 0; i < 2; ++i) {
loop<N-1>(d[i], updown_result + updown(i));
}
}
template<std::size_t N, typename T, typename U>
typename std::enable_if<N == 0>::type loop(T& d, U updown_result) {
d = updown_result;
}
If your type is int d[2][2][2]...[2][2]; or int*****... d;, you can also stop when the type isn't an array or pointer instead of manually specifying N (or change for whatever the type of d[0][0][0]...[0][0] is)
Here's a version that does that with a recursive lambda:
auto loop = [](auto& self, auto& d, auto updown_result) -> void {
using d_t = typename std::remove_cv<typename std::remove_reference<decltype(d)>::type>::type;
if constexpr (!std::is_array<d_t>::value && !std::is_pointer<d_t>::value) {
// Last level of nesting
d = updown_result;
} else {
for (int i = 0; i < 2; ++i) {
self(self, d[i], updown_result + updown(i));
}
}
};
for (int i = 0; i < 2; ++i) {
loop(loop, d[i], updown(i));
}
I am assuming that it is a multi-dimensional matrix. You may have to solve it mathematically first and then write the respective equations in the program.

Fast generation of number combinations for variable nested for loops

I have the following code, that generates a combination of numbers/indices for variable nested for loops
#include <iostream>
#include <array>
template<size_t ... Rest>
inline void index_generator() {
constexpr int size = sizeof...(Rest);
std::array<int,size> maxes = {Rest...};
std::array<int,size> a;
int i,j;
std::fill(a.begin(),a.end(),0);
while(1)
{
for(i = 0; i<size; i++) {
std::cout << a[i] << " ";
}
std::cout << "\n";
for(j = size-1 ; j>=0 ; j--)
{
if(++a[j]<maxes[j])
break;
else
a[j]=0;
}
if(j<0)
break;
}
}
int main()
{
index_generator<2,3,3>();
return 0;
}
which outputs the following
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
1 0 2
1 1 0
1 1 1
1 1 2
1 2 0
1 2 1
1 2 2
this is indeed equivalent to having
for (int i=0; i<2; ++i)
for (int j=0; j<3; ++j)
for (int k=0; i<3; ++k)
I can generate the equivalent of any number of nested for loops using the above method, however I have noticed that as the number of loops increase this code performs slower and slower, compared to its equivalent counterpart (i.e. nested for loops). I have checked both with gcc 5.3 and clang 3.8. Maybe this is due the processor having a hard time predicting the branch in while(true) or maybe its something else.
What I do in the innermost loops is typically access the data from two arrays and do multiplications on them something like c_ptr[idx] +=a_ptr[idx]*b_ptr[idx]. Since the indices generated using both nested for loops and using the above technique is the same, the memory access pattern remains the same. So I am quite sure this is not a cache miss/hit problem as far as data access is concerned.
So my question is:
Is there a way to generate these combination/indices as fast as the nested for loop style code or potentially even faster?
Since we know the number of for loops to set up and the indices of the for loop are known at compile time, can better optimisation opportunities not be exploited? SIMD for instance?
You can generate it with a single loop of the multiplication of all the dimensions and use modulo for the final indices.
#include <iostream>
#include <array>
template<size_t ... Rest>
inline void index_generator( ) {
constexpr int size = sizeof...( Rest );
std::array<int, size> maxes = { Rest... };
int total = 1;
for (int i = 0; i<size; ++i) {
total *= maxes[i];
}
for (int i = 0; i < total; ++i) {
int remaining = total;
for (int n = 0; n < size; ++n) {
remaining /= maxes[n];
std::cout << ( i / remaining ) % maxes[n] << " ";
}
std::cout << std::endl;
}
}
Or just generate recursive templates to actually produce nested loops and let the compiler optimize it for you. It depends on the actual usage of the indices. Right now your function is not too useful.
EDIT:
Benchmarked the three solution, first is the one in the question, the second is mine without the arrays, and thirs is recursive templates. The last one has a fault that its a bit harder to access the actual parameters to use, but not impossible. Also had to add a sum calculation to not suffer from being optimized out, and had to remove the console output to reduce the effect of that in the benchmark. The results are from my i7 machine release mode (VS 2015 community) and with the given setup below. The LOG and PROFILE_SCOPE are my macros.
#include <array>
// Original from the question
template<size_t ... Rest>
inline void index_generator1( ) {
constexpr int size = sizeof...( Rest );
std::array<int, size> maxes = { Rest... };
std::array<int, size> a;
int i, j;
std::fill( a.begin( ), a.end( ), 0 );
int x = 0;
while (1) {
for (i = 0; i < size; i++) {
x += a[i];
}
for (j = size - 1; j >= 0; j--) {
if (++a[j] < maxes[j])
break;
else
a[j] = 0;
}
if (j < 0)
break;
}
LOG( x )
}
// Initial try
template<size_t ... Rest>
inline void index_generator2( ) {
constexpr int size = sizeof...( Rest );
int x = 0;
std::array<int, size> maxes = { Rest... };
int total = 1;
for (int i = 0; i < size; ++i) {
total *= maxes[i];
}
for (int i = 0; i < total; ++i) {
int remaining = total;
for (int n = 0; n < size; ++n) {
remaining /= maxes[n];
x += ( i / remaining ) % maxes[n];
}
}
LOG(x)
}
// Recursive templates
template <int... Args>
struct Impl;
template <int First, int... Args>
struct Impl<First, Args...>
{
static int Do( int sum )
{
int x = 0;
for (int i = 0; i < First; ++i) {
x += Impl<Args...>::Do( sum + i );
}
return x;
}
};
template <>
struct Impl<>
{
static int Do( int sum )
{
return sum;
}
};
template <int... Args>
void index_generator3( )
{
LOG( Impl<Args...>::Do( 0 ) );
}
Executed code
{
PROFILE_SCOPE( Index1 )
index_generator1<200, 3, 400, 20>( );
}
{
PROFILE_SCOPE( Index2 )
index_generator2<200, 3, 400, 20>( );
}
{
PROFILE_SCOPE( Index3 )
index_generator3<200, 3, 400, 20>( );
}
Result in console:
[19:35:50]: 1485600000
[19:35:50]: 1485600000
[19:35:50]: 1485600000
[19:35:56]: PerCall(ms)
[19:35:56]: Index1 10.4016
[19:35:56]: Index2 75.3770
[19:35:56]: Index3 4.2299

Avoiding code duplication when the only difference is loop control statements (with the same statements in loop bodies)?

In my solution code for project euler problem 11, I got the following functions. Max_consecutive_prod is a class which calculates the max product of consecutive input()ed numbers, generalised from problem 8. The six functions calculate max product in different series of different directions and start from different edges of the grid.
The only difference in these functions is indexes in for statements, how to elimilate the obvious duplication? The situation here is somehow the opposite to the typical application of template method pattern: the operation is identical but the control framework is different, is there another design pattern for this?
Edit: all the modifications specified in comments are to the (two) for statements, and the loop body in each function is identical to the first.
template <size_t size> unsigned process_row(const unsigned (&grid)[size][size])
{
unsigned prodMax = 0;
for (int i = 0; i < size; ++i)
{
Max_consecutive_prod mcp;
for (int j = 0; j < size; ++j)
{
mcp.input(grid[i][j]);
}
if (mcp.result() > prodMax)
{
prodMax = mcp.result();
}
}
return prodMax;
}
// exchange i, j in process_row
template <size_t size> unsigned process_col(const unsigned (&grid)[size][size])
{
// ...
}
template <size_t size> unsigned process_diag_lower(const unsigned (&grid)[size][size])
{
unsigned prodMax = 0;
for (int init = 0; init < size; ++init)
{
Max_consecutive_prod mcp;
for (int i = init, j = 0; i < size && j < size; ++i, ++j)
// ...
// ...
}
return prodMax;
}
// exchange i, j in process_diag_lower
template <size_t size> unsigned process_diag_upper(const unsigned (&grid)[size][size])
{
// ...
}
// flip j in process_diag_lower
template <size_t size> unsigned process_rev_diag_lower(const unsigned (&grid)[size][size])
{
unsigned prodMax = 0;
for (int init = 0; init < size; ++init)
{
Max_consecutive_prod mcp;
for (int i = init, j = size-1; i < size && j >= 0; ++i, --j)
// ...
// ...
}
return prodMax;
}
// change ++j in process_diag_upper to --j
template <size_t size> unsigned process_rev_diag_upper(const unsigned (&grid)[size][size])
{
unsigned prodMax = 0;
for (int init = 0; init < size; ++init)
{
Max_consecutive_prod mcp;
for (int j = init, i = 0; j >=0 && i < size; ++i, --j)
// ...
// ...
}
return prodMax;
}
Based on random-hacker's code, which shows the real commonality and variability in control flows of the six function, I wrote my version and made the code more self-explaining and C++ idiomatic, using a stragegy class, defining local variables to clarify the code and improve effiency. I define a non-template version of process(), to avoid binary code bloat when instantizing for different size (see 'Effective C++', Item 44).
If you still get confused, please read random-hacker's answer for explanation. :)
namespace Grid_search
{
enum Step { neg = -1, nul, pos };
enum Index_t { i, j };
struct Strategy
{
Step direction[2];
Index_t varOuter;
};
const size_t typeCount = 6;
const Strategy strategy[typeCount] = { {{pos, nul}, i}, {{nul, pos}, j}, {{pos, pos}, i}, {{pos, pos}, j}, {{pos, neg}, i}, {{pos, neg}, j} };
};
template <size_t size> inline unsigned process(const Grid_search::Strategy& strategy, const unsigned (&grid)[size][size])
{
return process(strategy, reinterpret_cast<const unsigned*>(&grid), size);
}
unsigned process(const Grid_search::Strategy& strategy, const unsigned* grid, size_t size)
{
using namespace Grid_search;
const Index_t varOuter = strategy.varOuter, varInner = static_cast<Index_t>(!varOuter);
const Step di = strategy.direction[i], dj = strategy.direction[j];
const unsigned initInner = strategy.direction[varInner] == pos ? 0 : size -1;
unsigned prodMax = 0;
unsigned index[2];
unsigned &indexI = index[i], &indexJ = index[j];
for (unsigned initOuter = 0; initOuter < size; ++initOuter)
{
Max_consecutive_prod mcp;
for (index[varOuter] = initOuter, index[varInner] = initInner;
0 <= indexI && indexI < size && 0 <= indexJ && indexJ < size;
indexI += di, indexJ += dj)
{
mcp.input(grid[indexI*size + indexJ]);
if (mcp.result() > prodMax)
{
prodMax = mcp.result();
}
}
}
return prodMax;
}
int main()
{
static const size_t N = 20;
unsigned grid[N][N];
std::ifstream input("d:/pro11.txt");
for (int count = 0; input >> grid[count/N][count%N]; ++count)
{
}
unsigned prodMax = 0;
for (int i = 0; i < Grid_search::typeCount; ++i)
{
unsigned prod = process(Grid_search::strategy[i], grid);
if (prod > prodMax)
{
prodMax = prod;
}
}
}
Although I think what you already have will be fine after sticking the inner loop code blocks in an ordinary function as suggested by Adam Burry and Tony D, if you want you can combine the loops, using tables to encode the possible directions to move in. The trick is to use an array p[2] instead of separate i and j, to enable the question of which index is varied in the outer loop to be driven by a table. Then the only tricky thing is making sure that the other index, which will be varied in the inner loop, needs to start at its maximum value (instead of 0) iff it will decrement at each step:
enum indices { I, J }; // Can just use 0 and 1 if you want
template <size_t size> unsigned process(const unsigned (&grid)[size][size]) {
static int d[][2] = { {1, 0}, {0, 1}, {1, 1}, {1, -1}, {1, 1}, {1, -1} };
static int w[] = { J, I, J, J, I, I };
unsigned prodMax = 0; // Note: not 1
for (int k = 0; k < sizeof d / sizeof d[0]; ++k) { // For each direction
for (int init = 0; init < size; ++init) {
Max_consecutive_prod mcp;
int p[2]; // p[I] is like i, p[J] is like j
for (p[w[k]] = init, p[!w[k]] = (d[k][!w[k]] == -1 ? size - 1 : 0);
min(p[I], p[J]) >= 0 && max(p[I], p[J]) < size;
p[I] += d[k][I], p[J] += d[k][J])
{
mcp.input(grid[p[I]][p[J]]);
prodMax = max(prodMax, mcp.result());
}
}
}
return prodMax;
}
You could create an enum for the different states and then pass it into the function. You would then create an if statement that would set the values based on the passed value.
Your process_row() has a bug: from the example in the link, zero entries are allowed in the matrix, so if a row begins with e.g.
x y z 0 ...
and any of x, xy or xyz is larger than all other 4-element products on the rest of that row and on any other row in the matrix, it will incorrectly report that the this is the largest 4-element product. (I'm assuming here that Max_consecutive_prod calculates a rolling product of the last 4 elements provided with input()).
Unless your Max_consecutive_prod is unusually aware of how it is being called, you will also get erroneous results "wrapping" from the end of one row to the next, and from one process_...() call to the next.
Suppose you flattened the grid so that it was just 400 numbers in a row, reading left to right and then top to bottom. The topmost row would consist of the first 20 numbers (that is, indices 0, ..., 19); the second rwo of the next 20 numbers, etc. In general, row i (starting from 0) would correspond to indices i*20, i*20 + 1, i*20 + 2, ..., i*20 + 19.
Now, what about columns? The leftmost column starts at position 0, just like the topmost row. It's next element at position 20 (the first element in the second row), and then 40, and... So it's not hard to see that the indices for column j are j, j + 20, j + 40, ..., j + 19*20.
Diagonals are not much different. Try it on paper (grid-ruled paper is good for this sort of thing.)
One more hint: Does it make a difference if you find the product of four elements, multiplying left-to-right, than the same four elements multiplying right-to-left?
First, the Context object approach - this just packages the arguments to the support functions mentioned in my comment on your question... it's about as useful as the problem was significant ;-].
struct Context
{
unsigned& proxMax;
int i, j;
Max_consecutive_prod mcp;
Context(unsigned& prodMax) : prodMax(prodMax) { }
};
template <size_t size> unsigned process_diag_lower(const unsigned (&grid)[size][size])
{
unsigned prodMax = 0;
for (int init = 0; init < size; ++init)
{
Context context(prodMax);
for (context.i = init, context.j = 0; context.i < size && context.j < size; ++context.i, ++context.j)
loop_ij(context);
loop_outer(context);
}
return prodMax;
}
Visitor pattern. Now, I said in my comment "you don't show us enough loop bodies to see the common requirements", and haven't seen anything since, so on the basis of the one body I've seen - namely:
template <size_t size> unsigned process_row(const unsigned (&grid)[size][size])
{
unsigned prodMax = 0;
for (int i = 0; i < size; ++i)
{
Max_consecutive_prod mcp;
for (int j = 0; j < size; ++j)
{
mcp.input(grid[i][j]);
}
if (mcp.result() > prodMax)
{
prodMax = mcp.result();
}
}
return prodMax;
}
The above can be split:
template <size_t size, template Visitor>
unsigned visit_row(const unsigned (&grid)[size][size], Visitor& visitor)
{
for (int i = 0; i < size; ++i)
{
for (int j = 0; j < size; ++j)
visitor.inner{grid[i][j]);
visitor.outer();
}
return visitor.result();
}
struct Visitor
{
unsigned prodMax;
Max_consecutive_prod mcp;
Visitor() : prodMax(0) { }
void inner(unsigned n) { mcp.input(n); }
void outer()
{
if (mcp.result() > prodMax) prodMax = mcp.result();
mcp = Max_consecutive_prod(); // reset for next time...
}
unsigned result() const { return prodMax; }
};
This way, the same Visitor class can be combined with your various grid-element iteration routines.