In a function, I have several consecutive for loops with the same code but different initial values for the control variable. The initial values are obtained from inputs to the function. Ie,
void thisFunction( class A a){
//some other code
for (int i = a.x; i != 0; --i){
code
}
for (int i = a.y; i != 0; --i){
code
}
for (int i = a.z; i != 0; --i){
code
}
//some other code
}
Is there any way to condense all the for loops into one loop so that when I make changes to the code within the loop, I won't have to change it for all three loops? An alternative is to write anotherFunction() with the initial values as input, but I need access to local variables in thisFunction().
void anotherFunction(int in){
for (int i = in; i != 0; --i){
code
}
}
So is there another way to condense the loops?
Thank you.
Your hunch is right - you'll have to refactor your code into a separate function. Local variables of thisFunction will become arguments to anotherFunction; often these will be passed by reference .
EDIT: In most cases you should avoid doing this, and rather follow MSalter's answer. Just because you can, doesn't mean you should.
I am not sure how good an idea this is, but without any more context, a simple solution could be:
int starts[3] = { a.x, a.y, a.z };
for ( int var = 0; var < 3; ++var ) {
for ( int i = starts[var]; i != 0; --i ) {
// code
}
}
Note that the values of the conditions are obtained once at the beginning of the function, that means that if the object a changes throughout the first loop, that change will not be visible in the later control loops. If you need that, the solution can be modified to store pointers.
EDIT: There is a comment suggesting the use of the size of the array. I did not add that here not to modify, but anyway, a better way of getting the array size is:
template<typename T, unsigned int N>
inline unsigned int array_size( T (&)[N] ) {
return N;
}
//...
for ( int var = 0; var < array_size(starts); ++i ) {
Using lambda expressions, you could do this:
void thisFunction( class A a)
{
//some other code
auto code = [&] (int i)
{
// your code here
};
for (int i = a.x; i != 0; --i)
code(i);
for (int i = a.y; i != 0; --i)
code(i);
for (int i = a.z; i != 0; --i)
code(i);
}
You can use macro
#define ITERATE(_IN) \
{\
for (int _I = (_IN); _I != 0; --_I){\
code \
}\
}\
Then you have access to local variables, and you only have to manage one copy of the code.
So, your code becomes
void thisFunction( class A a){
//some other code
ITERATE(a.x)
ITERATE(a.y)
ITERATE(a.z)
//some other code
}
This may become difficult to debug, so you have to make sure first that it works well, before you make a macro of your code
Not really. The loops will all be running for different numbers of iterations, surely?
The best you can do is factor out as much of the internals as possible, for instance, into a function like you suggested. Alternatively, have the loop outside the function.
To avoid passing in a load of variables independently, you could consider wrapping them in a class or struct.
Update On second thoughts, David Rodriguez's answer suggestion is a pretty good solution!
Related
I recently started learning C++ and ran into problems with this task:
I am given 4 arrays of different lengths with different values.
vector<int> A = {1,2,3,4};
vector<int> B = {1,3,44};
vector<int> C = {1,23};
vector<int> D = {0,2,5,4};
I need to implement a function that goes through all possible variations of the elements of these vectors and checks if there are such values a from array A, b from array B, c from array C and d from array D that their sum would be 0(a+b+c+d=0)
I wrote such a program, but it outputs 1, although the desired combination does not exist.
using namespace std;
vector<int> test;
int sum (vector<int> v){
int sum_of_elements = 0;
for (int i = 0; i < v.size(); i++){
sum_of_elements += v[i];
}
return sum_of_elements;
}
bool abcd0(vector<int> A,vector<int> B,vector<int> C,vector<int> D){
for ( int ai = 0; ai <= A.size(); ai++){
test[0] = A[ai];
for ( int bi = 0; bi <= B.size(); bi++){
test[1] = B[bi];
for ( int ci = 0; ci <= C.size(); ci++){
test[2] = C[ci];
for ( int di = 0; di <= D.size(); di++){
test[3] = D[di];
if (sum (test) == 0){
return true;
}
}
}
}
}
}
I would be happy if you could explain what the problem is
Vectors don't increase their size by themself. You either need to construct with right size, resize it, or push_back elements (you can also insert, but vectors arent very efficient at that). In your code you never add any element to test and accessing any element, eg test[0] = A[ai]; causes undefined behavior.
Further, valid indices are [0, size()) (ie size() excluded, it is not a valid index). Hence your loops are accessing the input vectors out-of-bounds, causing undefined behavior again. The loops conditions should be for ( int ai = 0; ai < A.size(); ai++){.
Not returning something from a non-void function is again undefined behavior. When your abcd0 does not find a combination that adds up to 0 it does not return anything.
After fixing those issues your code does produce the expected output: https://godbolt.org/z/KvW1nePMh.
However, I suggest you to...
not use global variables. It makes the code difficult to reason about. For example we need to see all your code to know if you actually do resize test. If test was local to abcd0 we would only need to consider that function to know what happens to test.
read about Why is “using namespace std;” considered bad practice?
not pass parameters by value when you can pass them by const reference to avoid unnecessary copies.
using range based for loops helps to avoid making mistakes with the bounds.
Trying to change not more than necessary, your code could look like this:
#include <vector>
#include <iostream>
int sum (const std::vector<int>& v){
int sum_of_elements = 0;
for (int i = 0; i < v.size(); i++){
sum_of_elements += v[i];
}
return sum_of_elements;
}
bool abcd0(const std::vector<int>& A,
const std::vector<int>& B,
const std::vector<int>& C,
const std::vector<int>& D){
for (const auto& a : A){
for (const auto& b : B){
for (const auto& c : C){
for (const auto& d : D){
if (sum ({a,b,c,d}) == 0){
return true;
}
}
}
}
}
return false;
}
int main() {
std::vector<int> A = {1,2,3,4};
std::vector<int> B = {1,3,44};
std::vector<int> C = {1,23};
std::vector<int> D = {0,2,5,4};
std::cout << abcd0(A,B,C,D);
}
Note that I removed the vector test completely. You don't need to construct it explicitly, but you can pass a temporary to sum. sum could use std::accumulate, or you could simply add the four numbers directly in abcd0. I suppose this is for exercise, so let's leave it at that.
Edit : The answer written by #463035818_is_not_a_number is the answer you should refer to.
As mentioned in the comments by #Alan Birtles, there's nothing in that code that adds elements to test. Also, as mentioned in comments by #PaulMcKenzie, the condition in loops should be modified. Currently, it is looping all the way up to the size of the vector which is invalid(since the index runs from 0 to the size of vector-1). For implementing the algorithm that you've in mind (as I inferred from your code), you can declare and initialise the vector all the way down in the 4th loop.
Here's the modified code,
int sum (vector<int> v){
int sum_of_elements = 0;
for (int i = 0; i < v.size(); i++){
sum_of_elements += v[i];
}
return sum_of_elements;
}
bool abcd0(vector<int> A,vector<int> B,vector<int> C,vector<int> D){
for ( int ai = 0; ai <A.size(); ai++){
for ( int bi = 0; bi <B.size(); bi++){
for ( int ci = 0; ci <C.size(); ci++){
for ( int di = 0; di <D.size(); di++){
vector<int> test = {A[ai], B[bi], C[ci], D[di]};
if (sum (test) == 0){
return true;
}
}
}
}
}
return false;
}
The algorithm is inefficient though. You can try sorting the vectors first. Loop through the first two of them while using the 2 pointer technique to check if desired sum is available from the remaining two vectors
It looks to me, like you're calling the function every time you want to check an array. Within the function you're initiating int sum_of_elements = 0;.
So at the first run, you're starting with int sum_of_elements = 0;.
It finds the element and increases sum_of_elements up to 1.
Second run you're calling the function and it initiates again with int sum_of_elements = 0;.
This is repeated every time you're checking the next array for the element.
Let me know if I understood that correctly (didn't run it, just skimmed a bit).
I implemented some algorithm where the main data structure is a tree. I use a class to represent a node and a class to represent a tree. Because the nodes get updated a lot, I call many setters and getters.
Because I have heard many times that function calls are expensive, I was thinking that maybe if I represented the nodes and the tree using structs, it would make my algorithm more efficient in practice.
Before doing so I decided to run a small experiment to see if this is actually the case.
I created a class that had one private variable, a setter and a getter. Also I created a struct that had one variable as well, without setters/getters since we can just update the variable by calling struct.varName. Here are the results:
The number of runs is just how many times we call the setter/getter. Here is the code of the experiment:
#include <iostream>
#include <fstream>
#define BILLION 1000000000LL
using namespace std;
class foo{
private:
int a;
public:
void set(int newA){
a = newA;
}
int get(){
return a;
}
};
struct bar{
int a;
};
timespec startT, endT;
void startTimer(){
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &startT);
}
double endTimer(){
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &endT);
return endT.tv_sec * BILLION + endT.tv_nsec - (startT.tv_sec * BILLION + startT.tv_nsec);
}
int main() {
int runs = 10000000;
int startRun = 10000;
int step = 10000;
int iterations = 10;
int res = 0;
foo f;
ofstream fout;
fout.open("stats.txt", ios_base::out);
fout<<"alg\truns\ttime"<<endl;
cout<<"First experiment progress: "<<endl;
int cnt = 0;
for(int run = startRun; run <= runs; run += step){
double curTime = 0.0;
for(int iter = 0; iter < iterations; iter++) {
startTimer();
for (int i = 1; i <= run; i++) {
f.set(i);
res += f.get();
}
curTime += endTimer()/iterations;
cnt++;
if(cnt%10 == 0)
cout<<cnt/(((double)runs-startRun+1)/step*iterations)*100<<"%\r";
}
fout<<"class\t"<<run<<"\t"<<curTime/BILLION<<endl;
}
int res2 = 0;
bar b;
cout<<"Second experiment progress: "<<endl;
cnt = 0;
for(int run = startRun; run <= runs; run += step){
double curTime = 0.0;
for(int iter = 0; iter < iterations; iter++) {
startTimer();
for (int i = 1; i <= run; i++) {
b.a = i;
res2 += b.a;
}
curTime += endTimer()/iterations;
cnt++;
if(cnt%10 == 0)
cout<<cnt/(((double)runs-startRun+1)/step*iterations)*100<<"%\r";
}
fout<<"struct\t"<<run<<"\t"<<curTime/BILLION<<endl;
}
fout.close();
cout<<res<<endl;
cout<<res2<<endl;
return 0;
}
I don't understand why I get this behaviour. I thought that function calls were more expensive?
EDIT: I rerun the same experiment without -O3
EDIT: OK this is very surprising, by declaring the class in a separate file called foo.h, implementing the getters/setters in foo.cpp and running with -O3, it seems that the class becomes even more inefficient.
I have heard many times that function calls are expensive.
Was this in 1970 by any chance?
Compilers are smart. Very smart. They produce the best program they can given your source code, and unless you're doing something very weird, these sorts of design changes are unlikely to make much (if any) performance difference.
Most notably here, a simple getter/setter can even be completely inlined in most cases (unless you're doing something weird), making your two programs effectively the same once compiled! You can see this result on your graph.
Meanwhile, the specific change of replacing class with struct has no effect on performance whatsoever - both keywords define a class.
I don't understand why I get this behaviour. I thought that function calls were more expensive?
See, this is why we don't prematurely optimise. Write clear, easy-to-read code without tricks and let your compiler take care of the rest. That's its job, and it's generally very good at it.
The answer here is almost certainly compiler optimization. First of all, defining your getters and setters in the class definition makes them inline. Even if you didn't do that, though, I'd expect any modern compiler to optimize away the function calls if they're in the same file and the compiler knows the resultant object is the whole program.
Here is a simple question I have been wondering about for a long time :
When I do a loop such as this one :
for (int i = 0; i < myVector.size() ; ++i) {
// my loop
}
As the condition i < myVector.size() is checked each time, should I store the size of the array inside a variable before the loop to prevent the call to size() each iteration ? Or is the compiler smart enough to do it itself ?
mySize = myVector.size();
for (int i = 0; i < mySize ; ++i) {
// my loop
}
And I would extend the question with a more complex condition such as i < myVector.front()/myVector.size()
Edit : I don't use myVector inside the loop, it is juste here to give the ending condition. And what about the more complex condition ?
The answer depends mainly on the contents of your loop–it may modify the vector during processing, thus modifying its size.
However if the vector is just scanned you can safely store its size in advance:
for (int i = 0, mySize = myVector.size(); i < mySize ; ++i) {
// my loop
}
although in most classes the functions like 'get current size' are just inline getters:
class XXX
{
public:
int size() const { return mSize; }
....
private:
int mSize;
....
};
so the compiler can easily reduce the call to just reading the int variable, consequently prefetching the length gives no gain.
If you are not changing anything in vector (adding/removing) during for-loop (which is normal case) I would use foreach loop
for (auto object : myVector)
{
//here some code
}
or if you cannot use c++11 I would use iterators
for (auto it = myVector.begin(); it != myVector.end(); ++it)
{
//here some code
}
I'd say that
for (int i = 0; i < myVector.size() ; ++i) {
// my loop
}
is a bit safer than
mySize = myVector.size();
for (int i = 0; i < mySize ; ++i) {
// my loop
}
because the value of myVector.size() may change (as result of , e.g. push_back(value) inside the loop) thus you might miss some of the elements.
If you are 100% sure that the value of myVector.size() is not going to change, then both are the same thing.
Yet, the first one is a bit more flexible than the second (other developer may be unaware that the loop iterates over fixed size and he might change the array size). Don't worry about the compiler, he's smarter than both of us combined.
The overhead is very small.
vector.size() does not recalculate anything, but simply returns the value of the private size variable..
it is safer than pre-buffering the value, as the vectors internal size variable is changed when an element is popped or pushed to/from the vector..
compilers can be written to optimize this out, if and only if, it can predict that the vector is not changed by ANYTHING while the for loop runs.
That is difficult to do if there are threads in there.
but if there isn't any threading going on, it's very easy to optimize it.
Any smart compiler will probably optimize this out. However just to be sure I usually lay out my for loops like this:
for (int i = myvector.size() -1; i >= 0; --i)
{
}
A couple of things are different:
The iteration is done the other way around. Although this shouldn't be a problem in most cases. If it is I prefer David Haim's method.
The --i is used rather than a i--. In theory the --i is faster, although on most compilers it won't make a difference.
If you don't care about the index this:
for (int i = myvector.size(); i > 0; --i)
{
}
Would also be an option. Altough in general I don't use it because it is a bit more confusing than the first. And will not gain you any performance.
For a type like a std::vector or std::list an iterator is the preffered method:
for (std::vector</*vectortype here*/>::iterator i = myVector.begin(); i != myVector.end(); ++i)
{
}
I have a code that looks something like this:
bool var = some condition...
if( var )
{
for( int i=0; i<10; ++i )
{
//execute some code ...
}
}
else
{
for( int i=9; i>=0; --i )
{
//execute some other code...
}
}
However, the code that needs to be executed inside the for loop is almost entirely identical, and so I don't want to write it twice. I know I can do something like this:
bool var = some condition...
for( int i = (var ? 0 : 9 ); (var ? i<10 : i>=0 ); (var ? ++i : --i ) )
{
//Execute my code
}
But this is a really un-elegant solution.
Is there a short, more elegant way of doing this? I checked std::iterator but I don't think it's what I need.
You're focusing on the wrong problem here. If you have a direction flag, don't get all hung up on the iteration variable being literally correct. Just interpret it as required:
for (int i = 0; i < n; ++i)
{
int j = var ? i : n - 1 - i;
// j goes from 0..n-1 or n-1..0
}
Unless you're doing billions of these calls, the overhead of the secondary variable will be insignificant.
You can just break the body of the loop out into a function/method and pass in sufficient context for the operation to occur. If the loop body uses mostly fields on this, making it a method should be fairly easy. Otherwise, you shouldn't need more parameters than the loop currently has.
If you're using C++11, you could implement this as a lambda capturing any necessary info, and call the lambda from within each of the loops (so as not to have a loose function). Using a function or method you can test independently is a good idea, though.
Does the code inside the loop depend on the value of the iterator, and if so, how? You might be able to use some basic math in a clever fashion, like transforming the start/end to always be 1..n, or using abs and negatives. This would leave you with one loop, and moving the body out into a function wouldn't be strictly necessary.
It's smart to want to minimize duplicate code, but that doesn't mean that your solution needs to fit in one line. Just write out the logic in a way that makes sense and is legible. Include comments to explain what you're doring and why.
bool var = some condition...
int start = 0;
int end = 9;
int increment = 1;
if (!var)
{
// Reverse direction
start = 9;
end = 0;
increment = -1;
}
// Loop (forwards if var; reversed if !var)
for (int i = start; i != end; i += increment)
{
}
You may use something like that.
for(int j = 0; j < 10; ++j) { // always increases
int i = var ? j : 10 - 1 - j;
//go
}
This looks suspiciously like iteration to me, so let's try to write a function that will help us out:
void incr(int& i) { ++i; }
void decr(int& i) { --i; }
template <typename Iter, typename Incr>
void do_work(Iter start, Iter finish, Incr incr)
{
for(Iter i = start, i != finish; incr(i))
{
// Do your code.
}
}
bool var = some condition...
if( var )
{
do_work(0, 10, &incr);
}
else
{
do_work(9, -1, &decr);
}
I have a vector containing large number of elements. Now I want to write a small function which counts the number of even or odd elements in the vector. Since performance is a major concern I don't want to put an if statement inside the loop. So I wrote two small functions like:
long long countOdd(const std::vector<int>& v)
{
long long count = 0;
const int size = v.size();
for(int i = 0; i < size; ++i)
{
if(v[i] & 1)
{
++count;
}
}
return count;
}
long long countEven(const std::vector<int>& v)
{
long long count = 0;
const int size = v.size();
for(int i = 0; i < size; ++i)
{
if(0 == (v[i] & 1))
{
++count;
}
}
return count;
}
My question is can I get the same result by writing a single template function like this:
template <bool countEven>
long long countTemplate(const std::vector<int>& v1)
{
long long count = 0;
const int size = v1.size();
for(int i = 0; i < size; ++i)
{
if(countEven)
{
if(v1[i] & 1)
{
++count;
}
}
else if(0 == (v1[i] & 1))
{
++count;
}
}
return count;
}
And using it like this:
int main()
{
if(somecondition)
{
countTemplate<true>(vec); //Count even
}
else
{
countTemplate<false>(vec); //Count odd
}
}
Will the code generated for the template and non-template version be the same ? or will there be some additional instructions emitted?
Note that the counting of numbers is just for illustration hence please don't suggest other methods for counting.
EDIT:
Ok. I agree that it may not make much sense from performance point of view. But atleast from maintainability point of view I would like to have only one function to maintain instead of two.
The templated version may and, very probably, will be optimized by the compiler when it sees a certain branch in the code is never reached. The countTemplate code for instance, will have the countEven template argument set to true, so the odd branch will be cut away.
(sorry, I can't help suggesting another counting method)
In this particular case, you could use count_if on your vector:
struct odd { bool operator()( int i )const { return i&1; } };
size_t nbOdd = std::count_if( vec.begin(), vec.end(), odd() );
This can also be optimized, and writes way shorter :) The standard library developers have given possible optimization much thought, so better use it when you can, instead of writing your own counting for-loop.
Your template version will generate code like this:
template <>
long long countTemplate<true>(const std::vector<int>& v1)
{
long long count = 0;
const int size = v1.size();
for(int i = 0; i < size; ++i)
{
if(true)
{
if(v1[i] & 1)
{
++count;
}
}
else if(0 == (v1[i] & 1))
{
++count;
}
}
return count;
}
template <>
long long countTemplate<false>(const std::vector<int>& v1)
{
long long count = 0;
const int size = v1.size();
for(int i = 0; i < size; ++i)
{
if(false)
{
if(v1[i] & 1)
{
++count;
}
}
else if(0 == (v1[i] & 1))
{
++count;
}
}
return count;
}
So if all optimizations are disabled, the if will in theory still be there. But even a very naive compiler will determine that you're testing a constant, and simply remove the if.
So in practice, no, there should be no difference in the generated code. So you can use the template version and don't worry about this.
I guess that good compiler will cut redundant code in your template as countEven is compile time constant and it is very simple to implement such optimization during template instantiation.
Anyway it seems pretty strange. You wrote a template but do "dynamic switching" inside.
May be try something like that:
struct CountEven {}
struct CountOdd {}
inline void CountNum(int & num, long long & count, const CountEven &)
{
if(num & 1)
{
++count;
}
}
inline void CountNum(int & num, long long & count, const CountOdd &)
{
if(0 == (num & 1))
{
++count;
}
}
template <class T>
long long countTemplate(const std::vector<int>& v1)
{
long long count = 0;
const int size = v1.size();
for(int i = 0; i < size; ++i)
{
CountNum(v1[i], count, T());
}
return count;
}
It will select necessary CountNum() function version on compilation stage:
int main()
{
if(somecondition)
{
countTemplate<CountEven>(vec); //Count even
}
else
{
countTemplate<CountOdd>(vec); //Count odd
}
}
Code is messy, but I think you got the idea.
This will depend on how smart the compiler optimizer is. The compiler might be able to see that really the if-statement is redundant and only one branch of it is executed and optimize the whole thing.
The best way to check is to try and look at the assembly - this code will not produce too much of machine code.
The first thing that comes to my mind are the two optimization "rules":
Don't optmized prematurely.
Don't do it yet.
The point is that sometimes we bother about a performance bottleneck which will never happen in practice. There are studies that say that 20 percent of the code is responsible for 80 percent of the software execution time. Of course this doesn't mean you pessimize prematurely, but I don't think that's your case.
In general, you should do this kind of optmization only after you have actually run a profiler on your program and identified the real bottlenecks.
Regarding your function versions, as other have said this depends on your compiler. Just remember that with the template approach you won't be able to switch calls at runtime (template is a compile-time tool).
A final note: long long is not standard C++ (yet).
If you care about optimization issues try to make it like the following:
template <bool countEven>
long long countTemplate(const std::vector<int>& v1)
{
long long count = 0;
const int size = v1.size();
for ( int i = 0; i < size; ++i ) {
// According to C++ Standard 4.5/4:
// An rvalue of type bool can be converted to an rvalue of type int,
// with false becoming zero and true becoming one.
if ( v1[i] & 1 == countEven ) ++count;
}
return count;
}
I believe that the code above will be compiled in the same code as without templates.
Use STL, Luke :-) It's even as example in reference
bool isOdd(int i)
{
return i%2==1;
}
bool isEven(int i)
{
return i%2==0;
}
std::vector<int>::size_type count = 0;
if(somecondition)
{
count = std::count_if(vec.begin(), vec.end(), isEven);
}
else
{
count = std::count_if(vec.begin(), vec.end(), isOdd);
}
In general, the outcome will be much the same. You are describing an O(n) iteration over the linear memory of the vector.
If you had a vector of pointers, suddenly the performance would be way worse because the memory locality of reference would be lost.
However, the more general thing is that even netbook CPUs can do gazallions of operations per second. Looping over your array is most unlikely to be performance-critical code.
You should write for readability, then profile your code, and consider doing more involved hand-tweaked things when the profiling highlights the root cause of any performance issue you have.
And performance gains typically come from algorithmic changes; if you kept count of the number of odds as you added and removed elements from the vector, for example, it would be O(1) to retrieve...
I see that you're using long long for counter, and that probably means that you expect huge number of elements in vector. In that case, I would definitely go for template implementation (because of code readability) and just move that if condition outside for loop.
If we assume that compiler makes no optimization whatsoever, you would have 1 condition and possibly more than 2 billion iterations through vector. Also, since the condition would be if (true) or if (false) the branch prediction would work perfectly and execution would be less than 1 CPU instruction.
I'm pretty sure that all compilers on the market have this optimization, but I would quote my favorite when it comes to performance: "Premature optimization is the root of all evil" and "There're only 3 rules of optimization: Measure, measure and measure".
If you absolutely absurdly care about fast looking code:
(a clever compiler, or one otherwise hinted at using directives or intrinsics, could do this in parallel using SIMD; CUDA and OpenCL would of course eat this for breakfast!)
int count_odd(const int* array,size_t len) {
int count = 0;
const int* const sentinal = array+len;
while(array<sentinal)
count += (*array++ & 1);
return count;
}
int count_even(const int* array,size_t len) {
return len-count_odd(array,len);
}