template <typename T>
class Table {
public:
Table();
Table(int m, int n);
Table(int m, int n, const T& value);
Table(const Table<T>& rhs);
~Table();
Table<T>& operator=(const Table& rhs);
T& operator()(int i, int j);
int numRows()const;
int numCols()const;
void resize(int m, int n);
void resize(int m, int n, const T& value);
private:
// Make private because this method should only be used
// internally by the class.
void destroy();
private:
int mNumRows;
int mNumCols;
T** mDataMatrix;
};
template <typename T>
void Table<T>::destroy() {
// Does the matrix exist?
if (mDataMatrix) {
for (int i = 0; i < _m; ++i) {
// Does the ith row exist?
if (mDataMatrix[i]) {
// Yes, delete it.
delete[]mDataMatrix[i];
mDataMatrix[i] = 0;
}
}
// Delete the row-array.
delete[] mDataMatrix;
mDataMatrix = 0;
}
mNumRows = 0;
mNumCols = 0;
}
This is a code sample I got from a book. It demonstrates how to destroy or free a 2x2 matrix where mDataMatrix is the pointer to array of pointers.
What I don't understand is this part:
for(int i = 0; i < _m; ++i) {
// Does the ith row exist?
if (mDataMatrix[i]) {
//.….
}
}
I don't know why the book uses _m for max number of row-ptr. It wasn't even a variable define in class; the variable for max row is mNumRows. Maybe it is some compiler pre-defined variable? Another thing I am quite confuse is why is it ++i? pre-operator, why not i++? Will it make different if I change it into i++?
Another thing I am quite confuse is why is it ++i? pre-operator, why not i++? Will it make different if I change it into i++?
Because ++i is more natural and easier to understand: increment i and then yield the variable i as a result. i++ on the other hand means copy the current value of i somewhere (let's call it temp), increment i, and then yield the value temp as a result.
Also, for user-defined types, i++ is potentially slower than ++i.
Note that ++i as a loop increment does not imply the increment happens before entering the loop body or something. (This seems to be a common misconception among beginners.) If you're not using ++i or i++ as part of a larger expression, the semantics are exactly the same, because prefix and postfix increment only differ in their result (incremented variable vs. old value), not in their side effect (incrementing the variable).
Without seeing the entire class code, it is hard to tell for your first question, but if it hasn't been defined as part of the class, my guess would be that it is a typo.
as for your second question, ++i vs. i++, the prefix increment operator (++i) returns the object you are incrementing, whereas the postfix increment operator returns a copy of the object, in the objects original state. i.e.-
int i=1;
std::cout << i++ << std::endl; // output: 1
std::cout << i << std::endl // output: 2
std::cout << ++i << std::endl // output: 3
as for will the code change with the postfix- no, it works the same in loops, and makes basically no difference in loops for integer types. For user defined types, however, it may be more efficient to use the prefix increment, and is the style many c++ programmers use by default.
If the _mvariable isn't defined anywhere this is an error. From that context it looks like it should contain the number of rows that are allocated with new somewhere (probably in the constructor, or there might be methods like addRow). If that number is always mNumRows, than this would be appropriate for the loop in the destructor.
If you use ++i or i++ in that for loop doesn't make any difference. Both variants increment the integer, and the return value of the expression (that would be different) isn't used anywhere.
I can't speak to the first part of the question, but I can explain the pre- versus post- increment dilemma.
Prefix versions increment and decrement are slightly more efficient and are generally preferred. In the end, though, the extra overhead caused by using i++ over ++i is negligible unless the loop is being executed many, many times.
As others have said, the prefix operator is preferred for performance reasons when dealing with user-defined types. The reason it has no impact on the for loop is because the test involving the value of the variable (i.e. i < _m) is performed before the operation that modifies the variable is performed.
The real mess with this book is the way it illustrates a 2x2 matrix. The problem is here that for 4 elements you have 3 blocks of memory allocated, and not only does it slows down the program but it is certainly much more tricky to handle.
The usual technic is much simpler:
T* mData = new T[2*2];
And then you access it like so:
T& operator()(size_t r, size_t c) { return mData[r * mNbRows + c]; }
This is a bit more work (you have to multiply by the number of rows if you are row major), but then the destroy is incredibly easy:
template <class T>
void Table<T>::destroy()
{
delete[] mData;
mData = 0;
mNbRows = 0;
mNbColumns = 0;
}
Also note that here there is no need for a if: it's fine to call delete on a null pointer, it just doesn't do anything.
Finally, I have no idea why your book is using int for coordinates, do negative coordinates have any meaning in the context of this Table class ? If not, you're better off using an unsigned integral type (like size_t) and throwing the book away.
Related
I got this library of mathematical routines ( without documentation ) to work on some task at college. The problem I have with it is that all of its functions have void return type, although these functions call one another, or are part of another, and the results of their computations are needed.
This is a piece of ( simplified ) code extracted from the libraries. Don't bother about the mathematics in code, it is not significant. Just passing arguments and returning results is what puzzles me ( as described after code ) :
// first function
void vector_math // get the (output) vector we need
(
double inputV[3], // input vector
double outputV[3] // output vector
)
{
// some variable declarations and simple arithmetics
// .....
//
transposeM(matrix1, matrix2, 3, 3 ); // matrix2 is the result
matrixXvector( matrix2, inputV, outputV) // here you get the result, outputV
}
////////
// second function
void transposeM // transposes a matrix
(
std::vector< std::vector<double> > mat1, // input matrix
std::vector< std::vector<double> > &mat2, // transposed matrix
int mat1rows, int mat1columns
)
{
int row,col;
mat2.resize(mat1columns); // rows
for (std::vector< std::vector<double> >::iterator it=mat2.begin(); it !=mat2.end();++it)
it->resize(mat1rows);
for (row = 0; row < mat1rows; row++)
{
for (col = 0; col < mat1columns; col++)
mat2[col][row] = mat1[row][col];
}
}
////////
// third function
void matrixXvector // multiply matrix and vector
(
std::vector< std::vector<double> > inMatrix,
double inVect[3],
double outVect[3]
)
{
int row,col,ktr;
for (row = 0; row <= 2; row++)
{
outVect[row]= 0.0;
for (ktr = 0; ktr <= 2; ktr++)
outVect[row]= outVect[row] + inMatrix[row][ktr] * inVect[ktr];
}
}
So "vector_math" is being called by the main program. It takes inputV as input and the result should be outputV. However, outputV is one of the input arguments, and the function returns void. And similar process occurs later when calling "transposeM" and "matrixXvector".
Why is the output variable one of the input arguments ? How are the results being returned and used for further computation ? How this kind of passing and returning arguments works ?
Since I am a beginner and also have never seen this style of coding, I don't understand how passing parameters and especially giving output works in these functions. Therefore I don't know how to use them and what to expect of them ( what they will actually do ). So I would very much appreciate an explanation that will make these processes clear to me.
EXTRA :
Thank you all for great answers. It was first time I could barely decide which answer to accept, and even as I did it felt unfair to others. I would like to add an extra question though, if anyone is willing to answer ( as a comment is enough ). Does this "old" style of coding input/output arguments have its name or any other expression with which it is referred ?
This is an "old" (but still popular) style of returning certain or multiple values. It works like this:
void copy (const std::vector<double>& input, std::vector<double>& output) {
output = input;
}
int main () {
std::vector<double> old_vector {1,2,3,4,5}, new_vector;
copy (old_vector, new_vector); // new_vector now copy of old_vector
}
So basically you give the function one or multiple output parameter to write the result of its computation to.
If you pass input parameters (i.e. you don't intend to change them) by value or by const reference does not matter, although passing read only arguments by value might be costly performance-wise. In the first case, you copy the input object and use the copy in the function, in the latter you just let the function see the original and prevent it from being modified with the const. The const for the input parameters is optional, but leaving it out allows the function to change their values which might not be what you want, and inhibits passing temporaries as input.
The input parameter(s) have to be passed by non-const reference to allow the function to change it/them.
Another, even older and "C-isher" style is to passing output-pointer or raw-arrays, like the first of your functions does. This is potentially dangerous as the pointer might not point to a valid piece of memory, but still pretty wide spread. It works essentially just like the first example:
// Copies in to int pointed to by out
void copy (int in, int* out) {
*out = in;
}
// Copies int pointed to by in to int pointed to by out
void copy (const int* in, int* out) {
*out = *in;
}
// Copies length ints beginning from in to length ints beginning at out
void copy (const int* in, int* out, std::size_t length) {
// For loop for beginner, use std::copy IRL:
// std::copy(in, in + length, out);
for (std::size_t i = 0; i < length; ++i)
out[i] = in[i];
}
The arrays in your first example basically work like pointers.
Baum's answer is accurate, but perhaps not as detailed as a C/C++ beginner would like.
The actual argument values that go into a function are always passed by value (i.e. a bit pattern) and cannot be changed in a way that is readable by the caller. HOWEVER - and this is the key - those bits in the arguments may in fact be pointers (or references) that don't contain data directly, but rather contain a location in memory that contains the actual value.
Examples: in a function like this:
void foo(double x, double output) { output = x ^ 2; }
naming the output variable "output doesn't change anything - there is no way for the caller to get the result.
But like this:
void foo(double x, double& output) { output = x ^ 2; }
the "&" indicates that the output parameter is a reference to the memory location where the output should be stored. It's syntactic sugar in C++ that is equivalent to this 'C' code:
void foo(double x, double* pointer_to_output) { *pointer_to_output = x ^ 2; }
The pointer dereference is hidden by the reference syntax but the idea is the same.
Arrays perform a similar syntax trick, they are actually passed as pointers, so
void foo(double x[3], double output[3]) { ... }
and
void foo(double* x, double* output) { ... }
are essentially equivalent. Note that in either case there is no way to determine the size of the arrays. Therefore, it is generally considered good practice to pass pointers and lengths:
void foo(double* x, int xlen, double* output, int olen);
Output parameters like this are used in multiple cases. A common one is to return multiple values since the return type of a function can be only a single value. (While you can return an object that contains multiple members, but you can't return multiple separate values directly.)
Another reason why output parameters are used is speed. It's frequently faster to modify the output in place if the object in question is large and/or expensive to construct.
Another programming paradigm is to return a value that indicates the success/failure of the function and return calculated value(s) in output parameters. For example, much of the historic Windows API works this way.
An array is a low-level C++ construct. It is implicitly convertible to a pointer to the memory allocated for the array.
int a[] = {1, 2, 3, 4, 5};
int *p = a; // a can be converted to a pointer
assert(a[0] == *a);
assert(a[1] == *(a + 1));
assert(a[1] == p[1]);
// etc.
The confusing thing about arrays is that a function declaration void foo(int bar[]); is equivalent to void foo(int *bar);. So foo(a) doesn't copy the array a; instead, a is converted to a pointer and the pointer - not the memory - is then copied.
void foo(int bar[]) // could be rewritten as foo(int *bar)
{
bar[0] = 1; // could be rewritten as *(bar + 0) = 1;
}
int main()
{
int a[] = {0};
foo(a);
assert(a[0] == 1);
}
bar points to the same memory that a does so modifying the contents of array pointed to by bar is the same as modifying the contents of array a.
In C++ you can also pass objects by reference (Type &ref;). You can think of references as aliases for a given object. So if you write:
int a = 0;
int &b = a;
b = 1;
assert(a == 1);
b is effectively an alias for a - by modifying b you modify a and vice versa. Functions can also take arguments by reference:
void foo(int &bar)
{
bar = 1;
}
int main()
{
int a = 0;
foo(a);
assert(a == 1);
}
Again, bar is little more than an alias for a, so by modifying bar you will also modify a.
The library of mathematical routines you have is using these features to store results in an input variable. It does so to avoid copies and ease memory management. As mentioned by #Baum mit Augen, the method can also be used as a way to return multiple values.
Consider this code:
vector<int> foo(const vector<int> &bar)
{
vector<int> result;
// calculate the result
return result;
}
While returning result, foo will make a copy of the vector, and depending on number (and size) of elements stored the copy can be very expensive.
Note:
Most compilers will elide the copy in the code above using Named Return Value Optimization (NRVO). In general case, though, you have no guarantee of it happening.
Another way to avoid expensive copies is to create the result object on heap, and return a pointer to the allocated memory:
vector<int> *foo(const vector<int> &bar)
{
vector<int> *result = new vector<int>;
// calculate the result
return result;
}
The caller needs to manage the lifetime of the returned object, calling delete when it's no longer needed. Faililng to do so can result in a memory leak (the memory stays allocated, but effectively unusable, by the application).
Note:
There are various solutions to help with returning (expensive to copy) objects. C++03 has std::auto_ptr wrapper to help with lifetime management of objects created on heap. C++11 adds move semantics to the language, which allow to efficiently return objects by value instead of using pointers.
I want to add two arrays by simply writing:
int a[4] = {1,2,3,4};
int b[4] = {2,1,3,1};
int sum[4] = a + b;
I wrote this function but I got an error
int* operator+(const uint32& other) const{
uint32 sum[n];
for(int i=0; i<n; i++){
sum[i] = (*this[i]) + other[i];
}
return sum;
}
Could you help me on this? Thanks in advance.
Let's go through your code, piece by piece, and look at the problems:
int* operator+(const uint32& other) const{
You can't overload operators for built-in types, so this is doomed from the beginning
Even if you could do this (which you can't), it needs to take two parameters since it's non-member binary function.
uint32 sum[n];
You can't make variable-length arrays in C++ (assuming n isn't a compile-time constant) (note: G++ has some extensions that allow this, but it's non-standard C++)
for(int i=0; i<n; i++){
sum[i] = (*this[i]) + other[i];
There's no this pointer to begin with in this code (it's not a member function)...
const uint32& other is not an array/pointer to an array. It's a single reference to a single uint32. That means that other in this code is not an array/pointer to an array, and so you cannot do other[i] (it's like trying to do int x = 3; x[4] = 13;, which makes no sense).
}
return sum;
You're returning a pointer to a locally allocated array, which means this will result in undefined behavior, as the memory associated with sum is going to get annihilated when this function returns.
}
This is probably wrong, but it appears to work (C++11):
#include <iostream>
#include <array>
using namespace std;
template <class T>
T operator+(const T& a1, const T& a2)
{
T a;
for (typename T::size_type i = 0; i < a1.size(); i++)
a[i] = a1[i] + a2[i];
return a;
}
int main()
{
array<int,5> a1 = { 1, 2, 3, 4, 5 };
array<int,5> a2 = { 2, 3, 4, 5, 6 };
array<int,5> a3 = a1 + a2;
for (int i = 0; i < 5; i++)
cout << a1[i] << '+' << a2[i] << '=' << a3[i] << ' ';
cout << endl;
return 0;
}
Output (ideone):
1+2=3 2+3=5 3+4=7 4+5=9 5+6=11
I think the issue is that you're missing a way to pass in the length of the array. You might need to do something a bit more sophisticated. Something like:
class AddingVector : public std::vector<int>
{
public:
typedef AddingVector type;
type operator+(const AddingVector& rhs, const AddingVector& lhs)
{
/* validate that they're the same size, decide how you want to handle that*/
AddingVector retVal;
AddingVector::const_iterator rIter = rhs.begin();
AddingVector::const_iterator lIter = lhs.begin();
while (rIter != rhs.end() && lIter != lhs.end()) {
retVal.push_back(*rIter + *lIter);
++rIter;
++lIter;
}
return retVal;
}
}
You cannot do that. Non-member binary operators must take two arguments (you only provided one), so you could try this:
int* operator+(const uint32& a, const uint32& b)
But that can't possibly work either, since you want to add arrays, not single uint32 variables. So you would think that this would do it:
int* operator+(const uint32[] a, const uint32[] b)
or:
int* operator+(const uint32[4] a, const uint32[4] b)
But no go. It's illegal because you cannot have pointer types as both arguments in an operator overload. Additionally, at least one of the arguments must be a class type or an enum. So what you're trying to do is already impossible on at least two different levels.
It's impossible to do what you want. One correct way to go about it is to write your own class for an array that can be added to another one.
You cannot overload operators for types other than your own defined types. That is, if you create a class X, you can overload operators for X, but you cannot overload operators for arrays or pointers to fundamental types.
first is your code getting compiled properly, you have used 'n' directly in declaring array, is 'n' declared as constant..
And moreover you have taken a local variable in the function and returning it, well, this return a garbage form the stack, wat i can suggest is you malloc some memory and use it,, but again freeing it would be needed...
Hey, what you could do is,
Take a wrapper class "array"
class array
{
int *ipArr;
DWORD size;
};
then in constructor you can pass the size you want to have an array of
array(DWORD dwSize);
{
// then malloc memory of size dwSize;
}
Have an overloaded operator'+' for this class, that will have the above implementation of adding two int arrays,
Note here you will also need to overlaod the '=' assignment operator, so that our array class can you is directly..
now you can free the associated memory in the destructor
You have a few problems. The first is that you aren't passing in both arrays, and then you don't specify what n is, and the last is that you are trying to pass out a pointer to a local variable. It looks like you are trying to make a member operator of a class.
So basically you are trying to add the contents of an unspecified length array to an uninitialised array of the same length and return the stack memory.
So if you pass in pointers to the arrays and the length of the array and an output array then it would work, but you wouldn't have the syntax
sum = a + b;
it would be something like
addArray(&a, &b, &sum, 4);
To get the syntax you want you could make a class that wraps an array. But that is a much more complicated task.
I am trying to learn C++ function templates.I am passing an array as pointer to my function template. In that, I am trying to find the size of an array. Here is the function template that I use.
template<typename T>
T* average( T *arr)
{
T *ansPtr,ans,sum = 0.0;
size_t sz = sizeof(arr)/sizeof(arr[0]);
cout<<"\nSz is "<<sz<<endl;
for(int i = 0;i < sz; i++)
{
sum = sum + arr[i];
}
ans = (sum/sz);
ansPtr = &ans;
return ansPtr;
}
The cout statement displays the size of arr as 1 even when I am passing the pointer to an array of 5 integers. Now I know this might be a possible duplicate of questions to which I referred earlier but I need a better explanation on this.
Only thing I could come up with is that since templates are invoked at runtime,and sizeof is a compile time operator, compiler just ignores the line
int sz = sizeof(arr)/sizeof(arr[0]);
since it does not know the exact type of arr until it actually invokes the function.
Is it correct or am I missing something over here? Also is it reliable to send pointer to an array to the function templates?
T *arr
This is C++ for "arr is a pointer to T". sizeof(arr) obviously means "size of the pointer arr", not "size of the array arr", for obvious reasons. That's the crucial flaw in that plan.
To get the size of an array, the function needs to operate on arrays, obviously not on pointers. As everyone knows (right?) arrays are not pointers.
Furthermore, an average function should return an average value. But T* is a "pointer to T". An average function should not return a pointer to a value. That is not a value.
Having a pointer return type is not the last offense: returning a pointer to a local variable is the worst of all. Why would you want to steal hotel room keys?
template<typename T, std::size_t sz>
T average( T(&arr)[sz])
{
T ans,sum = 0.0;
cout<<"\nSz is "<<sz<<endl;
for(int i = 0;i < sz; i++)
{
sum = sum + arr[i];
}
ans = (sum/sz);
return ans;
}
If you want to be able to access the size of a passed parameter, you'd have to make that a template parameter, too:
template<typename T, size_t Len>
T average(const T (&arr)[Len])
{
T sum = T();
cout<<"\nSz is "<<Len<<endl;
for(int i = 0;i < Len; i++)
{
sum = sum + arr[i];
}
return (sum/Len);
}
You can then omit the sizeof, obviously. And you cannot accidentially pas a dynamically allocated array, which is a good thing. On the downside, the template will get instantiated not only once for every type, but once for every size. If you want to avoid duplicating the bulk of the code, you could use a second templated function which accepts pointer and length and returns the average. That could get called from an inline function.
template<typename T>
T average(const T* arr, size_t len)
{
T sum = T();
cout<<"\nSz is "<<len<<endl;
for(int i = 0;i < len; i++)
{
sum = sum + arr[i];
}
return (sum/len);
}
template<typename T, size_t Len>
inline T average(const T (&arr)[Len])
{
return average(arr, Len);
}
Also note that returning the address of a variable which is local to the function is a very bad idea, as it will not outlive the function. So better to return a value and let the compiler take care of optimizing away unneccessary copying.
Arrays decay to pointers when passed as a parameter, so you're effectively getting the size of the pointer. It has nothing to do with templates, it's how the language is designed.
Others have pointed out the immediate errors, but IMHO, there are two
important points that they haven't addresses. Both of which I would
consider errors if they occurred in production code:
First, why aren't you using std::vector? For historical reasons, C
style arrays are broken, and generally should be avoided. There are
exceptions, but they mostly involve static initialization of static
variables. You should never pass C style arrays as a function
argument, because they create the sort of problems you have encountered.
(It's possible to write functions which can deal with both C style
arrays and std::vector efficiently. The function should be a
function template, however, which takes two iterators of the template
type.)
The second is why aren't you using the functions in the standard
library? Your function can be written in basically one line:
template <typename ForwardIterator>
typename ForwardIterator::value_type
average( ForwardIterator begin, ForwardIterator end )
{
return std::accumulate( begin, end,
typename::ForwardIterator::value_type() )
/ std::distance( begin, end );
}
(This function, of course, isn't reliable for floating point types,
where rounding errors can make the results worthless. Floating point
raises a whole set of additional issues. And it probably isn't really
reliable for the integral types either, because of the risk of overflow.
But these are more advanced issues.)
I was implementing a small dense matrix class and instead of plan get/set operators I wanted to use operator overloading to make the API more usable and coherent.
What I want to achieve is pretty simple:
template<typename T>
class Matrix
{
public:
/* ... Leaving out CTOR and Memory Management for Simplicity */
T operator() (unsigned x, unsigned y){/* ... */ }
};
Matrix<int> m(10,10);
int value = m(5,3); // get the value at index 5,3
m(5,3) = 99; // set the value at index 5,3
While getting the value is straight forward by overloading operator(), I can't get my head around defining the setter. From what I understood the operator precedence would call operator() before the assignment, however it is not possible to overload operator() to return a correct lvalue.
What is the best approach to solve this problem?
I dispute that "it's not possible" to do the correct thing:
struct Matrix
{
int & operator()(size_t i, size_t j) { return data[i * Cols + j]; }
const int & operator()(size_t i, size_t j) const { return data[i * Cols + j]; }
/* ... */
private:
const size_t Rows, Cols;
int data[Rows * Cols]; // not real code!
};
Now you can say, m(2,3) = m(3,2) = -1; etc.
The answer to your question is what Kerrek already stated: you can provide an overload by changing the signature of the operator, which in this case can be achieved by modifying the const-ness of the function.
But I would recommend that you at least consider providing a separate setter for the values. The reason is that once you return references to your internal data structures you loose control of what external code does with your data. It might be ok in this case, but consider that if you decided to add range validation to the implementation (i.e. verify that no value in the matrix is above X or below Y), or wished to optimize some calculation on the matrix (say the sum of all of the elements in the matrix is an often checked value, and you want to optimize away the calculation by pre-caching the value and updating it on each field change) it is much easier to control with a method that receives the value to set.
We have the question is there a performance difference between i++ and ++i in C?
What's the answer for C++?
[Executive Summary: Use ++i if you don't have a specific reason to use i++.]
For C++, the answer is a bit more complicated.
If i is a simple type (not an instance of a C++ class), then the answer given for C ("No there is no performance difference") holds, since the compiler is generating the code.
However, if i is an instance of a C++ class, then i++ and ++i are making calls to one of the operator++ functions. Here's a standard pair of these functions:
Foo& Foo::operator++() // called for ++i
{
this->data += 1;
return *this;
}
Foo Foo::operator++(int ignored_dummy_value) // called for i++
{
Foo tmp(*this); // variable "tmp" cannot be optimized away by the compiler
++(*this);
return tmp;
}
Since the compiler isn't generating code, but just calling an operator++ function, there is no way to optimize away the tmp variable and its associated copy constructor. If the copy constructor is expensive, then this can have a significant performance impact.
Yes. There is.
The ++ operator may or may not be defined as a function. For primitive types (int, double, ...) the operators are built in, so the compiler will probably be able to optimize your code. But in the case of an object that defines the ++ operator things are different.
The operator++(int) function must create a copy. That is because postfix ++ is expected to return a different value than what it holds: it must hold its value in a temp variable, increment its value and return the temp. In the case of operator++(), prefix ++, there is no need to create a copy: the object can increment itself and then simply return itself.
Here is an illustration of the point:
struct C
{
C& operator++(); // prefix
C operator++(int); // postfix
private:
int i_;
};
C& C::operator++()
{
++i_;
return *this; // self, no copy created
}
C C::operator++(int ignored_dummy_value)
{
C t(*this);
++(*this);
return t; // return a copy
}
Every time you call operator++(int) you must create a copy, and the compiler can't do anything about it. When given the choice, use operator++(); this way you don't save a copy. It might be significant in the case of many increments (large loop?) and/or large objects.
Here's a benchmark for the case when increment operators are in different translation units. Compiler with g++ 4.5.
Ignore the style issues for now
// a.cc
#include <ctime>
#include <array>
class Something {
public:
Something& operator++();
Something operator++(int);
private:
std::array<int,PACKET_SIZE> data;
};
int main () {
Something s;
for (int i=0; i<1024*1024*30; ++i) ++s; // warm up
std::clock_t a = clock();
for (int i=0; i<1024*1024*30; ++i) ++s;
a = clock() - a;
for (int i=0; i<1024*1024*30; ++i) s++; // warm up
std::clock_t b = clock();
for (int i=0; i<1024*1024*30; ++i) s++;
b = clock() - b;
std::cout << "a=" << (a/double(CLOCKS_PER_SEC))
<< ", b=" << (b/double(CLOCKS_PER_SEC)) << '\n';
return 0;
}
O(n) increment
Test
// b.cc
#include <array>
class Something {
public:
Something& operator++();
Something operator++(int);
private:
std::array<int,PACKET_SIZE> data;
};
Something& Something::operator++()
{
for (auto it=data.begin(), end=data.end(); it!=end; ++it)
++*it;
return *this;
}
Something Something::operator++(int)
{
Something ret = *this;
++*this;
return ret;
}
Results
Results (timings are in seconds) with g++ 4.5 on a virtual machine:
Flags (--std=c++0x) ++i i++
-DPACKET_SIZE=50 -O1 1.70 2.39
-DPACKET_SIZE=50 -O3 0.59 1.00
-DPACKET_SIZE=500 -O1 10.51 13.28
-DPACKET_SIZE=500 -O3 4.28 6.82
O(1) increment
Test
Let us now take the following file:
// c.cc
#include <array>
class Something {
public:
Something& operator++();
Something operator++(int);
private:
std::array<int,PACKET_SIZE> data;
};
Something& Something::operator++()
{
return *this;
}
Something Something::operator++(int)
{
Something ret = *this;
++*this;
return ret;
}
It does nothing in the incrementation. This simulates the case when incrementation has constant complexity.
Results
Results now vary extremely:
Flags (--std=c++0x) ++i i++
-DPACKET_SIZE=50 -O1 0.05 0.74
-DPACKET_SIZE=50 -O3 0.08 0.97
-DPACKET_SIZE=500 -O1 0.05 2.79
-DPACKET_SIZE=500 -O3 0.08 2.18
-DPACKET_SIZE=5000 -O3 0.07 21.90
Conclusion
Performance-wise
If you do not need the previous value, make it a habit to use pre-increment. Be consistent even with builtin types, you'll get used to it and do not run risk of suffering unecessary performance loss if you ever replace a builtin type with a custom type.
Semantic-wise
i++ says increment i, I am interested in the previous value, though.
++i says increment i, I am interested in the current value or increment i, no interest in the previous value. Again, you'll get used to it, even if you are not right now.
Knuth.
Premature optimization is the root of all evil. As is premature pessimization.
It's not entirely correct to say that the compiler can't optimize away the temporary variable copy in the postfix case. A quick test with VC shows that it, at least, can do that in certain cases.
In the following example, the code generated is identical for prefix and postfix, for instance:
#include <stdio.h>
class Foo
{
public:
Foo() { myData=0; }
Foo(const Foo &rhs) { myData=rhs.myData; }
const Foo& operator++()
{
this->myData++;
return *this;
}
const Foo operator++(int)
{
Foo tmp(*this);
this->myData++;
return tmp;
}
int GetData() { return myData; }
private:
int myData;
};
int main(int argc, char* argv[])
{
Foo testFoo;
int count;
printf("Enter loop count: ");
scanf("%d", &count);
for(int i=0; i<count; i++)
{
testFoo++;
}
printf("Value: %d\n", testFoo.GetData());
}
Whether you do ++testFoo or testFoo++, you'll still get the same resulting code. In fact, without reading the count in from the user, the optimizer got the whole thing down to a constant. So this:
for(int i=0; i<10; i++)
{
testFoo++;
}
printf("Value: %d\n", testFoo.GetData());
Resulted in the following:
00401000 push 0Ah
00401002 push offset string "Value: %d\n" (402104h)
00401007 call dword ptr [__imp__printf (4020A0h)]
So while it's certainly the case that the postfix version could be slower, it may well be that the optimizer will be good enough to get rid of the temporary copy if you're not using it.
The Google C++ Style Guide says:
Preincrement and Predecrement
Use prefix form (++i) of the increment and decrement operators with
iterators and other template objects.
Definition: When a variable is incremented (++i or i++) or decremented (--i or
i--) and the value of the expression is not used, one must decide
whether to preincrement (decrement) or postincrement (decrement).
Pros: When the return value is ignored, the "pre" form (++i) is never less
efficient than the "post" form (i++), and is often more efficient.
This is because post-increment (or decrement) requires a copy of i to
be made, which is the value of the expression. If i is an iterator or
other non-scalar type, copying i could be expensive. Since the two
types of increment behave the same when the value is ignored, why not
just always pre-increment?
Cons: The tradition developed, in C, of using post-increment when the
expression value is not used, especially in for loops. Some find
post-increment easier to read, since the "subject" (i) precedes the
"verb" (++), just like in English.
Decision: For simple scalar (non-object) values there is no reason to prefer one
form and we allow either. For iterators and other template types, use
pre-increment.
++i - faster not using the return value
i++ - faster using the return value
When not using the return value the compiler is guaranteed not to use a temporary in the case of ++i. Not guaranteed to be faster, but guaranteed not to be slower.
When using the return value i++ allows the processor to push both the
increment and the left side into the pipeline since they don't depend on each other. ++i may stall the pipeline because the processor cannot start the left side until the pre-increment operation has meandered all the way through. Again, a pipeline stall is not guaranteed, since the processor may find other useful things to stick in.
I would like to point out an excellent post by Andrew Koenig on Code Talk very recently.
http://dobbscodetalk.com/index.php?option=com_myblog&show=Efficiency-versus-intent.html&Itemid=29
At our company also we use convention of ++iter for consistency and performance where applicable. But Andrew raises over-looked detail regarding intent vs performance. There are times when we want to use iter++ instead of ++iter.
So, first decide your intent and if pre or post does not matter then go with pre as it will have some performance benefit by avoiding creation of extra object and throwing it.
#Ketan
...raises over-looked detail regarding intent vs performance. There are times when we want to use iter++ instead of ++iter.
Obviously post and pre-increment have different semantics and I'm sure everyone agrees that when the result is used you should use the appropriate operator. I think the question is what should one do when the result is discarded (as in for loops). The answer to this question (IMHO) is that, since the performance considerations are negligible at best, you should do what is more natural. For myself ++i is more natural but my experience tells me that I'm in a minority and using i++ will cause less metal overhead for most people reading your code.
After all that's the reason the language is not called "++C".[*]
[*] Insert obligatory discussion about ++C being a more logical name.
Mark: Just wanted to point out that operator++'s are good candidates to be inlined, and if the compiler elects to do so, the redundant copy will be eliminated in most cases. (e.g. POD types, which iterators usually are.)
That said, it's still better style to use ++iter in most cases. :-)
The performance difference between ++i and i++ will be more apparent when you think of operators as value-returning functions and how they are implemented. To make it easier to understand what's happening, the following code examples will use int as if it were a struct.
++i increments the variable, then returns the result. This can be done in-place and with minimal CPU time, requiring only one line of code in many cases:
int& int::operator++() {
return *this += 1;
}
But the same cannot be said of i++.
Post-incrementing, i++, is often seen as returning the original value before incrementing. However, a function can only return a result when it is finished. As a result, it becomes necessary to create a copy of the variable containing the original value, increment the variable, then return the copy holding the original value:
int int::operator++(int& _Val) {
int _Original = _Val;
_Val += 1;
return _Original;
}
When there is no functional difference between pre-increment and post-increment, the compiler can perform optimization such that there is no performance difference between the two. However, if a composite data type such as a struct or class is involved, the copy constructor will be called on post-increment, and it will not be possible to perform this optimization if a deep copy is needed. As such, pre-increment generally is faster and requires less memory than post-increment.
#Mark: I deleted my previous answer because it was a bit flip, and deserved a downvote for that alone. I actually think it's a good question in the sense that it asks what's on the minds of a lot of people.
The usual answer is that ++i is faster than i++, and no doubt it is, but the bigger question is "when should you care?"
If the fraction of CPU time spent in incrementing iterators is less than 10%, then you may not care.
If the fraction of CPU time spent in incrementing iterators is greater than 10%, you can look at which statements are doing that iterating. See if you could just increment integers rather than using iterators. Chances are you could, and while it may be in some sense less desirable, chances are pretty good you will save essentially all the time spent in those iterators.
I've seen an example where the iterator-incrementing was consuming well over 90% of the time. In that case, going to integer-incrementing reduced execution time by essentially that amount. (i.e. better than 10x speedup)
#wilhelmtell
The compiler can elide the temporary. Verbatim from the other thread:
The C++ compiler is allowed to eliminate stack based temporaries even if doing so changes program behavior. MSDN link for VC 8:
http://msdn.microsoft.com/en-us/library/ms364057(VS.80).aspx
Since you asked for C++ too, here is a benchmark for java (made with jmh) :
private static final int LIMIT = 100000;
#Benchmark
public void postIncrement() {
long a = 0;
long b = 0;
for (int i = 0; i < LIMIT; i++) {
b = 3;
a += i * (b++);
}
doNothing(a, b);
}
#Benchmark
public void preIncrement() {
long a = 0;
long b = 0;
for (int i = 0; i < LIMIT; i++) {
b = 3;
a += i * (++b);
}
doNothing(a, b);
}
The result shows, even when the value of the incremented variable (b) is actually used in some computation, forcing the need to store an additional value in case of post-increment, the time per operation is exactly the same :
Benchmark Mode Cnt Score Error Units
IncrementBenchmark.postIncrement avgt 10 0,039 0,001 ms/op
IncrementBenchmark.preIncrement avgt 10 0,039 0,001 ms/op
An the reason why you ought to use ++i even on built-in types where there's no performance advantage is to create a good habit for yourself.
Both are as fast ;)
If you want it is the same calculation for the processor, it's just the order in which it is done that differ.
For example, the following code :
#include <stdio.h>
int main()
{
int a = 0;
a++;
int b = 0;
++b;
return 0;
}
Produce the following assembly :
0x0000000100000f24 <main+0>: push %rbp
0x0000000100000f25 <main+1>: mov %rsp,%rbp
0x0000000100000f28 <main+4>: movl $0x0,-0x4(%rbp)
0x0000000100000f2f <main+11>: incl -0x4(%rbp)
0x0000000100000f32 <main+14>: movl $0x0,-0x8(%rbp)
0x0000000100000f39 <main+21>: incl -0x8(%rbp)
0x0000000100000f3c <main+24>: mov $0x0,%eax
0x0000000100000f41 <main+29>: leaveq
0x0000000100000f42 <main+30>: retq
You see that for a++ and b++ it's an incl mnemonic, so it's the same operation ;)
The intended question was about when the result is unused (that's clear from the question for C). Can somebody fix this since the question is "community wiki"?
About premature optimizations, Knuth is often quoted. That's right. but Donald Knuth would never defend with that the horrible code which you can see in these days. Ever seen a = b + c among Java Integers (not int)? That amounts to 3 boxing/unboxing conversions. Avoiding stuff like that is important. And uselessly writing i++ instead of ++i is the same mistake.
EDIT: As phresnel nicely puts it in a comment, this can be summed up as "premature optimization is evil, as is premature pessimization".
Even the fact that people are more used to i++ is an unfortunate C legacy, caused by a conceptual mistake by K&R (if you follow the intent argument, that's a logical conclusion; and defending K&R because they're K&R is meaningless, they're great, but they aren't great as language designers; countless mistakes in the C design exist, ranging from gets() to strcpy(), to the strncpy() API (it should have had the strlcpy() API since day 1)).
Btw, I'm one of those not used enough to C++ to find ++i annoying to read. Still, I use that since I acknowledge that it's right.
i++ is sometimes faster than ++i!
For x86-architectures that use ILP (instruction level paralellism) i++ might in some situations outperform ++i.
Why? Because of data dependencies. Modern CPUs parallelise a lot of stuff. If the next few CPU cycles don't have any direct dependency on the incremented value of i, the CPU might omit microcode to delay the increment of i and shove it in a "free slot". This means that you essentially get a "free" increment.
I don't know how far ILE goes in this case but I suppose if the iterator becomes too complicated and does pointer dereferencing this might not work.
Here's a talk by Andrei Alexandrescu explaining the concept: https://www.youtube.com/watch?v=vrfYLlR8X8k&list=WL&index=5
++i is faster than i = i +1 because in i = i + 1 two operation are taking place, first increment and second assigning it to a variable. But in i++ only increment operation is taking place.
Time to provide folks with gems of wisdom ;) - there is simple trick to make C++ postfix increment behave pretty much the same as prefix increment (Invented this for myself, but the saw it as well in other people code, so I'm not alone).
Basically, trick is to use helper class to postpone increment after the return, and RAII comes to rescue
#include <iostream>
class Data {
private: class DataIncrementer {
private: Data& _dref;
public: DataIncrementer(Data& d) : _dref(d) {}
public: ~DataIncrementer() {
++_dref;
}
};
private: int _data;
public: Data() : _data{0} {}
public: Data(int d) : _data{d} {}
public: Data(const Data& d) : _data{ d._data } {}
public: Data& operator=(const Data& d) {
_data = d._data;
return *this;
}
public: ~Data() {}
public: Data& operator++() { // prefix
++_data;
return *this;
}
public: Data operator++(int) { // postfix
DataIncrementer t(*this);
return *this;
}
public: operator int() {
return _data;
}
};
int
main() {
Data d(1);
std::cout << d << '\n';
std::cout << ++d << '\n';
std::cout << d++ << '\n';
std::cout << d << '\n';
return 0;
}
Invented is for some heavy custom iterators code, and it cuts down run-time. Cost of prefix vs postfix is one reference now, and if this is custom operator doing heavy moving around, prefix and postfix yielded the same run-time for me.
++i is faster than i++ because it doesn't return an old copy of the value.
It's also more intuitive:
x = i++; // x contains the old value of i
y = ++i; // y contains the new value of i
This C example prints "02" instead of the "12" you might expect:
#include <stdio.h>
int main(){
int a = 0;
printf("%d", a++);
printf("%d", ++a);
return 0;
}
Same for C++:
#include <iostream>
using namespace std;
int main(){
int a = 0;
cout << a++;
cout << ++a;
return 0;
}