Auto reference for vectors? - c++

The following code will not alter the contents of i in the for loop:
class Solution {
public:
vector<vector<int>> combinationSum(vector<int>& candidates, int target) {
if (target == 0) {return vector<vector<int>>{{}};}
else if (!candidates.size() || target < 0) {return vector<vector<int>>();}
else {
vector<vector<int>> with = combinationSum(candidates, target - candidates[0]);
vector<int> new_vector(candidates.begin() + 1, candidates.end());
vector<vector<int>> without = combinationSum(new_vector, target);
for (auto i : with) {i.push_back(candidates[0]);}
with.insert(with.end(), without.begin(), without.end());
return with;
}
}
};
However, if I change it to auto& i : with ..., it works. Is there a reason why? I thought references were only relevant if you are working with pointed objects or if you don't want the variable (in this case i) to change in the local scope.

auto defaults to being neither const nor a reference. So for (auto i : with) { is storing to i by value, which invokes the copy constructor for each of the nested vectors. That copy has nothing to do with the version stored in your vector of vectors, so no changes persist when the copy goes away. Your addition of & makes it a reference that aliases the underlying inner vector so changes to the reference change the original vector.
If you don't want to change the object in question, the efficient way to do this is to use const auto& to both prevent changes and avoid copies; it's fairly rare you want to iterate by value this way unless you really want a mutable copy that won't affect the original, since the copy constructor costs can get pretty nasty.

On using for (auto i : with), you are copying the element i from the vector with and then using it.
On using for (auto & i : with), you are referencing the element i from the vector with and then using it.
Your vector would change if you were copying the addresses of its elements then dereferencing (or you were using a vector of pointers) but here you are copying the objects.

for (auto i : with) { ... is equivalent to:
for (vector<int>::iterator it = with.begin(); it != i.end(); it++)
{
vector<int> i = *it;
...
And you can then see that i makes a copy each time round the loop. Changes to this copy therefore affect only the copy, which is a temporary.
OTOH for (auto &i : with) ... is equivalent to:
for (vector<int>::iterator it = with.begin(); it != i.end(); it++)
{
vector<int>& i = *it;
...
which assigns a reference each time round the loop. Changes to this reference therefore affect the original object.

Related

c++ – Disappearing Variable

I am writing a function to take the intersection of two sorted vector<size_t>s named a and b. The function iterates through both vectors removing anything from a that is not also in b so that whatever remains in a is the intersection of the two. Code here:
void intersect(vector<size_t> &a, vector<size_t> &b) {
vector<size_t>::iterator aItr = a.begin();
vector<size_t>::iterator bItr = b.begin();
vector<size_t>::iterator aEnd = a.end();
vector<size_t>::iterator bEnd = b.end();
while(aItr != aEnd) {
while(*bItr < *aItr) {
bItr++;
if(bItr == bEnd) {
a.erase(aItr, aEnd);
return;
}
}
if (*aItr == *bItr) aItr++;
else aItr = a.erase(aItr, aItr+1);
}
}
I am getting a very bug. I am stepping the debugger and once it passes line 8 "while(*bItr < *aItr)" b seems to disappear. The debugger seems not to know that b even exists! When b comes back into existence after it goes back to the top of the loop it has now taken on the values of a!
This is the kind of behavior that I expect to see in dynamic memory error, but as you can see I am not managing any dynamic memory here. I am super confused and could really use some help.
Thanks in advance!
Well, perhaps you should first address a major issue with your code: iterator invalidation.
See: Iterator invalidation rules here on StackOverflow.
When you erase an element in a vector, iterators into that vector at the point of deletion and further on are not guaranteed to be valid. Your code, though, assumes such validity for aEnd (thanks #SidS).
I would guess either this is the reason for what you're seeing, or maybe it's your compiler optimization flags which can change the execution flow, the lifetimes of variables which are not necessary, etc.
Plus, as #KT. notes, your erases can be really expensive, making your algorithm potentially quadratic-time in the length of a.
You are making the assumption that b contains at least one element. To address that, you can add this prior to your first loop :
if (bItr == bEnd)
{
a.clear();
return;
}
Also, since you're erasing elements from a, aEnd will become invalid. Replace every use of aEnd with a.end().
std::set_intersection could do all of this for you :
void intersect(vector<size_t> &a, const vector<size_t> &b)
{
auto it = set_intersection(a.begin(), a.end(), b.begin(), b.end(), a.begin());
a.erase(it, a.end());
}

Initialising a reference to a class within a vector of pairs using a pointer to a pointer to an instance of that class

Apologies for the lengthy title, but I've noticed an odd behaviour that I can't understand regardless of the amount of testing I do. I was hoping someone could shed some light on what's happening here and whether it's due to something I'm doing wrong.
I have created a vector 'v' of a custom class 'T', which I've populated with unique objects. I want to use this vector as a sort of reference sheet.
I created a vector 'vp' of 'T' pointers, each element of which points to a specific unique element in 'v'. I then shuffled 'vp' and created an iterator 'i' for it, meaning I could iterate through 'vp' and get the unique 'T's in a random order without modifying 'v'.
I then created a vector of pairs of type with the intention of populating it with the randomised 'T's. However what I found was that each time a new element was added to this vector of pairs, the values of T& all updated to the current value of 'i'. What is the reason for this and (assuming that I must use this method) is there any way to avoid it? The simplified code I've tried to describe is below:
std::vector<T> v{ ... }; //Contains list of unique 'T's
std::vector<T*> vp{};
for (std::vector<T>::const_iterator iElement = v.begin(); iElement < v.end(); ++iElement)
vp.push_back(&(*iElement));
//Code for swapping elements of vp goes here
std::vector<std::pair<T&, int>> classVector{};
std::vector<T*>::iterator i{ vp.begin() };
for (int n = 0; n < 5; ++n)
{
classVector.push_back(std::make_pair(**i, 1));
++i;
}
Thanks for your help.
EDIT: had incorrectly used a for each loop to iterate through v, changed to iterator.
#include <vector>
#include <string>
#include <iostream>
int main() {
std::vector<std::string> v{"111", "2222", "3333"};
std::vector<const std::string*> vp;
for (const auto& element : v) {
vp.push_back(&element);
std::cout << &element <<std::endl;
}
std::vector<std::pair<const std::string&, int>> classVector{};
auto i = vp.begin();
for (int n = 0; n < v.size(); ++n) {
classVector.push_back(std::make_pair(**i, 1));
std::cout << (**i) << std::endl;
i++;
}
// prints same element
for(const auto& e : classVector) {
std::cout << (e.first) << std::endl;
}
}
This is a compiled abstract c++11 sample code of what you are trying to do and I will try to explain the problem on it. Basically you are pushing a pair to the vector to which this pair's first element is a reference to which ever string is pointed to by the iterator i (sorry for the wording complexity but what you are doing is complex and I don't get the point of it :) ). Taking the pointer is a different story, because pointers are copied (the addresses are copied not what's pointed to). so you go to the first element copy it's address, go the second copy it's address and so on. References are different!! The code above is taking the reference to any string that's pointed by the iterator so when the iterator moves on, the reference is refers to the string that i currently points to.

Optimization of a C++ code (that uses UnorderedMap and Vector)

I am trying to optimize some part of a C++ code that is taking a long time (the following part of the code takes about 19 seconds for X amount of data, and I am trying to finish the whole process in less than 5 seconds for the same amount of data - based on some benchmarks that I have). I have a function "add" that I have written and copied the code here. I will try to explain as much as possible that I think is needed to understand the code. Please let me know if I have missed something.
The following function add is called X times for X amount of data entries.
void HashTable::add(PointObject vector) // PointObject is a user-defined object
{
int combinedHash = hash(vector); // the function "hash" takes less than 1 second for X amount of data
// hashTableMap is an unordered_map<int, std::vector<PointObject>>
if (hashTableMap.count(combinedHash) == 0)
{
// if the hashmap does not contain the combinedHash key, then
// add the key and a new vector
std::vector<PointObject> pointVectorList;
pointVectorList.push_back(vector);
hashTableMap.insert(std::make_pair(combinedHash, pointVectorList));
}
else
{
// otherwise find the key and the corresponding vector of PointObjects and add the current PointObject to the existing vector
auto it = hashTableMap.find(combinedHash);
if (it != hashTableMap.end())
{
std::vector<PointObject> pointVectorList = it->second;
pointVectorList.push_back(vector);
it->second = pointVectorList;
}
}
}
You are doing a lot of useless operations... if I understand correctly, a simplified form could be simply:
void HashTable::add(const PointObject& vector) {
hashTableMap[hash(vector)].push_back(vector);
}
This works because
A map when accessed using operator[] will create a default-initialized value if it's not already present in the map
The value (an std::vector) is returned by reference so you can directly push_back the incoming point to it. This std::vector will be either a newly inserted one or a previously existing one if the key was already in the map.
Note also that, depending on the size of PointObject and other factors, it could be possibly more efficient to pass vector by value instead of by const PointObject&. This is the kind of micro optimization that however requires profiling to be performed sensibly.
Instead of calling hashTableMap.count(combinedHash) and hashTableMap.find(combinedHash), better just insert new element and check what insert() returned:
In versions (1) and (2), the function returns a pair object whose
first element is an iterator pointing either to the newly inserted
element in the container or to the element whose key is equivalent,
and a bool value indicating whether the element was successfully
inserted or not.
Moreover, do not pass objects by value, where you don't have to. Better pass it by pointer or by reference. This:
std::vector<PointObject> pointVectorList = it->second;
is inefficient since it will create an unnecessary copy of the vector.
This .count() is totally unecessary, you could simplify your function to:
void HashTable::add(PointObject vector)
{
int combinedHash = hash(vector);
auto it = hashTableMap.find(combinedHash);
if (it != hashTableMap.end())
{
std::vector<PointObject> pointVectorList = it->second;
pointVectorList.push_back(vector);
it->second = pointVectorList;
}
else
{
std::vector<PointObject> pointVectorList;
pointVectorList.push_back(vector);
hashTableMap.insert(std::make_pair(combinedHash, pointVectorList));
}
}
You are also performing copy operations everywhere. Copying an object is time consuming, avoid doing that. Also use references and pointers when possible:
void HashTable::add(PointObject& vector)
{
int combinedHash = hash(vector);
auto it = hashTableMap.find(combinedHash);
if (it != hashTableMap.end())
{
it->second.push_back(vector);
}
else
{
std::vector<PointObject> pointVectorList;
pointVectorList.push_back(vector);
hashTableMap.insert(std::make_pair(combinedHash, pointVectorList));
}
}
This code can probably be optimized further, but it would require knowing hash(), knowing the way hashTableMap works (by the way, why is it not a std::map?) and some experimentation.
If hashTableMap was a std::map<int, std::vector<pointVectorList>>, you could simplify your function to this:
void HashTable::add(PointObject& vector)
{
hashTableMap[hash(vector)].push_back(vector);
}
And if it was a std::map<int, std::vector<pointVectorList*>> (pointer) you can even avoid that last copy operation.
Without the if, try to insert an empty entry on the hash table:
auto ret = hashTableMap.insert(
std::make_pair(combinedHash, std::vector<PointObject>());
Either a new blank entry will be added, or the already present entry will be retrieved. In your case, you don't need to check which it the case, you just need to take the returned iterator and add the new element:
auto &pointVectorList = *ret.first;
pointVectorList.push_back(vector);
Assuming that PointObject is big and making copies of it is expensive, std::move is your friend here. You'll want to ensure that PointObject is move-aware (either don't define a destructor or copy operator, or provide a move-constructor and move-assignment operator yourself).
void HashTable::add(PointObject vector) // PointObject is a user-defined object
{
int combinedHash = hash(vector); // the function "hash" takes less than 1 second for X amount of data
// hashTableMap is an unordered_map<int, std::vector<PointObject>>
if (hashTableMap.count(combinedHash) == 0)
{
// if the hashmap does not contain the combinedHash key, then
// add the key and a new vector
std::vector<PointObject> pointVectorList;
pointVectorList.push_back(std::move(vector));
hashTableMap.insert(std::make_pair(combinedHash, std::move(pointVectorList)));
}
else
{
// otherwise find the key and the corresponding vector of PointObjects and add the current PointObject to the existing vector
auto it = hashTableMap.find(combinedHash);
if (it != hashTableMap.end())
{
std::vector<PointObject> pointVectorList = it->second;
pointVectorList.push_back(std::move(vector));
it->second = std::move(pointVectorList);
}
}
}
Using std::unordered_map doesn't seem appropriate here - you use the int from hash as the key (which presumably) is the hash of PointObject rather than PointObject itself. Essentially double hashing. And also if you need a PointObject in order to compute the map key then it's not really a key at all! Perhaps std::unordered_multiset would be a better choice?
First define the hash function form PointObject
namespace std
{
template<>
struct hash<PointObject> {
size_t operator()(const PointObject& p) const {
return ::hash(p);
}
};
}
Then something like
#include <unordered_set>
using HashTable = std::unordered_multiset<PointObject>;
int main()
{
HashTable table {};
PointObject a {};
table.insert(a);
table.emplace(/* whatever */);
return 0;
}
Your biggest problem is that you're copying the entire vector (and every element in that vector) twice in the else part:
std::vector<PointObject> pointVectorList = it->second; // first copy
pointVectorList.push_back(vector);
it->second = pointVectorList; // second copy
This means that every time you're adding an element to an existing vector you're copying that entire vector.
If you used a reference to that vector you'd do a lot better:
std::vector<PointObject> &pointVectorList = it->second;
pointVectorList.push_back(vector);
//it->second = pointVectorList; // don't need this anymore.
On a side note, in your unordered_map you're hashing your value to be your key.
You could use an unordered_set with your hash function instead.

trying to reflect the effect a change to the original variable (call by reference)

I have defined a vector which consist of pairs and vectors. For better readability, I do this
#include <iostream>
#include <vector>
using namespace std;
class foo {
public:
vector< pair<int, vector<int> > > v;
vector< pair<int, vector<int> > >::iterator it;
void bar()
{
for( it = v.begin(); it != v.end(); it++ ) {
if ( 10 == (*it).first ) {
vector<int> a = (*it).second; // NOTE
a[0] = a[0]*2; // NOTE
}
}
}
};
int main()
{
foo f;
f.bar();
return 0;
}
As you can see, I assign (*it).second to a variable a so that I manipulate easily. Problem is, when I change the value of a, the original vector v doesn't change. I can resolve that changing a to (*it).second every where in the loop. However this will make the code hard for reading.
Is there any way to reflect the changes in a to the original v? I have to say call by reference like this
vector<int> a = &(*it).second;
doesn't work
You need to declare a as a reference:
vector<int>& a = (*it).second;
Ampersand in std::vector<int> a = &(*it).second; is being parsed as adress-of operator and a isn't a pointer - that's why it doesn't work/compile.
You can change the value of the original vector only when you take it into a reference.
But instead in your code you are creating a copy of it and changing that copy which does not effect the original vector.
your vector<int> a should be a reference instead.
vector<int> &a...
first of all: there is no code which demonstrates how do you fill the v member and when you initialize it. just make sure that you have a valid it after all (remember that, all iterators into vector may invalidate after v modifications)
and the second: the correct line should be like this
vector<int>& a = it->second;
(you take an address instead, and your snippet even wouldn't compile)

Iterate through a C++ Vector using a 'for' loop

I am new to the C++ language. I have been starting to use vectors, and have noticed that in all of the code I see to iterate though a vector via indices, the first parameter of the for loop is always something based on the vector. In Java I might do something like this with an ArrayList:
for(int i=0; i < vector.size(); i++){
vector[i].doSomething();
}
Is there a reason I don't see this in C++? Is it bad practice?
The reason why you don't see such practice is quite subjective and cannot have a definite answer, because I have seen many of the code which uses your mentioned way rather than iterator style code.
Following can be reasons of people not considering vector.size() way of looping:
Being paranoid about calling size() every time in the loop
condition. However either it's a non-issue or it can be trivially
fixed
Preferring std::for_each() over the for loop itself
Later changing the container from std::vector to other one (e.g.
map, list) will also demand the change of the looping mechanism,
because not every container support size() style of looping
C++11 provides a good facility to move through the containers. That is called "range based for loop" (or "enhanced for loop" in Java).
With little code you can traverse through the full (mandatory!) std::vector:
vector<int> vi;
...
for(int i : vi)
cout << "i = " << i << endl;
The cleanest way of iterating through a vector is via iterators:
for (auto it = begin (vector); it != end (vector); ++it) {
it->doSomething ();
}
or (equivalent to the above)
for (auto & element : vector) {
element.doSomething ();
}
Prior to C++0x, you have to replace auto by the iterator type and use member functions instead of global functions begin and end.
This probably is what you have seen. Compared to the approach you mention, the advantage is that you do not heavily depend on the type of vector. If you change vector to a different "collection-type" class, your code will probably still work. You can, however, do something similar in Java as well. There is not much difference conceptually; C++, however, uses templates to implement this (as compared to generics in Java); hence the approach will work for all types for which begin and end functions are defined, even for non-class types such as static arrays. See here: How does the range-based for work for plain arrays?
Is there any reason I don't see this in C++? Is it bad practice?
No. It is not a bad practice, but the following approach renders your code certain flexibility.
Usually, pre-C++11 the code for iterating over container elements uses iterators, something like:
std::vector<int>::iterator it = vector.begin();
This is because it makes the code more flexible.
All standard library containers support and provide iterators. If at a later point of development you need to switch to another container, then this code does not need to be changed.
Note: Writing code which works with every possible standard library container is not as easy as it might seem to be.
The right way to do that is:
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
it->doSomething();
}
Where T is the type of the class inside the vector. For example if the class was CActivity, just write CActivity instead of T.
This type of method will work on every STL (Not only vectors, which is a bit better).
If you still want to use indexes, the way is:
for(std::vector<T>::size_type i = 0; i != v.size(); i++) {
v[i].doSomething();
}
Using the auto operator really makes it easy to use as one does not have to worry about the data type and the size of the vector or any other data structure
Iterating vector using auto and for loop
vector<int> vec = {1,2,3,4,5}
for(auto itr : vec)
cout << itr << " ";
Output:
1 2 3 4 5
You can also use this method to iterate sets and list. Using auto automatically detects the data type used in the template and lets you use it.
So, even if we had a vector of string or char the same syntax will work just fine
A correct way of iterating over the vector and printing its values is as follows:
#include<vector>
// declare the vector of type int
vector<int> v;
// insert elements in the vector
for (unsigned int i = 0; i < 5; ++i){
v.push_back(i);
}
// print those elements
for (auto it = v.begin(); it != v.end(); ++it){
std::cout << *it << std::endl;
}
But at least in the present case it is nicer to use a range-based for loop:
for (auto x: v) std::cout << x << "\n";
(You may also add & after auto to make x a reference to the elements rather than a copy of them. It is then very similar to the above iterator-based approach, but easier to read and write.)
There's a couple of strong reasons to use iterators, some of which are mentioned here:
Switching containers later doesn't invalidate your code.
i.e., if you go from a std::vector to a std::list, or std::set, you can't use numerical indices to get at your contained value. Using an iterator is still valid.
Runtime catching of invalid iteration
If you modify your container in the middle of your loop, the next time you use your iterator it will throw an invalid iterator exception.
Here is a simpler way to iterate and print values in vector.
for(int x: A) // for integer x in vector A
cout<< x <<" ";
With STL, programmers use iterators for traversing through containers, since iterator is an abstract concept, implemented in all standard containers. For example, std::list has no operator [] at all.
I was surprised nobody mentioned that iterating through an array with an integer index makes it easy for you to write faulty code by subscripting an array with the wrong index. For example, if you have nested loops using i and j as indices, you might incorrectly subscript an array with j rather than i and thus introduce a fault into the program.
In contrast, the other forms listed here, namely the range based for loop, and iterators, are a lot less error prone. The language's semantics and the compiler's type checking mechanism will prevent you from accidentally accessing an array using the wrong index.
don't forget examples with const correctness - can the loop modify the elements. Many examples here do not, and should use cont iterators. Here we assume
class T {
public:
T (double d) : _d { d } {}
void doSomething () const { cout << _d << endl; return; }
void changeSomething () { ++_d; return; }
private:
double _d;
};
vector<T> v;
// ...
for (const auto iter = v.cbegin(); iter != v.cend(); ++iter) {
iter->doSomething();
}
Note also, that with the C++11 notation, the default is to copy the element. Use a reference to avoid this, and/or to allow for original elements to be modified:
vector<T> v;
// ...
for (auto t : v) {
t.changeSomething(); // changes local t, but not element of v
t.doSomething();
}
for (auto& t : v) { // reference avoids copying element
t.changeSomething(); // changes element of v
t.doSomething();
}
for (const auto& t : v) { // reference avoids copying element
t.doSomething(); // element can not be changed
}
//different declaration type
vector<int>v;
vector<int>v2(5,30); //size is 5 and fill up with 30
vector<int>v3={10,20,30};
//From C++11 and onwards
for(auto itr:v2)
cout<<"\n"<<itr;
//(pre c++11)
for(auto itr=v3.begin(); itr !=v3.end(); itr++)
cout<<"\n"<<*itr;
int main()
{
int n;
int input;
vector<int> p1;
vector<int> ::iterator it;
cout << "Enter the number of elements you want to insert" << endl;
cin >> n;
for (int i = 0;i < n;i++)
{
cin >> input;
p1.push_back(input);
}
for(it=p1.begin();it!=p1.end();it++)
{
cout << *it << endl;
}
//Iterating in vector through iterator it
return 0;
}
conventional form of iterator
If you use
std::vector<std::reference_wrapper<std::string>> names{ };
Do not forget, when you use auto in the for loop, to use also get, like this:
for (auto element in : names)
{
element.get()//do something
}