C++: struct pointer in map structure - c++

I declared a struct like this, and the following data structures
struct piece{
int start, height, end;
piece(int a, int b, int c){
start = a; height = b; end = c;
}
};
vector<piece> piecesTog;
map <int,piece*> data;
Then, when I read the elements I do this:
while(scanf("%d %d %d", &aux1, &aux2, &aux3) != EOF){
piecesTog.push_back( piece(aux1, aux2, aux3) );
data[a] = data[c] = &piecesTog[tam];
}
Well, until now, I have had no problem.
However, later in the program, I have to use the piece* part, to do so, I use an iterator like this
for(map< int, piecesTog* >::iterator it = data.begin(); it != data.end(); it++){
piece* aux = it->second;
...
}
I want to have access to the structure that the it->second points, but I tried everything and nothing worked.
I printed the memory adress of it->second and &piecesTog[tam] and they are the same, but when I do (*aux).height or it->second->height they give number completely crazy, probably some trash.
I have no clue why that is happening.
If anyone has any idea how to fix it, I would appreciate it.

while(scanf("%d %d %d", &aux1, &aux2, &aux3) != EOF){
piecesTog.push_back( piece(aux1, aux2, aux3) );
data[a] = data[c] = &piecesTog[tam];
}
is almost certainly not following the Iterator invalidation rules.
piecesTog.push_back( piece(aux1, aux2, aux3) );
can trigger a resize which typically creates a new datastore, copies the elements from the old data store to the new one and then deletes the old datastore, leaving the pointers cached by
data[a] = data[c] = &piecesTog[tam];
dangling. When you use those pointers some time in the future, Ka-Blammo! Undefined Behaviour and an easily identified crash if you're lucky.
Insufficient information has been provided to supply a definitive solution, but here are a few general alternatives (in order of attractiveness):
If you know ahead of time the number of pieces that will go into piecesTog, you can reserve storage to eliminate the need to resize the vector.
If elements are only added to the end of the vector and no elements are ever removed, you can store the indexes of the elements rather than pointers to them. If the ordering never changes, the indexes will always refer to the correct elements no matter how many more items are added.
If it is possible to do so, rewrite the reader to load all of the pieces into piecesTog and then build the maps.
The above options all assume that piecesTog is assembled all at once and then left alone. If your insertion is more free-form, you sort the structure or you remove elements, you'll need to use a data structure with more favourable invalidation rules such as std::list.

Related

C++ Keeping track of start iterator while adding items

I am trying to do a double loop across a std::vector to explore all combinations of items in the vector. If the result is good, I add it to the vector for another pass. This is being used for an association rule problem but I made a smaller demonstration for this question. It seems as though when I push_back it will sometimes change the vector such that the original iterator no longer works. For example:
std::vector<int> nums{1,2,3,4,5};
auto nextStart = nums.begin();
while (nextStart != nums.end()){
auto currentStart = nextStart;
auto currentEnd = nums.end();
nextStart = currentEnd;
for (auto a = currentStart; a!= currentEnd-1; a++){
for (auto b = currentStart+1; b != currentEnd; b++){
auto sum = (*a) + (*b);
if (sum < 10) nums.push_back(sum);
}
}
}
On some iterations, currentStart points to a location that is outside the array and provides garbage data. What is causing this and what is the best way to avoid this situation? I know that modifying something you iterate over is an invitation for trouble...
nums.push_back(sum);
push_back invalidates all existing iterators to the vector if push_back ends up reallocating the vector.
That's just how the vector works. Initially some additional space gets reserved for the vector's growth. Vector's internal buffer that holds its contents has some extra room to spare, but when it is full the next call to push_back allocates a bigger buffer to the vector's contents, moves the contents of the existing buffer, then deletes the existing buffer.
The shown code creates and uses iterators for the vector, but any call to push_back will invalidate the whole lot, and the next invalidated vector dereference results in undefined behavior.
You have two basic options:
Replace the vector with some other container that does not invalidate its existing iterators, when additional values get added to the iterator
Reimplement the entire logic using vector indexes instead of iterators.

C++: how to loop through integer elements in a vector

I would like to loop through elements of a vector in C++.
I am very new at this so I don't understand the details very well.
For example:
for (elements in vector) {
if () {
check something
else {
//else add another element to the vector
vectorname.push_back(n)
}
}
Its the for (vector elements) that I am having trouble with.
You'd normally use what's called a range-based for loop for this:
for (auto element : your_vector)
if (condition(element))
// whatever
else
your_vector.push_back(something);
But note: modifying a vector in the middle of iteration is generally a poor idea. And if your basic notion is to add the element if it's not already present, you may want to look up std::set, std::map, std::unordered_set or std::unordered_map instead.
In order to do this properly (and safely), you need to understand how std::vector works.
vector capatity
You may know that a vector works much like an array with "infinite" size. Meaning, it can hold as many elements as you want, as long as you have enough memory to hold them. But how does it do that?
A vector has an internal buffer (think of it like an array allocated with new) that may be the same size as the elements you're storing, but generally it's larger. It uses the extra space in the buffer to insert any new elements that you want to insert when you use push_back().
The amount of elements the vector has is known as its size, and the amount of elements it can hold is known as its capacity. You can query those via the size() and capacity() member functions.
However, this extra space must end at some point. That's when the magic happens: When the vector notices it doesn't have enough memory to hold more elements, it allocates a new buffer, larger1 than the previous one, and copies all elements to it. The important thing to notice here is that the new buffer will have a different address. As we continue with this explanation, keep this in mind.
iterators
Now, we need to talk about iterators. I don't know how much of C++ you have studied yet, but think of an old plain array:
int my_array[5] = {1,2,3,4,5};
you can take the address of the first element by doing:
int* begin = my_array;
and you can take the address of the end of the array (more specifically, one past the last element) by doing:
int* end = begin + sizeof(my_array)/sizeof(int);
if you have these addresses, one way to iterate the array and print all elements would be:
for (int* it = begin; it < end; ++it) {
std::cout << *it;
}
An iterator works much like a pointer. If you increment it (like we do with the pointer using ++it above), it will point to the next element. If you dereference it (again, like we do with the pointer using *it above), it will return the element it is pointing to.
std::vector provides us with two member functions, begin() and end(), that return iterators analogous to our begin and end pointers above. This is what you need to keep in mind from this section: Internally, these iterators have pointers that point to the elements in the vector's internal buffer.
a simpler way to iterate
Theoretically, you can use std::vector::begin() and std::vector::end to iterate a vector like this:
std::vector<int> v{1,2,3,4,5};
for (std::vector<int>::iterator it = v.begin; it != v.end(); ++it) {
std::cout << *it;
}
Note that, apart from the ugly type of it, this is exactly the same as our pointer example. C++ introduced the keyword auto, that lets us get rid of these ugly types, when we don't really need to know them:
std::vector<int> v{1,2,3,4,5};
for (auto it = v.begin; it != v.end(); ++it) {
std::cout << *it;
}
This works exactly the same (in fact, it has the exact same type), but now we don't need to type (or read) that uglyness.
But, there's an even better way. C++ has also introduced range-based for:
std::vector<int> v{1,2,3,4,5};
for (auto it : v) {
std::cout << it;
}
the range-based for construct does several things for you:
It calls v.begin() and v.end()2 to get the upper and lower bounds of the range we're going to iterate;
Keeps an internal iterator (let's call it i), and calls ++i on every step of the loop;
Dereferences the iterator (by calling *i) and stores it in the it variable for us. This means we do not need to dereference it ourselves (note how the std::cout << it line looks different from the other examples)
putting it all together
Let's do a small exercise. We're going to iterate a vector of numbers, and, for each odd number, we are going to insert a new elements equal to 2*n.
This is the naive way that we could probably think at first:
std::vector<int> v{1,2,3,4,5};
for (int i : v) {
if (i%2==1) {
v.push_back(i*2);
}
}
Of course, this is wrong! Vector v will start with a capacity of 5. This means that, when we try using push_back for the first time, it will allocate a new buffer.
If the buffer was reallocated, its address has changed. Then, what happens to the internal pointer that the range-based for is using to iterate the vector? It no longer points to the buffer!
This it what we call a reference invalidation. Look at the reference for std::vector::push_back. At the very beginning, it says:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
Once the range-based for tries to increment and dereference the now invalid pointer, bad things will happen.
There are several ways to avoid this. For instance, in this particular algorithm, I know that we can never insert more than n new elements. This means that the size of the vector can never go past 2n after the loop has ended. With this knowledge in hand, I can increase the vector's capacity beforehand:
std::vector<int> v{1,2,3,4,5};
v.reserve(v.size()*2); // Increases the capacity of the vector to at least size*2.
// The code bellow now works properly!
for (int i : v) {
if (i%2==1) {
v.push_back(i*2);
}
}
If for some reason I don't know this information for a particular algorithm, I can use a separate vector to store the new elements, and then add them to our vector at the end:
std::vector<int> v{1,2,3,4,5};
std::vector<int> doubles;
for (int i : v) {
if (i%2==1) {
doubles.push_back(i*2);
}
}
// Reserving space is not necessary because the vector will allocate
// memory if it needs to anyway, but this does makes things faster
v.reserve(v.size() + doubles.size());
// There's a standard algorithm (std::copy), that, when used in conjunction with
// std::back_inserter, does this for us, but I find that the code bellow is more
// readable.
for (int i : doubles) {
v.push_back(i);
}
Finally, there's the old plain for, using an int to iterate. The iterator cannot be invalidated because it holds an index, instead of a pointer to the internal buffer:
std::vector<int> v{1,2,3,4,5};
for (int i = 0; i < v.size(); ++i) {
if (v[i]%2==1) {
doubles.push_back(v[i]*2);
}
}
Hopefully by now, you understand the advantages and drawbacks of each method. Happy studies!
1 How much larger depends on the implementation. Generally, implementations choose to allocate a new buffer of twice the size of the current buffer.
2 This is a small lie. The whole story is a bit more complicated: It actually tries to call begin(v) and end(v). Because vector is in the std namespace, it ends up calling std::begin and std::end, which, in turn, call v.begin() and v.end(). All of this machinery is there to ensure that the range-based for works not only with standard containers, but also with anything with a proper implementation for begin and end. That includes, for instance, regular plain arrays.
Here is the quick code snippet using iterators to iterate the vector-
#include<iostream>
#include<iterator> // for iterators to include
#include<vector> // for vectors to include
using namespace std;
int main()
{
vector<int> ar = { 1, 2, 3, 4, 5 };
// Declaring iterator to a vector
vector<int>::iterator ptr;
// Displaying vector elements using begin() and end()
cout << "The vector elements are : ";
for (ptr = ar.begin(); ptr < ar.end(); ptr++)
cout << *ptr << " ";
return 0;
}
Article to read more - Iterate through a C++ Vector using a 'for' loop
.
Hope it will help.
Try this,
#include<iostream>
#include<vector>
int main()
{
std::vector<int> vec(5);
for(int i=0;i<10;i++)
{
if(i<vec.size())
vec[i]=i;
else
vec.push_back(i);
}
for(int i=0;i<vec.size();i++)
std::cout<<vec[i];
return 0;
}
Output:
0123456789
Process returned 0 (0x0) execution time : 0.328 s
Press any key to continue.

How to "delete" a part of an array and keep the rest without running through it?

I am trying to implement an algorithm in C++.
In the pseudocode, there is this: w ←w[0..e], where w is an array of characters and e is an integer. Basically I want to keep a part of the array and discard the rest.
Just to make the program working, I have used a for loop, where I scan through the original array up to e and I copy the values in a new array.
char newArray[sizeIAlreadyKnow];
for (int i=0;i<e;i++)
newArray[i] = w[i];
I know this is not efficient; is there a way to avoid iterating through the original array?
Also I am not very familiar with vectors. Do they have a functionality for this?
Thank you in advance
You can use std::string::resize. The basic idea is to use std::string instead of raw arrays of char. Correspondingly, things also become much easier and safer by using std::vector<T> instead of raw arrays of T.
You're right, you should really use vectors !
A lot of documentation is available here, there are also a lot of good tutorials on c++ and std containers (ask google for some of those)
Conserning your question, what vectors can do is (create a copy)
std::vector<char> myArray;
// fill your array, do watherver work you want with it
std::vector<char> newArray(&myArray[start], &myArray[end]);
or in you case (resize)
std::vector<char> myArray;
// fill your array, do watherver work you want with it
myArray.resize(e);
Each and every one of the methods on vector listed in here come with exemple. Reading those might help you a lot with the implementation of your algorithm.
If you ever need, more can be done (like sorting) using the algorithm section on vector (or any other std container)
What you're asking is not possible with C++ builtin arrays or std::vector out of the box.
In the D programming language, it's is possible. If you scroll down to the section labelled Introducing Slices in the link below, you'll find an explanation about how it's possible. In short, it can't be done without garbage collection. You can't free an array in C++ by calling delete on a pointer to the middle of it. So if you tried to slice the middle out of an array, then ditched the old pointer, you would have no way to free the memory, and your program would leak.
http://dlang.org/d-array-article.html
Now, while it's not possible using a language construct, it is possible in a number of other ways.
Of course, there is the obvious, as stated by Amxx: You can simply copy the segment of the array you want into a new array or vector. However, if you're concerned about performance, this is not the best way.The vector constructor Amxx is using will still loop over all the elements and copy them, even though you can't see it.
For a more efficient solution, there are C++ iterators. If you have a function that you want to work on a subset of an array, you can make your function accept iterators instead of an array or a vector.
For example:
int sumElements(vector<int>::iterator first, vector<int>::iterator last)
{
int sum = 0;
for( ; first != last; ++first)
sum += *first;
return sum;
}
vector<int> numbers;
for(int i = 0; i < 100; ++i)
numbers.push_back(i);
int sum = sumElements(numbers.begin() + 10, numbers.begin() + 20);
There are also things like a string_view:
http://en.cppreference.com/w/cpp/experimental/basic_string_view
string_view is a non-owning reference to a slice of a string, but instead of having to deal with a pair of iterators, you can just treat it like the object that it is a slice of. Internally, it just stores pointers to the original string. The caveat though, is that since string_view is a non-owning reference, the original string's lifetime must outlast that of any string_view pointing at it.
The same thing could also be done with a vector, but there is nothing in the standard library for this yet(even string_view is still experimental).
I suppose you could do something like this:
template<class T>
class vector_view
{
vector<T>::iterator first;
vector<T>::iterator last;
public:
vector_view(vector<T>::iterator first, vector<T>::iterator last)
: first(first), last(last) { }
const T& operator[](size_t i) const {
assert(first + i < last);
return *(first + i)
}
size_t size() const {
return last - first;
}
};
vector<int> numbers;
// ... init numbers with 100 numbers
vector_view<int> vv(numbers.begin() + 5, numbers.begin() + 32);
int number = vv[10];
I would probably just stick to vectors and iterators to keep things simple, but there you have it.
edit: similar ideas to the one above are discussed in this proposal for C++ ranges:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4128.html

Access violation reading - vector of string pointers to value in vector of strings

I'm not very experienced programmer in C++ and I have a problem which I can't resolve. The project on which I'm working is quite big so I can't post here all codes. It is too much code and too much explanation. I write just little part of code, the part which causes me problem, so I hope it is enough. Sorry for the long of my question but I want explain all posted code. Maybe this part of code isn't enough to solve the problem but I want to try it.
First I have a struct called "record":
struct record {
vector<string> dataRow;
vector<string *> keys;
vector<string *> values;
void setDataRow(vector<string> r) {
dataRow = r;
}
}
Some of string data are marked as keys and others as values. I next processing is better for me to have all string data in one vector, so that's the reason why I don't have two vectors of string (vector keys, vector values).
Then I have this:
vector< vector<record> > resultSet;
vector is like data table - set of lines with string data. I need specific count of these tables, therefore vector of vectors of records. The count of tables is optional, so when I set table count I prepare tables by reserve function:
resultSet.reserve(count);
for(unsigned int i = 0; i < count; i++) {
vector<record> vec;
resultSet.push_back(vec);
}
When I want add new record to resultSet I know the number of table to which I need insert record. After resultSet[number].push_back(rec) I need change pointers in vectors "keys" and "values" because push_back() creates new copy of "rec" with values of "dataRow" in other memory addresses, right? So I have this function which does push_back and updates pointers:
void insert(int part, vector<string> & dataRow) {
record r;
r.setDataRow(dataRow);
resultSet[part].push_back(r);
int pos = resultSet.size() - 1; // position of last record
resultSet[part].at(pos).values.clear();
resultSet[part].at(pos).keys.clear();
for(unsigned int i = 0; i < dataRow.size(); i++) {
record * newRec = &resultSet[part].at(pos);
if(isValue(dataRow[i])) {
newRec->values.push_back(&(newRec->dataRow.at(i)));
// control cout...
} else {
newRec->keys.push_back(&(newRec->dataRow.at(i)));
// control cout...
}
}
}
This is working. After push_back in newRec I did control cout of inserted pointers and their referenced values, and everything was ok.
But! After some inserts I call function processData(resultSet), which has to process all data in resultSet. Before implementing processing od data I just wanted print all keys for control to find out if everything is alright. This code:
for(unsigned int i = 0; i < resultSet.size(); i++) {
for(unsigned int j = 0; j < resultSet[i].size(); j++) {
cout << "keys: ";
for(unsigned int k = 0; k < resultSet[i].at(j).keys.size(); k++) {
cout << *resultSet[i].at(j).keys.at(k) << ", ";
}
cout << endl;
}
}
is bad (Same problem with printing values vector of record). It throws exception of Access violation reading. I know that this exception is thrown when I want to read unaccessible memory, right? Please, tell me that I have mistake in code written above because I really don't know why it doesn't work. Before processing resultSet I do nothing with resultSet except some count of inserts.
Thank you for reading and possible answers.
When you add an entry to a std::vector, all existing pointers to elements in that vector should be considered invalid.
Here is the code that is going wrong.
vector<string> dataRow;
vector<string *> keys;
vector<string *> values;
If keys and values point to the strings in dataRow they will become invalid when dataRow grows.
If I have understood your question correctly, the reason for all this is a fundamental misconception in the way vectors behave.
Your code stores pointers in a vector that points to memory locations allocated by another vector. That would be fine if the vectors didn't change.
The reason for this is that a std::vector is a container that makes a guarantee - all the data it contains will be allocated in a contiguous block of memory.
Now, if you insert an element into a vector, it may move memory locations around. Hence, one of the things you should know is that iterators need to be considered invalid when a vector changes. Iterators are sort of a generalized pointer. In other words, pointers to the locations of elements inside a vector become invalid too.
Now, let's say you updated all your pointers, everywhere, when any of the vectors involved changed. You would then be fine. However, you've now got a bit of an uphill battle on your hands.
As you've said in your comments, you're using pointers because you want efficiency. Your struct is essentially a collection of three strings. Instead of using your own struct, typedef a std::tuple (you will need a C++11 compiler) of 3 std::strings.
Finally, when you need to access the data within, do so by const reference and const_iterator unless you need to modify any of it. This will ensure that
You don't have duplication of data
You're making maximum use of the STL, thereby minimizing your own code and the possible bugs
You're relying on algorithms and containers that are already really efficient
You're using the STL in a way it was meant to be used.
Hope this helps.
One possible problem could be in copies of record instances.
struct record
{
vector<string> dataRow;
vector<string *> keys;
vector<string *> values;
};
In fact, default copy constructor and copy operator= do a member-wise copy. This is OK for dataRow field (which is a vector<string>), but this is bad for keys and values fields (since these are vectors of raw pointers, their values are copied, but they point to something wrong).
I'd reconsider your design, e.g. using vector<int> instead of vector<string *> for keys and values fields. The ints stored would be indexes in the dataRow vector.
Another note (not directly related to your problem).
In C++11, when you want to copy something, you may want to pass by value, and move from the value:
void setDataRow(vector<string> r)
{
dataRow = std::move(r);
}
Or just use old C++98/03 style of passing by const ref:
void setDataRow(const vector<string>& r)
{
dataRow = r;
}

C++ reorder std::vector elements using std::list of pointers

I ran into this problem when I tried to write out an new algorithm to reorder elements in std::vector. The basic idea is that I have std::list of pointters pointing into std::vector in such way that *list.begin() == vector[0], *(++list.begin()) == vector[1] and so on.
However, any modifications on list's element positions breaks the mapping. (Including appended pointers) When the mapping is broken the list's elements can be in random order but they point still into correct elements on vector. The task would be to reorder the elements in vector to correct the mapping.
Simplest method to do it (How I have done it now):
create new empty std::vector and resize it to equal size of the old vector.
iterate through the list and read elements from the old vector and write them into new vector. Set the pointer to point into new vector's element.
swap vectors and release the old vector.
Sadly the method is only useful when I need more capacity on the vector. It's inefficient when the current vector holding the elements has enough capacity to store all incoming elements. Appended pointers on the list will point into diffrent vector's storgate. The simple method works for this because it only reads from the pointers.
So I would want to reorder the vector "in place" using constant amount of memory. Any pointer that was not pointing into current vector's storgate are moved to point into current vector's storgate. Elements are simple structures. (PODs)
I'll try post an example code when I have time..
What should I do to achieve this? I have the basic idea done, but I'm not sure if it is even possible to do the reordering with constant amount of memory.
PS: I'm sorry for the (possibly) bad grammar and typos in the post. I hope it's still readable. :)
First off, why do you have a list of pointers? You might as well keep indices into the vector, which you can compute as std::distance(&v[0], *list_iter). So, let's build a vector of indices first, but you can easily adapt that to use your list directly:
std::vector<T> v; // your data
std::list<T*> perm_list; // your given permutation list
std::vector<size_t> perms;
perms.reserve(v.size());
for (std::list<T*>::const_iterator it = perm_list.begin(), end = perm_list.end(); it != end; ++it)
{
perms.push_back(std::distance(&v[0], *it));
}
(There's probably a way to use std::transform and std::bind, or lambdas, to do this in one line.)
Now to do the work. We simply use the cycle-decomposition of the permutation, and we modify the perms vector as we go along:
std::set<size_t> done;
for (size_t i = 0; i < perms.size(); while(done.count(++i)) {})
{
T tmp1 = v[i];
for (size_t j = perms[i]; j != i; j = perms[j])
{
T tmp2 = v[j];
v[j] = tmp1;
tmp1 = tmp2;
done.insert(j);
}
v[i] = tmp1;
}
I'm using the auxiliary set done to track which indices have already been permuted. In C++0x you would add std::move everywhere to make this work with movable containers.