How unique() function for array works - c++

int a[4] = {3,1,2,3};
sort(a,a+n);
int j = unique(a,a+n) - a; // j=3
In this code variable j returns total numbers of unique element in the array a. But I couldn't understand how this code is working.
I know that in lists,
list::unique() is an inbuilt function in C++ STL which removes all duplicate consecutive elements from the list. It works only on sorted lists.

std::unique() is going to move the duplicates in the range [a+0, a+n), and it returns a new iterator in that range that will mark the new "end" of the array, i.e, where the first non-unique item is now moved to in the array.
If you then subtract from that iterator the beginning iterator, which you do with unique(a,a+n) - a;, you get the number of elements that are between the start of the array and the new "end". This is how you are able to get the count of the unique elements.
It should be noted that I use "end" here because arrays have a fixed size. You aren't actually changing the size of the array at all, you are just moving the duplicate elements to the back of the array, and keeping the unique elements at the front.
It should also be noted that after this happens, everything at and after the iterator returned by unique() will have an unspecified value. It is legal to set new values to them, but using an unspecified value leads to undefined behavior.

Related

Issues inserting an element to an array when the index is at the end and an empty array

First time posting something here. Hope I'm doing it right.
If not let me know.
Here's the issue:
I'm trying to insert an element into a given array.
I added a for loop that checked to see if the numOfElem was equal to zero then a[0] would equal elem, but that didn't help either.
void insertAtIndex(int a[], int numOfElem, int elem, int index)
{
for (int i = numOfElem; i > index; i--)
{
a[i] = a[i-1];
}
a[index] = elem;
numOfElem++;
}
Test Cases:
1:
Initial Array: No elements in the array.
Insert 10 at index 0...
Modified array: No elements in the array.
2:
Initial Array: 1
Insert 20 at index 0...
Modified array: 20
/As you can see here it added the 20 to the correct index, but it deleted the 1 instead of shifting it to the right./
3
Initial Array: 3
Insert 30 at index 1...
Modified array: 3
/It did nothing to this test case. Whenever the element that has to be added is at the end it does not add it, it returns the initial array with no change./
In summary, whenever the index of the element that I​ want to insert would be at the end of the modified array or the array is empty it will not make any changes to the array.
Any tips/advice help. Thank you in advance.
There are several issues with the given example.
You tagged the question with C++, but it is pure C code (see also below)
You are using plain C-Style arrays
You should use C++ STL containers
You need to understand how to pass parameters to a function
You nee to read about the "decay to pointer" topic
You must understand how indices in arrays are counted
The last bullet point is the reeason why your program does not work. Array indices start with 0. So, if you have an array with 3 elements, valid indices are 0,1,2. And NOT and under NO ciumstances 3.
So, if your index is equal to numOfElem you will write to an out of bounds memeory area with a[i] = a[i-1];. This is a major bug and may lead to a catastrophy.
Then, you pass parameter numOfElem by value. Meaning, the compiler makes a copy of this variable and uses the copy in the function. numOfElem++; will work on a local copy and will not increase the variable. A good compiler will warn you. This statement is a "no operation". Please always compile wit ALL warnings enables.
And I hope, but cannot see it, that you do not expect your array to grow dynamically by incrementing a variable.
Many STL containers have insert functions that will do the job.

Inserting a sub-array into a vector

The following statement inserts part of an array into an empty vector. It then prints the last elemnt inserted which is 14 in this case. My question is, how is the final array element that is inserted being determined with this syntax? How is "myArray+3" returning the third element in the array to the function?
vector <int> myVector(10);
int myArray[5] = {3,9,14,19,94};
myVector.insert(myVector.begin(), myArray, myArray+3);
cout << myVector.at(2) << endl;
For starters the vector is not empty. It has 10 elements initialized by zeroes.
vector <int> myVector(10);
As for these arguments
myArray, myArray+3
then they specify a range in the array the following way
[&myArray[0], &myArray[3])
^^^ ^^^
That means that the elements pointed to by these pointers
&myArray[0], &myArray[1], &myArray[2]
will be included in the vector. That is the second value of the range specifies elements before the value.
The element pointed to by the pointer &myArray[3] (that is by the pointer myArray + 3) will not be inserted to the vector.
Compare for example. If an array has N elements then the range of acceptable indices for its element is
[0, N-1]
^^^ ^^^
that can be also specified like
[0, N)
^^^ ^^^
Arrays in C++ are laid out in a contiguous fashion, so that the address of the array is the same as the address of the first element of the array, followed by the address of the next, and the next, etc.
Now when you do myArray + 3, this is actually saying, "Go to the first element and get the third element from the start position".
So if you had done (myArray + 1) + 3, this will mean to first from the first position to the second, and using your new position as a reference point, move three positions from there.
How does it know where to go? Simply by taking the size in bytes of a single element of the array and multiplying that by the distance you wanted to move forward, and then adding this value to the address of the reference position, and voila! You have gotten to the nth element of the array.

erasing an element from a string vector

In my program I have created a vector of string type (vector<string> names;).
After putting some values in it, I came in the situation where I wish to erase an element from it. I know that I can do this by typing: names.erase(<pointer to the element to be erased>);
However the only thing i know is that I wish to erase the element i (i is a counter in a loop). The starting position (pointer) of the i'th position is uknown, because the vector is a string (i.e. If it was an int vector I could do:
names.erase(names.begin()+i*sizeof(int))
Would someone please explain how I can find the position in memory of the i'th element, or generally how I can erase the i'th element without knowing its position.
It doesn't matter about the size of the elements. names.begin() + i gives you an iterator to the ith element of the vector. You don't move an iterator along in byte steps - you move it along an element at a time.
You definitely should not be doing names.begin() + i * sizeof(int) if you have a vector of ints. And even if it were the case that you had to add the size in bytes like this, the size of a std::string object is always fixed, regardless of the length of the string. That is sizeof(std::string) is a constant value. In fact, the size of any type is fixed in C++.
You definitely should use iterators to manipulate vector.
The simplest way to locate i'th element is:
std::vector<string>::iterator l_it(names.begin());
l_it += i;
Also be careful with erasing, because std::vector::erase relocates the rest of array (and moves indexes).
http://www.cplusplus.com/reference/vector/vector/erase/
need to use iterator as below
vector<string>::iterator lIter = lStrVec.begin();
lIter = (lIter + (i-1));
lStrVec.erase(lIter);
Note that if yu need to erase i th element move forward the iterator by i-1

Erasing elements in a vector

So I have a vector of unsigned ints (vector<unsigned int> is called vector1). I have another vector of a struct I created (vector<struct> is called vector2). vector<int> holds an integer that is the index of the vector<struct>. For example, let's say that vector<int = {5, 17, 18, 19}. That means vector2.at(5) == vector2.at(vector1.at(0)).
In the struct, I have a bool variable called var. In most cases, var is false. I want to delete all of the elements in vector1 that have var = true.
What I did was:
for (unsigned int i = 0; i < vector1.size(); i++)
{
if (vector2.at(vector1.at(i)).var)
vector1.erase(vector.begin() + i);
}
The only problem with this is that it does not delete all of the true elements. I have run the for loop multiple times for all values to be delete. Is this the correct behavior? If it is not, where did I go wrong?
You have to use the erase-remove idiom to delete elements from a vector.
v.erase(std::remove(v.begin(), v.end(), value), v.begin);
std::remove moves the elements to the end of the vector and erase will erase the element from the vector.
You can keep a temporary vector, copy of vector1 and iterate over it in the for loop and delete from vector1.
You are erasing elements in the vector while at the same time iterating over it. So when erasing an element, you always jump over the next element, since you increase i while having just shortened the vector at i (it would be even worse, had you used a proper iterator loop instead of an index loop). The best way to do this would be to seperate both opretions, first "marking" (or rather reordering) the elements for removal and then erasing them from the vector.
This is in practice best done using the erase-remove idiom (vector.erarse(std::remove(...), vector.end())), which first uses std::remove(_if) to reorganize the data with the non-removed elements at the beginning and returns the new end of the range, which can then be used to really delete those removed elements from the range (effectively just shortening the whole vector), using std::vector::erase. Using a C++11 lambda, the removal condition can be expressed quite easily:
vector1.erase(std::remove_if( //erase range starting here
vector1.begin(), vector1.end(), //iterate over whole vector
[&vector2](unsigned int i) //call this for each element
{ return vector2.at(i).var; }), //return true to remove
vector1.end()); //erase up to old end
EDIT: And by the way, as always be sure if you really need std::vector::at instead of just [] and keep in mind the implications of both (in particular the overhead of the former and "maybe insecurity" of the latter).

C++ vector - push_back

In the C++ Primer book, Chapter (3), there is the following for-loop that resets the elements in the vector to zero.
vector<int> ivec; //UPDATE: vector declaration
for (vector<int>::size_type ix = 0; ix ! = ivec.size(); ++ix)
ivec[ix] = 0;
Is the for-loop really assigning 0 values to the elements, or do we have to use the push_back function?
So, is the following valid?
ivec[ix] = ix;
Thanks.
Is the for-loop really assigning 0
values to the elements? Or, we have to
use the push_back finction?
ivec[ix] =0 updates the value of existing element in the vector, while push_back function adds new element to the vector!
So, is the following valid?
ivec[ix] = ix;
It is perfectly valid IF ix < ivec.size().
It would be even better if you use iterator, instead of index. Like this,
int ix = 0;
for(vector<int>::iterator it = ivec.begin() ; it != ivec.end(); ++it)
{
*it = ix++; //or do whatever you want to do with "it" here!
}
Use of iterator with STL is idiomatic. Prefer iterator over index!
Yes, you can use the square brackets to retrieve and overwrite existing elements of a vector. Note, however, that you cannot use the square brackets to insert a new element into a vector, and in fact indexing past the end of a vector leads to undefined behavior, often crashing the program outright.
To grow the vector, you can use the push_back, insert, resize, or assign functions.
Using the array brackets the vector object acts just like any other simple array. push_back() increases its length by one element and sets the new/last one to your passed value.
The purpose of this for loop is to iterate through the elements of the vector.
Starting at element x (when ix is 0) up to the last element (when ix is ivec.size() -1).
On each iteration the current element of the vector is set to 9.
This is what the statement
ivec[ix] = 0;
does. Putting
ivec[ix] = ix;
in the for loop would set all the elements of the vector to their position in the vector. i.e, the first element would have a value of zero (as vectors start indexing from 0), the second element would have a value of 1, and so on and so forth.
Yes, assuming ix is a valid index, most likely: you have a vector of int though and the index is size_type. Of course you may want to purposely store -1 sometimes to show an invalid index so the conversion of unsigned to signed would be appropriate but then I would suggest using a static_cast.
Doing what you are doing (setting each value in the vector to its index) is a way to create indexes of other collections. You then rearrange your vector sorting based on a predicte of the other collection.
Assuming that you never overflow (highly unlikely if your system is 32 bits or more) your conversion should work.