This is an undefined behavior:
std::vector<int> v;
int const * a = &v[0];
My goal is to avoid the UB and the vector::data() function would work. But I need to do it without >=C++11.
For example, if I were to allocate some memory with vector::reserve, would it work?
v.reserve(1);
int const * a = &v[0];
Clarification:
The vector is not changed after the point I take the pointer and the vector may be empty or contain data.
Just perform the check inside a conditional operator:
int const * a = v.empty() ? NULL : &v[0];
This has the added benefit over data() that you can check from the pointer itself whether the vector was empty: if it was, a is null.
Vectors don't provide any guarantees of the pointers of their elements. It's very dangerous to use that reserve you did, because you may push_back() some element later, which may invalidate your pointer.
If you want a better story, consider that even iterators may be invalidated with push_back and erase... why would a pointer still remain valid at all?
Related
As we know, std::vector when initialized like std::vector vect(n) or empty_vect.resize(n) not only allocates required amount of memory but also initializes it with default value (i.e. calls default constructor). This leads to unnecessary initialization especially if I have an array of integers and I'd like to fill it with some specific values that cannot be provided via any vector constructor.
Capacity on the other hand allocates the memory in call like empty_vect.reserve(n), but in this case vector still is empty. So size() returns 0, empty() returns true, operator[] generates exceptions.
Now, please look into the code:
{ // My scope starts here...
std::vector<int> vect;
vect.reserve(n);
int *data = vect.data();
// Here I know the size 'n' and I also have data pointer so I can use it as a regular array.
// ...
} // Here ends my scope, so vector is destroyed, memory is released.
The question is if "so I can use it as array" is a safe assumption?
No matter for arguments, I am just curious of above question. Anyway, as for arguments:
It allocates memory and automatically frees it on any return from function
Code does not performs unnecessary data initialization (which may affect performance in some cases)
No, you cannot use it.
The standard (current draft, equivalent wording in C++11) says in [vector.data]:
constexpr T* data() noexcept;
constexpr const T* data() const noexcept;
Returns: A pointer such that [data(), data() + size()) is a valid range.
For a non-empty vector, data() == addressof(front()).
You don't have any guarantee that you can access through the pointer beyond the vector's size. In particular, for an empty vector, the last sentence doesn't apply and so you cannot even be sure that you are getting a valid pointer to the underlying array.
There is currently no way to use std::vector with default-initialized elements.
As mentioned in the comments, you can use std::unique_ptr instead (requires #inclue<memory>):
auto data = std::unique_ptr<int[]>{new int[n]};
which will give you a std::unique_ptr<int[]> smart pointer to a dynamically sized array of int's, which will be destroyed automatically when the lifetime of data ends and that can transfer it's ownership via move operations.
It can be dereferenced and indexed directly with the usual pointer syntax, but does not allow direct pointer arithmetic. A raw pointer can be obtained from it via data.get().
It does not offer you the std::vector interface, though. In particular it does not provide access to its allocation size and cannot be copied.
Note: I made a mistake in a previous version of this answer. I used std::make_unique<int[]> without realizing that it actually also performs value-initialization (initialize to zero for ints). In C++20 there will be std::make_unique_default_init<int[]> which will default-initialize (and therefore leave ints with indeterminate value).
I've been reading through the FAQ at isocpp.org at "Link here" and came across the caution that with an std::vector:
std::vector<int> v;
auto a = &v[0]; // Is undefined behaviour but
auto a = v.data(); // Is safe
From the actual site:
void g()
{
std::vector<Foo> v;
// ...
f(v.begin(), v.size()); // Error, not guaranteed to be the same as &v[0]
↑↑↑↑↑↑↑↑↑ // Cough, choke, gag; use v.data() instead
}
Also, using &v[0] is undefined behavior if the std::vector or
std::array is empty, while it is always safe to use the .data()
function.
I'm not sure I've understood this exactly. ::data() returns a pointer to the beginning of the array, and &[0] returns the address of the beginning. I'm not seeing the difference here, and I don't think that &[0] is dereferencing anything (i.e., is not reading the memory at element 0). On Visual Studio in debug build accessing subscript [0] results in an assertion failed, but in release mode it doesn't say anything. Also the addresses in both cases is 0 for the default constructed vector.
Also I don't understand the comment about ::begin() not guaranteed to be the same as ::operator[0]. I assumed that for a vector the raw pointer in the begin() iterator, ::data(), and &[0] were all the same value.
I'm not seeing the difference here
&v[0] is same as &(v[0]), i.e. get the address from the 1st element of v. But when v is empty there're no elements at all, v[0] just leads to UB, it's trying to return a non-existent element; trying to get the address from it doesn't make sense.
v.data() is always safe. It will return the pointer to the underlying array directly. When v is empty the pointer is still valid (it might be null pointer or not); but note that dereferencing it (like *v.data()) leads to UB too, the same as v[0].
Also I don't understand the comment about ::begin() not guaranteed to be the same as ::operator[0]
std::vector::begin will return an iterator with type std::vector::iterator, which must satisfy the requirement of RandomAccessIterator. It might be a raw pointer, but it doesn't have to be. It's acceptable to implement it as a class.
The information missing in your question for your example to be more understandable is that void f(Foo* array, unsigned numFoos); Calling .begin() on your vector of Foo is not guaranteed to be a pointer. But some implementations might behave like it enough for it to work.
In the empty vector case, v.data(), returns a pointer but you don't know what it points to. It could be a nullptr, but that is not guaranteed.
It all comes down to one simple thing: You can add or subtract an integral value to the pointer, but trying to dereference an invalid pointer is undefined behaviour.
Say for example,
int a[10];
int* p = a;
int* q = p + 10; // This is fine
int r = *(p + 10) // This is undefined behaviour
In your example: v[0] is the same as *(v's internal pointer+0), and this is a problem if the vector is empty.
If I have the end iterator to a container, but I want to get a raw pointer to that is there a way to accomplish this?
Say I have a container: foo. I cannot for example do this: &*foo.end() because it yields the runtime error:
Vector iterator not dereferencable
I can do this but I was hoping for a cleaner way to get there: &*foo.begin() + foo.size().
EDIT:
This is not a question about how to convert an iterator to a pointer in general (obviously that's in the question), but how to specifically convert the end iterator to a pointer. The answers in the "duplicate" question actually suggest dereferencing the iterator. The end iterator cannot be dereferenced without seg-faulting.
The correct way to access the end of storage is:
v.data() + v.size()
This is because *v.begin() is invalid when v is empty.
The member function data is provided for all contiguous containers (vector, string and array).
From C++17 you will also be able to use the non-member functions:
data(v) + size(v)
This works on raw arrays as well.
In general? No.
And the fact that you're asking indicates that something is wrong with your overall design.
For vectors, arrays, strings? Sure… but why?
Just get a pointer to a valid element, and advance it:
std::vector<T> foo;
const T* ptr = foo.data() + foo.size();
As long as you don't dereference such a pointer (which is almost equivalent to dereferencing the iterator, as you did in your attempt) it is valid to obtain and hold such a pointer, because it points to the special one-past-the-end location.
Note that &foo[0] + foo.size() has undefined behaviour if the vector is empty, because &foo[0] is &*(foo.data() + 0) is &*foo.data(), and (just like in your attempt) *foo.data() is disallowed if there's nothing there. So we avoid all dereferencing and simply advance foo.data() itself.
Anyway, this only works for the case of vectors1, arrays and strings, though. Other containers do not guarantee (or can be reasonably expected to provide) storage contiguity; their end pointers could be almost anything, e.g. a "sentinel" null pointer, which is unlikely to be of any use to you.
That is why the iterator abstraction is there in the first place. Stick to it if you can, instead of delving into raw pointer usage.
1. Excepting std::vector<bool>.
at a time, I created a pointer point to a std::vector, then I did some push_back, reserve, resize operation to that vector, after such operations, is it safe to compare the pointer to the address of that vector to check whether the pointer point to that vector, because there might be some re-allocation of memory.
for example
std::vector<int> vec;
vector<int>* pVec = &vec;
vec.reserve(10000);
assert(pVec == &vec);
vec = anotherVec;
assert(pVec == &vec);
what is more, is it safe to compare a pointer to the first value of vector?
for example:
std::vector<int> vec(1,0);
int* p = &vec[0];
// some operation here
assert(p == &vec[0]);
As I tested by myself, it seems that the first situation is safe, while the second is not, but I can't be sure.
std::vector<int> vec;
vector<int>* pVec = &vec;
vec.reserve(10000);
assert(pVec == &vec);
is safe.
std::vector<int> vec(1,0);
int* p = &vec[0];
// some operation here
assert(p == &vec[0]);
is not safe.
The first block is safe since the address vec will not change even when its contents change.
The second block is not safe since the address of vec[0] may change; for example when the vec resizes itself — e.g, when you push_back elements to it.
it seems that the first situation is safe, while the second is not
That's right. In the first "situation", the vec object itself stays wherever it is in memory regardless of the reserve call, which might move the managed elements to another area of dynamic memory. It's because elements can be moved that the pointers may not compare equal in the second scenario.
The second situation is safe as long as no relocation takes place. If you know the size you will need in advance and use reserve() before you get the pointer it is perfectly safe and you save a little bit of performance (one less level of indirection).
However, any addition with push_back() for example might go beyond the allocated space and invalidate your pointer. std::vector is optimized and will try to allocate more memory at the same position if possible (since it saves copying data around) but you cannot be sure of that.
For that matter, instead of taking a pointer you could take an iterator because an iterator on a vector behaves exactly like a pointer (and has no performance impact) with more type safety.
vector<int>* pVec = &vec; operates on the address of the std::vector<int> object which is valid till scope.
vec = anotherVec; does not change the address of the vec because of here the operator = of std::vector is called.
So, both assert(pVec == &vec); is successfully passed through.
In the case int* p = &vec[0]; it depends: see Iterator invalidation.
The first case is indeed safe as there is no danger of the vector object's address changing. The second case is safe as long as no reallocation happens (reallocations can be traced using the std::vector::capacity member function), otherwise it's either undefined or implementation-defined depending on the version of the language. For more information consult this answer. since the same restrictions apply in that case.
std::map<std::string, std::string> myMap;
std::map<std::string, std::string>::iterator i = m_myMap.find(some_key_string);
if(i == m_imagesMap.end())
return NULL;
string *p = &i->first;
Is the last line valid?
I want to store this pointer p somewhere else, will it be valid for the whole program life?
But what will happen if I add some more elements to this map (with other unique keys) or remove some other keys, won’t it reallocate this string (key-value pair), so the p will become invalid?
Section 23.1.2#8 (associative container requirements):
The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.
So yes storing pointers to data members of a map element is guaranteed to be valid, unless you remove that element.
First, maps are guaranteed to be stable; i.e. the iterators are not invalidated by element insertion or deletion (except the element being deleted of course).
However, stability of iterator does not guarantee stability of pointers! Although it usually happens that most implementations use pointers - at least at some level - to implement iterators (which means it is quite safe to assume your solution will work), what you should really store is the iterator itself.
What you could do is create a small object like:
struct StringPtrInMap
{
typedef std::map<string,string>::iterator iterator;
StringPtrInMap(iterator i) : it(i) {}
const string& operator*() const { return it->first; }
const string* operator->() const { return &it->first; }
iterator it;
}
And then store that instead of a string pointer.
If you're not sure which operations will invalidate your iterators, you can look it up pretty easily in the reference. For instance for vector::insert it says:
This effectively increases the vector size, which causes an automatic reallocation of the allocated storage space if, and only if, the new vector size surpases the current vector capacity. Reallocations in vector containers invalidate all previously obtained iterators, references and pointers.
map::insert on the other hand doesn't mention anything of the sort.
As Pierre said, you should store the iterator rather than the pointer, though.
Why are you wanting to do this?
You can't change the value of *p, since it's const std::string. If you did change it, then you might break the invariants of the container by changing the sort order of the elements.
Unless you have other requirements that you haven't given here, then you should just take a copy of the string.