this is a learning question for me and hopefully others as well. My problem breaks down to having a pointer pointing to content of a vector. The issue occurs when I erase the first element of the vector. I'm not quite sure what I was expecting, I somehow assumed that, when removing items, the vector would not start moving objects in memory.
The question I have is: is there a way to keep the objects in place in memory? For example changing the underlying container of vector? With my particular example, I will remove the pointer access and just use and id for the object since the class needs a ID anyway.
here is a simplified example:
#include <iostream>
#include <vector>
class A
{
public:
A(unsigned int id) : id(id) {};
unsigned int id;
};
int main()
{
std::vector<A> aList;
aList.push_back(A(1));
aList.push_back(A(2));
A * ptr1 = &aList[0];
A * ptr2 = &aList[1];
aList.erase(aList.begin());
std::cout << "Pointer 1 points to \t" << ptr1 << " with content " << ptr1->id << std::endl;
std::cout << "Pointer 2 points to \t" << ptr2 << " with content " << ptr2->id << std::endl;
std::cout << "Element 1 is stored at \t" << &aList[0] << " with content " << aList[0].id << std::endl;
}
What I get is:
Pointer 1 points to 0xf69320 with content 2
Pointer 2 points to 0xf69324 with content 2
Element 1 is stored at 0xf69320 with content 2
While you can't achieve what you want exactly, there are two easy alternatives. The first is to use std::vector<std::unique_ptr<T>> instead of std::vector<T>. The actual instance of each object will not be moved when the vector resizes. This implies changing any use of &aList[i] to aList[i].get() and aList[i].id to aList[i]->id.
#include <iostream>
#include <memory>
#include <vector>
class A
{
public:
A(unsigned int id) : id(id) {};
unsigned int id;
};
int main()
{
std::vector<std::unique_ptr<A>> aList;
aList.push_back(std::make_unique<A>(1));
aList.push_back(std::make_unique<A>(2));
A * ptr1 = aList[0].get();
A * ptr2 = aList[1].get();
aList.erase(aList.begin());
// This output is undefined behavior, ptr1 points to a deleted object
//std::cout << "Pointer 1 points to \t" << ptr1 << " with content " << ptr1->id << std::endl;
std::cout << "Pointer 2 points to \t" << ptr2 << " with content " << ptr2->id << std::endl;
std::cout << "Element 1 is stored at \t" << aList[0].get() << " with content " << aList[0]->id << std::endl;
}
Note that ptr1 will point to a deleted object, as such it's still undefined behavior to deference it.
Another solution might be to use a different container that does not invalidate references and pointers. std::list never invalidates a node unless it's specifically erased. However, random access is not supported, so your example can't be directly modified to use std::list. You would have to iterate through the list to obtain your pointers.
Not sure if this is what you want, but how about this:
(only basic layout, you need to fill in the details, also: haven't tested the design, might have some flaws)
template <class T>
class MagicVector {
class MagicPointer {
friend class MagicVector;
private:
MagicVector* parent;
unsigned int position;
bool valid;
MagicPointer(MagicVector* par, const unsigned int pos); //yes, private!
public:
~MagicPointer();
T& content();
void handle_erase(const unsigned int erase_position);
}
friend class MagicPointer;
private:
vector<T> data;
vector<std::shared_ptr<MagicPointer> > associated_pointers;
public:
(all the methods you need from vector)
void erase(const unsigned int position);
std::shared_ptr<MagicPointer> create_pointer(const unsigned int position);
}
template <class T>
void MagicVector<T>::erase(const unsigned int position){
data.erase(position);
for(unsigned int i=0; i<associated_pointers.size(); i++){
associated_pointers[i].handle_erase(position);
}
}
template <class T>
std::shared_ptr<MagicPointer> MagicVector<T>::create_pointer(const unsigned int position){
associated_pointers.push_back(std::shared_ptr<MagicPointer>(new MagicPointer(this, position)));
return std::shared_ptr<MagicPointer>(associated_pointers.back());
}
template <class T>
MagicVector<T>::MagicPointer(MagicVector* par, const unsigned int pos){
parent = par;
position = pos;
if (position < parent->data.size()){
valid = true;
}else{
valid = false;
}
}
template <class T>
T& MagicVector<T>::MagicPointer::content(){
if(not valid){
(handle this somehow)
}
return parent->data[position];
}
template <class T>
void MagicVector<T>::MagicPointer::handle_erase(const unsigned int erase_position){
if (erase_position < position){
position--;
}
else if (erase_position == position){
valid = false;
}
}
template <class T>
MagicVector<T>::MagicPointer::~MagicPointer(){
for(unsigned int i=0; i<parent->associated_pointers.size(); i++){
if(parent->associated_pointers[i] == this){
parent->associated_pointers.erase(i);
i=parent->associated_pointers.size();
}
}
}
Basic idea: You have your own classes for vectors and pointers, with pointer storing a position in the vector. The vector knows it's pointers and handles them accordingly whenever something is erased.
I'm not completely satisfied myself, that shared_ptr over MagicPointer looks ugly, but not sure how to simplify this. Maybe we need to work with three classes, MagicVector, MagicPointerCore which stores the parent and position and MagicPointer : public shared_ptr < MagicPointerCore>, with MagicVector having vector < MagicPointerCore> associated_pointers.
Note that the destructor of MagicVector has to set all of it's associated pointers to invalid, since a MagicPointer can outlive the scope of it's parent.
I was expecting, I somehow assumed that, when removing items, the vector would not start moving objects in memory.
How so? What else did you expect?? A std::vector guarantees a contiguous series of it's contained elements in memory. So if something is removed, the other elements need to be replaced in that contiguous memory.
Related
I wanted a simple class which would encapsulate a pointer and a size, like C++20's std::span will be, I think. I am using C++ (g++ 11.2.1 to be precise)
I want it so I can have some constant, module-level data arrays without having to calculate the size of each one.
However my implementation 'works' only sometimes, dependent on the optimization flags and compiler (I tried on godbolt). Hence, I've made a mistake. How do I do this correctly?
Here is the implementation plus a test program which prints out the number of elements (which is always correct) and the elements (which is usually wrong)
#include <iostream>
#include <algorithm>
using std::cout;
using std::endl;
class CArray {
public:
template <size_t S>
constexpr CArray(const int (&e)[S]) : els(&e[0]), sz(S) {}
const int* begin() const { return els; }
const int* end() const { return els + sz; }
private:
const int* els;
size_t sz;
};
const CArray gbl_arr{{3,2,1}};
int main() {
CArray arr{{1,2,3}};
cout << "Global size: " << std::distance(gbl_arr.begin(), gbl_arr.end()) << endl;
for (auto i : gbl_arr) {
cout << i << endl;
}
cout << "Local size: " << std::distance(arr.begin(), arr.end()) << endl;
for (auto i : arr) {
cout << i << endl;
}
return 0;
}
Sample output:
Global size: 3
32765
0
0
Local size: 3
1
2
3
In this case the 'local' variable is correct, but the 'global' is not, should be 3,2,1.
I think the issue is your initialization is creating a temporary and then you're storing a pointer to that array after it has been destroyed.
const CArray gbl_arr{{3,2,1}};
When invoking the above constructor, the argument passed in is created just for the call itself, but gbl_arr refers to it after its life has ended. Change to this:
int gbl_arrdata[]{3,2,1};
const CArray gbl_arr{gbl_arrdaya};
And it should work, because the array it refers to now has the same lifetime scope as the object that refers to it.
I have to implement a simple "unique_ptr" class supporting only a constructor, destructor, –>, *, and release(). And I did below.
However, it feels weird to write "up.operator->()" to get the pointer p. I would be more logical to write "up->p". But how do I do that? Thanks!
#include <iostream>
#include <stdexcept>
template <class T>
class unique_ptr
{
T *p;
public:
unique_ptr(T *ptr)
: p{ptr}
{
}
~unique_ptr() { delete p; }
T *operator->() const { return p; } // returns a pointer
T operator*() const { return *p; }
T *release()
{
T *ptr = p;
p = nullptr;
return ptr;
}
};
template <class T>
void print(const unique_ptr<T> &up, const std::string &s)
{
std::cout << s << " up.operator->(): " << up.operator->() << '\n';
std::cout << s << " up.operator*(): " << up.operator*() << '\n';
}
int main()
try
{
int *ptr = new int(10);
unique_ptr<int> up(ptr);
print(up, "up: ");
}
catch (std::exception &e)
{
std::cerr << "exception: " << e.what() << '\n';
return 1;
}
catch (...)
{
std::cerr << "exception\n";
return 2;
}
However, it feels weird to write "up.operator->()" to get the pointer p.
It feels weird because the member access operator is not generally used to get a pointer to the object (although you can do it using the operator->() syntax, as you demonstrated). Member access operator is used to access members of the object. In your example, you have a unique pointer of int. int doesn't have a member, so it doesn't make sense to use the member access operator.
Here is an example of how to use it:
struct S {
int member;
};
unique_ptr<S> up(new S{10});
int value_of_member = up->member;
would be more logical to write "up->p"
That wouldn't be logical unless p is a member of the pointed object.
How to create an operator-> for a class unique_ptr
Like you did in the example. As far as I can tell, there was no problem with how you create the operator, but rather how to use it.
P.S. Your unique pointer is copyable, movable and assignable, but those operations are horribly broken leading to undefined behaviour. See rule of 5.
As others have noted in comments this implementation of a single ownership smart pointer is incomplete and the operator*() is incorrect in that it doesn't return a reference and thus does not facilitate making assignments through the pointer.
However to answer the question,
it feels weird to write "up.operator->()" to get the pointer p. I
would be more logical to write "up->p". But how do I do that?
Well you wouldnt want to do that as p is part of the private implementation of your smart pointer class. It is weird to write up.operator->() because that is not how the -> is typically used. It is typically used as shorthand to access the members of a struct or class that is slightly less verbose than the * operator in combination with member access via .. To use your pointer then in a less weird way you need the template parameter to be instantiated with some type that has fields, e.g.
struct foo {
int bar;
};
void print(const unique_ptr<foo>& up, const std::string& s)
{
std::cout << s << " up.operator->(): " << up->bar << '\n';
std::cout << s << " up.operator*(): " << (*up).bar << '\n';
}
int main()
{
unique_ptr<foo> up(new foo{ 42 });
print(up, "up: ");
}
I am trying to convince myself that objects in C++ have constant address during their lifetime. Here is a minimal working example:
#include <iostream>
#include <type_traits>
#include <vector>
class Class1
{
public:
Class1(unsigned int * pt);
unsigned int * val_pt;
};
Class1::Class1(unsigned int * pt)
:
val_pt(pt)
{}
class Class2
{
public:
Class2(std::vector<unsigned int> vec_);
std::vector<unsigned int> vec_of_ints;
Class1 class1_instance;
};
Class2::Class2(std::vector<unsigned int> vec_)
:
vec_of_ints(vec_),
class1_instance(Class1(&vec_of_ints[0]))
{}
int main() {
std::vector<unsigned int> vec_test(10, 2);
Class2 instance_class2(vec_test);
Class1 instance_class1 = instance_class2.class1_instance;
//both addresses are equal
std::cout<<"Address stored in instance_class1: "<<instance_class1.val_pt<<" ,address of first vec_element of instance_class2: "<<&(instance_class2.vec_of_ints)[0]<<std::endl;
instance_class2.vec_of_ints.resize(20);
//different addresses now
std::cout<<"Address stored in instance_class1: "<<instance_class1.val_pt<<" ,address of first vec_element of instance_class2: "<<&(instance_class2.vec_of_ints)[0]<<std::endl;
return 0;
}
My Class2 stores a vector of ints and an instance of Class1. Class1 stores the address of the vector of the Class2 instance.
I'd like to get the address of that vector, i.e. the address where the vector is stored on the stack. If my understanding is correct, the resize() function doesn't change that address on the stack but only the content of that address, i.e. where the vector points to in heap.
My overall goal is to show that any modifications of the vector in Class2 are "visible" in the stored pointer of Class1. So if I dereference the pointer in Class1 I will get the same integer value as when accessing the vector itself in Class2. That is because the address of member variables are constant during runtime.
But I guess something is wrong in my code, probably in the constructor where I pass 'vec[0]'. I think this is not the actual address of the vector in the stack but some address on the heap. How do I get the correct address?
Any input is appreciated!
What you're doing here is ungodly, and needs to stop. Consider this particularly terrible pattern:
class DataOwner {
public:
inline std::vector<uint32_t>& getData() { return data; }
private:
// I am safely tucked away
std::vector<uint32_t> data;
};
class I_Want_To_Work_On_Data {
public:
I_Want_To_Work_On_Data(DataOwner* owner) : owner(owner) {}
void doThing() {
auto& direct_ref_to_data = owner->getData();
for(auto& item : direct_ref_to_data) {
// This is just as fast as your direct pointer :/
}
}
private:
DataOwner* owner;
};
Returning mutable access to the data is somewhat bad, but it's far safer than the approach you are taking (in a single threaded environment a least). Performance is no worse than what you are attempting, but it is a lot safer. So what are you optimising this for exactly? How is your approach an improvement over this boring pattern?
Now you could argue that providing mutable access to the std::vector isn't wanted (i.e. don't allow any old code to resize the array), but that can easily be solved without resorting to dirty hacks.
#include <vector>
#include <cstdint>
class DataOwner {
public:
inline std::vector<uint32_t>::iterator begin()
{ return data.begin(); }
inline std::vector<uint32_t>::iterator end()
{ return data.end(); }
inline std::vector<uint32_t>::const_iterator begin() const
{ return data.begin(); }
inline std::vector<uint32_t>::const_iterator end() const
{ return data.end(); }
private:
// I am safely tucked away
std::vector<uint32_t> data;
};
class ConstAccess {
public:
ConstAccess(const DataOwner& owner) : owner(owner) {}
void doThing() {
for(const auto& item : owner) {
}
}
private:
const DataOwner& owner;
};
class MutableAccess {
public:
MutableAccess(DataOwner& owner) : owner(owner) {}
void doThing() {
for(auto& item : owner) {
}
}
private:
DataOwner& owner;
};
The performance is the same as with your approach, however this approach as the following advantages:
It won't crash in debug builds on this line: class1_instance(Class1(&vec_of_ints[0])), when the vector is empty, and you attempt to dereference NULL to find the address.
It won't crash when you attempt to dereference unsigned int * val_pt; after you've accidentally resized the array.
It won't allow you to accidentally do: delete [] val_pt
I'm not sure what conclusions you extracted from the comments and responses above.
I just wanted to make sure these were not among them:
The address of a member variable is constant during runtime.
If, for example, you have a vector of Class2 instances, and you
resize that vector, the address of the vec_of_ints member variable
may change for any of those instances.
Having a Class2 instance in the stack or a pointer to a Class2 instance in the heap makes a difference.
The address of the vec_of_ints member variable shouldn't change if
you resize it, no matter the instance of Class2 is in the stack or
in the heap.
The example below tests both assertions (https://godbolt.org/z/3TYrnjro8):
#include <iomanip>
#include <iostream>
#include <memory>
#include <string>
#include <vector>
struct HoldsIntVector
{
std::vector<int> v{};
};
int main()
{
HoldsIntVector stackInstance{};
auto heapInstance{std::make_unique<HoldsIntVector>()};
stackInstance.v.push_back(5);
heapInstance->v.push_back(5);
auto printStackAndHeapInstances = [&](const auto& text){
std::cout << std::showbase << std::hex;
std::cout << &stackInstance << "\t" << &heapInstance << "\t";
std::cout << &stackInstance.v << "\t" << &heapInstance->v << "\t";
std::cout << std::setw(8) << std::setfill('0') << stackInstance.v.data() << "\t\t";
std::cout << std::setw(8) << std::setfill('0') << heapInstance->v.data() << "\t\t";
std::cout << text;
std::cout << "\n";
};
std::cout << "stackInstance\theapInstance\tstackInstance.v\theapInstance.v\t&stackInstance.v[0]\t&heapInstance.v[0]\n";
printStackAndHeapInstances("after initializing stack and heap instances");
// After resizing both vectors in stack and heap instances
//
// Address of v doesn't change neither in stack nor in heap instances
// Address of v[0] changes in both stack and heap instances
stackInstance.v.resize(10);
heapInstance->v.resize(10);
printStackAndHeapInstances("after resizing both v's");
std::cout << "\n";
// Now what happens if we have a vector of HoldsIntVector and we resize it
//
// Address of v changes for the first HoldsInVector
std::vector<HoldsIntVector> hivs{10};
std::for_each(begin(hivs), end(hivs), [](auto& hiv){hiv.v.push_back(3);});
std::cout << "&hivs[0].v\n" << &hivs[0].v << "\t" << "after intializing hivs\n";
hivs.resize(20);
std::cout << &hivs[0].v << "\t" << "after resizing hivs\n";
}
I have some code that claims ownership of a sequence of raw pointers, and am wondering if there is an acceptable way to do this? What I'm looking for is a way to enforce the ownership in code to a greater degree. Mainly, I've been wondering whether or not my constructor should be taking a vector of unique pointers directly.
As a sidenote, once ownership has been claimed, the data is supposed to be immutable.
The code follows roughly the pattern of class X below.
#include <iostream>
#include <memory>
#include <vector>
using namespace std; // For readability purposes only
class X {
public:
const vector< unique_ptr<const int> > data; // In my case this is private
// Constructor: X object will take ownership of the data
// destroying it when going out of scope
X (vector<int*> rawData)
: data { make_move_iterator(rawData.begin()), make_move_iterator(rawData.end()) }
{ }
};
int main() {
// Illustrating some issues with claiming ownership of existing pointers:
vector<int*> rawData { new int(9) , new int(4) };
int* rawPointer = rawData[0];
{ // New scope
X x(rawData);
cout << *(x.data[0]) << endl; // Unique pointer points to 9
*rawPointer = 7;
cout << *(x.data[0]) << endl; // Unique pointer points to 7
}
cout << *rawPointer << endl; // The pointer has been deleted, prints garbage
return 0;
}
It is difficult to post an answer without detailed knowledge of your situation. But my recommendation is to attach your data to a unique_ptr as soon as it is known. Then you can move that unique_ptr into and out of vectors at will. For example:
#include <iostream>
#include <memory>
#include <vector>
using namespace std; // For readability purposes only
class X {
public:
const vector< unique_ptr<const int> > data; // In my case this is private
// Constructor: X object will take ownership of the data
// destroying it when going out of scope
X (vector<unique_ptr<const int>>&& v)
: data { std::move(v) }
{ }
};
vector<unique_ptr<const int>>
collectRawData()
{
auto rawData = {9, 4};
vector<unique_ptr<const int>> data;
for (auto const& x : rawData)
data.push_back(make_unique<int>(x));
return data;
}
int main() {
auto rawData = collectRawData();
{ // New scope
X x(std::move(rawData));
cout << *(x.data[0]) << endl; // Unique pointer points to 9
cout << *(x.data[1]) << endl; // Unique pointer points to 4
}
}
You did several misstakes.
In case of const vector< unique_ptr<const int> > data; a move iterator does make not that much sense. The reason why is, int* doesn't have a move constructor.
If you call X's constructor X (vector<int*> rawData) with vector < int* > so the copy constructor of vector < int* > gets called, but that's not what you want to.
Btw. the reason why to use move is, to avoid big memory copies. For instance std::vector < int* >: The member attribute size and the pointer to the memory location where your int*s are stored of std::vector<int*> must be copied by a move too but not the int*s self. A conclusion is that move is there to claim ownership.
If you want shared pointers like that, use std::shared_ptr. It owns a counter, which counts the ptrs which pointing to itself.
´
My Example Code:
class X
{
public:
const std::vector< std::shared_ptr< const int> > data; // In my case this is private
// Constructor: X object will take ownership of the data
// destroying it when going out of scope
X (std::vector<std::shared_ptr<int>>& rawData)
//: data(rawData)
: data(rawData.cbegin(), rawData.cend())
{ }
};
int main() {
// Illustrating some issues with claiming ownership of existing pointers:
std::vector<std::shared_ptr<int>> rawData { std::make_shared<int>(9), std::make_shared<int>(4) };
int* rawPointer = rawData[0].get();
{ // New scope
X x(rawData);
cout << *(x.data[0]) << endl; // Unique pointer points to 9
*rawPointer = 7;
cout << *(x.data[0]) << endl; // Unique pointer points to 7
}
cout << *rawPointer << endl; // The pointer has been deleted, prints not more garbage
return 0;
}
If you dont want use std::shared_ptr, you will need an GC.
I'm implementing an STL set with a complex template parameter type. When inserting in to the set, I want the set to use the less-than operator I've defined for my type. I also want to minimize the quantity of object instantiations of my type. It seems I can't have both.
I've got two minimal examples below, each uses the same C++ class.
#include <iostream>
#include <set>
using namespace std;
class Foo {
public:
Foo(int z);
Foo(const Foo &z);
bool operator<(const Foo &rhs) const;
int a;
};
Foo::Foo(int z)
{
cout << "cons" << endl;
a = z;
}
Foo::Foo(const Foo &z)
{
cout << "copy cons" << endl;
a = z.a;
}
bool
Foo::operator<(const Foo &rhs) const
{
cout << "less than" << endl;
return a < rhs.a;
}
Here's my first main():
int
main(void)
{
set<Foo> s;
s.insert(*new Foo(1));
s.insert(*new Foo(2));
s.insert(*new Foo(1));
cout << "size: " << s.size() << endl;
return 0;
}
That's great because it uses the less-than I've defined for my class, and thus the size of the set is correctly two. But it's bad because every insertion in to the set requires the instantiation of two objects (constructor, copy constructor).
$ ./a.out
cons
copy cons
cons
less than
less than
less than
copy cons
cons
less than
less than
less than
size: 2
Here's my second main():
int
main(void)
{
set<Foo *> s;
s.insert(new Foo(1));
s.insert(new Foo(2));
s.insert(new Foo(1));
cout << "size: " << s.size() << endl;
return 0;
}
That's great because an insertion requires just one object instantiation. But it's bad because it's really a set of pointers, and thus the uniqueness of set members is gone as far as my type is concerned.
$ ./a.out
cons
cons
cons
size: 3
I'm hoping there's some bit of information I'm missing. Is it possible for me to have both minimal object instantiations and appropriate sorting?
You are getting a copy from this: *new Foo(1).
Create this struct:
template<typename T>
struct PtrLess
{
bool operator()(const T *a, const T *b) const
{
return *a < *b;
}
};
Make the map look like set<Foo*, PtrLess<Foo>> s; and then add Foo's like s.insert(new Foo(1));
Note the *
Otherwise, when the map creates a container for the Foo item, since it is allocated within the foo containers definition, the map has to copy the supplied value into its internal Foo object.
Standard containers store a copy of the items that are added. If you want your set to store objects, rather than pointers you should simply do the following, otherwise you're creating a memory leak, since the objects allocated via new are never free'd via a corresponding delete.
int main()
{
set<Foo> s;
s.insert(Foo(1));
s.insert(Foo(2));
s.insert(Foo(1));
cout << "size: " << s.size() << endl;
return 0;
}
If you want to minimise the number of temporary objects instantiated, just use a single temporary:
int main()
{
set<Foo> s;
Foo temp(1);
s.insert(temp);
temp.a = 2;
s.insert(temp);
temp.a = 1;
s.insert(temp);
cout << "size: " << s.size() << endl;
return 0;
}
The output for this snippet (via ideone) is:
cons
copy cons
less than
less than
less than
copy cons
less than
less than
less than
size: 2
Generally, I would prefer to store the actual objects in a set<Foo> rather than pointers to objects in a set<Foo*>, since there can be no problems with object ownership (who/when new and delete need to be called), the total amount of memory allocated is smaller (for N items you need N*sizeof(Foo) rather than N*(sizeof(Foo) + sizeof(Foo*)) bytes) and data access could typically be expected to be faster (since there's no extra pointer indirection).
Hope this helps.
This is an extension to #Mranz's answer. Instead of dealing with raw pointers, put the pointers in an std::unique_ptr
#include <memory>
using namespace std;
template<typename T>
struct PtrLess
{
bool operator()(const T& a, const T& b) const
{
return *a < *b;
}
};
int
main(void)
{
set<unique_ptr<Foo>, PtrLess<unique_ptr<Foo>>> s;
s.insert(unique_ptr<Foo>(new Foo(1)));
s.insert(unique_ptr<Foo>(new Foo(2)));
s.insert(unique_ptr<Foo>(new Foo(1)));
cout << "size: " << s.size() << endl;
return 0;
}