Copy hashtable to another hashtable using c++ - c++

I am starting with c++ and need to know, what should be the approach to copy one hashtable to another hashtable in C++?
We can easily do this in java using: HashMap copyOfOriginal=new HashMap(original);
But what about C++? How should I go about it?
UPDATE
Well, I am doing it at a very basic level,perhaps the java example was a wrong one to give. This is what I am trying to implement using C++:
I have this hash array and each element of the array is the head of a linked list. Which has it's individual nodes (data and next pointer).
And now, I need to copy the complete hash array and the linked list each node is pointing to.

In C++ you would use either the copy constructor or simple assignment (with values) to perform this.
For example
std::map<int,string> map1 = CreateTheMap();
std::map<int,string> map2 = map1;
std::map<int,string> map3(map1);

Whichever hashmap you're using, I'm sure it has a copy-constructor and possibly operator=.
hashmap_type newMap = oldMap; // copies
And that's it. C++ has no standard hash map, though.

Well, what hash table implementation are you using? There is no hash table provided by the current version of ISO C++. That said, if your hash table class does not make operator= and its copy constructor private, then it would be a reasonable assumption that both will behave as expected. If not, I'd consider it a bug.
As an aside std::unordered_map is being added in ISO C++ 2010, but ISO C++ 1998 and ISO C++ 1998 with the 2003 amendments do not have a hash map container. Microsoft provided a non-standard "std::hash_map", which they should never have placed in the "std::" namespace. They have since moved it to "stdext::" (which is good news). Some other vendors copied MSFT to make their compilers compatible.
If you are anxious to use a hash table implementation right away, then use boost::unordered_map from the Boost C++ Libraries. The Boost C++ Libraries are open source, very popular, and high quality.
EDIT
Based on your updated question, you will need to create your own copy constructor, a swap function, and an implementation of operator= in order to do this. Usually operator= is trivial once you have swap and a copy constructor in place. Here is a sketch of how you would do this:
template<typename T>
HashTable<T>::HashTable(const HashTable<T>& o)
{
// pseudo code:
// initialize as in HashTable<T>::HashTable()
// for each key/value pair in o:
// insert that key/value pair into this instance
//
// NOTE:
// if your hash table is sized so that the number of
// elements is a prime number, you can do better
// than the pseudo-code given above, but otherwise
// copying element by element is the way to go.
//
// BEGIN YOUR CODE
// ...
// END YOUR CODE
}
template<typename T> HashTable<T>&
HashTable<T>::swap(HashTable<T>& o)
{
// Swap data pointers
T* datatmp = _data;
_data = o._data;
o._data = datatmp;
// Swap capacity
size_t captmp = _capacity;
_capacity = o._capacity;
o._capacity = captmp;
// Swap other info
// ...
// Report self
return *this;
}
template<typename T> HashTable<T>&
HashTable<T>::operator=(const HashTable<T>& o)
{
HashTable<T> cpy(o);
return swap(cpy);
}
You will have to take the signatures from the above and add them to your declaration. I should also point out that one reason that operator= tends to be implemented in terms of swap is that, not only is it very simple to do and having a swap function makes your code very fast when that operation is needed, but also for the purposes of exception safety... your swap should pretty much never fail, but copy construction might... so if the copy construction throws an exception, you haven't thrown the object's state to hell.

I am afraid, somehow, that you are using a custom HashMap class, since you speak about its implementation details.
In C++, when it comes to copying a class, there is a special purpose Copy Constructor, which syntax looks like:
class Foo
{
public:
Foo(); // regular constructor
Foo(const Foo& rhs); // copy constructor
};
It can be invoked with either syntax:
Foo copy(original);
Foo copy2 = original;
Now, if you HashMap does not provide a copy constructor, my first suggestion is to switch to an existing implementation, like boost::unordered_map or if available std::hash_map, std::tr1::hash_map or std::tr1::unordered_map. The reasons there are so may std:: possibilities is that many STL features a hash_map long before it was standardized. unordered_map is here to stay though, and the boost one too.
If you can't switch you are bound to implement the copy operation somehow.

Related

How to convert C array to std::initializer_list?

I know a pointer to an array and its size. What container can be created from it? I tried to do this:
std::initializer_list<int> foo(arr, arr + size);
It works for the MSVC, but not for the gcc
std::initializer_list is a reference-type designed just for supporting list-initialization, and only has the default-ctor, and implicitly copy-ctor. Any other ctor is an extension.
What you can do is initializing the target-container directly from an iterator-range, without involving any intermediate views.
The standard container to use unless you know better would be std::vector. Or would using a simple view like std::span be enough for you?
If you need an actual data owning container, then what you want is a std::vector. This is going to cost you a copy and an allocation. If all you need is to act like a container, then what you want is the upcoming std::span from C++20. It takes a pointer and a size and wraps it in an interface that is like an array.
MSVS's use of
std::initializer_list<int> foo(arr, arr + size);
is not standard. Per the standard the only constructor for std::initiliazer_list is
constexpr initializer_list() noexcept;
There are kind of two questions here, each with their own answer.
How to convert C array to std::initializer_list?
You can't. It doesn't really make sense to. std::initializer_list is really only used to initialize (as its name implies) objects. It's basically is what is created from the {} notation like this:
myObject obj = {0,1,2,3,4};
Attempting to create an instance of an std::initializer_list isn't really useful in any other sense that I can think of, especially since C++14 it's impossible to create at runtime anyway, since it has a constexpr constructor.
If you have some object foo that accepts an std::initializer_list, like this:
class foo {
foo(std::initializer_list list) {
//...
}
};
And you are wondering how to create this object without anstd::initializer_list, then the answer is to simply add another constructor:
class foo {
// an actual array
foo(type arr[size]) {
//...
}
// as a pointer
foo(type arr*, size_t size) {
//...
}
};
If you are using a third party library, or some other library you don't control, that does not have another constructor, then chances are it's not intended to be used this way. In that case, I would consult your documentation or vendor.
What container can be created from it?
Any sequence container. Most of them have some sort of constructor that accepts pointers to an object (actually, it technically takes iterators, but pointers will work the same in this context) or an array. They are pretty much designed for easy conversion from C arrays. Which one you use will depend on your situation.
Also, std::span (which is not listed as a "sequence container") has been mentioned as a possible container that can get created from a C array at low cost. Although, I can't vouch for them personally, as I'm not too familiar with the upcoming standard.
Final note: If MSVC allows this, then either a) you're possibly in C++11 (though I can't confirm if this was allowed in C++11 either, just that the constructor is not constexpr in C++) or b) it is a compiler bug in MSVC.

Sorting vector of custom type by their constant id

I need to sort a vector of custom type std::vector<Blah> v by Blah's integer id. I do this via std::sort(v.begin(), v.end()) with the operator < being overloaded within Blah as
bool operator< (const Blah& b) const { return (id < b.id); }
I noticed that Blah's private id cannot be declared as const int id, otherwise the type Blah does not meet the requirements for std::sort (I assume it conflicts with not being ValueSwappable?)
If id is not const everything is fine. However, I dislike the idea of the objects not having constant ids just for the requirement of rearranging their order within a vector.
Is there a way around or is this the way it is?
Is there a way around or is this the way it is?
I fear that this is the way it is. If you want to sort a vector, which is in principle an array, then you have to assign to elements when exchanging them.
At least that is what i thought, actually you can cheat a bit. Wrap your objects into an union:
template<typename T>
union ac {
// actual object
T thing;
// assignment first destructs object, then copy
// constructs a new inplace.
ac & operator=(ac<T> const & other) {
thing. ~T();
new (& thing) T(other. thing);
}
// need to provide constructor, destructor, etc.
ac(T && t) : thing (std:: forward<T>(t))
{}
ac(ac<T> const & other) : thing (other. thing) {}
~ac() {
thing. ~T();
}
// if you need them, add move assignment and constructor
};
You can then implement the (copy) assignment operator to first destruct the current object and then (copy) construct a new object from the provided inplace of the old object.
You also need to provide constructors and destructors, and of course this only works with C++11 and beyond due to limitations concerning the union members in previous language standards.
This seems to work quite nice: Live demo.
But still, I think you should first revisit some design choices, e.g. if the constant id really needs to be part of your objects
Is there a way around or is this the way it is?
So you want to update / swap the entire data of an object (including it's identity) and to keep the identity constant; the two are in conflict, because constant means "doesn't change" (and swap means "change these instances").
You have stumbled here on the two (competing) definitions of const-ness: conceptual const-ness (what the data says/means is the same) and binary const-ness (the bytes representing the data do not change). (The first definition is what lead to the introduction of mutable in the language: the ability to keep conceptual constness while breaking binary const-ness).
Your data here is conceptually constant (the interface to the data should be const) but not binary constant (you can swap values, so your bits may go away to another instance).
The canonical idea for this is to keep the data non-const internally, and provide only const public/protected access for client code.
You say:
However, I dislike the idea of the objects not having constant ids just for the requirement of rearranging their order within a vector.
Just because the identity is conceptually constant (exposed API is/should be constant), you have no actual hard requirement to keep the data constant (and should have no preference towards it, based on the API).

Trying to store an object in an array but then how to call that object's methods?

I'm not a very experienced c++ coder and this has me stumped. I am passing a object (created elsewhere) to a function, I want to be able to store that object in some array and then run through the array to call a function on that object. Here is some pseudo code:
void AddObject(T& object) {
object.action(); // this works
T* objectList = NULL;
// T gets allocated (not shown here) ...
T[0] = object;
T[0].action(); // this doesn't work
}
I know the object is passing correctly, because the first call to object.action() does what it should. But when I store object in the array, then try to invoke action() it causes a big crash.
Likely my problem is that I simply tinkered with the .'s and *'s until it compiled, T[0].action() compliles but crashes at runtime.
The simplest answer to your question is that you must declare your container correctly and you must define an appropriate assigment operator for your class. Working as closely as possible from your example:
typedef class MyActionableClass T;
T* getGlobalPointer();
void AddInstance(T const& objInstance)
{
T* arrayFromElsewhere = getGlobalPointer();
//ok, now at this point we have a reference to an object instance
//and a pointer which we assume is at the base of an array of T **objects**
//whose first element we don't mind losing
//**copy** the instance we've received
arrayFromElsewhere[0] = objInstance;
//now invoke the action() method on our **copy**
arrayFromElsewhere[0].action();
}
Note the signature change to const reference which emphasizes that we are going to copy the original object and not change it in any way.
Also note carefully that arrayFromElsewhere[0].action() is NOT the same as objInstance.action() because you have made a copy — action() is being invoked in a different context, no matter how similar.
While it is obvious you have condensed, the condensation makes the reason for doing this much less obvious — specifying, for instance, that you want to maintain an array of callback objects would make a better case for “needing” this capability. It is also a poor choice to use “T” like you did because this tends to imply template usage to most experienced C++ programmers.
The thing that is most likely causing your “unexplained” crash is that assignment operator; if you don't define one the compiler will automatically generate one that works as a bitwise copy — almost certainly not what you want if your class is anything other than a collection of simple data types (POD).
For this to work properly on a class of any complexity you will likely need to define a deep copy or use reference counting; in C++ it is almost always a poor choice to let the compiler create any of ctor, dtor, or assignment for you.
And, of course, it would be a good idea to use standard containers rather than the simple array mechanism you implied by your example. In that case you should probably also define a default ctor, a virtual dtor, and a copy ctor because of the assumptions made by containers and algorithms.
If, in fact, you do not want to create a copy of your object but want, instead, to invoke action() on the original object but from within an array, then you will need an array of pointers instead. Again working closely to your original example:
typedef class MyActionableClass T;
T** getGlobalPointer();
void AddInstance(T& objInstance)
{
T** arrayFromElsewhere = getGlobalPointer();
//ok, now at this point we have a reference to an object instance
//and a pointer which we assume is at the base of an array of T **pointers**
//whose first element we don't mind losing
//**reference** the instance we've received by saving its address
arrayFromElsewhere[0] = &objInstance;
//now invoke the action() method on **the original instance**
arrayFromElsewhere[0]->action();
}
Note closely that arrayFromElsewhere is now an array of pointers to objects instead of an array of actual objects.
Note that I dropped the const modifier in this case because I don’t know if action() is a const method — with a name like that I am assuming not…
Note carefully the ampersand (address-of) operator being used in the assignment.
Note also the new syntax for invoking the action() method by using the pointer-to operator.
Finally be advised that using standard containers of pointers is fraught with memory-leak peril, but typically not nearly as dangerous as using naked arrays :-/
I'm surprised it compiles. You declare an array, objectList of 8 pointers to T. Then you assign T[0] = object;. That's not what you want, what you want is one of
T objectList[8];
objectList[0] = object;
objectList[0].action();
or
T *objectList[8];
objectList[0] = &object;
objectList[0]->action();
Now I'm waiting for a C++ expert to explain why your code compiled, I'm really curious.
You can put the object either into a dynamic or a static array:
#include <vector> // dynamic
#include <array> // static
void AddObject(T const & t)
{
std::array<T, 12> arr;
std::vector<T> v;
arr[0] = t;
v.push_back(t);
arr[0].action();
v[0].action();
}
This doesn't really make a lot of sense, though; you would usually have defined your array somewhere else, outside the function.

Benefits of a swap function?

Browsing through some C++ questions I have often seen comments that a STL-friendly class should implement a swap function (usually as a friend.) Can someone explain what benefits this brings, how the STL fits into this and why this function should be implemented as a friend?
For most classes, the default swap is fine, however, the default swap is not optimal in all cases. The most common example of this would be a class using the Pointer to Implementation idiom. Where as with the default swap a large amount of memory would get copied, is you specialized swap, you could speed it up significantly by only swapping the pointers.
If possible, it shouldn't be a friend of the class, however it may need to access private data (for example, the raw pointers) which you class probably doesn't want to expose in the class API.
The standard version of std::swap() will work for most types that are assignable.
void std::swap(T& lhs,T& rhs)
{
T tmp(lhs);
lhs = rhs;
rhs = tmp;
}
But it is not an optimal implementation as it makes a call to the copy constructor followed by two calls to the assignment operator.
By adding your own version of std::swap() for your class you can implement an optimized version of swap().
For example std::vector. The default implementation as defined above would be very expensive as you would need to make copy of the whole data area. Potentially release old data areas or re-allocate the data area as well as invoke the copy constructor for the contained type on each item copied. A specialized version has a very simple easy way to do std::swap()
// NOTE this is not real code.
// It is just an example to show how much more effecient swaping a vector could
// be. And how using a temporary for the vector object is not required.
std::swap(std::vector<T>& lhs,std::vector<T>& rhs)
{
std::swap(lhs.data,rhs.data); // swap a pointer to the data area
std::swap(lhs.size,rhs.size); // swap a couple of integers with size info.
std::swap(lhs.resv,rhs.resv);
}
As a result if your class can optimize the swap() operation then you should probably do so. Otherwise the default version will be used.
Personally I like to implement swap() as a non throwing member method. Then provide a specialized version of std::swap():
class X
{
public:
// As a side Note:
// This is also useful for any non trivial class
// Allows the implementation of the assignment operator
// using the copy swap idiom.
void swap(X& rhs) throw (); // No throw exception guarantee
};
// Should be in the same namespace as X.
// This will allows ADL to find the correct swap when used by objects/functions in
// other namespaces.
void swap(X& lhs,X& rhs)
{
lhs.swap(rhs);
}
If you want to swap (for example) two vectors without knowing anything about their implementation, you basically have to do something like this:
typedef std::vector<int> vec;
void myswap(vec &a, vec &b) {
vec tmp = a;
a = b;
b = tmp;
}
This is not efficient if a and b contain many elements since all those elements are copied between a, b and tmp.
But if the swap function would know about and have access to the internals of the vector, there might be a more efficient implementation possible:
void std::swap(vec &a, vec &b) {
// assuming the elements of the vector are actually stored in some memory area
// pointed to by vec::data
void *tmp = a.data;
a.data = b.data;
b.data = tmp;
// ...
}
In this implementation just a few pointers need to be copied, not all the elements like in the first version. And since this implementation needs access to the internals of the vector it has to be a friend function.
I interpreted your question as basically three different (related) questions.
Why does STL need swap?
Why should a specialized swap be implemented (i.s.o. relying on the default swap)?
Why should it be implemented as a friend?
Why does STL need swap?
The reason an STL friendly class needs swap is that swap is used as a primitive operation in many STL algorithms. (e.g. reverse, sort, partition etc. are typically implemented using swap)
Why should a specialized swap be implemented (i.s.o. relying on the default swap)?
There are many (good) answers to this part of your question already. Basically, knowing the internals of a class frequently allows you to write a much more optimized swap function.
Why should it be implemented as a friend?
The STL algorithms will always call swap as a free function. So it needs to be available as a non member function to be useful. And, since it's only beneficial to write a customized swap when you can use knowledge of internal structures to write a much more efficient swap, this means your free function will need access to the internals of your class, hence a friend.
Basically, it doesn't have to be a friend, but if it doesn't need to be a friend, there's usually no reason to implement a custom swap either.
Note that you should make sure the free function is inside the same namespace as your class, so that the STL algorithms can find your free function via Koening lookup.
One other use of the swap function is to aid exception-safe code: http://www.gotw.ca/gotw/059.htm
Efficiency:
If you've got a class that holds (smart) pointers to data then it's likely to be faster to swap the pointers than to swap the actual data - 3 pointer copies vs. 3 deep copies.
If you use a 'using std::swap' + an unqualified call to swap (or just a qualified call to boost::swap), then ADL will pick up the custom swap function, allowing efficient template code to be written.
Safety:
Pointer swaps (raw pointers, std::auto_ptr and std::tr1::shared_ptr) do not throw, so can be used to implement a non-throwing swap. A non-throwing swap makes it easier to write code that provides the strong exception guarantee (transactional code).
The general pattern is:
class MyClass
{
//other members etc...
void method()
{
MyClass finalState(*this);//copy the current class
finalState.f1();//a series of funcion calls that can modify the internal
finalState.f2();//state of finalState and/or throw.
finalState.f3();
//this only gets call if no exception is thrown - so either the entire function
//completes, or no change is made to the object's state at all.
swap(*this,finalState);
}
};
As for whether it should be implemented as friend; swapping usually requires knowledge of implementation details. It's a matter of taste whether to use a non-friend that calls a member function or to use a friend.
Problems:
A custom swap is often faster than a single assignment - but a single assignment is always faster than the default three assignment swap. If you want to move an object, it's impossible to know in a generic way whether a swap or assignment would be best - a problem which C++0x solves with move constructors.
To implement assignment operators:
class C
{
C(C const&);
void swap(C&) throw();
C& operator=(C x) { this->swap(x); return *this; }
};
This is exception safe, the copy is done via the copy constructor when you pass by value, and the copy can be optimized out by the compiler when you pass a temporary (via copy elision).

Best way to return list of objects in C++?

It's been a while since I programmed in C++, and after coming from python, I feel soooo in a straight jacket, ok I'm not gonna rant.
I have a couple of functions that act as "pipes", accepting a list as input, returning another list as output (based on the input),
this is in concept, but in practice, I'm using std::vector to represent the list, is that acceptable?
further more, I'm not using any pointers, so I'm using std::vector<SomeType> the_list(some_size); as the variable, and returning it directly, i.e. return the_list;
P.S. So far it's all ok, the project size is small and this doesn't seem to affect performance, but I still want to get some input/advice on this, because I feel like I'm writing python in C++.
The only thing I can see is that your forcing a copy of the list you return. It would be more efficient to do something like:
void DoSomething(const std::vector<SomeType>& in, std::vector<SomeType>& out)
{
...
// no need to return anything, just modify out
}
Because you pass in the list you want to return, you avoid the extra copy.
Edit: This is an old reply. If you can use a modern C++ compiler with move semantics, you don't need to worry about this. Of course, this answer still applies if the object you are returning DOES NOT have move semantics.
If you really need a new list, I would simply return it. Return value optimization will take care of no needless copies in most cases, and your code stays very clear.
That being said, taking lists and returning other lists is indeed python programming in C++.
A, for C++, more suitable paradigm would be to create functions that take a range of iterators and alter the underlying collection.
e.g.
void DoSomething(iterator const & from, iterator const & to);
(with iterator possibly being a template, depending on your needs)
Chaining operations is then a matter of calling consecutive methods on begin(), end().
If you don't want to alter the input, you'd make a copy yourself first.
std::vector theOutput(inputVector);
This all comes from the C++ "don't pay for what you don't need" philosophy, you'd only create copies where you actually want to keep the originals.
I'd use the generic approach:
template <typename InIt, typename OutIt>
void DoMagic(InIt first, InIt last, OutIt out)
{
for(; first != last; ++first) {
if(IsCorrectIngredient(*first)) {
*out = DoMoreMagic(*first);
++out;
}
}
}
Now you can call it
std::vector<MagicIngredients> ingredients;
std::vector<MagicResults> result;
DoMagic(ingredients.begin(), ingredients.end(), std::back_inserter(results));
You can easily change containers used without changing the algorithm used, also it is efficient there's no overhead in returning containers.
If you want to be really hardcore, you could use boost::tuple.
tuple<int, int, double> add_multiply_divide(int a, int b) {
return make_tuple(a+b, a*b, double(a)/double(b));
}
But since it seems all your objects are of a single, non-polymorphic type, then the std::vector is all well and fine.
If your types were polymorphic (inherited classes of a base class) then you'd need a vector of pointers, and you'd need to remember to delete all the allocated objects before throwing away your vector.
Using a std::vector is the preferably way in many situations. Its guaranteed to use consecutive memory and is therefor pleasant for the L1 cache.
You should be aware of what happends when your return type is std::vector. What happens under the hood is that the std::vector is recursive copied, so if SomeType's copy constructor is expensive the "return statement" may be a lengthy and time consuming operation.
If you are searching and inserting a lot in your list you could look at std::set to get logarithmic time complexity instead of linear. (std::vectors insert is constant until its capacity is exceeded).
You are saying that you have many "pipe functions"... sounds like an excellent scenario for std::transform.
Another problem with returning a list of objects (opposed to working on one or two lists in place, as BigSandwich pointed out), is if your objects have complex copy constructors, those will called for each element in the container.
If you have 1000 objects each referencing a hunk of memory, and they copy that memory on Object a, b; a=b; that's 1000 memcopys for you, just for returning them contained in a container. If you still want to return a container directly, think about pointers in this case.
It works very simple.
list<int> foo(void)
{
list<int> l;
// do something
return l;
}
Now receiving data:
list<int> lst=foo();
Is fully optimal because compiler know to optimize constructor of lst well. and
would not cause copies.
Other method, more portable:
list<int> lst;
// do anything you want with list
lst.swap(foo());
What happens: foo already optimized so there is no problem to return the value. When
you call swap you set value of lst to new, and thus do not copy it. Now old value
of lst is "swapped" and destructed.
This is the efficient way to do the job.