Implementing a C++ hashtable class using template - c++

I am trying to implement a hashtable in C++ that sort of like the Java version
I would like it has the form of
template <class Key, class Value>
class hashtable {
...
}
Soon enough I notice that I need to somehow convert Key into a number, so that I can use the simple hash function
int h(int hashkey) {
return hashkey%some_prime;
}
But the headache is, Key type is only known at run time. Is it possible to check what type Key is on run time in C++. Or I have to create this hashtable class with different type manually? That is easier to do but ugly. Anyone know an elegant solution?

C++ templates are usually duck typed, meaning that you can explicitly cast to an integeral type in the template, and all types that implement the appropriate conversion can be used as a key. That has the disadvantage of requiring that the classes implement the conversion operator in such a fashion that the hash function will be decent, which is asking for a lot.
You could instead provide a function template
template<typename T> int hash (T t);
Along with specializations for the built in types, and any user that wants to use a custom class as key will just have to provide his own specialzation. I think this is a decent approach.

You seem to have a few misunderstandings. Key type is known at compile time - that's the whole point of using templates. Secondly, there is really no such thing as a completely generic hash function that will work on any type. You need to implement different hash functions for different types, using function overloading or template specialization. There are many common hash functions used for strings, for example.
Finally, C++11 includes a standard hash table (std::unordered_map) which you can use instead of implementing your own.

If you would like try to implement a "generic" one, perhaps you can start with a skeleton much like this:
template <class T, class K>
struct HashEntry { // you would need this to deal with collision
T curr;
K next;
}
template <class V, size_t n>
class HashTable {
void insert(V v)
{
...
size_t idx = v->getHashCode(n);
...
}
private:
HashEntry <V> table_[n];
}
It is usually instantiated with some pointer type, to figure out where a pointer should go, it requires the type implement member function "getHashCode" ...

Related

Using Concepts to create static polymorphic interface

Hello Stackoverflow community,
I've been really confused on the concepts syntax and am having a hard time getting started.
I would like to create a polymorphic interface for two types of operator types: unary and binary and opted to try out the concept feature in c++20.
Not sure if it matters, but I used a CRTP create my unary functor compatible with binary functors, however I would like to get rid of that. Here's what I have so far:
template <typename T>
concept UnaryMatrixOperatable = requires(T _op) {
_op.template operate(std::unique_ptr<Matrix::Representation>{});
{_op.template operate() } -> same_as<std::unique_ptr<Matrix::Representation>>;
};
class ReLU : public UnaryAdapter<ReLU> {
public:
std::unique_ptr<Matrix::Representation> operate(
const std::unique_ptr<Matrix::Representation>& m);
};
static_assert(UnaryMatrixOperatable<ReLU>);
However, I am getting a compilation error, presumably because I am not doing some sort of template specialization for a const matrix & type?
include/m_algorithms.h:122:13: error: static_assert failed
static_assert(UnaryMatrixOperatable<ReLU>);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/m_algorithms.h:122:27: note: because 'Matrix::Operations::Unary::ReLU' does not satisfy 'UnaryMatrixOperatable'
static_assert(UnaryMatrixOperatable<ReLU>);
^
include/m_algorithms.h:53:26: note: because '_op.template operate(std::unique_ptr<Matrix::Representation>{})' would be invalid: 'operate' following the 'template' keyword does not refer to a template
_op.template operate(std::unique_ptr<Matrix::Representation>{});
^
Thanks for all the help in advance, this design in my code has been problematic for over a week so I'm determined to find a clean way to fix it! Thanks.
Concepts are not base classes, and you should not treat concept requirements like base class interfaces. Base classes specify exact function signatures that derived classes must implement.
Concepts specify behavior that must be provided. So you explain what that behavior is.
The behavior you seem to want is that you can pass an rvalue of a unique pointer to an operate member function. So... say that.
template <typename T>
concept UnaryMatrixOperatable = requires(T _op, std::unique_ptr<Matrix::Representation> mtx)
{
_op.operate(std::move(mtx));
};
There's no need for template here because you do not care if operate is a template function. It's not important in the slightest to your code if any particular T happens to implement operate as a template function or not. You're going to call it this way, so the user must specify some function interface that can be called a such.
The same goes for the zero-argument version. Though your interface should probably make it much more clear that you're moving from the unique pointer in question:
template <typename T>
concept UnaryMatrixOperatable = requires(T _op, std::unique_ptr<Matrix::Representation> mtx)
{
_op.operate(std::move(mtx));
{ std::move(_op).operate() } -> std::same_as<decltype(mtx)>;
};
In any case, the other reason you'll get a compile error is that your interface requires two functions: one that gets called with an object and one that does not. Your ReLu class only provides one function that pretends to do both.

Why do we need to specify the type template parameter for each class function defintion?

Since we define the template type over the class declaration, why do we have to specify it after each function definition? I'm confused because its even in the same file so it seems almost unnecessary to have to specify it over every function and because we are using the :: operator shouldnt it go back to the class declaration and see that T is already defined.
I'm new to c++ and still need to clear up some misunderstandings.
#ifndef __Foo_H__
#define __Foo_H__
template <class T>
class Foobar{
private:
bool foo1(T);
bool foo2(T);
public:
FooBar();
};
template <class T> bool FooBar<T>::foo1(T data){
code..
}
template <class T> bool FooBar<T>::foo2(T data){
code..
}
#endif
First you may rename the argument as for normal function:
template <class U> bool FooBar<U>::foo1(U and_here_too){/**/}
It also manages to handle (partial) specialization:
template <> bool FooBar<int>::foo1(int i){/**/}
template <typename T> bool FooBar<std::vector<T>>::foo1(std::vector<T> v){/**/}
Templates are example of generic programming. The idea is to reuse code/algorithms. In languages with strict type control you come across seemingly unnecessary constraints. For instance you may have some sorting function doing great job in one project but incompatible with types used in another.
C++, C#, and Java introduce generic programming as templates (C++) and generics (C#, Java). In generics (let's talk about Java) classes are existing entities and class parameters serve mainly as type control service. That is their purpose in collections. When you inspect how list works you see list gathers Objects and cast back to the parameterized type only when the object is retrieved. When you write class you can only assume the parameterized type is Object or declared interface like in the following example:
class Test<T extends Comparable> {}
Here you can use T as Comparable. Unless you explicitly declare the interface, the parameter is treated as Object.
Now comes the difference between generics and templates. In C++ you can assume much more about the parameterized type in implementation. You can write sorting of objects of unknown type. In Java you have to at least know what is interface the parameter type. This causes that C++ have to build new class for each parameter (in order to check if the code is correct). Vector<int> **is completely separate type from **Vector<float>. While in Java there exists one class Vector<? extends Comparable>.
:: is scope operator. You can access scope of Vector<int> because the class exists, however, Vector does not.
As a result Java can compile generics and C++ cannot. All templates have to be available in headers to all programmers; you cannot hide it (there is some work to compile templates but I don't know what is its status).
So when you use generics you can refer to method Vector.add() while when templates you have to specify parameter template<class T> Vector<T>.
PS. since template parameter is integral part of class name you may use templates for compile time calculations like fibonaci sequence
template<int N> struct Fibonaci {
static const int element = Fibonacci<N-1>::data + Fibonacci<N-2::data>;
}
template<1> struct Fibonaci {
static const int element = 1;
}
template<0> struct Fibonaci {
static const int element = 0;
}

How to implement a generic hash function in C++

I am trying to implement HashTable in C++ via templates.
Here is the signature:
template<class T1, class T2>
class HashTable {
public:
void add(T1 a, T2 b);
void hashFunction(T1 key, T2 value)
{
// how to implement this function using key as a generic
// we need to know the object type of key
}
};
So, I am unable to move ahead with implementation involving a generic key.
In Java, I could have easily cast the key to string and then be happy with implementing the hash for a key as string. But, in C++, what I know is that there is a concept of RTTI which can dynamically cast an object to the desired object.
How to implement that dynamic cast, if this method is correct at all?
If using template is not the correct approach to implement generics for this case, then please suggest some better approach.
You would typically use std::hash for this, and let type implementors specialize that template as required.
size_t key_hash = std::hash<T1>()(key);
There is no way you can generically implement a hash function for any random type you are given. If two objects are equal, their hash codes must be the same. You could simply run the raw memory of the objects through a hash function, but the types might implement an operator== overload that ignores some piece of object data (say, a synchronization object). In that case you could potentially (and very easily) return different hash values for equal objects.
It's strange that you want hash both key and value. How you will be able to get value by only key after it?
If you are using C++11 good idea is to use std::hash<T1> that provided for some types (integers, string, pointers) and maybe specialized for other classes. Besides, it's good idea to allow change it using third template parameter class. See how unordered_map is done
template<typename K, typename V, typename H = std::hash<T>>
class HashTable {
//...
void hashFunction(const T1& key) {
hash = H()(key);
//process hash somehow, probably you need get reminder after division to number of buckets or something same
return hash % size;
}
}
It seems impossible to write you own hasher, that will work OK for most types, because equality operator maybe overridden in some complicated way

C++ collection class to call children functions

The project I'm working on has some pretty nasty collection classes that I feel could use a redesign. I'd really like to make a collection template class that takes model instances and provides a way to call type-specific functions of each child in the collection. For example, something like:
MyCollection<Student> BiologyStudents();
// [Fill the collection]
BiologyStudents.EnrollInClass(ClassList::Biology);
BiologyStudents.Commit();
The idea is that I could easily enroll all students in a class using my collection, then commit those changes to a database. My problem is in how to expose that EnrollInClass() function which belongs to the children Student objects? If my collection contains objects of a different type than Student, I would like those functions to be exposed from the collection. The only way I can think to do that with my semi-limited C++ knowledge would be to make a function that takes a parameter which references a function I know is in the containing child class. This wouldn't provide compilation errors if you call the wrong function or provide the wrong parameters, so I'd like a way to utilize the compiler to provide these checks.
Is this possible? If so, how? As a warning, I'm used to generic programming in Java/C#, so my impression of C++ templates might be a bit off.
One way would be to use a method pointer:
template <typename T>
struct MyCollection {
template <typename U>
void ForEach(void (T::*func)(U),U param)
{
// for each item loop goes here
(item.*func)(param);
}
};
MyCollection<Student> BiologyStudents;
// [Fill the collection]
BiologyStudents.ForEach(&Student::EnrollInClass,ClassList::Biology);
You would have to provide different versions for different numbers of parameters.
With C++11, you can do this:
template <typename T>
struct MyCollection {
void ForEach(std::function<void (T &)> func)
{
// for each item loop goes here
func(item);
}
};
MyCollection<Student> BiologyStudents;
// [Fill the collection]
BiologyStudents.ForEach([](Student &s){s.EnrollInClass(ClassList::Biology);});
Which would not require making different versions of ForEach for different numbers of parameters.

Template parameters dilemma

I have a dilemma. Suppose I have a template class:
template <typename ValueT>
class Array
{
public:
typedef ValueT ValueType;
ValueType& GetValue()
{
...
}
};
Now I want to define a function that receives a reference to the class and calls the function GetValue(). I usually consider the following two ways:
Method 1:
template <typename ValueType>
void DoGetValue(Array<ValueType>& arr)
{
ValueType value = arr.GetValue();
...
}
Method 2:
template <typename ArrayType>
void DoGetValue(ArrayType& arr)
{
typename ArrayType::ValueType value = arr.GetValue();
...
}
There is almost no difference between the two methods. Even calling both functions will look exactly the same:
int main()
{
Array<int> arr;
DoGetValue(arr);
}
Now, which of the two is the best? I can think of some cons and pros:
Method 1 pros:
The parameter is a real class not a template, so it is easier for the user to understand the interface - it is very explicit that the parameter has to be Array. In method 2 you can guess it only from the name. We use ValueType in the function so it is more clear this way than when it is hidden inside Array and must be accessed using the scope operator.
In addition the typename keyword might be confusing for many non template savvy programmers.
Method 2 pros:
This function is more "true" to its purpose. When I think if it, I don't really need the class to be Array. What I really need is a class that has a method GetValue and a type ValueType. That's all. That is, this method is more generic.
This method is also less dependent on the changes in Array class. What if the template parameters of Array are changed? Why should it affect DoGetValue? It doesn't really care how Array is defined.
Evey time I have this situation I'm not sure what to choose. What is your choice?
The second one is better. In your "pros" for the first one, you say, "it is very explicit that the parameter has to be Array". But saying that the parameter has to be an Array is an unnecessary limitation. In the second example, any class with a suitable GetValue function will do. Since it's an unnecessary limitation, it's better to remove it (second one) than to make it explicit (first one). You'll write more flexible templates, which is useful in future when you want to get a value from something that isn't an Array.
If your function is very specific to ArrayType, and no other template will satisfy its interface requirements, use #1 as it's both shorter and more specific: the casual reader is informed that it operates on an ArrayType.
If there's a possibility that other templates will be compatible with DoGetValue, use #2 as it's more generic.
But no use obsessing, since it's easy enough to convert between them.
My friend proposed two more, somewhat more extreme, methods:
Method 3: gives you the ability of using types that don't have a ::ValueType.
template <typename ArrayType, typename ValueType = ArrayType::ValueType>
void DoGetValue(ArrayType& arr)
{
ValueType value = arr.GetValue();
...
}
Method 4: a cool way of forcing the array to be a class that has one template parameter.
template <template <typename> class ArrayType, typename ValueType>
void DoGetValue(ArrayType<ValueType>& arr)
{
ValueType value = arr.GetValue();
...
}