Where to implement the hash function? - c++

I am using an object as a key in an unordered_map, so I need to define a hash function. My question is, where should the hash function be implemented. Should I put it with the class implementation or should I implement it close to where I need it.
UPDATE:
If it makes a difference, all of this is based in a framework

If you anticipate you'll need to reuse it in many unordered_maps, put it somewhere visible, like in the class.
If you just need it for a one-off unordered_map, put it close to where you use it. You can even use a lambda.

I'd put it with the class definition, at least if you're using == as
the equality function in the unordered_map. The implementation of the
hash function depends on the implementation of the equality comparison,
and there is a definite advantage in keeping both together, to reduce
the probability of someone not changing the hash function if they
change ==.
If you're also defining a special equality function for the map, then
the two functions should be defined together, probably close to where
they will be used to instantiate the map.

In my opinion if the hash function is basic as below it should be method of the class and also should be inline.
int hashFunction(long x){
return (int) (x % N);
}
If it is a little more complex hash function it should be a method of this class.Because you will need a "N" which will be spesific to that class .

Related

How to achieve constexpr pseudopolymorphism?

In a ray tracing project that I'm trying to make compile-time (constexpr) for fun and challenge, I've run into a bit of an issue: I have an object (intersection) that needs to refer to one of a group of other objects (shapes).
Now, my understanding is that you cannot use polymorphism / virtual methods with constexpr because of the vtable lookups, so as far as I know, I cannot have a superclass, Shape, from which the other classes derive. Thus, I need to make Intersection a template class that holds one of its shapes.
Unfortunately, I need to store these Intersection classes in an array or some other container, and I want to be able to call a common function on them and their shape, i.e. where the pseudopolymorphism comes in.
I implemented something that solves the problem, where I take an std::array of std::variant and whenever I add to the array, if the type isn't represented by anything in the std::variant, then I expand it. I can also achieve pseudopolymorphism by using std::visit, invoking a commonly named function on each element to result in an std::array of final elements.
My implementation is here, in this gist. I thought it would be too long to post:
https://gist.github.com/sraaphorst/28998c109f94a78616e7dd488c1491d1
Now, I've been known to solve problems with much more difficulty than is necessary, so I was wondering if any of you know of a simpler way to achieve this?

Where to put comparison function for use with (e.g.) std::sort?

If I had a class Cell, for example, which I want to sort according to the following function (x and y here being int member variables with accessors):
bool sortByCoordinates(const Cell& c1, const Cell& c2) {
return c1.getX() < c2.getX() || (c1.getX() == c2.getX() && c1.getY() < c2.getY());
}
Where exactly is the best place to put this function so that I can use it with a functions such as std::sort?
In the examples they just have the method floating in the source file above where it is needed, but in practice I want to keep it associated with the Cell class. I know that I could override the operator< but there might be other sort methods by which I'd like to sort Cell, and I'm not a big fan of overriding operators for code clarity's sake anyhow.
At the moment I have it as a static method in my Cell.h file, so that I can call it in when sorting like so:
std::sort(cells.begin(), cells.end(), Cell::sortByCoordinates);
Is this the best practice for multiple (or even singular) custom sort functions, and is the header file the right place for them? If not, what is?
Doing it the way you describe is reasonable. And defining the comparison function inline in the header itself is a good idea if you care about performance (rather than defining it in the .cpp file).
Personally I have a different preference than you. I would declare this reasonable default comparison function at namespace scope (i.e. right below the class), because as written it does not need privileged access to class members. And I would declare it as operator <. I don't think there is anything to be ashamed about in terms of making one function "special" when it seems to be a reasonable default ordering.

Define std::hash<std::function>

I need to create a templated class that can hold pointers to elements of type T and then performs functions on them. The functions will come from different places, so I need a container to store them, so I can call them later. I decided to use an std::unordered_set, because it offers speed and restricts duplication due to it being implemented as a hash table. I have a whole class written, but it doesn't compile due to there not being a hash function defined for my std::function that takes a pointer of type T and returns void. It's easy enough to specify it with struct hash<std::function<void(MyCustomType*)>> (and overloading the () operator, too) for each type I use, but how do I actually hash the function?
Here is a watered-down excerpt from my class with the relevant members and methods:
template <typename T>
class Master {
private:
std::unordered_set<std::function<void(T*)>> functions;
protected:
registerFunction(std::function<void(T*)> function) {
this->functions.insert(function);
}
unregisterFunction(std::function<void(T*)> function) {
this->functions.erase(function);
}
};
I'm not completely bound to using an std::unordered_set, but it seems to offer everything that I'd need to get this piece (and the rest of my code) working well.
Am I thinking about this the wrong way? Is it completely impossible to hash a std::function?
A set is mostly something you will check that data is in it.
So I do not see the point of using one here... You'll have your functions and you'll store them in the set, and after that, what ? You just iterate on them ?
For your question, a element of a set should have a way to generate a hash and an operator==(). The second is not provided for std::function and thus you wouldn't be able to check that your function is really in the set.
So even if you find a way to generate an hash from the function, you would be stuck... And I do not see how to meet the hash requirement.
Why not simply use a std::vector ?

Is it idiomatically ok to put algorithm into class?

I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way. Since the algorithm is complex, I break it down into several functions.
Now, I actually do not see how this might be a class from an idiomatic way; I mean, I am just used to have algorithms as functions. The usage would simply be:
Calculation calc(/* several parameters */);
calc.calculate();
// get the heterogenous results via getters
On the other hand, putting this into a class has the following advantages:
I do not have to pass all the variables to the other functions/methods
arrays initialized at the beginning of the algorithm are accessible throughout the class in each function
my code is shorter and (imo) clearer
A hybrid way would be to put the algorithm class into a source file and access it via a function that uses it. The user of the algorithm would not see the class.
Does anyone have valuable thoughts that might help me out?
Thank you very much in advance!
I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way.[...]
Now, I actually do not see how this might be a class from an idiomatic way
It is not, but many people do the same thing you do (so did I a few times).
Instead of creating a class for your algorithm, consider transforming your inputs and outputs into classes/structures.
That is, instead of:
Calculation calc(a, b, c, d, e, f, g);
calc.calculate();
// use getters on calc from here on
you could write:
CalcInputs inputs(a, b, c, d, e, f, g);
CalcResult output = calculate(inputs); // calculate is now free function
// use getters on output from here on
This doesn't create any problems and performs the same (actually better) grouping of data.
I'd say it is very idiomatic to represent an algorithm (or perhaps better, a computation) as a class. One of the definitions of object class from OOP is "data and functions to operate on that data." A compex algorithm with its inputs, outputs and intermediary data matches this definition perfectly.
I've done this myself several times, and it simplifies (human) code flow analysis significantly, making the whole thing easier to reason about, to debug and to test.
If the abstraction for the client code is an algorithm, you
probably want to keep a pure functional interface, and not
introduce additional types there. It's quite common, on the
other hand, for such a function to be implemented in a source
file which defines a common data structure or class for its
internal use, so you might have:
double calculation( /* input parameters */ )
{
SupportClass calc( /* input parameters */ );
calc.part1();
calc.part2();
// etc...
return calc.results();
}
Depending on how your code is organized, SupportClass will be
in an unnamed namespace in the source file (probably the most
common case), or in a "private" header, included only by the
sources involved in the algorith.
It really depends of what kind of algorithm you want to encapsulate. Generally I agree with John Carmack : "Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function."
It really boils down to: do the algorithm need access to the private area of the class that is not supposed to be public? If the answer is yes (unless you are willing to refactor your class interface, depending on the specific cases) you should go with a member function, if not, then a free function is good enough.
Take for example the standard library. Most of the algorithms are provided as free functions because they only access the public interface of the class (with iterators for standard containers, for example).
Do you need to call the exact same functions in the exact same order each time? Then you shouldn't be requiring calling code to do this. Splitting your algorithm into multiple functions is fine, but I'd still have one call the next and then the next and so on, with a struct of results/parameters being passed along the way. A class doesn't feel right for a one-off invocation of some procedure.
The only way I'd do this with a class is if the class encapsulates all the input data itself, and you then call myClass.nameOfMyAlgorithm() on it, among other potential operations. Then you have data+manipulators. But just manipulators? Yeah, I'm not so sure.
In modern C++ the distinction has been eroded quite a bit. Even from the operator overloading of the pre-ANSI language, you could create a class whose instances are syntactically like functions:
struct Multiplier
{
int factor_;
Multiplier(int f) : factor_(f) { }
int operator()(int v) const
{
return v * _factor;
}
};
Multipler doubler(2);
std::cout << doubler(3) << std::endl; // prints 6
Such a class/struct is called a functor, and can capture "contextual" values in its constructor. This allows you to effectively pass the parameters to a function in two stages: some in the constructor call, some later each time you call it for real. This is called partial function application.
To relate this to your example, your calculate member function could be turned into operator(), and then the Calculation instance would be a function! (or near enough.)
To unify these ideas, you can try thinking of a plain function as a functor of which there is only one instance (and hence no need for a constructor - although this is no guarantee that the function only depends on its formal parameters: it might depend on global variables...)
Rather than asking "Should I put this algorithm in a function or a class?" instead ask yourself "Would it be useful to be able to pass the parameters to this algorithm in two or more stages?" In your example, all the parameters go into the constructor, and none in the later call to calculate, so it makes little sense to ask users of your class make two calls.
In C++11 the distinction breaks down further (and things get a lot more convenient), in recognition of the fluidity of these ideas:
auto doubler = [] (int val) { return val * 2; };
std::cout << doubler(3) << std::endl; // prints 6
Here, doubler is a lambda, which is essentially a nifty way to declare an instance of a compiler-generated class that implements the () operator.
Reproducing the original example more exactly, we would want a function-like thing called multiplier that accepts a factor, and returns another function-like thing that accepts a value v and returns v * factor.
auto multiplier = [] (int factor)
{
return [=] (int v) { return v * factor; };
};
auto doubler = multiplier(2);
std::cout << doubler(3) << std::endl; // prints 6
Note the pattern: ultimately we're multiplying two numbers, but we specify the numbers in two steps. The functor we get back from calling multiplier acts like a "package" containing the first number.
Although lambdas are relatively new, they are likely to become a very common part of C++ style (as they have in every other language they've been added to).
But sadly at this point we've reached the "cutting edge" as the above example works in GCC but not in MSVC 12 (I haven't tried it in MSVC 13). It does pass the intellisense checking of MSVC 12 though (they use two completely different compilers)! And you can fix it by wrapping the inner lambda with std::function<int(int)>( ... ).
Even so, you can use these ideas in old-school C++ when writing functors by hand.
Looking further ahead, resumable functions may make it into some future version of the language (Microsoft is pushing hard for them as they are practically identical to async/await in C#) and that is yet another blurring of the distinction between functions and classes (a resumable function acts like a constructor for a state machine class).

boost::unordered_map key class hash and equal methods

Is it possible not to declare another objects for hash generation and key comparasion,but to create member functions in key class for this capabilities?
As noted in the comments you can easily define an operator== for your class. You can also write a free function hash_value that takes a parameter of your class, and this should be used automatically.