c++ hash function for an int array - c++

I need to specialize the hash function for unordered_map so I can use int arrays as keys. The array values are usually 0 or 1, e.g. int array = {0, 1, 0, 1}, but technically not bounded.
Can someone recommend a good hash function in this case? Alternatively, I can always convert the int array into a string and avoid specialization. But I am concerned about performance since I may have several million of these arrays.

C++ TR1 contains a hash template function.
If you don't have that yet, you can use Boost Hash.
Idea for a handy helper:
#include <boost/functional/hash.hpp>
template <typename T, int N>
static std::size_t hasharray(const T (&arr)[N])
{
return boost::hash_range(arr, arr+N);
}
This would be (roughly?) equivalent to
size_t seed = 0;
for (const T* it=arr; it!=(arr+N); ++it)
boost::hash_combine(seed, *it);
return seed;
Don't forget to implement proper equality comparison operations if you're using this hash for lookup

Try to use lookup8 hash function. This function is VERY fast and good.
int key[100];
int key_size=10;
for (int i=0;i<key_size;i++) key[i]=i; //fill key with sample data
ub8 hash=hash((ub8*)key, sizeof(key[0])*key_size, 0);
UPD: Or use better function. - t1ha

Related

C++ - Hash/Map a std::vector<uint64_t> in a single uint64_t

I need to map a std::vector<uint64_t> to a single uint64_t. It is possible to do? I thought to use a hash function. Is that a solution?
For example, this vector:
std::vector<uint64_t> v {
16377,
2631694347470643681,
11730294873282192384
}
should be converted into one uint64_t.
If a hash function is not a good solution (e.g. high percentage of collision) there is an alternative to do this mapping?
I need to hash a std::vector<uint64_t> to a single uint64_t. It is possibile to do?
Yes, variable length hash functions exist, and it's possible to implement them in C++.
C++ standard library comes with a few hash functions, but unfortunately not for vector (other than for the bool specialisation). We can reuse the hash function provided for string views, but this is a bit of a cludge:
const char* data = reinterpret_cast<const char*>(v.data());
std::size_t size = v.size() * sizeof(v[0]);
std::hash<std::string_view> hash;
std::cout << hash(std::string_view(data, size));
Note that using this is reasonable only in the case std::has_unique_object_representations_v is true of the element type of vector. I think it's reasonable to assume that to be the case for std::uint64_t.
A caveat when using standard library hash functions is that they don't have exact specification and as such you cannot rely on hashes being identical across separate systems. You should use another hash function if that is a concern.
You can create an std::map<std::vector<uint64_t>, uint64_t>, create a compare function for your vectors and just keep adding them to a map while incrementing a counter.
That counter will be your hash value.
The comment above in code :
#include <array>
#include <algorithm>
#include <vector>
#include <iostream>
static std::array<size_t,5> primes = { 3,5,7,11,13 };
static std::uint64_t hash(const std::vector<std::uint64_t>& v)
{
std::uint64_t hash = v[0];
for (size_t n = 1; n < std::min(primes.size(), v.size()); ++n) hash += (primes[n]*v[n]);
return hash;
}
int main()
{
std::vector<uint64_t> v{ 16377, 2631694347470643681, 11730294873282192384 };
std::cout << hash(v);
return 0;
}

The most efficient nested array container in C++ for reading and writing?

I am a mathematician by training and need to simulate a continuous time Markov chain. I need to use a variant of Gillespie algorithm which relies on fast reading and writing to a 13-dimensional array. At the same time, I need to set the size of each dimension based on users input (they will be each roughly of order 10). Once these sizes are set by the user, they will not change throughout the runtime. The only thing which changes will be the data contained in them. What is the most efficient way of doing this?
My first try was to use the standard arrays but their sizes must be known at the compilation time, which is not my case. Is std::vector a good structure for this? If so, how shall I go about initializing a creature as:
vector<vector<vector<vector<vector<vector<vector<vector<vector<vector<vector<vector<vector<int>>>>>>>>>>>>> Array;
Will the initialization take more time than dealing with an array? Or, is there a better data container to use, please?
Thank you for any help!
I would start by using a std::unordered_map to hold key-value pairs, with each key being a 13-dimensional std::array, and each value being an int (or whatever datatype is appropriate), like this:
#include <iostream>
#include <unordered_map>
#include <array>
typedef std::array<int, 13> MarkovAddress;
// Define a hasher that std::unordered_map can use
// to compute a hash value for a MarkovAddress
// borrowed from: https://codereview.stackexchange.com/a/172095/126857
template<class T, size_t N>
struct std::hash<std::array<T, N>> {
size_t operator() (const std::array<T, N>& key) const {
std::hash<T> hasher;
size_t result = 0;
for(size_t i = 0; i < N; ++i) {
result = result * 31 + hasher(key[i]); // ??
}
return result;
}
};
int main(int, char **)
{
std::unordered_map<MarkovAddress, int> map;
// Just for testing
const MarkovAddress a{{1,2,3,4,5,6,7,8,9,10,11,12,13}};
// Place a value into the map at the specified address
map[a] = 12345;
// Now let's see if the value is present in the map,
// and retrieve it if so
if (map.count(a) > 0)
{
std::cout << "Value in map is " << map[a] << std::endl;
}
else std::cout << "Value not found!?" << std::endl;
return 0;
}
That will give you fast (O(1)) lookup and insert, which is likely your first priority. If you later run into trouble with that (e.g. too much RAM used, or you need a well-defined iteration order, or etc) you could replace it with something more elaborate later.

C++: Function templates

I have 2 2D arrays that represent a maze
const char maze1[10][11]
and
const char maze2[20][21]
I'm trying to create 1 function to handle both mazes like so:
void solveMaze(maze[][])
{
}
and just pass the maze like solveMaze(maze1);
How would I do this with function templates?
I recently asked this question already but explicitly asked not to use function templates because I wasn't sure on how to use it, but I would like to see how it would work. (hope this isn't "abusing" the system)
You do not need and should not use templates to solve this problem. All you are doing is solving mazes of different sizes.
Templates are for the generation of a number of classes/functions that use various types.
Instead construct a class to store a maze. This class should store the dimentsions of the maze and give access to the components of that maze.
First of all, it would be much simpler if you were using better arrays. The issue with C-arrays is that they have a tendency to decay to pointers easily, and once they do, the size is lost (and that, my dear, is pretty stupid as far as I am concerned...)
The choice then depends on whether you have fixed-size arrays or want dynamically-sized arrays:
for fixed-size: std::array (or if unavailable boost::array)
for dynamically-size: std::vector
Since a template will make more sense in the std::array case, I'll suppose that is what you elected.
char const maze1[10][11]
is equivalent to
std::array<std::array<char, 11>, 10> const maze1
It's slightly more verbose, but std::array proposes regular member methods like .size(), .begin(), .end(), etc... and it can be passed in functions easily.
Now, on to your template functions. The signature will simply be:
template <size_t M, size_t N>
void solveMaze(std::array<std::array<char, N>, M> const& maze);
However, despite your question, you more likely want not to use templates here (they are of little benefits). So I would advise using vector and a regular functions:
void solveMaze(std::vector< std::vector<char> > const& maze);
template<int w, int h>
void solveMaze(const char (&maze)[w][h])
{
//can use w,h now
}
There is no real support for multidimensional Arrays. You should consider using a class with proper support for the dimensions. The following does the trick
template<int N, int M>
void solveMaze(const char (&maze)[N][M]) {
size_t n = N;
size_t m = M;
int x = 0;
}
int main(int argc, char *argv[])
{
const char maze[3][2] = { { 0, 1} , {2, 3}, {4, 5} };
solveMaze(maze);
return 0;
}

Write the prototype for a function that takes an array of exactly 16 integers

One of the interview questions asked me to "write the prototype for a C function that takes an array of exactly 16 integers" and I was wondering what it could be? Maybe a function declaration like this:
void foo(int a[], int len);
Or something else?
And what about if the language was C++ instead?
In C, this requires a pointer to an array of 16 integers:
void special_case(int (*array)[16]);
It would be called with:
int array[16];
special_case(&array);
In C++, you can use a reference to an array, too, as shown in Nawaz's answer. (The question asks for C in the title, and originally only mentioned C++ in the tags.)
Any version that uses some variant of:
void alternative(int array[16]);
ends up being equivalent to:
void alternative(int *array);
which will accept any size of array, in practice.
The question is asked - does special_case() really prevent a different size of array from being passed. The answer is 'Yes'.
void special_case(int (*array)[16]);
void anon(void)
{
int array16[16];
int array18[18];
special_case(&array16);
special_case(&array18);
}
The compiler (GCC 4.5.2 on MacOS X 10.6.6, as it happens) complains (warns):
$ gcc -c xx.c
xx.c: In function ‘anon’:
xx.c:9:5: warning: passing argument 1 of ‘special_case’ from incompatible pointer type
xx.c:1:6: note: expected ‘int (*)[16]’ but argument is of type ‘int (*)[18]’
$
Change to GCC 4.2.1 - as provided by Apple - and the warning is:
$ /usr/bin/gcc -c xx.c
xx.c: In function ‘anon’:
xx.c:9: warning: passing argument 1 of ‘special_case’ from incompatible pointer type
$
The warning in 4.5.2 is better, but the substance is the same.
There are several ways to declare array-parameters of fixed size:
void foo(int values[16]);
accepts any pointer-to-int, but the array-size serves as documentation
void foo(int (*values)[16]);
accepts a pointer to an array with exactly 16 elements
void foo(int values[static 16]);
accepts a pointer to the first element of an array with at least 16 elements
struct bar { int values[16]; };
void foo(struct bar bar);
accepts a structure boxing an array with exactly 16 elements, passing them by value.
& is necessary in C++:
void foo(int (&a)[16]); // & is necessary. (in C++)
Note : & is necessary, otherwise you can pass array of any size!
For C:
void foo(int (*a)[16]) //one way
{
}
typedef int (*IntArr16)[16]; //other way
void bar(IntArr16 a)
{
}
int main(void)
{
int a[16];
foo(&a); //call like this - otherwise you'll get warning!
bar(&a); //call like this - otherwise you'll get warning!
return 0;
}
Demo : http://www.ideone.com/fWva6
I think the simplest way to be typesafe would be to declare a struct that holds the array, and pass that:
struct Array16 {
int elt[16];
};
void Foo(struct Array16* matrix);
You already got some answers for C, and an answer for C++, but there's another way to do it in C++.
As Nawaz said, to pass an array of N size, you can do this in C++:
const size_t N = 16; // For your question.
void foo(int (&arr)[N]) {
// Do something with arr.
}
However, as of C++11, you can also use the std::array container, which can be passed with more natural syntax (assuming some familiarity with template syntax).
#include <array>
const size_t N = 16;
void bar(std::array<int, N> arr) {
// Do something with arr.
}
As a container, std::array allows mostly the same functionality as a normal C-style array, while also adding additional functionality.
std::array<int, 5> arr1 = { 1, 2, 3, 4, 5 };
int arr2[5] = { 1, 2, 3, 4, 5 };
// Operator[]:
for (int i = 0; i < 5; i++) {
assert(arr1[i] == arr2[i]);
}
// Fill:
arr1.fill(0);
for (int i = 0; i < 5; i++) {
arr2[i] = 0;
}
// Check size:
size_t arr1Size = arr1.size();
size_t arr2Size = sizeof(arr2) / sizeof(arr2[0]);
// Foreach (C++11 syntax):
for (int &i : arr1) {
// Use i.
}
for (int &i : arr2) {
// Use i.
}
However, to my knowledge (which is admittedly limited at the time), pointer arithmetic isn't safe with std::array unless you use the member function data() to obtain the actual array's address first. This is both to prevent future modifications to the std::array class from breaking your code, and because some STL implementations may store additional data in addition to the actual array.
Note that this would be most useful for new code, or if you convert your pre-existing code to use std::arrays instead of C-style arrays. As std::arrays are aggregate types, they lack custom constructors, and thus you can't directly switch from C-style array to std::array (short of using a cast, but that's ugly and can potentially cause problems in the future). To convert them, you would instead need to use something like this:
#include <array>
#include <algorithm>
const size_t N = 16;
std::array<int, N> cArrayConverter(int (&arr)[N]) {
std::array<int, N> ret;
std::copy(std::begin(arr), std::end(arr), std::begin(ret));
return ret;
}
Therefore, if your code uses C-style arrays and it would be infeasible to convert it to use std::arrays instead, you would be better off sticking with C-style arrays.
(Note: I specified sizes as N so you can more easily reuse the code wherever you need it.)
Edit: There's a few things I forgot to mention:
1) The majority of the C++ standard library functions designed for operating on containers are implementation-agnostic; instead of being designed for specific containers, they operate on ranges, using iterators. (This also means that they work for std::basic_string and instantiations thereof, such as std::string.) For example, std::copy has the following prototype:
template <class InputIterator, class OutputIterator>
OutputIterator copy(InputIterator first, InputIterator last,
OutputIterator result);
// first is the beginning of the first range.
// last is the end of the first range.
// result is the beginning of the second range.
While this may look imposing, you generally don't need to specify the template parameters, and can just let the compiler handle that for you.
std::array<int, 5> arr1 = { 1, 2, 3, 4, 5 };
std::array<int, 5> arr2 = { 6, 7, 8, 9, 0 };
std::string str1 = ".dlrow ,olleH";
std::string str2 = "Overwrite me!";
std::copy(arr1.begin(), arr1.end(), arr2.begin());
// arr2 now stores { 1, 2, 3, 4, 5 }.
std::copy(str1.begin(), str1.end(), str2.begin());
// str2 now stores ".dlrow ,olleH".
// Not really necessary for full string copying, due to std::string.operator=(), but possible nonetheless.
Due to relying on iterators, these functions are also compatible with C-style arrays (as iterators are a generalisation of pointers, all pointers are by definition iterators (but not all iterators are necessarily pointers)). This can be useful when working with legacy code, as it means you have full access to the range functions in the standard library.
int arr1[5] = { 4, 3, 2, 1, 0 };
std::array<int, 5> arr2;
std::copy(std::begin(arr1), std::end(arr1), std::begin(arr2));
You may have noticed from this example and the last that std::array.begin() and std::begin() can be used interchangeably with std::array. This is because std::begin() and std::end() are implemented such that for any container, they have the same return type, and return the same value, as calling the begin() and end() member functions of an instance of that container.
// Prototype:
template <class Container>
auto begin (Container& cont) -> decltype (cont.begin());
// Examples:
std::array<int, 5> arr;
std::vector<char> vec;
std::begin(arr) == arr.begin();
std::end(arr) == arr.end();
std::begin(vec) == vec.begin();
std::end(vec) == vec.end();
// And so on...
C-style arrays have no member functions, necessitating the use of std::begin() and std::end() for them. In this case, the two functions are overloaded to provide applicable pointers, depending on the type of the array.
// Prototype:
template <class T, size_t N>
T* begin (T(&arr)[N]);
// Examples:
int arr[5];
std::begin(arr) == &arr[0];
std::end(arr) == &arr[4];
As a general rule of thumb, if you're unsure about whether or not any particular code segment will have to use C-style arrays, it's safer to use std::begin() and std::end().
[Note that while I used std::copy() as an example, the use of ranges and iterators is very common in the standard library. Most, if not all, functions designed to operate on containers (or more specifically, any implementation of the Container concept, such as std::array, std::vector, and std::string) use ranges, making them compatible with any current and future containers, as well as with C-style arrays. There may be exceptions to this widespread compatibility that I'm not aware of, however.]
2) When passing a std::array by value, there can be considerable overhead, depending on the size of the array. As such, it's usually better to pass it by reference, or use iterators (like the standard library).
// Pass by reference.
const size_t N = 16;
void foo(std::array<int, N>& arr);
3) All of these examples assume that all arrays in your code will be the same size, as specified by the constant N. To make more your code more implementation-independent, you can either use ranges & iterators yourself, or if you want to keep your code focused on arrays, use templated functions. [Building on this answer to another question.]
template<size_t SZ> void foo(std::array<int, SZ>& arr);
...
std::array<int, 5> arr1;
std::array<int, 10> arr2;
foo(arr1); // Calls foo<5>(arr1).
foo(arr2); // Calls foo<10>(arr2).
If doing this, you can even go so far as to template the array's member type as well, provided your code can operate on types other than int.
template<typename T, size_t SZ>
void foo(std::array<T, SZ>& arr);
...
std::array<int, 5> arr1;
std::array<float, 7> arr2;
foo(arr1); // Calls foo<int, 5>(arr1).
foo(arr2); // Calls foo<float, 7>(arr2).
For an example of this in action, see here.
If anyone sees any mistakes I may have missed, feel free to point them out for me to fix, or fix them yourself. I think I caught them all, but I'm not 100% sure.
Based on Jonathan Leffler's answer
#include<stdio.h>
void special_case(int (*array)[4]);
void anon(void){
int array4[4];
int array8[8];
special_case(&array4);
special_case(&array8);
}
int main(void){
anon();
return 0;
}
void special_case(int (*array)[4]){
printf("hello\n");
}
gcc array_fixed_int.c &&./a.out will yield warning:
array_fixed_int.c:7:18: warning: passing argument 1 of ‘special_case’ from incompatible pointer type [-Wincompatible-pointer-types]
7 | special_case(&array8);
| ^~~~~~~
| |
| int (*)[8]
array_fixed_int.c:2:25: note: expected ‘int (*)[4]’ but argument is of type ‘int (*)[8]’
2 | void special_case(int (*array)[4]);
| ~~~~~~^~~~~~~~~
Skip warning:
gcc -Wno-incompatible-pointer-types array_fixed_int.c &&./a.out

How to initialize all elements in an array to the same number in C++ [duplicate]

This question already has answers here:
Initialization of all elements of an array to one default value in C++?
(12 answers)
Closed 4 months ago.
I'm trying to initialize an int array with everything set at -1.
I tried the following, but it doesn't work. It only sets the first value at -1.
int directory[100] = {-1};
Why doesn't it work right?
I'm surprised at all the answers suggesting vector. They aren't even the same thing!
Use std::fill, from <algorithm>:
int directory[100];
std::fill(directory, directory + 100, -1);
Not concerned with the question directly, but you might want a nice helper function when it comes to arrays:
template <typename T, size_t N>
T* end(T (&pX)[N])
{
return pX + N;
}
Giving:
int directory[100];
std::fill(directory, end(directory), -1);
So you don't need to list the size twice.
I would suggest using std::array. For three reasons:
1. array provides runtime safety against index-out-of-bound in subscripting (i.e. operator[]) operations,
2. array automatically carries the size without requiring to pass it separately
3. And most importantly, array provides the fill() method that is required for
this problem
#include <array>
#include <assert.h>
typedef std::array< int, 100 > DirectoryArray;
void test_fill( DirectoryArray const & x, int expected_value ) {
for( size_t i = 0; i < x.size(); ++i ) {
assert( x[ i ] == expected_value );
}
}
int main() {
DirectoryArray directory;
directory.fill( -1 );
test_fill( directory, -1 );
return 0;
}
Using array requires use of "-std=c++0x" for compiling (applies to the above code).
If that is not available or if that is not an option, then the other options like std::fill() (as suggested by GMan) or hand coding the a fill() method may be opted.
If you had a smaller number of elements you could specify them one after the other. Array initialization works by specifying each element, not by specifying a single value that applies for each element.
int x[3] = {-1, -1, -1 };
You could also use a vector and use the constructor to initialize all of the values. You can later access the raw array buffer by specifying &v.front()
std::vector directory(100, -1);
There is a C way to do it also using memset or various other similar functions. memset works for each char in your specified buffer though so it will work fine for values like 0 but may not work depending on how negative numbers are stored for -1.
You can also use STL to initialize your array by using fill_n. For a general purpose action to each element you could use for_each.
fill_n(directory, 100, -1);
Or if you really want you can go the lame way, you can do a for loop with 100 iterations and doing directory[i] = -1;
If you really need arrays, you can use boosts array class. It's assign member does the job:
boost::array<int,N> array; // boost arrays are of fixed size!
array.assign(-1);
It does work right. Your expectation of the initialiser is incorrect. If you really wish to take this approach, you'll need 100 comma-separated -1s in the initialiser. But then what happens when you increase the size of the array?
use vector of int instead a array.
vector<int> directory(100,-1); // 100 ints with value 1
It is working right. That's how list initializers work.
I believe 6.7.8.10 of the C99 standard covers this:
If an object that has automatic
storage duration is not initialized
explicitly, its value is
indeterminate. If an object that has
static storage duration is not
initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned)
zero;
if it is an aggregate, every member is initialized (recursively) according
to these rules;
if it is a union, the first named member is initialized (recursively)
according to these rules.
If you need to make all the elements in an array the same non-zero value, you'll have to use a loop or memset.
Also note that, unless you really know what you're doing, vectors are preferred over arrays in C++:
Here's what you need to realize about containers vs. arrays:
Container classes make programmers more productive. So if you insist on using arrays while those around are willing to use container classes, you'll probably be less productive than they are (even if you're smarter and more experienced than they are!).
Container classes let programmers write more robust code. So if you insist on using arrays while those around are willing to use container classes, your code will probably have more bugs than their code (even if you're smarter and more experienced).
And if you're so smart and so experienced that you can use arrays as fast and as safe as they can use container classes, someone else will probably end up maintaining your code and they'll probably introduce bugs. Or worse, you'll be the only one who can maintain your code so management will yank you from development and move you into a full-time maintenance role — just what you always wanted!
There's a lot more to the linked question; give it a read.
u simply use for loop as done below:-
for (int i=0; i<100; i++)
{
a[i]= -1;
}
as a result as u want u can get
A[100]={-1,-1,-1..........(100 times)}
I had the same question and I found how to do, the documentation give the following example :
std::array<int, 3> a1{ {1, 2, 3} }; // double-braces required in C++11 (not in C++14)
So I just tried :
std::array<int, 3> a1{ {1} }; // double-braces required in C++11 (not in C++14)
And it works all elements have 1 as value. It does not work with the = operator. It is maybe a C++11 issue.
Can't do what you're trying to do with a raw array (unless you explicitly list out all 100 -1s in the initializer list), you can do it with a vector:
vector<int> directory(100, -1);
Additionally, you can create the array and set the values to -1 using one of the other methods mentioned.
Just use this loop.
for(int i =0 ; i < 100 ; i++) directory[i] =0;
the almighty memset() will do the job for array and std containers in C/C++/C++11/C++14
The reason that int directory[100] = {-1} doesn't work is because of what happens with array initialization.
All array elements that are not initialized explicitly are initialized implicitly the same way as objects that have static storage duration.
ints which are implicitly initialized are:
initialized to unsigned zero
All array elements that are not initialized explicitly are initialized implicitly the same way as objects that have static storage duration.
C++11 introduced begin and end which are specialized for arrays!
This means that given an array (not just a pointer), like your directory you can use fill as has been suggested in several answers:
fill(begin(directory), end(directory), -1)
Let's say that you write code like this, but then decide to reuse the functionality after having forgotten how you implemented it, but you decided to change the size of directory to 60. If you'd written code using begin and end then you're done.
If on the other hand you'd done this: fill(directory, directory + 100, -1) then you'd better remember to change that 100 to a 60 as well or you'll get undefined behavior.
If you are allowed to use std::array, you can do the following:
#include <iostream>
#include <algorithm>
#include <array>
using namespace std;
template <class Elem, Elem pattern, size_t S, size_t L>
struct S_internal {
template <Elem... values>
static array<Elem, S> init_array() {
return S_internal<Elem, pattern, S, L - 1>::init_array<values..., pattern>();
}
};
template <class Elem, Elem pattern, size_t S>
struct S_internal<Elem, pattern, S, 0> {
template <Elem... values>
static array<Elem, S> init_array() {
static_assert(S == sizeof...(values), "");
return array<Elem, S> {{values...}};
}
};
template <class Elem, Elem pattern, size_t S>
struct init_array
{
static array<Elem, S> get() {
return S_internal<Elem, pattern, S, S>::init_array<>();
}
};
void main()
{
array<int, 5> ss = init_array<int, 77, 5>::get();
copy(cbegin(ss), cend(ss), ostream_iterator<int>(cout, " "));
}
The output is:
77 77 77 77 77
Just use the fill_n() method.
Example
int n;
cin>>n;
int arr[n];
int value = 9;
fill_n(arr, n, value); // 9 9 9 9 9...
Learn More about fill_n()
or
you can use the fill() method.
Example
int n;
cin>>n;
int arr[n];
int value = 9;
fill(arr, arr+n, value); // 9 9 9 9 9...
Learn More about fill() method.
Note: Both these methods are available in algorithm library (#include<algorithm>). Don't forget to include it.
Starting with C++11 you could also use a range based loop:
int directory[10];
for (auto& value: directory) value = -1;