I'd like to use std::unordered map as a software cache with a limited capacity. Namely, I set the number of buckets in the constructor (doesn't mind that it might become actually larger) and insert new data (if not already there) if the following way:
If the bucket where the data belong is not empty, I replace its node with the inserted data (by C++17 extraction-insertion pattern).
Otherwise, I simply insert data.
The minimal example that simulates this approach is as follows:
#include <iostream>
#include <unordered_map>
std::unordered_map<int, int> m(2);
void insert(int a) {
auto idx = m.bucket(a);
if (m.bucket_size(idx) > 0) {
const auto& key = m.begin(idx)->first;
auto nh = m.extract(key);
nh.key() = a;
nh.mapped() = a;
m.insert(std::move(nh));
}
else
m.insert({a, a});
}
int main() {
for (int i = 0; i < 1000; i++) {
auto bc1 = m.bucket_count();
insert(i);
auto bc2 = m.bucket_count();
if (bc1 != bc2) std::cerr << bc2 << std::endl;
}
}
The problem is, that with GCC 8.1 (that is available for me in the production environment), the bucket count is not fixed and grows instead; the output reads:
7
17
37
79
167
337
709
1493
Live demo: https://wandbox.org/permlink/c8nnEU52NsWarmuD
Updated info: the bucket count is always increased in the else branch: https://wandbox.org/permlink/p2JaHNP5008LGIpL.
However, when I use GCC 9.1 or Clang 8.0, the bucket count remains fixed (no output is printed in the error stream).
My question is whether this is a bug in the older version of libstdc++, or my approach isn't correct and I cannot use std::unordered_map this way.
Moreover, I found out that the problem disappears when I set the max_load_factor to some very high number, such as
m.max_load_factor(1e20f);
But I don't want to rely on such a "fragile" solution in the production code.
Unfortunately the problem you're having appears to be a bug in older implementations of std::unordered_map. This problem disappears in g++-9, but if you're limited to g++-8, I recommend rolling your own hash-cache.
Rolling our own hash-cache
Thankfully, the type of cache you want to write is actually simpler than writing a full hash-table, mainly because it's fine if values occasionally get dropped from the table. To see how difficult it'd be, I wrote my own version.
So what's it look like?
Let's say you have an expensive function you want to cache. The fibbonacci function, when written using the recursive implementation, is notorious for requiring exponential time in terms of the input because it calls itself.
// Uncached version
long long fib(int n) {
if(n <= 1)
return n;
else
return fib(n - 1) + fib(n - 2);
}
Let's transform it to the cached version, using the Cache class which I'll show you in a moment. We actually only need to add one line of code to the function:
// Cached version; much faster
long long fib(int n) {
static auto fib = Cache(::fib, 1024); // fib now refers to the cache, instead of the enclosing function
if(n <= 1)
return n;
else
return fib(n - 1) + fib(n - 2); // Invokes cache
}
The first argument is the function you want to cache (in this case, fib itself), and the second argument is the capacity. For n == 40, the uncached version takes 487,000 microseconds to run. And the cached version? Just 16 microseconds to initialize the cache, fill it, and return the value! You can see it run here.. After that initial access, retrieving a stored value from the cache takes around 6 nanoseconds.
(If Compiler Explorer shows the assembly instead of the output, click on the tab next to it.)
How would we write this Cache class?
Here's a compact implementation of it. The Cache class stores the following
An array of bools, which keeps track of which buckets have values
An array of keys
An array of values
A bitmask & hash function
A function to calculate values that aren't in the table
In order to calculate a value, we:
Check if the key is stored in the table
If the key is not in the table, calculate and store the value
Return the stored value
Here's the code:
template<class Key, class Value, class Func>
class Cache {
static size_t calc_mask(size_t min_cap) {
size_t actual_cap = 1;
while(actual_cap <= min_cap) {
actual_cap *= 2;
}
return actual_cap - 1;
}
size_t mask = 0;
std::unique_ptr<bool[]> isEmpty;
std::unique_ptr<Key[]> keys;
std::unique_ptr<Value[]> values;
std::hash<Key> hash;
Func func;
public:
Cache(Cache const& c)
: mask(c.mask)
, isEmpty(new bool[mask + 1])
, keys(new Key[mask + 1])
, values(new Value[mask + 1])
, hash(c.hash)
, func(c.func)
{
std::copy_n(c.isEmpty.get(), capacity(), isEmpty.get());
std::copy_n(c.keys.get(), capacity(), keys.get());
std::copy_n(c.values.get(), capacity(), values.get());
}
Cache(Cache&&) = default;
Cache(Func func, size_t cap)
: mask(calc_mask(cap))
, isEmpty(new bool[mask + 1])
, keys(new Key[mask + 1])
, values(new Value[mask + 1])
, hash()
, func(func) {
std::fill_n(isEmpty.get(), capacity(), true);
}
Cache(Func func, size_t cap, std::hash<Key> const& hash)
: mask(calc_mask(cap))
, isEmpty(new bool[mask + 1])
, keys(new Key[mask + 1])
, values(new Value[mask + 1])
, hash(hash)
, func(func) {
std::fill_n(isEmpty.get(), capacity(), true);
}
Value operator()(Key const& key) const {
size_t index = hash(key) & mask;
auto& value = values[index];
auto& old_key = keys[index];
if(isEmpty[index] || old_key != key) {
old_key = key;
value = func(key);
isEmpty[index] = false;
}
return value;
}
size_t capacity() const {
return mask + 1;
}
};
template<class Key, class Value>
Cache(Value(*)(Key), size_t) -> Cache<Key, Value, Value(*)(Key)>;
Related
I work on GPL'ed C++ code with heavy data processing. One particular pattern we often have is to collect some amount (thousands to millions) of keys or key/value pairs (usually int32..int128), insert them into hashset/hashmap and then use it without further modifications.
I named it immutable hashtable, although single-assignment hashtable may be even a better name since we don't use it prior to full construction.
Today we are using STL unordered_map/set, but we are looking for a better (especially faster) library. Can you recommend anything suitable for the situation, with GPL-compatible license?
I think that the most efficient approach would be to radix-sort all keys by the bucket num and provide bucket->range mapping, so we can use the following code to search for a key:
bool contains (set,key) {
h = hash(key);
b = h % BUCKETS;
for (i : range(set.bucket[b], set.bucket[b+1]-1)
if (set.keys[i]==key) return true;
return false;
}
Your comments on this approach? Can you propose a faster way to implement immutable map/set?
I think, for your case is more suitable Double Hashing or Robin Hood Hashing. Among lot of possible algorithms, I prefer to use Double Hashing with 2^n table and odd step. This algorithm very efficient and easy to code. Following is just an example of such container for uint32_t keys:
class uint32_DH {
static const int _TABSZ = 1 << 20; // 1M cells, 2^N size
public:
uint32_DH() { bzero(_data, sizeof(_data)); }
bool search(uint32_t key) { return *lookup(key) == key; }
void insert(uint32_t key) { *lookup(key) = key; }
private:
uint32_t* lookup(uint32_t key) {
uint32_t pos = key + (key >> 32) * 7919;
uint32_t step = (key * 7717 ^ (pos >> 16)) | 1;
uint32_t *rc;
do {
rc = _data + ((pos += step) & (_TABSZ - 1));
} while(*rc != 0 && *rc != key);
return rc;
}
uint32_t _data[_TABSZ];
}
I'm new to Hash Maps and I have an assignment due tomorrow. I implemented everything and it all worked out fine, except for when I get a collision. I cant quite understand the idea of linear probing, I did try to implement it based on what I understood, but the program stopped working for table size < 157, for some reason.
void hashEntry(string key, string value, entry HashTable[], int p)
{
key_de = key;
val_en = value;
for (int i = 0; i < sizeof(HashTable); i++)
{
HashTable[Hash(key, p) + i].key_de = value;
}
}
I thought that by adding a number each time to the hash function, 2 buckets would never get the same Hash index. But that didn't work.
A hash table with linear probing requires you
Initiate a linear search starting at the hashed-to location for an empty slot in which to store your key+value.
If the slot encountered is empty, store your key+value; you're done.
Otherwise, if they keys match, replace the value; you're done.
Otherwise, move to the next slot, hunting for any empty or key-matching slot, at which point (2) or (3) transpires.
To prevent overrun, the loop doing all of this wraps modulo the table size.
If you run all the way back to the original hashed-to location and still have no empty slot or matching-key overwrite, your table is completely populated (100% load) and you cannot insert more key+value pairs.
That's it. In practice it looks something like this:
bool hashEntry(string key, string value, entry HashTable[], int p)
{
bool inserted = false;
int hval = Hash(key, p);
for (int i = 0; !inserted && i < p; i++)
{
if (HashTable[(hval + i) % p].key_de.empty())
{
HashTable[(hval + i) % p].key_de = key;
}
if (HashTable[(hval + i) % p].key_de == key)
{
HashTable[(hval + i) % p].val_en = value;
inserted = true;
}
}
return inserted;
}
Note that expanding the table in a linear-probing hash algorithm is tedious. I suspect that will be forthcoming in your studies. Eventually you need to track how many slots are taken so when the table exceeds a specified load factor (say, 80%), you expand the table, rehashing all entries on the new p size, which will change where they all end up residing.
Anyway, hope it makes sense.
Is there an STL container which size can be limited, where inserting elements keep it sorted and can provide a raw pointer to the data in C++ or can it be built by assembling some stuff from the STL and C++ ?
In fact, I'm receiving real time data (epoch + data), and I noticed that they aren't "always" sent in an increasing order of the epoch.
I only save 1024 data points to plot them with a plotting API, thus, I need a double raw pointer to the data (x => epoch, y => data).
I wrote a class that fills a 1024 double arrays of times and values. After receiving the 1023th data point, the buffer is shifted to receive the next data points.
Adding sorting to the code below, might overcomplicate it, so is there a better way to code it ?
struct TemporalData
{
TemporalData(const unsigned capacity) :
m_timestamps(new double[capacity]),
m_bsl(new double[capacity]),
m_capacity(capacity),
m_size(0),
m_lastPos(capacity - 1)
{
}
TemporalData(TemporalData&& moved) :
m_capacity(moved.m_capacity),
m_lastPos(moved.m_lastPos)
{
m_size = moved.m_size;
m_timestamps = moved.m_timestamps;
moved.m_timestamps = nullptr;
m_bsl = moved.m_bsl;
moved.m_bsl = nullptr;
}
TemporalData(const TemporalData& copied) :
m_capacity(copied.m_capacity),
m_lastPos(copied.m_lastPos)
{
m_size = copied.m_size;
m_timestamps = new double[m_capacity];
m_bsl = new double[m_capacity];
std::copy(copied.m_timestamps, copied.m_timestamps + m_size, m_timestamps);
std::copy(copied.m_bsl, copied.m_bsl + m_size, m_bsl);
}
TemporalData& operator=(const TemporalData& copied) = delete;
TemporalData& operator=(TemporalData&& moved) = delete;
inline void add(const double timestamp, const double bsl)
{
if (m_size >= m_capacity)
{
std::move(m_timestamps + 1, m_timestamps + 1 + m_lastPos, m_timestamps);
std::move(m_bsl + 1, m_bsl + 1 + m_lastPos, m_bsl);
m_timestamps[m_lastPos] = timestamp;
m_bsl[m_lastPos] = bsl;
}
else
{
m_timestamps[m_size] = timestamp;
m_bsl[m_size] = bsl;
++m_size;
}
}
inline void removeDataBefore(const double ptTime)
{
auto itRangeToEraseEnd = std::lower_bound(m_timestamps,
m_timestamps + m_size,
ptTime);
auto timesToEraseCount = itRangeToEraseEnd - m_timestamps;
if (timesToEraseCount > 0)
{
// shift
std::move(m_timestamps + timesToEraseCount, m_timestamps + m_size, m_timestamps);
std::move(m_bsl + timesToEraseCount, m_bsl + m_size, m_bsl);
m_size -= timesToEraseCount;
}
}
inline void clear() { m_size = 0; }
inline double* x() const { return m_timestamps; }
inline double* y() const { return m_bsl; }
inline unsigned size() const { return m_size; }
inline unsigned capacity() const { return m_capacity; }
~TemporalData()
{
delete [] m_timestamps;
delete [] m_bsl;
}
private:
double* m_timestamps; // x axis
double* m_bsl; // y axis
const unsigned m_capacity;
unsigned m_size;
const unsigned m_lastPos;
};
Is there an STL container which size can be limited, where inserting elements keep it sorted and can provide a raw pointer to the data in C++ or can it be built by assembling some stuff from the STL and C++ ?
No, but you can keep a container sorted via e.g. std::lower_bound. If the container can be accessed randomly, the insertion will be O(log(N)) in time.
After receiving the 1023th data point, the buffer is shifted to receive the next data points.
That sounds like a circular buffer. However, if you want to keep the elements sorted, it won't be a circular buffer anymore; unless you are talking about a sorted view on top of a circular buffer.
Is there an STL container which size can be limited, where inserting elements keep it sorted and can provide a raw pointer to the data in C++
No. There is no such standard container.
or can it be built by assembling some stuff from the STL and C++ ?
Sure.
Size limitation can be implemented using an if-statement. Arrays can be iterated using a pointer, and there is standard algorithm for sorting.
What I want is to insert the element at the right place in the fixed size buffer (like a priority queue), starting from its end, I thought it's faster than pushing back the element and then sorting the container.
It depends. If you insert multiple elements at a time, then sorting has better worst case asymptotic complexity.
But if you insert one at a time, and especially if the elements are inserted in "mostly sorted" order, then it may be better for average case complexity to simply search for the correct position, and insert.
The searching can be done linearly (std::find), which may be most efficient depending on how well the input is ordered, or using binary search (std::lower_bound family of functions), which has better worst case complexity. Yet another option is exponential search, but there is no standard implementation of that.
Moreover, as I have a paired data but in two different buffers, I can't use std::sort !
It's unclear why the former would imply the latter.
Following the advice of Acorn, I wrote this (I know it's ugly but it does what I want)
inline void add(const double timestamp, const double bsl)
{
if (m_size >= m_capacity)
{
const auto insertPositionIterator = std::lower_bound(m_timestamps,
m_timestamps + m_size,
timestamp);
if (insertPositionIterator == m_timestamps)
{
if (*insertPositionIterator == timestamp)
{
m_timestamps[0] = timestamp;
m_bsl[0] = bsl;
}
// then return...
}
else
{
const auto shiftIndex = insertPositionIterator - m_timestamps; // for data
std::move(m_timestamps + 1, insertPositionIterator, m_timestamps);
std::move(m_bsl + 1, m_bsl + shiftIndex, m_bsl);
*(insertPositionIterator - 1) = timestamp;
m_bsl[shiftIndex - 1] = bsl;
}
}
else
{
auto insertPositionIterator = std::lower_bound(m_timestamps,
m_timestamps + m_size,
timestamp);
if (insertPositionIterator == m_timestamps + m_size)
{
// the new inserted element is strictly greater than the already
// existing element or the buffer is empty, let's push it at the back
m_timestamps[m_size] = timestamp;
m_bsl[m_size] = bsl;
}
else
{
// the new inserted element is equal or lesser than an already
// existing element, let's insert it at its right place
// to keep the time buffer sorted in ascending order
const auto shiftIndex = insertPositionIterator - m_timestamps; // for data
// shift
assert(insertPositionIterator == m_timestamps + shiftIndex);
std::move_backward(insertPositionIterator, m_timestamps + m_size, m_timestamps + m_size + 1);
std::move_backward(m_bsl + shiftIndex, m_bsl + m_size, m_bsl + m_size + 1);
*insertPositionIterator = timestamp; // or m_timestamps[shiftIndex] = timestamp;
m_bsl[shiftIndex] = bsl;
}
++m_size;
}
}
I'm trying to solve algorithm task: I need to create MultiMap(key,(values)) using hash-table. I can't use Set and Map libraries. I send code to testing system, but I get time-limit exceeded error on test 20. I don't know what exactly this test contains. The code must do following tasks:
put x y - add pair (x,y).If pair exists, do nothing.
delete x y - delete pair(x,y). If pair doesn't exist, do nothing.
deleteall x - delete all pairs with first element x.
get x - print number of pairs with first element x and second elements.
The amount of operations <= 100000
Time limit - 2s
Example:
multimap.in:
put a a
put a b
put a c
get a
delete a b
get a
deleteall a
get a
multimap.out:
3 b c a
2 c a
0
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
inline long long h1(const string& key) {
long long number = 0;
const int p = 31;
int pow = 1;
for(auto& x : key){
number += (x - 'a' + 1 ) * pow;
pow *= p;
}
return abs(number) % 1000003;
}
inline void Put(vector<vector<pair<string,string>>>& Hash_table,const long long& hash, const string& key, const string& value) {
int checker = 0;
for(int i = 0; i < Hash_table[hash].size();i++) {
if(Hash_table[hash][i].first == key && Hash_table[hash][i].second == value) {
checker = 1;
break;
}
}
if(checker == 0){
pair <string,string> key_value = make_pair(key,value);
Hash_table[hash].push_back(key_value);
}
}
inline void Delete(vector<vector<pair<string,string>>>& Hash_table,const long long& hash, const string& key, const string& value) {
for(int i = 0; i < Hash_table[hash].size();i++) {
if(Hash_table[hash][i].first == key && Hash_table[hash][i].second == value) {
Hash_table[hash].erase(Hash_table[hash].begin() + i);
break;
}
}
}
inline void Delete_All(vector<vector<pair<string,string>>>& Hash_table,const long long& hash,const string& key) {
for(int i = Hash_table[hash].size() - 1;i >= 0;i--){
if(Hash_table[hash][i].first == key){
Hash_table[hash].erase(Hash_table[hash].begin() + i);
}
}
}
inline string Get(const vector<vector<pair<string,string>>>& Hash_table,const long long& hash, const string& key) {
string result="";
int counter = 0;
for(int i = 0; i < Hash_table[hash].size();i++){
if(Hash_table[hash][i].first == key){
counter++;
result += Hash_table[hash][i].second + " ";
}
}
if(counter != 0)
return to_string(counter) + " " + result + "\n";
else
return "0\n";
}
int main() {
vector<vector<pair<string,string>>> Hash_table;
Hash_table.resize(1000003);
ifstream input("multimap.in");
ofstream output("multimap.out");
string command;
string key;
int k = 0;
string value;
while(true) {
input >> command;
if(input.eof())
break;
if(command == "put") {
input >> key;
long long hash = h1(key);
input >> value;
Put(Hash_table,hash,key,value);
}
if(command == "delete") {
input >> key;
input >> value;
long long hash = h1(key);
Delete(Hash_table,hash,key,value);
}
if(command == "get") {
input >> key;
long long hash = h1(key);
output << Get(Hash_table,hash,key);
}
if(command == "deleteall"){
input >> key;
long long hash = h1(key);
Delete_All(Hash_table,hash,key);
}
}
}
How can I do my code work faster?
At very first, a matter of design: Normally, one would pass the key only to the function and calculate the hash within. Your variant allows a user to place elements anywhere within the hash table (using bad hash values), so user could easily break it.
So e. g. put:
using HashTable = std::vector<std::vector<std::pair<std::string, std::string>>>;
void put(HashTable& table, std::string& key, std::string const& value)
{
auto hash = h1(key);
// ...
}
If at all, the hash function could be parametrised, but then you'd write a separate class for (wrapping the vector of vectors) and provide the hash function in constructor so that a user cannot exchange it arbitrarily (and again break the hash table). A class would come with additional benefits, most important: better encapsulation (hiding the vector away, so user could not change it with vector's own interface):
class HashTable
{
public:
// IF you want to provide hash function:
template <typename Hash>
HashTable(Hash hash) : hash(hash) { }
void put(std::string const& key, std::string const& value);
void remove(std::string const& key, std::string const& value); //(delete is keyword!)
// ...
private:
std::vector<std::vector<std::pair<std::string, std::string>>> data;
// if hash function parametrized:
std::function<size_t(std::string)> hash; // #include <functional> for
};
I'm not 100% sure how efficient std::function really is, so for high performance code, you preferrably use your hash function h1 directly (not implenting constructor as illustrated above).
Coming to optimisations:
For the hash key I would prefer unsigned value: Negative indices are meaningless anyway, so why allow them at all? long long (signed or unsigned) might be a bad choice if testing system is a 32 bit system (might be unlikely, but still...). size_t covers both issues at once: it is unsigned and it is selected in size appropriately for given system (if interested in details: actually adjusted to address bus size, but on modern systems, this is equal to register size as well, which is what we need). Select type of pow to be the same.
deleteAll is implemented inefficiently: With each element you erase you move all the subsequent elements one position towards front. If you delete multiple elements, you do this repeatedly, so one single element can get moved multiple times. Better:
auto pos = vector.begin();
for(auto& pair : vector)
{
if(pair.first != keyToDelete)
*pos++ = std::move(s); // move semantics: faster than copying!
}
vector.erase(pos, vector.end());
This will move each element at most once, erasing all surplus elements in one single go. Appart from the final erasing (which you have to do explicitly then), this is more or less what std::remove and std::remove_if from algorithm library do as well. Are you allowed to use it? Then your code might look like this:
auto condition = [&keyToDelete](std::pair<std::string, std::string> const& p)
{ return p.first == keyToDelete; };
vector.erase(std::remove_if(vector.begin(), vector.end(), condition), vector.end());
and you profit from already highly optimised algorithm.
Just a minor performance gain, but still: You can spare variable initialisation, assignment and conditional branch (the latter one can be relatively expensive operation on some systems) within put if you simply return if an element is found:
//int checker = 0;
for(auto& pair : hashTable[hash]) // just a little more comfortable to write...
{
if(pair.first == key && pair.second == value)
return;
}
auto key_value = std::make_pair(key, value);
hashTable[hash].push_back(key_value);
Again, with algorithm library:
auto key_value = std::make_pair(key, value);
// same condition as above!
if(std::find_if(vector.begin(), vector.end(), condition) == vector.end();
{
vector.push_back(key_value);
}
Then less than 100000 operations does not indicate that each operation will require a separate key/value pair. We might expect that keys are added, removed, re-added, ..., so you most likely don't have to cope with 100000 different values. I'd assume your map is much too large (be aware that it requires initialisation of 100000 vectors as well). I'd assume a much smaller one should suffice already (possibly 1009 or 10007? You might possibly have to experiment a little...).
Keeping the inner vectors sorted might give you some performance boost as well:
put: You could use a binary search to find the two elements in between a new one is to be inserted (if one of these two is equal to given one, no insertion, of course)
delete: Use binary search to find the element to delete.
deleteAll: Find upper and lower bounds for elements to be deleted and erase whole range at once.
get: find lower and upper bound as for deleteAll, distance in between (number of elements) is a simple subtraction and you could print out the texts directly (instead of first building a long string). Which of outputting directly or creating a string really is more efficient is to be found out, though, as outputting directly involves multiple system calls, which in the end might cost previously gained performance again...
Considering your input loop:
Checking for eof() (only) is critical! If there is an error in the file, you'll end up in an endless loop, as the fail bit gets set, operator>> actually won't read anything at all any more and you won't ever reach the end of the file. This even might be the reason for your 20th test failing.
Additionally: You have line based input (each command on a separate line), so reading a whole line at once and only afterwards parse it will spare you some system calls. If some argument is missing, you will detect it correctly instead of (illegally) reading next command (e. g. put) as argument, similarly you won't interpret a surplus argument as next command. If a line is invalid for whatever reason (bad number of arguments as above or unknown command), you can then decide indiviually what you want to do (just ignore the line or abort processing entirely). So:
std::string line;
while(std::getline(std::cin, line))
{
// parse the string; if line is invalid, appropriate error handling
// (ignoring the line, exiting from loop, ...)
}
if(!std::cin.eof())
{
// some error occured, print error message!
}
You are given a std::vector<T> of distinct items. which is already sorted.
type T only supports less-than < operator for comparisons. and it is a heavy function. so you have to use it as few times as possible.
Is there any better solution than a binary search?
If not, is there any better solution than this, that uses less-than operator fewer times?
template<typename T>
int FindKey(const std::vector<T>& list, const T& key)
{
if( list.empty() )
return -1;
int left = 0;
int right = list.size() - 1;
int mid;
while( left < right )
{
mid = (right + left) / 2;
if( list[mid] < key )
left = mid + 1;
else
right = mid;
}
if( !(key < list[left]) && !(list[left] < key) )
return left;
return -1;
}
It's not a real world situation, just a coding test.
You could trade off additional O(n) preprocessing time to get amortized O(1) query time, using a hash table (e.g. an unordered_map) to create a lookup table.
Hash tables compute hash functions of the keys and do not compare the keys themselves.
Two keys could have the same hash, resulting in a collision, explaining why it's not guaranteed that every separate operation is constant time. Amortized constant time means that if you carry out k operations that took time t in total, then the quotient t/k = O(1), for a sufficiently large k.
Live example:
#include <vector>
#include <unordered_map>
template<typename T>
class lookup {
std::unordered_map<T, int> position;
public:
lookup(const std::vector<T>& a) {
for(int i = 0; i < a.size(); ++i) position.emplace(a[i], i);
}
int operator()(const T& key) const {
auto pos = position.find(key);
return pos == position.end() ? -1 : pos->second;
}
};
This requires additional memory also.
If the values can be mapped to integers and are within a reasonable range (i.e. max-min = O(n)), you could simply use a vector as a lookup table instead of unordered_map. With the benefit of guaranteed constant query time.
See also this answer to "C++ get index of element of array by value", for a more detailed discussion, including an empirical comparison of linear, binary and hash index lookup.
Update
If the interface of type T supports no other operations than bool operator<(L, R), then using the decision tree model you can prove a lower bound for comparison-based search algorithms to be Ω(log n).
You can use std::lower_bound. It does it with log(n)+1 comparisons, which is the best possible complexity for your problem.
template<typename T>
int FindKey(const std::vector<T>& list, const T& key)
{
if(list.empty())
return -1;
typename std::vector<T>::const_iterator lb =
std::lower_bound(list.begin(), list.end(), key);
// now lb is an iterator to the first element
// which is greater or equal to key
if(key < *lb)
return -1;
else
return std::distance(list.begin(), lb);
}
With the additionnal check for equality, you do it with log(n)+2 comparisons.
You can use interpolation search in log log n time if your numbers are normally distributed. If they have some other distribution, you can modify this to take your distribution into account, though I don't know which distributions yield log log time.
https://en.wikipedia.org/wiki/Interpolation_search