HashTable in C++

HashTable in C++ - c++

I need to implement a HashTable in C++. I thought of using Array.
But i don't know exactly how to create an array of fixed size.
Lets say that my class is named HT.
In the constructor i want to specify the array size but i don't know how.
I have a members size_type size; and string [] t; in HT headerfile.
How can i specify the size of t from the constructor?
HT(size_type s):size(s) {
}
If it is not possible what data structure should i use to implement a hash table?

In the Constructor for the HT class, pass in (or default to 0) a size variable (s) to specify the size of the array. Then set t to a new string array of size s
So something like:
HT::HT(size_type s)
{
t = new string[s];
}

You could do as std::array and make the size a compile-time parameter.
If not, there is really no use trying to avoid std::vector, since you'll be doing dynamic allocation no matter what.
So, while you could
struct HT
{
HT(size_t size) : _size(size), _data(new std::string[size]) {}
private:
size_t const _size;
std::unique_ptr<std::string[]> _data;
};
It's only making your class more complex, less flexible and generally less elegant, so I'd go with vector:
#include <memory>
using namespace std;
struct HT
{
HT(size_t size) : _size(size), _data(new std::string[size]) {}
private:
size_t const _size;
std::unique_ptr<std::string[]> _data;
};
#include <vector>
struct HT2
{
HT2(size_t size) : _data(size) {}
private:
std::vector<std::string> _data;
};
int main()
{
HT table1(31);
HT2 table2(31);
}

Most suggestions seem to assume it's ok to implement your hash table container class in terms of the standard library. I wonder exactly what your situation is; how did it come about, this "need" to implement a primitive container class? Is it really cool if you depend on another library?
Everyone else seems to think so, though. I guess std is really a fundamental component of the C++ language, now....
Looking at other answers, I see std::vector, std::string, std::unique_pointer...
But the road doesn't end there. Not even close.
#include <unordered_map>
#include <string>
#include <iostream>
template <typename T>
class CHashTable {
typedef std::string KEYTYPE;
struct HASH_FUNCTOR {
size_t operator ()(const KEYTYPE& key) const {
return CHashTable::MyAmazingHashFunc(key);
} };
typename std::unordered_map<KEYTYPE, T, HASH_FUNCTOR> m_um;
public:
static size_t MyAmazingHashFunc(const KEYTYPE& key) {
size_t h = key.length();
for(auto c : key) {
h = h*143401 + static_cast<size_t>(c)*214517 + 13;
}
h = (~h << (sizeof(h)*4)) + (h >> (sizeof(h)*4));
return h;
}
template <typename KT>
T& operator [] (const KT& key) {
return m_um[KEYTYPE(key)];
}
template <typename KT>
const T& operator [] (const KT& key) const {
return m_um.at(KEYTYPE(key));
}
void DeleteAll() {
m_um.clear();
}
template <typename KT>
void Delete(const KT& key) {
m_um.erase(KEYTYPE(key));
}
template <typename KT>
bool Exists(const KT& key) const {
const auto fit = m_um.find(KEYTYPE(key));
return fit != m_um.end();
}
};
int main() {
CHashTable<int> ht;
// my Universal Translator, a "WIP"
ht["uno"] = 1;
ht["un"] = 1;
ht["one"] = 1;
ht["dos"] = 2;
ht["deux"] = 2;
ht["two"] = 2;
const char* key = "deux";
int value = ht[key];
std::cout << '[' << key << "] => " << value << std::endl;
key = "un";
bool exists = ht.Exists(key);
std::cout << '[' << key << "] "
<< (exists ? "exists" : "does not exist") << std::endl;
key = "trois";
exists = ht.Exists(key);
std::cout << '[' << key << "] "
<< (exists ? "exists" : "does not exist") << std::endl;
return 0;
}
main()'s output:
[deux] => 2
[un] exists
[trois] does not exist
And that's not even the end of the Hash Table std:: Highway! The end is abrupt, at a class that just publicly inherits from std::unordered_map. But I would never suggest THAT in an Answer because I don't want to come across as a sarcastic smartass.

Related

Pointer generalization of index

Referencing with indexes compared to referencing with pointers has several advantages, e.g. indexes often survive array reallocation. But pointers/references are often more convenient to use than indexes. In the C++ STL library, the iterators were designed as generalizations of the pointer and now I wonder whether we could do something similar for indexes: little classes that carry the index as local data, but behave like (convenient) pointers. Something like:
#include <iostream>
#include <vector>
#include <compare>
template<class T, class I, T** ppdata>
struct index_ptr_T
{
explicit index_ptr_T(I index = 0)
: m_index(index)
{}
index_ptr_T(T* p)
: m_index(p - *ppdata)
{}
T* operator->() const
{
return *ppdata + m_index;
}
T& operator*() const
{
return *(*ppdata + m_index);
}
friend auto operator<=>(const index_ptr_T<T, I, ppdata>&, const index_ptr_T<T, I, ppdata>&) = default;
I m_index;
};
struct s_t // a test class
{
inline static s_t* pdata{};
using index_ptr_t = index_ptr_T<s_t, short, &pdata>;
int m_i;
index_ptr_t m_next;
};
int main()
{
std::vector<s_t> data(2);
s_t::pdata = &data[0];
s_t::index_ptr_t pa = &data[0];
*pa = { 10, 0 };
s_t::index_ptr_t pb = &data[1];
pb->m_next = pa;
std::cout << pb->m_next->m_i << std::endl;
data.reserve(data.capacity() + 1);
s_t::pdata = &data[0];
std::cout << pb->m_next->m_i << std::endl;
std::cout << "sizeof(s_t::index_ptr_t) = " << sizeof(s_t::index_ptr_t) << std::endl;
}
Does something like this already exist in library form? Or did someone already publish about it? I googled, but couldn’t find what I was looking for. The index_ptr_T class above is quite incomplete, many operators are missing. I would like to use a proven library.
Note 1: I've !!EDITED!! above example to clarify a little bit more the intended use.
Note 2: s_t::pdata = &data[0] must also be done whenever the data is realocated/moved. E.g. when the data is in a vector after elements have been added.

Your proposed design is rather limited, thanks to templating on the array object. A slightly more reasonable design would look like this:
template<typename ContainerType, typename IndexType>
struct index_ptr_T
{
index_ptr_T(ContainerType & container, IndexType index)
: m_container(container), m_index(index) {}
auto* operator->()
{
return &m_container[m_index];
}
auto & operator*()
{
return m_container[m_index];
}
ContainerType & m_container;
IndexType m_index;
};
int main()
{
std::vector<int> v{1, 2, 3};
int i = 1;
index_ptr_T p{v, i};
v.reserve(v.capacity() + 1);
std::cout << *p;
}
Or, more succinctly, with no separate utility:
int main()
{
std::vector<int> v{1, 2, 3};
int i = 1;
auto p = [&v, i]() -> auto & { return v[i]; };
v.reserve(v.capacity() + 1);
std::cout << p();
}
I'm not aware of anything like this in existing frameworks. As you can see, there's just not much in the concept. It's just a bound indexing operation. Making it into a utility would make code harder to read, without making it easier to write.

Why is this vector implementation more performant?

For learning purposes, I decided to implement my own vector data structure. I called it list because that seems to generally be the more proper name for it but that's unimportant.
I am halfway through implementing this class (inserting and getting are complete) and I decide to write some benchmarks with surprising results.
My compiler is whatever Visual Studio 2019 uses. I have tried debug and release, in x64 and x86.
For some reason, my implementation is faster than vector and I cannot think of a reason why. I fear that either my implementation or testing method are flawed.
Here are my results (x64, debug):
List: 13269ms
Vector: 78515ms
Release has a much less drastic, but still apparent, difference.
List: 65ms
Vector: 247ms
Here is my code
dataset.hpp:
#ifndef DATASET_H
#define DATASET_H
#include <memory>
#include <stdexcept>
#include <algorithm>
#include <functional>
#include <chrono>
namespace Dataset {
template <class T>
class List {
public:
List();
List(unsigned int);
void push(T);
T& get(int);
void reserve(int);
void shrink();
int count();
int capacity();
~List();
private:
void checkCapacity(int);
void setCapacity(int);
char* buffer;
int mCount, mCapacity;
};
template <class T>
List<T>::List() {
mCount = 0;
mCapacity = 0;
buffer = 0;
setCapacity(64);
}
template <class T>
List<T>::List(unsigned int initcap) {
mCount = 0;
buffer = 0;
setCapacity(initcap);
}
template <class T>
void List<T>::push(T item) {
checkCapacity(1);
new(buffer + (sizeof(T) * mCount++)) T(item);
}
template <class T>
T& List<T>::get(int index) {
return *((T*)(buffer + (sizeof(T) * index)));
}
template <class T>
void List<T>::reserve(int desired) {
if (desired > mCapacity) {
setCapacity(desired);
}
}
template <class T>
void List<T>::shrink() {
if (mCapacity > mCount) {
setCapacity(mCount);
}
}
template <class T>
int List<T>::count() {
return mCount;
}
template <class T>
int List<T>::capacity() {
return mCapacity;
}
template <class T>
void List<T>::checkCapacity(int cap) {
// Can <cap> more items fit in the list? If not, expand!
if (mCount + cap > mCapacity) {
setCapacity((int)((float)mCapacity * 1.5));
}
}
template <class T>
void List<T>::setCapacity(int cap) {
mCapacity = cap;
// Does buffer exist yet?
if (!buffer) {
// Allocate a new buffer
buffer = new char[sizeof(T) * cap];
}
else {
// Reallocate the old buffer
char* newBuffer = new char[sizeof(T) * cap];
if (newBuffer) {
std::copy(buffer, buffer + (sizeof(T) * mCount), newBuffer);
delete[] buffer;
buffer = newBuffer;
}
else {
throw std::runtime_error("Allocation failed");
}
}
}
template <class T>
List<T>::~List() {
for (int i = 0; i < mCount; i++) {
get(i).~T();
}
delete[] buffer;
}
long benchmark(std::function<void()>);
long benchmark(std::function<void()>, long);
long benchmark(std::function<void()> f) {
return benchmark(f, 100000);
}
long benchmark(std::function<void()> f, long iters) {
using std::chrono::high_resolution_clock;
using std::chrono::duration_cast;
auto start = high_resolution_clock::now();
for (long i = 0; i < iters; i++) {
f();
}
auto end = high_resolution_clock::now();
auto time = duration_cast<std::chrono::milliseconds>(end - start);
return (long)time.count();
}
}
#endif
test.cpp:
#include "dataset.hpp"
#include <iostream>
#include <vector>
/*
TEST CODE
*/
class SimpleClass {
public:
SimpleClass();
SimpleClass(int);
SimpleClass(const SimpleClass&);
void sayHello();
~SimpleClass();
private:
int data;
};
SimpleClass::SimpleClass() {
//std::cout << "Constructed " << this << std::endl;
data = 0;
}
SimpleClass::SimpleClass(int data) {
//std::cout << "Constructed " << this << std::endl;
this->data = data;
}
SimpleClass::SimpleClass(const SimpleClass& other) {
//std::cout << "Copied to " << this << std::endl;
data = other.data;
}
SimpleClass::~SimpleClass() {
//std::cout << "Deconstructed " << this << std::endl;
}
void SimpleClass::sayHello() {
std::cout << "Hello! I am #" << data << std::endl;
}
int main() {
long list = Dataset::benchmark([]() {
Dataset::List<SimpleClass> list = Dataset::List<SimpleClass>(1000);
for (int i = 0; i < 1000; i++) {
list.push(SimpleClass(i));
}
});
long vec = Dataset::benchmark([]() {
std::vector<SimpleClass> list = std::vector<SimpleClass>(1000);
for (int i = 0; i < 1000; i++) {
list.emplace_back(SimpleClass(i));
}
});
std::cout << "List: " << list << "ms" << std::endl;
std::cout << "Vector: " << vec << "ms" << std::endl;
return 0;
}

std::vector constructor with one parameter creates vector with count elements:
explicit vector( size_type count, const Allocator& alloc = Allocator() );
To have something comparable for vector you have to do:
std::vector<SimpleClass> list;
list.reserve( 1000 );
also your "vector" copies objects it holds by simply copying memory, which is only allowed for trivially copyable objects, and SimpleClass is not one of them as it has user defined constuctors.

This is a really nice start! Clean and simple solution to the exercise. Sadly, your instincts are right that you weren’t testing enough cases.
One thing that jumps out at me is that you never resize your vectors, and therefore don’t measure how most STL implementations can often avoid copying when they grow in size. It also never returns any memory to the heap when it shrinks. You also don’t say whether you were compiling with /Oz to enable optimizations. But my guess is that there’s a small amount of overhead in Microsoft’s implementation, and it would pay off in other tests (especially an array of non-trivially-copyable data that needs to be resized, or a series of vectors that start out big but can be filtered and shrunk, or storing lots of data that can be moved instead of copied).
One bug that jumps out at me is that you call new[] to allocate a buffer of char—which is not guaranteed to meet the alignment requirements of T. On some CPUs, that can crash the program.
Another is that you use std::copy with an uninitialized area of memory as the destination in List::setCapacity. That doesn’t work except in special cases: std::copy expects a validly-initialized object that can be assigned to. For any type where assignment is a non-trivial operation, this will fail when the program tries to call a destructor on garbage data. If that happens to work, the move will then inefficiently clone the data and destroy the original, rather than using the move constructor if one exists. The STL algorithm you really want here is std::uninitialized_move. You might also want to use calloc/realloc, which allows resizing blocks.
Your capacity and size members should be size_t rather than int. This not only limits the size to less memory than most implementations can address, calculating a size greater than INT_MAX (i.e., 2 GiB or more on most implementations) causes undefined behavior.
One thing List::push has going for it is that it uses the semantics of std::vector::emplace_back (which you realize, and use as your comparison). It could, however, be improved. You pass item in by value, rather than by const reference. This creates an unnecessary copy of the data. Fortunately, if T has a move constructor, the extra copy can be moved, and if item is an xvalue, the compiler might be able to optimize the copy away, but it would be better to have List::push(const T&) and List::push(T&&). This will let the class push an xvalue without making any copies at all.
List::get is better, and avoids making copies, but it does not have a const version, so a const List<T> cannot do anything. It also does not check bounds.
Consider putting the code to look up the position of an index within the buffer into a private inline member function, which would drastically cut down the amount of work you will need to do to fix design changes (such as the ones you will need to fix the data-alignment bug).

Is it possible to perform a string to int mapping at compile time?

Is it possible to perform a unique string to int mapping at compile time?
Let's say I have a template like this for profiling:
template <int profilingID>
class Profile{
public:
Profile(){ /* start timer */ }
~Profile(){ /* stop timer */ }
};
which I place at the beginning of function calls like this:
void myFunction(){
Profile<0> profile_me;
/* some computations here */
}
Now I'm trying to do something like the following, which is not possible since string literals cannot be used as a template argument:
void myFunction(){
Profile<"myFunction"> profile_me; // or PROFILE("myFunction")
/* some computations here */
}
I could declare global variables to overcome this issue, but I think it would be more elegant to avoid previous declarations. A simple mapping of the form
”myFunction” → 0
”myFunction1” → 1
…
”myFunctionN” → N
would be sufficient. But to this point neither using constexpr, template meta-programming nor macros I could find a way to accomplish such a mapping. Any ideas?

As #harmic has already mentioned in the comments, you should probably just pass the name to the constructor. This might also help reduce code bloat because you don't generate a new type for each function.
However, I don't want to miss the opportunity to show a dirty hack that might be useful in situations where the string cannot be passed to the constructor. If your strings have a maximum length that is known at compile-time, you can encode them into integers. In the following example, I'm only using a single integer which limits the maximum string length to 8 characters on my system. Extending the approach to multiple integers (with the splitting logic conveniently hidden by a small macro) is left as an exercise to the reader.
The code makes use of the C++14 feature to use arbitrary control structures in constexpr functions. In C++11, you'd have to write wrap as a slightly less straight-forward recursive function.
#include <climits>
#include <cstdint>
#include <cstdio>
#include <type_traits>
template <typename T = std::uintmax_t>
constexpr std::enable_if_t<std::is_integral<T>::value, T>
wrap(const char *const string) noexcept
{
constexpr auto N = sizeof(T);
T n {};
std::size_t i {};
while (string[i] && i < N)
n = (n << CHAR_BIT) | string[i++];
return (n << (N - i) * CHAR_BIT);
}
template <typename T>
std::enable_if_t<std::is_integral<T>::value>
unwrap(const T n, char *const buffer) noexcept
{
constexpr auto N = sizeof(T);
constexpr auto lastbyte = static_cast<char>(~0);
for (std::size_t i = 0UL; i < N; ++i)
buffer[i] = ((n >> (N - i - 1) * CHAR_BIT) & lastbyte);
buffer[N] = '\0';
}
template <std::uintmax_t Id>
struct Profile
{
char name[sizeof(std::uintmax_t) + 1];
Profile()
{
unwrap(Id, name);
std::printf("%-8s %s\n", "ENTER", name);
}
~Profile()
{
std::printf("%-8s %s\n", "EXIT", name);
}
};
It can be used like this:
void
function()
{
const Profile<wrap("function")> profiler {};
}
int
main()
{
const Profile<wrap("main")> profiler {};
function();
}
Output:
ENTER main
ENTER function
EXIT function
EXIT main

In principle you can. However, I doubt any option is practical.
You can set your key type to be a constexpr value type (this excludes std::string), initializing the value type you implement is not a problem either, just throw in there a constexpr constructor from an array of chars. However, you also need to implement a constexpr map, or hash table, and a constexpr hashing function. Implementing a constexpr map is the hard part. Still doable.

You could create a table:
struct Int_String_Entry
{
unsigned int id;
char * text;
};
static const Int_String_Entry my_table[] =
{
{0, "My_Function"},
{1, "My_Function1"},
//...
};
const unsigned int my_table_size =
sizeof(my_table) / sizeof(my_table[0]);
Maybe what you want is a lookup table with function pointers.
typedef void (*Function_Pointer)(void);
struct Int_vs_FP_Entry
{
unsigned int func_id;
Function_Point p_func;
};
static const Int_vs_FP_Entry func_table[] =
{
{ 0, My_Function},
{ 1, My_Function1},
//...
};
For more completion, you can combine all three attributes into another structure and create another table.
Note: Since the tables are declared as "static const", they are assembled during compilation time.

Why not just use an Enum like:
enum ProfileID{myFunction = 0,myFunction1 = 1, myFunction2 = 2 };
?
Your strings will not be loaded in runtime, so I don't understand the reason for using strings here.

It is an interesting question.
It is possible to statically-initialize a std::map as follows:
static const std::map<int, int> my_map {{1, 2}, {3, 4}, {5, 6}};
but I get that such initialization is not what you are looking for, so I took another approach after looking at your example.
A global registry holds a mapping between function name (an std::string) and run time (an std::size_t representing the number of milliseconds).
An AutoProfiler is constructed providing the name of the function, and it will record the current time. Upon destruction (which will happen as we exit the function) it will calculate the elapsed time and record it in the global registry.
When the program ends we print the contents of the map (to do so we utilize the std::atexit function).
The code looks as follows:
#include <cstdlib>
#include <iostream>
#include <map>
#include <chrono>
#include <cmath>
using ProfileMapping = std::map<std::string, std::size_t>;
ProfileMapping& Map() {
static ProfileMapping map;
return map;
}
void show_profiles() {
for(const auto & pair : Map()) {
std::cout << pair.first << " : " << pair.second << std::endl;
}
}
class AutoProfiler {
public:
AutoProfiler(std::string name)
: m_name(std::move(name)),
m_beg(std::chrono::high_resolution_clock::now()) { }
~AutoProfiler() {
auto end = std::chrono::high_resolution_clock::now();
auto dur = std::chrono::duration_cast<std::chrono::milliseconds>(end - m_beg);
Map().emplace(m_name, dur.count());
}
private:
std::string m_name;
std::chrono::time_point<std::chrono::high_resolution_clock> m_beg;
};
void foo() {
AutoProfiler ap("foo");
long double x {1};
for(std::size_t k = 0; k < 1000000; ++k) {
x += std::sqrt(k);
}
}
void bar() {
AutoProfiler ap("bar");
long double x {1};
for(std::size_t k = 0; k < 10000; ++k) {
x += std::sqrt(k);
}
}
void baz() {
AutoProfiler ap("baz");
long double x {1};
for(std::size_t k = 0; k < 100000000; ++k) {
x += std::sqrt(k);
}
}
int main() {
std::atexit(show_profiles);
foo();
bar();
baz();
}
I compiled it as:
$ g++ AutoProfile.cpp -std=c++14 -Wall -Wextra
and obtained:
$ ./a.out
bar : 0
baz : 738
foo : 7
You do not need -std=c++14, but you will need at least -std=c++11.
I realize this is not what you are looking for, but I liked your question and decided to pitch in my $0.02.
And notice that if you use the following definition:
using ProfileMapping = std::multi_map<std::string, std::size_t>;
you can record every access to each function (instead of ditching the new results once the first entry has been written, or overwriting the old results).

You could do something similar to the following. It's a bit awkward, but may do what you want a little more directly than mapping to an integer:
#include <iostream>
template <const char *name>
class Profile{
public:
Profile() {
std::cout << "start: " << name << std::endl;
}
~Profile() {
std::cout << "stop: " << name << std::endl;
}
};
constexpr const char myFunction1Name[] = "myFunction1";
void myFunction1(){
Profile<myFunction1Name> profile_me;
/* some computations here */
}
int main()
{
myFunction1();
}

c++ array[var][2] as a class member

I would like to have an array of unsigned integers within a class, and its size should be [var][2], so the user will be able to choose var in runtime.
Is there a better way than allocating a two dimensional array (an allocated array of pointers to allocated arrays)?
In the class I have:
unsigned int *(*hashFunc);
And in the initializing function:
hashFunc = new unsigned int*[var];
for(unsigned int i = 0; i<var; ++i)
hashFunc[i] = new unsigned int[2];
I want to only allocate once, and I think it should somehow be possible because I only have one unknown dimension (var is unknown but 2 I know from the beginning).
Thanks!

If the sizes are known at compilation time, you should use std::array. If one of the dimensions are not known until runtime, you should use std::vector.
You can of course combine them:
std::vector<std::array<unsigned int, 2>> hashFunc;
The above declares hashFunc to be a vector of arrays, and the arrays is of size two and of type unsigned int, just like specified in the question.
Then to add a new inner array just use push_back of the vector:
hashFunc.push_back({{ 1, 2 }});
(And yes, double braces are needed. The outer to construct the std::array object, and the inner for the actual array data.)
Or if you want to set the size of the outer vector at once (for example if you (runtime) know the size beforehand) you could do e.g.
hashFunc = std::vector<std::array<unsigned int, 2>>(var);
Where var above is the size of the "first dimension". Now you can directly access hashFunc[x][y] where x is in range of var and y is zero or one.

(To answer the direct question.) You can declare the pointer as
int (*hashFunc)[2];
and allocate it in one shot as
hashFunc = new int[var][2];

There's two ways you can go about this. Either have a class with a bare pointer or encapsulate it with std::vector and std::array. Below is a sample of two possible implementations which do exactly the same.
#include <iostream>
#include <vector>
#include <array>
#include <stdexcept>
class TheClass {
public:
typedef int TwoInts[2];
TheClass(const std::size_t size) : m_size(size)
{
m_hashFunc = new TwoInts[m_size];
if (m_hashFunc == NULL) throw std::runtime_error("Ran out of memory.");
}
virtual ~TheClass()
{
delete [] m_hashFunc;
}
inline std::size_t size() const { return m_size; }
inline int& operator()(const std::size_t i, const std::size_t j) { return m_hashFunc[i][j]; }
inline const int& operator()(const std::size_t i, const std::size_t j) const { return m_hashFunc[i][j]; }
private:
std::size_t m_size;
TwoInts* m_hashFunc;
};
class AnotherClass {
public:
AnotherClass(const std::size_t size) : m_hashFunc(size)
{
// Nothing to do here.
}
// No destructor required.
inline std::size_t size() const { return m_hashFunc.size(); }
inline int& operator()(const std::size_t i, const std::size_t j) { return m_hashFunc[i][j]; }
inline const int& operator()(const std::size_t i, const std::size_t j) const { return m_hashFunc[i][j]; }
private:
std::vector<std::array<int, 2>> m_hashFunc;
};
int main(int argc, char *argv[]) {
if (argc < 2) return -1;
const std::size_t runtimesize = static_cast<std::size_t>(atoll(argv[1]));
const std::size_t i1 = rand() % runtimesize;
const std::size_t i2 = rand() % runtimesize;
TheClass instance1(runtimesize);
AnotherClass instance2(runtimesize);
instance1(i1,0) = instance2(i1,0) = 4;
instance1(i2,1) = instance2(i2,1) = 2;
std::cout << instance1(i1,0) << ' ' << instance2(i1,0) << std::endl;
std::cout << instance1(i2,1) << ' ' << instance2(i2,1) << std::endl;
std::cout << instance1.size() << std::endl;
std::cout << instance2.size() << std::endl;
// ... etc
return 0;
}

Array with undefined size as Class-member

I'm searching for a way to define an array as a class-member with an undefined size (which will be defined on initialization).
class MyArrayOfInts {
private:
int[] array; // should declare the array with an (yet) undefined length
public:
MyArrayOfInts(int);
int Get(int);
void Set(int, int);
};
MyArrayOfInts::MyArrayOfInts(int length) {
this->array = int[length]; // defines the array here
}
int MyArrayOfInts::Get(int index) {
return this->array[index];
}
void MyArrayOfInts:Set(int index, int value) {
this->array[index] = value;
}
How can I achieve this behaviour ?

Why not just use std::vector<int>?

Proof Of Concept
Ok, inspired by UncleBens challenge here, I came up with a Proof-Of-Concept (see below) that let's you actually do:
srand(123);
for (int i=0; i<10; i++)
{
size_t N = rand() % DEMO_MAX; // capped for demo purposes
std::auto_ptr<iarray> dyn(make_dynamic_array(N));
exercise(*dyn);
}
It revolves around a template trick in factory<>::instantiate that actually uses a compile-time meta-binary-search to match the specified (runtime) dimension to a range of explicit static_array class template instantiations.
I feel the need to repeat that this is not good design, I provide the code sample only to show what the limits are of what can be done - with reasonable effor, to achieve the actual goal of the question. You can see the drawbacks:
the compiler is crippled with a boatload of useless statical types and create classes that are so big that they become a performance liability or a reliability hazard (stack allocation anyone? -> we're on 'stack overflow' already :))
at DEMO_MAX = 256, g++ -Os will actually emit 258 instantiations of factory<>; g++ -O4 will keep 74 of those, inlining the rest[2]
compilation doesn't scale well: at DEMO_MAX = MAX_RAND compilation takes about 2m9s to... run out of memory on a 64-bit 8GB machine; at MAX_RAND>>16 it takes over 25 minutes to possibly compile (?) while nearly running out of memory. It would really require some amounts of ugly manual optimization to remove these limits - I haven't gone so insane as to actually do that work, if you'll excuse me.
on the upside, this sample demonstrates the arguably sane range for this class (0..256) and compiles in only 4 seconds and 800Kb on my 64-bit linux. See also a down-scaled, ANSI-proof version at codepad.org
[2] established that with objdump -Ct test | grep instantiate | cut -c62- | sort -k1.10n
Show me the CODE already!
#include <iostream>
#include <memory>
#include <algorithm>
#include <iterator>
#include <stdexcept>
struct iarray
{
typedef int value_type;
typedef value_type* iterator;
typedef value_type const* const_iterator;
typedef value_type& reference;
typedef value_type const& const_reference;
virtual size_t size() const = 0;
virtual iterator begin() = 0;
virtual const_iterator begin() const = 0;
// completely unoptimized plumbing just for demonstration purps here
inline iterator end() { return begin()+size(); }
inline const_iterator end() const { return begin()+size(); }
// boundary checking would be 'gratis' here... for compile-time constant values of 'index'
inline const_reference operator[](size_t index) const { return *(begin()+index); }
inline reference operator[](size_t index) { return *(begin()+index); }
//
virtual ~iarray() {}
};
template <size_t N> struct static_array : iarray
{
static const size_t _size = N;
value_type data[N];
virtual size_t size() const { return _size; }
virtual iterator begin() { return data; }
virtual const_iterator begin() const { return data; }
};
#define DEMO_MAX 256
template <size_t PIVOT=DEMO_MAX/2, size_t MIN=0, size_t MAX=DEMO_MAX>
struct factory
/* this does a binary search in a range of static types
*
* due to the binary search, this will require at most 2log(MAX) levels of
* recursions.
*
* If the parameter (size_t n) is a compile time constant expression,
* together with automatic inlining, the compiler will be able to optimize
* this all the way to simply returning
*
* new static_array<n>()
*
* TODO static assert MIN<=PIVOT<=MAX
*/
{
inline static iarray* instantiate(size_t n)
{
if (n>MAX || n<MIN)
throw std::range_error("unsupported size");
if (n==PIVOT)
return new static_array<PIVOT>();
if (n>PIVOT)
return factory<(PIVOT + (MAX-PIVOT+1)/2), PIVOT+1, MAX>::instantiate(n);
else
return factory<(PIVOT - (PIVOT-MIN+1)/2), MIN, PIVOT-1>::instantiate(n);
}
};
iarray* make_dynamic_array(size_t n)
{
return factory<>::instantiate(n);
}
void exercise(iarray& arr)
{
int gen = 0;
for (iarray::iterator it=arr.begin(); it!=arr.end(); ++it)
*it = (gen+=arr.size());
std::cout << "size " << arr.size() << ":\t";
std::copy(arr.begin(), arr.end(), std::ostream_iterator<int>(std::cout, ","));
std::cout << std::endl;
}
int main()
{
{ // boring, oldfashioned method
static_array<5> i5;
static_array<17> i17;
exercise(i5);
exercise(i17);
}
{ // exciting, newfangled, useless method
for (int n=0; n<=DEMO_MAX; ++n)
{
std::auto_ptr<iarray> dyn(make_dynamic_array(n));
exercise(*dyn);
}
try { make_dynamic_array(-1); } catch (std::range_error e) { std::cout << "range error OK" << std::endl; }
try { make_dynamic_array(DEMO_MAX + 1); } catch (std::range_error e) { std::cout << "range error OK" << std::endl; }
return 0;
srand(123);
for (int i=0; i<10; i++)
{
size_t N = rand() % DEMO_MAX; // capped for demo purposes
std::auto_ptr<iarray> dyn(make_dynamic_array(N));
exercise(*dyn);
}
}
return 0;
}

Declare it as:
int* array;
Then you can initialize it this way:
MyArrayOfInts::MyArrayOfInts(int length) {
this->array = new int[length];
}
Don't forget to free the memory in the destrutor:
MyArrayOfInts::~MyArrayOfInts() {
delete [] this->array;
}

Is the class declaration complete ? If the constructor of the class takes the size of the array as an argument and you don't want to resize the array, then templatizing the class can give you runtime behaviour.
Now, we don't have to pass the size of the array as argument to the constructor.
template<size_t size>
class MyClass
{
public:
MyClass() { std::iota(arr_m, arr_m + size, 1); }
int operator[](int index) const
{
return arr_m[index];
}
int& operator[](int index)
{
return arr_m[index];
}
void Set(size_t index, int value)
{
arr_m[index] = value;
}
private:
int arr_m[size];
};
int main()
{
{
MyClass<5> obj;
std::cout << obj[4] << std::endl;
}
{
MyClass<4> obj;
std::cout << obj[3] << std::endl;
obj.Set(3, 30);
std::cout << obj[3] << std::endl;
}
}

In response to critics in the comments
I think many people fail to notice a crucial given in the question: since the question asks specifically how to declare an int[N] array inside a struct, it follows that each N will yield a distinct static type to the compiler.
As much as my approach is being 'critiqued' for this property, I did not invent it: it is a requirement from the original question. I can join the chorus saying: "just don't" or "impossible" but as a curious engineer I feel I'm often more helped by defining the boundaries of ust what is in fact still possible.
I'll take a moment to come up with a sketch of an answer to mainly UncleBen interesting challenge. Of course I could hand-waive 'just use template metaprogramming' but it sure would be more convincing and fun to come up with a sample1
1 only to follow that sample with a big warning: don't do this in actual life :)
The TR1 (or c++0x) type std::array does exactly that; you'll need to make the containing class generic to cater for the array size:
template <size_t N> struct MyArrayOfInts : MyArrayOfIntsBase /* for polymorphism */
{
std::array<int, N> _data;
explicit MyArrayOfInts(const int data[N])
{
std::copy(data, data+N, _data);
}
};
You can make the thing easier to work with by doing a smart template overloaded factory:
template <size_t N>
MyArrayOfInts<N> MakeMyArray(const int (&data)[N])
{ return MyArrayOfInts<N>(data); }

I'm working on this too for solving a dynamic array problem - I found the answer provided was sufficient to resolve.
This is tricky because arrays in functions from my reading do not continue after function ends, arrays have a lot of strange nuance, however if trying to make a dynamic array without being allowed to use a vector, I believe this is the best approach..
Other approaches such as calling new and delete upon the same pointed to array can/will lead to double free pending the compiler as it causes some undefined behavior.
class arrayObject
{
public:
arrayObject();
~arrayObject();
int createArray(int firstArray[]);
void getSize();
void getValue();
void deleting();
// private:
int *array;
int size;
int counter;
int number;
};
arrayObject::arrayObject()
{
this->array = new int[size];
}
arrayObject::~arrayObject()
{
delete [] this->array;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

HashTable in C++ - c++

In the Constructor for the HT class, pass in (or default to 0) a size variable (s) to specify the size of the array. Then set t to a new string array of size s So something like: HT::HT(size_type s) { t = new string[s]; }

Related

Pointer generalization of index

Why is this vector implementation more performant?

Is it possible to perform a string to int mapping at compile time?

c++ array[var][2] as a class member

Array with undefined size as Class-member

Categories

Resources