Why no default hash for C++ POD structs?

Why no default hash for C++ POD structs? - c++

I want to use a POD struct as a hash key in a map, e.g.
struct A { int x; int y; };
std::unordered_map<A, int> my_map;
but I can't do this, since no hash function is auto-generatable for such structs.
Why does the C++ standard not require a default hash for a POD struct?
Why do compilers (specifically, GCC 4.x / 5.x) offer such a hash, even if the standard doesn't mandate one?
How can I generate a hash function, using a template, in a portable way, for all of my POD structures (I'm willing to make semantic assumptions if necessary)?

As from the documentation, a possible implementation in your case would be:
#include<functional>
#include<unordered_map>
struct A { int x; int y; };
namespace std
{
template<> struct hash<A>
{
using argument_type = A;
using result_type = std::size_t;
result_type operator()(argument_type const& a) const
{
result_type const h1 ( std::hash<int>()(a.x) );
result_type const h2 ( std::hash<int>()(a.y) );
return h1 ^ (h2 << 1);
}
};
}
int main() {
std::unordered_map<A, int> my_map;
}
The compiler us not allowed to generate such a specialization because of the standard that does not define anything like that (as already mentioned in the comments).

There is a method to generate hash for POD, like good old c style. Only for real POD with no any linked data on the outside of struct. There is no checking of this requirements in code so use it only when you know and can guarantee this. All fields must be initialized (for example by default constructor like this A(), B() etc).
#pragma pack(push) /* push current alignment to stack */
#pragma pack(1) /* set alignment to 1 byte boundary */
struct A { int x; int y; };
struct B { int x; char ch[8] };
#pragma pack(pop) /* restore original alignment from stack */
struct C { int x __attribute__((packed)); };
template<class T> class PodHash;
template<>
class PodHash<A> {
public:
size_t operator()(const A &a) const
{
// it is possible to write hash func here char by char without using std::string
const std::string str =
std::string( reinterpret_cast<const std::string::value_type*>( &a ), sizeof(A) );
return std::hash<std::string>()( str );
}
};
std::unordered_map< A, int, PodHash<A> > m_mapMyMapA;
std::unordered_map< B, int, PodHash<B> > m_mapMyMapB;
UPD:
Data structure must be defined in data packing section with value of one byte or with pack attribute for prevent padding bytes.
UPD:
But I need to warn that replace deafult packing will make data loading/storing from/to memory for some fields little slowly, to prevent this need to arrange structure data fields with granularity that corresponding your (or most popular) architecture.
I suggest that you can add by yourself additional unused fields not for using but for arrange fields in your data structure for best prformance of memory loading/storing. Example:
struct A
{
char x; // 1 byte
char padding1[3]; // 3 byte for the following 'int'
int y; // 4 bytes - largest structure member
short z; // 2 byte
char padding2[2]; // 2 bytes to make total size of the structure 12 bytes
};
#pragma pack is supported by, at least:
Microsoft compiler
GNU compiler (webarchive)
clang-llvm compiler (webarchive)
Embarcadero (Borland) compiler (webarchive)
Sun WorkShop Compiler (webarchive)
Intel compiler is compatible with GCC, CLANG and Microsoft compiler

More flexible way is to declare comparision class and use it as template param of std::unordered_map.
struct A { int x; int y; };
emplate<class T> class MyHash;
template<>
class MyHash<A> {
public:
size_t operator()(const A &a) const
{
result_type const h1 ( std::hash<int>()(a.x) );
result_type const h2 ( std::hash<int>()(a.y) );
return h1 ^ (h2 << 1);
}
};
std::unordered_map<CString,CString,MyHash> m_mapMyMap;
You may want another Hash for same objects. Flexibility appear with code like this:
std::unordered_map<CString,CString, *MyAnotherHas* > m_mapMyMap;

Related

Uses for cpp struct variables passed in templates

I just want to ask if there are any uses for passing variables in cpp as template arguments
template<int a> struct foo {
int x = a;
};
int main() {
foo<2> bar;
std::cout << bar.x;
}
Something like this compiles, works and cout's 2 but the same thing can be done by doing
struct foo {
int x;
foo(int a) : x(a) {}
};
int main() {
foo bar(2);
std::cout << bar.x;
}
So what is the point of using variables in template arguments? I can also see a big flaw in using the first method: the variable a uses memory and isn't destructed after x is changed, as it would be after the constructor is called in the second example. It might be helpful if you showed some reasonable uses for that.

When you pass a variable through a template argument, it can be used in compile time.
For example, if you need to create a statically sized array in your class, you could use the template argument to pass the size of your array:
template <int TSize>
class Foo {
[...] // Do whatever you need to do with mData.
private:
std::array<int, TSize> mData;
};

There are many uses for constants in template parameters.
Static Sizes
This is how you would start implementing something like a std::array.
template <typename T, size_t SIZE>
struct Array {
T data[SIZE];
}
Template parameters are always usable in a constexpr context, so they can be used as sizes for statically sized arrays.
Providing Compile-Time Parameters to Algorithms
Another use is parametrizing algorithms like in the following code sample.
We have a uint32_t in ARGB order but to store it in a file, we might need to reorder it to BGRA or RGBA. We know the order at compile time, so we could use an ArgbOrder template variable.
enum class ArgbOrder { ARGB, RGBA, BGRA };
struct ChannelOffsets {
unsigned a;
unsigned r;
unsigned g;
unsigned b;
};
// and we can get a constexpr lookup table from this enum
constexpr ChannelOffsets byteShiftAmountsOf(ArgbOrder format)
{
...
}
template <ArgbOrder order>
void encodeArgb(uint32_t argb, uint8_t out[4])
{
// We can generate the shift amounts at compile time.
constexpr detail::ChannelOffsets shifts = shiftAmountsOf(order);
out[0] = static_cast<u8>(argb >> shifts.a);
out[1] = static_cast<u8>(argb >> shifts.r);
out[2] = static_cast<u8>(argb >> shifts.g);
out[3] = static_cast<u8>(argb >> shifts.b);
}
void example() {
encodeArgb<ArgbOrder::BGRA>(12345);
}
In this example, we can select the appropriate lookup table at compile time and have zero runtime cost. All that needs to happen at runtime is 4 shifts.
Feature Toggles
We can use bool template variables to toggle features in our code, like for example:
template <bool handleZeroSpecially>
int div(int x, int y) {
if constexpr (handleZeroSpecially) {
return y == 0 ? 0 : x / y;
}
else {
return x / y;
}
}

How do you assign array values to class member array in constructor, without using the std library?

Trying to understand the proper way to copy values into an class member array. Currently, I take each value of the array and copy them into the corresponding element of the member array:
struct IPAddress
{
IPAddress(const unsigned char values[4]) :
values{values[0], values[1], values[2], values[3]}
{
}
const unsigned char values[4];
};
int main(int argc, char** argv)
{
unsigned char values[] = {10, 0, 0, 1};
IPAddress address(values);
return 0;
}
This works, but is there a way to "automagically" copy all the values in the constructor? I mean, what would I do if the values were of a class had 100 elements instead of 4? Or 1000?
I'm aware that I should be using std::array. But since this code is built for a microcontroller, using std library is not really an option.
Any takers?

You should be using std::array. This is one part of the standard library that shouldn't be offensive to embedded programming.
If you don't have access to it, it's not hard to implement a class just like it. It's a straight forward aggregate with saner semantics than raw arrays. It's also likely to be reused, which makes it a good candidate for a utility you should implement.
Failing that, you can rely on delegating c'tors, which I only add here for the intellectual exercise:
struct IPAddress
{
IPAddress(const unsigned char values[4])
: IPAddress(values, std::make_index_sequence<4>{})
{
}
const unsigned char values[4];
private:
template<std::size_t... I>
IPAddress(const unsigned char values[4], std::index_sequence<I...>)
: values{values[I]...}
{
}
};
The key is in the pack expansion values{values[I]...}, which turns into an initializer not unlike your original one. See it live.

I'm aware that I should be using std::array. But since this code is built for a microcontroller, using std library is not really an option.
If you don't want to include array, you can still implement your own type for solving your issue:
template<typename T, std::size_t N>
class values_t {
public:
values_t(const T *ptr) {
// copy N elements
for (std::size_t i = 0; i < N; ++i)
value[i] = ptr[i]; // copy element
}
T& operator[](int i) { return value[i]; }
const T& operator[](int i) const { return value[i]; }
private:
typename std::remove_const<T>::type value[N];
};
Then, initializing the values data member of IPAddress becomes much simpler:
struct IPAddress
{
IPAddress(const unsigned char values[4]) :
values{values} {} // <-- copy as a whole
values_t<const unsigned char, 4> values;
};

Access c++ struct attribute like in an array [duplicate]

Let have a type T and a struct having ONLY uniform elements of T type.
struct Foo {
T one,
T two,
T three
};
I'd like to access them in fallowing way:
struct Foo {
T one,
T two,
T three
T &operator [] (int i)
{
return *(T*)((size_t)this + i * cpp_offsetof(Foo, two));
}
};
where cpp_offsetof macro (it is considered to be correct) is:
#define cpp_offsetof(s, m) (((size_t)&reinterpret_cast<const volatile char&>((((s*)(char*)8)->m))) - 8)
The C++ standard doesn't guarantee it, but can we assume that members are distanced by a fixed offset and above is correct, cross-platform solution?
100% compatible solution would be:
struct Foo {
T one,
T two,
T three
T &operator [] (int i) {
const size_t offsets[] = { cpp_offsetof(Foo, one), cpp_offsetof(Foo, two), cpp_offsetof(Foo, three) };
return *(T*)((size_t)this + offsets[i]);
}
};
[edit]standard, compliant and faster version was presented by snk_kid using pointers to data members[/edit]
but it requires extra lookup table which I'm trying to avoid.
//EDIT
And one more. I cannot use just an array and constants to index these fields, they have to be named fields of a struct (some macro requires that).
//EDIT2
Why those have to be named fields of a struct? What is the macro? It is settings system of a bigger project. Simplifying it's sth like this:
struct Foo {
int one;
int two;
}
foo;
struct Setting { void *obj, size_t filed_offset, const char *name, FieldType type }
#define SETTING(CLASS, OBJ, FIELD, TYPE) { OBJ, cpp_offsetof(CLASS, FIELD), #OBJ #FIELD, TYPE }
Setting settings[] = {
SETTING(Foo, foo, one, INT_FIELD),
SETTING(Foo, foo, two, INT_FIELD)
};
And once again: I'm not looking form 100% compatible solution but 99%. I'm asking if we can expect that some compilers will put non-uniform padding between uniform fields.

Your code doesn't work with NON-POD types such those which using virtual member functions. There is a standard compliant (and efficient) way to achieve what you're trying to do, using pointer to data members:
template< typename T >
struct Foo {
typedef size_t size_type;
private:
typedef T Foo<T>::* const vec[3];
static const vec v;
public:
T one;
T two;
T three;
const T& operator[](size_type i) const {
return this->*v[i];
}
T& operator[](size_type i) {
return this->*v[i];
}
};
template< typename T >
const typename Foo<T>::vec Foo<T>::v = { &Foo<T>::one, &Foo<T>::two, &Foo<T>::three };
Just make sure you use const every with the table of pointer to data-members to get optimizations. Check here to see what I'm talking about.

Another way is with template specialization if what you are trying to achieve is still a compile time feature.
class Foo {
T one;
T two;
T three;
};
template <int i> T & get(Foo& foo);
template T& get<1>(Foo& foo){ return foo.one;}
template T& get<2>(Foo& foo){ return foo.two;}
template T& get<3>(Foo& foo){ return foo.three;}
It would be nice to define get as a member function but you cannot
specialize template member functions. Now if this is only a compile time
expansion you are looking for then this will avoid the lookup table
issue of one of the previous posts. If you need runtime resolution
then you need a lookup table obviously.
--
Brad Phelan
http://xtargets.heroku.com

You might be able to achieve what you want using an array to hold the data (so you can get indexed access without using a lookup table) and having references to the various array elements (so you can have 'named' elements for use by your macros).
I'm not sure what your macros require, so I'm not 100% sure this will work, but it might. Also, I'm not sure that the slight overhead of the lookup table approach is worth jumping through too many hoops to avoid. On the other hand, I don't think the approach I suggest here is any more complex than the table-of-pointers approach, so here it is for your consideration:
#include <stdio.h>
template< typename T >
struct Foo {
private:
T data_[3];
public:
T& one;
T& two;
T& three;
const T& operator[](size_t i) const {
return data_[i];
}
T& operator[](size_t i) {
return data_[i];
}
Foo() :
one( data_[0]),
two( data_[1]),
three( data_[2])
{};
};
int main()
{
Foo<int> foo;
foo[0] = 11;
foo[1] = 22;
foo[2] = 33;
printf( "%d, %d, %d\n", foo.one, foo.two, foo.three);
Foo<int> const cfoo( foo);
printf( "%d, %d, %d\n", cfoo[0], cfoo[1], cfoo[2]);
return 0;
}

You can't because the compiler can add dead bytes between members to allow padding.
There is two ways to do what you want.
The first is to use your compiler-specific keyword or pragma macro that will force the compiler to not add padding bytes. But that is not portable.
That said it might be the easiest way to do it with your macro requirements, so I suggest you explore this possibility and prepare for adding more pragma when using different compilers.
The other way is to first make sure your members are aligned, then add accessors :
struct Foo {
T members[ 3 ]; // arrays are guarrantied to be contigu
T& one() { return members[0]; }
const T& one() const { return members[0]; }
//etc...
};

If you're sure the compilers you're using are going to generate the right code for this (and I'd imagine they would, assuming T isn't a reference type anyway) the best thing to do is put in some kind of check that the struct is laid out as you think. I can't think of any particular reason to insert non-uniform padding between adjacent members of the same type, but if you check the struct layout by hand then you'll at least know if it happens.
If the struct (S) has exactly N members of type T, for example, you can check at compile time that they are tightly packed simply using sizeof:
struct S {
T a,b,c;
};
extern const char check_S_size[sizeof(S)==3*sizeof(T)?1:-1];
If this compiles, then they're tightly packed, as there's no space for anything else.
If you just happen to have N members, that you want to ensure are placed directly one after the other, you can do something similar using offsetof:
class S {
char x;
T a,b,c;
};
extern const char check_b_offset[offsetof(S,b)==offsetof(S,a)+sizeof(T)?1:-1];
extern const char check_c_offset[offsetof(S,c)==offsetof(S,b)+sizeof(T)?1:-1];
Depending on the compiler, this might have to become a runtime check, possibly not using offsetof -- which you might want to do for non-POD types anyway, because offsetof isn't defined for them.
S tmp;
assert(&tmp.b==&tmp.a+1);
assert(&tmp.c==&tmp.b+1);
This doesn't say anything about what to do if the asserts start failing, but you should at least get some warning that the assumptions aren't true...
(By the way, insert appropriate casts to char references and so on where appropriate. I left them out for brevity.)

Is there a standard-compliant way to specify field offsets in C++?

I have a chunk of memory populated by external code which I'm trying to reverse engineer. I don't know the complete structure of this memory, but I do know a few fields (e.g. the chunk starts off with an int32 named 'foo' and there's a double at offset 0xC called 'bar'). I want to define a structure and essentially reinterpret-cast a pointer to this memory chunk to that structure, and have it line up. I'm not sure if there's a more conventional name for this technique but I'll refer to it as creating an 'overlay type'.
Here's a sketch of what I'd like to be able to do:
START_OVERLAY_TYPE(my_type, 0xFF) // struct named my_type, size 0xFF
FIELD(0x00, int32_t foo); // field int32_t foo at 0x00
FIELD(0x0C, double bar); // field double bar at 0x0C
END_OVERLAY_TYPE
Not having to use macros would be a plus, but I don't see a good way around them.
With my current implementation, I expand this to (something like):
__pragma(pack(push, 1))
template<size_t p> struct padding_t { unsigned char pad[p]; };
template<> struct padding_t<0> {};
struct my_type
{
union
{
struct : padding_t<0xFF> {}; // ensure total size is 0xFF
struct : padding_t<0x00> { int32_t foo; }; // field at 0x00
struct : padding_t<0x0C> { double bar; }; // field at 0x0C
};
};
__pragma(pack(pop))
This compiles and works great, at least in the versions I tried of clang, gcc, and VC++ (with appropriate changes to the pragma). Unfortunately, warnings abound due to the non-standard use of anonymous structs.
Is there any way to achieve the same effect while staying within the standard? The requirements are that it be reasonably simple to declare (like the current macro is), and that to the consumer, the usage is indistinguishable from
struct my_type { int32_t foo; double bar; }
at least to the casual observer.
The current code will work for my purposes, I'm just curious if there is a better approach I am overlooking.

You could try something like this with implicit type conversions and assignment operators for the internal struct containing the value. This way instead of using unnamed structs the struct bears the name, but the internals become the unnamed part through operator overloading.
I tried this out with some client code (passing to functions, getting/setting values) and everything seemed fine. It's of course possible that I missed a scenario somewhere.
__pragma(pack(push, 1))
template<size_t p, typename t>
struct padding_t
{
unsigned char pad[p];
t val;
operator t () const {return val;}
operator t& () {return val;}
padding_t<p, t>& operator= (const t& rhs) {val = rhs; return *this;}
};
template<typename t> struct padding_t<0, t>
{
t val;
operator t () const {return val;}
operator t& () {return val;}
padding_t<0, t>& operator= (const t& rhs) {val = rhs; return *this;}
};
template<size_t p>
struct sizing_t
{
unsigned char pad[p];
};
struct my_type
{
union
{
sizing_t<0xFF> size; // ensure total size is 0xFF
padding_t<0x00, int32_t> foo; // field at 0x00
padding_t<0x0C, double> bar; // field at 0x0C
};
};
__pragma(pack(pop))

Accessing struct members with array subscript operator

Let have a type T and a struct having ONLY uniform elements of T type.
struct Foo {
T one,
T two,
T three
};
I'd like to access them in fallowing way:
struct Foo {
T one,
T two,
T three
T &operator [] (int i)
{
return *(T*)((size_t)this + i * cpp_offsetof(Foo, two));
}
};
where cpp_offsetof macro (it is considered to be correct) is:
#define cpp_offsetof(s, m) (((size_t)&reinterpret_cast<const volatile char&>((((s*)(char*)8)->m))) - 8)
The C++ standard doesn't guarantee it, but can we assume that members are distanced by a fixed offset and above is correct, cross-platform solution?
100% compatible solution would be:
struct Foo {
T one,
T two,
T three
T &operator [] (int i) {
const size_t offsets[] = { cpp_offsetof(Foo, one), cpp_offsetof(Foo, two), cpp_offsetof(Foo, three) };
return *(T*)((size_t)this + offsets[i]);
}
};
[edit]standard, compliant and faster version was presented by snk_kid using pointers to data members[/edit]
but it requires extra lookup table which I'm trying to avoid.
//EDIT
And one more. I cannot use just an array and constants to index these fields, they have to be named fields of a struct (some macro requires that).
//EDIT2
Why those have to be named fields of a struct? What is the macro? It is settings system of a bigger project. Simplifying it's sth like this:
struct Foo {
int one;
int two;
}
foo;
struct Setting { void *obj, size_t filed_offset, const char *name, FieldType type }
#define SETTING(CLASS, OBJ, FIELD, TYPE) { OBJ, cpp_offsetof(CLASS, FIELD), #OBJ #FIELD, TYPE }
Setting settings[] = {
SETTING(Foo, foo, one, INT_FIELD),
SETTING(Foo, foo, two, INT_FIELD)
};
And once again: I'm not looking form 100% compatible solution but 99%. I'm asking if we can expect that some compilers will put non-uniform padding between uniform fields.

Your code doesn't work with NON-POD types such those which using virtual member functions. There is a standard compliant (and efficient) way to achieve what you're trying to do, using pointer to data members:
template< typename T >
struct Foo {
typedef size_t size_type;
private:
typedef T Foo<T>::* const vec[3];
static const vec v;
public:
T one;
T two;
T three;
const T& operator[](size_type i) const {
return this->*v[i];
}
T& operator[](size_type i) {
return this->*v[i];
}
};
template< typename T >
const typename Foo<T>::vec Foo<T>::v = { &Foo<T>::one, &Foo<T>::two, &Foo<T>::three };
Just make sure you use const every with the table of pointer to data-members to get optimizations. Check here to see what I'm talking about.

Another way is with template specialization if what you are trying to achieve is still a compile time feature.
class Foo {
T one;
T two;
T three;
};
template <int i> T & get(Foo& foo);
template T& get<1>(Foo& foo){ return foo.one;}
template T& get<2>(Foo& foo){ return foo.two;}
template T& get<3>(Foo& foo){ return foo.three;}
It would be nice to define get as a member function but you cannot
specialize template member functions. Now if this is only a compile time
expansion you are looking for then this will avoid the lookup table
issue of one of the previous posts. If you need runtime resolution
then you need a lookup table obviously.
--
Brad Phelan
http://xtargets.heroku.com

You might be able to achieve what you want using an array to hold the data (so you can get indexed access without using a lookup table) and having references to the various array elements (so you can have 'named' elements for use by your macros).
I'm not sure what your macros require, so I'm not 100% sure this will work, but it might. Also, I'm not sure that the slight overhead of the lookup table approach is worth jumping through too many hoops to avoid. On the other hand, I don't think the approach I suggest here is any more complex than the table-of-pointers approach, so here it is for your consideration:
#include <stdio.h>
template< typename T >
struct Foo {
private:
T data_[3];
public:
T& one;
T& two;
T& three;
const T& operator[](size_t i) const {
return data_[i];
}
T& operator[](size_t i) {
return data_[i];
}
Foo() :
one( data_[0]),
two( data_[1]),
three( data_[2])
{};
};
int main()
{
Foo<int> foo;
foo[0] = 11;
foo[1] = 22;
foo[2] = 33;
printf( "%d, %d, %d\n", foo.one, foo.two, foo.three);
Foo<int> const cfoo( foo);
printf( "%d, %d, %d\n", cfoo[0], cfoo[1], cfoo[2]);
return 0;
}

You can't because the compiler can add dead bytes between members to allow padding.
There is two ways to do what you want.
The first is to use your compiler-specific keyword or pragma macro that will force the compiler to not add padding bytes. But that is not portable.
That said it might be the easiest way to do it with your macro requirements, so I suggest you explore this possibility and prepare for adding more pragma when using different compilers.
The other way is to first make sure your members are aligned, then add accessors :
struct Foo {
T members[ 3 ]; // arrays are guarrantied to be contigu
T& one() { return members[0]; }
const T& one() const { return members[0]; }
//etc...
};

If you're sure the compilers you're using are going to generate the right code for this (and I'd imagine they would, assuming T isn't a reference type anyway) the best thing to do is put in some kind of check that the struct is laid out as you think. I can't think of any particular reason to insert non-uniform padding between adjacent members of the same type, but if you check the struct layout by hand then you'll at least know if it happens.
If the struct (S) has exactly N members of type T, for example, you can check at compile time that they are tightly packed simply using sizeof:
struct S {
T a,b,c;
};
extern const char check_S_size[sizeof(S)==3*sizeof(T)?1:-1];
If this compiles, then they're tightly packed, as there's no space for anything else.
If you just happen to have N members, that you want to ensure are placed directly one after the other, you can do something similar using offsetof:
class S {
char x;
T a,b,c;
};
extern const char check_b_offset[offsetof(S,b)==offsetof(S,a)+sizeof(T)?1:-1];
extern const char check_c_offset[offsetof(S,c)==offsetof(S,b)+sizeof(T)?1:-1];
Depending on the compiler, this might have to become a runtime check, possibly not using offsetof -- which you might want to do for non-POD types anyway, because offsetof isn't defined for them.
S tmp;
assert(&tmp.b==&tmp.a+1);
assert(&tmp.c==&tmp.b+1);
This doesn't say anything about what to do if the asserts start failing, but you should at least get some warning that the assumptions aren't true...
(By the way, insert appropriate casts to char references and so on where appropriate. I left them out for brevity.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why no default hash for C++ POD structs? - c++

Related

Uses for cpp struct variables passed in templates

How do you assign array values to class member array in constructor, without using the std library?

Access c++ struct attribute like in an array [duplicate]

Is there a standard-compliant way to specify field offsets in C++?

Accessing struct members with array subscript operator

Categories

Resources