Using global C++object from C crashes application - c++

This is my first post, I'm new to this site, but I've been lurking around for a while now. I've got a good knowledge of C and very limited knowledge of C++. I guess. I'm on Windows (XPx64), VS2008.
I'm trying to wrap a C++ library, kdtree2, so that I can use it from C. The main issues relate to accessing the kdtree2 and kdtree2_result_vector classes. As the authors ftp server does not respond I've uploaded a copy of the original distribution kdtree2 src
Just some quick info on the kd-tree (a form of a binary tree), "'the data' are coordinates in n-dimensional Cartesian space and an index. What it is used for are nearest neighbour searches, so after constructing the tree (which will not be modified), one can query the tree for various types of nn-searches. The results in this case are returned in a vector object of structs (c-like structs).
struct kdtree2_result {
//
// the search routines return a (wrapped) vector
// of these.
//
public:
float dis; // its square Euclidean distance
int idx; // which neighbor was found
};
My imagined solution is to have an array of kdtree2 objects (one per thread). For the kdtree2_result_vector class I haven't got a solution yet as I'm not getting past first base. It is not necessary to access the kdtree2 class directly.
I only need to fill it with data and then use it (as the second function below is an example of). For this I've defined:
kdtree2 *global_kdtree2;
extern "C" void new_kdtree2 ( float **data, const int n, const int dim, bool arrange ) {
multi_array_ref<float,2> kdtree2_data ( ( float * ) &data [ 0 ][ 0 ], extents [ n ][ dim ], c_storage_order ( ) );
global_kdtree2 = new kdtree2 ( kdtree2_data, arrange );
}
For then using that tree, I've defined:
extern "C" void n_nearest_around_point_kdtree2 ( int idxin, int correltime, int nn ) {
kdtree2_result_vector result;
global_kdtree2->n_nearest_around_point ( idxin, correltime, nn, result );
}
kdtree2_result_vector is derived from the vector class. This compiles without error, and the resulting library can be linked and it's C-functions accessed from C.
The problem is that the invocation of n_nearest_around_point_kdtree2 crashes the program. I suspect somehow between setting up the tree and using it in the second function call, the tree somehow gets freed/destroyed. The calling c-test-program is posted below:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include "kdtree2.h"
#define MALLOC_2D(type,x,y) ((type**)malloc_2D_kdtree2((x),(y),sizeof(type)))
void **malloc_2D_kdtree2 ( const int x, const int y, const int type_size ) {
const int y_type_size = y * type_size;
void** x_idx = ( void ** ) malloc ( x * ( sizeof ( void ** ) + y_type_size ) );
if ( x_idx == NULL )
return NULL;
char* y_idx = ( char * ) ( x_idx + x );
for ( int i = 0; i < x; i++ )
x_idx [ i ] = y_idx + i * y_type_size;
return x_idx;
}
int main ( void ) {
float **data = MALLOC_2D ( float, 100, 3 );
for ( int i = 0; i < 100; i++ )
for ( int j = 0; j < 3; j++ )
data [ i ][ j ] = ( float ) ( 3 * i + j );
// this works fine
tnrp ( data, 100, 3, false );
new_kdtree2 ( data, 100, 3, false );
// this crashes the program
n_nearest_around_point_kdtree2 ( 9, 3, 6 );
delete_kdtree2 ( );
free ( data );
return 0;
}
As far as I can see, searching the internet, it should work, but I'm obviously missing something vital in the brave (for me) new world of C++.
EDIT:
Resolution, thanks to larsmans. I've defined the following class (derived from what larsmans posted earlier):
class kdtree {
private:
float **data;
multi_array_ref<float,2> data_ref;
kdtree2 tree;
public:
kdtree2_result_vector result;
kdtree ( float **data, int n, int dim, bool arrange ) :
data_ref ( ( float * ) &data [ 0 ][ 0 ], extents [ n ][ dim ], c_storage_order ( ) ),
tree ( data_ref, arrange )
{
}
void n_nearest_brute_force ( std::vector<float>& qv ) {
tree.n_nearest_brute_force ( qv, result ); }
void n_nearest ( std::vector<float>& qv, int nn ) {
tree.n_nearest ( qv, nn, result ); }
void n_nearest_around_point ( int idxin, int correltime, int nn ) {
tree.n_nearest_around_point ( idxin, correltime, nn, result ); }
void r_nearest ( std::vector<float>& qv, float r2 ) {
tree.r_nearest ( qv, r2, result ); }
void r_nearest_around_point ( int idxin, int correltime, float r2 ) {
tree.r_nearest_around_point ( idxin, correltime, r2, result ); }
int r_count ( std::vector<float>& qv, float r2 ) {
return tree.r_count ( qv, r2 ); }
int r_count_around_point ( int idxin, int correltime, float r2 ) {
return tree.r_count_around_point ( idxin, correltime, r2 ); }
};
The code to call these functions from C:
kdtree* global_kdtree2 [ 8 ];
extern "C" void new_kdtree2 ( const int thread_id, float **data, const int n, const int dim, bool arrange ) {
global_kdtree2 [ thread_id ] = new kdtree ( data, n, dim, arrange );
}
extern "C" void delete_kdtree2 ( const int thread_id ) {
delete global_kdtree2 [ thread_id ];
}
extern "C" void n_nearest_around_point_kdtree2 ( const int thread_id, int idxin, int correltime, int nn, struct kdtree2_result **result ) {
global_kdtree2 [ thread_id ]->n_nearest_around_point ( idxin, correltime, nn );
*result = &( global_kdtree2 [ thread_id ]->result.front ( ) );
}
and eventually the C-program to start using it all:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include "kdtree2.h"
int main ( void ) {
float **data = MALLOC_2D ( float, 100, 3 );
for ( int i = 0; i < 100; i++ )
for ( int j = 0; j < 3; j++ )
data [ i ][ j ] = ( float ) ( 3 * i + j );
int thread_id = 0;
new_kdtree2 ( thread_id, data, 100, 3, false );
struct kdtree2_result *result;
n_nearest_around_point_kdtree2 ( thread_id, 28, 3, 9, &result );
for ( int i = 0; i < 9; i++ )
printf ( "result[%d]= (%d,%f)\n", i , result [ i ].idx, result [ i ].dis );
printf ( "\n" );
n_nearest_around_point_kdtree2 ( thread_id, 9, 3, 6, &result );
for ( int i = 0; i < 6; i++ )
printf ( "result[%d]= (%d,%f)\n", i , result [ i ].idx, result [ i ].dis );
delete_kdtree2 ( thread_id );
free ( data );
return 0;
}

The API docs in the referenced paper are rather flaky and the author's FTP server doesn't respond, so I can't tell with certainty, but my hunch is that
multi_array_ref<float,2> kdtree2_data((float *)&data[0][0], extents[n][dim],
c_storage_order( ));
global_kdtree2 = new kdtree2(kdtree2_data, arrange);
construct the kdtree2 by storing a reference to kdtree2_data in the global_kdtree2 object, rather than making a full copy. Since kdtree2_data is a local variable, it is destroyed when new_kdtree2 returns. You'll have to keep it alive until n_nearest_around_point_kdtree2 is done.

Related

How do I allocate memory-aligned C++ object arrays? [duplicate]

This question already has an answer here:
Parameter "size" of member operator new[] increases if class has destructor/delete[]
(1 answer)
Closed 5 years ago.
I am seeing a problem with operator new[]:
#include <stdlib.h>
#include <stdio.h>
class V4 { public:
float v[ 4 ];
V4() {}
void *operator new( size_t sz ) { return aligned_alloc( 16, sz ); }
void *operator new[]( size_t sz ) { printf( "sz: %zu\n", sz ); return aligned_alloc( 16, sz ); }
void operator delete( void *p, size_t sz ) { free( p ); }
//void operator delete[]( void *p, size_t sz ) { free( p ); }
};
class W4 { public:
float w[ 4 ];
W4() {}
void *operator new( size_t sz ) { return aligned_alloc( 16, sz ); }
void *operator new[]( size_t sz ) { printf( "sz: %zu\n", sz ); return aligned_alloc( 16, sz ); }
void operator delete( void *p, size_t sz ) { free( p ); }
void operator delete[]( void *p, size_t sz ) { free( p ); }
};
int main( int argc, char **argv ) {
printf( "sizeof( V4 ): %zu\n", sizeof( V4 ));
V4 *p = new V4[ 1 ];
printf( "p: %p\n", p );
printf( "sizeof( W4 ): %zu\n", sizeof( W4 ));
W4 *q = new W4[ 1 ];
printf( "q: %p\n", q );
exit(0);
}
Produces:
$ g++ -Wall main.cpp && ./a.out
sizeof( V4 ): 16
sz: 16
p: 0x55be98a10030
sizeof( W4 ): 16
sz: 24
q: 0x55be98a10058
Why does the alloc size increase to 24 when I include the operator delete[]? This is screwing up my aligned malloc.
$ g++ --version
g++ (Debian 7.2.0-18) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
From looking at other questions, it seems as though the extra 8 bytes may be being used to store the array size. Even if this is expected behaviour, why is it triggered by operator delete[], and what is the correct procedure for allocating memory-aligned arrays?
EDIT Thanks, the linked questions appear to be relevant. I still think the question as asked needs an answer, however. It ought to be possible to change the example code to produce memory-aligned arrays without recourse to std::vector, in my opinion. My current thinking is that it will be necessary to allocate a yet-larger block of bytes which are 16-byte aligned, and return the pointer such that the initial 8 bytes bring the rest of the block to alignment on the 16-byte boundary. The delete[] operator would then have to perform the reverse operation before calling free(). This is pretty disgusting, but I think it is required to satisfy both the calling code (C runtime?) (which requires its 8 bytes for size storage) - and the use case which is to get 16-byte aligned Vector4s.
EDIT The linked answer is certainly relevant, but it does not address the problem of ensuring correct memory alignment.
EDIT It looks like this code will do what I want, but I don't like the magic number 8 in delete[]:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
class W16 { public:
float w[ 16 ];
W16() {}
void *operator new( size_t sz ) { return aligned_alloc( 16, sz ); }
void *operator new[]( size_t sz ) {
size_t r = sz % sizeof( W16 );
size_t ofs = sizeof( W16 ) - r;
size_t _sz = sz + ofs;
void *p1 = aligned_alloc( sizeof( W16 ), _sz );
void *p2 = ((uint8_t *) p1) + ofs;
printf( "sizeof( W16 ): %zx, sz: %zx, r: %zx, ofs: %zx, _sz: %zx\np1: %p\np2: %p\n\n", sizeof( W16 ), sz, r, ofs, _sz, p1, p2 );
return p2;
}
void operator delete( void *p, size_t sz ) { free( p ); }
void operator delete[]( void *p, size_t sz ) {
void *p1 = ((int8_t*) p) + 8 - sizeof( W16 );
printf( "\np2: %p\np1: %p", p, p1 );
free( p1 );
}
};
int main( int argc, char **argv ) {
printf( "sizeof( W16 ): %zx\n", sizeof( W16 ));
W16 *q = new W16[ 16 ];
printf( "&q[0]: %p\n", &q[0] );
delete[] q;
}
Output:
$ g++ -Wall main.cpp && ./a.out
sizeof( W16 ): 40
sizeof( W16 ): 40, sz: 408, r: 8, ofs: 38, _sz: 440
p1: 0x559876c68080
p2: 0x559876c680b8
&q[0]: 0x559876c680c0
p2: 0x559876c680b8
p1: 0x559876c68080
EDIT Title changed from feedback in comments. I don't think this is a 'duplicate' of the linked answer anymore, though I don't know if I can get it removed.
It looks as though this will do for me:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
inline void *array_alloc( size_t sz_obj, size_t sz_req ) {
size_t r = sz_req % sz_obj;
size_t ofs = sz_obj - r;
size_t sz = sz_req + ofs;
void *p1 = aligned_alloc( sz_obj, sz );
void *p2 = (void*) (((uintptr_t ) p1) + ofs);
//printf( "sz_obj: %zx, sz_req: %zx, r: %zx, ofs: %zx, sz: %zx\np1: %p\np2: %p\n\n", sz_obj, sz_req, r, ofs, sz, p1, p2 );
return p2;
}
inline void array_free( size_t sz_obj, void *p2 ) {
void *p1 = (void*) (((uint8_t*)p2) - (((uintptr_t)p2) % sz_obj));
//printf( "\np2: %p\np1: %p", p2, p1 );
free( p1 );
}
class W16 { public:
float w[ 16 ];
W16() {}
void *operator new( size_t sz ) { return aligned_alloc( 16, sz ); }
void *operator new[]( size_t sz ) { return array_alloc( sizeof( W16 ), sz ); }
void operator delete( void *p, size_t sz ) { free( p ); }
void operator delete[]( void *p, size_t sz ) { array_free( sizeof( W16 ), p ); }
};
int main( int argc, char **argv ) {
//printf( "sizeof( W16 ): %zx\n", sizeof( W16 ));
W16 *q = new W16[ 16 ];
printf( "&q[0]: %p\n", &q[0] );
delete[] q;
}
EDIT Thanks to n.m., this code works without a magic number.

cereal + armadillo + json serialization

Does anyone have an example of Cereal based armadillo matrix serialization to JSON? Binary serialization below seems to be working.
Inside mat_extra_meat.hpp
template<class Archive, class eT>
typename std::enable_if<cereal::traits::is_output_serializable<cereal::BinaryData<eT>, Archive>::value, void>::type
save( Archive & ar, const Mat<eT>& m ) {
uword n_rows = m.n_rows;
uword n_cols = m.n_cols;
ar( n_rows );
ar( n_cols );
ar( cereal::binary_data(
reinterpret_cast< void * const >( const_cast< eT* >( m.memptr() ) ),
static_cast< std::size_t >( n_rows * n_cols * sizeof( eT ) ) ) );
}
template<class Archive, class eT>
typename std::enable_if<cereal::traits::is_input_serializable<cereal::BinaryData<eT>, Archive>::value, void>::type
load( Archive & ar, Mat<eT>& m ) {
uword n_rows;
uword n_cols;
ar( n_rows );
ar( n_cols );
m.resize( n_rows, n_cols );
ar( cereal::binary_data(
reinterpret_cast< void * const >( const_cast< eT* >( m.memptr() ) ),
static_cast< std::size_t >( n_rows * n_cols * sizeof( eT ) ) ) );
}
Test with this:
int main( int argc, char** argv ) {
arma::mat xx1 = arma::randn( 10, 20 );
std::ofstream ofs( "test", std::ios::binary );
cereal::BinaryOutputArchive o( ofs );
o( xx1 );
ofs.close();
// Now load it.
arma::mat xx2;
std::ifstream ifs( "test", std::ios::binary );
cereal::BinaryInputArchive i( ifs );
i( xx2 );
}
You have two options for JSON serialization - you can take a quick and dirty approach that won't really be human readable, or you can make it human readable at the cost of increased serialization size and time.
For the quick version, you can modify your existing code to use saveBinaryValue and loadBinaryValue, which exist within the text archives of cereal (JSON and XML).
e.g.:
ar.saveBinaryValue( reinterpret_cast<void * const>( const_cast< eT* >( m.memptr() ) ),
static_cast<std::size_t>( n_rows * n_cols * sizeof( eT ) ) );
and similarly for the load.
This will base64 encode your data and write it as a string. You would of course need to specialize the function to only apply to text archives (or just JSON) within cereal.
The alternative is to individually serialize each element. You have two choices again here, the first is to serialize as a JSON array (e.g. myarray: [1, 2, 3, 4, 5, ...]) or as a bunch of individual name-value-pairs: "array1" : "1", "array2": "2", ...
The convention in cereal has been to use JSON arrays for dynamically re-sizable containers (e.g. vector), but because we're purely emphasizing readability with this example, I'll use arrays even though your armadillo matrix would not be something you would like users to be able to add or remove elements from using JSON:
namespace arma
{
// Wraps a particular column in a class with its own serialization function.
// This is necessary because cereal expects actual data to follow a size_tag, and can't
// serialize two size_tags back to back without creating a new node (entering a new serialization function).
//
// This wrapper serves the purpose of creating a new node in the JSON serializer and allows us to
// then serialize the size_tag, followed by the actual data
template <class T>
struct ColWrapper
{
ColWrapper(T && m, int c, int nc) : mat(std::forward<T>(m)), col(c), n_cols(nc) {}
T & mat;
int col;
int n_cols;
template <class Archive>
void save( Archive & ar ) const
{
ar( cereal::make_size_tag( mat.n_rows ) );
for( auto iter = mat.begin_col(col), end = mat.end_col(col); iter != end; ++iter )
ar( *iter );
}
template <class Archive>
void load( Archive & ar )
{
cereal::size_type n_rows;
// Test to see if we need to resize the data
ar( cereal::make_size_tag( n_rows ) );
if( mat.n_rows != n_rows )
mat.resize( n_rows, n_cols );
for( auto iter = mat.begin_col(col), end = mat.end_col(col); iter != end; ++iter )
ar( *iter );
}
};
// Convenience function to make a ColWrapper
template<class T> inline
ColWrapper<T> make_col_wrapper(T && t, int c, int nc)
{
return {std::forward<T>(t), c, nc};
}
template<class Archive, class eT, cereal::traits::EnableIf<cereal::traits::is_text_archive<Archive>::value> = cereal::traits::sfinae>
inline void save( Archive & ar, const Mat<eT>& m )
{
// armadillo stored in column major order
uword n_rows = m.n_rows;
uword n_cols = m.n_cols;
// First serialize a size_tag for the number of columns. This will make expect a dynamic
// sized container, which it will output as a JSON array. In reality our container is not dynamic,
// but we're going for readability here.
ar( cereal::make_size_tag( n_cols ) );
for( auto i = 0; i < n_cols; ++i )
// a size_tag must be followed up with actual serializations that create nodes within the JSON serializer
// so we cannot immediately make a size_tag for the number of rows. See ColWrapper for more details
ar( make_col_wrapper(m, i, n_cols) );
}
template<class Archive, class eT, cereal::traits::EnableIf<cereal::traits::is_text_archive<Archive>::value> = cereal::traits::sfinae>
inline void load( Archive & ar, Mat<eT>& m )
{
// We're doing essentially the same thing here, but loading the sizes and performing the resize for the matrix
// within ColWrapper
cereal::size_type n_rows;
cereal::size_type n_cols;
ar( cereal::make_size_tag( n_cols ) );
for( auto i = 0; i < n_cols; ++i )
ar( make_col_wrapper(m, i, n_cols) );
}
} // end namespace arma
Example program to run the above:
int main(int argc, char* argv[])
{
std::stringstream ss;
std::stringstream ss2;
{
arma::mat A = arma::randu<arma::mat>(4, 5);
cereal::JSONOutputArchive ar(ss);
ar( A );
}
std::cout << ss.str() << std::endl;
{
arma::mat A;
cereal::JSONInputArchive ar(ss);
ar( A );
{
cereal::JSONOutputArchive ar2(ss2);
ar2( A );
}
}
std::cout << ss2.str() << std::endl;
return 0;
}
and its output:
{
"value0": [
[
0.786820954867802,
0.2504803406880287,
0.7106712289786555,
0.9466678009609704
],
[
0.019271058195813773,
0.40490214481616768,
0.25131781792803756,
0.02271243862792676
],
[
0.5206431525734917,
0.34467030607918777,
0.27419560360286257,
0.561032100176393
],
[
0.14003945653337478,
0.5438560675050177,
0.5219157100717673,
0.8570772835528213
],
[
0.49977436000503835,
0.4193700240544483,
0.7442805199715539,
0.24916812957858262
]
]
}
{
"value0": [
[
0.786820954867802,
0.2504803406880287,
0.7106712289786555,
0.9466678009609704
],
[
0.019271058195813773,
0.40490214481616768,
0.25131781792803756,
0.02271243862792676
],
[
0.5206431525734917,
0.34467030607918777,
0.27419560360286257,
0.561032100176393
],
[
0.14003945653337478,
0.5438560675050177,
0.5219157100717673,
0.8570772835528213
],
[
0.49977436000503835,
0.4193700240544483,
0.7442805199715539,
0.24916812957858262
]
]
}

Loop fusion in C++ (how to help the compiler?)

I try to understand under what circumstances a C++ compiler is able to perform loop fusion and when not.
The following code measures the performance of two different ways to calculate the squared doubles (f(x) = (2*x)^2) of all values in a vector.
#include <chrono>
#include <iostream>
#include <numeric>
#include <vector>
constexpr int square( int x )
{
return x * x;
}
constexpr int times_two( int x )
{
return 2 * x;
}
// map ((^2) . (^2)) $ [1,2,3]
int manual_fusion( const std::vector<int>& xs )
{
std::vector<int> zs;
zs.reserve( xs.size() );
for ( int x : xs )
{
zs.push_back( square( times_two( x ) ) );
}
return zs[0];
}
// map (^2) . map (^2) $ [1,2,3]
int two_loops( const std::vector<int>& xs )
{
std::vector<int> ys;
ys.reserve( xs.size() );
for ( int x : xs )
{
ys.push_back( times_two( x ) );
}
std::vector<int> zs;
zs.reserve( ys.size() );
for ( int y : ys )
{
zs.push_back( square( y ) );
}
return zs[0];
}
template <typename F>
void test( F f )
{
const std::vector<int> xs( 100000000, 42 );
const auto start_time = std::chrono::high_resolution_clock::now();
const auto result = f( xs );
const auto end_time = std::chrono::high_resolution_clock::now();
const auto elapsed = end_time - start_time;
const auto elapsed_us = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
std::cout << elapsed_us / 1000 << " ms - " << result << std::endl;
}
int main()
{
test( manual_fusion );
test( two_loops );
}
The version with two loops takes about twice as much time as the version with one loop, even with -O3 for GCC and Clang.
Is there a way to allow the compiler to optimize two_loops into being as fast as manual_fusion without operating in-place in the second loop? The reason I'm asking is I want to make chained calls to my library FunctionalPlus like fplus::enumerate(fplus::transform(f, xs)); faster.
You can try modify your two_loops function as follows:
int two_loops( const std::vector<int>& xs )
{
std::vector<int> zs;
zs.reserve( xs.size() );
for ( int x : xs )
{
zs.push_back( times_two( x ) );
}
for ( int i=0 : i<zs.size(); i++ )
{
zs[i] = ( square( zs[i] ) );
}
return zs[0];
}
The point is to avoid allocating memory twice and push_back to another vector

double probe hash table

I am trying to edit my hash table to form a double hashing class but can't seem to get it right.
I was wondering if anyone had any insight. I was told that all I needed to do was edit the findPos() where I now have to provide new probes using a new strategy.
**I did some research and it says in double probing you would use R-(x mod R) where R >size and a prime smaller than the table size. So do I make a new rehash function?
here is my code:
template <typename HashedObj>
class HashTable
{
public:
explicit HashTable( int size = 101 ) : array( nextPrime( size ) )
{ makeEmpty( ); }
bool contains( const HashedObj & x ) const
{
return isActive( findPos( x ) );
}
void makeEmpty( )
{
currentSize = 0;
for( auto & entry : array )
entry.info = EMPTY;
}
bool insert( const HashedObj & x )
{
// Insert x as active
int currentPos = findPos( x );
if( isActive( currentPos ) )
return false;
if( array[ currentPos ].info != DELETED )
++currentSize;
array[ currentPos ].element = x;
array[ currentPos ].info = ACTIVE;
// Rehash;
if( currentSize > array.size( ) / 2 )
rehash( );
return true;
}
bool insert( HashedObj && x )
{
// Insert x as active
int currentPos = findPos( x );
if( isActive( currentPos ) )
return false;
if( array[ currentPos ].info != DELETED )
++currentSize;
array[ currentPos ] = std::move( x );
array[ currentPos ].info = ACTIVE;
// Rehash; see Section 5.5
if( currentSize > array.size( ) / 2 )
rehash( );
return true;
}
bool remove( const HashedObj & x )
{
int currentPos = findPos( x );
if( !isActive( currentPos ) )
return false;
array[ currentPos ].info = DELETED;
return true;
}
enum EntryType { ACTIVE, EMPTY, DELETED };
private:
struct HashEntry
{
HashedObj element;
EntryType info;
HashEntry( const HashedObj & e = HashedObj{ }, EntryType i = EMPTY )
: element{ e }, info{ i } { }
HashEntry( HashedObj && e, EntryType i = EMPTY )
: element{ std::move( e ) }, info{ i } { }
};
vector<HashEntry> array;
int currentSize;
bool isActive( int currentPos ) const
{ return array[ currentPos ].info == ACTIVE; }
int findPos( const HashedObj & x ) const
{
int offset = 1;
int currentPos = myhash( x );
while( array[ currentPos ].info != EMPTY &&
array[ currentPos ].element != x )
{
currentPos += offset; // Compute ith probe
offset += 2;
if( currentPos >= array.size( ) )
currentPos -= array.size( );
}
return currentPos;
}
void rehash( )
{
vector<HashEntry> oldArray = array;
// Create new double-sized, empty table
array.resize( nextPrime( 2 * oldArray.size( ) ) );
for( auto & entry : array )
entry.info = EMPTY;
// Copy table over
currentSize = 0;
for( auto & entry : oldArray )
if( entry.info == ACTIVE )
insert( std::move( entry.element ) );
}
size_t myhash( const HashedObj & x ) const
{
static hash<HashedObj> hf;
return hf( x ) % array.size( );
}
};
I am not sure of understanding your code, but let me pose some observations that they should not be considered as an answer, but their size is greater that what is allowed to comment.
If you use quadratic probing, then I think in the method findPos() you should advance currentPos in some as currentPos*currentPos % array.size(). Currently, as I see, you increase currentPos in one unity (offset is initially 1), after 2 unities, after 4 and so on
Probably you are trying a fast way for compute the quadratic probe. If this is the case, then offset should not be increased by two, but multiplied by two. That would be some as offset *= 2, but because you should count the number of collisions you should increase offset.
Maybe a simpler way would be:
currentPos += 2*offset++ - 1; // fast way of doing quadratic resolution
Your resizing is ok, given that it guarantees that the table will be at least half empty and consequently the search of availables entries when the insertion is executed is guaranteed.
Good luck
It appears that you want to implement double hashing for probing. This is a technique for resolving collisions by using a second hash of the input key. In the original quadratic function, you continually add an increasing offset to the index value until you find an empty spot in the hash table. The only important difference in a double hashing function would be the value of the offset.
If I were you, I would create a new hash function which is similar to the first one, but I would replace the return statement with return R - ( hf(x) % R ); for a provided R value. Then I would change findPos to set offset equal to this second hash function (Also remember to remove the offset += 2; line because the offset is no longer increasing).

How to optimize out the space a class reference member takes?

template<typename T>
struct UninitializedField
{
T& X;
inline UninitializedField( ) : X( *( T* )&DATA )
{
}
protected:
char DATA[ sizeof( T ) ];
};
int main( )
{
UninitializedField<List<int>> LetsTest;
printf( "%u, %u\n", sizeof( LetsTest ), sizeof( List<int> ) );
}
I am trying to program a class that wraps an object without being automatically initialize\constructed.
But when I execute my program the output is:
8, 4
Is there a way to optimize out the dereference to get into the object in X and the space it takes?
template<typename T>
struct UninitializedField {
__inline UninitializedField( const T &t ) {
*( ( T* )this ) = t;
}
__inline UninitializedField( bool Construct = false, bool Zero = true ) {
if ( Zero )
memset( this, 0, sizeof( *this ) );
if ( Construct )
*( ( T* )this ) = T( );
}
__inline T *operator->( ) {
return ( T* )this;
}
__inline T &operator*( ) {
return *( ( T* )this );
}
protected:
char DATA[ sizeof( T ) ];
};
There isn't any space taken, and with compiler-optimization on there's no call to function.