SOLVED
I am writing a interface to an existing lib that handles struct bwords (see code below), and would like to offer the possibility to call some check functions on the bword itself, or on a string of bytes (a bword member) :
#include <cstdio>
typedef unsigned char byte;
typedef unsigned short ushort;
typedef struct bwordSt { ushort nbLetters; byte *L; } bword;
template<typename T, size_t N>
ushort checkBwL(T (&wL)[N], ushort wSz) {
return 0;
}
ushort checkBwL(const byte* const &wL, ushort wSz) {
return 0;
}
ushort checkBw(const bword &bw) {
return checkBwL(bw.L, bw.nbLetters);
}
int main() {
ushort n;
byte fL[2] = {0, 1};
n = checkBwL(fL, 2); // calls the template function
bword bW = {2, new byte[3]};
bW.L[0] = 0; bW.L[1] = 1; bW.L[2] = 2;
n = checkBwL(bW.L, 3); // calls the non-template function
n = checkBw(bW); // calls the non-template function
return n;
}
The string of bytes can be huge, so I'd like to pass by reference. And I did it.
The only way I found to offer a uniform interface was to duplicate the code of the base check function (checkBwL) in a template (for arrays[byte]) and an overload (for byte*), which is ugly and forces me to maintain two basically identical (big) functions.
Any way around this ?
SOLUTION
No need for the template function, just don't forget the const before the & in argument specification const byte* const &wL
The key to success is delegation:
#include <cstdio>
typedef unsigned char byte;
typedef unsigned short ushort;
typedef struct bwordSt { ushort nbLetters; byte *L; } bword;
ushort check_impl(ushort length, const byte* buffer)
{
// do your actual checking here
return 0;
}
template<typename T, size_t N>
auto checkBw(T (&wL)[N], ushort wSz) -> ushort
{
return wSz == (N * sizeof(T)) && // assuming no null terminator
check_impl(wSz, reinterpret_cast<const byte*>(wL));
}
ushort checkBw(const byte* const &wL, ushort wSz) {
return check_impl(wSz, wL);
}
ushort checkBw(const bword &bw) {
return check_impl(bw.nbLetters, bw.L);
}
int main() {
ushort n;
byte fL[2] = {0, 1};
n = checkBw(fL, 2); // calls the template function
bword bW = {2, new byte[3]};
bW.L[0] = 0; bW.L[1] = 1; bW.L[2] = 2;
n = checkBw(bW.L, 3); // calls the non-template function
n = checkBw(bW); // calls the non-template function
return n;
}
Related
I am looking on a way to use unique_ptr to allocate a structure that contains an array of char with a number of bytes that set dynamically to support different types of message.
Assuming:
struct MyMessage
{
uint32_t id;
uint32_t data_size;
char data[4];
};
How can I convert send_message() below to use a smart pointer?
void send_message(void* data, const size_t data_size)
{
const auto message_size = sizeof(MyMessage) - 4 + data_size;
const auto msg = reinterpret_cast<MyMessage*>(new char[message_size]);
msg->id = 3;
msg->data_size = data_size;
memcpy(msg->data, data, data_size);
// Sending the message
// ...
delete[] msg;
}
My attempt to use smart point using the code below does not compile:
const auto message_size = sizeof(MyMessage) - 4 + data_size;
const auto msg = std::unique_ptr<MyMessage*>(new char[message_size]);
Below a complete working example:
#include <iostream>
#include <iterator>
#include <memory>
using namespace std;
struct MyMessage
{
uint32_t id;
uint32_t data_size;
char data[4];
};
void send_message(void* data, const size_t data_size)
{
const auto message_size = sizeof(MyMessage) - 4 + data_size;
const auto msg = reinterpret_cast<MyMessage*>(new char[message_size]);
if (msg == nullptr)
{
throw std::domain_error("Not enough memory to allocate space for the message to sent");
}
msg->id = 3;
msg->data_size = data_size;
memcpy(msg->data, data, data_size);
// Sending the message
// ...
delete[] msg;
}
struct MyData
{
int page_id;
char point_name[8];
};
void main()
{
try
{
MyData data{};
data.page_id = 7;
strcpy_s(data.point_name, sizeof(data.point_name), "ab332");
send_message(&data, sizeof(data));
}
catch (std::exception& e)
{
std::cout << "Error: " << e.what() << std::endl;
}
}
The data type that you pass to delete[] needs to match what new[] returns. In your example, you are new[]ing a char[] array, but are then delete[]ing a MyMessage object instead. That will not work.
The simple fix would be to change this line:
delete[] msg;
To this instead:
delete[] reinterpret_cast<char*>(msg);
However, You should use a smart pointer to manage the memory deletion for you. But, the pointer that you give to std::unique_ptr needs to match the template parameter that you specify. In your example, you are declaring a std::unique_ptr whose template parameter is MyMessage*, so the constructor is expecting a MyMessage**, but you are passing it a char* instead.
Try this instead:
// if this struct is being sent externally, consider
// setting its alignment to 1 byte, and setting the
// size of the data[] member to 1 instead of 4...
struct MyMessage
{
uint32_t id;
uint32_t data_size;
char data[4];
};
void send_message(void* data, const size_t data_size)
{
const auto message_size = offsetof(MyMessage, data) + data_size;
std::unique_ptr<char[]> buffer = std::make_unique<char[]>(message_size);
MyMessage *msg = reinterpret_cast<MyMessage*>(buffer.get());
msg->id = 3;
msg->data_size = data_size;
std::memcpy(msg->data, data, data_size);
// Sending the message
// ...
}
Or this:
using MyMessage_ptr = std::unique_ptr<MyMessage, void(*)(MyMessage*)>;
void send_message(void* data, const size_t data_size)
{
const auto message_size = offsetof(MyMessage, data) + data_size;
MyMessage_ptr msg(
reinterpret_cast<MyMessage*>(new char[message_size]),
[](MyMessage *m){ delete[] reinterpret_cast<char*>(m); }
);
msg->id = 3;
msg->data_size = data_size;
std::memcpy(msg->data, data, data_size);
// Sending the message
// ...
}
This should work, but it is still not clear if accessing msg->data out of bounds is legal (but at least it is not worst than in your original code):
const auto message_size = sizeof(MyMessage) - ( data_size < 4 ? 0 : data_size - 4 );
auto rawmsg = std::make_unique<char[]>( message_size );
auto msg = new (rawmsg.get()) MyMessage;
I'm writing C++98 (sorry), but working with a C library, which has many objects stored in data structures of the form:
struct c_container
{
size_t len;
int data[1];
};
struct c_container *make_container(size_t n)
{
if (n == 0)
return NULL;
struct c_container *rv = (struct c_container *)malloc(sizeof(rv->len) + n*sizeof(rv->data));
rv->len = n;
return rv;
}
I'd like to do C++-style iteration using BOOST_FOREACH, but this doesn't work. (The "old style" of manually calling the range_begin and range_end functions does work).
inline int *range_begin(c_container *c)
{
return c ? &c->data[0] : NULL;
}
inline int *range_end(c_container *c)
{
return c ? &c->data[c->len] : NULL;
}
inline const int *range_begin(const c_container *c)
{
return c ? &c->data[0] : NULL;
}
inline const int *range_end(const c_container *c)
{
return c ? &c->data[c->len] : NULL;
}
namespace boost
{
template<>
struct range_mutable_iterator<c_container *>
{
typedef int *type;
};
template<>
struct range_const_iterator<c_container *>
{
typedef const int *type;
};
}
int main()
{
c_container *coll = make_container(3);
coll->data[0] = 1;
coll->data[1] = 42;
coll->data[2] = -1;
BOOST_FOREACH(int i, coll)
{
std::cout << i << std::endl;
}
}
This is all that should be necessary, according to http://www.boost.org/doc/libs/1_65_1/doc/html/foreach/extensibility.html (and I've tested it with classes)
However, that example uses a class, whereas I'm using a pointer to a class. Based on my investigation, it appears to be using the codepath that is only intended for const char * and const wchar_t *:
In file included from boost-foreach.cpp:6:0:
/usr/include/boost/foreach.hpp: In function ‘bool boost::foreach_detail_::done(const boost::foreach_detail_::auto_any_base&, const boost::foreach_detail_::auto_any_base&, boost::foreach_detail_::type2type<T*, C>*) [with T = c_container, C = mpl_::bool_<false>, const boost::foreach_detail_::auto_any_base& = const boost::foreach_detail_::auto_any_base&]’:
boost-foreach.cpp:65:5: instantiated from here
/usr/include/boost/foreach.hpp:749:57: error: no match for ‘operator!’ in ‘!* boost::foreach_detail_::auto_any_cast [with T = c_container*, C = mpl_::bool_<false>, typename boost::mpl::if_<C, const T, T>::type = c_container*, const boost::foreach_detail_::auto_any_base& = const boost::foreach_detail_::auto_any_base&](((const boost::foreach_detail_::auto_any_base&)((const boost::foreach_detail_::auto_any_base*)cur)))’
/usr/include/boost/foreach.hpp:749:57: note: candidate is: operator!(bool) <built-in>
Is there some additional boost trait to specialize or something?
It seems to be difficult to define the range functions for pointer types. But you can define them for c_container directly. The code looks like this:
#include <cstdlib>
#include <iostream>
#include <boost/foreach.hpp>
struct c_container
{
size_t len;
int data[1];
};
struct c_container *make_container(size_t n)
{
if (n == 0)
return NULL;
struct c_container *rv = (struct c_container *)malloc(sizeof(rv->len) + n * sizeof(rv->data));
rv->len = n;
return rv;
}
inline int *range_begin(c_container &c)
{
return c.len > 0 ? &c.data[0] : NULL;
}
inline int *range_end(c_container &c)
{
return c.len > 0 ? &c.data[c.len] : NULL;
}
inline const int *range_begin(const c_container &c)
{
return c.len > 0 ? &c.data[0] : NULL;
}
inline const int *range_end(const c_container &c)
{
return c.len > 0 ? &c.data[c.len] : NULL;
}
namespace boost
{
template<>
struct range_mutable_iterator<c_container>
{
typedef int *type;
};
template<>
struct range_const_iterator<c_container>
{
typedef const int *type;
};
}
#define MY_FOREACH(x, y) BOOST_FOREACH(x, *y)
int main()
{
c_container *coll = make_container(3);
coll->data[0] = 1;
coll->data[1] = 42;
coll->data[2] = -1;
//BOOST_FOREACH(int i, *coll)
MY_FOREACH(int i, coll)
{
std::cout << i << std::endl;
}
}
Note that the BOOST_FOREACH loop does not iterate over a pointer type. As a workaround you may define your own FOREACH that does so as shown in the code above.
I am trying to code a C++ implementation of a Bloom filter using the MurmurHash3 hash function. My implementation is based on this site: http://blog.michaelschmatz.com/2016/04/11/how-to-write-a-bloom-filter-cpp/
Somehow, in my BloomFilter header file, the hash function throws an incomplete type error, also, when I use the hash function inside of the add function, I get a "hash is ambigious error".
What can I do to fix this? I am somewhat new to C++ so I'm not exactly sure if I am using the interface/implementation of a structure correctly.
I am also using a main function that will include this file and run some tests to analyze the false positive rate, number of bits, filter size etc . . .
#ifndef BLOOM_FILTER_H
#define BLOOM_FILTER_H
#include "MurmurHash3.h"
#include <vector>
//basic structure of a bloom filter object
struct BloomFilter {
BloomFilter(uint64_t size, uint8_t numHashes);
void add(const uint8_t *data, std::size_t len);
bool possiblyContains(const uint8_t *data, std::size_t len) const;
private:
uint8_t m_numHashes;
std::vector<bool> m_bits;
};
//Bloom filter constructor
BloomFilter::BloomFilter(uint64_t size, uint8_t numHashes)
: m_bits(size),
m_numHashes(numHashes) {}
//Hash array created using the MurmurHash3 code
std::array<uint64_t, 2> hash(const uint8_t *data, std::size_t len)
{
std::array<uint64_t, 2> hashValue;
MurmurHash3_x64_128(data, len, 0, hashValue.data());
return hashValue;
}
//Hash array created using the MurmurHash3 code
inline uint64_t nthHash(uint8_t n,
uint64_t hashA,
uint64_t hashB,
uint64_t filterSize) {
return (hashA + n * hashB) % filterSize;
}
//Adds an element to the array
void BloomFilter::add(const uint8_t *data, std::size_t len) {
auto hashValues = hash(data, len);
for (int n = 0; n < m_numHashes; n++)
{
m_bits[nthHash(n, hashValues[0], hashValues[1], m_bits.size())] = true;
}
}
//Returns true or false based on a probabilistic assesment of the array using MurmurHash3
bool BloomFilter::possiblyContains(const uint8_t *data, std::size_t len) const {
auto hashValues = hash(data, len);
for (int n = 0; n < m_numHashes; n++)
{
if (!m_bits[nthHash(n, hashValues[0], hashValues[1], m_bits.size())])
{
return false;
}
}
return true;
}
#endif
If your MurmurHash3_x64_128 returns two 64-bit numbers as a hash value, I'd treat that as 4 distinct uint32_t hashes as long as you don't need more than 4 billion bits in your bit string. Most likely you don't need more than 2-3 hashses, but that depends on your use case. To figure out how many hashes you need you can check "How many hash functions does my bloom filter need?".
Using MurmurHash3_x64_128 I'd do it this way (if I were to treat it as 4 x uint32_t hashses):
void BloomFilter::add(const uint8_t *data, std::size_t len) {
auto hashValues = hash(data, len);
uint32_t* hx = reinterpret_cast<uint32_t*>(&hashValues[0]);
assert(m_numHashes <= 4);
for (int n = 0; n < m_numHashes; n++)
m_bits[hx[n] % m_bits.size()] = true;
}
Your code has some issues with types conversion that's why it didn't compile:
missing #include <array>
you have to use size_t for size (it might be 32-bit unsigned or 64-bit unsigned int)
it's better to name your hash to something else (e.g. myhash) and make it static.
Here's version of your code with these correction and this should work:
#ifndef BLOOM_FILTER_H
#define BLOOM_FILTER_H
#include "MurmurHash3.h"
#include <vector>
#include <array>
//basic structure of a bloom filter object
struct BloomFilter {
BloomFilter(size_t size, uint8_t numHashes);
void add(const uint8_t *data, std::size_t len);
bool possiblyContains(const uint8_t *data, std::size_t len) const;
private:
uint8_t m_numHashes;
std::vector<bool> m_bits;
};
//Bloom filter constructor
BloomFilter::BloomFilter(size_t size, uint8_t numHashes)
: m_bits(size),
m_numHashes(numHashes) {}
//Hash array created using the MurmurHash3 code
static std::array<uint64_t, 2> myhash(const uint8_t *data, std::size_t len)
{
std::array<uint64_t, 2> hashValue;
MurmurHash3_x64_128(data, len, 0, hashValue.data());
return hashValue;
}
//Hash array created using the MurmurHash3 code
inline size_t nthHash(int n,
uint64_t hashA,
uint64_t hashB,
size_t filterSize) {
return (hashA + n * hashB) % filterSize; // <- not sure if that is OK, perhaps it is.
}
//Adds an element to the array
void BloomFilter::add(const uint8_t *data, std::size_t len) {
auto hashValues = myhash(data, len);
for (int n = 0; n < m_numHashes; n++)
{
m_bits[nthHash(n, hashValues[0], hashValues[1], m_bits.size())] = true;
}
}
//Returns true or false based on a probabilistic assesment of the array using MurmurHash3
bool BloomFilter::possiblyContains(const uint8_t *data, std::size_t len) const {
auto hashValues = myhash(data, len);
for (int n = 0; n < m_numHashes; n++)
{
if (!m_bits[nthHash(n, hashValues[0], hashValues[1], m_bits.size())])
{
return false;
}
}
return true;
}
#endif
Run this code on ideone.
If you are just starting with c++, at first start with basic example, try to use std::hash maybe? Create working implementation, then extend it with optional hash function parameter. If you need your BloomFilter to be fast I'd probably stay away from vector<bool> and use array of unsigned ints instead.
Basic impl could something like this, provided that your have MurmurHash3 implemented:
uint32_t MurmurHash3(const char *str, size_t len);
class BloomFilter
{
public:
BloomFilter(int count_elements = 0, double bits_per_element = 10)
{
mem = NULL;
init(count_elements, bits_per_element);
}
~BloomFilter()
{
delete[] mem;
}
void init(int count_elements, double bits_per_element)
{
assert(!mem);
sz = (uint32_t)(count_elements*bits_per_element + 0.5);
mem = new uint8_t[sz / 8 + 8];
}
void add(const std::string &str)
{
add(str.data(), str.size());
}
void add(const char *str, size_t len)
{
if (len <= 0)
return;
add(MurmurHash3(str, len));
}
bool test(const std::string &str)
{
return test(str.data(), str.size());
}
bool test(const char *str, size_t len)
{
return test_hash(MurmurHash3(str, len));
}
bool test_hash(uint32_t h)
{
h %= sz;
if (0 != (mem[h / 8] & (1u << (h % 8))))
return true;
return false;
}
int mem_size() const
{
return (sz + 7) / 8;
}
private:
void add(uint32_t h)
{
h %= sz;
mem[h / 8] |= (1u << (h % 8));
}
public:
uint32_t sz;
uint8_t *mem;
};
I am aligning several arrays in order and performing some sort of classification. I created an array to hold other arrays in order to simplify the operations that I want to perform.
Sadly, my program crashed when I ran it and I went on to debug it to finally realize that the sizeof operator is giving me sizes of pointers and not arrays within the loop.So I resorted to the cumbersome solution and my program worked.
How can I avoid this cumbersome method? I want to calculate within a loop!
#include <iostream>
#include <string>
#define ARRSIZE(X) sizeof(X) / sizeof(*X)
int classify(const char *asset, const char ***T, size_t T_size, size_t *index);
int main(void)
{
const char *names[] = { "book","resources","vehicles","buildings" };
const char *books[] = { "A","B","C","D" };
const char *resources[] = { "E","F","G" };
const char *vehicles[] = { "H","I","J","K","L","M" };
const char *buildings[] = { "N","O","P","Q","R","S","T","U","V" };
const char **T[] = { books,resources,vehicles,buildings };
size_t T_size = sizeof(T) / sizeof(*T);
size_t n, *index = new size_t[T_size];
/* This will yeild the size of pointers not arrays...
for (n = 0; n < T_size; n++) {
index[n] = ARRSIZE(T[n]);
}
*/
/* Cumbersome solution */
index[0] = ARRSIZE(books);
index[1] = ARRSIZE(resources);
index[2] = ARRSIZE(vehicles);
index[3] = ARRSIZE(buildings);
const char asset[] = "L";
int i = classify(asset, T, T_size, index);
if (i < 0) {
printf("asset is alien !!!\n");
}
else {
printf("asset ---> %s\n", names[i]);
}
delete index;
return 0;
}
int classify(const char *asset, const char ***T, size_t T_size, size_t *index)
{
size_t x, y;
for (x = 0; x < T_size; x++) {
for (y = 0; y < index[x]; y++) {
if (strcmp(asset, T[x][y]) == 0) {
return x;
}
}
}
return -1;
}
As you are including <string> and <iostream> I assume that the question is about C++ and not C. To avoid all this complication, simply use containers. E.g:
#include <vector>
std::vector<int> vect = std::vector<int>(3,0);
std::cout << vect.size() << std::endl; // prints 3
One solution if you are coding in C is to terminate your array with a special item, like NULL
const char *books[] = { "A","B","C","D", NULL };
size_t size(const char *arr[])
{
const char **p = arr;
while (*p)
{
p++;
}
return p - arr;
}
You can specify the array size explizit:
size_t n, index[] = {ARRSIZE(books), ARRSIZE(resources), ARRSIZE(vehicles), ARRSIZE(vehicles)};
or if you want to avoid double typing you can you X-Macros to roll out everything:
#define TBL \
X(books) \
X(resources) \
X(vehicles) \
X(buildings)
const char **T[] = {
#define X(x) x,
TBL
};
#undef X
size_t n, index[] = {
#define X(x) ARRSIZE(x),
TBL
};
which produces the same. See Running Demo.
union LowLevelNumber
{
unsigned int n;
struct
{
unsigned int lowByte : 8;
unsigned int highByte : 8;
unsigned int upperLowByte : 8;
unsigned int upperHighByte : 8;
} bytes;
struct
{
unsigned int lowWord : 16;
unsigned int highWord : 16;
} words;
};
This union allows me to access the unsigned integer byte or word-wise.
However, the code looks rather ugly:
var.words.lowWord = 0x66;
Is there a way which would allow me to write code like this:
var.lowWord = 0x66;
Update:
This is really about writing short / beautiful code as in the example above. The union solution itself does work, I just don't want to write .words or .bytes everytime I access lowWord or lowByte.
union LowLevelNumber {
unsigned int n;
struct {
unsigned int lowByte : 8;
unsigned int highByte : 8;
unsigned int upperLowByte : 8;
unsigned int upperHighByte : 8;
};
struct {
unsigned int lowWord : 16;
unsigned int highWord : 16;
};
};
Note the removed bytes and words names.
C++
Would http://www.cplusplus.com/reference/stl/bitset/ serve for your needs?
Plain C version would look something like this:
int32 foo;
//...
//Set to 0x66 at the low byte
foo &= 0xffffff00;
foo |= 0x66;
This is probably going to be more maintainable down the road than writing a custom class/union, because it follows the typical C idiom.
You can make
short& loword() { return (short&)(*(void*)&m_source); }
and use it if you don't care parenthesis.
Or you can go fancy
public class lowordaccess
{
unsigned int m_source;
public:
void assign(unsigned int& source) { m_source = source; }
short& operator=(short& value) { ... set m_source }
operator short() { return m_source & 0xFF; }
}
and then
struct LowLevelNumber
{
LowLevelNumber() { loword.assign(number); }
unsigned int number;
lowordaccess loword;
}
var.loword = 1;
short n = var.loword;
The latter technique is a known property emulation in C++.
You could easily wrap that in a class and use get/set accessors.
Using a union for this is bad, because it is not portable w.r.t. endianness.
Use accessor functions and implement them with bit masks and shifts.