int24 - 24 bit integral datatype - c++

Is there a 24Bit primitive integral datatype in C++?
If there is none, would it be possible to create a class int24 (, uint24 ) ?
Its purpose could be:
manipulating soundfiles in 24 bit format
manipulating bitmapdata without alphachannel

Depending on the requirements I'd use a bitfield for it.
struct int24{
unsigned int data : 24;
};
Or, if a separation is easier, just use 3 bytes (chars).
Btw, both use cases you mention in the question generally use 32bit integers. In the case of audio processing you'll generally convert to 32 bit ints (or floats, preferably, to prevent overflow situations you'd get with fixed point or integer math) when loading in chunks of audio because you're not going to have the entire file in memory at once.
For image data, people just tend to use 32 bit integers and ignore the alpha 8 alpha bits all together, or if you're dealing with a tightly packed format you're probably better of just manipulating them as char-pointers anyway because you'll have all channels separate. It's going to be a performance/memory trade-off anyway because writing one int is generally faster than three chars separately; however it will take 25% more memory.
Packing structs like this is compiler specific. However, in Visual Studio you'd do the following to make the struct exactly 24 bits.
#pragma pack(push, 1)
struct int24{
unsigned int data : 24;
};
#pragma pack(pop)

I wrote this to help me with audio manipulation. Its not the fastest but it works for me :)
const int INT24_MAX = 8388607;
class Int24
{
protected:
unsigned char m_Internal[3];
public:
Int24()
{
}
Int24( const int val )
{
*this = val;
}
Int24( const Int24& val )
{
*this = val;
}
operator int() const
{
if ( m_Internal[2] & 0x80 ) // Is this a negative? Then we need to siingn extend.
{
return (0xff << 24) | (m_Internal[2] << 16) | (m_Internal[1] << 8) | (m_Internal[0] << 0);
}
else
{
return (m_Internal[2] << 16) | (m_Internal[1] << 8) | (m_Internal[0] << 0);
}
}
operator float() const
{
return (float)this->operator int();
}
Int24& operator =( const Int24& input )
{
m_Internal[0] = input.m_Internal[0];
m_Internal[1] = input.m_Internal[1];
m_Internal[2] = input.m_Internal[2];
return *this;
}
Int24& operator =( const int input )
{
m_Internal[0] = ((unsigned char*)&input)[0];
m_Internal[1] = ((unsigned char*)&input)[1];
m_Internal[2] = ((unsigned char*)&input)[2];
return *this;
}
/***********************************************/
Int24 operator +( const Int24& val ) const
{
return Int24( (int)*this + (int)val );
}
Int24 operator -( const Int24& val ) const
{
return Int24( (int)*this - (int)val );
}
Int24 operator *( const Int24& val ) const
{
return Int24( (int)*this * (int)val );
}
Int24 operator /( const Int24& val ) const
{
return Int24( (int)*this / (int)val );
}
/***********************************************/
Int24 operator +( const int val ) const
{
return Int24( (int)*this + val );
}
Int24 operator -( const int val ) const
{
return Int24( (int)*this - val );
}
Int24 operator *( const int val ) const
{
return Int24( (int)*this * val );
}
Int24 operator /( const int val ) const
{
return Int24( (int)*this / val );
}
/***********************************************/
/***********************************************/
Int24& operator +=( const Int24& val )
{
*this = *this + val;
return *this;
}
Int24& operator -=( const Int24& val )
{
*this = *this - val;
return *this;
}
Int24& operator *=( const Int24& val )
{
*this = *this * val;
return *this;
}
Int24& operator /=( const Int24& val )
{
*this = *this / val;
return *this;
}
/***********************************************/
Int24& operator +=( const int val )
{
*this = *this + val;
return *this;
}
Int24& operator -=( const int val )
{
*this = *this - val;
return *this;
}
Int24& operator *=( const int val )
{
*this = *this * val;
return *this;
}
Int24& operator /=( const int val )
{
*this = *this / val;
return *this;
}
/***********************************************/
/***********************************************/
Int24 operator >>( const int val ) const
{
return Int24( (int)*this >> val );
}
Int24 operator <<( const int val ) const
{
return Int24( (int)*this << val );
}
/***********************************************/
Int24& operator >>=( const int val )
{
*this = *this >> val;
return *this;
}
Int24& operator <<=( const int val )
{
*this = *this << val;
return *this;
}
/***********************************************/
/***********************************************/
operator bool() const
{
return (int)*this != 0;
}
bool operator !() const
{
return !((int)*this);
}
Int24 operator -()
{
return Int24( -(int)*this );
}
/***********************************************/
/***********************************************/
bool operator ==( const Int24& val ) const
{
return (int)*this == (int)val;
}
bool operator !=( const Int24& val ) const
{
return (int)*this != (int)val;
}
bool operator >=( const Int24& val ) const
{
return (int)*this >= (int)val;
}
bool operator <=( const Int24& val ) const
{
return (int)*this <= (int)val;
}
bool operator >( const Int24& val ) const
{
return (int)*this > (int)val;
}
bool operator <( const Int24& val ) const
{
return (int)*this < (int)val;
}
/***********************************************/
bool operator ==( const int val ) const
{
return (int)*this == val;
}
bool operator !=( const int val ) const
{
return (int)*this != val;
}
bool operator >=( const int val ) const
{
return (int)*this >= val;
}
bool operator <=( const int val ) const
{
return (int)*this <= val;
}
bool operator >( const int val ) const
{
return ((int)*this) > val;
}
bool operator <( const int val ) const
{
return (int)*this < val;
}
/***********************************************/
/***********************************************/
};

Working with anything smaller than an integer (32 or 64 bit depending on your architecture) is not ideal. All CPU operations of the smaller data types (short, etc) are done using integer arithmetic. Conversion to and from the CPU has to be done, slowing your application down (even if it is just a tad).
My advice: Store them as 32 (or 64 bit) integers to improve your overall speed. When it comes time to do I/O, then you'll have to do the conversion yourself.
As far as manipulating audio data, there are many libraries available that take care of the I/O for you - unless you want to start learning how PCM, etc are stored - as well as other DSP functions. I would suggest using one of the many libraries out there.

I know I'm ten years late, but what do you think about a bitset solution ?
class i24
{
std::bitset<24> m_value;
public:
constexpr i24(int value) noexcept: m_value {static_cast<unsigned long long>(value)} {}
operator int() const
{
constexpr std::uint32_t negative_mask = (0xff << 24);
return (m_value[23] ? negative_mask : 0) | m_value.to_ulong();
}
};

No - all you can really do is:
typedef int32_t int24_t;
which helps to make code/intent more readable/obvious, but doesn't impose any limits on range or storage space.

The best way is creating an Int24 class and use it like primitive types.
It would be like:
Int24.h
#ifndef INT24_H
#define INT24_H
class Int24
{
public:
Int24();
Int24(unsigned long);
Int24 operator+ (Int24 value);
Int24 operator* (int value);
Int24 operator/ (int value);
void operator= (unsigned long value);
void operator= (Int24 value);
operator int() const;
// Declare prefix and postfix increment operators.
Int24 &operator++(); // Prefix increment operator.
Int24 operator++(int); // Postfix increment operator.
// Declare prefix and postfix decrement operators.
Int24 &operator--(); // Prefix decrement operator.
Int24 operator--(int); // Postfix decrement operator.
unsigned long value() const;
private:
unsigned char mBytes[3] ;
};
#endif // INT24_H
Int24.cpp
#include "Int24.h"
Int24::Int24()
{
mBytes[0] = 0;
mBytes[1] = 0;
mBytes[2] = 0;
}
Int24::Int24(unsigned long value)
{
mBytes[0] = ( value & 0xff);
mBytes[1] = ((value >> 8) & 0xff);
mBytes[2] = ((value >> 16 ) & 0xff);
}
Int24 Int24::operator+(Int24 value)
{
Int24 retVal;
unsigned long myValue;
unsigned long addValue;
myValue = this->mBytes[2];
myValue <<= 8;
myValue |= this->mBytes[1];
myValue <<= 8;
myValue |= this->mBytes[0];
addValue = value.mBytes[2];
addValue <<= 8;
addValue |= value.mBytes[1];
addValue <<= 8;
addValue |= value.mBytes[0];
myValue += addValue;
retVal = myValue;
return retVal;
}
Int24 Int24::operator*(int value)
{
(*this) = (*this).value() * value;
return (*this);
}
Int24 Int24::operator/(int value)
{
(*this) = (*this).value() / value;
return (*this);
}
void Int24::operator=(unsigned long value)
{
mBytes[0] = ( value & 0xff);
mBytes[1] = ((value >> 8) & 0xff);
mBytes[2] = ((value >> 16 ) & 0xff);
}
void Int24::operator=(Int24 value)
{
mBytes[0] = value.mBytes[0];
mBytes[1] = value.mBytes[1];
mBytes[2] = value.mBytes[2];
}
Int24 &Int24::operator++()
{
(*this) = (*this).value() + 1;
return *this;
}
Int24 Int24::operator++(int)
{
Int24 temp = (*this);
++(*this);
return temp;
}
Int24 &Int24::operator--()
{
(*this) = (*this).value() - 1;
return *this;
}
Int24 Int24::operator--(int)
{
Int24 temp = (*this);
--(*this);
return temp;
}
Int24::operator int() const
{
return value();
}
unsigned long Int24::value() const
{
unsigned long retVal;
retVal = this->mBytes[2];
retVal <<= 8;
retVal |= this->mBytes[1];
retVal <<= 8;
retVal |= this->mBytes[0];
return retVal;
}
You can download the Int24 and Int48 classes from my GitHub link.
Also there is an example which shows you how to use it.

Related

How to define a RandomAccessIterator over a pointer to a vector of chars?

I am implementing a kind of dataframe and I want to define a RandomAccessIterator over it, in order to execute the different std algorithms, such as the sorting one. The dataframe of the example contains two column "a" and "b":
a; b;
20; 21;
20; 19;
10; 11;
40; 41;
10; 11;
After sorting with a trivial selection sort this is the result:
a; b;
10; 11;
10; 11;
20; 19;
20; 21;
40; 41;
The problem that I am facing is that the std::sort does not work properly. And I don't know weather the implementation of the iterator is sound or not.
This is the code.
File: dataframe.hpp
#pragma once
#include <iostream>
#include <charconv>
#include <vector>
#include <memory>
#include <cstring>
#include <numeric>
#include "iterator.hpp"
namespace df
{
class Record;
class Column;
class Dataframe;
namespace types
{
enum class Base : char
{
CHAR = 'A',
UNSIGNED = 'U',
// Other types..
};
class Dtype
{
public:
Dtype(types::Base base, std::size_t size) : m_base_dtype{base}, m_size{size} {}
[[nodiscard]] auto name() const
{
return std::string{static_cast<char>(m_base_dtype)} + std::to_string(m_size);
}
[[nodiscard]] auto base() const { return m_base_dtype; }
[[nodiscard]] auto size() const { return m_size; }
[[nodiscard]] auto is_primitive() const
{
switch (base())
{
case types::Base::CHAR:
return size() == 1;
case types::Base::UNSIGNED:
return size() == 1 or size() == 2 or size() == 4 or size() == 8;
}
return false;
}
private:
types::Base m_base_dtype;
std::size_t m_size;
};
[[nodiscard]] static auto CHAR(const std::size_t size) { return Dtype(types::Base::CHAR, size); }
[[nodiscard]] static auto UNSIGNED(const std::size_t size) { return Dtype(types::Base::UNSIGNED, size); }
}
class Column
{
public:
Column(std::vector<char> &raw, const types::Dtype dtype) : m_raw{std::move(raw)}, m_dtype{dtype} {}
Column &operator=(Column &&c) = default; // Move constructor
[[nodiscard]] const auto &dtype() const { return m_dtype; }
[[nodiscard]] auto &raw() { return m_raw; }
[[nodiscard]] const auto &raw() const { return m_raw; }
[[nodiscard]] auto *data() { return m_raw.data(); }
[[nodiscard]] const auto *data() const { return m_raw.data(); }
private:
std::vector<char> m_raw;
types::Dtype m_dtype;
};
class Dataframe
{
public:
Dataframe(std::vector<char> &raw, std::vector<std::string> names, std::vector<types::Dtype> dtypes)
{
m_raw = std::move(raw);
m_column_dtypes = dtypes;
m_column_names = names;
m_record_size = 0;
for (const auto dt : dtypes)
{
m_column_offsets.emplace_back(m_record_size);
m_record_size += dt.size();
}
m_record_count = m_raw.size() / m_record_size;
}
Dataframe(std::vector<char> &raw, std::vector<types::Dtype> dtypes) : Dataframe(raw, {}, dtypes) {}
Dataframe &operator=(Dataframe &&c) = default; // Move constructor
[[nodiscard]] auto &raw() { return m_raw; }
[[nodiscard]] const auto &raw() const { return m_raw; }
[[nodiscard]] auto *data() { return m_raw.data(); }
[[nodiscard]] const auto *data() const { return m_raw.data(); }
// Iterators
[[nodiscard]] df::Iterator begin()
{
return df::Iterator{m_raw.data(), m_record_size};
}
[[nodiscard]] df::Iterator end()
{
return df::Iterator{m_raw.data() + m_raw.size(), m_record_size};
}
[[nodiscard]] auto shape() const { return std::make_pair(m_record_count, m_column_dtypes.size()); }
[[nodiscard]] auto record_count() const { return m_record_count; }
[[nodiscard]] auto record_size() const { return m_record_size; }
[[nodiscard]] const auto &names() const { return m_column_names; }
[[nodiscard]] const auto &dtypes() const { return m_column_dtypes; }
[[nodiscard]] const auto &offsets() const { return m_column_offsets; }
void print() { print(m_record_count); }
void print(const std::size_t initial_records)
{
// Print header
for (auto column_name : m_column_names)
{
std::cout << column_name << "; ";
}
std::cout << std::endl;
// Print rows
std::size_t records_to_print = std::min(initial_records, m_record_count);
for (std::size_t i = 0; i < records_to_print; i++)
{
const auto start_p = i * record_size();
auto start_field = 0;
auto end_field = 0;
for (auto field : m_column_dtypes)
{
end_field += field.size();
switch (field.base())
{
case types::Base::UNSIGNED:
{
std::uint64_t uint_value = 0;
memcpy(&uint_value, m_raw.data() + start_p + start_field, field.size());
std::cout << uint_value;
break;
}
case types::Base::CHAR:
{
std::string str_value = std::string(m_raw.data() + start_p + start_field, field.size());
std::cout << str_value;
break;
}
}
start_field = end_field;
// New column
std::cout << "; ";
}
// New row
std::cout << std::endl;
}
}
std::shared_ptr<Dataframe> copy() const
{
auto x = std::vector<char>(m_raw);
return std::make_shared<Dataframe>(x, std::vector<std::string>(m_column_names), std::vector<types::Dtype>(m_column_dtypes));
}
private:
std::vector<char> m_raw = {};
std::vector<std::string> m_column_names = {};
std::vector<types::Dtype> m_column_dtypes = {};
std::vector<std::size_t> m_column_offsets = {};
std::size_t m_record_size = {};
std::size_t m_record_count = {};
};
using namespace types;
static std::shared_ptr<Dataframe> read_from_vector(const std::vector<std::vector<std::string>> values, const std::vector<std::string> names, const std::vector<Dtype> dtypes)
{
const auto record_size = std::accumulate(dtypes.begin(), dtypes.end(), std::size_t{0},
[](std::size_t accum, const auto &m)
{ return accum + m.size(); });
const auto total_size = values.size() * record_size;
const std::size_t INCR_RECORDS = std::max(total_size / (10 * record_size), std::size_t{65536});
auto raw = std::vector<char>{};
std::size_t written_records = 0;
auto offsets = std::vector<std::size_t>{};
for (int offset = 0; const auto &kd : dtypes)
{
offsets.push_back(offset);
offset += kd.size();
}
for (auto value : values)
{
if (written_records >= raw.size() / record_size)
{
raw.resize(raw.size() + INCR_RECORDS * record_size, char{' '});
}
for (int i = 0; i < names.size(); i++)
{
const auto name = names[i];
const auto dtype = dtypes[i];
const auto offset = offsets[i];
const auto pos = written_records * record_size + offset;
switch (dtype.base())
{
case df::Base::CHAR:
{
const auto v = value[i];
const auto byte_to_copy = std::min(v.size(), dtype.size());
std::memcpy(raw.data() + pos,
v.data() + v.size() - byte_to_copy, byte_to_copy); // Prendo gli ultimi byte
break;
}
case df::Base::UNSIGNED:
{
const auto v = std::stoull(value[i]);
const auto byte_to_copy = dtype.size();
std::memcpy(raw.data() + pos, &v, byte_to_copy); // Prendo gli ultimi byte
break;
}
default:
throw std::runtime_error("ColumnType non riconosciuto");
}
}
written_records++;
}
raw.resize(written_records * record_size);
raw.shrink_to_fit();
return std::make_shared<Dataframe>(raw, names, dtypes);
}
}
File: iterator.hpp
#pragma once
#include <iostream>
#include <cstring>
namespace df
{
class Iterator
{
std::size_t size;
char *ptr;
public:
struct record_reference;
struct record_value
{
std::size_t size;
char *ptr;
record_value(const record_reference &t) : record_value(t.size, t.ptr){};
record_value(const std::size_t m_size, char *m_ptr)
{
this->size = m_size;
this->ptr = new char[this->size];
std::memcpy(ptr, m_ptr, this->size);
}
~record_value()
{
delete[] this->ptr;
}
};
struct record_reference
{
std::size_t size;
char *ptr;
record_reference(const std::size_t m_size, char *m_ptr)
{
this->size = m_size;
this->ptr = m_ptr;
}
record_reference(const record_reference &t)
{
this->size = t.size;
this->ptr = t.ptr;
}
// record_reference(const record_value &t) : record_reference(t.size, t.ptr) {};
record_reference &operator=(const record_value &t)
{
std::memcpy(ptr, t.ptr, size);
return *this;
}
record_reference &operator=(const record_reference &t)
{
std::memcpy(ptr, t.ptr, size);
return *this;
}
record_reference &operator=(char *t)
{
std::memcpy(ptr, t, size);
return *this;
}
operator char *()
{
return ptr;
}
operator const char *() const { return ptr; }
};
using iterator_category = std::random_access_iterator_tag;
using value_type = record_value;
using reference = record_reference;
using difference_type = std::ptrdiff_t;
// default constructible
Iterator() : size(0), ptr(nullptr)
{
}
// copy assignable
Iterator &operator=(const Iterator &t)
{
size = t.size;
ptr = t.ptr;
return *this;
}
Iterator(char *ptr, const std::size_t size) : size{size}, ptr(ptr)
{
}
record_reference operator*() const
{
return {size, ptr};
}
// Prefix
Iterator &operator++()
{
ptr += size;
return *this;
}
// Postfix
Iterator operator++(int)
{
auto tmp = *this;
++*this;
return tmp;
}
Iterator &operator--()
{
ptr -= size;
return *this;
}
difference_type operator-(const Iterator &it) const
{
return (this->ptr - it.ptr) / size;
}
Iterator operator+(const difference_type &offset) const
{
return Iterator(ptr + offset * size, size);
}
friend Iterator operator+(const difference_type &diff, const Iterator &it)
{
return it + diff;
}
Iterator operator-(const difference_type &diff) const
{
return Iterator(ptr - diff * size, size);
}
reference operator[](const difference_type &offset) const
{
return {size, ptr + offset * size};
}
bool operator==(const Iterator &it) const
{
return this->ptr == it.ptr;
}
bool operator!=(const Iterator &it) const
{
return !(*this == it);
}
bool operator<(const Iterator &it) const
{
return this->ptr < it.ptr;
}
bool operator>=(const Iterator &it) const
{
return this->ptr >= it.ptr;
}
bool operator>(const Iterator &it) const
{
return this->ptr > it.ptr;
}
bool operator<=(const Iterator &it) const
{
return this->ptr <= it.ptr;
}
Iterator &operator+=(const difference_type &diff)
{
ptr += diff * size;
return *this;
}
operator Iterator() const
{
return Iterator(ptr, size);
}
};
void swap(df::Iterator::record_reference a, df::Iterator::record_reference b)
{
unsigned char *p;
unsigned char *q;
unsigned char *const sentry = (unsigned char *)a.ptr + a.size;
for (p = (unsigned char *)a.ptr, q = (unsigned char *)b.ptr; p < sentry; ++p, ++q)
{
const unsigned char t = *p;
*p = *q;
*q = t;
}
}
}
File: comparator.hpp
#pragma once
#include <memory>
#include <functional>
#include "dataframe.hpp"
#include "iterator.hpp"
namespace compare
{
using comparator_fn = std::function<int(const df::Iterator::record_reference, const df::Iterator::record_reference)>;
template <typename T, std::size_t offset = 0, std::size_t size = sizeof(T)>
static inline comparator_fn make_comparator()
{
if constexpr (size == 3 or size == 5 or size == 7 or size > 8)
return [=](const df::Iterator::record_reference a, const df::Iterator::record_reference b)
{ return std::memcmp(a + offset, b + offset, size); };
return [](const df::Iterator::record_reference a, const df::Iterator::record_reference b)
{ return *(T *)(a + offset) < *(T *)(b + offset) ? -1 : *(T *)(b + offset) < *(T *)(a + offset) ? +1
: 0; };
}
template <typename T>
static inline comparator_fn make_comparator(const std::size_t offset)
{
return [=](const df::Iterator::record_reference a, const df::Iterator::record_reference b)
{ return *(T *)(a + offset) < *(T *)(b + offset) ? -1 : *(T *)(b + offset) < *(T *)(a + offset) ? +1
: 0; };
}
static inline comparator_fn make_column_comparator(const df::Dtype dtype, const std::size_t offset)
{
switch (dtype.base())
{
case df::Base::CHAR:
{
if (dtype.size() == 1)
return make_comparator<std::uint8_t>(offset);
else if (dtype.size() == 2)
return [=](const df::Iterator::record_reference a, const df::Iterator::record_reference b)
{ return std::memcmp(a + offset, b + offset, 2); }; // C'� qualche beneficio a fissare il 2? o conviene trattarlo come uno unsigned short?
return [=](const df::Iterator::record_reference a, const df::Iterator::record_reference b)
{ return std::memcmp(a + offset, b + offset, dtype.size()); };
}
case df::Base::UNSIGNED:
{
return [=](const df::Iterator::record_reference a, const df::Iterator::record_reference b)
{
std::uint64_t uint_value_a = 0;
std::uint64_t uint_value_b = 0;
std::memcpy(&uint_value_a, a + offset, dtype.size());
std::memcpy(&uint_value_b, b + offset, dtype.size());
return (uint_value_a < uint_value_b ? -1 : uint_value_a > uint_value_b ? +1
: 0);
};
}
default:
throw std::runtime_error("Unsupported dtype");
break;
}
}
static inline comparator_fn make_composite_two_way_comparator(const std::shared_ptr<df::Dataframe> &T)
{
const auto K = T->dtypes().size();
std::vector<comparator_fn> F;
for (int i = 0; i < K; i++)
{
F.emplace_back(make_column_comparator(T->dtypes()[i], T->offsets()[i]));
}
const auto comparator = [=](const df::Iterator::record_reference a, const df::Iterator::record_reference b)
{
for (int i = 0; i < K; i++)
{
// If equal go to the next column, otherwise return the result
// The return value is true if the first argument is less than the second
// and false otherwise
if (const auto result = F[i](a, b); result != 0)
return result < 0;
}
return false;
};
return comparator;
}
}
File: main.cpp
#include <iostream>
#include <vector>
#include "dataframe.hpp"
#include "comparator.hpp"
template <typename RandomAccessIterator, typename Comparator>
static void selection_sort(RandomAccessIterator first, RandomAccessIterator last, Comparator comp)
{
for (auto i = first; i != last; ++i)
{
auto min = i;
for (auto j = i + 1; j != last; ++j)
{
if (comp(*j, *min))
min = j;
}
df::Iterator::value_type temp = *i;
*i = *min;
*min = temp;
// Alternative
// std::iter_swap(i, min);
}
}
int main(int argc, char const *argv[])
{
std::vector<std::string> values{"20", "21", "20", "19", "10", "11", "40", "41", "10", "11"};
// Create a vector that contains values grouped by 2
std::vector<std::vector<std::string>> v;
for (int i = 0; i < values.size(); i += 2)
{
std::vector<std::string> temp;
temp.push_back(values[i]);
temp.push_back(values[i + 1]);
v.push_back(temp);
}
std::vector<std::string> column_names = {"a", "b"};
df::Dtype d = df::Dtype(df::Base::UNSIGNED, 4);
std::vector dtypes = {d, d};
// Create a dataframe
std::shared_ptr<df::Dataframe> df = df::read_from_vector(v, column_names, dtypes);
std::cout << "Before sorting" << std::endl;
df->print();
// This comparator sorts the dataframe first by column a and then by column b in ascending order
auto comparator = compare::make_composite_two_way_comparator(df);
selection_sort(df->begin(), df->end(), comparator);
std::cout << "\nAfter sorting" << std::endl;
df->print();
// With the std::sort it does not work
std::sort(df->begin(), df->end(), comparator);
return 0;
}
Your type is not a C++17 RandomAccessIterator, because it isn't a C++17 ForwardIterator, because reference is an object type, not a reference type.
The type It satisfies ForwardIterator if
Let T be the value type of It. The type std::iterator_traits<It>::reference must be either
T& or T&& if It satisfies OutputIterator (It is mutable), or
const T& or const T&& otherwise (It is constant),
(Other requirements elided)
You will be able to satisfy the C++20 concept std::random_access_iterator, because that relaxes the requirement on It::reference.
In C++17, the reference type of an iterator must be precisely value_type& in order for that iterator to be random access. Only input iterators can have the reference type be something other than value_type&. So in C++17, proxy iterators are limited to input iterators. And every algorithm written against C++17 has this expectation.
The C++20 ranges library adds the ability to have random access proxy iterators. And the C++20 algorithms that use those range concepts will respect them.

requirements for custom container type to use with views

I start to play with std::ranges and want understand how views really work. So I try to write my own container and iterator type and want to use it in a view.
But something seems to be missing but the compiler only tells me that there is no begin() method inside the view but not why.
Example:
#include <iostream>
#include <array>
#include <ranges>
class MyFixedContainer;
class MyIterator
{
MyFixedContainer* ptr;
unsigned int offset;
public:
MyIterator( MyFixedContainer* ptr_, unsigned int offset_ ): ptr{ ptr_},offset{offset_}{}
bool operator==( MyIterator& other ) const
{
return ( ptr == other.ptr )&& ( offset == other.offset );
}
bool operator!=( MyIterator& other ) const
{
return !(*this == other);
}
MyIterator operator++()
{
offset++;
return *this;
}
MyIterator operator++(int)
{
MyIterator tmp = *this;
offset++;
return tmp;
}
int operator*() const;
};
class MyFixedContainer
{
std::array<int,4> arr={5,6,7,8};
public:
auto begin() { return MyIterator{ this, 0 }; }
auto end() { return MyIterator{ this, 4}; }
int Get( int offset ) const
{
return arr[ offset ];
}
};
int MyIterator::operator*() const
{
return ptr->Get( offset );
}
int main()
{
MyFixedContainer c;
// Container type itself works:
for ( int i: c )
{
std::cout << i << std::endl;
}
// Try to use with std::ranges
auto even = [] (int i) { return 0 == i % 2; };
auto y = std::views::filter(c, even);
auto b = y.begin(); // << error message
}
Compiles with
main.cpp:90:16: error: 'struct std::ranges::views::__adaptor::_RangeAdaptorClosurestd::ranges::views::__adaptor::_RangeAdaptor<_Callable::operator()<{MyFixedContainer&, main()::<lambda(int)>&}>::<lambda(_Range&&)> >' has no member named 'begin'
90 | auto b = y.begin();
https://godbolt.org/z/doW76j
MyIterator does not model std::input_or_output_iterator because:
It needs to be default constructible.
std::iter_difference_t<MyIterator> must be valid, and
the pre-increment operator must return a reference.
MyIterator is not a std::sentinel_for<MyIterator, MyIterator> because its operators == and != take references instead of const references.
MyIterator does not satisfy std::input_iterator, which requires std::iter_value_t to be valid.
Fixing all of the above:
#include <iostream>
#include <array>
#include <ranges>
class MyFixedContainer;
class MyIterator
{
MyFixedContainer* ptr;
unsigned int offset;
public:
using difference_type = int;
using value_type = int;
MyIterator() = default;
MyIterator( MyFixedContainer* ptr_, unsigned int offset_ ): ptr{ ptr_},offset{offset_}{}
bool operator==( MyIterator const & other ) const
{
return ( ptr == other.ptr )&& ( offset == other.offset );
}
bool operator!=( MyIterator const & other ) const
{
return !(*this == other);
}
MyIterator &operator++()
{
offset++;
return *this;
}
MyIterator operator++(int)
{
MyIterator tmp = *this;
offset++;
return tmp;
}
int operator*() const;
};
class MyFixedContainer
{
std::array<int,4> arr={5,6,7,8};
public:
auto begin() { return MyIterator{ this, 0 }; }
auto end() { return MyIterator{ this, 4}; }
int Get( int offset ) const
{
return arr[ offset ];
}
};
int MyIterator::operator*() const
{
return ptr->Get( offset );
}
int main()
{
MyFixedContainer c;
// Container type itself works:
for ( int i: c )
{
std::cout << i << std::endl;
}
// Try to use with std::ranges
auto even = [] (int i) { return 0 == i % 2; };
static_assert(std::input_or_output_iterator<MyIterator>);
static_assert(std::ranges::input_range<MyFixedContainer>);
auto y = c | std::views::filter(even);
auto b = y.begin(); // << OK
}
The error messages are much clearer if you static_assert every concept that your container/iterator has to model.

How can i use unsigned short int and short int to subtract hugeinteger in C++?

I try my best to code something in C++.When i use unsigned int and int to subtract, that's OK. However, using the unsigned short int and short int to subtract, i have a problem. So what i need to do in my code?
Thanks a lot.
Test value:
21758612232416725117133766166700 1758612232416725117155428849047
In the beginning, i have to define
*this > op2
absolutely.
template< typename T >
HugeInteger< T > HugeInteger< T >::operator-(const HugeInteger &op2)const // subtraction operator; HugeInteger - HugeInteger
{
int size = integer.getSize();
int op2Size = op2.integer.getSize();
int differenceSize = size;
HugeInteger < T > difference(differenceSize);
difference = *this;
Vector<T>::iterator it = difference.integer.begin();
int counter = 0;
for (Vector<T>::iterator i = difference.integer.begin(), j = op2.integer.begin(); j < op2.integer.end(); i++, j++) {
if ((*i - *j - counter) < 10) {
*i -= (*j + counter);
counter = 0;
}
else {
*i += 10;
*i -= (*j + counter);
counter = 1;
}
}
while (counter == 1) {
if ((*(it + op2Size) - counter) < 10) {
*(it + op2Size) -= counter;
counter = 0;
op2Size++;
}
else {
*(it + op2Size) += 10;
*(it + op2Size) -= counter;
counter = 1;
op2Size++;
}
}
if (*this == op2) {
HugeInteger<T> zero(1);
return zero;
}
for (Vector<T>::iterator i = difference.integer.end() - 1; i > difference.integer.begin(); i--) {
if (*i == 0) {
differenceSize--;
}
else {
break;
}
}
difference.integer.resize(differenceSize);
return difference;
}
There's not enough information in the code you posted to determine what is going wrong. I suspect it has something to do with how the "short" types are being converted to a HugeInt.
Below, I show a class definition for something very similar to HugeInt. It keeps track of a value's sign, but unfortunately, it only a few members are defined, enough to demonstrate subtraction.
This class has a nutty template conversion constructor. If none of the other constructors can handle a type, it will try to convert values with that type as if they were some kind of integer value, in a size-independent manner.
main() has two example subtraction operations, both involving a short value and a "huge" value.
#include <iostream>
#include <vector>
#include <string>
// radix-10 arbitrary size integer class
template<typename T>
class HugInt {
// integer digits are stored in a Vector container
// The example uses the class Vector, derived (statically)
// from std::vector<T> in order to mimic the class from
// code in the Question. Normally, I would just use a std::vector.
class Vector : public std::vector<T> { // virtual never needed
public:
int getSize() const {
return static_cast<int>(std::vector<T>::size());
} };
// two member variables
Vector integer;
int sign;
// For encapsulation and compactness, this class uses a lot
// of inline-friend functions. If considering using your own
// inline-friends, it must always be called passing at least one
// object from the class where the function is defined.
// Otherwise, the compiler cannot find it.
// compares just the magnitude (unsigned) parts of two HugInts
// returns a>b:1, a==b:0, a<b:-1
friend int cmpMagnitude(const HugInt& lh, const HugInt& rh) {
const int lh_size = lh.integer.getSize();
const int rh_size = rh.integer.getSize();
if(lh_size != rh_size) return lh_size > rh_size ? 1 : -1;
for(int i=lh_size-1; i+1; --i) {
if(lh.integer[i] != rh.integer[i]) {
return lh.integer[i] > rh.integer[i] ? 1 : -1;
} }
return 0;
}
// subtract the magnitude of op2 from the magnitude of *this
// does not take signs into account, but
// flips sign of *this if op2 has a greater magnitude than *this
void subMagnitude(const HugInt& op2) {
const int cm = cmpMagnitude(*this, op2);
if(cm == 0) {
// magnitudes are equal, so result is zero
integer.clear();
sign = 0;
return;
}
if(cm < 0) {
// If op2's magnitude is greater than this's
// subtract this's Magnitude from op2's,
// then set this to the negated result
HugInt temp{op2};
temp.subMagnitude(*this);
integer = temp.integer;
sign = -sign;
return;
}
// perform digit-wise Magnitude subtraction
// here, this's Magnitude is always greater or
// equal to op2's
T borrow = 0;
const int min_size = op2.integer.getSize();
int i;
for(i=0; i<min_size; ++i) {
const T s = op2.integer[i] + borrow;
if(borrow = (integer[i] < s)) {
integer[i] += T(10);
}
integer[i] -= s;
}
// propagate borrow to upper words (beyond op2's size)
// i is guaranteed to stay in bounds
while(borrow) {
if(borrow = (integer[i] < 1)) {
integer[i] += T(10);
}
--integer[i++];
}
// remove zeroes at end until a nonzero
// digit is encountered or the vector is empty
while(!integer.empty() && !integer.back()) {
integer.pop_back();
}
// if the vector is empty after removing zeroes,
// the object's value is 0, fixup the sign
if(integer.empty()) sign = 0;
}
void addMagnitude(const HugInt& op2) {
std::cout << "addMagnitude called but not implemented\n";
}
// get_sign generically gets a value's sign as an int
template <typename D>
static int get_sign(const D& x) { return int(x > 0) - (x < 0); }
public:
HugInt() : sign(0) {} // Default ctor
// Conversion ctor for template type
// Handles converting from any type not handled elsewhere
// Assumes D is some kind of integer type
// To be more correct, narrow the set of types this overload will handle,
// either by replacing with overloads for specific types,
// or use <type_traits> to restrict it to integer types.
template <typename D>
HugInt(D src) : sign(get_sign(src)) {
// if src is negative, make absolute value to store magnitude in Vector
if(sign < 0) src = D(0)-src; // subtract from zero prevents warning with unsigned D
// loop gets digits from least- to most-significant
// Repeat until src is zero: get the low digit, with src (mod 10)
// then divide src by 10 to shift all digits down one place.
while(src >= 1) {
integer.push_back(T(src % D(10)));
src /= D(10);
} }
// converting floating-point values will cause an error if used with the integer modulo
// operator (%). New overloads could use fmod. But for a shorter example, do something cheesy:
HugInt(double src) : HugInt(static_cast<long long>(src)) {}
HugInt(float src) : HugInt(static_cast<long long>(src)) {}
// conversion ctor from std::string
HugInt(const std::string& str) : sign(1) {
for(auto c : str) {
switch(c) {
// for simple, short parsing logic, a '-'
// found anywhere in the parsed value will
// negate the sign. If the '-' count is even,
// the result will be positive.
case '-': sign = -sign; break;
case '+': case ' ': case '\t': break; // '+', space and tab ignored
case '0': if(integer.empty()) break; // ignore leading zeroes or fallthrough
case '1': case '2': case '3': case '4': case '5':
case '6': case '7': case '8': case '9':
integer.insert(integer.begin(), T(c - '0'));
break;
} }
if(integer.empty()) sign = 0; // fix up zero value if no digits between '1' and '9' found
}
// conversion ctor from C-String (dispatches to std::string ctor)
HugInt(const char* str) : HugInt(std::string(str)) {}
// The default behavior, using value semantics to copy the
// two data members, is correct for our copy/move ctors/assigns
HugInt(const HugInt& src) = default;
HugInt& operator = (const HugInt& src) = default;
// some "C++11" compilers refuse to default the moves
// HugInt(HugInt&& src) = default;
// HugInt& operator = (HugInt&& src) = default;
// cmp(a, b) returns 1 if a>b, 0 if a==b or -1 if a<b
friend int cmp(const HugInt& lh, const HugInt& rh) {
if(lh.sign != rh.sign) return lh.sign > rh.sign ? 1 : -1;
const int cmpmag = cmpMagnitude(lh, rh);
return lh.sign < 0 ? -cmpmag : cmpmag;
}
friend bool operator == (const HugInt& lh, const HugInt& rh) {
return cmp(lh, rh) == 0;
}
friend bool operator != (const HugInt& lh, const HugInt& rh) {
return cmp(lh, rh) != 0;
}
friend bool operator > (const HugInt& lh, const HugInt& rh) {
return cmp(lh, rh) == 1;
}
friend bool operator < (const HugInt& lh, const HugInt& rh) {
return cmp(lh, rh) == -1;
}
friend bool operator >= (const HugInt& lh, const HugInt& rh) {
return cmp(lh, rh) != -1;
}
friend bool operator <= (const HugInt& lh, const HugInt& rh) {
return cmp(lh, rh) != 1;
}
// negate operator
HugInt operator - () const {
HugInt neg;
neg.integer = integer;
neg.sign = -sign;
return neg;
}
// subtract-assign operator
HugInt& operator -= (const HugInt &op2) {
if(op2.sign == 0) { // op2 is zero, do nothing
return *this;
}
if(sign == 0) { // *this is zero, set *this to negative op2
integer = op2.integer;
sign = -op2.sign;
return *this;
}
if(sign == op2.sign) { // same signs: use magnitude subtratction
subMagnitude(op2);
return *this;
}
// opposite signs here: needs magnitude addition (but not implemented)
addMagnitude(op2);
return *this;
}
friend HugInt operator - (const HugInt& lh, const HugInt& rh) {
// a - b uses the -= operator to avoid duplicate code
HugInt result{lh};
return result -= rh;
}
// overload stream output operator for HugInt values
friend std::ostream& operator << (std::ostream& os, const HugInt& hi) {
// assumes decimal output and class radix is 10
if(hi.integer.getSize() == 0) return os << '0';
if(hi.sign < 0) os << '-';
for(int i=hi.integer.getSize()-1; i+1; --i) {
os << char('0' + int(hi.integer[i]));
}
return os;
}
};
int main() {
using HugeInt = HugInt<char>;
{
HugeInt a = "21758612232416725117133766166700 1758612232416725117155428849047";
unsigned short b = 55427;
HugeInt c = a - b;
std::cout << a << " - " << b << " = \n" << c << '\n';
}
std::cout << '\n';
{
short a = 6521;
HugeInt b = 1234567;
HugeInt c = a - b;
std::cout << a << " - " << b << " = \n" << c << '\n';
}
}

Copy constructor used in a "for" loop, but where?

I'm writing an UTF-8 string class and it's two const and non-const iterator classes. I'm encountering a const problem. Here are the classes :
class Utf8String
{
public:
class ConstIter;
class Iter
{
friend class ConstIter;
private:
Iter();
private:
Utf8String * m_pStr;
utf8::iterator< char * > m_oIter;
public:
Iter( const Iter & );
inline explicit Iter( Utf8String * pStr )
: m_pStr( pStr )
, m_oIter( m_pStr->m_sBuf, m_pStr->m_sBuf, m_pStr->m_sBuf + m_pStr->m_nSize )
{ }
inline Iter & operator = ( const Iter & oIter )
{
m_pStr = oIter.m_pStr;
m_oIter = utf8::iterator< char * >(
m_pStr->m_sBuf,
m_pStr->m_sBuf,
m_pStr->m_sBuf + m_pStr->m_nSize );
return *this;
}
inline operator const char * () const
{
return m_oIter.base();
}
inline uchar32_t operator * () const
{
return *m_oIter;
}
inline Iter & operator ++ ()
{
++m_oIter;
return *this;
}
inline Iter & operator -- ()
{
--m_oIter;
return *this;
}
inline bool operator == ( const Iter & oIter )
{
return m_oIter == oIter.m_oIter;
}
inline bool operator != ( const Iter & oIter )
{
return m_oIter != oIter.m_oIter;
}
};
class ConstIter
{
private:
ConstIter();
private:
const Utf8String * m_pStr;
utf8::iterator< const char * > m_oIter;
public:
ConstIter( const ConstIter & );
inline ConstIter( const Iter & oIter )
: m_pStr( oIter.m_pStr )
, m_oIter( m_pStr->m_sBuf, m_pStr->m_sBuf, m_pStr->m_sBuf + m_pStr->m_nSize )
{ }
inline ConstIter( const Utf8String * pStr )
: m_pStr( pStr )
, m_oIter( m_pStr->m_sBuf, m_pStr->m_sBuf, m_pStr->m_sBuf + m_pStr->m_nSize )
{ }
inline operator const char * () const
{
return m_oIter.base();
}
inline ConstIter & operator = ( const ConstIter & oIter )
{
m_pStr = oIter.m_pStr;
m_oIter = utf8::iterator< const char * >(
oIter.m_pStr->m_sBuf,
oIter.m_pStr->m_sBuf,
oIter.m_pStr->m_sBuf + oIter.m_pStr->m_nSize );
return *this;
}
inline ConstIter & operator = ( const Iter & oIter )
{
m_pStr = oIter.m_pStr;
m_oIter = utf8::iterator< const char * >(
m_pStr->m_sBuf,
m_pStr->m_sBuf,
m_pStr->m_sBuf + m_pStr->m_nSize );
return *this;
}
inline uchar32_t operator * () const
{
return *m_oIter;
}
inline ConstIter & operator ++ ()
{
++m_oIter;
return *this;
}
inline ConstIter & operator -- ()
{
--m_oIter;
return *this;
}
inline bool operator == ( const ConstIter & oIter )
{
return m_oIter == oIter.m_oIter;
}
inline bool operator != ( const ConstIter & oIter )
{
return m_oIter != oIter.m_oIter;
}
};
// More stuff
};
Which i'm using as follows :
Utf8String sStr = "not const";
for( Utf8String::Iter i = sStr.Begin(); i != sStr.End(); ++i )
{
}
// 2) Iterating over a const UTF-8 string :
const Utf8String sConstStr = "const";
for( Utf8String::ConstIter i = sConstStr.Begin(); i != sConstStr.End(); ++i )
{
}
// 3) Const interators can also iterate over a non-const string :
for( Utf8String::ConstIter i = sStr.Begin(); i != sStr.End(); ++i )
{
}
The problem is that, if the copy constructor of the iterator classes are not declared public, i'm getting the following error, despite that copy constructor not being explicitly used :
Error 1 error C2248: 'core::Utf8String::Iter::Iter' : cannot access private member declared in class 'core::Utf8String::Iter' c:\xxx\main.cpp 20
Declaring these copy constructors public solve the problem.
What happens ? Is the compiler optimizing Utf8String::ConstIter i = sStr.Begin() into Utf8String::ConstIter i( sStr.Begin() ) or doing some other implicit optimization ?
Thanks for your help. :)
EDIT: Using VS2005 and no C++11.
Utf8String::ConstIter i = sStr.Begin(); is a declaration together with an initialization. It is not an assignment. This initialization is done using the copy constructor.

Can you set a maximum limit to an integer (C++)?

If I never want an integer to go over 100, is there any simple way to make sure that the integer never exceeds 100, regardless of how much the user adds to it?
For example,
50 + 40 = 90
50 + 50 = 100
50 + 60 = 100
50 + 90 = 100
Try this:
std::min(50 + 40, 100);
std::min(50 + 50, 100);
std::min(50 + 60, 100);
std::min(50 + 90, 100);
http://www.cplusplus.com/reference/algorithm/min/
Another option would be to use this after each operation:
if (answer > 100) answer = 100;
Here is a fairly simple and fairly complete example of a simple ADT for a generic BoundedInt.
It uses boost/operators to avoid writing tedious (const, non-assigning) overloads.
The implicit conversions make it interoperable.
I shunned the smart optimizations (the code therefore stayed easier to adapt to e.g. a modulo version, or a version that has a lower bound as well)
I also shunned the direct templated overloads to convert/operate on mixed instantiations (e.g. compare a BoundedInt to a BoundedInt) for the same reason: you can probably rely on the compiler optimizing it to the same effect anyway
Notes:
you need c++0x support to allow the default value for Max to take effect (constexpr support); Not needed as long as you specify Max manually
A very simple demonstration follows.
#include <limits>
#include <iostream>
#include <boost/operators.hpp>
template <
typename Int=unsigned int,
Int Max=std::numeric_limits<Int>::max()>
struct BoundedInt : boost::operators<BoundedInt<Int, Max> >
{
BoundedInt(const Int& value) : _value(value) {}
Int get() const { return std::min(Max, _value); }
operator Int() const { return get(); }
friend std::ostream& operator<<(std::ostream& os, const BoundedInt& bi)
{ return std::cout << bi.get() << " [hidden: " << bi._value << "]"; }
bool operator<(const BoundedInt& x) const { return get()<x.get(); }
bool operator==(const BoundedInt& x) const { return get()==x.get(); }
BoundedInt& operator+=(const BoundedInt& x) { _value = get() + x.get(); return *this; }
BoundedInt& operator-=(const BoundedInt& x) { _value = get() - x.get(); return *this; }
BoundedInt& operator*=(const BoundedInt& x) { _value = get() * x.get(); return *this; }
BoundedInt& operator/=(const BoundedInt& x) { _value = get() / x.get(); return *this; }
BoundedInt& operator%=(const BoundedInt& x) { _value = get() % x.get(); return *this; }
BoundedInt& operator|=(const BoundedInt& x) { _value = get() | x.get(); return *this; }
BoundedInt& operator&=(const BoundedInt& x) { _value = get() & x.get(); return *this; }
BoundedInt& operator^=(const BoundedInt& x) { _value = get() ^ x.get(); return *this; }
BoundedInt& operator++() { _value = get()+1; return *this; }
BoundedInt& operator--() { _value = get()-1; return *this; }
private:
Int _value;
};
Sample usage:
typedef BoundedInt<unsigned int, 100> max100;
int main()
{
max100 i = 1;
std::cout << (i *= 10) << std::endl;
std::cout << (i *= 6 ) << std::endl;
std::cout << (i *= 2 ) << std::endl;
std::cout << (i -= 40) << std::endl;
std::cout << (i += 1 ) << std::endl;
}
Demo output:
10 [hidden: 10]
60 [hidden: 60]
100 [hidden: 120]
60 [hidden: 60]
61 [hidden: 61]
Bonus material:
With a fully c++11 compliant compiler, you could even define a Userdefined Literal conversion:
typedef BoundedInt<unsigned int, 100> max100;
static max100 operator ""_b(unsigned int i)
{
return max100(unsigned int i);
}
So that you could write
max100 x = 123_b; // 100
int y = 2_b*60 - 30; // 70
Yes.
As a bare minimum, you could start with this:
template <int T>
class BoundedInt
{
public:
explicit BoundedInt(int startValue = 0) : m_value(startValue) {}
operator int() { return m_value; }
BoundedInt operator+(int rhs)
{ return BoundedInt(std::min((int)BoundedInt(m_value + rhs), T)); }
private:
int m_value;
};
You can write your own class IntegerRange that includes overloaded operator+, operator-, etc. For an example of operator overloading, see the complex class here.
The simplest way would be to make a class that holds the value, rather than using an integer variable.
class LimitedInt
{
int value;
public:
LimitedInt() : value(0) {}
LimitedInt(int i) : value(i) { if (value > 100) value = 100; }
operator int() const { return value; }
LimitedInt & operator=(int i) { value = i; if (value > 100) value = 100; return *this; }
};
You might get into trouble with results not matching expectations. What should the result of this be, 70 or 90?
LimitedInt q = 2*60 - 30;
C++ does not allow overriding (re-defining) operators on primitive types.
If making a custom integer class (as suggested above) does not fall under your definition of a "simple way", then the answer to your question is no, there is no simple way to do what you want.
I know it's an old post but I think it could still be usefull for some people. I use this to set upper and lower bound:
bounded_value = max(min(your_value,upper_bound),lower_bound);
You could use it in a function like this:
float bounded_value(float x, float upper, float under){
return max(min(x,upper),lower);
}