Tiny serializer

Tiny serializer - c++

I have a family of classes that contain only variables of the following types: std::string, int, double. I should be able to serialize/deserialize objects of these classes to/from C string (null terminated). I don't want to use some 3rdparty serializer and do not want to write full-featured serializer by myself. I will serialize/deserialize only once in my code.
So, how to write very very tiny and yet elegant serializer and do it pretty fast?
UPDATE
I've written and tested my own. Maybe it will be useful for someone. If you notice some bugs or have some suggestions how to make it better, let me know. Here it is:
typedef std::ostringstream ostr;
typedef std::istringstream istr;
const char delimiter = '\n';
const int doublePrecision = 15;
void Save(ostr& os, int x) { os << x << delimiter; }
void Save(ostr& os, double x)
{
os.precision(doublePrecision);
os << x << delimiter;
}
void Save(ostr& os, const std::string& x) { os << x << delimiter; }
void Load(istr& is, int& x)
{
is >> x;
is.rdbuf()->sbumpc(); // get rid of delimiter
}
void Load(istr& is, double& x)
{
is >> x;
is.rdbuf()->sbumpc(); // get rid of delimiter
}
void Load(istr& is, std::string& x) { getline(is, x, delimiter); }
Test:
std::string a = "Test string 1 2 and 2.33";
std::string b = "45";
double c = 45.7;
int d = 58;
double e = 1.0/2048;
std::ostringstream os;
Save(os, a);
Save(os, b);
Save(os, c);
Save(os, d);
Save(os, e);
std::string serialized = os.str();
std::string aa;
std::string bb;
double cc = 0.0;
int dd = 0;
double ee = 0.0;
std::istringstream is(serialized);
Load(is, aa);
Load(is, bb);
Load(is, cc);
Load(is, dd);
Load(is, ee);
ASSERT(a == aa);
ASSERT(b == bb);
ASSERT(c == cc);
ASSERT(d == dd);
ASSERT(e == ee);

There are 2 ways of serializing string data to a stream, you either do it C-style and use a null to terminate it or (more portably and easier to read) first output a byte that says how long the string is then write the string.
Now if you want to differentiate a string from a non-string (number in this case), you can prepend every "packet" (item) with a byte code, say 0x00 for int, 0x01 for double, 0x02 for string, and branch a switch off depending on what the code is. This way you can even write the int/double as a byte, so you won't lose precision and you'll also end up with a smaller/easier to read file.

void save(std::ostringstream& out, const std::string& x)
{
out << x;
}
void read(std::istringstream& in, std::string& x)
{
in.str(x);
}
from here.

I know you don't want to use a 3rdParty serializer, but if you would reconsider: use Boost.Serialization.
(even if this is not the answer for you, it might be for someone else stumbling on this question)
Very simple example
class some_data
{
public:
template<class Archive>
void serialize(Archive & ar, const unsigned int version)
{
ar & my_string;
ar & my_double;
}
private:
std::string my_string;
double my_double;
};
and then to save:
my_data dataObject;
std::ofstream ofs("filename");
boost::archive::text_oarchive oa(ofs);
oa << dataObject;
or to load:
my_data dataObject;
std::ifstream ifs("filename");
boost::archive::text_iarchive ia(ifs);
ia >> dataObject;

See the sun xdr format. It's binary and efficient.
I have a tiny custom class to do this, as an example:
Marshall& Marshall::enc(const string& str) {
size_t size = str.size();
size_t pad = (4 - (size%4))%4;
size_t size_on_buff = size + pad;
space_for(sizeof(uint32_t) + size + pad);
check_size_t_overflow(size);
enc(static_cast<uint32_t>(size));
// xdr mandates padding
//space_for(size_on_buff);
memcpy(&(*buff)[pos],str.data(), size);
memset(&(*buff)[pos+size],0,pad);
pos+=size_on_buff;
return *this;
}
Marshall& Marshall::dec(string& str) {
str.clear();
size_t size;
dec(size);
size_t pad = (4 - (size%4))%4;
size_t size_on_buff = size + pad;
ck_space_avl(size + pad);
//str.resize(size);
str.assign((char*)&(*buff)[pos],size);
pos+=size_on_buff;
return *this;
}

Related

Use of std::vector results in unknown output C++

I'm unable to figure out why the output I received isn't just "00110" but has other giberrish characters in it. Not sure what's wrong with my vector push_back.. It definitely makes sense to me. If I changed it to std::string implementation, it would give a correct output. But in this case, I would need to use vector for proper encapsulation of the object's state. I've been debugging for a few hours now, but still can't find out why. Hope anyone is able to help! Thanks! Note: main() can't be modified.
#include <iostream>
#include <vector>
template<size_t NumBits>
class bitsetts
{
private:
static const unsigned int NO_OF_BITS = CHAR_BIT * sizeof(int); //32 bits
static const unsigned NumBytes = (NumBits - 7) /8;
unsigned char array[NumBytes];
public:
bitsetts() { }
void set(size_t bit, bool val = true) {
if (val == true)
{
array[bit] |= (val << bit );
}
else
{
array[bit] &= (val << bit );
}
}
bool test(size_t bit) const {
return array[bit] & (1U << bit );
}
const std::string to_string()
{
std::vector<char> str;
for (unsigned int i=NumBits; i-- > 0;)
str.push_back('0' + test(i));
return str.data();
}
friend std::ostream& operator<<(std::ostream& os, const bitsetts& ob)
{
for (unsigned i = NumBits; i-- > 0;)
os << ob.test(i);
return os << '\n';
}
};
int main()
{
try
{
bitsetts<5> bitsetts;
bitsetts.set(1);
bitsetts.set(2);
const std::string st = bitsetts.to_string();
if (st != "00110")
{
std::cout << st << std::endl;
throw std::runtime_error{ "-" };
}
}
catch (const std::exception& exception)
{
std::cout << "Conversion failed\n";
}
}

You are filling the std::vector with char values and then constructing a std::string from the raw char data using the std::string constructor that takes a single const char* parameter. That constructor expects the char data to be null-terminated, but you are not pushing a null terminator into your vector, which is why you get extra garbage on the end of your std::string.
So, either push a null terminator into the vector, eg:
const std::string to_string()
{
std::vector<char> str;
for (unsigned int i=NumBits; i-- > 0;)
str.push_back('0' + test(i));
str.push_back('\0'); // <-- add this!
return str.data();
}
Or, use a different std::string constructor that can take the vector's size() as a parameter, eg:
const std::string to_string()
{
std::vector<char> str;
for (unsigned int i=NumBits; i-- > 0;)
str.push_back('0' + test(i));
return std::string(str.data(), str.size()); // <-- add size()!
}
On a side note: your to_string() method should be marked as const, eg:
const std::string to_string() const
Which would then allow you to use to_string() inside of your operator<<, eg:
friend std::ostream& operator<<(std::ostream& os, const bitsetts& b)
{
return os << b.to_string() << '\n';
}

Generic constant time compare function c++

I'm writing a ProtectedPtr class that protects objects in memory using Windows Crypto API, and I've run into a problem creating a generic constant time compare function. My current code:
template <class T>
bool operator==(volatile const ProtectedPtr& other)
{
std::size_t thisDataSize = sizeof(*protectedData) / sizeof(T);
std::size_t otherDataSize = sizeof(*other) / sizeof(T);
volatile auto thisData = (byte*)getEncyptedData();
volatile auto otherData = (byte*)other.getEncyptedData();
if (thisDataSize != otherDataSize)
return false;
volatile int result = 0;
for (int i = 0; i < thisDataSize; i++)
result |= thisData[i] ^ otherData[i];
return result == 0;
}
getEncryptedData function:
std::unique_ptr<T> protectedData;
const T& getEncyptedData() const
{
ProtectMemory(true);
return *protectedData;
}
The problem is casting to byte*. When using this class with strings, my compiler complains that strings can't be casted to byte pointers. I was thinking maybe trying to base my function off of Go's ConstantTimeByteEq function, but it still brings me back to my original problem of converting a template type to an int or something that I can preform binary manipulation on.
Go's ConstantTimeByteEq function:
func ConstantTimeByteEq(x, y uint8) int {
z := ^(x ^ y)
z &= z >> 4
z &= z >> 2
z &= z >> 1
return int(z)
}
How can I easily convert a template type into something that can have binary manipulation easily preformed on it?
UPDATE Working generic constant compare function based on suggestions from lockcmpxchg8b:
//only works on primative types, and types that don't have
//internal pointers pointing to dynamically allocated data
byte* serialize()
{
const size_t size = sizeof(*protectedData);
byte* out = new byte[size];
ProtectMemory(false);
memcpy(out, &(*protectedData), size);
ProtectMemory(true);
return out;
}
bool operator==(ProtectedPtr& other)
{
if (sizeof(*protectedData) != sizeof(*other))
return false;
volatile auto thisData = serialize();
volatile auto otherData = other.serialize();
volatile int result = 0;
for (int i = 0; i < sizeof(*protectedData); i++)
result |= thisData[i] ^ otherData[i];
//wipe the unencrypted copies of the data
SecureZeroMemory(thisData, sizeof(thisData));
SecureZeroMemory(otherData, sizeof(otherData));
return result == 0;
}

Generally, what you're trying to accomplish in your current code is called Format Preserving Encryption. I.e., to encrypt a std::string such that the resulting ciphertext is also a valid std::string. This is much harder than letting the encryption process convert from the original type to a flat array of bytes.
To do the conversion to a flat array, declare a second template argument for a "Serializer" object, that knows how to serialize objects of type T into an array of unsigned char. You could default it to a generic sizeof/memcpy serializer that would work for all primitve types.
Here's an example for std::string.
template <class T>
class Serializer
{
public:
virtual size_t serializedSize(const T& obj) const = 0;
virtual size_t serialize(const T& obj, unsigned char *out, size_t max) const = 0;
virtual void deserialize(const unsigned char *in, size_t len, T& out) const = 0;
};
class StringSerializer : public Serializer<std::string>
{
public:
size_t serializedSize(const std::string& obj) const {
return obj.length();
};
size_t serialize(const std::string& obj, unsigned char *out, size_t max) const {
if(max >= obj.length()){
memcpy(out, obj.c_str(), obj.length());
return obj.length();
}
throw std::runtime_error("overflow");
}
void deserialize(const unsigned char *in, size_t len, std::string& out) const {
out = std::string((const char *)in, (const char *)(in+len));
}
};
Once you've reduced the objects down to a flat array of unsigned chars, then your given constant-time compare algorithm will work just fine.
Here's a really dumbed-down version of your example code using the serializer above.
template <class T, class S>
class Test
{
std::unique_ptr<unsigned char[]> protectedData;
size_t serSize;
public:
Test(const T& obj) : protectedData() {
S serializer;
size_t size = serializer.serializedSize(obj);
protectedData.reset(new unsigned char[size]);
serSize = serializer.serialize(obj, protectedData.get(), size);
// "Encrypt"
for(size_t i=0; i< size; i++)
protectedData.get()[i] ^= 0xa5;
}
size_t getEncryptedLen() const {
return serSize;
}
const unsigned char *getEncryptedData() const {
return protectedData.get();
}
const T getPlaintextData() const {
S serializer;
T target;
//"Decrypt"
for(size_t i=0; i< serSize; i++)
protectedData.get()[i] ^= 0xa5;
serializer.deserialize(protectedData.get(), serSize, target);
return target;
}
};
int main(int argc, char *argv[])
{
std::string data = "test";
Test<std::string, StringSerializer> tester(data);
const unsigned char *ptr = tester.getEncryptedData();
std::cout << "\"Encrypted\" bytes: ";
for(size_t i=0; i<tester.getEncryptedLen(); i++)
std::cout << std::setw(2) << std::hex << std::setfill('0') << (unsigned int)ptr[i] << " ";
std::cout << std::endl;
std::string recov = tester.getPlaintextData();
std::cout << "Recovered: " << recov << std::endl;
}
Output:
$ ./a.out
"Encrypted" bytes: d1 c0 d6 d1
Recovered: test
Edit: answering request for a generic serializer for primtive/flat types. Consider this as pseudocode, because I'm typing it into a browser without testing. I'm not sure if that's the right template syntax.
template<class T>
class PrimitiveSerializer : public Serializer<T>
{
public:
size_t serializedSize(const T& obj) const {
return sizeof obj;
};
size_t serialize(const T& obj, unsigned char *out, size_t max) const {
if(max >= sizeof obj){
memcpy(out, &obj, sizeof obj);
return sizeof obj;
}
throw std::runtime_error("overflow");
}
void deserialize(const unsigned char *in, size_t len, T& out) const {
if(len < sizeof out) {
throw std::runtime_error("underflow");
}
memcpy(&out, in, sizeof out);
}
};

I'm curious about what error the compiler gives you.
That said, try casting to a const char* or const void*.
Another issue could be the casting from a 64-bit pointer to an 8-bit byte. Try casting to an int, long, or longlong
Edit: Based upon your feedback, another minor change:
volatile auto thisData = (byte*)&getEncyptedData();
volatile auto otherData = (byte*)&other.getEncyptedData();
(note the ampersands). That will allow the previous casts to work

Overload operator<<

I would like to overload operator<< like this:
ostringstream oss;
MyDate a(2000, 1, 2);
oss << dateFormat("%Y/%m/%d") << a;
assert(oss.str() == "2000-01-02");
so that the date at a will be formatted to specific format. How to achieve this?

In order to store custom state in a stream, you need to use the xalloc static function to get a unique index, and then either pword to get a pointer at that index (allocated specifically for each stream it is used on), or iword to get an integer at that index(allocated specifically for each stream it is used on). In your case, you will probably want pword. You can use the pointer returned by pword to point to a dynamically allocated object which stores the formatting information.
struct DateFormatter
{
// The implementation of this class (e.g. parsing the format string)
// is a seperate issue. If you need help with it, you can ask another
// question
static int xalloc_index;
};
int DateFormatter::xalloc_index = std::ios_base::xalloc();
void destroy_date_formatter(std::ios_base::event evt, std::ios_base& io, int idx)
{
if (evt == std::ios_base::erase_event) {
void*& vp = io.pword(DateFormatter::xalloc_index);
delete (DateFormatter*)(vp);
}
}
DateFormatter& get_date_formatter(std::ios_base& io) {
void*& vp = io.pword(DateFormatter::xalloc_index);
if (!vp) {
vp = new DateFormatter;
io.register_callback(destroy_date_formatter, 0);
}
return *static_cast<DateFormatter*>(vp);
}
std::ostream& operator<<(std::ostream& os, const DateFormatter& df) {
get_date_formatter(os) = df;
return os;
}
std::ostream& operator<<(std::ostream& os, const MyDate& date)
{
DateFormatter& df = get_date_formatter(os);
// format output according to df
return os;
}
int main() {
MyDate a ( 2000, 1, 2 );
std::cout << DateFormatter("%Y/%m/%d") << a;
}
This is the standard method. It is terrible, in my opinion. I much prefer an alternative approach, which is to pass the date object together with the formatting as a single object. For example:
class DateFormatter
{
const MyDate* date;
std::string format_string;
DateFormatter(const MyDate& _date, std::string _format_string)
:date(&_date)
,format_string(_format_string)
{}
friend std::ostream& operator<<(std::ostream& os, const DateFormatter& df) {
// handle formatting details here
return os;
}
};
int main() {
MyDate a ( 2000, 1, 2 );
std::cout << DateFormatter(a, "%Y/%m/%d");
}

Or you can do something like that (using static variable):
#include <iostream>
struct MyDate
{
MyDate(int y, int m, int d): year{y}, month{m}, day{d} {}
int year{};
int month{};
int day{};
};
class DateFormatter
{
public:
DateFormatter(const std::string & format)
{
format_ = format;
}
static const std::string & format()
{
return format_;
}
private:
static std::string format_;
};
std::string DateFormatter::format_ = {"Default Format"};
std::ostream & operator<< (std::ostream & stream, const DateFormatter &)
{
return stream;
}
std::ostream & operator<< (std::ostream & stream, const MyDate & date)
{
auto currentFormat = DateFormatter::format();
// some code using current format ...
return stream << currentFormat << " - " << date.year << "/" << date.month << "/" << date.day;
}
int main(void)
{
MyDate date{2016,4,18};
std::cout << date << std::endl;
std::cout << DateFormatter("New format") << date << std::endl;
return 0;
}

How to change a string to a int number in C++? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
converting string to int in C++
I have tried include stdlib.h but it doesn't work.

in c++98 you can do
std::string str("1234");
int i;
std::stringstream ss(str);
ss >> i;
in c++11 you should do:
std::string str("1234");
int i=std::stoi(str);

A more C++-way of doing this:
#include <sstream>
#include <string>
// Converts a string to anything.
template<typename T>
T to(const std::string& s)
{
std::stringstream ss(s);
T ret;
ss >> ret;
return ret;
}
// And with a default value for not-convertible strings:
template<typename T>
T to(const std::string& s, T default_)
{
std::stringstream ss(s);
ss >> default_;
return default_;
}
Use it as follows:
int i = to<int>("123");
assert(i == 123);
int j = to<int>("Not an integer", 123);
assert(j == 123);
And extend it to support arbitrary types of yours:
struct Vec3 {float x, y, z;};
template<class T>
T& operator>>(T& f, Vec3& v) {
f >> v.x >> v.y >> v.z;
return f;
}
// Somewhere else:
Vec3 v = to<Vec3>("1.0 2.0 3.0");

C++ multiplication of 2 strings representing decimal value of a number

The idea is to overload an operator * so it can multiply two strings representing decimal value of a number. The operator is part of a bigger class but that is not important. The algorithm is the same as in elementary school :)
Here's my code:
Bignumber operator* (Bignumber x, Bignumber y ){
int i, j, transfer=0, tmp, s1, s2, k;
char add[1];
string sol;
string a, b;
Bignumber v1, v2;
a=x.GetValue();
b=y.GetValue();
a.insert(0,"0");
b.insert(0,"0");
for(i=a.length()-1; i>=0; i--){
s1 = (int) a[i]-48;
for(k=a.length()-i-1; k >0 ; k--){
sol+="0";
}
for(j=b.length()-1; j >=0; j--){
s2=(int) b[j]-48;
tmp=s1*s2+transfer;
if(tmp >= 10){
transfer=tmp/10;
tmp=tmp-(10*transfer);
}
itoa(tmp, add, 10);
sol.insert(0, add);
}
v1=sol;
v2=v1+v2;
sol.erase(0);
transfer=0;
}
return v2;
}
This works fine most of the time but for some random values it doesnt work properly. like for example for 128*28 it returns 4854 instead of 3584.
Any idea what might be the problem?
operators + and = are already overloaded for the class Bignumber and they work fine.

While my first answer solves your issue (by my testing, anyway), here's an alternative implementation; I don't have your Bignumber class so I wrote a small fake one to test with:
#include <string>
#include <ios>
#include <iostream>
#include <ostream>
#include <sstream>
class Bignumber
{
static inline unsigned long long strtoull(std::string const& str)
{
unsigned long long val;
return std::istringstream(str) >> val ? val : 0uLL;
}
unsigned long long val_;
public:
Bignumber() : val_() { }
explicit Bignumber(unsigned long long const val) : val_(val) { }
explicit Bignumber(std::string const& str) : val_(strtoull(str)) { }
Bignumber& operator +=(Bignumber const rhs)
{
val_ += rhs.val_;
return *this;
}
std::string GetValue() const
{
std::ostringstream oss;
oss << val_;
return oss.str();
}
};
Bignumber operator *(Bignumber const x, Bignumber const y)
{
typedef std::string::const_reverse_iterator cr_iter_t;
std::string const& a = '0' + x.GetValue();
std::string const& b = '0' + y.GetValue();
Bignumber ret;
for (cr_iter_t a_iter = a.rbegin(), a_iter_end = a.rend(); a_iter != a_iter_end; ++a_iter)
{
unsigned transfer = 0u;
std::string sol(a.end() - a_iter.base(), '0');
for (cr_iter_t b_iter = b.rbegin(), b_iter_end = b.rend(); b_iter != b_iter_end; ++b_iter)
{
unsigned tmp = static_cast<unsigned>(*a_iter - '0') * static_cast<unsigned>(*b_iter - '0') + transfer;
if (tmp >= 10u)
{
transfer = tmp / 10u;
tmp -= transfer * 10u;
}
sol.insert(sol.begin(), static_cast<char>(tmp + '0'));
}
ret += Bignumber(sol);
}
return ret;
}
int main()
{
Bignumber const z = Bignumber(123456789uLL) * Bignumber(987654321uLL);
std::cout << std::boolalpha << (z.GetValue() == "121932631112635269") << std::endl;
}

itoa null-terminates the string it writes, so add is too small for the data being written to it, resulting in memory corruption. Change the definition of add to char add[2]; and it should work.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Tiny serializer - c++

void save(std::ostringstream& out, const std::string& x) { out << x; } void read(std::istringstream& in, std::string& x) { in.str(x); } from here.

Related

Use of std::vector results in unknown output C++

Generic constant time compare function c++

Overload operator<<

How to change a string to a int number in C++? [duplicate]

C++ multiplication of 2 strings representing decimal value of a number

Categories

Resources