How to use multiple cache layers in c++ (RAM, HDD, cold) - c++

I have a simple POD data class like
struct hash{
char buffer[16];
};
I need to have a vector of many instances of it it will shorelly not fit into ram (20 PB). It is conceptually grouped into a vector (tree). I want to have a way to have a pointer like thing that would hide RAM, filesystem, cold storage, and have a simple array\pointer like interface (makeing fs, operations invisible after initialisation yet allowing to give it multiple places to put data in - RAM, Fast SSD, SSD, HDD, Tape, Cloud drive locations)
How to do such thing in C++?

There is no support for this at the language level.
One solution would be use a memory mapped file, for example see:
Creating a File Mapping Using Large Pages
If you need a more platform independant solution then it is possible you could use boost that has some support for memory mapped files as well in the boost-filesystem library.
Besides that you, you can always make a pointer like object facade to manage the underlying logics (ala. smart pointers).
template<class T>
struct MyMappedPointerType {
T& operator* MyPointerType();//derefence - may throw..
//implement rest of semantics
};

I think the usual would be to use some handle. Then when you want to access the object, you would pass the handle to a function which will load the memory and give you the address, and then you would close the handle. In C++ you would use RAII.
#include <string>
#include <cstdio>
template <class T>
class Access
{
private:
FILE* f= nullptr;
public:
Access(const std::string& filename)
{
f= fopen(filename.data(), "rw");
}
~Access()
{
fclose(f);
}
class WriteAccess
{
T buffer{};
bool dirty= false;
FILE* f;
int64_t elementNumber;
public:
WriteAccess(FILE* f, int64_t elementNumber)
: f(f)
, elementNumber(elementNumber)
{
if (f) {
fseek(f, elementNumber*sizeof(buffer), SEEK_SET);
fread(&buffer, sizeof(buffer), 1, f);
}
}
T& get() { dirty= true; return buffer; }
const T& get() const { return buffer; }
~WriteAccess()
{
if (dirty && f) {
fseek(f, elementNumber*sizeof(buffer), SEEK_SET);
fwrite(&buffer, sizeof(buffer), 1, f);
}
}
};
WriteAccess operator[] (int64_t elementNumber)
{
return WriteAccess(f, elementNumber);
}
};
struct SomeData
{
int a= 0;
int b= 0;
int c= 0;
};
int main()
{
Access<SomeData> myfile("thedata.bin");
myfile[0].get().a= 1;
auto pos1= myfile[1];
pos1.get().a= 10;
pos1.get().b= 10;
}
Of course, you would provide read acccess and write access, probably not using fopen but new c++ files, you should check for errors, and maybe you could get rid of get() function in form of a conversion operator to T.
You should also note that you could use some ref counting, in my simple example Access class should outlive WriteAccess class.
Also, you should lock if this is going to get used by more than one thread, and I assumed that you would not access the same element twice.
Or you could also use memory mapped file access like they've told you.

Related

Save reference to void pointer in a vector during loop iteration

Guys I have a function like this (this is given and should not be modified).
void readData(int &ID, void*&data, bool &mybool) {
if(mybool)
{
std::string a = "bla";
std::string* ptrToString = &a;
data = ptrToString;
}
else
{
int b = 9;
int* ptrToint = &b;
data = ptrToint;
}
}
So I want to use this function in a loop and save the returned function parameters in a vector (for each iteration).
To do so, I wrote the following struct:
template<typename T>
struct dataStruct {
int id;
T** data; //I first has void** data, but would not be better to
// have the type? instead of converting myData back
// to void* ?
bool mybool;
};
my main.cpp then look like this:
int main()
{
void* myData = nullptr;
std::vector<dataStruct> vec; // this line also doesn't compile. it need the typename
bool bb = false;
for(int id = 1 ; id < 5; id++) {
if (id%2) { bb = true; }
readData(id, myData, bb); //after this line myData point to a string
vec.push_back(id, &myData<?>); //how can I set the template param to be the type myData point to?
}
}
Or is there a better way to do that without template? I used c++11 (I can't use c++14)
The function that you say cannot be modified, i.e. readData() is the one that should alert you!
It causes Undefined Behavior, since the pointers are set to local variables, which means that when the function terminates, then these pointers will be dangling pointers.
Let us leave aside the shenanigans of the readData function for now under the assumption that it was just for the sake of the example (and does not produce UB in your real use case).
You cannot directly store values with different (static) types in a std::vector. Notably, dataStruct<int> and dataStruct<std::string> are completely unrelated types, you cannot store them in the same vector as-is.
Your problem boils down to "I have data that is given to me in a type-unsafe manner and want to eventually get type-safe access to it". The solution to this is to create a data structure that your type-unsafe data is parsed into. For example, it seems that you inteded for your example data to have structure in the sense that there are pairs of int and std::string (note that your id%2 is not doing that because the else is missing and the bool is never set to false again, but I guess you wanted it to alternate).
So let's turn that bunch of void* into structured data:
std::pair<int, std::string> readPair(int pairIndex)
{
void* ptr;
std::pair<int, std::string> ret;
// Copying data here.
readData(2 * pairIndex + 1, ptr, false);
ret.first = *reinterpret_cast<int*>(ptr);
readData(2 * pairIndex + 2, ptr, true);
ret.second = *reinterpret_cast<std::string*>(ptr);
}
void main()
{
std::vector<std::pair<int, std::string>> parsedData;
parsedData.push_back(readPair(0));
parsedData.push_back(readPair(1));
}
Demo
(I removed the references from the readData() signature for brevity - you get the same effect by storing the temporary expressions in variables.)
Generally speaking: Whatever relation between id and the expected data type is should just be turned into the data structure - otherwise you can only reason about the type of your data entries when you know both the current ID and this relation, which is exactly something you should encapsulate in a data structure.
Your readData isn't a useful function. Any attempt at using what it produces gives undefined behavior.
Yes, it's possible to do roughly what you're asking for without a template. To do it meaningfully, you have a couple of choices. The "old school" way would be to store the data in a tagged union:
struct tagged_data {
enum { T_INT, T_STR } tag;
union {
int x;
char *y;
} data;
};
This lets you store either a string or an int, and you set the tag to tell you which one a particular tagged_data item contains. Then (crucially) when you store a string into it, you dynamically allocate the data it points at, so it will remain valid until you explicitly free the data.
Unfortunately, (at least if memory serves) C++11 doesn't support storing non-POD types in a union, so if you went this route, you'd have to use a char * as above, not an actual std::string.
One way to remove (most of) those limitations is to use an inheritance-based model:
class Data {
public:
virtual ~Data() { }
};
class StringData : public Data {
std::string content;
public:
StringData(std::string const &init) : content(init) {}
};
class IntData : public Data {
int content;
public:
IntData(std::string const &init) : content(init) {}
};
This is somewhat incomplete, but I think probably enough to give the general idea--you'd have an array (or vector) of pointers to the base class. To insert data, you'd create a StringData or IntData object (allocating it dynamically) and then store its address into the collection of Data *. When you need to get one back, you use dynamic_cast (among other things) to figure out which one it started as, and get back to that type safely. All somewhat ugly, but it does work.
Even with C++11, you can use a template-based solution. For example, Boost::variant, can do this job quite nicely. This will provide an overloaded constructor and value semantics, so you could do something like:
boost::variant<int, std::string> some_object("input string");
In other words, it's pretty what you'd get if you spent the time and effort necessary to finish the inheritance-based code outlined above--except that it's dramatically cleaner, since it gets rid of the requirement to store a pointer to the base class, use dynamic_cast to retrieve an object of the correct type, and so on. In short, it's the right solution to the problem (until/unless you can upgrade to a newer compiler, and use std::variant instead).
Apart from the problem in given code described in comments/replies.
I am trying to answer your question
vec.push_back(id, &myData<?>); //how can I set the template param to be the type myData point to?
Before that you need to modify vec definition as following
vector<dataStruct<void>> vec;
Now you can simple push element in vector
vec.push_back({id, &mydata, bb});
i have tried to modify your code so that it can work
#include<iostream>
#include<vector>
using namespace std;
template<typename T>
struct dataStruct
{
int id;
T** data;
bool mybool;
};
void readData(int &ID, void*& data, bool& mybool)
{
if (mybool)
{
data = new string("bla");
}
else
{
int b = 0;
data = &b;
}
}
int main ()
{
void* mydata = nullptr;
vector<dataStruct<void>> vec;
bool bb = false;
for (int id = 0; id < 5; id++)
{
if (id%2) bb = true;
readData(id, mydata, bb);
vec.push_back({id, &mydata, bb});
}
}

What's the most simple way to read and write data from a struct to and from a file in c++ without serialization library?

I am writing a program to that regularly stores and reads structs in the form below.
struct Node {
int leftChild = 0;
int rightChild = 0;
std::string value;
int count = 1;
int balanceFactor = 0;
};
How would I read and write nodes to a file? I would like to use the fstream class with seekg and seekp to do the serialization manually but I'm not sure how it works based off of the documentation and am struggling with finding decent examples.
[edit] specified that i do not want to use a serialization library.
This problem is known as serialization. Use a serializing library like e.g. Google's Protocol Buffers or Flatbuffers.
To serialize objects, you will need to stick to the concept that the object is writing its members to the stream and reading members from the stream. Also, member objects should write themselves to the stream (as well as read).
I implemented a scheme using three member functions, and a buffer:
void load_from_buffer(uint8_t * & buffer_pointer);
void store_to_buffer(uint8_t * & buffer_pointer) const;
unsigned int size_on_stream() const;
The size_on_stream would be called first in order to determine the buffer size for the object (or how much space it occupied in the buffer).
The load_from_buffer function loads the object's members from a buffer using the given pointer. The function also increments the pointer appropriately.
The store_to_buffer function stores the objects's members to a buffer using the given pointer. The function also increments the pointer appropriately.
This can be applied to POD types by using templates and template specializations.
These functions also allow you to pack the output into the buffer, and load from a packed format.
The reason for I/O to the buffer is so you can use the more efficient block stream methods, such as write and read.
Edit 1: Writing a node to a stream
The problem with writing or serializing a node (such a linked list or tree node) is that pointers don't translate to a file. There is no guarantee that the OS will place your program in the same memory location or give you the same area of memory each time.
You have two options: 1) Only store the data. 2) Convert the pointers to file offsets. Option 2) is very complicated as it may require repositioning the file pointer because file offsets may not be known ahead of time.
Also, be aware of variable length records like strings. You can't directly write a string object to a file. Unless you use a fixed string width, the string size will change. You will either need to prefix the string with the string length (preferred) or use some kind of terminating character, such as '\0'. The string length first is preferred because you don't have to search for the end of the string; you can use a block read to read in the text.
If you replace the std::string by a char buffer, you can use fwrite and fread to write/read your structure to and from disk as a fixed size block of information. Within a single program that should work ok.
The big bug-a-boo is the fact that compilers will insert padding between fields in order to keep the data aligned. That makes the code less portable as if a module is compiled with different alignment requirements the structure literally can be a different size, throwing your fixed size assumption out the door.
I would lean toward a well worn in serialization library of some sort.
Another approach would be to overload the operator<< and operator>> for the structure so that it knows how to save/load itself. That would reduce the problem to knowing where to read/write the node. In theory, your left and right child fields could be seek addresses to where the nodes actually reside, while a new field could hold the seek location of the current node.
When implementing your own serialization method, the first decision you'll have to make is whether you want the data on disk to be in binary format or textual format.
I find it easier to implement the ability to save to a binary format. The number of functions needed to implement that is small. You need to implement functions that can write the fundamental types, arrays of known size at compile time, dynamic arrays and strings. Everything else can be built on top of those.
Here's something very close to what I recently put into production code.
#include <cstring>
#include <fstream>
#include <cstddef>
#include <stdexcept>
// Class to write to a stream
struct Writer
{
std::ostream& out_;
Writer(std::ostream& out) : out_(out) {}
// Write the fundamental types
template <typename T>
void write(T number)
{
out_.write(reinterpret_cast<char const*>(&number), sizeof(number));
if (!out_ )
{
throw std::runtime_error("Unable to write a number");
}
}
// Write arrays whose size is known at compile time
template <typename T, uint64_t N>
void write(T (&array)[N])
{
for(uint64_t i = 0; i < N; ++i )
{
write(array[i]);
}
}
// Write dynamic arrays
template <typename T>
void write(T array[], uint64_t size)
{
write(size);
for(uint64_t i = 0; i < size; ++i )
{
write(array[i]);
}
}
// Write strings
void write(std::string const& str)
{
write(str.c_str(), str.size());
}
void write(char const* str)
{
write(str, std::strlen(str));
}
};
// Class to read from a stream
struct Reader
{
std::ifstream& in_;
Reader(std::ifstream& in) : in_(in) {}
template <typename T>
void read(T& number)
{
in_.read(reinterpret_cast<char*>(&number), sizeof(number));
if (!in_ )
{
throw std::runtime_error("Unable to read a number.");
}
}
template <typename T, uint64_t N>
void read(T (&array)[N])
{
for(uint64_t i = 0; i < N; ++i )
{
read(array[i]);
}
}
template <typename T>
void read(T*& array)
{
uint64_t size;
read(size);
array = new T[size];
for(uint64_t i = 0; i < size; ++i )
{
read(array[i]);
}
}
void read(std::string& str)
{
char* s;
read(s);
str = s;
delete [] s;
}
};
// Test the code.
#include <iostream>
void writeData(std::string const& file)
{
std::ofstream out(file);
Writer w(out);
w.write(10);
w.write(20.f);
w.write(200.456);
w.write("Test String");
}
void readData(std::string const& file)
{
std::ifstream in(file);
Reader r(in);
int i;
r.read(i);
std::cout << "i: " << i << std::endl;
float f;
r.read(f);
std::cout << "f: " << f << std::endl;
double d;
r.read(d);
std::cout << "d: " << d << std::endl;
std::string s;
r.read(s);
std::cout << "s: " << s << std::endl;
}
void testWriteAndRead(std::string const& file)
{
writeData(file);
readData(file);
}
int main()
{
testWriteAndRead("test.bin");
return 0;
}
Output:
i: 10
f: 20
d: 200.456
s: Test String
The ability to write and read a Node is very easily implemented.
void write(Writer& w, Node const& n)
{
w.write(n.leftChild);
w.write(n.rightChild);
w.write(n.value);
w.write(n.count);
w.write(n.balanceFactor);
}
void read(Reader& r, Node& n)
{
r.read(n.leftChild);
r.read(n.rightChild);
r.read(n.value);
r.read(n.count);
r.read(n.balanceFactor);
}
The process you are referring to are known as serialization. I'd recommend Cereal at http://uscilab.github.io/cereal/
It supports both json, xml and binary serialization and is very easy to use (with good examples).
(Unfortunately it does not support my favourite format yaml)

Why can't I wrap a T* in an std::vector<T>?

I have a T* addressing a buffer with len elements of type T. I need this data in the form of an std::vector<T>, for certain reasons. As far as I can tell, I cannot construct a vector which uses my buffer as its internal storage. Why is that?
Notes:
Please don't suggest I use iterators - I know that's usually the way around such issues.
I don't mind that the vector having to copy data around if it's resized later.
This question especially baffles me now that C++ has move semantics. If we can pull an object's storage from under its feet, why not be able to shove in our own?
You can.
You write about std::vector<T>, but std::vector takes two template arguments, not just one. The second template argument specifies the allocator type to use, and vector's constructors have overloads that allow passing in a custom instance of that allocator type.
So all you need to do is write an allocator that uses your own internal buffer where possible, and falls back to asking the default allocator when your own internal buffer is full.
The default allocator cannot possibly hope to handle it, since it would have no clue on which bits of memory can be freed and which cannot.
A sample stateful allocator with an internal buffer containing already-constructed elements that should not be overwritten by the vector, including a demonstration of a big gotcha:
struct my_allocator_state {
void *buf;
std::size_t len;
bool bufused;
const std::type_info *type;
};
template <typename T>
struct my_allocator {
typedef T value_type;
my_allocator(T *buf, std::size_t len)
: def(), state(std::make_shared<my_allocator_state, my_allocator_state>({ buf, len, false, &typeid(T) })) { }
template <std::size_t N>
my_allocator(T(&buf)[N])
: def(), state(std::make_shared<my_allocator_state, my_allocator_state>({ buf, N, false, &typeid(T) })) { }
template <typename U>
friend struct my_allocator;
template <typename U>
my_allocator(my_allocator<U> other)
: def(), state(other.state) { }
T *allocate(std::size_t n)
{
if (!state->bufused && n == state->len && typeid(T) == *state->type)
{
state->bufused = true;
return static_cast<T *>(state->buf);
}
else
return def.allocate(n);
}
void deallocate(T *p, std::size_t n)
{
if (p == state->buf)
state->bufused = false;
else
def.deallocate(p, n);
}
template <typename...Args>
void construct(T *c, Args... args)
{
if (!in_buffer(c))
def.construct(c, std::forward<Args>(args)...);
}
void destroy(T *c)
{
if (!in_buffer(c))
def.destroy(c);
}
friend bool operator==(const my_allocator &a, const my_allocator &b) {
return a.state == b.state;
}
friend bool operator!=(const my_allocator &a, const my_allocator &b) {
return a.state != b.state;
}
private:
std::allocator<T> def;
std::shared_ptr<my_allocator_state> state;
bool in_buffer(T *p) {
return *state->type == typeid(T)
&& points_into_buffer(p, static_cast<T *>(state->buf), state->len);
}
};
int main()
{
int buf [] = { 1, 2, 3, 4 };
std::vector<int, my_allocator<int>> v(sizeof buf / sizeof *buf, {}, buf);
v.resize(3);
v.push_back(5);
v.push_back(6);
for (auto &i : v) std::cout << i << std::endl;
}
Output:
1
2
3
4
6
The push_back of 5 fits into the old buffer, so construction is bypassed. When 6 is added, new memory is allocated, and everything starts acting as normal. You could avoid that problem by adding a method to your allocator to indicate that from that point onward, construction should not be bypassed any longer.
points_into_buffer turned out to be the hardest part to write, and I've omitted that from my answer. The intended semantics should be obvious from how I'm using it. Please see my question here for a portable implementation in my answer there, or if your implementation allows it, use one of the simpler versions in that other question.
By the way, I'm not really happy with how some implementations use rebind in such ways that there is no avoiding storing run-time type info along with the state, but if your implementation doesn't need that, you could make it a bit simpler by making the state a template class (or a nested class) too.
The short answer is that a vector can't use your buffer because it wasn't designed that way.
It makes sense, too. If a vector doesn't allocate its own memory, how does it resize the buffer when more items are added? It allocates a new buffer, but what does it do with the old one? Same applies to moving - if the vector doesn't control its own buffer, how can it give control of this buffer to another instance?
These days - you no longer need to wrap a T* in an std::vector, you can wrap it with an std::span (in C++20; before that - use gsl::span). A span offers you all the convenience of a standard library container - in fact, basically all relevant features of std::vector excluding changes to the size - with a very thin wrapper class. That's what you want to use, really.
For more on spans, read: What is a "span" and when should I use one?

Copying column info in C++

I have to copy column info from a database to a struct, the problem is that it takes over 5000 iterations and is very slow. Is there any better way?
The code used is in the .h file:
struct sFieldDef
{
CString m_strQualifier;
CString m_strOwner;
CString m_strTableName;
CString m_strColumnName;
int m_nDataType;
CString m_strTypeName;
long m_lPrecision;
long m_lLength;
int m_nScale;
int m_nRadix;
int m_nNullable;
};
The code used in the .cpp file:
sFieldDef sTempField;
CColumns rsColumns(m_pDatabase);
rsColumns.Open(CRecordset::snapshot);
while( !rsColumns.IsEOF() )
{
sTempField.m_strQualifier=rsColumns.m_strQualifier;
sTempField.m_strOwner=rsColumns.m_strOwner;
sTempField.m_strTableName=rsColumns.m_strTableName;
sTempField.m_strColumnName=rsColumns.m_strColumnName;
sTempField.m_nDataType=rsColumns.m_nDataType;
sTempField.m_strTypeName=rsColumns.m_strTypeName;
sTempField.m_lPrecision=rsColumns.m_lPrecision;
sTempField.m_lLength=rsColumns.m_lLength;
sTempField.m_nScale=rsColumns.m_nScale;
sTempField.m_nRadix=rsColumns.m_nRadix;
sTempField.m_nNullable=rsColumns.m_nNullable;
pArrFiels->Add(sTempField);
rsColumns.MoveNext();
}
You seem to be copying and storing everything in an array of structs, where each struct has identical members with the corresponding record. Usually we use arrays through iterators. So why not provide an iterator to your record-set and avoid copying altogether? You could roughly start like this:
template <typename RS>
class rs_iterator
{
RS& rs;
public:
rs_iterator(RS& rs) : rs{rs} { }
const RS& operator*() { return rs; }
rs_iterator& operator++() { return rs.MoveNext(), *this; }
// ...
}
So you not only provide a convenient and standard interface to an array-like data source like a record-set, but you can use it directly in STL-like algorithms requiring bidirectional iterators.
If your CRecordset supports random access, then so would your iterator, easily. Otherwise, random access provision by itself is a good reason to copy (e.g. to sort columns).

How can I take ownership of a C++ std::string char data without copying and keeping std::string object?

How can I take ownership of std::string char data without copying and withoug keeping source std::string object? (I want to use moving semantics but between different types.)
I use the C++11 Clang compiler and Boost.
Basically I want to do something equivalent to this:
{
std::string s(“Possibly very long user string”);
const char* mine = s.c_str();
// 'mine' will be passed along,
pass(mine);
//Made-up call
s.release_data();
// 's' should not release data, but it should properly destroy itself otherwise.
}
To clarify, I do need to get rid of std::string: further down the road. The code deals with both string and binary data and should handle it in the same format. And I do want the data from std::string, because that comes from another code layer that works with std::string.
To give more perspective where I run into wanting to do so: for example I have an asynchronous socket wrapper that should be able to take both std::string and binary data from user for writing. Both "API" write versions (taking std::string or row binary data) internally resolve to the same (binary) write. I need to avoid any copying as the string may be long.
WriteId write( std::unique_ptr< std::string > strToWrite )
{
// Convert std::string data to contiguous byte storage
// that will be further passed along to other
// functions (also with the moving semantics).
// strToWrite.c_str() would be a solution to my problem
// if I could tell strToWrite to simply give up its
// ownership. Is there a way?
unique_ptr<std::vector<char> > dataToWrite= ??
//
scheduleWrite( dataToWrite );
}
void scheduledWrite( std::unique_ptr< std::vecor<char> > data)
{
…
}
std::unique_ptr in this example to illustrate ownership transfer: any other approach with the same semantics is fine to me.
I am wondering about solutions to this specific case (with std::string char buffer) and this sort of problem with strings, streams and similar general: tips to approach moving buffers around between string, stream, std containers and buffer types.
I would also appreciated tips and links with C++ design approaches and specific techniques when it comes to passing buffer data around between different API's/types without copying. I mention but not using streams because I'm shaky on that subject.
How can I take ownership of std::string char data without copying and withoug keeping source std::string object ? (I want to use moving semantics but between different types)
You cannot do this safely.
For a specific implementation, and in some circumstances, you could do something awful like use aliasing to modify private member variables inside the string to trick the string into thinking it no longer owns a buffer. But even if you're willing to try this it won't always work. E.g. consider the small string optimization where a string does not have a pointer to some external buffer holding the data, the data is inside the string object itself.
If you want to avoid copying you could consider changing the interface to scheduledWrite. One possibility is something like:
template<typename Container>
void scheduledWrite(Container data)
{
// requires data[i], data.size(), and &data[n] == &data[0] + n for n [0,size)
…
}
// move resources from object owned by a unique_ptr
WriteId write( std::unique_ptr< std::vector<char> > vecToWrite)
{
scheduleWrite(std::move(*vecToWrite));
}
WriteId write( std::unique_ptr< std::string > strToWrite)
{
scheduleWrite(std::move(*strToWrite));
}
// move resources from object passed by value (callers also have to take care to avoid copies)
WriteId write(std::string strToWrite)
{
scheduleWrite(std::move(strToWrite));
}
// assume ownership of raw pointer
// requires data to have been allocated with new char[]
WriteId write(char const *data,size_t size) // you could also accept an allocator or deallocation function and make ptr_adapter deal with it
{
struct ptr_adapter {
std::unique_ptr<char const []> ptr;
size_t m_size;
char const &operator[] (size_t i) { return ptr[i]; }
size_t size() { return m_size; }
};
scheduleWrite(ptr_adapter{data,size});
}
This class take ownership of a string using move semantics and shared_ptr:
struct charbuffer
{
charbuffer()
{}
charbuffer(size_t n, char c)
: _data(std::make_shared<std::string>(n, c))
{}
explicit charbuffer(std::string&& str)
: _data(std::make_shared<std::string>(str))
{}
charbuffer(const charbuffer& other)
: _data(other._data)
{}
charbuffer(charbuffer&& other)
{
swap(other);
}
charbuffer& operator=(charbuffer other)
{
swap(other);
return *this;
}
void swap(charbuffer& other)
{
using std::swap;
swap(_data, other._data);
}
char& operator[](int i)
{
return (*_data)[i];
}
char operator[](int i) const
{
return (*_data)[i];
}
size_t size() const
{
return _data->size();
}
bool valid() const
{
return _data;
}
private:
std::shared_ptr<std::string> _data;
};
Example usage:
std::string s("possibly very long user string");
charbuffer cb(std::move(s)); // s is empty now
// use charbuffer...
You could use polymorphism to resolve this. The base type is the interface to your unified data buffer implementation. Then you would have two derived classes. One for std::string as the source, and the other uses your own data representation.
struct MyData {
virtual void * data () = 0;
virtual const void * data () const = 0;
virtual unsigned len () const = 0;
virtual ~MyData () {}
};
struct MyStringData : public MyData {
std::string data_src_;
//...
};
struct MyBufferData : public MyData {
MyBuffer data_src_;
//...
};