construct a vector in range without copying - c++

I have a class that wraps a big array of bytes that are network packets. The class implements a queue and provides (among others) front() function that returns a const vector of bytes that constitute the oldest packet in the queue.
class Buffer{
unsigned char data[65536];
unsigned int offset;
unsigned int length;
[...]//other fields for maintaining write ptr etc.
public:
const std::vector<unsigned char> front(){
return std::vector<unsigned char>(data + offset, data + offset + length);
}
//other methods for accessing the queue like
//pop(), push(), clean() and so forth...
[...]
}
The performance of above implementation of front() function suffers from unnecessary copying bytes from the range occupied by the current packet. Since the vector is const, there is no need of making a copy of the data. What I want is to create a vector on the data that are already stored in the buffer. Of course destructor of the vector should not deallocate the memory.

You have some options available to you:
Rather than returning a vector, just return a const char*:
const char* front() {
return data;
}
Consider using a standard container, such as a string data as your Buffer member. This will allow you to do:
const string& front() {
return data;
}
The best option though is if you have C++17 or access to experimental::string_view you could just do:
const string_view front() {
return string_view(data);
}
Just a convention comment, there is going to be an expectation of front that it will behave like other standard containers, which:
Returns a reference to the first element in the container.
Calling front on an empty container is undefined.
[source]
Bringing front to apply to bare on fixed size arrays was also discussed by the C++ standards committee: front and back Proposal for iterators Library
As it is this method more closely resembles data, which:
Returns a pointer to the block of memory containing the elements of the container.
[source]

If you're looking to avoid unnecessary copying then you'll need to return a view into the data. You can either provide a front_begin() and front_end() set of functions:
const char *front_begin() const
{
return data + offset;
}
const char *front_end() const
{
return data + offset + length;
}
Or write a wrapper class:
class Data
{
private:
const char *m_Begin;
const char *m_End;
public:
Data(const char *begin, const char *end) : m_Begin(begin), m_End(end)
{
}
const char *begin() const
{
return m_Begin;
}
const char *end() const
{
return m_End;
}
}
And have your front() method return one of these:
Data front()
{
return Data(data + offset, data + offset + length)
}
If you're using C++11 then you can use a Data instance in a ranged based for loop:
Data data = buffer.front();
for(char c : data)
{
// Do something with the data
}

Related

C++, array of objects, customize where they are stored in memory

Currently I working on a existing project (DLL ) which I have to extend.
For the transport through the DLL I have a struct for example 'ExternEntry'
and a struct which passes a array of it.
struct ExternEntry
{
unsigned int MyInt;
const wchar_t* Text;
}
struct ExternEntries
{
const ExternEntry* Data;
const unsigned int Length;
ExternEntries(const ExternEntry* ptr, const unsigned int size)
: Data(ptr)
, Length(size);
{
}
}
In the existing project architecture, it will be the first time that a array is passed to the DLL callers. So the existing architecture doesn't allow arrays and if a struct is passed to a caller, normally there is a wrapper-struct for it (because of their str pointers).
Inside the DLL I need to wrap the ExternEntry so have a valid Text pointer.
struct InternEntry
{
ExternEntry Data;
std::wstring Text;
inline const ExternEntry* operator&() const { return& Data }
UpdateText() { Data.Text = Text.c_str(); }
}
struct InternEntries
{
std::vector<InternEntry> Data;
operator ExternEntries() const
{
return ExternEntries(Data.data()->operator&(), Data.size());
}
}
So the problem is, when the Caller received the ExternEntries and created a vector again:
auto container = DllFuncReturnInternEntries(); // returns ExternEntries
std::vector<ExternEntry> v(container.Data, container.Data + container.Length);
The first element is valid. All other elements are pointing to the wrong memory because in memory the InternEntry (with the wstring Text) is stored between the next InternEntry.
Maybe I'm wrong with the reason why this can't work.
[Data][std::wstring][Data][std::wstring][Data][std::wstring]
Caller knows just about the size of the [Data]
So the vector is doing the following:
[Data][std::wstring][Data][std::wstring][Data][std::wstring] 
  |       |       |
 Get     Get     Get
instead of
[Data][std::wstring][Data][std::wstring][Data][std::wstring]
  |                   |                   |
 Get                 Get                 Get
Do I have any possibilities to customize how the vector stores InternEntry objects in memory?
like Data,Data,Data ..anywhere else wstring,wstring,wstring
I hope I have explained my problem well

Define custom comparator for use with priority_queue

I want to implement a minHeap in c++ for char[] buffers and am facing some problems with the implementation. My declaration of the priority queue is as follows (I am not sure if this will give me a maxHeap or a minHeap):
priority_queue<char[], vector<char[]>, comparePacketContents> receiveBuffer;
where comparePacketContents is:
struct comparePacketContents {
bool operator()(char lhs[], char rhs[]) const {
return atoi(TcpPacket::getBytes(lhs, 0, SEQUENCE_SIZE)) < atoi(TcpPacket::getBytes(rhs, 0, SEQUENCE_SIZE));
}
};
and TcpPacket::getBytes is:
char* TcpPacket::getBytes(char* buf, int start, int size) {
char* ans = (char *) malloc(sizeof(char)*size);
for (int i = 0; i < size; i++) {
*(ans + i) = *(buf + start + i);
}
return ans;
}
Basically I intend to get the first SEQUENCE_SIZE characters of the received packet and then create a heap ordered upon the value of the sequence number.
However, when I try to push a packet into this heap using:
receiveBuffer.push(buf);
It gives me the following error:
no instance of overloaded function "std::priority_queue<_Ty, _Container, _Pr>::push [with _Ty=char [], _Container=std::vector<char [], std::allocator<char []>>, _Pr=comparePacketContents]" matches the argument list
argument types are: (char [2048])
object type is: std::priority_queue<char [], std::vector<char [], std::allocator<char []>>, comparePacketContents>
What should I do to resolve this error?
You can probably "fix" the compilation error by doing push(&buf) to push a pointer to the beginning of the array explicitly. Otherwise the compiler thinks you want to push the entire array, while the container holds pointers (char[] is like char*).
But probably that's not sufficient to fix all the problems you have, because it seems you are storing raw pointers to C-style strings without managing those allocations correctly. Instead, consider writing a class to hold your packets:
class Packet {
public:
Packet(const char* data); // takes ownership of data
uint32_t seqnum() const; // similar to existing implementation
// ...
private:
std::shared_ptr<char> m_data;
};
Packet::Packet(const char* data) : m_data(data, free) {
}
bool operator<(const Packet& lhs, const Packet& rhs) {
return lhs.seqnum() < rhs.seqnum();
}
priority_queue<Packet> receiveBuffer;
In my example I assume you release packet buffers using the C free() function, but you can use any "deleter" in the C++ shared_ptr constructor, including one you write yourself.

std::vector with const pointer to const object does not compile

I get a whole lot of errors from gcc when trying to complie this method. zones_ is a
std::map<int,std::vector<Zone const * const>>
That is a private member of MyClass.
//get unique zones
std::vector<Zone const* const> MyClass::getZones() const {
std::vector<Zone const * const> zones; //why can I not do this???
std::map<Zone const * const,int> zone_set;
for(auto & pair : zones_) {
for(Zone const * const z : pair.second) {
if(zone_set.count(z) == 0) {
zone_set[z] = 1;
zones.push_back(z); //cannot do this
}
}
}
return zones;
}
Can I have a vector of const pointers to const objects?
no, the element type cannot be const (can be pointer to const type though)
also there is no need for this, since when you return a const reference to your vector, the access to the elements will be const_reference: http://www.cplusplus.com/reference/vector/vector/operator[]
No. In general, elements of vectors can't be const since they have to be moved when the vector's array needs reallocating. You'll have to store Zone const *; or perhaps use a different container type (list, deque or possibly set), if you really need constant elements.

Is std::string an object?

just looking in optimizing some std::map code. The map contains objects, accessed via the string-identifier.
Example:
std::map<std::string, CVeryImportantObject> theMap;
...
theMap["second"] = new CVeryImportantObject();
Now, when using the find-function as theMap->find("second"), the String is converted into std::string("second"), which causes new string allocations (over all when using IDL=2 with Visual Studio).
1. Is there a possibility to use a string-only class to avoid such allocations?
Intentionally I've tried to use another String-Class as well:
std::map<CString, CVeryImportantObject> theMap;
This code works also. But CString indeed is an object.
And: If you remove an object from the map, I'll need to release both the related object and the key, do I?
Any suggestions?
Now, when using the find-function as theMap->find("second"), the
String is converted into std::string("second"), which causes new
string allocations (over all when using IDL=2 with Visual Studio).
This is a Standard issue, which is fixed in C++14 for ordered containers. The newest version of VS, VS 14 CTP (which is a pre-release) contains a fix for this issue, as will new versions of other implementations.
If you need to avoid allocations, you can try a class like llvm::StringRef which can refer to std::string or string literals interchangably, but then you will be left trying to handle the ownership externally.
You can try something like unique_ptr<char[], maybe_delete> that sometimes deletes the contents. This is a bit of a mess to interface with though.
And: If you remove an object from the map, I'll need to release both
the related object and the key, do I?
The map will automatically destruct the key and value for you. For a class which frees it's own resources like std::string, which is the only sane way to write C++, then you can erase without worrying about resource cleanup.
If you always use string constants as keys, you can use const char * as key type in map when you use proper comparator:
struct PCharCompare {
bool operator()( const char *s1, const char *s2 ) const { return strcmp( s1, s2 ) < 0; }
};
std::map< const char *, CVeryImportantObject, PCharCompare> theMap;
Note: you have to be careful and need to understand how it works, as it can easily lead to UB:
void foo() {
char buffer[256];
snprintf( buffer, sizeof( buffer ), "blah" );
theMap.insert( std::make_pair( buffer, Object ) );
} // ups dangled pointer in the map
As for optimization, it is very unlikely that std::string creation is a culprit. you may try to use std::unordered_map or something similar for optimization
Now, when using the find-function as theMap->find("second"), the
String is converted into std::string("second"), which causes new
string allocations
Not necessarily. VC uses Small-String Optimisation (SSO). This means that for a string as short as "second", no allocation on the heap should take place at all; the characters will instead be stored directly in the temporarily created std::string object.
This is still not free (because the std::string has to be created, albeit without any dynamic allocation happening inside), but should be good enough. Is it really a concern for you? Chances are very high that it does not cause any measurable performance decrease.
Is there a possibility to use a string-only class to avoid such allocations?
Not really, except of the C++14 fix mentioned in other answers. Using char const * as the key type is very dangerous, because std::map will only store the actual addresses, not copies of the keys.
If I were you and if I really experienced performance problems, I'd just not use std::map directly but create my own container class to wrap a std::map<char const *, T, CustomComparison> and do the hard pointer work inside.
template <class ValueType>
class FastStringMap
{
private:
struct Comparison
{
bool operator()(char const *lhs, char const *rhs) const
{
return strcmp(lhs, rhs) > 0;
}
};
typedef std::map<char const *, ValueType, Comparison> WrappedMap;
WrappedMap m_map;
public:
typedef typename WrappedMap::iterator iterator;
typedef typename WrappedMap::const_iterator const_iterator;
bool insert(char const *key, ValueType const &value)
{
if (m_map.find(key) != m_map.end())
{
return false;
}
else
{
char *copy = new char[strlen(key) + 1];
strcpy(copy, key);
try
{
return m_map.insert(std::make_pair(copy, value)).second;
}
catch (...)
{
delete copy;
throw;
}
}
}
~FastStringMap()
{
for (iterator iter = m_map.begin(); iter != m_map.end(); ++iter)
{
delete[] iter->first;
}
}
iterator find(char const *key)
{
return m_map.find(key);
}
const_iterator find(char const *key) const
{
return m_map.find(key);
}
// further operations
};
To be used like this:
FastStringMap<int> m;
m.insert("AAA", 1);
m.insert("BBB", 2);
m.insert("CCC", 3);
std::cout << m.find("AAA")->second;
Note that you can possibly make this more sophisticated by templatising also on the character type (for std::wstring support) or by providing "real" iterator classes (using Boost Iterator Facade).
And: If you remove an object from the map, I'll need to release both
the related object and the key, do I?
If you use std::string, no. If you use char const * and if the pointers point to memory allocated dynamically (as in my example), then yes.

How can I take ownership of a C++ std::string char data without copying and keeping std::string object?

How can I take ownership of std::string char data without copying and withoug keeping source std::string object? (I want to use moving semantics but between different types.)
I use the C++11 Clang compiler and Boost.
Basically I want to do something equivalent to this:
{
std::string s(“Possibly very long user string”);
const char* mine = s.c_str();
// 'mine' will be passed along,
pass(mine);
//Made-up call
s.release_data();
// 's' should not release data, but it should properly destroy itself otherwise.
}
To clarify, I do need to get rid of std::string: further down the road. The code deals with both string and binary data and should handle it in the same format. And I do want the data from std::string, because that comes from another code layer that works with std::string.
To give more perspective where I run into wanting to do so: for example I have an asynchronous socket wrapper that should be able to take both std::string and binary data from user for writing. Both "API" write versions (taking std::string or row binary data) internally resolve to the same (binary) write. I need to avoid any copying as the string may be long.
WriteId write( std::unique_ptr< std::string > strToWrite )
{
// Convert std::string data to contiguous byte storage
// that will be further passed along to other
// functions (also with the moving semantics).
// strToWrite.c_str() would be a solution to my problem
// if I could tell strToWrite to simply give up its
// ownership. Is there a way?
unique_ptr<std::vector<char> > dataToWrite= ??
//
scheduleWrite( dataToWrite );
}
void scheduledWrite( std::unique_ptr< std::vecor<char> > data)
{
…
}
std::unique_ptr in this example to illustrate ownership transfer: any other approach with the same semantics is fine to me.
I am wondering about solutions to this specific case (with std::string char buffer) and this sort of problem with strings, streams and similar general: tips to approach moving buffers around between string, stream, std containers and buffer types.
I would also appreciated tips and links with C++ design approaches and specific techniques when it comes to passing buffer data around between different API's/types without copying. I mention but not using streams because I'm shaky on that subject.
How can I take ownership of std::string char data without copying and withoug keeping source std::string object ? (I want to use moving semantics but between different types)
You cannot do this safely.
For a specific implementation, and in some circumstances, you could do something awful like use aliasing to modify private member variables inside the string to trick the string into thinking it no longer owns a buffer. But even if you're willing to try this it won't always work. E.g. consider the small string optimization where a string does not have a pointer to some external buffer holding the data, the data is inside the string object itself.
If you want to avoid copying you could consider changing the interface to scheduledWrite. One possibility is something like:
template<typename Container>
void scheduledWrite(Container data)
{
// requires data[i], data.size(), and &data[n] == &data[0] + n for n [0,size)
…
}
// move resources from object owned by a unique_ptr
WriteId write( std::unique_ptr< std::vector<char> > vecToWrite)
{
scheduleWrite(std::move(*vecToWrite));
}
WriteId write( std::unique_ptr< std::string > strToWrite)
{
scheduleWrite(std::move(*strToWrite));
}
// move resources from object passed by value (callers also have to take care to avoid copies)
WriteId write(std::string strToWrite)
{
scheduleWrite(std::move(strToWrite));
}
// assume ownership of raw pointer
// requires data to have been allocated with new char[]
WriteId write(char const *data,size_t size) // you could also accept an allocator or deallocation function and make ptr_adapter deal with it
{
struct ptr_adapter {
std::unique_ptr<char const []> ptr;
size_t m_size;
char const &operator[] (size_t i) { return ptr[i]; }
size_t size() { return m_size; }
};
scheduleWrite(ptr_adapter{data,size});
}
This class take ownership of a string using move semantics and shared_ptr:
struct charbuffer
{
charbuffer()
{}
charbuffer(size_t n, char c)
: _data(std::make_shared<std::string>(n, c))
{}
explicit charbuffer(std::string&& str)
: _data(std::make_shared<std::string>(str))
{}
charbuffer(const charbuffer& other)
: _data(other._data)
{}
charbuffer(charbuffer&& other)
{
swap(other);
}
charbuffer& operator=(charbuffer other)
{
swap(other);
return *this;
}
void swap(charbuffer& other)
{
using std::swap;
swap(_data, other._data);
}
char& operator[](int i)
{
return (*_data)[i];
}
char operator[](int i) const
{
return (*_data)[i];
}
size_t size() const
{
return _data->size();
}
bool valid() const
{
return _data;
}
private:
std::shared_ptr<std::string> _data;
};
Example usage:
std::string s("possibly very long user string");
charbuffer cb(std::move(s)); // s is empty now
// use charbuffer...
You could use polymorphism to resolve this. The base type is the interface to your unified data buffer implementation. Then you would have two derived classes. One for std::string as the source, and the other uses your own data representation.
struct MyData {
virtual void * data () = 0;
virtual const void * data () const = 0;
virtual unsigned len () const = 0;
virtual ~MyData () {}
};
struct MyStringData : public MyData {
std::string data_src_;
//...
};
struct MyBufferData : public MyData {
MyBuffer data_src_;
//...
};