In my attempt to create a custom string class without dynamic memory, I'm using the template array length trick. Because the size is passed as a template parameter, it is known at compile time. So therefore char buffer[n] is not a variable length array. Is this correct line of thinking? Here's code:
template<typename T, size_t n>
class string_test
{
using Type = T;
// Don't count NUL byte
size_t _size = n - 1;
char buffer[n]; // <--- variable length array?
public:
string_test(const char* str)
{
strcpy(buffer, str);
}
size_t size() const
{
return _size;
}
const char* c_str() const
{
return buffer;
}
};
template<typename T, size_t n>
string_test<T, n> make_string(const T (&str)[n])
{
return string_test<T, n>(str);
}
Hopefully by this method all memory is on stack and I don't run into any issues with new, delete and so on.
Yes, your thinking is correct: buffer is not a VLA.
Hopefully by this method all memory is on stack and I don't run into any issues with new, delete and so on.
This is also correct in the sense that you don't need to manage any memory by hand.
One (potentially significant) wrinkle is that string_test<T, m> and string_test<T, n> are different types when m != n.
In general it would seem more appropriate to simply use std::vector<T>. This will lead to straightforward yet correct code with little scope for memory errors.
The code and the thinking is not wrong; and will work as is.
I would question the motive and the means to achieve it. The class is not safe in terms of bounds checking. For instance, if ctor is given a string exceeding the capacity, BANG.
Heap memory management is as efficient as it can be, and I do not foresee performance problems for 99.99% applications. If you REALLY are trying to squeeze the last CPU performance drops, perhaps using STL is not the right choice. You should limit yourself to classic C-style programming with hand-crafted optimized algorithms.
Last, I second PaulMcKenzie above that, if you want to customize memory management for any of the containers (string included), use std class with custom allocator. You can construct an allocator with automatic buffer on the stack and make it use that buffer.
Related
I am writing an implementation for a neural network, and I am passing in the number of nodes in each layer into the constructor. Here is my constructor:
class Network {
public:
template<size_t n>
Network(int inputNodes, int (&hiddenNodes)[n], int outputNodes);
};
I am wondering if it is bad practice to use templates to specify array size. Should I be doing something like this instead?
class Network {
public:
Network(int inputNodes, int numHiddenLayers, int* hiddenNodes, int outputNodes);
};
Templates are necessary when you want to write something that uses variable types. You don't need it when you just want to pass a value of a given type. So one argument against using a template for this is to keep things simple.
Another problem with the template approach is that you can only pass in a constant value for the size. You can't write:
size_t n;
std::cin >> n;
Network<n> network(...); // compile error
A third issue with the template approach is that the compiler will have to instantiate a specialization of the function for every possible size you are using. For small values of n, that might give some benefits, because the compiler could optimize each specialization better when it knows the exact value (for example, by unrolling loops), but for large values it will probably not be able to optimize it any better than if it didn't know the size. And having multiple specializations might mean the instruction cache in your CPU is trashed more easily, that your program's binary is larger and thus uses more disk space and memory.
So it likely is much better to pass the size as a variable, or instead of using a size and a pointer to an array, use a (reference to an) STL container, or if you can use C++20, consider using std::span.
Use std::span<int> or write your own.
struct int_span {
int* b = 0;
int* e = 0;
// iteration:
int* begin() const { return b; }
int* end() const { return e; }
// container-like access:
int& operator[](std::size_t i) const { return begin()[i]; }
std::size_t size() const { return end()-begin(); }
int* data() const { return begin(); }
// implicit constructors from various contiguous buffers:
template<std::size_t N>
int_span( int(&arr)[N] ):int_span( arr, N ) {}
template<std::size_t N>
int_span( std::array<int, N>& arr ):int_span( arr.data(), N ) {}
template<class A>
int_span( std::vector<int, A>& v ):int_span(v.data(), v.size()) {}
// From a pair of pointers, or pointer+length:
int_span( int* s, int* f ):b(s),e(f) {}
int_span( int* s, std::size_t len ):int_span(s, s+len) {}
// special member functions. Copy is enough:
int_span() = default;
// This is a view type; so assignment and copy is copying the selection,
// not the contents:
int_span(int_span const&) = default;
int_span& operator=(int_span const&) = default;
};
there we go; an int_span with represents a view into a contiguous buffer of ints of some size.
class Network {
public:
Network(int inputNodes, int_span hiddenNodes, int outputNodes);
};
From the way you write the second function argument
int (&hiddenNodes)[n]
I guess you're not an experienced C/C++ programmer. The point is that n will be ignored by the compiler and you'll lose any possibility to verify that the size of the C-style array you'll input here and the n passed as the template parameter will be equal to each other or at least coherent with each other.
So, forget about templates. Go std::vector<int>.
The only advantage of using a template (or std::array) here is that the compiler might optimize your code better than with std::vector. The chances that you'll be able to exploit it are, however, very small, and even if you succeed, the speedup most likely be hardly measureable.
The advantage of std::vector is that it is practically as fast and easy to use as std::array, but far more flexible (its size is adjustable at runtime). If you go std::array or templates and you are going to use in your program hidden layers of different sizes, soon you'll have to turn other parts of your program into templates and it is likely that rather than implementing your neural network, you'll find yourself fighting with templates. It's not worth it.
However, when you'll have a working implementation of your NN, based on std::vector, you can THEN consider its optimization, which may include std::array or templates. But I'm 99.999% sure you'll stay with std::vector.
I've never implemented a neural network, but did a lot of time-consuming simulations. The first choice is always std::vector and only if one has some special, well defined requirements for the data container does one use other containers.
Finally, keep in mind that std::array is stack-allocated, whereas std::vector is allocated on the heap. Heap is much larger and in some scenarios this is a crucial factor to consider.
EDIT
In short:
if an array size may vary freely, never pass its value as a
template parameter. Use std::vector
If it can take on 2, 3, 4, perhaps 5 sizes from a fixed set, you CAN consider std::array, but std::vector will most likely be as efficient and the code will be simpler
If the array will always be of the same size known at compile-time, and the limited size of the function stack is not an issue, use std::array.
I'm implementing a super simple container for long term memory management, and the container will have inside an array.
I was wondering, what are the actual implications of those two approaches below?
template<class T, size_t C>
class Container
{
public:
T objects[C];
};
And:
template<class T>
class Container
{
public:
Container(size_t cap)
{
this->objects = new T[cap];
}
~Container()
{
delete[] this->objects;
}
T* objects;
};
Keep in mind that those are minimal examples and I'm not taking into account things like storing the capacity, the virtual size, etc.
If the size of the container is known at compile time, like in first example, you should better use std::array. For instance:
template<class T, size_t C>
class Container
{
public:
std::array<T, C> objects;
};
This has important advantages:
You can get access to its element via std::get, which automatically checks that the access is within bounds, at compile time.
You have iterators for Container::objects, so you can use all the routines of the algorithm library.
The second example has some important drawbacks:
You cannot enforce bounds-check when accessing the elements: this can potentially lead to bugs.
What happens if new in the constructor throws? You have to manage this case properly.
You need a suitable copy constructor and assignment operators.
you need a virtual destructor unless you are sure that nobody derives from the class, see here.
You can avoid all these problems by using a std::vector.
In addition to #francesco's answer:
First example
In your first example, your Container holds a C-style array. If an instance of the Container is created on the stack, the array will be on the stack as well. You might want to read heap vs stack (or similar). So, allocating on the stack can have advantages, but you have to be careful with the size you give to the array (size_t C) in order to avoid a stack overflow.
You should consider using std::array<T,C>.
Second example
Here you hold a pointer of type T which points to a C-style array which you allocate on the heap (it doesn't matter whether you allocate an instance of Container on the stack or on the heap). In this case, you don't need to know the size at compile time, which has obvious advantages in many situations. Also, you can use much greater values for size_t C.
You should consider using std::vector<T>.
Further research
For further research, read on stack vs heap allocation/performance, std::vector and std::array.
Ideally, an immutable string class would only need one memory allocation for each string. Even the reference count could be stored in the same chunk of memory that holds the string itself.
A trivial implementation of string and shared_ptr would allocate three distinct pieces of memory for shared_ptr<string const>:
Memory for the string buffer
Memory for the string object
Memory for the reference count
Now, I know that when using std::make_shared(), it is possible for a smart implementation to combine the last two into a single allocation. But that would still leave two allocations.
When you know that the string is immutable, the string buffer won't be reallocated, hence it should be possible to integrate it with the string object, leaving only one allocation.
I know that some string implementations already use such optimizations for short strings, but I'm after an implementation that does this regardless of string length.
My questions are: Is my reasoning sound? Is an implementation actually permitted and able to do this? Can I reasonably expect from a good quality standard library to implement this optimization? Do you know of contemporary library implementations which do this?
Or is this something I would have to implement myself?
I believe that the only way to do this is a make_shared which accepts arrays of run-time variable size. The standard one does not, even as of c++17 (which adds support for shared_ptr to arrays).
Boost, on the other hand, has boost::make_shared, which can take an array size parameter as well. Once you have that, you're golden; you get a shared_ptr<char[]> which does pretty much what you want (besides actually being a std::string.
If you don't want to use boost you could just roll your own. It probably wouldn't be that hard.
Something else to consider is that if you will only ever create O(1) strings, it will be much faster to just never delete them, and pass around raw pointers (or std::string_views) instead. This avoids any copying, or fiddling with reference counts. (The reference counts are actually quite slow, since they use atomic operations.)
You could also use an interning mechanism like std::unordered_set<std::string>.
You'd probably need to use a custom allocator for all of the allocation.
class ImmutableStringAllocator;
template<typename CharT>
using immutable_string = std::basic_string<CharT, std::char_traits<CharT>, ImmutableStringAllocator>
template<size_t N>
immutable_string<char> make_immutable_string(char (&data)[N])
{
ImmutableStringAllocator alloc(N);
// going for basic_string::basic_string(charT *, size_t, Allocator)
return allocate_shared<immutable_string<char>>(alloc, data, N, alloc);
}
class ImmutableStringAllocator {
size_t len;
size_t offset;
char * buf;
std::reference_wrapper<char *> ref;
public:
// Normal Allocator stuff here
ImmutableStringAllocator(size_t N) : len(N), buf(nullptr), offset(0), ref(buf) {}
ImmutableStringAllocator(const ImmutableStringAllocator & other) : len(other.len), buf(nullptr), offset(other.offset), ref(other.buf) {}
ImmutableStringAllocator operator=(const ImmutableStringAllocator & other)
{
assert(buf == nullptr);
temp(other);
swap(*this, temp);
return *this;
}
pointer allocate(size_type n, const_void_pointer hint)
{
if (!ref.get()) { buf = ::new(n + len); offset = n; return buf; }
return ref.get() + offset;
}
}
I want to implement std::end for unique pointer.
The problem is that I have to get N(count of elements in array).
1.Approach deduce type from template
template <typename T, size_t N>
T* end(const unique_ptr<T[N]> &arr)
{
return arr.get() + N;
}
But I got error error: C2893: Failed to specialize function template 'T *test::end(const std::unique_ptr> &)' with [ _Ty=T [N] ] With the following template arguments: 'T=int' 'N=0x00'
It looks like It is not possible to deduce N
2.Get N from allocator.
Allocator has to know N to correctly execute delete[].
You could read about this in this article. There are two approaches:
Over-allocate the array and put n just to the left.
Use an associative array with p as the key and n as the value.
The problem is how to get this size cross platform/compiler.
Maybe someone knows better approaches or know how to make this works?
If you have a run time sized array and you need to know the size of it without having to manually do the book keeping then you should use a std::vector. It will manage the memory and size for you.
std::unique_ptr<T[]> is just a wrapper for a raw pointer. You cannot get the size of the block the pointer points to from just the pointer. The reason you use a std::unique_ptr<T[]> over T* foo = new T[size] is the unique_ptr makes sure delete[] is called when the pointer goes out of scope.
Something like this?
template<class X>
struct sized_unique_buffer;
template<class T, std::size_t N>
struct sized_unique_buffer<T[N]>:
std::unique_ptr<T[]>
{
using std::unique_ptr<T[]>::unique_ptr;
T* begin() const { return this->get(); }
T* end() const { return *this?begin(*this)+N:nullptr; }
bool empty() const { return N==0 || !*this; }
};
where we have a compile-time unenforced promise of a fixed compile-time length.
A similar design could work for a dynamic runtime length.
In some compilers, the number of T when T can be trivially destroyed is not stored when you call new T[N]. The system is free to over-allocate and give you a larger buffer (ie, round to a page boundary for a large allocation, or implicitly store the size of the buffer via the location from which it is allocated to reduce overhead and round allocations up), so the allocation size need not exactly match the number of elements.
For non-trivially destroyed T it is true that the compiler must know how many to destroy from just the pointer. This information is not exposed to C++.
You can do manual allocation of buffers and the count and pass that on to a unique_ptr with a custom deleter, even a stateless one. This would permit a type
unique_buffer<T[]> ptr;
where you can get the number of elements out at only a modest runtime cost.
If you instead store the length in the deleter, you can get a bit more locality on the loop limits (saving a cache miss) at the cost of a larger unique_buffer<T[]>.
Doing this with an unadulterated unique_ptr<T[]> is not possible in a portable way.
I'm working on a program that stores a vital data structure as an unstructured string with program-defined delimiters (so we need to walk the string and extract the information we need as we go) and we'd like to convert it to a more structured data type.
In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself. The length of the string will always be known at allocation time. We've determined through testing that doubling the number of allocations required for each of these data types is an unnacceptable cost. Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation? If we were using cstrings I'd just have a char * in the struct and point it to the end of the struct after allocating a block big enough for the struct and string, but we'd prefer std::string if possible.
Most of my experience is with C, so please forgive any C++ ignorance displayed here.
If you have such rigorous memory needs, then you're going to have to abandon std::string.
The best alternative is to find or write an implementation of basic_string_ref (a proposal for the next C++ standard library), which is really just a char* coupled with a size. But it has all of the (non-mutating) functions of std::basic_string. Then you use a factory function to allocate the memory you need (your struct size + string data), and then use placement new to initialize the basic_string_ref.
Of course, you'll also need a custom deletion function, since you can't just pass the pointer to "delete".
Given the previously linked to implementation of basic_string_ref (and its associated typedefs, string_ref), here's a factory constructor/destructor, for some type T that needs to have a string on it:
template<typename T> T *Create(..., const char *theString, size_t lenstr)
{
char *memory = new char[sizeof(T) + lenstr + 1];
memcpy(memory + sizeof(T), theString, lenstr);
try
{
return new(memory) T(..., string_ref(theString, lenstr);
}
catch(...)
{
delete[] memory;
throw;
}
}
template<typename T> T *Create(..., const std::string & theString)
{
return Create(..., theString.c_str(), theString.length());
}
template<typename T> T *Create(..., const string_ref &theString)
{
return Create(..., theString.data(), theString.length());
}
template<typename T> void Destroy(T *pValue)
{
pValue->~T();
char *memory = reinterpret_cast<char*>(pValue);
delete[] memory;
}
Obviously, you'll need to fill in the other constructor parameters yourself. And your type's constructor will need to take a string_ref that refers to the string.
If you are using std::string, you can't really do one allocation for both structure and string, and you also can't make the allocation of both to be one large block. If you are using old C-style strings it's possible though.
If I understand you correctly, you are saying that through profiling you have determined that the fact that you have to allocate a string and another data member in your data structure imposes an unacceptable cost to you application.
If that's indeed the case I can think of a couple solutions.
You could pre-allocate all of these structures up front, before your program starts. Keep them in some kind of fixed collection so they aren't copy-constructed, and reserve enough buffer in your strings to hold your data.
Controversial as it may seem, you could use old C-style char arrays. It seems like you are fogoing much of the reason to use strings in the first place, which is the memory management. However in your case, since you know the needed buffer sizes at start up, you could handle this yourself. If you like the other facilities that string provides, bear in mind that much of that is still available in the <algorithm>s.
Take a look at Variable Sized Struct C++ - the short answer is that there's no way to do it in vanilla C++.
Do you really need to allocate the container structs on the heap? It might be more efficient to have those on the stack, so they don't need to be allocated at all.
Indeed two allocations can seem too high. There are two ways to cut them down though:
Do a single allocation
Do a single dynamic allocation
It might not seem so different, so let me explain.
1. You can use the struct hack in C++
Yes this is not typical C++
Yes this requires special care
Technically it requires:
disabling the copy constructor and assignment operator
making the constructor and destructor private and provide factory methods for allocating and deallocating the object
Honestly, this is the hard-way.
2. You can avoid allocating the outer struct dynamically
Simple enough:
struct M {
Kind _kind;
std::string _data;
};
and then pass instances of M on the stack. Move operations should guarantee that the std::string is not copied (you can always disable copy to make sure of it).
This solution is much simpler. The only (slight) drawback is in memory locality... but on the other hand the top of the stack is already in the CPU cache anyway.
C-style strings can always be converted to std::string as needed. In fact, there's a good chance that your observations from profiling are due to fragmentation of your data rather than simply the number of allocations, and creating an std::string on demand will be efficient. Of course, not knowing your actual application this is just a guess, and really one can't know this until it's tested anyways. I imagine a class
class my_class {
std::string data() const { return self._data; }
const char* data_as_c_str() const // In case you really need it!
{ return self._data; }
private:
int _type;
char _data[1];
};
Note I used a standard clever C trick for data layout: _data is as long as you want it to be, so long as your factory function allocates the extra space for it. IIRC, C99 even gave a special syntax for it:
struct my_struct {
int type;
char data[];
};
which has good odds of working with your C++ compiler. (Is this in the C++11 standard?)
Of course, if you do do this, you really need to make all of the constructors private and friend your factory function, to ensure that the factory function is the only way to actually instantiate my_class -- it would be broken without the extra memory for the array. You'll definitely need to make operator= private too, or otherwise implement it carefully.
Rethinking your data types is probably a good idea.
For example, one thing you can do is, rather than trying to put your char arrays into a structured data type, use a smart reference instead. A class that looks like
class structured_data_reference {
public:
structured_data_reference(const char *data):_data(data) {}
std::string get_first_field() const {
// Do something interesting with _data to get the first field
}
private:
const char *_data;
};
You'll want to do the right thing with the other constructors and assignment operator too (probably disable assignment, and implement something reasonable for move and copy). And you may want reference counted pointers (e.g. std::shared_ptr) throughout your code rather than bare pointers.
Another hack that's possible is to just use std::string, but store the type information in the first entry (or first several). This requires accounting for that whenever you access the data, of course.
I'm not sure if this exactly addressing your problem. One way you can optimize the memory allocation in C++ by using a pre-allocated buffer and then using a 'placement new' operator.
I tried to solve your problem as I understood it.
unsigned char *myPool = new unsigned char[10000];
struct myStruct
{
myStruct(char* aSource1, char* aSource2)
{
original = new (myPool) string(aSource1); //placement new
data = new (myPool) string(aSource2); //placement new
}
~myStruct()
{
original = NULL; //no deallocation needed
data = NULL; //no deallocation needed
}
string* original;
string* data;
};
int main()
{
myStruct* aStruct = new (myPool) myStruct("h1", "h2");
// Use the struct
aStruct = NULL; // No need to deallocate
delete [] myPool;
return 0;
}
[Edit] After, the comment from NicolBolas, the problem is bit more clear. I decided to write one more answer, eventhough in reality it is not that much advantageous than using a raw character array. But, I still believe that this is well within the stated constraints.
Idea would be to provide a custom allocater for the string class as specified in this SO question.
In the implementation of the allocate method, use the placement new as
pointer allocate(size_type n, void * = 0)
{
// fail if we try to allocate too much
if((n * sizeof(T))> max_size()) { throw std::bad_alloc(); }
//T* t = static_cast<T *>(::operator new(n * sizeof(T)));
T* t = new (/* provide the address of the original character buffer*/) T[n];
return t;
}
The constraint is that for the placement new to work, the original string address should be known to the allocater at run time. This can be achieved by external explicit setting before the new string member creation. However, this is not so elegant.
In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself.
I have a feeling that may you are not exploiting C++'s type-system to its maximum potential here. It looks and feels very C-ish (that is not a proper word, I know). I don't have concrete examples to post here since I don't have any idea about the problem you are trying to solve.
Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation?
I believe that you are worrying about the structure allocation followed by a copy of the string to the structure member? This ideally shouldn't happen (but of course, this depends on how and when you are initializng the members). C++11 supports move construction. This should take care of any extra string copies that you are worried about.
You should really, really post some code to make this discussion worthwhile :)
a vital data structure as an unstructured string with program-defined delimiters
One question: Is this string mutable? If not, you can use a slightly different data-structure. Don't store copies of parts of this vital data structure but rather indices/iterators to this string which point to the delimiters.
// assume that !, [, ], $, % etc. are your program defined delims
const std::string vital = "!id[thisisdata]$[moredata]%[controlblock]%";
// define a special struct
enum Type { ... };
struct Info {
size_t start, end;
Type type;
// define appropriate ctors
};
// parse the string and return Info obejcts
std::vector<Info> parse(const std::string& str) {
std::vector<Info> v;
// loop through the string looking for delims
for (size_t b = 0, e = str.size(); b < e; ++b) {
// on hitting one such delim create an Info
switch( str[ b ] ) {
case '%':
...
case '$;:
// initializing the start and then move until
// you get the appropriate end delim
}
// use push_back/emplace_back to insert this newly
// created Info object back in the vector
v.push_back( Info( start, end, kind ) );
}
return v;
}