Allocate constant strings in container contiguously

Allocate constant strings in container contiguously - c++

Lets say I have a std::vector of const std::strings.
std::vector<const std::string> strs;
Now the default behavior here is that the actual string containers can be allocated anywhere on the heap, which pretty much disables any prefetching of data when iterating over the contained strings.
strs.push_back("Foo"); // allocates char block on heap
strs.push_back("Boo"); // allocates char block on heap
However, since the strings are "const" I would like the char blocks to be allocated contiguously or close to each other (when possible) in order to have the most efficient cache behavior when iterating over the strings.
Is there any way to achieve this behavior?

You need a custom allocator known as a memory region allocator. You can look on Wikipedia or Google for more information, but the basic idea is something akin to the hardware stack- allocate one large chunk and then simply increment the pointer to mark it as used. It can serve many contiguous requests very quickly but can't deal with frees and allocations- all freeing is done at once.

If it really is that simple - pushing strings that will never change, it is easy to write your own allocator. Allocate a large block of memory, set a pointer free to offset 0 in the block. When you need storage for a new string strncpy it to free and increase free with the strlen. Keep track of the end of the memory block and allocate another block when needed.

Not really.
std::string isn't a POD, it doesn't keep its contents "inside of the object". What's more - it doesn't even require to store its contents in a single memory block.
Also a std::vector (as all arrays) needs its contents to be of one type (= of equal size), so you can't make a literal "array" of strings of different lengths.
Your best shot is to assume a length and use std::vector<std::array<char, N> >
If you need really different lengths, an alternative is just a std::vector<char> for the data plus a std::vector<unsigned> for the indices where consecutive strings start.
Rolling your own allocator for the string is a tempting idea, you could base it on std::vector<char> and then roll up your own std::basic_string on it, then make a collection of those.
Note that you are actually depending much on a specific std::string implementation. Some do have an internal buffer of N chars and only allocate memory externally if the string length is bigger than the buffer. If that's the case on your implementation, you still wouldn't get a contiguous memory for the whole buffer of strings.
On that grounds, I conclude that with std::string you won't be generally able to accomplish what you want (unless you rely on a specific STL implementation) and you need to provide another string implementation to suit your needs.

A custom allocator is great, but why not store all the strings in a single std::vector<char> or std::string, and access the original strings by offset?
Simple and effective.

You can always write a private allocator (second template parameter for std::vector) that will allocate all the strings from a continuous pool. Also you can use std::basic_string instead of std::string (which is a private case of std::basic_string), which allows specifying your own allocator similarly. Generally I would say its a case of "premature optimization", but I trust you've measured and saw a performance hit here... I believe the price to pay would be some memory wasted, though.

A vector is guaranteed to be contiguous memory and is
interoperable with an array. It is not a singly linked list.
"Contiguity is in fact part of the vector abstraction. It’s so important, in fact, the C++03 standard was amended to explicitly add the guarantee."
Source : http://herbsutter.com/2008/04/07/cringe-not-vectors-are-guaranteed-to-be-contiguous/
Use reserve() to force it to be contiguous and not reallocate.
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <iterator>
using namespace std;
int main()
{
// create empty vector for strings
vector<const string> sentence;
// reserve memory for five elements to avoid reallocation
sentence.reserve(5);
// append some elements
sentence.push_back("Hello,");
sentence.push_back("how");
sentence.push_back("are");
sentence.push_back("you");
sentence.push_back("?");
// print elements separated with spaces
copy (sentence.begin(), sentence.end(),
ostream_iterator<string>(cout," "));
cout << endl;
return 0;
}

Related

Are two heap allocations more expensive than a call to std::string fill ctor?

I want to have a string with a capacity of 131 chars (or bytes). I know two simple ways of achieving that. So which of these two code blocks is faster and more efficient?
std::string tempMsg( 131, '\0' ); // constructs the string with a 131 byte buffer from the start
tempMsg.clear( ); // clears those '\0' chars to free space for the actual data
tempMsg += "/* some string literals that are appended */";
or this one:
std::string tempMsg; // default constructs the string with a 16 byte buffer
tempMsg.reserve( 131 ); // reallocates the string to increase the buffer size to 131 bytes??
tempMsg += "/* some string literals that are appended */";
I guess the first approach only uses 1 allocation and then sets all those 131 bytes to 0 ('\0') and then clears the string (std::string::clear is generally constant according to: https://www.cplusplus.com/reference/string/string/clear/).
The second approach uses 2 allocations but on the other hand, it doesn't have to set anything to '\0'. But I've also heard about compilers allocating 16 bytes on the stack for a string object for optimization purposes. So the 2nd method might use only 1 heap allocation as well.
So is the first method faster than the other one? Or are there any other better methods?

The most accurate answer is that it depends. The most probable answer is the second being faster or as fast. Calling the fill ctor requires not only a heap allocation but a fill (typically translates to a memset in my experience).
clear usually won't do anything with a POD char besides setting a first pointer or size integer to zero because char is a trivially-destructible type. There's no loop involved with clear usually unless you create std::basic_string with a non-trivial UDT. It's constant-time otherwise and dirt-cheap in practically every standard library implementation.
Edit: An Important Note:
I never encountered a standard lib implementation that does this or it has slipped my memory (very possible as I think I'm turning senile), but there is something very important that Viktor Sehl pointed out to me that I was very ignorant about in the comments:
Please note that std::string::clear() on some implementations free the allocated memory (if there are any), unlike a std::vector. –
That would actually make your first version involve two heap allocations. But the second should still only be one (opposite of what you thought).
Resumed:
But I've also heard about compilers allocating 16 bytes on the stack for a string object for optimization purposes. So the 2nd method might use only 1 heap allocation as well.
Small Buffer Optimizations
The first allocation is a small-buffer stack optimization for implementations that use it (technically not always stack, but it'll avoid additional heap allocations). It's not separately heap-allocated and you can't avoid it with a fill ctor (the fill ctor will still allocate the small buffer). What you can avoid is filling the entire array with '\0' before you fill it with what you actually want, and that's why the second version is likely faster (marginally or not depending on how many times you invoke it from a loop). That's needless overhead unless the optimizer eliminates it for you, and it's unlikely in my experience that optimizers will do that in loopy cases that can't be optimized with something like SSA.
I just pitched in here because your second version is also clearer in intent than filling a string with something as an attempted optimization (in this case a very possibly misguided one if you ask me) only to throw it out and replace it with what you actually want. The second is at least clearer in intent and almost certainly as fast or faster in most implementations.
On Profiling
I would always suggest measuring though if in doubt, and especially before you start attempting funny things like in your first example. I can't recommend the profiler enough if you're in working in performance-critical fields. The profiler will not only answer this question for you but it'll also teach you to refrain from writing such counter-intuitive code like in the first example except in places where it makes a real positive difference (in this case I think the difference is actually negative or neutral). From my perspective, the use of both profiler and debugging should be something ideally taught in CS 101. The profiler helps mitigate the dangerous tendency for people to optimize the wrong things very counter-productively. They tend to be very easy to use; you just run them and make your code perform the expensive operation you want to optimize and you get back nice results like so:
If the small buffer optimization confuses you a bit, a simple illustration is like this:
struct SomeString
{
// Pre-allocates (always) some memory in advance to avoid additional
// heap allocs.
char small_buffer[some_small_fixed_size] = {};
// Will point to small buffer until string gets large.
char* ptr = small_buffer;
};
The allocation of the small buffer is unavoidable, but it doesn't require separate calls to malloc/new/new[]. And it's not allocated separately on the heap from the string object itself (if it is allocated on heap). So both of the examples that you showed involve, at most, a single heap allocation (unless your standard library implementation is FUBAR -- edit: or one that Viktor is using). What the first example has conceptually on top of that is a fill/loop (could be implemented as a very efficient intrinsic in assembly but loopy/linear time stuff nevertheless) unless the optimizer eliminates it.
String Optimization
So is the first method faster than the other one? Or are there any other better methods?
You can write your own string type which uses an SBO with, say, 256 bytes for the small buffer which is typically going to be much larger than any std::string optimization. Then you can avoid heap allocations entirely for your 131-length case.
template <class Char, size_t SboSize=256>
class TempString
{
private:
// Stores the small buffer.
Char sbo[SboSize] = {};
// Points to the small buffer until num > SboSize.
Char* ptr = sbo;
// Stores the length of the string.
size_t num = 0;
// Stores the capacity of the string.
size_t cap = SboSize;
public:
// Destroys the string.
~TempString()
{
if (ptr != sbo)
delete[] ptr;
}
// Remaining implementation left to reader. Note that implementing
// swap requires swapping the contents of the SBO if the strings
// point to them rather than swapping pointers (swapping is a
// little bit tricky with SBOs involved, so be wary of that).
};
That would be ill-suited for persistent storage though because it would blow up memory use (ex: requiring 256+ bytes just to store a string with one character in it) if you stored a bunch of strings persistently in a container. It's well-suited for temporary strings though you transfer into and out of function calls. I'm primarily a gamedev so rolling our own alternatives to the standard C++ library is quite normal here given our requirements for real-time feedback with high graphical fidelity. I wouldn't recommend it for the faint-hearted though, and definitely not without a profiler. This is a very practical and viable option in my field although it might be ridiculous in yours. The standard lib is excellent but it's tailored for the needs of the entire world. You can usually beat it if you can tailor your code very specifically to your needs and produce more narrowly-applicable code.
Actually, even std::string with SBOs is rather ill-suited for persistent storage anyway and not just TempString above because if you store like std::unordered_map<std::string, T> and std::string uses a 16-byte SBO inflating sizeof(std::string) to 32 bytes or more, then your keys will require 32 bytes even if they just store one character fitting only two strings or less in a single cache line on traversal of the hash table. That's a downside to using SBOs. They can blow up your memory use for persistent storage that's part of your application state. But they're excellent for temporaries whose memory is just pushed and popped to/from stack in a LIFO alloc/dealloc pattern which only requires incrementing and decrementing a stack pointer.
If you want to optimize the storage of many strings though from a memory standpoint, then it depends a lot on your access patterns and needs. However, a fairly simple solution is like so if you want to just build a dictionary and don't need to erase specific strings dynamically:
// Just using a struct for simplicity of illustration:
struct MyStrings
{
// Stores all the characters for all the null-terminated strings.
std::vector<char> buffer;
// Stores the starting index into the buffer for the nth string.
std::vector<std::size_t> string_start;
// Inserts a null-terminated string to the buffer.
void insert(const std::string_view str)
{
string_start.push_back(buffer.size());
buffer.insert(buffer.end(), str.begin(), str.end());
buffer.push_back('\0');
}
// Returns the nth null-terminated string.
std::string_view operator[](int32_t n) const
{
return {buffer.data() + string_start[n]};
}
};
Another common solution that can be very useful if you store a lot of duplicate strings in an associative container or need fast searches for strings that can be looked up in advance is to use string interning. The above solution can also be combined to implement an efficient way to store all the interned strings. Then you can store lightweight indices or pointers to your interned strings and compare them immediately for equality, e.g., without involving any loops, and store many duplicate references to strings that only cost the size of an integer or pointer.

How the memory allocation works for nested containers?

For example, i have std::vector<std::string>, how the allocators for vector and string work together?
Say the allocator for vector allocates a chunk of memory ChunkVec, does the allocator for string allocate memory inside ChunkVec so that the memory allocated for each string sums to ChunkVec? Or the allocator for string allocates memory outside ChunkVec?
Is the answer the same for other nested containers?
And is there a difference between C++ and C++11?

i have std::vector < std::string >
On my Ubuntu 15.04, 64 bit, a std::string is 8 bytes, regardless of contents.
(using std::string s1; I am comparing sizeof(std::string) versus s1.size(). Then append to the string and then print them both again.)
I have not noticed or found a way to specify what allocator to use when the string allocates its data from the heap, therefore, I believe it must use some standard allocator, probably new, but I have never looked into the std::string code. And that standard allocator would know nothing about your vector.
does the allocator for string allocate memory inside ChunkVec so that
the memory allocated for each string sums to ChunkVec?
I believe the part of the string in a vector element is only the 8 byte pointer to where the string 'proper' resides in the heap. So no.
Or the allocator for string allocates memory outside ChunkVec?
Yes, I believe so.
You can confirm this by printing the addresses of the vector elements i, and i+1, and the address of the some of the chars of element i.
By the way, on my implementation (g++ 4.9.2) , sizeof(std::vector) is 24 bytes, regardless of the number of data elements (vec.size()) and regardless of element size. Note also, that I have read about some implementations where some of a small vector might actually reside in the 24 bytes. Implementation details can be tedious, but helpful. Still, some might be interested in why you want to know this.
Be aware we are talking about implementation details (I think) ... so your exploration might vary from mine.
Is the answer the same for other nested containers?
I have not explored every container (but I have used many "std::vector< std::string >").
Generally, and without much thought, I would guess not.
And is there a difference between C++ and C++11?
Implementation details change for various reasons, including language feature changes. What have you tried?

ChunkVec stores only the pointer to the data allocated by string.(in this case it stores a std::string object which stores pointer). Its a totally different allocation. A Good way to understand it is to analyze the tree structure in programming.
struct node
{
int data;
struct node* left;
struct node* right;
};
left and right are different memory allocations than node. You can remove them without removing this very node.

std::string has two things to store--the size of the string and the content. If I allocate one on the stack, the size will be on the stack as well. For short strings, the character data itself will also be on the stack. These two items make up the "control structure". std::string only uses its allocator for long strings that don't fit in its fixed-size control structure.
std::vector allocates memory to store the control structure of the std::string. Any allocation required by std::string to store long strings could be in a completely different area of memory than the vector. Short strings will be entirely managed be the allocator of std::vector.

`std::string` allocations are my current bottleneck - how can I optimize with a custom allocator?

I'm writing a C++14 JSON library as an exercise and to use it in my personal projects.
By using callgrind I've discovered that the current bottleneck during a continuous value creation from string stress test is an std::string dynamic memory allocation. Precisely, the bottleneck is the call to malloc(...) made from std::string::reserve.
I've read that many existing JSON libraries such as rapidjson use custom allocators to avoid malloc(...) calls during string memory allocations.
I tried to analyze rapidjson's source code but the large amount of additional code and comments, plus the fact that I'm not really sure what I'm looking for, didn't help me much.
How do custom allocators help in this situation?
Is a memory buffer preallocated somewhere (where? statically?) and std::strings take available memory from it?
Are strings using custom allocators "compatible" with normal strings?
They have different types. Do they have to be "converted"? (And does that result in a performance hit?)
Code notes:
Str is an alias for std::string.

By default, std::string allocates memory as needed from the same heap as anything that you allocate with malloc or new. To get a performance gain from providing your own custom allocator, you will need to be managing your own "chunk" of memory in such a way that your allocator can deal out the amounts of memory that your strings ask for faster than malloc does. Your memory manager will make relatively few calls to malloc, (or new, depending on your approach) under the hood, requesting "large" amounts of memory at once, then deal out sections of this (these) memory block(s) through the custom allocator. To actually achieve better performance than malloc, your memory manager will usually have to be tuned based on known allocation patterns of your use cases.
This kind of thing often comes down to the age-old trade off of memory use versus execution speed. For example: if you have a known upper bound on your string sizes in practice, you can pull tricks with over-allocating to always accommodate the largest case. While this is wasteful of your memory resources, it can alleviate the performance overhead that more generalized allocation runs into with memory fragmentation. As well as making any calls to realloc essentially constant time for your purposes.
#sehe is exactly right. There are many ways.
EDIT:
To finally address your second question, strings using different allocators can play nicely together, and usage should be transparent.
For example:
class myalloc : public std::allocator<char>{};
myalloc customAllocator;
int main(void)
{
std::string mystring(customAllocator);
std::string regularString = "test string";
mystring = regularString;
std::cout << mystring;
return 0;
}
This is a fairly silly example and, of course, uses the same workhorse code under the hood. However, it shows assignment between strings using allocator classes of "different types". Implementing a useful allocator that supplies the full interface required by the STL without just disguising the default std::allocator is not as trivial. This seems to be a decent write up covering the concepts involved. The key to why this works, in the context of your question at least, is that using different allocators doesn't cause the strings to be of different type. Notice that the custom allocator is given as an argument to the constructor not a template parameter. The STL still does fun things with templates (such as rebind and Traits) to homogenize allocator interfaces and tracking.

What often helps is the creation of a GlobalStringTable.
See if you can find portions of the old NiMain library from the now defunct NetImmerse software stack. It contains an example implementation.
Lifetime
What is important to note is that this string table needs to be accessible between different DLL spaces, and that it is not a static object. R. Martinho Fernandes already warned that the object needs to be created when the application or DLL thread is created / attached, and disposed when the thread is destroyed or the dll is detached, and preferrably before any string object is actually used. This sounds easier than it actually is.
Memory allocation
Once you have a single point of access that exports correctly, you can have it allocate a memory buffer up-front. If the memory is not enough, you have to resize it and move the existing strings over. Strings essentially become handles to regions of memory in this buffer.
Placement new
Something that often works well is called the placement new() operator, where you can actually specify where in memory your new string object needs to be allocated. However, instead of allocating, the operator can simply grab the memory location that is passed in as an argument, zero the memory at that location, and return it. You can also keep track of the allocation, the actual size of the string etc.. in the Globalstringtable object.
SOA
Handling the actual memory scheduling is something that is up to you, but there are many possible ways to approach this. Often, the allocated space is partitioned in several regions so that you have several blocks per possible string size. A block for strings <= 4 bytes, one for <= 8 bytes, and so on. This is called a Small Object Allocator, and can be implemented for any type and buffer.
If you expect many string operations where small strings are incremented repeatedly, you may change your strategy and allocate larger buffers from the start, so that the number of memmove operations are reduced. Or you can opt for a different approach and use string streams for those.
String operations
It is not a bad idea to derive from std::basic_str, so that most of the operations still work but the internal storage is actually in the GlobalStringTable, so that you can keep using the same stl conventions. This way, you also make sure that all the allocations are within a single DLL, so that there can be no heap corruption by linking different kinds of strings between different libraries, since all the allocation operations are essentially in your DLL (and are rerouted to the GlobalStringTable object)

Custom allocators can help because most malloc()/new implementations are designed for maximum flexibility, thread-safety and bullet-proof workings. For instance, they must gracefully handle the case that one thread keeps allocating memory, sending the pointers to another thread that deallocates them. Things like these are difficult to handle in a performant way and drive the cost of malloc() calls.
However, if you know that some things cannot happen in your application (like one thread deallocating stuff another thread allocated, etc.), you can optimize your allocator further than the standard implementation. This can yield significant results, especially when you don't need thread safety.
Also, the standard implementation is not necessarily well optimized: Implementing void* operator new(size_t size) and void operator delete(void* pointer) by simply calling through to malloc() and free() gives an average performance gain of 100 CPU cycles on my machine, which proves that the default implementation is suboptimal.

I think you'd be best served by reading up on the EASTL
It has a section on allocators and you might find fixed_string useful.

The best way to avoid a memory allocation is don't do it!
BUT if I remember JSON correctly all the readStr values either gets used as keys or as identifiers so you will have to allocate them eventually, std::strings move semantics should insure that the allocated array are not copied around but reused until its final use. The default NRVO/RVO/Move should reduce any copying of the data if not of the string header itself.
Method 1:
Pass result as a ref from the caller which has reserved SomeResonableLargeValue chars, then clear it at the start of readStr. This is only usable if the caller actually can reuse the string.
Method 2:
Use the stack.
// Reserve memory for the string (BOTTLENECK)
if (end - idx < SomeReasonableValue) { // 32?
char result[SomeReasonableValue] = {0}; // feel free to use std::array if you want bounds checking, but the preceding "if" should insure its not a problem.
int ridx = 0;
for(; idx < end; ++idx) {
// Not an escape sequence
if(!isC('\\')) { result[ridx++] = getC(); continue; }
// Escape sequence: skip '\'
++idx;
// Convert escape sequence
result[ridx++] = getEscapeSequence(getC());
}
// Skip closing '"'
++idx;
result[ridx] = 0; // 0-terminated.
// optional assert here to insure nothing went wrong.
return result; // the bottleneck might now move here as the data is copied to the receiving string.
}
// fallback code only if the string is long.
// Your original code here
Method 3:
If your string by default can allocate some size to fill its 32/64 byte boundary, you might want to try to use that, construct result like this instead in case the constructor can optimize it.
Str result(end - idx, 0);
Method 4:
Most systems already has some optimized allocator that like specific block sizes, 16,32,64 etc.
siz = ((end - idx)&~0xf)+16; // if the allocator has chunks of 16 bytes already.
Str result(siz);
Method 5:
Use either the allocator made by google or facebooks as global new/delete replacement.

To understand how a custom allocator can help you, you need to understand what malloc and the heap does and why it is quite slow in comparison to the stack.
The Stack
The stack is a large block of memory allocated for your current scope. You can think of it as this
([] means a byte of memory)
[P][][][][][][][][][][][][][][][]
(P is a pointer that points to a specific byte of memory, in this case its pointing at the first byte)
So the stack is a block with only 1 pointer. When you allocate memory, what it does is it performs a pointer arithmetic on P, which takes constant time.
So declaring int i = 0; would mean this,
P + sizeof(int).
[i][i][i][i][P][][][][][][][][][][][],
(i in [] is a block of memory occupied by an integer)
This is blazing fast and as soon as you go out of scope, the entire chunk of memory is emptied simply by moving P back to the first position.
The Heap
The heap allocates memory from a reserved pool of bytes reserved by the c++ compiler at runtime, when you call malloc, the heap finds a length of contiguous memory that fits your malloc requirements, marks it as used so nothing else can use it, and returns that to you as a void*.
So, a theoretical heap with little optimization calling new(sizeof(int)), would do this.
Heap chunk
At first : [][][][][][][][][][][][][][][][][][][][][][][][][]
Allocate 4 bytes (sizeof(int)):
A pointer goes though every byte of memory, finds one that is of correct length, and returns to you a pointer.
After : [i][i][i][i][][][]][][][][][][][][][]][][][][][][][]
This is not an accurate representation of the heap, but from this you can already see numerous reasons for being slow relative to the stack.
The heap is required to keep track of all already allocated memory and their respective lengths. In our test case above, the heap was already empty and did not require much, but in worst case scenarios, the heap will be populated with multiple objects with gaps in between (heap fragmentation), and this will be much slower.
The heap is required to cycle though all the bytes to find one that fits your length.
The heap can suffer from fragmentation since it will never completely clean itself unless you specify it. So if you allocated an int, a char, and another int, your heap would look like this
[i][i][i][i][c][i2][i2][i2][i2]
(i stands for bytes occupied by int and c stands for bytes occupied by a char. When you de-allocate the char, it will look like this.
[i][i][i][i][empty][i2][i2][i2][i2]
So when you want to allocate another object into the heap,
[i][i][i][i][empty][i2][i2][i2][i2][i3][i3][i3][i3]
unless an object is the size of 1 char, the overall heap size for that allocation is reduced by 1 byte. In more complex programs with millions of allocations and deallocations, the fragmentation issue becomes severe and the program will become unstable.
Worry about cases like thread safety (Someone else said this already).
Custom Heap/Allocator
So, a custom allocator usually needs to address these problems while providing the benefits of the heap, such as personalized memory management and object permanence.
These are usually accomplished with specialized allocators. If you know you dont need to worry about thread safety or you know exactly how long your string will be or a predictable usage pattern you can make your allocator fast than malloc and new by quite a lot.
For example, if your program requires a lot of allocations as fast as possible without lots of deallocations, you could implement a stack allocator, in which you allocate a huge chunk of memory with malloc at startup,
e.g
typedef char* buffer;
//Super simple example that probably doesnt work.
struct StackAllocator:public Allocator{
buffer stack;
char* pointer;
StackAllocator(int expectedSize){ stack = new char[expectedSize];pointer = stack;}
allocate(int size){ char* returnedPointer = pointer; pointer += size; return returnedPointer}
empty() {pointer = stack;}
};
Get expected size, get a chunk of memory from the heap.
Assign a pointer to the beginning.
[P][][][][][][][][][] ..... [].
then have one pointer that moves for each allocation. When you no longer need the memory, you simply move the pointer to the beginning of your buffer. This gives your the advantage of O(1) speed allocations and deallocations as well as object permanence for the lack of flexible deallocation and large initial memory requirements.
For strings, you could try a chunk allocator. For every allocation, the allocator gives a set chunk of memory.
Compatibility
Compatibility with other strings is almost guaranteed. As long as you are allocating a contiguous chunk of memory and preventing anything else from using that block of memory, it will work.

Best practice for storing an array of struct of undefined length

The question
Let's suppose that we have a struct such as:
struct MyStruct
{
enum Type { NONE, TYPE1, TYPE2 };
Type type;
int value;
}
Now, the application needs to store a undefined amount of these structs on an array or similar.
The question is: Which will be the best way to do this in terms of memory usage, speed, elegance, etc?
Some considerations
To have a fixed length array with a length that you know that is not going to be overtaken:
MyStruct myStructArray[200];
I suppose this will lead to more memory usage as it will reserve space for the incoming struct instances.
To have some autoresizable array mechanism like a vector<MyStruct> managing memory by itself.
To store pointers to each struct in some array or vector.

std::vector<MyStruct> is the better option. There is also another option, which is very close to vector in some way, is called std::deque. Have a look at it; maybe it will help you, or at least increase your awareness of standard containers. The online doc says,
As opposed to std::vector, the elements of a deque are not stored contiguously: typical implementations use a sequence of individually allocated fixed-size arrays.
The storage of a deque is automatically expanded and contracted as needed. Expansion of a deque is cheaper than the expansion of a std::vector because it does not involve copying of the existing elements to a new memory location.
Although, std::deque doesn't store elements in contiguous memory, it works with RandomAccessIterator — pretty much like std::vector.

Use std::vector<> if your size can vary during runtime, use std::array<> if it is fixed. Although the effort for adding and removing elements in a std::deque<> is lower than for a std::vector<>, vector<> provides data in contiguous memory, which, esp. for linear traversal, is much more cache friendly. This will improve performance compared to containers that rely on btrees or similar that can be distributed in memory leading to cache misses during traversal.

You forgot one possibility: dynamically allocated arrays. You can also do this:
long myStructCount = /*whatever*/;
MyStruct* myStructArray = new MyStruct[myStructCount];
This is worse that std::vector<MyStruct> from a usage perspective, but definitely preferable to that evil MyStruct myStructArray[200] approach.
The fixed size approach is evil because there are preciously few cases in which you can prove that your limit won't be exceeded, so in most cases it's nothing more or less than a bug waiting to strike.

Can C++ automatic variables vary in size?

In the following C++ program:
#include <string>
using namespace std;
int main()
{
string s = "small";
s = "bigger";
}
is it more correct to say that the variable s has a fixed size or that the variable s varies in size?

It depends on what you mean by "size".
The static size of s (as returned by sizeof(s)) will be the same.
However, the size occupied on the heap will vary between the two cases.
What do you want to do with the information?

i'll say yes and no.
s will be the same string instance but it's internal buffer (which is preallocated depending on your STL implementation) will contain a copy of the constant string you wanted to affect to it.
Should the constant string (or any other char* or string) have a bigger size than the internal preallocated buffer of s, s buffer will be reallocated depending on string buffer reallocation algorithm implemented in your STL implmentation.

This is going to lead to a dangerous discussion because the concept of "size" is not well defined in your question.
The size of a class s is known at compile time, it's simply the sum of the sizes of it's members + whatever extra information needs to be kept for classes (I'll admit I don't know all the details) The important thing to get out of this, however is the sizeof(s) will NOT change between assignments.
HOWEVER, the memory footprint of s can change during runtime through the use of heap allocations. So as you assign the bigger string to s, it's memory footprint will increase because it will probably need more space allocated on the heap. You should probably try and specify what you want.

The std::string variable never changes its size. It just refers to a different piece of memory with a different size and different data.

Neither, exactly. The variable s is referring to a string object.
#include <string>
using namespace std;
int main()
{
string s = "small"; //s is assigned a reference to a new string object containing "small"
s = "bigger"; //s is modified using an overloaded operator
}
Edit, corrected some details and clarified point
See: http://www.cplusplus.com/reference/string/string/ and in particular http://www.cplusplus.com/reference/string/string/operator=/
The assignment results in the original content being dropped and the content of the right side of the operation being copied into the object. similar to doing s.assign("bigger"), but assign has a broader range of acceptable parameters.
To get to your original question, the contents of the object s can have variable size. See http://www.cplusplus.com/reference/string/string/resize/ for more details on this.

A variable is an object we refer to by a name. The "physical" size of an object -- sizeof(s) in this case -- doesn't change, ever. They type is still std::string and the size of a std::string is always constant. However, things like strings and vectors (and other containers for that matter) have a "logical size" that tells us how many elements of some type they store. A string "logically" stores characters. I say "logically" because a string object doesn't really contain the characters directly. Usually it has only a couple of pointers as "physical members". Since the string objects manages a dynamically allocated array of characters and provides proper copy semantics and convenient access to the characters we can thing of those characters as members ("logical members"). Since growing a string is a matter of reallocating memory and updating pointers we don't even need sizeof(s) to change.

i would say this is string object , And it has capability to grow dynamically and vice-versa

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js