How to create string with predefined value and allocated space? - c++

In C, there is a nice construct to create a c-string with more allocated space:
char str[6] = "Hi"; // [H,i,0,0,0,0]
I thought I could do the same using (4) version of string constructor, but the reference says
The behavior is undefined if s does not point at an array of at least count elements of CharT.
So it is not safe to use
std::string("Hi", 6);
Is there any way to create such std::string without extra copies and reallocations?

Theory:
Legacy c-strings
Consider the following snippet:
int x[10];
void method() {
int y[10];
}
The first declaration, int x[10], uses static storage duration, defined by cppreference as: "The storage for the object is allocated when the program begins and deallocated when the program ends. Only one instance of the object exists. All objects declared at namespace scope (including global namespace) have this storage duration, plus those declared with static or extern."
In this case, the allocation happens when the program begins and freed when it ends. From cppreference.com:
static storage duration. The storage for the object is allocated when the program begins and deallocated when the program ends.
Informally, it is implementation-defined. But, since these strings never change they are stored in read-only memory segments (.BSS/.DATA) of the executable and are only referenced during run-time.
The second one, int y[10], uses automatic storage duration, defined by cppreference as: "The object is allocated at the beginning of the enclosing code block and deallocated at the end. All local objects have this storage duration, except those declared static, extern or thread_local."
In this case, there is a very simple allocation, a simple as moving the stack pointer in most cases.
std::string
A std::string on the other hand is a run-time creature, and it has to allocate some run-time memory:
For smaller strings, std::string has an inner buffer with a constant size and is capable of storing small strings (think of it as a char buffer[N] member)
For larger strings, it performs dynamic allocations.
Practice
You could use reserve(). This method makes sure that the underlying buffer can hold at least N charT's.
Option 1: First reserve, then append
std::string str;
str.reserve(6);
str.append("Hi");
Option 2: First construct, then reserve
std::string str("Hi");
str.reserve(6);

To ensure at most one runtime allocation, you could write:
std::string str("Hi\0\0\0", 6);
str.resize(2);
However, in practice many string implementations use the Small String Optimization, which makes no allocations if the string is "short" (up to size 16 is suggested on that thread). So actually you would not suffer a reallocation by starting the string off at size 2 and later increasing to 6.

Related

Dynamically created string allocated on heap or stack - C

Context
I was experimenting with getting C strings in C++ without allocating memory on the heap and came across this in testing:
#include <stddef.h>
#include <stdlib.h>
char* get_empty_c_string(size_t length) {
char buffer[length];
char *string = buffer;
for (size_t i = 0; i ^ length; i++) *(string + i) = '\0';
return string;
}
int main(void) {
char *string = get_empty_c_string(20u); // Allocated on heap?
// or stack?
return 0;
}
Question
Is the C string returned allocated on heap or stack?
As far as I know:
Heap allocation occurs with the calloc, malloc & realloc C standard functions or new & new[] C++ keywords.
Stack allocation in most other cases.
The array buffer is a variable length array (VLA), meaning its size is determined at runtime. As a variable local to a function is resides on the stack. The pointer string then points to that array, and that pointer is returned. And because the returned pointer points to a local stack variable which goes out of scope, attempting to use that pointer will invoke undefined behavior.
Also, note that VLAs are a C only feature.
There is no way in standard C++ to obtain runtime-sized memory of automatic storage duration (which usually maps to stack memory).
Therefore a proper string of any length cannot be obtained on the stack. You can only allocate a buffer with a maximal size and use strings up to that length in the program. (Something similar is usually done by std::string as so-called short string optimization.)
Furthermore, you cannot return pointers or references to variables with automatic storage duration from a function. When the function returns the variables are destroyed and the pointer/reference becomes invalid. You can only ever use the stack-allocation until the function returns. You can however return the variable by-value.
As #PaulMcKenzie points out, your implementation of get_empty_c_string() would fail to compile: In essence, arrays as temporary/instance variables of a function need to have a static size defined for them prior to compile time. This is because that volume of memory is pushed onto the stack at function invocation.
I can see that you're trying to have dynamic memory allocation as part of the function itself, which is why you need such heap-allocators.

Difference between array declarations

I've always declared my arrays using this method:
bool array[256];
However, I've recently been told to declare my arrays using:
bool* array = new bool[256];
What is the difference and which is better? Honestly, I don't fully understand the second way, so an explanation on that would be helpful too.
bool array[256];
This allocates a bool array with automatic storage duration.
It will be automatically cleaned up when it goes out of scope.
In most implementations this would be allocated on the stack if it's not declared static or global.
Allocations/deallocations on the stack are computationally really cheap compared to the alternative. It also might have some advantages for data-locality but that's not something you usually have to worry about. But you might need to be careful of allocating many large arrays to avoid a stack overflow.
bool* array = new bool[256];
This allocates an array with dynamic storage duration.
You need to clean it up yourself with a call to delete[] later on. If you do not then you will leak memory.
Alternatively (as mentioned by #Fibbles) you can use smart-pointers to express the desired ownership/lifetime requirements. This will leave the responsibility of cleaning up to the smart-pointer class. Which helps a lot with guaranteeing deletion, even in cases of exceptions.
It has the advantage of being able to pass it to outer scopes and other objects without copying (RVO will avoid copying for the first case too in certain cases, but storing it as a data-member and other uses can't be optimized in the first case).
The first is allocation of memory on stack:
// inside main (or function, or non-static member of class) -> stack
int main() {
bool array[256];
}
or maybe as a static memory:
// outside main (and any function, or static member of class) -> static
bool array[256];
int main() {
}
The last is allocation of dynamic memory (in heap):
int main() {
bool* array = new bool[256];
delete[] array; // you should not forget to release memory allocated in heap
}
The advantage of dynamic memory is that it can be created with variable number of elements (not 256, but from some user input for example). But you should release it each time by yourself.
More about stack, static and heap memory and when you should use each is here: Stack, Static, and Heap in C++
The difference is static vs dynamic allocation, as previous answers have indicated. There are reasons for using one over the other. This video by Herb Sutter explains when you should use what. https://www.youtube.com/watch?v=JfmTagWcqoE It is just over 1 1/2 hours.
My preference is to use
bool array[256];
unless there's a reason to do otherwise.
Mike

C++ std::string* s; Memory reclaimed?

Given a function foo with a statement in it:
void foo() {
std::string * s;
}
Is memory reclaimed after this function returns?
I am assuming yes because this pointer isn't pointing to anything but some people are saying no - that it is a dangling pointer.
std::string* s is just an uninitialized pointer to a string. The pointer will be destroyed when function foo returns (because the pointer itself is a local variable allocated on the stack). No std::string was ever created, hence you won't have any memory leak.
If you say
void foo() {
std::string * s = new std::string;
}
Then you will have memory leak
This code is typical when people learn about strings a-la C, and then start using C++ through C idioms.
C++ classes (in particular standard library classes) treat objects as values, and manage themselves the memory they need.
std::string, in this sense is not different from an int. If you need a "text container", just declare an std::string (not std::string*) and initialize it accordingly (uninitialized std::strings are empty by definition - and default constructor) than use it to form expression using method, operators and related functions like you will do with other simple types.
std::string* itself is a symptom of a bad designed environment.
Explicit dynamic memory in C++ is typically used in two situation:
You don't know at compile time the size of an object (typical with unknown size arrays, like C strings are)
You don't know at compile time the runtime-type of an object (since its class will be decided on execution, based on some other input)
Now, std:string manage itself the first point, and does not support the second (it has no virtual methods), so allocating it dynamically adds no value: it just adds all the complication to manage yourself the memory to contain the string object that is itself a manager of other memory to contain its actual text.
This code just creates a pointer to somewhere in memory, which contains string value and it points to somewhere which has been allocated before and it does not allocate new string.
it just allocate a memory for pointer value and after function return it's no more valid...

Heap or Stack? When a constant string is referred in function call in C++

Consider the function:
char *func()
{
return "Some thing";
}
Is the constant string (char array) "Some thing" stored in the stack as local to the function call or as global in the heap?
I'm guessing it's in the heap.
If the function is called multiple times, how many copies of "Some thing" are in the memory? (And is it the heap or stack?)
String literal "Some thing" is of type const char*. So, they are neither on heap nor on stack but on a read only location which is a implementation detail.
From Wikipedia
Data
The data area contains global and static variables used by the program
that are initialized. This segment can be further classified into
initialized read-only area and initialized read-write area. For
instance the string defined by char s[] = "hello world" in C and a C
statement like int debug=1 outside the "main" would be stored in
initialized read-write area. And a C statement like const char* string
= "hello world" makes the string literal "hello world" to be stored in
initialized read-only area and the character pointer variable string
in initialized read-write area. Ex: static int i = 10 will be stored
in data segment and global int i = 10 will be stored in data segment
Constant strings are usually placed with program code, which is neither heap nor stack (this is an implementation detail). Only one copy will exist, each time the function returns it will return the same pointer value (this is guaranteed by the standard). Since the string is in program memory, it is possible that it will never be loaded into memory, and if you run two copies of the program then they will share the same copy in RAM (this only works for read-only strings, which includes string constants in C).
Neither, its in the static section of the program. Similar to having the string as a global variable. There is only ever one copy of the string within the translation unit.
Neither on the heap, nor on stack, it is part of the so-called init section in the executable image (COFF). This is loaded into memory and contains stuff like strings.

What is the difference between Static and Dynamic arrays in C++?

I have to do an assignment for my class and it says not to use Static arrays, only Dynamic arrays. I've looked in the book and online, but I don't seem to understand.
I thought Static was created at compile time and Dynamic at runtime, but I might be mistaking this with memory allocation.
Can you explain the difference between static array and dynamic array in C++?
Static arrays are created on the stack, and have automatic storage duration: you don't need to manually manage memory, but they get destroyed when the function they're in ends. They necessarily have a fixed size at compile time:
int foo[10];
Arrays created with operator new[] have dynamic storage duration and are stored on the heap (technically the "free store"). They can have any size during runtime, but you need to allocate and free them yourself since they're not part of the stack frame:
int* foo = new int[10];
delete[] foo;
static is a keyword in C and C++, so rather than a general descriptive term, static has very specific meaning when applied to a variable or array. To compound the confusion, it has three distinct meanings within separate contexts. Because of this, a static array may be either fixed or dynamic.
Let me explain:
The first is C++ specific:
A static class member is a value that is not instantiated with the constructor or deleted with the destructor. This means the member has to be initialized and maintained some other way. static member may be pointers initialized to null and then allocated the first time a constructor is called. (Yes, that would be static and dynamic)
Two are inherited from C:
within a function, a static variable is one whose memory location is preserved between function calls. It is static in that it is initialized only once and retains its value between function calls (use of statics makes a function non-reentrant, i.e. not threadsafe)
static variables declared outside of functions are global variables that can only be accessed from within the same module (source code file with any other #include's)
The question (I think) you meant to ask is what the difference between dynamic arrays and fixed or compile-time arrays. That is an easier question, compile-time arrays are determined in advance (when the program is compiled) and are part of a functions stack frame. They are allocated before the main function runs. dynamic arrays are allocated at runtime with the "new" keyword (or the malloc family from C) and their size is not known in advance. dynamic allocations are not automatically cleaned up until the program stops running.
It's important to have clear definitions of what terms mean. Unfortunately there appears to be multiple definitions of what static and dynamic arrays mean.
Static variables are variables defined using static memory allocation. This is a general concept independent of C/C++. In C/C++ we can create static variables with global, file, or local scope like this:
int x[10]; //static array with global scope
static int y[10]; //static array with file scope
foo() {
static int z[10]; //static array with local scope
Automatic variables are usually implemented using stack-based memory allocation. An automatic array can be created in C/C++ like this:
foo() {
int w[10]; //automatic array
What these arrays , x, y, z, and w have in common is that the size for each of them is fixed and is defined at compile time.
One of the reasons that it's important to understand the distinction between an automatic array and a static array is that static storage is usually implemented in the data section (or BSS section) of an object file and the compiler can use absolute addresses to access the arrays which is impossible with stack-based storage.
What's usually meant by a dynamic array is not one that is resizeable but one implemented using dynamic memory allocation with a fixed size determined at run-time. In C++ this is done using the new operator.
foo() {
int *d = new int[n]; //dynamically allocated array with size n
But it's possible to create an automatic array with a fixes size defined at runtime using alloca:
foo() {
int *s = (int*)alloca(n*sizeof(int))
For a true dynamic array one should use something like std::vector in C++ (or a variable length array in C).
What was meant for the assignment in the OP's question? I think it's clear that what was wanted was not a static or automatic array but one that either used dynamic memory allocation using the new operator or a non-fixed sized array using e.g. std::vector.
I think the semantics being used in your class are confusing. What's probably meant by 'static' is simply "constant size", and what's probably meant by "dynamic" is "variable size". In that case then, a constant size array might look like this:
int x[10];
and a "dynamic" one would just be any kind of structure that allows for the underlying storage to be increased or decreased at runtime. Most of the time, the std::vector class from the C++ standard library will suffice. Use it like this:
std::vector<int> x(10); // this starts with 10 elements, but the vector can be resized.
std::vector has operator[] defined, so you can use it with the same semantics as an array.
Static arrays are allocated memory at compile time and the memory is allocated on the stack. Whereas, the dynamic arrays are allocated memory at the runtime and the memory is allocated from heap.
int arr[] = { 1, 3, 4 }; // static integer array.
int* arr = new int[3]; // dynamic integer array.
I think in this context it means it is static in the sense that the size is fixed.
Use std::vector. It has a resize() function.
You could have a pseudo dynamic array where the size is set by the user at runtime, but then is fixed after that.
int size;
cin >> size;
int dynamicArray[size];
Static Array :
Static arrays are allocated memory at compile time.
Size is fixed.
Located in stack memory space.
Eg. : int array[10]; //array of size 10
Dynamic Array :
Memory is allocated at run time.
Size is not fixed.
Located in Heap memory space.
Eg. : int* array = new int[10];
Yes right the static array is created at the compile time where as the dynamic array is created on the run time. Where as the difference as far is concerned with their memory locations the static are located on the stack and the dynamic are created on the heap. Everything which gets located on heap needs the memory management until and unless garbage collector as in the case of .net framework is present otherwise there is a risk of memory leak.
Static array :Efficiency. No dynamic allocation or deallocation is required.
Arrays declared in C, C++ in function including static modifier are static.
Example: static int foo[5];
static array:
The memory allocation is done at the complile time and the memory is allocated in the stack memory
The size of the array is fixed.
dynamic array:
The memory allocation is done at the runtime and the memory is allocated in the heap memory
The size of the array is not fixed.
Let's state this issue with a function
if we have the static array then calling the function() will iterate all the fixed allocated components from the memory. There will be no append.
On the other hand, a dynamic array will extend the memory if we append it.
static arrary meens with giving on elements in side the array
dynamic arrary meens without giving on elements in side the array
example:
char a[10]; //static array
char a[]; //dynamic array