char* buffer = new vs char buffer[] in C++ - c++

1. char* buffer = new char[size]
2. char buffer[size]
I'm new to C++ and I see most places creating buffers using the first example. I know in the first method, the data in that part of memory can be passed on until manually deleted using delete[]. While using the second method, the buffer would have a lifetime depending on the scope. If I only plan on the buffer lasting through a particular function and I don't plan on passing it to anything else, does it matter which method I use?

char* buffer = new char[size]
This is portable, but should be avoided. Until you really know what you're doing, using new directly is almost always a mistake (and when you do know what you're doing, it's still a mistake, but you'll know that without being told).
char buffer[size]
This depends on how you've defined size. If it's a constant (and fairly small), then this is all right. If it's not a constant, then any properly functioning compiler is required to reject it (but some common ones accept it anyway).
If it's constant, but "large", the compiler will accept the code, but it's likely to fail when you try to execute it. In this case, anything over a million is normally too large, and anything more than a few hundred thousand or so becomes suspect.
There is one exception to that though: if this is defined outside any function (i.e., as a global variable), then it can safely be much larger than a local variable can be. At the same time, I feel obliged to point out that I consider global variables something that should normally be avoided as a rule (and I'm far from being alone in holding that opinion).
Also note that these two are (more or less) mutually exclusive: if size is a constant, you generally want to avoid dynamic allocation, but it has to be a constant to just define an array (again, with a properly functioning compiler).
Unless size is fairly small constant, most of the time you should avoid both of these. What you most likely want is either:
std::string buffer;
or:
std::vector<char> buffer(size);
or possibly:
std::array<char, size> buffer;
The first two of these can allocate space for the buffer dynamically, but generally keep the allocation "hidden", so you don't normally need to deal with it directly. The std::array is pretty much like the char buffer[size], (e.g., has a fixed size, and is really on suitable for fairly small sizes) but enforces that the size has to be a const, and gives you roughly the same interface as vector (minus anything that would change the number of elements, since that's a constant with std::array).

Main difference is that the first variant is dynamic allocation and the second one is not. You require dynamic allocation when you do not know at compile time, how much memory you will need. That means when "size" is not entirely a constant but somehow calculated at runtime depending on external input.
It is a good practice¹ to use containers that handle dynamic memory internally and thus ensure that you do not have to delete manually which is often a source for bugs and memory leaks.
A common, dynamic container for all kinds of data is
std::vector<char> (don't forget to #include <vector> )
However if you do handle texts, use the class std::string which also handles the memory internally. Raw char* arrays are a remainder from old C.
¹that good practice has the main exception when you don't use primitive data types but your own classes which store some massive amount of data. Reason is that std::vector<> performs copy operations when resized (and those are more expensive the larger the data).
However once you have come that far in your C++ projects, you should know about "smart pointers" by then which are the safe solution for those special cases.
By the way, with &the_vector[0] (address of the first element in the vector) you can get a pointer that behaves pretty much like the char array and thus can be used for older functions that do not accept vectors directly.

Related

Are two heap allocations more expensive than a call to std::string fill ctor?

I want to have a string with a capacity of 131 chars (or bytes). I know two simple ways of achieving that. So which of these two code blocks is faster and more efficient?
std::string tempMsg( 131, '\0' ); // constructs the string with a 131 byte buffer from the start
tempMsg.clear( ); // clears those '\0' chars to free space for the actual data
tempMsg += "/* some string literals that are appended */";
or this one:
std::string tempMsg; // default constructs the string with a 16 byte buffer
tempMsg.reserve( 131 ); // reallocates the string to increase the buffer size to 131 bytes??
tempMsg += "/* some string literals that are appended */";
I guess the first approach only uses 1 allocation and then sets all those 131 bytes to 0 ('\0') and then clears the string (std::string::clear is generally constant according to: https://www.cplusplus.com/reference/string/string/clear/).
The second approach uses 2 allocations but on the other hand, it doesn't have to set anything to '\0'. But I've also heard about compilers allocating 16 bytes on the stack for a string object for optimization purposes. So the 2nd method might use only 1 heap allocation as well.
So is the first method faster than the other one? Or are there any other better methods?
The most accurate answer is that it depends. The most probable answer is the second being faster or as fast. Calling the fill ctor requires not only a heap allocation but a fill (typically translates to a memset in my experience).
clear usually won't do anything with a POD char besides setting a first pointer or size integer to zero because char is a trivially-destructible type. There's no loop involved with clear usually unless you create std::basic_string with a non-trivial UDT. It's constant-time otherwise and dirt-cheap in practically every standard library implementation.
Edit: An Important Note:
I never encountered a standard lib implementation that does this or it has slipped my memory (very possible as I think I'm turning senile), but there is something very important that Viktor Sehl pointed out to me that I was very ignorant about in the comments:
Please note that std::string::clear() on some implementations free the allocated memory (if there are any), unlike a std::vector. –
That would actually make your first version involve two heap allocations. But the second should still only be one (opposite of what you thought).
Resumed:
But I've also heard about compilers allocating 16 bytes on the stack for a string object for optimization purposes. So the 2nd method might use only 1 heap allocation as well.
Small Buffer Optimizations
The first allocation is a small-buffer stack optimization for implementations that use it (technically not always stack, but it'll avoid additional heap allocations). It's not separately heap-allocated and you can't avoid it with a fill ctor (the fill ctor will still allocate the small buffer). What you can avoid is filling the entire array with '\0' before you fill it with what you actually want, and that's why the second version is likely faster (marginally or not depending on how many times you invoke it from a loop). That's needless overhead unless the optimizer eliminates it for you, and it's unlikely in my experience that optimizers will do that in loopy cases that can't be optimized with something like SSA.
I just pitched in here because your second version is also clearer in intent than filling a string with something as an attempted optimization (in this case a very possibly misguided one if you ask me) only to throw it out and replace it with what you actually want. The second is at least clearer in intent and almost certainly as fast or faster in most implementations.
On Profiling
I would always suggest measuring though if in doubt, and especially before you start attempting funny things like in your first example. I can't recommend the profiler enough if you're in working in performance-critical fields. The profiler will not only answer this question for you but it'll also teach you to refrain from writing such counter-intuitive code like in the first example except in places where it makes a real positive difference (in this case I think the difference is actually negative or neutral). From my perspective, the use of both profiler and debugging should be something ideally taught in CS 101. The profiler helps mitigate the dangerous tendency for people to optimize the wrong things very counter-productively. They tend to be very easy to use; you just run them and make your code perform the expensive operation you want to optimize and you get back nice results like so:
If the small buffer optimization confuses you a bit, a simple illustration is like this:
struct SomeString
{
// Pre-allocates (always) some memory in advance to avoid additional
// heap allocs.
char small_buffer[some_small_fixed_size] = {};
// Will point to small buffer until string gets large.
char* ptr = small_buffer;
};
The allocation of the small buffer is unavoidable, but it doesn't require separate calls to malloc/new/new[]. And it's not allocated separately on the heap from the string object itself (if it is allocated on heap). So both of the examples that you showed involve, at most, a single heap allocation (unless your standard library implementation is FUBAR -- edit: or one that Viktor is using). What the first example has conceptually on top of that is a fill/loop (could be implemented as a very efficient intrinsic in assembly but loopy/linear time stuff nevertheless) unless the optimizer eliminates it.
String Optimization
So is the first method faster than the other one? Or are there any other better methods?
You can write your own string type which uses an SBO with, say, 256 bytes for the small buffer which is typically going to be much larger than any std::string optimization. Then you can avoid heap allocations entirely for your 131-length case.
template <class Char, size_t SboSize=256>
class TempString
{
private:
// Stores the small buffer.
Char sbo[SboSize] = {};
// Points to the small buffer until num > SboSize.
Char* ptr = sbo;
// Stores the length of the string.
size_t num = 0;
// Stores the capacity of the string.
size_t cap = SboSize;
public:
// Destroys the string.
~TempString()
{
if (ptr != sbo)
delete[] ptr;
}
// Remaining implementation left to reader. Note that implementing
// swap requires swapping the contents of the SBO if the strings
// point to them rather than swapping pointers (swapping is a
// little bit tricky with SBOs involved, so be wary of that).
};
That would be ill-suited for persistent storage though because it would blow up memory use (ex: requiring 256+ bytes just to store a string with one character in it) if you stored a bunch of strings persistently in a container. It's well-suited for temporary strings though you transfer into and out of function calls. I'm primarily a gamedev so rolling our own alternatives to the standard C++ library is quite normal here given our requirements for real-time feedback with high graphical fidelity. I wouldn't recommend it for the faint-hearted though, and definitely not without a profiler. This is a very practical and viable option in my field although it might be ridiculous in yours. The standard lib is excellent but it's tailored for the needs of the entire world. You can usually beat it if you can tailor your code very specifically to your needs and produce more narrowly-applicable code.
Actually, even std::string with SBOs is rather ill-suited for persistent storage anyway and not just TempString above because if you store like std::unordered_map<std::string, T> and std::string uses a 16-byte SBO inflating sizeof(std::string) to 32 bytes or more, then your keys will require 32 bytes even if they just store one character fitting only two strings or less in a single cache line on traversal of the hash table. That's a downside to using SBOs. They can blow up your memory use for persistent storage that's part of your application state. But they're excellent for temporaries whose memory is just pushed and popped to/from stack in a LIFO alloc/dealloc pattern which only requires incrementing and decrementing a stack pointer.
If you want to optimize the storage of many strings though from a memory standpoint, then it depends a lot on your access patterns and needs. However, a fairly simple solution is like so if you want to just build a dictionary and don't need to erase specific strings dynamically:
// Just using a struct for simplicity of illustration:
struct MyStrings
{
// Stores all the characters for all the null-terminated strings.
std::vector<char> buffer;
// Stores the starting index into the buffer for the nth string.
std::vector<std::size_t> string_start;
// Inserts a null-terminated string to the buffer.
void insert(const std::string_view str)
{
string_start.push_back(buffer.size());
buffer.insert(buffer.end(), str.begin(), str.end());
buffer.push_back('\0');
}
// Returns the nth null-terminated string.
std::string_view operator[](int32_t n) const
{
return {buffer.data() + string_start[n]};
}
};
Another common solution that can be very useful if you store a lot of duplicate strings in an associative container or need fast searches for strings that can be looked up in advance is to use string interning. The above solution can also be combined to implement an efficient way to store all the interned strings. Then you can store lightweight indices or pointers to your interned strings and compare them immediately for equality, e.g., without involving any loops, and store many duplicate references to strings that only cost the size of an integer or pointer.

Allocating contiguous array of non-default constructed templated structs

I'm writing a C++ library in a multi-threaded application where performance is important, and am not sure of the best way to achieve my goal of allocating a fixed-size array of templated structs. I'd like a single contiguous block of memory for cache locality, and pad the cache lines to avoid false sharing.
Here's the struct definition:
template<size_t buf_bytes, size_t cache_pad_bytes>
struct PaddedBuffer {
char buf[buf_bytes];
char cache_line_padding[cache_pad_bytes];
}
Ideally the contiguous block of memory would look something like this, and be accessible as an array:
[ {buf, pad}, {buf, pad}, ..., {buf, pad} ]
where [ ] indicates the array, and { } indicates a struct. The fundamental challenge is that the size of buf is not known until runtime.
This is what I really want to be able to do, but given that buf_bytes is not a constant expression, the C++ standard doesn't allow this with a templated function (of course).
PaddedBuffer<buf_bytes, cache_pad_bytes> buffers[num_buffered_entries];
I can think of a couple options, but don't really like any of them. Is there a better way?
Option 1: Dynamically allocate everything
Change the PaddedBuffer definition to make the buffer a pointer (char* buf). Then, dynamically allocate an array of PaddedBuffer*, and then dynamically allocate as num_buffered_entries of them. This isn't ideal because the buffers aren't going to [necessarily] be adjacent.
Option 2: Dynamically allocate one large buffer
I can essentially allocate a big chunk of memory of size (num_buffered_entries * (buf_bytes + cache_pad_bytes), then define my own operator[] function, and just use it pretty much as normal after that. I think this works, but feels a bit hacky. Also, I loose the named struct accessor ability, so I can't add a field foo to the struct and just write object.foo. I'd have to manage that myself. In this case, I only need the buffer (not the cache line padding) so it doesn't really matter, but it does feel a bit like I'm reinventing the wheel.
(Note: I realize I can use a std::vector and initialize it with a copy constructor, but I don't want a std::vector for a variety of reasons.)
A bit more background on the problem I'm solving: This data structure is essentially used for a lock-free single-producer producer-consumer queue, where the buffers are reused thousands or millions of times throughout the course of the application. I'd like to be able to allocate the memory once at the beginning of the program and then access the memory efficiently (and without false sharing). I'm assuming I can achieve good higher level cache locality by allocating this as a contiguous chunk; I'm assuming I can obviate false sharing with the cache line padding. (I did investigate using related libraries (TBB, Boost, TBB) and they were close but didn't quite map to my problem given.) This is in the inner loop of a large library, and therefore want it to be reasonably fast. I can just go with one of these options (Option 1 would be the most straightforward) if there isn't a clean solution, but if there is a clean solution, I'd like to use it. Note that the queue is intended to be of a fixed size, and therefore is implemented as a circular buffer with bounds appropriate checking.
Is there a cleaner and recommended approach?

Is accessing the elements of a char* or std::string faster?

I have seen char* vs std::string in c++, but am still wondering if accessing the elements of a char* is faster than std::string.
If you need to know, the char*/std::string will contain less than 80 characters, but I would like to know a cutoff if there is one.
I would also like to know the answer to this question for different compilers and different Operating Systems, if there is a difference.
Thanks in advance!
Edit: I would be accessing the elements using array[n], and would set the values once.
(Note: If this doesn't meet the help center, please let me know how I can reword it before down-voting)
They should be equivalent in general, though std::string might be a tiny bit slower. Why? Because of short-string optimization.
Short-string optimization is a trick some implementations use to store short strings in std::string without allocating any memory. Usually this is done by doing something like this (though different variations exist):
union {
char* data_ptr;
char short_string[sizeof(char*)];
};
Then std::string can use the short_string array to store the data, but only if the size of the string is short enough to fit in there. If not, then it will need to allocate memory and use data_ptr to store that pointer.
Depending on how short-string optimization is implemented, whenever you access data in a std::string, it needs to check its length and determine if it's using the short_string or the data_ptr. This check is not totally free: it takes at least a couple instructions and might cause some branch misprediction or inhibit prefetching in the CPU.
libc++ uses short-string optimization kinda like this that requires checking whether the string is short vs long every access.
libstdc++ uses short-string optimization, but they implement it slightly differently and actually avoid any extra access costs. Their union is between a short_string array and an allocated_capacity integer, which means their data_ptr can always point to the real data (whether it's in short_string or in an allocated buffer), so there aren't any extra steps needed when accessing it.
If std::string doesn't use short-string optimization (or if it's implemented like in libstdc++), then it should be the same as using a char*. I disagree with black's statement that there is an extra level of indirection in this situation. The compiler should be able to inline operator[] and it should be the same as directly accessing the internal data pointer in the std::string.
Since you don't have direct access to the underlying CharT sequence, accessing it will require an extra layer through the public interface. So it could be slower, probably requiring 20-30 cycles more. Even then, only in a tight loop you might see a difference.
However, it's extremely easy to optimize this out considering the large range of techniques a compiler can employ (caching, inlining, non-standard function calls and so on) if you instruct it to.

CString or char array which one is better in terms of memory

I read somewhere that usage of CString is costly. Can you calrify it with an example. Also among CString and char array, which is better in terms of memory.
CString in addition to array of chars (or wide chars) contains string size, allocated buffer size, and reference counter (serving additionally as a lock flag). The buffer containing the array of chars may be significantly larger than the string it contains -- it allows to reduce the number of time-costly allocation calls. In addition, when the CString is set to be zero-sized, it still contains two wchar characters.
Naturally, when you compare the size of CString with the size of corresponding C-style array, the array will be smaller. However, if you want to manipulate your string as extensively as CString allows, you will eventually define your own variables for string size, buffer size and sometimes refcounter and/or guard flags. Indeed, you need to store your string size to avoid calling strlen each time you need it. You need to store separately your buffer size if you allow your buffer to be larger than the string length, and avoid calling reallocs each time you add to or subtract from the string. And so on -- you trade some small size increase for significant increases in speed, safety and functionality.
So, the answer depends on what you are going to do with the string. Suppose you want a string to store the name of your class for logging -- there a C-style string (const and static) will do fine. If you need a string to manipulate and use it extensively with MFC or ATL-related classes, use CString family types. If you need to manipulate string in the "engine" parts of your application that are isolated from its interface, and may be converted to other platforms, use std::string or write your own string type to suit your particular needs (this can be really useful when you write the "glue" code to place between the interface and the engine, otherwise std::string is preferable).
CString is from MFC framework specific to windows. std::string is from c++ standard. They are library classes for managing strings in memory. std::string will provide you code portability across platforms.
Using raw array is always good for memory however one has to do operations on strings and it becomes difficult with raw array, consider out of bounds check, get the string length, copy the array or change the size because the string may grow, deleting the array, etc. For all these problem string utility class are good wrapper. The string class will keep the actual string in heap and you have the overhead of the string class itself. However that will provide you functionality to mange the string memory which anyway you have to write by hand.
Prefer std::string if you can, if not, use CString.
In almost all cases I encourage novice programmers to use std::string or CString(*). First they will do significantly less errors. I have seen many buffer overruns, memory invalidation or memory leaks, because of erroneous use of C arrays.
So which is more efficient, CString / std::string or raw character arrays? Memory wise, generally speaking, all CString ans std::string have more is one integer for the size. The question is does it matter?
So which is more efficient in terms of performance? Well it depends on what you are doing with it and how you are using your C-arrays. But passing CString or std::string arround can be computationally more efficient than C-arrays. The problem with C-arrays is that you can't be sure of who owns the memory and what type (heap/stack/literal) it is. Defensive programming results in more copies of arrays, you know, just to be sure that the memory you hold will be valid for the entire duration of when it is needed.
Why is std::string or CString more efficient than C-arrays, if they are passed around by value? This is a bit more complicated and for totally different reasons. For CString, this is simple, it implemented as a COW (copy on write) object. So when you have 5 objects that originate for one CString, it will not use more memory that one, until you start to make change on one object. std::string has stricter requirements and thus it is not allowed to share memory with other std:: string objects. But if you have a newer compiler, std::string should implement the move semantic and thus returning a string from a function will only result in a copy of the pointer not reallocation.
There are very few cases where raw C arrays are good and practical idea.
*) If you are already programming against MFC, why not just use CString.

Why does std::fstream use char*?

I'm writing a small program that reads the bytes from a file in binary file in groups of 16 bytes (please don't ask why), modifies them, and then writes them to another file.
The fstream::read function reads into a char * buffer, which I was initially passing to a function that looks like this:
char* modify (char block[16], std::string key)
The modification was done on block which was then returned. On roaming the posts of SO, I realized that it might be a better idea to use std::vector<char>. My immediate next worry was how to convert a char * to a std::vector<char>. Once again, SO gave me an answer.
But now what I'm wondering is: If its such a good idea to use std::vector<char> instead of char*, why do the fstream functions use char* at all?
Also, is it a good idea to convert the char* from fstream to std::vector<char> in the first place?
EDIT: I now realize that since fstream::read is used to write data into objects directly, char * is necessary. I must now modify my question. Firstly, why are there no overloaded functions for fstream::read? And secondly, in the program that I've written about, which is a better option?
To use it with a vector, do not pass a pointer to the vector. Instead, pass a pointer to the vector content:
vector<char> v(size);
stream.read(&v[0], size);
fstream() functions let you use char*s so you can point them at arbitrary pre-allocated buffers. std::vector<char> can be sized to provide an appropriate buffer, but it will be on the heap and there's allocation costs involved with that. Sometimes too you may want to read or write data to a specific location in memory - even in shared memory - rather than accepting whatever heap memory the vector happens to have allocated. Further, you may want to use fstream without having included the vector header... it's nice to be able to avoid unnecessary includes as it reduces compilation time.
As your buffers are always 16 bytes in size, it's probably best to allocate them as char [16] data members in an appropriate owning object (if any exists), or on the stack (i.e. some function's local variable).
vector<> is more useful when the alternative is heap allocation - whether because the size is unknown at compile time, or is particularly large, or you want more flexible control of the memory lifetime. It's also useful when you specifically want some of the other vector functionality, such as ability to change the number of elements afterwards, to sort the bytes etc. - it seems very unlikely you'll want to do any of that so a vector raises questions in the mind of the person reading your code about what you'll do for no good purpose. Still, the choice of char[16] vs. vector appears (based on your stated requirements) more a matter of taste than objective benefit.