What are some good scenarios to use a cstring over a string? - c++

I've made quite a few projects (small) now with C++ and was wondering, what would be some good scenarios in which it is better to use a cstring instead of a string?
Just to clarify, I'm not calling cstrings bad. I'm just genuinely interested if they are as important as regular strings in C++

std::string allocates dynamic memory at runtime for its character data (unless the std::string implementation employs "Short String Optimization" and the data is short enough to fit in the SSO buffer).
So, one scenario you may want to use C-style strings for is when you want to pass around string literals without allocating memory for them. Assigning a string literal to a std::string will allocate dynamic memory (if SSO is not used).
There can also be scenarios where you want to process character data without allocating new memory for extracted substrings. C-style strings can be good for that, too (though std::string_view in C++17 and later would generally be better for that).

cstrings are just character arrays will a null character to terminate them. While std::string are dynamically allocated. Able to have more memory control; basically memory management is the main advantage

Related

how to take input a string of length 10^6 in c++?

we have to take input of a string such as :
1<=|S| <=10^6 (length of string)
and perform some operation , let's leave operation , I want to only know how to take input of such a long string like 10^6 ?
Can we take like this char S[1000001];
or which would be other better way?
Kindly, help
Forget about using a c-style string and use a std::string instead. A std::string can hold std::string::max_size characters. On most implementations today that should be 4294967294 characters.
The benefit with use a std::string is that it automatically grows to accommodate the size you need. Since the memory is dynamically allocated you don't have to worry about running out of stack space as well. You could run out of memory on the heap as you would need about 4 GB of RAM to hold a max size string thought.
Can we take like this char S[1000001];
You can... but if S is an automatic variable, you'll spend most of your stack-space on that one array and you'll probably run out (depending on available stack space and how much the rest of your program needs).
Large arrays like these should be allocated dynamically. Unfortunately, it's not easy to say how big arrays/objects should be allocated dynamically. It depends on a few things such as:
Amount of total stack space which depends on the platform and may be configurable at run- or linktime.
Amount of stack space needed by the rest of your program. This depends on how deep nested function calls do you have and how much memory your functions need.
I use a few kilobytes as a rule of thumb to decide if I'll need dynamic memory. Or a few dozen bytes inside a recursive function that is expected to go deep.
or which would be other better way?
std::string allocates it's buffer dynamically, provides ways to manipulate the string, makes sure that you don't forget the zero terminator and takes care of mamory management (which would be an issue if you did dynamic allocation manually). I highly recommend you use it.
it depends if the string characters are unicode or not
you could use:
char *txt = new char[1000000]
wchar_t *txt = new wchar_t[1000000];
also you can try using std::string or std::wstring like NathanOliver said
Use std::string.
Alternatively, if your data isn't being resized, you could opt to use a simple char array. Due to the size of the char array, I would suggest you allocate it with new and free it with delete. Otherwise, you may exhaust your stack size and cause a stack overflow.

CString or char array which one is better in terms of memory

I read somewhere that usage of CString is costly. Can you calrify it with an example. Also among CString and char array, which is better in terms of memory.
CString in addition to array of chars (or wide chars) contains string size, allocated buffer size, and reference counter (serving additionally as a lock flag). The buffer containing the array of chars may be significantly larger than the string it contains -- it allows to reduce the number of time-costly allocation calls. In addition, when the CString is set to be zero-sized, it still contains two wchar characters.
Naturally, when you compare the size of CString with the size of corresponding C-style array, the array will be smaller. However, if you want to manipulate your string as extensively as CString allows, you will eventually define your own variables for string size, buffer size and sometimes refcounter and/or guard flags. Indeed, you need to store your string size to avoid calling strlen each time you need it. You need to store separately your buffer size if you allow your buffer to be larger than the string length, and avoid calling reallocs each time you add to or subtract from the string. And so on -- you trade some small size increase for significant increases in speed, safety and functionality.
So, the answer depends on what you are going to do with the string. Suppose you want a string to store the name of your class for logging -- there a C-style string (const and static) will do fine. If you need a string to manipulate and use it extensively with MFC or ATL-related classes, use CString family types. If you need to manipulate string in the "engine" parts of your application that are isolated from its interface, and may be converted to other platforms, use std::string or write your own string type to suit your particular needs (this can be really useful when you write the "glue" code to place between the interface and the engine, otherwise std::string is preferable).
CString is from MFC framework specific to windows. std::string is from c++ standard. They are library classes for managing strings in memory. std::string will provide you code portability across platforms.
Using raw array is always good for memory however one has to do operations on strings and it becomes difficult with raw array, consider out of bounds check, get the string length, copy the array or change the size because the string may grow, deleting the array, etc. For all these problem string utility class are good wrapper. The string class will keep the actual string in heap and you have the overhead of the string class itself. However that will provide you functionality to mange the string memory which anyway you have to write by hand.
Prefer std::string if you can, if not, use CString.
In almost all cases I encourage novice programmers to use std::string or CString(*). First they will do significantly less errors. I have seen many buffer overruns, memory invalidation or memory leaks, because of erroneous use of C arrays.
So which is more efficient, CString / std::string or raw character arrays? Memory wise, generally speaking, all CString ans std::string have more is one integer for the size. The question is does it matter?
So which is more efficient in terms of performance? Well it depends on what you are doing with it and how you are using your C-arrays. But passing CString or std::string arround can be computationally more efficient than C-arrays. The problem with C-arrays is that you can't be sure of who owns the memory and what type (heap/stack/literal) it is. Defensive programming results in more copies of arrays, you know, just to be sure that the memory you hold will be valid for the entire duration of when it is needed.
Why is std::string or CString more efficient than C-arrays, if they are passed around by value? This is a bit more complicated and for totally different reasons. For CString, this is simple, it implemented as a COW (copy on write) object. So when you have 5 objects that originate for one CString, it will not use more memory that one, until you start to make change on one object. std::string has stricter requirements and thus it is not allowed to share memory with other std:: string objects. But if you have a newer compiler, std::string should implement the move semantic and thus returning a string from a function will only result in a copy of the pointer not reallocation.
There are very few cases where raw C arrays are good and practical idea.
*) If you are already programming against MFC, why not just use CString.

Why should one use std::string over c-style strings in C++?

"One should always use std::string over c-style strings(char *)" is advice that comes up for almost every source code posted here. While the advice is no doubt good, the actual questions being addressed do not permit to elaborate on the why? aspect of the advice in detail. This question is to serve as a placeholder for the same.
A good answer should cover the following aspects(in detail):
Why should one use std::string over c-style strings in C++?
What are the disadvantages (if any) of the practice mentioned in #1?
What are the scenarios where the opposite of the advice mentioned in #1 is a good practice?
std::string manages its own memory, so you can copy, create, destroy them easily.
You can't use your own buffer as a std::string.
You need to pass a c string / buffer to something that expects to take ownership of the buffer - such as a 3rd party C library.
Well, if you just need an array of chars, std::string provides little advantage. But face it, how often is that the case? By wrapping a char array with additional functionality like std::string does, you gain both power and efficiency for some operations.
For example, determining the length of an array of characters requires "counting" the characters in the array. In contrast, an std::string provides an efficient operation for this particular task. (see https://stackoverflow.com/a/1467497/129622)
For power, efficiency and sanity
Larger memory footprint than "just" a char array
When you just need an array of chars
3) The advice always use string of course must be taken with a pinch of common sense. String literals are const char[], and if you pass a literal to a function that takes a const char* (for example std::ifstream::open()) there's absolutely no point wrapping it in std::string.
A char* is basically a pointer to a character. What C does is frequently makes this pointer point to the first character in an array.
An std::string is a class that is much like a vector. Internally, it handles the storage of an array of characters, and gives the user several member functions to manipulate said stored array as well as several overloaded operators.
Reasons to use a char* over an std::string:
C backwards-compatibility.
Performance (potentially).
char*s have lower-level access.
Reasons to use an std::string over a char*:
Much more intuitive to use.
Better searching, replacement, and manipulation functions.
Reduced risk of segmentation faults.
Example :
char* must be used in conjuction with either a char array, or with a dynamically allocated char array. After all, a pointer is worthless unless it actually points to something. This is mainly used in C programs:
char somebuffer[100] = "a string";
char* ptr = somebuffer; // ptr now points to somebuffer
cout << ptr; // prints "a string"
somebuffer[0] = 'b'; // change somebuffer
cout << ptr; // prints "b string"
notice that when you change 'somebuffer', 'ptr' also changes. This is because somebuffer is the actual string in this case. ptr just points/refers to it.
With std::string it's less weird:
std::string a = "a string";
std::string b = a;
cout << b; // prints "a string"
a[0] = 'b'; // change 'a'
cout << b; // prints "a string" (not "b string")
Here, you can see that changing 'a' does not affect 'b', because 'b' is the actual string.
But really, the major difference is that with char arrays, you are responsible for managing the memory, whereas std::string does it for you. In C++, there are very few reasons to use char arrays over strings. 99 times out of 100 you're better off with a string.
Until you fully understand memory management and pointers, just save yourself some headaches and use std::string.
Why should one use std::string over c-style strings in C++?
The main reason is it frees you from managing the lifetime of the string data. You can just treat strings as values and let the compiler/library worry about managing the memory.
Manually managing memory allocations and lifetimes is tedious and error prone.
What are the disadvantages (if any) of the practice mentioned in #1?
You give up fine-grained control over memory allocation and copying. That means you end up with a memory management strategy chosen by your toolchain vendor rather than chosen to match the needs of your program.
If you aren't careful you can end up with a lot of unneeded data copying (in a non-refcounted implementation) or reference count manipulation (in a refcounted implementation)
In a mixed-language project any function whose arguments use std::string or any data structure that contains std::string will not be able to be used directly from other languages.
What are the scenarios where the opposite of the advice mentioned in #1 is a good practice?
Different people will have different opinions on this but IMO
For function arguments passing strings in "const char *" is a good choice since it avoids unnessacery copying/refcouning and gives the caller flexibility about what they pass in.
For things used in interoperation with other languages you have little choice but to use c-style strings.
When you have a known length limit it may be faster to use fixed-size arrays.
When dealing with very long strings it may be better to use a construction that will definately be refcounted rather than copied (such as a character array wrapped in a shared_ptr) or indeed to use a different type of data structure altogether
In general you should always use std::string, since it is less bug prone. Be aware, that memory overhead of std::string is significant. Recently I've performed some experiments about std::string overhead. In general it is about 48 bytes! The article is here: http://jovislab.com/blog/?p=76.

Strings and character array

What is the difference between the string and character array?
How can each element of the string be accessed in C++?
string manages its own memory; this is not so with an array of char except as a local variable.
In both cases you can access individual elements using [] (but in the case of string this is actually operator[]).
string has a lot of built-in functions that you don't easily get in a C++-friendly way with C-Strings.
In C, they are the same, a string is a char array and you have a lot of standard methods to handle them like sprintf, strcat, strcpy, strdup, strchr, strstr...
In C++, you can also use the STL string class that will provide a object oriented string that you can manipulate in an easier way. The advantage is that the code is easier to read and you don't need to allocate/deallocate memory for the strings by yourself.

Why do you prefer char* instead of string, in C++?

I'm a C programmer trying to write c++ code. I heard string in C++ was better than char* in terms of security, performance, etc, however sometimes it seems that char* is a better choice. Someone suggested that programmers should not use char* in C++ because we could do all things that char* could do with string, and it's more secure and faster.
Did you ever used char* in C++? What are the specific conditions?
It's safer to use std::string because you don't need to worry about allocating / deallocating memory for the string. The C++ std::string class is likely to use a char* array internally. However, the class will manage the allocation, reallocation, and deallocation of the internal array for you. This removes all the usual risks that come with using raw pointers, such as memory leaks, buffer overflows, etc.
Additionally, it's also incredibly convenient. You can copy strings, append to a string, etc., without having to manually provide buffer space or use functions like strcpy/strcat. With std::string it's as simple as using the = or + operators.
Basically, it's:
std::string s1 = "Hello ";
std::string s2 = s1 + "World";
versus...
const char* s1 = "Hello";
char s2[1024]; // How much should I really even allocate here?
strcpy(s2, s1);
strcat(s2, " World ");
Edit:
In response to your edit regarding the use of char* in C++: Many C++ programmers will claim you should never use char* unless you're working with some API/legacy function that requires it, in which case you can use the std::string::c_str() function to convert an std::string to const char*.
However, I would say there are some legitimate uses of C-arrays in C++. For example, if performance is absolutely critical, a small C-array on the stack may be a better solution than std::string. You may also be writing a program where you need absolute control over memory allocation/deallocation, in which case you would use char*. Also, as was pointed out in the comments section, std::string isn't guaranteed to provide you with a contiguous, writable buffer *, so you can't directly write from a file into an std::string if you need your program to be completely portable. However, in the event you need to do this, std::vector would still probably be preferable to using a raw C-array.
* Although in C++11 this has changed so that std::string does provide you with a contiguous buffer
Ok, the question changed a lot since I first answered.
Native char arrays are a nightmare of memory management and buffer overruns compared to std::string. I always prefer to use std::string.
That said, char array may be a better choice in some circumstances due to performance constraints (although std::string may actually be faster in some cases -- measure first!) or prohibition of dynamic memory usage in an embedded environment, etc.
In general, std::string is a cleaner, safer way to go because it removes the burden of memory management from the programmer. The main reason it can be faster than char *'s, is that std::string stores the length of the string. So, you don't have to do the work of iterating through the entire character array looking for the terminating NULL character each time you want to do a copy, append, etc.
That being said, you will still find a lot of c++ programs that use a mix of std::string and char *, or have even written their own string classes from scratch. In older compilers, std::string was a memory hog and not necessarily as fast as it could be. This has gotten better over time, but some high-performance applications (e.g., games and servers) can still benefit from hand-tuned string manipulations and memory-management.
I would recommend starting out with std::string, or possibly creating a wrapper for it with more utility functions (e.g., starts_with(), split(), format(), etc.). If you find when benchmarking your code that string manipulation is a bottleneck, or uses too much memory, you can then decide if you want to accept the extra risks and testing that a custom string library demands.
TIP: One way of getting around the memory issues and still use std::string is to use an embedded database such as SQLite. This is particularly useful when generating and manipulating extremely large lists of strings, and performance is better than what you might expect.
C char * strings cannot contain '\0' characters. C++ string can handle null characters without a problem. If users enter strings containing \0 and you use C strings, your code may fail. There are also security issues associated with this.
Implementations of std::string hide the memory usage from you. If you're writing performance-critical code, or you actually have to worry about memory fragmentation, then using char* can save you a lot of headaches.
For anything else though, the fact that std::string hides all of this from you makes it so much more usable.
String may actually be better in terms of performance. And innumerable other reasons - security, memory management, convenient string functions, make std::string an infinitely better choice.
Edit: To see why string might be more efficient, read Herb Sutter's books - he discusses a way to internally implement string to use Lazy Initialization combined with Referencing.
Use std::string for its incredible convenience - automatic memory handling and methods / operators. With some string manipulations, most implementations will have optimizations in place (such as delayed evaluation of several subsequent manipulations - saves memory copying).
If you need to rely on the specific char layout in memory for other optimizations, try std::vector<char> instead. If you have a non-empty vector vec, you can get a char* pointer using &vec[0] (the vector has to be nonempty).
Short answer, I don't. The exception is when I'm using third party libraries that require them. In those cases I try to stick to std::string::c_str().
In all my professional career I've had an opportunity to use std::string at only two projects. All others had their own string classes :)
Having said that, for new code I generally use std::string when I can, except for module boundaries (functions exported by dlls/shared libraries) where I tend to expose C interface and stay away from C++ types and issues with binary incompatibilities between compilers and std library implementations.
Compare and contrast the following C and C++ examples:
strlen(infinitelengthstring)
versus
string.length()
std::string is almost always preferred. Even for speed, it uses small array on the stack before dynamically allocating more for larger strings.
However, char* pointers are still needed in many situations for writing strings/data into a raw buffer (e.g. network I/O), which can't be done with std::string.
The only time I've recently used a C-style char string in a C++ program was on a project that needed to make use of two C libraries that (of course) used C strings exclusively. Converting back and forth between the two string types made the code really convoluted.
I also had to do some manipulation on the strings that's actually kind of awkward to do with std::string, but I wouldn't have considered that a good reason to use C strings in the absence of the above constraint.