If I use
const char * str = "Hello";
there is no memory allocation/deallocaton needed in runtime
If I use
const std::string str = "Hello";
will be there an allocation via new/malloc inside string class or not? I could find it in assembly, but I am not good at reading it.
If answer is "yes, there will be malloc/new", why? Why can there be only pass through to inner const char pointer inside std::string and do actual memory allocation if I need to edit edit string?
will be there an allocation via new/malloc inside string class or not?
It depends. The string object will have to provide some memory to store the data, since that's its job. Some implementations use a "small string optimisation", where the object contains a small buffer, and only allocates from the heap if the string is too large for that.
Why can there be only pass through to inner const char pointer inside std::string and do actual memory allocation if I need to edit edit string?
What you describe isn't necessarily an optimisation (since it needs an extra runtime check whenever you modify the string), and in any case isn't allowed by the iterator invalidation rules.
There is a proposal for a string_view, allowing you to access an existing character sequence with an interface like const string, without any memory management. It's not yet standard, and doesn't allow you to modify the string.
Naive implementation of std::string will require a heap allocation however compilers are allowed to optimize statically initialized std::string objects by replacing them with objects of alternative implementations if the initialized strings are not modified during runtime.
You may use const std::string when you instantiate immutable strings to ensure better optimization.
The C++ standard doesn't actually say you can't just store a pointer to an external string (and a length). However, that means EVERY time you may modify the string (e.g. char& std::string::operator[](size_t index)) would have to ensure that the string is actually writeable. Since a large number of string usage does NOT use a constant string only to store the string, but does indeed modify the string [or use a string that isn't a constant input anyway].
So, some problems are;
std::string s = "Hello";
char &c = s[1];
c = 'a'; // Should make string to "Hallo".
what if:
char buffer[1000];
cin.getline(buffer); // Reads "Hello"
std::string s = buffer;
cin.getline(buffer); // Reads "World"
What is the value in s now?
There are so many such cases where if you were to just copy the original string, it would cause more problems, and little or no benefit.
Related
In the class below is there any benefit or reason why speak() returns a const char* instead of a std::string?
class Animal
{
protected:
std::string m_name;
Animal(std::string name)
: m_name(name)
{
}
public:
std::string getName() { return m_name; }
const char* speak() { return "???"; }
};
std::string comes with a lot of maybe unwanted and also unused features. If you do not want all that features you should think about the cost of the usage. Passing around std::string needs at minimum a first copy from the literal to the internal storage of the string. Maybe you get some additional copies if you pass the string to other functions and back. If small string optimization is present, you can not simply move the string, so the cost becomes higher in that case.
If you do not want the cost but get all features for a constant string, you should take a look for std::string_view. The implementation typically contains only a pointer to the underlying data and a value for size. So it comes with less cost and is very feature rich.
And indeed there is nothing wrong with passing const char* if it fits to your needs.
Use always std::string for constant strings is a very common pattern but not a good one.
What a string literal such as "???" does is it tells the compiler to include a piece of global memory containing that particular sequence of characters (plus the character '\0' at the end). The string literal expression has a value of type const char* pointing to that piece of global memory.
What that means is that in this very particular circumstance, it's safe to pass around the pointer. Note that the following function
std::string speak() { return "???"; }
has the same function body but a rather different code path. The string literal expression (the global memory, the pointer) is still there, but it is used as an argument to the constructor for std::string; it is implicitly converted to std::string.
That constructor for std::string dynamically allocates some memory (well not in this case with the short string optimization, probably) and copies from the pointer you give it.
So in this very particular case, where you have a string literal and nothing else, you can return a const char*. It will probably get implicitly converted to std::string when passed to another function which expects it, so const char* isn't all that much of an optimization.
In addition to the other answers, returning a const char* may be needed if you need to call your functions from a different library that is written in a different C++ compiler. For example your Animal class may be compiled with Visual Studio 2013, but the library where you use the Animal class from may be compiled with Visual Studio 2015.
For more info:
How do I safely pass objects, especially STL objects, to and from a DLL?
Considering we have a struct with char * member, if we want to request the content of this member, we normally do
char const * get_member() { return object.member; }
By this, we only return a pointer, without any allocation.
If now we want to return a string; is it possible to let the string just use that pointer, instead of copying the content and construct a new string object?
string const & get_member() { return object.member; }
will this code above will do a memory allocation. what like of extra work will this method do compare to the char const * one?
No, it is not possible. std::string always allocates its own memory and cannot take ownership of a pre-existing buffer.
You can either return a copy of the pointer, or you can use a std::string member in the first place, and return a reference to it. Or, alternatively return std::string_view which can be used with either char* or a std::string member. String view is only available since C++17 but it also exists in standard library extensions some for earlier compilers and there also exists non-standard implementations.
The struct is from some C code based library, just want to wrap with C++, at the mean time, do not kill any performance.
Then it seems that returning a std::string would not be an appropriate design.
Say I have a variable
std::string str; // initialized with some value
And a struct defined as:
struct test
{
public:
const char* name;
};
I know this can be done :
test t1;
t1.name = str.c_str();
But this will store the address of the variable str in t1.name
Instead I want the values of str to put in a char array member of the structure which should be of exact same size as variable str.
Is there a way that can be achieved or a better design
Thanks in advance!
But this will store the address of the variable str in t1.name
Not exactly. str.c_str() does not return the address of variable str. It returns the address of the character array owned by str.
Instead I want the values of str to put in a char array member of the structure
To do that, the structure must have a char array member. Your structure does not; it has a pointer member.
char array member of the structure which should be of exact same size as variable str.
This is not possible. The size of the string is dynamic i.e. it may change at run time. The size of a member array must be known at compile time.
You can instead allocate an array dynamically. As the name implies, the size of dynamic allocation may be determined at run time. However, dynamic allocations must be manually deallocated, or else your program will leak memory.
or a better design
A popular design pattern for dynamic allocation is RAII. The standard library already has a RAII container for character strings: std::string. So, to copy a string into a member of a struct, a good design is to have a string as the member:
struct test {
std::string name;
};
test t1;
t1.name = str;
There is no reason to use const char *, since its more error-prone and harder to implement you should use std::string instead.
std::string also allows you to get const char * to string using c_str() method.
But if you have to implement name as C-style string, here is what you have to do:
Allocate enough space on heap using new.
Cast to non-const
Copy strings using strcpy
Free memory in destructor
Constructor:
test(const std::string& str) : name(new char[str.length() + 1])
{
strcpy((char*)name, str.c_str()); }
};
[Live demo]
Also as #Pixelchemist correctly noted, there are important rules of zero/three/five. If your class contains resources which arent copied/destructed correctly them self, like pure pointers aren't (if you would use smart pointers, it would be different story), you have to implement these as well:
copy constructor
copy assignment operator
destructor
move constuctor
move assignment operator
For extended informations read this excellent answer about rule of three.
I am creating a struct called student. In order to store the name, is there anything wrong with just declaring a char pointer in the struct instead of a char array with a predefined size? I can then assign a string literal to the char pointer in the main code.
struct student
{
int ID;
char* name;
};
It really depends on your use case. As suggested above you should use std::string in C++. But if you are using C-style strings, then it depends on your usage.
Using char[] of defined size you can avoid errors due to null pointers and other pointer related errors like memory leaks, dangling pointers etc., but you might not be making an optimal use of memory. You may for example define
#define MAX_SIZE 100
struct student
{
int ID;
char name[MAX_SIZE];
};
And then
#define STUDENT_COUNT 50
struct student many_students[STUDENT_COUNT];
But the length of names of student will be different and in many cases much less than MAX_SIZE. As such much memory will be wasted here.
Or in some cases it might be greater than MAX_SIZE. You may have to truncate the names here to avoid memory corruption.
In other case where we define use char*, memory is not wasted as we allocate only the required amount, but we must take care of memory allocation and freeing.
struct student
{
int ID;
char *name;
};
Then while storing name we need to do something like this:
struct student many_student[STUDENT_COUNT];
int i;
for( i=0; i<STUDENT_COUNT; i++) {
// some code to get student name
many_student[i].name = (char*)malloc(name_length+1 * sizeof(char));
// Now we can store name
}
// Later when name is no longer required free it
free(many_student[some_valid_index_to_free].name);
// also set it to NULL, to avoid dangling pointers
many_student[some_valid_index_to_free].name = NULL;
Also if you are again allocating memory to name, you should free previously allocated memory to avoid memory leaks. Also another thing to consider is NULL checks for pointers before use, i.e., you should always check as
if(many_students[valid_index].name!=NULL) {
// do stuff
}
Although you can create macros to do this, but these are basic overheads with pointers.
Another advantage of using pointers is that if there are many similar names then you can point multiple pointers to same name and save memory, but in array you will be having separate memory for all, e.g,
// IF we have a predefined global name array
char *GLOBAL_NAMES[] = {"NAME_1", "NAME_2", "NAME_3", "NAME_4", ... , "NAME_N"};
// using pointers, just need to assign name to correct pointer in array
many_student[valid_index_1].name = GLOBAL_NAMES[INDEX_NAME_1];
many_student[valid_index_2].name = GLOBAL_NAMES[INDEX_NAME_1];
// In case of array we would have had to copy.
Although this might not be your case, but just saying that pointers may help avoid extra usage.
Hope it will help you :)
Don't use either, use std::string. I (and many others) guarantee that compared to either char* or char[]:
it will be easier to use and
it will be less prone to bugs.
Difference is same as difference between static and dynamic memory allocation. With former ( static ) you have to specify size enough to store the name whereas with latter you have to pay attention to delete it when in no need.
Although it's all time better to use std::string.
Unless there is a strong reason to not do so, I'd suggest you to use a convenient string class like std::string, instead of a raw char* pointer.
Using std::string will simplify your code a lot, e.g. the structure will be automatically copyable, the strings will be automatically released, etc.
A reason why you could not use std::string is because you are designing an interface boundary, think of e.g. Win32 APIs which are mainly C-interface-based (implementation can be in C++), so you can't use C++ at the boundary and instead must use pure C.
But if that's not the case, do yourself a favor and use std::string.
Note also that in case you must use a raw char* pointer, you have several design questions to clarify, e.g.:
Is this an owning pointer, or an observing pointer?
If it's an owning pointer, in what way is it allocated, and in what way is it released? (e.g. malloc()/free(), new[]/delete[], some other allocator like COM CoTaskMemAlloc(), SysAllocString(), etc.)
If it's an observing pointer, you must make sure that the observed string's lifetime exceeds that of the observing pointer, to avoid dangling references.
All these questions are just non-existent if you use a convenient string class (like e.g. std::string).
Note also that, as some Win32 data structures do, you can have a maximum-sized string buffer inside your structure, e.g.
struct Student
{
int ID;
char Name[60];
};
In this case you could use C functions like strcpy(), or safer variants, to deep-copy a source string into the Name buffer. In that case you have good locality since the name string is inside the structure, and a simplified memory management with respect to the raw char* pointer case, but at the cost of having a pre-allocated memory buffer.
This may or may not be a better option for you, depending on your particular programming context. Anyway, keep in mind that this is a more C-like approach; a better C++ approach would be to just use a string class like std::string.
TL;DR - use std::string, as we're talking in c++.
EDIT: Previously, as per the C tag (currently removed)
As per your requirement, assigning a string literal needs a pointer, you cannot do that with an array, anyway.#
If you're using that pointer to store the base address of a string literal, then it is ok. Otherwise, you need to
allocate memory before using that pointer
deallocate memory once you're done with it.
#) Base address of compile time allocted array cannot be changed, thus assignment won't work.
Use the std::string library. It is more easier to work with. And has way more functionality compared to the built in counterparts.
Is there any way to know if an object is a const object or regular object, for instance consider the following class
class String
{
String(const char* str);
};
if user create a const object from String then there is no reason to copy the passed native string and that because he will not make any manipulation on it, the only thing he will do is get string size, string search and other functions that will not change the string.
There is a very good reason for copying - you can't know that the lifetime of the const char * is the same as that of the String object. And no, there is no way of knowing that you are constructing a const object.
Unfortunately, C++ does not provide a way to do what you are attempting. Simply passing a const char * does not guarantee the lifetime of the memory being pointed to. Consider:
char * a = new char[10];
char const *b = a;
String c (b);
delete[] a;
// c is now broken
There is no way for you to know. You could write a class that tightly interacts with String and that creates a constant string pointing to an external buffer (by making the corresponding constructor private and making the interacting class a nested class or a friend of String).
If all you worry about is doing dynamic memory management on a potentially small constant string, you can implement the Small String Optimization (also Small Object/Buffer Optimization). It works by having an embedded buffer in your string class, and copying each string up to some predefined size into that buffer, and each string that's larger to a dynamically allocated storage (the same technique is used by boost::function for storing small sized function objects).
class String {
union {
char *dynamicptr;
char buffer[16];
};
bool isDynamic;
};
There are clever techniques for storing even the length of the embedded string into the buffer itself (storing its length as buffer[15] and similar trickeries).
You could use const_string to do what you're looking for. However, even with const string you have to "tell" it that the string doesn't need to be copied.
const char* foo = "c-string";
boost::const_string bar(foo); // will copy foo
boost::const_string baz(boost::ref(foo)); // assumes foo will always be a valid pointer.
if user create a const object from String then there is no reason to copy the passed native string and that because he will not make any manipulation on it, the only thing he will do is get string size, string search and other functions that will not change the string.
Oh yes there is. Just that it is passes as const doesn't mean that it actually is const outside of the constructor call, and it especially doesn't mean it won't be destroyed while the string object still exists. The keyword const for a function argument only means that the function won't modify or delete it (trying to implement a function that modifies a const argument will result in a compiler error), but there's no way for the function to know what happens outside.
What you're looking for is basically a COW (copy on write) string. Such things are entirely possible, but getting them to work well is somewhat non-trivial. In a multithreaded environment, getting good performance can go beyond non-trivial into the decidedly difficult range.