In the class below is there any benefit or reason why speak() returns a const char* instead of a std::string?
class Animal
{
protected:
std::string m_name;
Animal(std::string name)
: m_name(name)
{
}
public:
std::string getName() { return m_name; }
const char* speak() { return "???"; }
};
std::string comes with a lot of maybe unwanted and also unused features. If you do not want all that features you should think about the cost of the usage. Passing around std::string needs at minimum a first copy from the literal to the internal storage of the string. Maybe you get some additional copies if you pass the string to other functions and back. If small string optimization is present, you can not simply move the string, so the cost becomes higher in that case.
If you do not want the cost but get all features for a constant string, you should take a look for std::string_view. The implementation typically contains only a pointer to the underlying data and a value for size. So it comes with less cost and is very feature rich.
And indeed there is nothing wrong with passing const char* if it fits to your needs.
Use always std::string for constant strings is a very common pattern but not a good one.
What a string literal such as "???" does is it tells the compiler to include a piece of global memory containing that particular sequence of characters (plus the character '\0' at the end). The string literal expression has a value of type const char* pointing to that piece of global memory.
What that means is that in this very particular circumstance, it's safe to pass around the pointer. Note that the following function
std::string speak() { return "???"; }
has the same function body but a rather different code path. The string literal expression (the global memory, the pointer) is still there, but it is used as an argument to the constructor for std::string; it is implicitly converted to std::string.
That constructor for std::string dynamically allocates some memory (well not in this case with the short string optimization, probably) and copies from the pointer you give it.
So in this very particular case, where you have a string literal and nothing else, you can return a const char*. It will probably get implicitly converted to std::string when passed to another function which expects it, so const char* isn't all that much of an optimization.
In addition to the other answers, returning a const char* may be needed if you need to call your functions from a different library that is written in a different C++ compiler. For example your Animal class may be compiled with Visual Studio 2013, but the library where you use the Animal class from may be compiled with Visual Studio 2015.
For more info:
How do I safely pass objects, especially STL objects, to and from a DLL?
Considering we have a struct with char * member, if we want to request the content of this member, we normally do
char const * get_member() { return object.member; }
By this, we only return a pointer, without any allocation.
If now we want to return a string; is it possible to let the string just use that pointer, instead of copying the content and construct a new string object?
string const & get_member() { return object.member; }
will this code above will do a memory allocation. what like of extra work will this method do compare to the char const * one?
No, it is not possible. std::string always allocates its own memory and cannot take ownership of a pre-existing buffer.
You can either return a copy of the pointer, or you can use a std::string member in the first place, and return a reference to it. Or, alternatively return std::string_view which can be used with either char* or a std::string member. String view is only available since C++17 but it also exists in standard library extensions some for earlier compilers and there also exists non-standard implementations.
The struct is from some C code based library, just want to wrap with C++, at the mean time, do not kill any performance.
Then it seems that returning a std::string would not be an appropriate design.
Say I have a variable
std::string str; // initialized with some value
And a struct defined as:
struct test
{
public:
const char* name;
};
I know this can be done :
test t1;
t1.name = str.c_str();
But this will store the address of the variable str in t1.name
Instead I want the values of str to put in a char array member of the structure which should be of exact same size as variable str.
Is there a way that can be achieved or a better design
Thanks in advance!
But this will store the address of the variable str in t1.name
Not exactly. str.c_str() does not return the address of variable str. It returns the address of the character array owned by str.
Instead I want the values of str to put in a char array member of the structure
To do that, the structure must have a char array member. Your structure does not; it has a pointer member.
char array member of the structure which should be of exact same size as variable str.
This is not possible. The size of the string is dynamic i.e. it may change at run time. The size of a member array must be known at compile time.
You can instead allocate an array dynamically. As the name implies, the size of dynamic allocation may be determined at run time. However, dynamic allocations must be manually deallocated, or else your program will leak memory.
or a better design
A popular design pattern for dynamic allocation is RAII. The standard library already has a RAII container for character strings: std::string. So, to copy a string into a member of a struct, a good design is to have a string as the member:
struct test {
std::string name;
};
test t1;
t1.name = str;
There is no reason to use const char *, since its more error-prone and harder to implement you should use std::string instead.
std::string also allows you to get const char * to string using c_str() method.
But if you have to implement name as C-style string, here is what you have to do:
Allocate enough space on heap using new.
Cast to non-const
Copy strings using strcpy
Free memory in destructor
Constructor:
test(const std::string& str) : name(new char[str.length() + 1])
{
strcpy((char*)name, str.c_str()); }
};
[Live demo]
Also as #Pixelchemist correctly noted, there are important rules of zero/three/five. If your class contains resources which arent copied/destructed correctly them self, like pure pointers aren't (if you would use smart pointers, it would be different story), you have to implement these as well:
copy constructor
copy assignment operator
destructor
move constuctor
move assignment operator
For extended informations read this excellent answer about rule of three.
If I use
const char * str = "Hello";
there is no memory allocation/deallocaton needed in runtime
If I use
const std::string str = "Hello";
will be there an allocation via new/malloc inside string class or not? I could find it in assembly, but I am not good at reading it.
If answer is "yes, there will be malloc/new", why? Why can there be only pass through to inner const char pointer inside std::string and do actual memory allocation if I need to edit edit string?
will be there an allocation via new/malloc inside string class or not?
It depends. The string object will have to provide some memory to store the data, since that's its job. Some implementations use a "small string optimisation", where the object contains a small buffer, and only allocates from the heap if the string is too large for that.
Why can there be only pass through to inner const char pointer inside std::string and do actual memory allocation if I need to edit edit string?
What you describe isn't necessarily an optimisation (since it needs an extra runtime check whenever you modify the string), and in any case isn't allowed by the iterator invalidation rules.
There is a proposal for a string_view, allowing you to access an existing character sequence with an interface like const string, without any memory management. It's not yet standard, and doesn't allow you to modify the string.
Naive implementation of std::string will require a heap allocation however compilers are allowed to optimize statically initialized std::string objects by replacing them with objects of alternative implementations if the initialized strings are not modified during runtime.
You may use const std::string when you instantiate immutable strings to ensure better optimization.
The C++ standard doesn't actually say you can't just store a pointer to an external string (and a length). However, that means EVERY time you may modify the string (e.g. char& std::string::operator[](size_t index)) would have to ensure that the string is actually writeable. Since a large number of string usage does NOT use a constant string only to store the string, but does indeed modify the string [or use a string that isn't a constant input anyway].
So, some problems are;
std::string s = "Hello";
char &c = s[1];
c = 'a'; // Should make string to "Hallo".
what if:
char buffer[1000];
cin.getline(buffer); // Reads "Hello"
std::string s = buffer;
cin.getline(buffer); // Reads "World"
What is the value in s now?
There are so many such cases where if you were to just copy the original string, it would cause more problems, and little or no benefit.
I'm working on a program that stores a vital data structure as an unstructured string with program-defined delimiters (so we need to walk the string and extract the information we need as we go) and we'd like to convert it to a more structured data type.
In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself. The length of the string will always be known at allocation time. We've determined through testing that doubling the number of allocations required for each of these data types is an unnacceptable cost. Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation? If we were using cstrings I'd just have a char * in the struct and point it to the end of the struct after allocating a block big enough for the struct and string, but we'd prefer std::string if possible.
Most of my experience is with C, so please forgive any C++ ignorance displayed here.
If you have such rigorous memory needs, then you're going to have to abandon std::string.
The best alternative is to find or write an implementation of basic_string_ref (a proposal for the next C++ standard library), which is really just a char* coupled with a size. But it has all of the (non-mutating) functions of std::basic_string. Then you use a factory function to allocate the memory you need (your struct size + string data), and then use placement new to initialize the basic_string_ref.
Of course, you'll also need a custom deletion function, since you can't just pass the pointer to "delete".
Given the previously linked to implementation of basic_string_ref (and its associated typedefs, string_ref), here's a factory constructor/destructor, for some type T that needs to have a string on it:
template<typename T> T *Create(..., const char *theString, size_t lenstr)
{
char *memory = new char[sizeof(T) + lenstr + 1];
memcpy(memory + sizeof(T), theString, lenstr);
try
{
return new(memory) T(..., string_ref(theString, lenstr);
}
catch(...)
{
delete[] memory;
throw;
}
}
template<typename T> T *Create(..., const std::string & theString)
{
return Create(..., theString.c_str(), theString.length());
}
template<typename T> T *Create(..., const string_ref &theString)
{
return Create(..., theString.data(), theString.length());
}
template<typename T> void Destroy(T *pValue)
{
pValue->~T();
char *memory = reinterpret_cast<char*>(pValue);
delete[] memory;
}
Obviously, you'll need to fill in the other constructor parameters yourself. And your type's constructor will need to take a string_ref that refers to the string.
If you are using std::string, you can't really do one allocation for both structure and string, and you also can't make the allocation of both to be one large block. If you are using old C-style strings it's possible though.
If I understand you correctly, you are saying that through profiling you have determined that the fact that you have to allocate a string and another data member in your data structure imposes an unacceptable cost to you application.
If that's indeed the case I can think of a couple solutions.
You could pre-allocate all of these structures up front, before your program starts. Keep them in some kind of fixed collection so they aren't copy-constructed, and reserve enough buffer in your strings to hold your data.
Controversial as it may seem, you could use old C-style char arrays. It seems like you are fogoing much of the reason to use strings in the first place, which is the memory management. However in your case, since you know the needed buffer sizes at start up, you could handle this yourself. If you like the other facilities that string provides, bear in mind that much of that is still available in the <algorithm>s.
Take a look at Variable Sized Struct C++ - the short answer is that there's no way to do it in vanilla C++.
Do you really need to allocate the container structs on the heap? It might be more efficient to have those on the stack, so they don't need to be allocated at all.
Indeed two allocations can seem too high. There are two ways to cut them down though:
Do a single allocation
Do a single dynamic allocation
It might not seem so different, so let me explain.
1. You can use the struct hack in C++
Yes this is not typical C++
Yes this requires special care
Technically it requires:
disabling the copy constructor and assignment operator
making the constructor and destructor private and provide factory methods for allocating and deallocating the object
Honestly, this is the hard-way.
2. You can avoid allocating the outer struct dynamically
Simple enough:
struct M {
Kind _kind;
std::string _data;
};
and then pass instances of M on the stack. Move operations should guarantee that the std::string is not copied (you can always disable copy to make sure of it).
This solution is much simpler. The only (slight) drawback is in memory locality... but on the other hand the top of the stack is already in the CPU cache anyway.
C-style strings can always be converted to std::string as needed. In fact, there's a good chance that your observations from profiling are due to fragmentation of your data rather than simply the number of allocations, and creating an std::string on demand will be efficient. Of course, not knowing your actual application this is just a guess, and really one can't know this until it's tested anyways. I imagine a class
class my_class {
std::string data() const { return self._data; }
const char* data_as_c_str() const // In case you really need it!
{ return self._data; }
private:
int _type;
char _data[1];
};
Note I used a standard clever C trick for data layout: _data is as long as you want it to be, so long as your factory function allocates the extra space for it. IIRC, C99 even gave a special syntax for it:
struct my_struct {
int type;
char data[];
};
which has good odds of working with your C++ compiler. (Is this in the C++11 standard?)
Of course, if you do do this, you really need to make all of the constructors private and friend your factory function, to ensure that the factory function is the only way to actually instantiate my_class -- it would be broken without the extra memory for the array. You'll definitely need to make operator= private too, or otherwise implement it carefully.
Rethinking your data types is probably a good idea.
For example, one thing you can do is, rather than trying to put your char arrays into a structured data type, use a smart reference instead. A class that looks like
class structured_data_reference {
public:
structured_data_reference(const char *data):_data(data) {}
std::string get_first_field() const {
// Do something interesting with _data to get the first field
}
private:
const char *_data;
};
You'll want to do the right thing with the other constructors and assignment operator too (probably disable assignment, and implement something reasonable for move and copy). And you may want reference counted pointers (e.g. std::shared_ptr) throughout your code rather than bare pointers.
Another hack that's possible is to just use std::string, but store the type information in the first entry (or first several). This requires accounting for that whenever you access the data, of course.
I'm not sure if this exactly addressing your problem. One way you can optimize the memory allocation in C++ by using a pre-allocated buffer and then using a 'placement new' operator.
I tried to solve your problem as I understood it.
unsigned char *myPool = new unsigned char[10000];
struct myStruct
{
myStruct(char* aSource1, char* aSource2)
{
original = new (myPool) string(aSource1); //placement new
data = new (myPool) string(aSource2); //placement new
}
~myStruct()
{
original = NULL; //no deallocation needed
data = NULL; //no deallocation needed
}
string* original;
string* data;
};
int main()
{
myStruct* aStruct = new (myPool) myStruct("h1", "h2");
// Use the struct
aStruct = NULL; // No need to deallocate
delete [] myPool;
return 0;
}
[Edit] After, the comment from NicolBolas, the problem is bit more clear. I decided to write one more answer, eventhough in reality it is not that much advantageous than using a raw character array. But, I still believe that this is well within the stated constraints.
Idea would be to provide a custom allocater for the string class as specified in this SO question.
In the implementation of the allocate method, use the placement new as
pointer allocate(size_type n, void * = 0)
{
// fail if we try to allocate too much
if((n * sizeof(T))> max_size()) { throw std::bad_alloc(); }
//T* t = static_cast<T *>(::operator new(n * sizeof(T)));
T* t = new (/* provide the address of the original character buffer*/) T[n];
return t;
}
The constraint is that for the placement new to work, the original string address should be known to the allocater at run time. This can be achieved by external explicit setting before the new string member creation. However, this is not so elegant.
In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself.
I have a feeling that may you are not exploiting C++'s type-system to its maximum potential here. It looks and feels very C-ish (that is not a proper word, I know). I don't have concrete examples to post here since I don't have any idea about the problem you are trying to solve.
Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation?
I believe that you are worrying about the structure allocation followed by a copy of the string to the structure member? This ideally shouldn't happen (but of course, this depends on how and when you are initializng the members). C++11 supports move construction. This should take care of any extra string copies that you are worried about.
You should really, really post some code to make this discussion worthwhile :)
a vital data structure as an unstructured string with program-defined delimiters
One question: Is this string mutable? If not, you can use a slightly different data-structure. Don't store copies of parts of this vital data structure but rather indices/iterators to this string which point to the delimiters.
// assume that !, [, ], $, % etc. are your program defined delims
const std::string vital = "!id[thisisdata]$[moredata]%[controlblock]%";
// define a special struct
enum Type { ... };
struct Info {
size_t start, end;
Type type;
// define appropriate ctors
};
// parse the string and return Info obejcts
std::vector<Info> parse(const std::string& str) {
std::vector<Info> v;
// loop through the string looking for delims
for (size_t b = 0, e = str.size(); b < e; ++b) {
// on hitting one such delim create an Info
switch( str[ b ] ) {
case '%':
...
case '$;:
// initializing the start and then move until
// you get the appropriate end delim
}
// use push_back/emplace_back to insert this newly
// created Info object back in the vector
v.push_back( Info( start, end, kind ) );
}
return v;
}