C++ remembering pointers to allocated strings

C++ remembering pointers to allocated strings - c++

I have this problem in C++ that I cant figure out how to remember pointers to the new allocated strings
in function getName() I create a copy of the wanted name so that the user cant get pointer to the real allocated name..But I cant find a way to free these allocated copies!
is there any other way than Lists? or Array?
thank you
this is the definition of the member function getName();
char * Course::getName() const
{
char* CourseNameCopy= new char*(strlen(CourseName)+1);
return CourseNameCopy;
}

char * Course::getName() const
{
char* CourseNameCopy= new char[strlen(CourseName)+1];
strcpy(CourseNameCopy, CourseName);
return CourseNameCopy;
}
I've made a couple of corrections to the original code so that it does what it claims to do.
If there's a requirement to return a pointer to a modifiable character array containing a copy of the course name, then this is the way to go. But that's very unusual requirement; usually it's sufficient to return a pointer to a non-modifiable character array, and for that, the internal array is all that's needed:
const char * Course::getName() const
{
return CourseName;
}
With that, users can look at the name of the course but not change it. If for some reason someone needs to fiddle with the returned text they can make their own copy and change that.

Use std::string, unless you have a very specific reason not to. Your code looks like a prime example for a string.
#include <string>
std::string Course::getName() const
{
return CourseName; // This will return a copy
}
Of course you have to change your member variable to be also a std::string CourseName;.
This will make your code much safer and much easier to read. It's the preferred way of doing it in C++, unless you're not a beginner anymore and have a very specific reason not to.

... so that the user cant get pointer to the real allocated name ...
You have a slight misconception here. The client (user) doesn't need to have access to the name, just receiving the pointer value (address), is enough to delete it later. Thus clients just have to do the following:
Course c("XYZ");
char* n = c.getName();
// deallocate after use
delete[] n;
Also note you missed to copy the contents actually:
char * Course::getName() const {
char* CourseNameCopy= new char*(strlen(CourseName)+1);
strcpy(CourseNameCopy,CourseName); // <<<<<<<<<
return CourseNameCopy;
}
I have to mention that's not a good solution, because it puts your code's clients in charge to take care about memory management.
Better use a std::string member variable for CourseName, that was designed for such and takes care about all of the memory management under the hood.

Related

returning a "variable string literal" from a function

I have some function that needs to return a const char* (so that a whole host of other functions can end up using it).
I know that if I had something defined as follows:
const char* Foo(int n)
{
// Some code
.
.
.
return "string literal, say";
}
then there is no problem. However am I correct in saying that if Foo has to return some string that can only be determined at runtime (depending on the parameter n, (where each n taking any value in [0, 2^31-1] uniquely determines a return string)) then I have to use the heap (or return objects like std::string which use the heap internally)?
std::string seems too heavyweight for what I want to accomplish (at least two functions will have to pass the parcel), and allocating memory inside Foo to be freed by the caller doesn't strike me as a safe way of going forward. I cannot (easily) pass in references to the objects that need this function, and not that I believe it is possible anyway but macro trickery is out of the question.
Is there something simple that I have not yet considered?
EDIT
Thanks to all for the answers, I'll go for std::string (I suppose in a roundabout fashion I was asking for confirmation that there is no way of hinting to the compiler that it should store the contents of some char[] in the same place that it stores string literals). As for "heavyweight" (and I'm pleasantly surprised that copying them isn't as wasteful as I thought) that wasn't the best way of putting it, perhaps "different" would have been closer to my initial apprehension.

If you mean that your function chooses between one of n known-at-compile-time strings, then you can just return a const char * to any one of them. A string literal has static storage duration in C and C++, meaning that they exist for the lifetime of the program. Therefore it is safe to return a pointer to one.
const char* choose_string(int n)
{
switch(n % 4)
{
case 0: return "zero";
case 1: return "one";
case 2: return "two";
case 3: return "three";
}
}
If your function dynamically generates a string at runtime, then you have to either pass in a (char *buf, int buf_length) and write the result into it, or return a std::string.

In C++, returning a std::string is probably the right answer (as several others have already said).
If you don't want to use std::string for some reason (say, if you were programming in C, but then you would have tagged the question that way), there are several options for "returning" a string from a function. None of them are pretty.
If you return a string literal, what you're really returning is a pointer to the first character of the array object associated with that string literal. That object has static storage duration (i.e., it exists for the entire execution of your program), so returning a pointer to it is perfectly safe. This is obviously inflexible.
You can allocate an array on the heap and return a pointer to it. That lets the called function determine how long it needs to be, but it places the burden on the caller to deallocate the memory when it's no longer needed.
You can return a pointer to (the first element of) a static array defined inside the function. This is inflexible in that the maximum length has to be determined at compile time. It also means that successive calls to the function will clobber the result. The asctime() function, defined in <time.h> <ctime> does this. (I once wrote a function that cycled through the elements of a static array of arrays, so that 6 successive calls would not clobber previous results, but the 7th would. That was probably overkill.)
You can require the caller to pass in a pointer to (the first element of) an array that the caller itself must allocate, probably along with a separate argument that specifies the length of the caller's array. This requires the caller to know how long the string might be, and probably to be able to handle the error of not reserving enough space.
And now you know why C++ provides library features like std::string that take care of all this stuff for you.
Incidentally, the phrase "variable string literal" doesn't make a lot of sense. If something is a literal, it's not variable. Probably "variable string" is what you meant.

The easiest solution might be to return a std::string.
If you want to avoid std::string, one alternative is to have the caller pass a char[] buffer to the function. You might also want to provide a function that can tell the caller how big of a buffer will be needed, unless an upper bound is known statically.

Use std::string, but if you really want... A common pattern used in C programming is to return the size of the final result, allocate a buffer, and call the function twice. (I apologize for the C style, you want a C solution I give a C solution :P )
size_t Foo(int n, char* buff, size_t buffSize)
{
if (buff)
{
// check if buffSize is large enough if so fill
}
// calculate final string size and return
return stringSize;
}
size_t size = Foo(x, NULL, 0); // find the size of the result
char* string = malloc(size); // allocate
Foo(x,string, size); // fill the buffer

(Donning asbestos suit)
Consider just leaking the memory.
const char* Foo(int n)
{
static std::unordered_map<int, const char*> cache;
if (!cache[n])
{
// Generate cache[n]
}
return cache[n];
}
Yup, this will leak memory. Up to 2^32 strings worth of them. But if you had the actual string literals, you would always have all 2^32 strings in memory (and clearly require a 64 bits build - just the \0 alone take 4GB!)

Returning a constant char pointer yields an error

I am new to C++, and haven't quite grasped all the concepts yet, so i am perplexed at why this function does not work. I am currently not at home, so i cannot post the compiler error just yet, i will do it as soon as i get home.
Here is the function.
const char * ConvertToChar(std::string input1, std::string input2) {
// Create a string that you want converted
std::stringstream ss;
// Streams the two strings together
ss << input1 << input2;
// outputs it into a string
std::string msg = ss.str();
//Creating the character the string will go in; be sure it is large enough so you don't overflow the array
cont char * cstr[80];
//Copies the string into the char array. Thus allowing it to be used elsewhere.
strcpy(cstr, msg.c_str());
return * cstr;
}
It is made to concatenate and convert two strings together to return a const char *. That is because the function i want to use it with requires a const char pointer to be passed through.

The code returns a pointer to a local (stack) variable. When the caller gets this pointer that local variable doesn't exist any more. This is often called dangling reference.
If you want to convert std::string to a c-style string use std::string::c_str().
So, to concatenate two strings and get a c-style string do:
std::string input1 = ...;
std::string input2 = ...;
// concatenate
std::string s = input1 + input2;
// get a c-style string
char const* cstr = s.c_str();
// cstr becomes invalid when s is changed or destroyed

Without knowing what the error is, it's hard to say, but this
line:
const char* cstr[80];
seems wrong: it creates an array of 80 pointers; when it
implicitly converts to a pointer, the type will be char
const**, which should give an error when it is passed as an
argument to strcpy, and the dereference in the return
statement is the same as if you wrote cstr[0], and returns the
first pointer in the array—since the contents of the array
have never been initialized, this is undefined behavior.
Before you go any further, you have to define what the function
should return—not only its type, but where the pointed to
memory will reside. There are three possible solutions to this:
Use a local static for the buffer:
This solution was
frequently used in early C, and is still present in a number of
functions in the C library. It has two major defects: 1)
successive calls will overwrite the results, so the client code
must make its own copy before calling the function again, and 2)
it isn't thread safe. (The second issue can be avoided by using
thread local storage.) In cases like yours, it also has the
problem that the buffer must be big enough for the data, which
probably requires dynamic allocation, which adds to the
complexity.
Return a pointer to dynamically allocated memory:
This works well in theory, but requires the client code to free
the memory. This must be rigorously documented, and is
extremely error prone.
Require the client code to provide the buffer:
This is probably the best solution in modern code, but it does
mean that you need extra parameters for the address and the
length of the buffer.
In addition to this: there's no need to use std::ostringstream
if all you're doing is concatenating; just add the two strings.
Whatever solution you use, verify that the results will fit.

implement substring c++

how do you implement substring in C++ that returns pointer to char? (takes pointer to the first letter)
say something like char* substr(char* c, int pos, int lenght)

I use methods of the std::string class.
Edit:
say something like char* substr(char* c, int pos, int lenght)
This isn't a function that it's possible to implemement well. To create a substring, you need memory:
Same memory as original string: did you mean to overwrite the original string? Is the original string writeable?
New memory: is the caller going to invoke the delete[] operator on the pointer which it receives from your substr function (if it doesn't, there would be a memory leak)?
Global memory: you could reserve a fixed-length buffer in global memory, but that's all sorts of problems (may not be long enough, will be overwritten when someone else calls your substr).
The right way to do it is to return a std::string instance, which is what the string::substr method does, as shown in Graphics Noob's answer, because the std::string instance (unlike a char* instance) will manage the lifetime of its own memory; and if you then want a char* instance from the std::string instance, you can get that using the c_str() method.

Is this the functionality you are trying to replicate?
http://www.cplusplus.com/reference/clibrary/cstring/strstr/

str.substr(int,int).c_str()
Will return the substring as a const char* but be careful using it, the memory containing the string will be invalid at the end of the statement, so you may need to surround it with a strcpy() if you want to store the result. Or better yet just break it up into two lines:
std::string str2 = str.substr(int,int);
const char* cstr = str2.c_str();
You'll need to do a strcpy() to safely get rid of the const.
The ints refer to the parameters for the substring you're trying to get.

Avoiding memory leaks while mutating c-strings

For educational purposes, I am using cstrings in some test programs. I would like to shorten strings with a placeholder such as "...".
That is, "Quite a long string" will become "Quite a lo..." if my maximum length is set to 13. Further, I do not want to destroy the original string - the shortened string therefore has to be a copy.
The (static) method below is what I come up with. My question is: Should the class allocating memory for my shortened string also be responsible for freeing it?
What I do now is to store the returned string in a separate "user class" and defer freeing the memory to that user class.
const char* TextHelper::shortenWithPlaceholder(const char* text, size_t newSize) {
char* shortened = new char[newSize+1];
if (newSize <= 3) {
strncpy_s(shortened, newSize+1, ".", newSize);
}
else {
strncpy_s(shortened, newSize+1, text, newSize-3);
strncat_s(shortened, newSize+1, "...", 3);
}
return shortened;
}

The standard approach of functions like this is to have the user pass in a char[] buffer. You see this in functions like sprintf(), for example, which take a destination buffer as a parameter. This allows the caller to be responsible for both allocating and freeing the memory, keeping the whole memory management issue in a single place.

In order to avoid buffer overflows and memory leaks, you should always use C++ classes such as std::string in this case.
Only the very last instance should convert the class into something low level such as char*. This will make your code simple and safe. Just change your code to:
std::string TextHelper::shortenWithPlaceholder(const std::string& text,
size_t newSize) {
return text.substr(0, newSize-3) + "...";
}
When using that function in a C context, you simply use the cstr() method:
some_c_function(shortenWithPlaceholder("abcde", 4).c_str());
That's all!
In general, you should not program in C++ the same way you program in C. It's more appropriate to treat C++ as a really different language.

I've never been happy returning pointers to locally allocated memory. I like to keep a healthy mistrust of anyone calling my function in regard to clean up.
Instead, have you considered accepting a buffer into which you'd copy the shortened string?
eg.
const char* TextHelper::shortenWithPlaceholder(const char* text,
size_t textSize,
char* short_text,
size_t shortSize)
where short_text = buffer to copy shortened string, and shortSize = size of the buffer supplied. You could also continue to return a const char* pointing to short_text as a convenience to the caller (return NULL if shortSize isn't large enough to).

Really you should just use std::string, but if you must, look to the existing library for usage guidance.
In the C standard library, the function that is closest to what you are doing is
char * strncpy ( char * destination, const char * source, size_t num );
So I'd go with this:
const char* TextHelper::shortenWithPlaceholder(
char * destination,
const char * source,
size_t newSize);
The caller is responsible for memory management - this allows the caller to use the stack, or a heap, or a memory mapped file, or whatever source to hold that data. You don't need to document that you used new[] to allocate the memory, and the caller doesn't need to know to use delete[] as opposed to free or delete, or even a lower-level operating system call. Leaving the memory management to the caller is just more flexible, and less error prone.
Returning a pointer to the destination is just a nicety to allow you to do things like this:
char buffer[13];
printf("%s", TextHelper::shortenWithPlaceholder(buffer, source, 12));

The most flexible approach is to return a helper object that wraps the allocated memory, so that the caller doesn't have to worry about it. The class stores a pointer to the memory, and has a copy constructor, an assignment operator and a destructor.
class string_wrapper
{
char *p;
public:
string_wrapper(char *_p) : p(_p) { }
~string_wrapper() { delete[] p; }
const char *c_str() { return p; }
// also copy ctor, assignment
};
// function declaration
string_wrapper TextHelper::shortenWithPlaceholder(const char* text, size_t newSize)
{
// allocate string buffer 'p' somehow...
return string_wrapper(p);
}
// caller
string_wrapper shortened = TextHelper::shortenWithPlaceholder("Something too long", 5);
std::cout << shortened.c_str();
Most real programs use std::string for this purpose.

In your example the caller has no choice but to be responsible for freeing the allocated memory.
This, however, is an error prone idiom to use and I don't recommend using it.
One alternative that allows you to use pretty much the same code is to change shortened to a referenced counted pointer and have the method return that referenced counted pointer instead of a bare pointer.

There are two basic ways that I consider equally common:
a) TextHelper returns the c string and forgets about it. The user has to delete the memory.
b) TextHelper maintains a list of allocated strings and deallocates them when it is destroyed.
Now it depends on your usage pattern. b) seems risky to me: If TextHelper has to deallocate the strings, it should not do so before the user is done working with the shortened string. You probably won't know when this point comes, so you keep the TextHelper alive until the program terminates. This results in a memory usage pattern equal to a memory leak. I'd recommend b) only if the strings belong semantically to the class that provides them, similar to the std::string::c_str(). Your TextHelper looks more like a toolbox that should not be associated with the processed strings, so if I had to choose between the two, I'd go for a). Your user class is probably the best solution, given a fixed TextHelper interface.

Edit: No, I'm wrong. I misunderstood what you were trying to do. The caller must delete the memory in your instance.
The C++ standard states that deleting 0/NULL does nothing (in other words, this is safe to do), so you can delete it regardless of whether you ever called the function at all. Edit: I don't know how this got left out...your other alternative is placement delete. In that case, even if it is bad form, you should also use placement new to keep the allocation/deallocation in the same place (otherwise the inconsistency would make debugging ridiculous).
That said, how are you using the code? I don't see when you would ever call it more than once, but if you do, there are potential memory leaks (I think) if you don't remember each different block of memory.
I would just use std::auto_ptr or Boost::shared_ptr. It deletes itself on the way out and can be used with char*.
Another thing you can do, considering on how TextHelper is allocated. Here is a theoretical ctor:
TextHelper(const char* input) : input_(input), copy(0) { copy = new char[sizeof(input)/sizeof(char)]; //mess with later }
~TextHelper() { delete copy; }

Caching a const char * as a return type

Was reading up a bit on my C++, and found this article about RTTI (Runtime Type Identification):
http://msdn.microsoft.com/en-us/library/70ky2y6k(VS.80).aspx . Well, that's another subject :) - However, I stumbled upon a weird saying in the type_info-class, namely about the ::name-method. It says: "The type_info::name member function returns a const char* to a null-terminated string representing the human-readable name of the type. The memory pointed to is cached and should never be directly deallocated."
How can you implement something like this yourself!? I've been struggling quite a bit with this exact problem often before, as I don't want to make a new char-array for the caller to delete, so I've stuck to std::string thus far.
So, for the sake of simplicity, let's say I want to make a method that returns "Hello World!", let's call it
const char *getHelloString() const;
Personally, I would make it somehow like this (Pseudo):
const char *getHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}
.. But this would mean that the caller should do a delete[] on my return pointer :(
Thx in advance

How about this:
const char *getHelloString() const
{
return "HelloWorld!";
}
Returning a literal directly means the space for the string is allocated in static storage by the compiler and will be available throughout the duration of the program.

I like all the answers about how the string could be statically allocated, but that's not necessarily true for all implementations, particularly the one whose documentation the original poster linked to. In this case, it appears that the decorated type name is stored statically in order to save space, and the undecorated type name is computed on demand and cached in a linked list.
If you're curious about how the Visual C++ type_info::name() implementation allocates and caches its memory, it's not hard to find out. First, create a tiny test program:
#include <cstdio>
#include <typeinfo>
#include <vector>
int main(int argc, char* argv[]) {
std::vector<int> v;
const type_info& ti = typeid(v);
const char* n = ti.name();
printf("%s\n", n);
return 0;
}
Build it and run it under a debugger (I used WinDbg) and look at the pointer returned by type_info::name(). Does it point to a global structure? If so, WinDbg's ln command will tell the name of the closest symbol:
0:000> ?? n
char * 0x00000000`00857290
"class std::vector<int,class std::allocator<int> >"
0:000> ln 0x00000000`00857290
0:000>
ln didn't print anything, which indicates that the string wasn't in the range of addresses owned by any specific module. It would be in that range if it was in the data or read-only data segment. Let's see if it was allocated on the heap, by searching all heaps for the address returned by type_info::name():
0:000> !heap -x 0x00000000`00857290
Entry User Heap Segment Size PrevSize Unused Flags
-------------------------------------------------------------------------------------------------------------
0000000000857280 0000000000857290 0000000000850000 0000000000850000 70 40 3e busy extra fill
Yes, it was allocated on the heap. Putting a breakpoint at the start of malloc() and restarting the program confirms it.
Looking at the declaration in <typeinfo> gives a clue about where the heap pointers are getting cached:
struct __type_info_node {
void *memPtr;
__type_info_node* next;
};
extern __type_info_node __type_info_root_node;
...
_CRTIMP_PURE const char* __CLR_OR_THIS_CALL name(__type_info_node* __ptype_info_node = &__type_info_root_node) const;
If you find the address of __type_info_root_node and walk down the list in the debugger, you quickly find a node containing the same address that was returned by type_info::name(). The list seems to be related to the caching scheme.
The MSDN page linked in the original question seems to fill in the blanks: the name is stored in its decorated form to save space, and this form is accessible via type_info::raw_name(). When you call type_info::name() for the first time on a given type, it undecorates the name, stores it in a heap-allocated buffer, caches the buffer pointer, and returns it.
The linked list may also be used to deallocate the cached strings during program exit (however, I didn't verify whether that is the case). This would ensure that they don't show up as memory leaks when you run a memory debugging tool.

Well gee, if we are talking about just a function, that you always want to return the same value. it's quite simple.
const char * foo()
{
static char[] return_val= "HelloWorld!";
return return_val;
}
The tricky bit is when you start doing things where you are caching the result, and then you have to consider Threading,or when your cache gets invalidated, and trying to store thing in thread local storage. But if it's just a one off output that is immediate copied, this should do the trick.
Alternately if you don't have a fixed size you have to do something where you have to either use a static buffer of arbitrary size.. in which you might eventually have something too large, or turn to a managed class say std::string.
const char * foo()
{
static std::string output;
DoCalculation(output);
return output.c_str();
}
also the function signature
const char *getHelloString() const;
is only applicable for member functions.
At which point you don't need to deal with static function local variables and could just use a member variable.

I think that since they know that there are a finite number of these, they just keep them around forever. It might be appropriate for you to do that in some instances, but as a general rule, std::string is going to be better.
They can also look up new calls to see if they made that string already and return the same pointer. Again, depending on what you are doing, this may be useful for you too.

Be careful when implementing a function that allocates a chunk of memory and then expects the caller to deallocate it, as you do in the OP:
const char *getHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}
By doing this you are transferring ownership of the memory to the caller. If you call this code from some other function:
int main()
{
char * str = getHelloString();
delete str;
return 0;
}
...the semantics of transferring ownership of the memory is not clear, creating a situation where bugs and memory leaks are more likely.
Also, at least under Windows, if the two functions are in 2 different modules you could potentially corrupt the heap. In particular, if main() is in hello.exe, compiled in VC9, and getHelloString() is in utility.dll, compiled in VC6, you'll corrupt the heap when you delete the memory. This is because VC6 and VC9 both use their own heap, and they aren't the same heap, so you are allocating from one heap and deallocating from another.

Why does the return type need to be const? Don't think of the method as a get method, think of it as a create method. I've seen plenty of API that requires you to delete something a creation operator/method returns. Just make sure you note that in the documentation.
/* create a hello string
* must be deleted after use
*/
char *createHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}

What I've often done when I need this sort of functionality is to have a char * pointer in the class - initialized to null - and allocate when required.
viz:
class CacheNameString
{
private:
char *name;
public:
CacheNameString():name(NULL) { }
const char *make_name(const char *v)
{
if (name != NULL)
free(name);
name = strdup(v);
return name;
}
};

Something like this would do:
const char *myfunction() {
static char *str = NULL; /* this only happens once */
delete [] str; /* delete previous cached version */
str = new char[strlen("whatever") + 1]; /* allocate space for the string and it's NUL terminator */
strcpy(str, "whatever");
return str;
}
EDIT: Something that occurred to me is that a good replacement for this could be returning a boost::shared_pointer instead. That way the caller can hold onto it as long as they want and they don't have to worry about explicitly deleting it. A fair compromise IMO.

The advice given that warns about the lifetime of the returned string is sound advise. You should always be careful about recognising your responsibilities when it comes to managing the lifetime of returned pointers. The practise is quite safe, however, provided the variable pointed to will outlast the call to the function that returned it. Consider, for instance, the pointer to const char returned by c_str() as a method of class std::string. This is returning a pointer to the memory managed by the string object which is guaranteed to be valid as long as the string object is not deleted or made to reallocate its internal memory.
In the case of the std::type_info class, it is a part of the C++ standard as its namespace implies. The memory returned from name() is actually pointed to static memory created by the compiler and linker when the class was compiled and is a part of the run time type identification (RTTI) system. Because it refers to a symbol in code space, you should not attempt to delete it.

I think something like this can only be implemented "cleanly" using objects and the RAII idiom.
When the objects destructor is called (obj goes out of scope), we can safely assume that the const char* pointers arent be used anymore.
example code:
class ICanReturnConstChars
{
std::stack<char*> cached_strings
public:
const char* yeahGiveItToMe(){
char* newmem = new char[something];
//write something to newmem
cached_strings.push_back(newmem);
return newmem;
}
~ICanReturnConstChars(){
while(!cached_strings.empty()){
delete [] cached_strings.back()
cached_strings.pop_back()
}
}
};
The only other possibility i know of is to pass a smart_ptr ..

It's probably done using a static buffer:
const char* GetHelloString()
{
static char buffer[256] = { 0 };
strcpy( buffer, "Hello World!" );
return buffer;
}
This buffer is like a global variable that is accessible only from this function.

You can't rely on GC; this is C++. That means you must keep the memory available until the program terminates. You simply don't know when it becomes safe to delete[] it. So, if you want to construct and return a const char*, simple new[] it and return it. Accept the unavoidable leak.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ remembering pointers to allocated strings - c++

Related

returning a "variable string literal" from a function

Returning a constant char pointer yields an error

implement substring c++

Avoiding memory leaks while mutating c-strings

Caching a const char * as a return type

Categories

Resources