I'm relatively novice when it comes to C++ as I was weened on Java for much of my undergraduate curriculum (tis a shame). Memory management has been a hassle, but I've purchased a number books on ansi C and C++. I've poked around the related questions, but couldn't find one that matched this particular criteria. Maybe it's so obvious nobody mentions it?
This question has been bugging me, but I feel as if there's a conceptual point I'm not utilizing.
Suppose:
char original[56];
cstr[0] = 'a';
cstr[1] = 'b';
cstr[2] = 'c';
cstr[3] = 'd';
cstr[4] = 'e';
cstr[5] = '\0';
char *shaved = shavecstr(cstr);
// various operations, calls //
delete[] shaved;
Where,
char* shavecstr(char* cstr)
{
size_t len = strlen(cstr);
char* ncstr = new char[len];
strcpy(ncstr,cstr);
return ncstr;
}
In that the whole point is to have 'original' be a buffer that fills with characters and routinely has its copy shaved and used elsewhere.
To clarify, original is filled via std::gets(char* buff), std::getline(char* buff, buff_sz), std::read(char* buff, buff_sz), or any in-place filling input reader. To 'shave' a string, it's basically truncated down eliminating the unused array space.
The error is a heap allocation error, and segs on the delete[].
To prevent leaks, I want to free up the memory held by 'shaved' to be used again after it passes through some arguments. There is probably a good reason for why this is restricted, but there should be some way to free the memory as by this configuration, there is no way to access the original owner (pointer) of the data.
I assume you would replace original by cstr, otherwise the code won't compile as cstr is not declared.
The error here is that the size of the allocated array is too small. You want char* ncstr = new char[len+1]; to account for the terminating \0.
Also, if you delete shaved right after the function returns, there is no point in calling the function...
[*] To go a bit deeper, the memory used for cstr will be released when the containing function returns. Usually such static strings are placed in constants that live for the entire duration of the application. For example, you could have const char* cstr="abcde"; outside all your functions. Then you can pass this string around without having to dynamically allocate it.
Assuming you meant to use cstr instead of cstrn...
You should not be deleting cstr. You should be deleting shaved.
You only delete the memory that was allocated with new. And delete[] memory that was allocated with new[].
shaved is simply a variable that holds a memory address. You pass that memory address to delete[] to get rid of the memory. shaved holds the memory address of the memory that was allocated with new[].
Related
will a re-new of a TCHAR* array has negative/undefined effect? Or even maybe not recommended? Below code has been working fine so far. Need inputs. Thanks!
//e.g.
TCHAR *tc1 = new TCHAR[1];
// later:
//resize TCHARs
tc1 = new TCHAR[size1];
tc1[size1] = { L'\0' };
This is a memory leak. You need to delete anything created by a new call. If you do that, everything is fine:
//e.g.
TCHAR *tc1 = new TCHAR[1];
// later:
//resize TCHARs
delete [] tc1;
tc1 = new TCHAR[size1];
tc1[size1] = { L'\0' };
Although on an unrelated note, your last line is writing behind the array you allocated. That's not fine. But it has nothing to do with your allocation of memory, it's a mistake on it's own.
A lot of this can be avoided if you use a string class. Either std::string or if you are using the MFC, CString.
The negative effect of the "re-newing" is that you lose the pointer to the free-store memory originally allocated. It will remain occupied throughout the rest of your program, without any chance to reclaim it.
Of course, you may have some other pointer pointing to the memory, but that would be a very strange and unnecessarily complex piece of code.
Avoid all those problems by using std::vector instead of new[].
tc1 = new TCHAR[size1];
tc1[size1] = { L'\0' };
In addition to the memory leak, this is undefined behaviour because size1 is one past the last valid index.
Here's a std::vector example:
std::vector<TCHAR> tc1(1);
// later:
//resize TCHARs
tc1.resize(size1);
tc1[size1 - 1] = L'\0';
Perhaps even std::string or std::wstring is sufficient for your needs.
My question arises from one of my c++ exercises (from Programming Abstraction in C++, 2012 version, Exercise 12.2). Here it is:
void strcpy(char *dst, char *src) {
while (*dst++ = *src++);
}
The definition of strcpy is dangerous. The danger stems from the fact
that strcpy fails to check that there is sufficient space in the
character array that receives the copy, thereby increasing the chance
of a buffer-overflow error. It is possible, however, to eliminate much
of the danger by using dynamic allocation to create memory space for
the copied string. Write a function
char *copyCString(char *str);
that allocates enough memory for the C-style string str and then
copies the characters—along with the terminating null character—into
the newly allocated memory.
Here's my question:
Is this new method really safe? Why it's safe?
I mean, to be a little bit radical, what if there isn't enough space in the heap?
Is the new operator able to check for space availability and fall in an elegant way if there isn't enough space?
Will that cause other kind of "something-overflow"?
If new fails to allocate the requested memory, it's supposed to throw a std::bad_alloc exception (but see below for more). After that, the stack will be unwound to the matching exception handler, and it'll be up to your code to figure out what to do from there.
If you really want/need to assure against an exception being thrown, there is a nothrow version of new you can use that will return a null pointer to signal failure--but this is included almost exclusively for C compatibility, and not frequently used (or useful).
For the type of situation cited in the question, you normally want to use std::string instead of messing with allocating space yourself at all.
Also note that on many modern systems, the notion of new either throwing or returning a null pointer in case of failure, is really fairly foreign. In reality, Windows will normally attempt to expand the paging file to meet your request. Linux has an "OOMKiller" process that will attempt to find "bad" processes and kill them to free up memory if you run out.
As such, even though the C++ standard (and the C standard) prescribe what should happen if allocation fails, that's rarely what happens in real life.
New operator will throw bad_alloc exception if it cannot alocate memory, unless nothrow specified. If you specify constant nothrow you will get NULL pointer back if it cannot alocate memory.
The code for strcpy is unsafe because it will try copying outside of the allocated memory for the dst pointer. Example:
int main()
{
const char* s1 = "hello"; // allocated space for 6 characters
char* s2 = new char[ 2 ]; // allocated space for 2 characters.
strcpy( s2, s1 );
cout << s2 << endl;
char c; cin >> c;
return 0;
}
This prints the correct value "hello", but remember that the pointer s2 was allocated to only have space for 2 characters. So we can assume that the other characters were written to the subsequent memory slots, which is unsafe as we could be overwriting data or accessing invalid memory.
Consider this solution:
char* e4_strdup( const char*& c )
{
// holds the number of space required for the c-string
unsigned int sz{ 0 };
// since c-style strings are terminated by the '\0' character,
// increase the required space until we've found a '\0' character.
for ( const char* p_to_c = c; *p_to_c != '\0'; ++p_to_c )
++sz;
// allocate correct amount of space for copy.
// we do ++sz during allocation because we must provide enough space for the '\0' character.
char* c_copy{ new char[ ++sz ] }; // extra space for '\0' character.
for ( unsigned int i{ 0 }; i < sz; ++i )
c_copy[ i ] = c[ i ]; // copy every character onto allocated memory
return c_copy;
}
The new operator will still return a std::bad_alloc exception if you run out of memory.
I'm kind of new when it comes to memory management in C++. I read that if you create a class with the new keyword you must delete the object to free its memory. I also read that primitive types, such as int, char and bool, are created on the stack, which means they get deleted when they go out of scopes.
But what about primitive types created with the new keyword? Do I need to explicitly call delete? Are these created on the heap like classes? Or since they are primitive do they still get created on the stack?
I am asking because I am allocating a LPTSTR using the new keyword, but I am worried that if I do not call delete that the memory will never be freed. Here is my code with the bare question in the comments:
#include <Windows.h>
#include <tchar.h>
#include <string>
#ifdef _UNICODE
typedef std::wstring Str;
#else // ANSI
typedef std::string Str;
#endif
Str GetWndStr(HWND hwnd) {
const int length = GetWindowTextLength(hwnd);
if (length != 0) {
LPTSTR buffer = new TCHAR[length + 1]; // Allocation of string
GetWindowText(hwnd, buffer, length + 1);
Str text(buffer);
delete buffer; // <--- Is this line necessary?
return text;
} else {
return _T("");
}
}
Do I need to call delete? Awhile back, I tried using GlobalAlloc() and GlobalFree(), but during runtime I got an error saying something about illegally modifying the stack, I do not have an exact error message as this was awhile ago. Also, in addition to your answer, if you would like to give me resources you found helpful to learn more about C++ memory management, that would be nice.
For every new there must be a delete and for every new[] there must be a delete[]. Notice that the memory allocated with new[] must be deleted with delete[], which is not the case in the posted code.
A smart pointer can be used, boost::scoped_array for example, which will perform the delete[] in its destructor (or reset() function). This is particularly useful if exceptions can be thrown after the call to new[]:
boost::scoped_array<TCHAR> buffer(new TCHAR[length + 1]);
GetWindowText(hwnd, buffer.get(), length + 1);
Str text(buffer.get()); // buffer will be deleted at end of scope.
Your array is allocated with new[] and therefore must be deleted with delete[] (not delete).
Your explicit dynamic allocation is also unnecessary:
Str text(length+1, 0);
GetWindowText(hwnd, &text[0], length + 1);
text.resize(length); // remove NUL terminator
return text;
In C++03 there's some justification needed, whether string and wstring actually allocate contiguous memory, suitable for passing as a buffer. It's not guaranteed by the C++03 standard, but it is in fact true in MSVC++. If you don't want to rely on that fact, then it is guaranteed for vector, so you can use that for the buffer:
std::vector<TCHAR> buffer(length+1);
GetWindowText(hwnd, &buffer[0], length + 1);
return Str(buffer.begin(), buffer.end() - 1);
It is pretty rare to directly use new[] in C++. In both cases, my vector or string buffer is an automatic variable, so I don't have to do anything special to make sure that it is destroyed correctly when its scope ends.
YES (unless you use smart pointers or similar to delete it for you)
Yes, the rule is very simple. Everything you allocate with new needs to be deallocated with delete; and everything allocated with new[] needs to be deallocated with delete[].
To reduce the chances of error, it's best to use containers, smart pointers or other RAII-style objects to manage dynamic resources, rather than remembering to use delete yourself.
Sure, no matter what type was allocated. It still have memory space.
I'm new to pointers in C++. I'm not sure why I need pointers like char * something[20] as oppose to just char something[20][100]. I realize that the second method would mean that 100 block of memory will be allocated for each element in the array, but wouldn't the first method introduce memory leak issues.
If someone could explain to me how char * something[20] locates memory, that would be great.
Edit:
My C++ Primer Plus book is doing:
const char * cities[5] = {
"City 1",
"City 2",
"City 3",
"City 4",
"City 5"
}
Isn't this the opposite of what people just said?
You allocate 20 pointers in the memory, then you will need to go through each and every one of them to allocate memory dynamically:
something[0] = new char[100];
something[1] = new char[20]; // they can differ in size
And delete them all separately:
delete [] something[0];
delete [] something[1];
EDIT:
const char* text[] = {"These", "are", "string", "literals"};
Strings specified directly in the source code ("string literals", which are always const char *) are quite different to char *, mainly because you don't have to worry about alloc/dealloc of them. They are also generally handled very different in memory, but this depends on the implementation of your compiler.
You're right.
You'd need to go through each element of that array and allocate a character buffer for each one.
Then, later, you'd need to go through each element of that array and free the memory again.
Why you would want to faff about with this in C++ is anyone's guess.
What's wrong with std::vector<std::string> myStrings(20)?
It will allocate space for twenty char-pointers.
They will not be initialized, so typical usage looks like
char * something[20];
for (int i=0; i<20; i++)
something[i] = strdup("something of a content");
and later
for (int i=0; i<20; i++)
if (something[i])
free(something[i]);
You're right - the first method may introduce memory leak issues and the overhead of doing dynamic allocations, plus more reads. I think the second method is usually preferable, unless it wastes too much RAM or you may need the strings to grow longer than 99 chars.
How the first method works:
char* something[20]; // Stores 20 pointers.
something[0] = malloc(100); // Make something[0] point to a new buffer of 100 bytes.
sprintf(something[0], "hai"); // Make the new buffer contain "hai", going through the pointer in something[0]
free(something[0]); // Release the buffer.
char* smth[20] does not allocate any memeory on heap. It allocates just enough space on the stack to store 20 pointers. The value of those pointers is undefined, so before using them, you have to initialize them, like this:
char* smth[20];
smth[0] = new char[100]; // allocate memory for 100 chars, store the address of the first one in smth[0]
//..some code..
delete[] smth[0];
First of all, this almost inapplicable in C++. The normal equivalent in C++ would be something like: std::vector<std::string> something;
In C, the primary difference is that you can allocate each string separately from the others. With char something[M][N], you always allocate exactly the same number of strings, and the same space for each string. This will frequently waste space (when the strings are shorter than you've made space for), and won't allow you to deal with any more strings or longer of strings than you've made space for initially.
char *something[20] let's you deal with longer/shorter strings more efficiently, but still only makes space for 20 strings.
The next step (if you're feeling adventurous) is to use something like:
char **something;
and allocate the strings individually, and allocate space for the pointers dynamically as well, so if you get more than 20 strings you can deal with that as well.
I'll repeat, however, that for most practical purposes, this is restricted to C. In C++, the standard library already has data structures for situations like these.
C++ has pointers because C has pointers.
Why do we use pointers?
To track dynamically-allocated memory. The memory allocation functions in C (malloc, calloc, realloc) and the new operator in C++ all return pointer values.
To mimic pass-by-reference semantics (C only). In C, all function arguments are passed by value; the formal parameter and the actual parameter are distinct objects, and modifying a formal parameter doesn't affect the actual parameter. We get around this by passing pointers to the function. C++ introduced reference types, which serve the same purpose, but are a bit cleaner and safer than using pointers.
To build dynamic, self-referential data structures. A struct cannot contain an instance of itself, but it can contain a pointer to an instance. For example, the following code
struct node
{
data_t data;
struct node *next;
};
creates a data type for a simple linked-list node; the next member explicitly points to the next element in the list. Note that in C++, the STL containers for stacks and queues and vectors all use pointers under the hood, isolating you from the bookkeeping.
There are literally dozens of other places where pointers come up, but those are the main reasons you use them.
Your array of pointers could be used to store strings of varying length by allocating just enough memory for each, rather than relying on some maximum size (which will eventually be exceeded, leading to a buffer overflow error, and in any case will lead to internal memory fragmentation). Naturally, in C++ you'd use the string data type (which hides all the pointer and memory management behind the class API) instead of pointers to char, but someone has decided to confuse you by starting with low-level details instead of the big picture.
I'm not
sure why I need pointers like char *
something[20] as oppose to just char
something[20][100]. I realize that the
second method would mean that 100
block of memory will be allocated for
each element in the array, but
wouldn't the first method introduce
memory leak issues.
The second method will suffice if you're only referencing your buffer(s) locally.
The problem comes when you pass the array name to another function. When you pass char something[10] to another function, you're actually passing char* something because the array length doesn't go along for the ride.
For multidimensional arrays, you can declare a function that takes in an array of determinate length in all but one direction, e.g. foo(char* something[10]).
So why use the first form rather than the second? I can think of a few reasons:
You don't want to have the restriction that the entire buffer must reside in continuous memory.
You don't know at compile-time that you'll need each buffer, or that the length of each buffer will need to be the same size, and you want the flexibility to determine that at run-time.
This is a function declaration.
char * something[20]
Assuming this is 32Bit, this allocates 80 bytes of data on the stack.
4 Bytes for each pointer address, 20 pointers total = 4 x 20 = 80 bytes.
The pointers are all uninitialized, so you need to write additional code to allocate/free
the buffers for doing this.
It roughly looks like:
[0] [4 Bytes of Uninitialized data to hold a pointer/memory address...]
[1] [4 Bytes of ... ]
...
[19]
char something[20][100]
Allocates 2000 bytes on the stack.
100 Bytes for each something, 20 somethings total = 100 x 20 = 2000 bytes.
[0] [100 bytes to hold characters]
[1] [100 bytes to hold characters]
...
[19]
The char *, has a smaller memory overhead, but you have to manage the memory.
The char[][] approach, has bigger memory overhead, but you don't have additional memory management.
With either approach, you have to be careful when writing to the buffer allocated not to exceed/overwrite the memory alloc'd for it.
For educational purposes, I am using cstrings in some test programs. I would like to shorten strings with a placeholder such as "...".
That is, "Quite a long string" will become "Quite a lo..." if my maximum length is set to 13. Further, I do not want to destroy the original string - the shortened string therefore has to be a copy.
The (static) method below is what I come up with. My question is: Should the class allocating memory for my shortened string also be responsible for freeing it?
What I do now is to store the returned string in a separate "user class" and defer freeing the memory to that user class.
const char* TextHelper::shortenWithPlaceholder(const char* text, size_t newSize) {
char* shortened = new char[newSize+1];
if (newSize <= 3) {
strncpy_s(shortened, newSize+1, ".", newSize);
}
else {
strncpy_s(shortened, newSize+1, text, newSize-3);
strncat_s(shortened, newSize+1, "...", 3);
}
return shortened;
}
The standard approach of functions like this is to have the user pass in a char[] buffer. You see this in functions like sprintf(), for example, which take a destination buffer as a parameter. This allows the caller to be responsible for both allocating and freeing the memory, keeping the whole memory management issue in a single place.
In order to avoid buffer overflows and memory leaks, you should always use C++ classes such as std::string in this case.
Only the very last instance should convert the class into something low level such as char*. This will make your code simple and safe. Just change your code to:
std::string TextHelper::shortenWithPlaceholder(const std::string& text,
size_t newSize) {
return text.substr(0, newSize-3) + "...";
}
When using that function in a C context, you simply use the cstr() method:
some_c_function(shortenWithPlaceholder("abcde", 4).c_str());
That's all!
In general, you should not program in C++ the same way you program in C. It's more appropriate to treat C++ as a really different language.
I've never been happy returning pointers to locally allocated memory. I like to keep a healthy mistrust of anyone calling my function in regard to clean up.
Instead, have you considered accepting a buffer into which you'd copy the shortened string?
eg.
const char* TextHelper::shortenWithPlaceholder(const char* text,
size_t textSize,
char* short_text,
size_t shortSize)
where short_text = buffer to copy shortened string, and shortSize = size of the buffer supplied. You could also continue to return a const char* pointing to short_text as a convenience to the caller (return NULL if shortSize isn't large enough to).
Really you should just use std::string, but if you must, look to the existing library for usage guidance.
In the C standard library, the function that is closest to what you are doing is
char * strncpy ( char * destination, const char * source, size_t num );
So I'd go with this:
const char* TextHelper::shortenWithPlaceholder(
char * destination,
const char * source,
size_t newSize);
The caller is responsible for memory management - this allows the caller to use the stack, or a heap, or a memory mapped file, or whatever source to hold that data. You don't need to document that you used new[] to allocate the memory, and the caller doesn't need to know to use delete[] as opposed to free or delete, or even a lower-level operating system call. Leaving the memory management to the caller is just more flexible, and less error prone.
Returning a pointer to the destination is just a nicety to allow you to do things like this:
char buffer[13];
printf("%s", TextHelper::shortenWithPlaceholder(buffer, source, 12));
The most flexible approach is to return a helper object that wraps the allocated memory, so that the caller doesn't have to worry about it. The class stores a pointer to the memory, and has a copy constructor, an assignment operator and a destructor.
class string_wrapper
{
char *p;
public:
string_wrapper(char *_p) : p(_p) { }
~string_wrapper() { delete[] p; }
const char *c_str() { return p; }
// also copy ctor, assignment
};
// function declaration
string_wrapper TextHelper::shortenWithPlaceholder(const char* text, size_t newSize)
{
// allocate string buffer 'p' somehow...
return string_wrapper(p);
}
// caller
string_wrapper shortened = TextHelper::shortenWithPlaceholder("Something too long", 5);
std::cout << shortened.c_str();
Most real programs use std::string for this purpose.
In your example the caller has no choice but to be responsible for freeing the allocated memory.
This, however, is an error prone idiom to use and I don't recommend using it.
One alternative that allows you to use pretty much the same code is to change shortened to a referenced counted pointer and have the method return that referenced counted pointer instead of a bare pointer.
There are two basic ways that I consider equally common:
a) TextHelper returns the c string and forgets about it. The user has to delete the memory.
b) TextHelper maintains a list of allocated strings and deallocates them when it is destroyed.
Now it depends on your usage pattern. b) seems risky to me: If TextHelper has to deallocate the strings, it should not do so before the user is done working with the shortened string. You probably won't know when this point comes, so you keep the TextHelper alive until the program terminates. This results in a memory usage pattern equal to a memory leak. I'd recommend b) only if the strings belong semantically to the class that provides them, similar to the std::string::c_str(). Your TextHelper looks more like a toolbox that should not be associated with the processed strings, so if I had to choose between the two, I'd go for a). Your user class is probably the best solution, given a fixed TextHelper interface.
Edit: No, I'm wrong. I misunderstood what you were trying to do. The caller must delete the memory in your instance.
The C++ standard states that deleting 0/NULL does nothing (in other words, this is safe to do), so you can delete it regardless of whether you ever called the function at all. Edit: I don't know how this got left out...your other alternative is placement delete. In that case, even if it is bad form, you should also use placement new to keep the allocation/deallocation in the same place (otherwise the inconsistency would make debugging ridiculous).
That said, how are you using the code? I don't see when you would ever call it more than once, but if you do, there are potential memory leaks (I think) if you don't remember each different block of memory.
I would just use std::auto_ptr or Boost::shared_ptr. It deletes itself on the way out and can be used with char*.
Another thing you can do, considering on how TextHelper is allocated. Here is a theoretical ctor:
TextHelper(const char* input) : input_(input), copy(0) { copy = new char[sizeof(input)/sizeof(char)]; //mess with later }
~TextHelper() { delete copy; }