I have an issue where I need a vector of const char* but for some reason whenever I try adding something nothing happens. Here is the code sample in question.
std::vector<const char*> getArgsFromFile(char* arg) {
std::ifstream argsFile(arg);
std::vector<const char*> args;
while (!argsFile.eof()) {
std::string temp;
argsFile >> temp;
args.push_back(temp.c_str());
}
args.pop_back();
return args;
}
The strange part is if I make this change
std::vector<const char*> getArgsFromFile(char* arg) {
std::ifstream argsFile(arg);
std::vector<const char*> args;
while (!argsFile.eof()) {
std::string temp;
argsFile >> temp;
const char* x = "x";
args.push_back(x);
}
args.pop_back();
return args;
}
It will add 'x' to the vector but I can't get the value of temp into the vector. Any thoughts? Help would be greatly appreciated. Thanks!
A const char* is not a string, but merely a pointer to some memory, usually holding some characters. Now std::string under the hood either holds a small region of memory (like char buff[32]) or, for larger strings, keeps a pointer to memory allocated on the heap. In either case, a pointer to the actual memory holding the data can be obtained via string::c_str(). But when the string goes out of scope that pointer no longer points to secured data and becomes dangling.
This is the reason why C++ introduced methods to avoid direct exposure and usage of raw pointers. Good C++ code avoid raw pointers like the plague. Your homework is for poor/bad C++ code (hopefully only to learn the problems that come with such raw pointers).
So, in order for the pointers held in your vector to persistently point to some characters (and not become dangling), they must point to persistent memory. The only guaranteed way to achieve that is to dynamically allocate the memory
while (!argsFile.eof()) {
std::string temp;
argsFile >> temp;
char* buff = new char[temp.size()+1]; // allocate memory
std::strncpy(buff,temp.c_str(),temp.size()+1); // copy data to memory
args.push_back(buff); // store pointer in vector
}
but then the memory allocated in this way will be leaked, unless you de-allocate it as in
while(!args.empty()) {
delete[] args.back();
args.pop_back();
}
Note that this is extremely bad C++ code and not exception safe (if an exception occurs between allocation and de-allocation, the allocated memory is leaked). In C++ one would instead use std::vector<std::string> or perhaps std::vector<std::unique_ptr<const char[]> (if you cannot use std::string), both being exception safe.
Use a standard-library-based implementation
Guideline SL.1 of the C++ coding guidelines says: "Use the standard library whenever possible" (and relevant). Why work so hard? People have already done most of the work for you...
So, using your function's declaration, you could just have:
std::vector<std::string> getArgsFromFile(char* arg) {
using namespace std;
ifstream argsFile(arg);
vector<string> args;
copy(istream_iterator<string>(argsFile),
istream_iterator<string>(),
back_inserter(args));
return args;
}
and Bob's your uncle.
Still, #Walter's answer is very useful to read, so that you realize what's wrong with your use of char * for strings.
Related
Suppose I do the following:
char *get_data(...) {
char *c_style = (char *) malloc(length * sizeof(char));
load_c_string_with_my_c_function(c_style, length, input);
return c_style;
}
int main() {
std::string data(get_data(...));
// free(data.c_str()); ?? -- are the malloc'd bytes now managed?
return 0;
}
Is there any way to release the memory that get_data() allocated? Would the commented free(data.c_str()); Work?
Once you do
std::string data(get_data(...));
there is no way to get the pointer back that get_data() returned so you will leak that memory. To fix this just have get_data() return a std::string in the first place so you don't have to worry about memory management at all. That would give you
std::string get_data(...) {
std::string data(length, '\0');
load_c_string_with_my_c_function(data.data(), length, input); // requires C++17
// load_c_string_with_my_c_function(&data[0], length, input); // use this for pre-C++17 compilers
return data;
}
and now no memory leak. If you can't do this then you need to capture the pointer, use it to initialize the string, and then free the pointer like
char* ret = get_data(...);
std::string data(ret);
free(ret);
Is there any way to release the memory that get_data() allocated?
No, you lost your pointer to it.
Would the commented free(data.c_str()); Work?
No. The std::string copied your data into a new string that it owns and manages. You cannot legally de-allocate the std::string's own allocated memory, and it would not solve the problem of having to de-allocate your original allocated memory.
Either use std::string throughout (preferred!) or capture the pointer inside main first:
int main()
{
auto cstr = get_data(...);
std::string data(cstr);
free(cstr);
}
Trouble is, this isn't exception-safe, which is why we have nice things like std::string in the first place. It can be solved with some liberal try and catch but it'll be ugly.
Also, since you already have get_data, presumably for a reason, you might consider a string_view over the actual memory that you already allocated, unless you really need data to be an owning copy.
int main()
{
auto cstr = get_data(...);
std::string_view data(cstr); // no copy! just features
free(cstr);
}
(Comments elsewhere indicate that this may be what you really wanted.)
Now experiment with having get_data return something with clear ownership & lifetime semantics (std::unique_ptr? std::vector? std::string? .. lol) and you're golden.
You can't call call free(data.c_str()); or anything like that. std::string manages its own memory and even the pointer that you get from c_str() is invalidated automatically as soon as the std::string goes out of scope.
You do, however, need to free the c_style returned from the function call. The std::string may handle its own memory, but it is only a copy of the malloc'd memory that isn't managed.
Do this:
int main() {
char *result = (get_data(...)
std::string data(result);
// free(data.c_str()); ?? -- are the malloc'd bytes now managed?
free(result);
return 0;
}
You can't delete the pointer, you aren't keeping a reference to it. You need to do:
int main() {
char* cstr = get_data(...);
std::string data(cstr);
free(cstr);
return 0;
}
Or much better write your data into the string directly:
std::string get_data(...) {
std::string data(length, '\0');
load_c_string_with_my_c_function(&data[0], data.size(), input);
return data;
}
You may need to add 1 to the length of the std::string and/or pass data.size()-1 to load_c_string_with_my_c_function depending on the specification of the length parameter of load_c_string_with_my_c_function.
How do I do this thing.
char* ToString(int num) {
char* str = new char[len(num)];
//conversion
return str;
}
And by calling this.
string someStr = ToString(someInt);
Should I free the someStr here?
I know I always need to delete whenever I use new.
And what if I call this function multiple times, do I allocate memory and just leaving them behind not using it?
You should avoid this practice altogether. Either return a std::unique_ptr, or deal with std::string directly. It is not clear from your code what exactly you are trying to do, so I can't offer specific solutions.
Note that this initialization:
string someStr = ToString(someInt);
will only work properly if you return a null-terminated string, but it leaks resources regardless.
See this related post.
You need to call delete once for every call to ToString. You also can't initialise a std::string with an allocated char array in the way your question hints at - that'd leak the returned memory, with your someStr variable having copied it.
The easiest/neatest thing to do would be to change ToString to return std::string instead. In this case, memory used by the string will be automatically deleted when the caller's variable goes out of scope.
I ran your code under valgrind with --leak-check=full, it reports num size of memory leak.
Call new/delete, new [] /delete [] in pair is the only way to keep memory cycled.
I am not sure what's you trying to do, if you want to convert integer types to string, C++ has a few options:
// std::to_string(C++11) e.g:
{
std::string str = std::to_string(num)
}
// std::stringstream e.g:
{
std::string str;
std::stringstream ss;
ss << num;
ss >> str;
}
// boost::lexical_cast e.g:
{
std::string str = boost::lexical_cast<std::string>(num);
}
// itoa(c function)
{
char buf[MAX_INT_DIGITS]; // MAX_INT_DIGITS == 12 ("-2147483648\0")
itoa(num, buf, 10);
std::string str(buf);
}
You're ToString function should return a std::string, if you then just assign the value to a std::string. No reason to deal with dynamically allocated memory here.
someStr is a copy. You have allready a leak. You need to temporally save the value of the returned pointer and after constructing the string delect it. This is normaly the job of the smart pointers.
EDIT:
No,
char* temp = strs; delete [] str; return temp;
will something undefined. But:
char* temp =ToString(someInt); string someStr(temp);delete []temp;
will work. But this is only for you to understand the idea. This can be made for you if you return a unique_ptr. And I'm assuming this is a kind of general question of returning a memory that have to be free after that, in with case unique_ptr and shared_ptr are a solution. In this particular case you can just create a string, modify it and return it simply. All the memory manage will be made for you by the string class. If you really only need to allocate “space” in the string, you can do:
String Str; Str.reserve(len(num));
I have a situation as below in which I need to pass a C-style string into a function and stored it into a container that needed to be used later on. The container is storing the char*. I couldn't figure out the efficient way to create the memory and store it into the vector. As in overloadedfunctionA (int), I need to create new memory and copy into buffer, and pass into the overloadedfunctionA(char*) which again create new memory and copy into the buffer again. Imagine I have alot of items in int and other types and I am doing twice the work every time. One way to solve it is to copy the logic from overloadedfunctionA(char*) to overloadedfunctionA(int). But it would resulted in alot of repetitive codes. Any ideas on a more efficient way to do this?
Thanks.
int main(){
overloadedfunctionA(5);
overloadedfunctionA("abc");
}
vector<char*> v1;
void overloadedfunctionA(int intA){
char* buffer = new char[];
convert int to char in buffer;
overloadedfunctionA(buffer1);
delete buffer;
}
//act as base function that has a lot of logic need to be performed
void overloadedfunctionA(char* string1){
char* buffer = new char[];
copy string to buffer;
insert string into vector1;
}
For all that's holy, just use std::string internally. You can assign to a std::string from a const char*, and you can access the std::string's underlying C string through the c_str() method.
Otherwise, you'd need to write a wrapper class that handles memory allocation for you, and store instances of that class in your vector; but then, std::string already IS such a wrapper class.
Hopefully making the earlier solutions easier for you to follow...
#include <vector>
#include <string>
#include <sstream>
std::vector<std::string> the_vector;
// template here... use an overload if you prefer...
template <typename T>
void fn(const T& t)
{
std::ostringstream oss;
oss << t;
the_vector.push_back(t.str());
}
int main()
{
fn(3);
fn("whatever");
}
One optimization you can do is to maintain a static hashmap of strings, so that you can use a single instance for duplicate values.
Another solution is to use std::string and redeclare overloadedfunctionA(char*) as overloadedfunctionA(const std::string&).
Yet another option is to let a Garbage Collector do the memory management.
If your strings are long such that copying is likely to be a performance problem, you could consider boost::shared_array as an alternative to std::string. Provided the contents are invariant, this offers a solution where only one copy of the string will need to be held in memory. Note however that you would be trading reference counting overhead for string copying overhead here.
Some compilers may optimize std::string for you this way anyway, so that a single copy is kept and ref-counted until the contents are changed, when a new array is allocated for the modified contents with its refcount decoupled from the original contents
For educational purposes, I am using cstrings in some test programs. I would like to shorten strings with a placeholder such as "...".
That is, "Quite a long string" will become "Quite a lo..." if my maximum length is set to 13. Further, I do not want to destroy the original string - the shortened string therefore has to be a copy.
The (static) method below is what I come up with. My question is: Should the class allocating memory for my shortened string also be responsible for freeing it?
What I do now is to store the returned string in a separate "user class" and defer freeing the memory to that user class.
const char* TextHelper::shortenWithPlaceholder(const char* text, size_t newSize) {
char* shortened = new char[newSize+1];
if (newSize <= 3) {
strncpy_s(shortened, newSize+1, ".", newSize);
}
else {
strncpy_s(shortened, newSize+1, text, newSize-3);
strncat_s(shortened, newSize+1, "...", 3);
}
return shortened;
}
The standard approach of functions like this is to have the user pass in a char[] buffer. You see this in functions like sprintf(), for example, which take a destination buffer as a parameter. This allows the caller to be responsible for both allocating and freeing the memory, keeping the whole memory management issue in a single place.
In order to avoid buffer overflows and memory leaks, you should always use C++ classes such as std::string in this case.
Only the very last instance should convert the class into something low level such as char*. This will make your code simple and safe. Just change your code to:
std::string TextHelper::shortenWithPlaceholder(const std::string& text,
size_t newSize) {
return text.substr(0, newSize-3) + "...";
}
When using that function in a C context, you simply use the cstr() method:
some_c_function(shortenWithPlaceholder("abcde", 4).c_str());
That's all!
In general, you should not program in C++ the same way you program in C. It's more appropriate to treat C++ as a really different language.
I've never been happy returning pointers to locally allocated memory. I like to keep a healthy mistrust of anyone calling my function in regard to clean up.
Instead, have you considered accepting a buffer into which you'd copy the shortened string?
eg.
const char* TextHelper::shortenWithPlaceholder(const char* text,
size_t textSize,
char* short_text,
size_t shortSize)
where short_text = buffer to copy shortened string, and shortSize = size of the buffer supplied. You could also continue to return a const char* pointing to short_text as a convenience to the caller (return NULL if shortSize isn't large enough to).
Really you should just use std::string, but if you must, look to the existing library for usage guidance.
In the C standard library, the function that is closest to what you are doing is
char * strncpy ( char * destination, const char * source, size_t num );
So I'd go with this:
const char* TextHelper::shortenWithPlaceholder(
char * destination,
const char * source,
size_t newSize);
The caller is responsible for memory management - this allows the caller to use the stack, or a heap, or a memory mapped file, or whatever source to hold that data. You don't need to document that you used new[] to allocate the memory, and the caller doesn't need to know to use delete[] as opposed to free or delete, or even a lower-level operating system call. Leaving the memory management to the caller is just more flexible, and less error prone.
Returning a pointer to the destination is just a nicety to allow you to do things like this:
char buffer[13];
printf("%s", TextHelper::shortenWithPlaceholder(buffer, source, 12));
The most flexible approach is to return a helper object that wraps the allocated memory, so that the caller doesn't have to worry about it. The class stores a pointer to the memory, and has a copy constructor, an assignment operator and a destructor.
class string_wrapper
{
char *p;
public:
string_wrapper(char *_p) : p(_p) { }
~string_wrapper() { delete[] p; }
const char *c_str() { return p; }
// also copy ctor, assignment
};
// function declaration
string_wrapper TextHelper::shortenWithPlaceholder(const char* text, size_t newSize)
{
// allocate string buffer 'p' somehow...
return string_wrapper(p);
}
// caller
string_wrapper shortened = TextHelper::shortenWithPlaceholder("Something too long", 5);
std::cout << shortened.c_str();
Most real programs use std::string for this purpose.
In your example the caller has no choice but to be responsible for freeing the allocated memory.
This, however, is an error prone idiom to use and I don't recommend using it.
One alternative that allows you to use pretty much the same code is to change shortened to a referenced counted pointer and have the method return that referenced counted pointer instead of a bare pointer.
There are two basic ways that I consider equally common:
a) TextHelper returns the c string and forgets about it. The user has to delete the memory.
b) TextHelper maintains a list of allocated strings and deallocates them when it is destroyed.
Now it depends on your usage pattern. b) seems risky to me: If TextHelper has to deallocate the strings, it should not do so before the user is done working with the shortened string. You probably won't know when this point comes, so you keep the TextHelper alive until the program terminates. This results in a memory usage pattern equal to a memory leak. I'd recommend b) only if the strings belong semantically to the class that provides them, similar to the std::string::c_str(). Your TextHelper looks more like a toolbox that should not be associated with the processed strings, so if I had to choose between the two, I'd go for a). Your user class is probably the best solution, given a fixed TextHelper interface.
Edit: No, I'm wrong. I misunderstood what you were trying to do. The caller must delete the memory in your instance.
The C++ standard states that deleting 0/NULL does nothing (in other words, this is safe to do), so you can delete it regardless of whether you ever called the function at all. Edit: I don't know how this got left out...your other alternative is placement delete. In that case, even if it is bad form, you should also use placement new to keep the allocation/deallocation in the same place (otherwise the inconsistency would make debugging ridiculous).
That said, how are you using the code? I don't see when you would ever call it more than once, but if you do, there are potential memory leaks (I think) if you don't remember each different block of memory.
I would just use std::auto_ptr or Boost::shared_ptr. It deletes itself on the way out and can be used with char*.
Another thing you can do, considering on how TextHelper is allocated. Here is a theoretical ctor:
TextHelper(const char* input) : input_(input), copy(0) { copy = new char[sizeof(input)/sizeof(char)]; //mess with later }
~TextHelper() { delete copy; }
Was reading up a bit on my C++, and found this article about RTTI (Runtime Type Identification):
http://msdn.microsoft.com/en-us/library/70ky2y6k(VS.80).aspx . Well, that's another subject :) - However, I stumbled upon a weird saying in the type_info-class, namely about the ::name-method. It says: "The type_info::name member function returns a const char* to a null-terminated string representing the human-readable name of the type. The memory pointed to is cached and should never be directly deallocated."
How can you implement something like this yourself!? I've been struggling quite a bit with this exact problem often before, as I don't want to make a new char-array for the caller to delete, so I've stuck to std::string thus far.
So, for the sake of simplicity, let's say I want to make a method that returns "Hello World!", let's call it
const char *getHelloString() const;
Personally, I would make it somehow like this (Pseudo):
const char *getHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}
.. But this would mean that the caller should do a delete[] on my return pointer :(
Thx in advance
How about this:
const char *getHelloString() const
{
return "HelloWorld!";
}
Returning a literal directly means the space for the string is allocated in static storage by the compiler and will be available throughout the duration of the program.
I like all the answers about how the string could be statically allocated, but that's not necessarily true for all implementations, particularly the one whose documentation the original poster linked to. In this case, it appears that the decorated type name is stored statically in order to save space, and the undecorated type name is computed on demand and cached in a linked list.
If you're curious about how the Visual C++ type_info::name() implementation allocates and caches its memory, it's not hard to find out. First, create a tiny test program:
#include <cstdio>
#include <typeinfo>
#include <vector>
int main(int argc, char* argv[]) {
std::vector<int> v;
const type_info& ti = typeid(v);
const char* n = ti.name();
printf("%s\n", n);
return 0;
}
Build it and run it under a debugger (I used WinDbg) and look at the pointer returned by type_info::name(). Does it point to a global structure? If so, WinDbg's ln command will tell the name of the closest symbol:
0:000> ?? n
char * 0x00000000`00857290
"class std::vector<int,class std::allocator<int> >"
0:000> ln 0x00000000`00857290
0:000>
ln didn't print anything, which indicates that the string wasn't in the range of addresses owned by any specific module. It would be in that range if it was in the data or read-only data segment. Let's see if it was allocated on the heap, by searching all heaps for the address returned by type_info::name():
0:000> !heap -x 0x00000000`00857290
Entry User Heap Segment Size PrevSize Unused Flags
-------------------------------------------------------------------------------------------------------------
0000000000857280 0000000000857290 0000000000850000 0000000000850000 70 40 3e busy extra fill
Yes, it was allocated on the heap. Putting a breakpoint at the start of malloc() and restarting the program confirms it.
Looking at the declaration in <typeinfo> gives a clue about where the heap pointers are getting cached:
struct __type_info_node {
void *memPtr;
__type_info_node* next;
};
extern __type_info_node __type_info_root_node;
...
_CRTIMP_PURE const char* __CLR_OR_THIS_CALL name(__type_info_node* __ptype_info_node = &__type_info_root_node) const;
If you find the address of __type_info_root_node and walk down the list in the debugger, you quickly find a node containing the same address that was returned by type_info::name(). The list seems to be related to the caching scheme.
The MSDN page linked in the original question seems to fill in the blanks: the name is stored in its decorated form to save space, and this form is accessible via type_info::raw_name(). When you call type_info::name() for the first time on a given type, it undecorates the name, stores it in a heap-allocated buffer, caches the buffer pointer, and returns it.
The linked list may also be used to deallocate the cached strings during program exit (however, I didn't verify whether that is the case). This would ensure that they don't show up as memory leaks when you run a memory debugging tool.
Well gee, if we are talking about just a function, that you always want to return the same value. it's quite simple.
const char * foo()
{
static char[] return_val= "HelloWorld!";
return return_val;
}
The tricky bit is when you start doing things where you are caching the result, and then you have to consider Threading,or when your cache gets invalidated, and trying to store thing in thread local storage. But if it's just a one off output that is immediate copied, this should do the trick.
Alternately if you don't have a fixed size you have to do something where you have to either use a static buffer of arbitrary size.. in which you might eventually have something too large, or turn to a managed class say std::string.
const char * foo()
{
static std::string output;
DoCalculation(output);
return output.c_str();
}
also the function signature
const char *getHelloString() const;
is only applicable for member functions.
At which point you don't need to deal with static function local variables and could just use a member variable.
I think that since they know that there are a finite number of these, they just keep them around forever. It might be appropriate for you to do that in some instances, but as a general rule, std::string is going to be better.
They can also look up new calls to see if they made that string already and return the same pointer. Again, depending on what you are doing, this may be useful for you too.
Be careful when implementing a function that allocates a chunk of memory and then expects the caller to deallocate it, as you do in the OP:
const char *getHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}
By doing this you are transferring ownership of the memory to the caller. If you call this code from some other function:
int main()
{
char * str = getHelloString();
delete str;
return 0;
}
...the semantics of transferring ownership of the memory is not clear, creating a situation where bugs and memory leaks are more likely.
Also, at least under Windows, if the two functions are in 2 different modules you could potentially corrupt the heap. In particular, if main() is in hello.exe, compiled in VC9, and getHelloString() is in utility.dll, compiled in VC6, you'll corrupt the heap when you delete the memory. This is because VC6 and VC9 both use their own heap, and they aren't the same heap, so you are allocating from one heap and deallocating from another.
Why does the return type need to be const? Don't think of the method as a get method, think of it as a create method. I've seen plenty of API that requires you to delete something a creation operator/method returns. Just make sure you note that in the documentation.
/* create a hello string
* must be deleted after use
*/
char *createHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}
What I've often done when I need this sort of functionality is to have a char * pointer in the class - initialized to null - and allocate when required.
viz:
class CacheNameString
{
private:
char *name;
public:
CacheNameString():name(NULL) { }
const char *make_name(const char *v)
{
if (name != NULL)
free(name);
name = strdup(v);
return name;
}
};
Something like this would do:
const char *myfunction() {
static char *str = NULL; /* this only happens once */
delete [] str; /* delete previous cached version */
str = new char[strlen("whatever") + 1]; /* allocate space for the string and it's NUL terminator */
strcpy(str, "whatever");
return str;
}
EDIT: Something that occurred to me is that a good replacement for this could be returning a boost::shared_pointer instead. That way the caller can hold onto it as long as they want and they don't have to worry about explicitly deleting it. A fair compromise IMO.
The advice given that warns about the lifetime of the returned string is sound advise. You should always be careful about recognising your responsibilities when it comes to managing the lifetime of returned pointers. The practise is quite safe, however, provided the variable pointed to will outlast the call to the function that returned it. Consider, for instance, the pointer to const char returned by c_str() as a method of class std::string. This is returning a pointer to the memory managed by the string object which is guaranteed to be valid as long as the string object is not deleted or made to reallocate its internal memory.
In the case of the std::type_info class, it is a part of the C++ standard as its namespace implies. The memory returned from name() is actually pointed to static memory created by the compiler and linker when the class was compiled and is a part of the run time type identification (RTTI) system. Because it refers to a symbol in code space, you should not attempt to delete it.
I think something like this can only be implemented "cleanly" using objects and the RAII idiom.
When the objects destructor is called (obj goes out of scope), we can safely assume that the const char* pointers arent be used anymore.
example code:
class ICanReturnConstChars
{
std::stack<char*> cached_strings
public:
const char* yeahGiveItToMe(){
char* newmem = new char[something];
//write something to newmem
cached_strings.push_back(newmem);
return newmem;
}
~ICanReturnConstChars(){
while(!cached_strings.empty()){
delete [] cached_strings.back()
cached_strings.pop_back()
}
}
};
The only other possibility i know of is to pass a smart_ptr ..
It's probably done using a static buffer:
const char* GetHelloString()
{
static char buffer[256] = { 0 };
strcpy( buffer, "Hello World!" );
return buffer;
}
This buffer is like a global variable that is accessible only from this function.
You can't rely on GC; this is C++. That means you must keep the memory available until the program terminates. You simply don't know when it becomes safe to delete[] it. So, if you want to construct and return a const char*, simple new[] it and return it. Accept the unavoidable leak.