Passing C style string around function - c++

I have a situation as below in which I need to pass a C-style string into a function and stored it into a container that needed to be used later on. The container is storing the char*. I couldn't figure out the efficient way to create the memory and store it into the vector. As in overloadedfunctionA (int), I need to create new memory and copy into buffer, and pass into the overloadedfunctionA(char*) which again create new memory and copy into the buffer again. Imagine I have alot of items in int and other types and I am doing twice the work every time. One way to solve it is to copy the logic from overloadedfunctionA(char*) to overloadedfunctionA(int). But it would resulted in alot of repetitive codes. Any ideas on a more efficient way to do this?
Thanks.
int main(){
overloadedfunctionA(5);
overloadedfunctionA("abc");
}
vector<char*> v1;
void overloadedfunctionA(int intA){
char* buffer = new char[];
convert int to char in buffer;
overloadedfunctionA(buffer1);
delete buffer;
}
//act as base function that has a lot of logic need to be performed
void overloadedfunctionA(char* string1){
char* buffer = new char[];
copy string to buffer;
insert string into vector1;
}

For all that's holy, just use std::string internally. You can assign to a std::string from a const char*, and you can access the std::string's underlying C string through the c_str() method.
Otherwise, you'd need to write a wrapper class that handles memory allocation for you, and store instances of that class in your vector; but then, std::string already IS such a wrapper class.

Hopefully making the earlier solutions easier for you to follow...
#include <vector>
#include <string>
#include <sstream>
std::vector<std::string> the_vector;
// template here... use an overload if you prefer...
template <typename T>
void fn(const T& t)
{
std::ostringstream oss;
oss << t;
the_vector.push_back(t.str());
}
int main()
{
fn(3);
fn("whatever");
}

One optimization you can do is to maintain a static hashmap of strings, so that you can use a single instance for duplicate values.
Another solution is to use std::string and redeclare overloadedfunctionA(char*) as overloadedfunctionA(const std::string&).
Yet another option is to let a Garbage Collector do the memory management.

If your strings are long such that copying is likely to be a performance problem, you could consider boost::shared_array as an alternative to std::string. Provided the contents are invariant, this offers a solution where only one copy of the string will need to be held in memory. Note however that you would be trading reference counting overhead for string copying overhead here.
Some compilers may optimize std::string for you this way anyway, so that a single copy is kept and ref-counted until the contents are changed, when a new array is allocated for the modified contents with its refcount decoupled from the original contents

Related

Can't add item to const char* vector? C++

I have an issue where I need a vector of const char* but for some reason whenever I try adding something nothing happens. Here is the code sample in question.
std::vector<const char*> getArgsFromFile(char* arg) {
std::ifstream argsFile(arg);
std::vector<const char*> args;
while (!argsFile.eof()) {
std::string temp;
argsFile >> temp;
args.push_back(temp.c_str());
}
args.pop_back();
return args;
}
The strange part is if I make this change
std::vector<const char*> getArgsFromFile(char* arg) {
std::ifstream argsFile(arg);
std::vector<const char*> args;
while (!argsFile.eof()) {
std::string temp;
argsFile >> temp;
const char* x = "x";
args.push_back(x);
}
args.pop_back();
return args;
}
It will add 'x' to the vector but I can't get the value of temp into the vector. Any thoughts? Help would be greatly appreciated. Thanks!
A const char* is not a string, but merely a pointer to some memory, usually holding some characters. Now std::string under the hood either holds a small region of memory (like char buff[32]) or, for larger strings, keeps a pointer to memory allocated on the heap. In either case, a pointer to the actual memory holding the data can be obtained via string::c_str(). But when the string goes out of scope that pointer no longer points to secured data and becomes dangling.
This is the reason why C++ introduced methods to avoid direct exposure and usage of raw pointers. Good C++ code avoid raw pointers like the plague. Your homework is for poor/bad C++ code (hopefully only to learn the problems that come with such raw pointers).
So, in order for the pointers held in your vector to persistently point to some characters (and not become dangling), they must point to persistent memory. The only guaranteed way to achieve that is to dynamically allocate the memory
while (!argsFile.eof()) {
std::string temp;
argsFile >> temp;
char* buff = new char[temp.size()+1]; // allocate memory
std::strncpy(buff,temp.c_str(),temp.size()+1); // copy data to memory
args.push_back(buff); // store pointer in vector
}
but then the memory allocated in this way will be leaked, unless you de-allocate it as in
while(!args.empty()) {
delete[] args.back();
args.pop_back();
}
Note that this is extremely bad C++ code and not exception safe (if an exception occurs between allocation and de-allocation, the allocated memory is leaked). In C++ one would instead use std::vector<std::string> or perhaps std::vector<std::unique_ptr<const char[]> (if you cannot use std::string), both being exception safe.
Use a standard-library-based implementation
Guideline SL.1 of the C++ coding guidelines says: "Use the standard library whenever possible" (and relevant). Why work so hard? People have already done most of the work for you...
So, using your function's declaration, you could just have:
std::vector<std::string> getArgsFromFile(char* arg) {
using namespace std;
ifstream argsFile(arg);
vector<string> args;
copy(istream_iterator<string>(argsFile),
istream_iterator<string>(),
back_inserter(args));
return args;
}
and Bob's your uncle.
Still, #Walter's answer is very useful to read, so that you realize what's wrong with your use of char * for strings.

C++ remembering pointers to allocated strings

I have this problem in C++ that I cant figure out how to remember pointers to the new allocated strings
in function getName() I create a copy of the wanted name so that the user cant get pointer to the real allocated name..But I cant find a way to free these allocated copies!
is there any other way than Lists? or Array?
thank you
this is the definition of the member function getName();
char * Course::getName() const
{
char* CourseNameCopy= new char*(strlen(CourseName)+1);
return CourseNameCopy;
}
char * Course::getName() const
{
char* CourseNameCopy= new char[strlen(CourseName)+1];
strcpy(CourseNameCopy, CourseName);
return CourseNameCopy;
}
I've made a couple of corrections to the original code so that it does what it claims to do.
If there's a requirement to return a pointer to a modifiable character array containing a copy of the course name, then this is the way to go. But that's very unusual requirement; usually it's sufficient to return a pointer to a non-modifiable character array, and for that, the internal array is all that's needed:
const char * Course::getName() const
{
return CourseName;
}
With that, users can look at the name of the course but not change it. If for some reason someone needs to fiddle with the returned text they can make their own copy and change that.
Use std::string, unless you have a very specific reason not to. Your code looks like a prime example for a string.
#include <string>
std::string Course::getName() const
{
return CourseName; // This will return a copy
}
Of course you have to change your member variable to be also a std::string CourseName;.
This will make your code much safer and much easier to read. It's the preferred way of doing it in C++, unless you're not a beginner anymore and have a very specific reason not to.
... so that the user cant get pointer to the real allocated name ...
You have a slight misconception here. The client (user) doesn't need to have access to the name, just receiving the pointer value (address), is enough to delete it later. Thus clients just have to do the following:
Course c("XYZ");
char* n = c.getName();
// deallocate after use
delete[] n;
Also note you missed to copy the contents actually:
char * Course::getName() const {
char* CourseNameCopy= new char*(strlen(CourseName)+1);
strcpy(CourseNameCopy,CourseName); // <<<<<<<<<
return CourseNameCopy;
}
I have to mention that's not a good solution, because it puts your code's clients in charge to take care about memory management.
Better use a std::string member variable for CourseName, that was designed for such and takes care about all of the memory management under the hood.

Why do I get "double free or corruption"?

I am trying to serialize a struct, but the program crashed with:
*** glibc detected *** ./unserialization: double free or corruption (fasttop): 0x0000000000cf8010 ***
#include <iostream>
#include <cstdlib>
#include <cstring>
struct Dummy
{
std::string name;
double height;
};
template<typename T>
class Serialization
{
public:
static unsigned char* toArray (T & t)
{
unsigned char *buffer = new unsigned char [ sizeof (T) ];
memcpy ( buffer , &t , sizeof (T) );
return buffer;
};
static T fromArray ( unsigned char *buffer )
{
T t;
memcpy ( &t , buffer , sizeof (T) );
return t;
};
};
int main ( int argc , char **argv )
{
Dummy human;
human.name = "Someone";
human.height = 11.333;
unsigned char *buffer = Serialization<Dummy>::toArray (human);
Dummy dummy = Serialization<Dummy>::fromArray (buffer);
std::cout << "Name:" << dummy.name << "\n" << "Height:" << dummy.height << std::endl;
delete buffer;
return 0;
}
I see two problems with this code:
You are invoking undefined behavior by memcpying a struct containing a std::string into another location. If you memcpy a class that isn't just a pure struct (for example, a std::string), it can cause all sorts of problems. In this particular case, I think that part of the problem might be that std::string sometimes stores an internal pointer to a buffer of characters containing the actual contents of the string. If you memcpy the std::string, you bypass the string's normal copy constructor that would duplicate the string. Instead, you now have two different instances of std::string sharing a pointer, so when they are destroyed they will both try to delete the character buffer, causing the bug you're seeing. There is no easy fix for this other than to not do what you're doing. It's just fundamentally unsafe.
You are allocating memory with new[], but deleting it with delete. You should use the array deleting operator delete[] to delete this memory, since using regular delete on it will result in undefined behavior, potentially causing this crash.
Hope this helps!
It's not valid to use memcpy() with a data element of type std::string (or really, any non-POD data type). The std::string class stores the actual string data in a dynamically-allocated buffer. When you memcpy() the contents of a std::string around, you are obliterating the pointers allocated internally and end up accessing memory that has been already freed.
You could make your code work by changing the declaration to:
struct Dummy
{
char name[100];
double height;
};
However, that has the disadvantages of a fixed size name buffer. If you want to maintain a dynamically sized name, then you will need to have a more sophisticated toArray and fromArray implementation that doesn't do straight memory copies.
You're copying the string's internal buffer in the toArray call. When deserializing with fromArray you "create" a second string in dummy, which thinks it owns the same buffer as human.
std::string probably contains a pointer to a buffer that contains the string data. When you call toArray (human), you're memcpy()'ing the Dummy class's string, including the pointer to the string's data. Then when you create a new Dummy object by memcpy()'ing directly into it, you've created a new string object with the same pointer to string data as the first object. Next thing you know, dummy gets destructed and the copy of the pointer gets destroyed, then human gets destructed and BAM, you got a double free.
Generally, copying objects using memcpy like this will lead to all sorts of problems, like the one you've seen. Its probably just going to be the tip of the ice berg. Instead, you might consider explicitly implementing some sort of marshalling function for each class you want to serialize.
Alternatively, you might look into json libraries for c++, which can serialize things into a convenient text based format. JSON protocols are commonly used with custom network protocols where you want to serialize objects to send over a socket.

How do I pass an std::string to a function that expects char*? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Can I get a non-const C string back from a C++ string?
Do I need to convert it first? I saw in another post that .c_str() can be used if the function expected const char*. What about for just char*?
std::vector<char> buffer(s.begin(), s.end());
foo(&buffer[0], buffer.size());
s.assign(buffer.begin(), buffer.end());
There is no way to get a char* from a string that is guaranteed to work on all platforms, for the simple fact that string is not required to use contiguous storage.
Your safest, most portable course of action is to copy the string somewhere that does use contigious storage (a vector perhaps), and use that instead.
vector<char> chars(my_string.begin(), my_string.end());
char* ptr = &chars[0];
If you want to be hacky and non-portable and decidedly unsafe, you can confirm that your string implementation does in fact use contigious storage, and then maybe use this:
&my_str[0]
But I would punch any developer that worked for me that did this.
EDIT:
I've been made aware that there are currently no known STL implementations that do not store the string data in a contiguous array, which would make &my_str[0] safe. It is also true (and I was asked to state this) that in the upcoming C++0x standard, it will be required for the storage to be contiguous.
It's been suggested that because if these facts that my post is factually incorrect.
Decide for yourself, but I say no. This is not in the current C++ standard, and so it is not required. I will still in practice do things the way I have suggested, and in any code review I will flag any code that assumes the underlying storage is contigious.
Consider this. Suppose there were a question about vtable pointers. Someone wants to examing a class and get the pointer to a virtual function by looking at the vtable. I would immediately tell them not to do this because there is no mention of how virtual methods are implemented in C++. Every implementation I know uses vtables, and I can't think of a better way to do it. It is likely that polymorphism will forever be implemented using vtables. Does that make it ok to examing the vtable directly?
IMO no, because this depends on undocumented implementation details. You have no control over this, and it could change at any time. Even if you expect it will never change, it is still bad engineering to rely on these implementation details.
Decide for yourself.
There are three scenarios:
If the function is outside of your control, and it either modifies the string, or you don't and can't know if it modifies the string:
Then, copy the string into a temporary buffer, and pass that to the function, like so:
void callFoo(std::string& str);
{
char* tmp = new char str(str.length() +1);
strncpy(tmp, str.c_str(), str.length());
foo(tmp);
// Include the following line if you want the modified value:
str = tmp;
delete [] tmp;
}
If the function is outside of your control, but you are certain it does not modify the string, and that not taking the argument as const is simply a mistake on the API's part.
Then, you can cast the const away and pass that to the function
void callFoo(const std::string& str)
{
foo(const_cast<char*> str.c_str());
}
You are in control of the function (and it would not be overly disruptive to change the signature).
In that case, change the function to accept either a string& (if it modifies the input buffer) or either const char* or const string& if it does not.
When a parameter is declared as char* there it is implicitly assumed that the function will have as a side effect the modification of the string that is pointed. Based in this and the fact that c_str() does not allow modifications to the enclosed string you cannot explicitly pass an std::string to such a method.
Something like this can be achived by following the following approach:
#include <cstdlib>
#include <string>
#include <iostream>
void modify_string(char* pz)
{
pz[0] = 'm';
}
class string_wrapper
{
std::string& _s;
char* _psz;
string_wrapper(const string_wrapper&);
string_wrapper& operator=(const string_wrapper&);
public:
string_wrapper(std::string& s) : _s(s), _psz(0) {}
virtual ~string_wrapper()
{
if(0 != _psz)
{
_s = _psz;
delete[] _psz;
}
}
operator char*()
{
_psz = new char[_s.length()+1];
strcpy(_psz,_s.c_str());
return _psz;
}
};
int main(int argc, char** argv)
{
using namespace std;
std::string s("This is a test");
cout << s << endl;
modify_string(string_wrapper(s));
cout << s << endl;
return 0;
}
If you are certain that the char* will not be modified, you can use const_cast to remove the const.
It's a dirty solution but I guess it works
std::string foo("example");
char* cpy = (char*)malloc(foo.size()+1);
memcpy(cpy, foo.c_str(), foo.size()+1);

Avoiding memory leaks while mutating c-strings

For educational purposes, I am using cstrings in some test programs. I would like to shorten strings with a placeholder such as "...".
That is, "Quite a long string" will become "Quite a lo..." if my maximum length is set to 13. Further, I do not want to destroy the original string - the shortened string therefore has to be a copy.
The (static) method below is what I come up with. My question is: Should the class allocating memory for my shortened string also be responsible for freeing it?
What I do now is to store the returned string in a separate "user class" and defer freeing the memory to that user class.
const char* TextHelper::shortenWithPlaceholder(const char* text, size_t newSize) {
char* shortened = new char[newSize+1];
if (newSize <= 3) {
strncpy_s(shortened, newSize+1, ".", newSize);
}
else {
strncpy_s(shortened, newSize+1, text, newSize-3);
strncat_s(shortened, newSize+1, "...", 3);
}
return shortened;
}
The standard approach of functions like this is to have the user pass in a char[] buffer. You see this in functions like sprintf(), for example, which take a destination buffer as a parameter. This allows the caller to be responsible for both allocating and freeing the memory, keeping the whole memory management issue in a single place.
In order to avoid buffer overflows and memory leaks, you should always use C++ classes such as std::string in this case.
Only the very last instance should convert the class into something low level such as char*. This will make your code simple and safe. Just change your code to:
std::string TextHelper::shortenWithPlaceholder(const std::string& text,
size_t newSize) {
return text.substr(0, newSize-3) + "...";
}
When using that function in a C context, you simply use the cstr() method:
some_c_function(shortenWithPlaceholder("abcde", 4).c_str());
That's all!
In general, you should not program in C++ the same way you program in C. It's more appropriate to treat C++ as a really different language.
I've never been happy returning pointers to locally allocated memory. I like to keep a healthy mistrust of anyone calling my function in regard to clean up.
Instead, have you considered accepting a buffer into which you'd copy the shortened string?
eg.
const char* TextHelper::shortenWithPlaceholder(const char* text,
size_t textSize,
char* short_text,
size_t shortSize)
where short_text = buffer to copy shortened string, and shortSize = size of the buffer supplied. You could also continue to return a const char* pointing to short_text as a convenience to the caller (return NULL if shortSize isn't large enough to).
Really you should just use std::string, but if you must, look to the existing library for usage guidance.
In the C standard library, the function that is closest to what you are doing is
char * strncpy ( char * destination, const char * source, size_t num );
So I'd go with this:
const char* TextHelper::shortenWithPlaceholder(
char * destination,
const char * source,
size_t newSize);
The caller is responsible for memory management - this allows the caller to use the stack, or a heap, or a memory mapped file, or whatever source to hold that data. You don't need to document that you used new[] to allocate the memory, and the caller doesn't need to know to use delete[] as opposed to free or delete, or even a lower-level operating system call. Leaving the memory management to the caller is just more flexible, and less error prone.
Returning a pointer to the destination is just a nicety to allow you to do things like this:
char buffer[13];
printf("%s", TextHelper::shortenWithPlaceholder(buffer, source, 12));
The most flexible approach is to return a helper object that wraps the allocated memory, so that the caller doesn't have to worry about it. The class stores a pointer to the memory, and has a copy constructor, an assignment operator and a destructor.
class string_wrapper
{
char *p;
public:
string_wrapper(char *_p) : p(_p) { }
~string_wrapper() { delete[] p; }
const char *c_str() { return p; }
// also copy ctor, assignment
};
// function declaration
string_wrapper TextHelper::shortenWithPlaceholder(const char* text, size_t newSize)
{
// allocate string buffer 'p' somehow...
return string_wrapper(p);
}
// caller
string_wrapper shortened = TextHelper::shortenWithPlaceholder("Something too long", 5);
std::cout << shortened.c_str();
Most real programs use std::string for this purpose.
In your example the caller has no choice but to be responsible for freeing the allocated memory.
This, however, is an error prone idiom to use and I don't recommend using it.
One alternative that allows you to use pretty much the same code is to change shortened to a referenced counted pointer and have the method return that referenced counted pointer instead of a bare pointer.
There are two basic ways that I consider equally common:
a) TextHelper returns the c string and forgets about it. The user has to delete the memory.
b) TextHelper maintains a list of allocated strings and deallocates them when it is destroyed.
Now it depends on your usage pattern. b) seems risky to me: If TextHelper has to deallocate the strings, it should not do so before the user is done working with the shortened string. You probably won't know when this point comes, so you keep the TextHelper alive until the program terminates. This results in a memory usage pattern equal to a memory leak. I'd recommend b) only if the strings belong semantically to the class that provides them, similar to the std::string::c_str(). Your TextHelper looks more like a toolbox that should not be associated with the processed strings, so if I had to choose between the two, I'd go for a). Your user class is probably the best solution, given a fixed TextHelper interface.
Edit: No, I'm wrong. I misunderstood what you were trying to do. The caller must delete the memory in your instance.
The C++ standard states that deleting 0/NULL does nothing (in other words, this is safe to do), so you can delete it regardless of whether you ever called the function at all. Edit: I don't know how this got left out...your other alternative is placement delete. In that case, even if it is bad form, you should also use placement new to keep the allocation/deallocation in the same place (otherwise the inconsistency would make debugging ridiculous).
That said, how are you using the code? I don't see when you would ever call it more than once, but if you do, there are potential memory leaks (I think) if you don't remember each different block of memory.
I would just use std::auto_ptr or Boost::shared_ptr. It deletes itself on the way out and can be used with char*.
Another thing you can do, considering on how TextHelper is allocated. Here is a theoretical ctor:
TextHelper(const char* input) : input_(input), copy(0) { copy = new char[sizeof(input)/sizeof(char)]; //mess with later }
~TextHelper() { delete copy; }