Stringstream to Vector<char> throws std::bad_alloc - c++

I was tinkering with Boost Property Tree a bit, and I came across this example. I needed to convert the final result to vector, so I just added one more line after
write_json(ss, root);
like this:
std::vector<char> testVec(std::begin(ss.str()), std::end(ss.str()));
I have also tried this instead:
std::string someString = ss.str()
char* someArray = (char*)someString.c_str();
std::vector<char> someVec(someArray, someArray+sizeof(someArray));
In both the cases, I am getting this error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Any hints what I am doing wrong? It is not a problem of property tree, I gather. It is a problem of string to vector conversion. I guess this makes sense, because vectors are only supposed to be one dimensional, but I have no clue how to convert that string to vector and get it back again.

This fails:
std::vector<char> testVec(std::begin(ss.str()), std::end(ss.str()));
because the std::begin() and std::end() are not associated with same std::string instance. The ss.str() returns a new std::string instance each time, it does not return a std::string& to some internal member.
This is incorrect:
std::vector<char> someVec(someArray, someArray+sizeof(someArray));
because sizeof(someArray) is sizeof(char*). If there is less than sizeof(char*) elements in someArray an attempt will be made to read memory that should not.
A possible solution:
std::string someString = ss.str();
std::vector<char> someVec(someString.begin(), someString.end());

sizeof() isn't giving you the length of the array, because c strings don't know their length (null-terminated). It's just providing the size of the memory location for the pointer at the beginning of the array, which is why you can reference past the end of the string using operator[] if you're not careful.
Another solution is to overload sizeof to iterate through the string until it hits the null terminator, but that seems like something you could do with built in things without going through the extra work.

Related

storing a c++ string into a char* vector

I'm trying to store this string into a vector of pointers. This is just the starting point of the code. Eventually the vector will store words the user types in and either add it to the vector if the vector doesn't have it or show the word is already in the vector.
I tried to use strcpy() but then it told me to use strcpy_s(). So I did now and now it just crashes every time with no error. Can someone give me some insight into what is going on here.
vector<char*> wordBank;
string str;
cin >> str;
wordBank[0] = new char[str.length() + 1];
strcpy_s(wordBank[0], str.length() + 1 , str.c_str() );
cout << wordBank[0];
delete[] wordBank[0];
I would not consider vector<char*> wordBank; this c++ code, but rather C code that happens to use some C++ features.
The standard library in C++ can make your life easier. You should use std::vector<std::string> instead.
vector<string> wordBank;
wordBank.push_back(str);
It avoid all the pointer stuff, so you need not to do memory management (which is a good reason get rid of the pointers).
For strcpy_s, it's a safer version of strcpy, because you have to explicitly specify the size of the target buffer, which can avoid buffer overflows during copies.
However, strcpy_s is non-standard and MS specific, NEVER use strcpy_s unless your just want your code compiled on MSVS. Use std::copy instead.
The default size of a vector is 0
Hence this line
vector<char*> wordBank;
just defines a 0 sized vector of character pointers
As people have mentioned in comments you can go with either of these 2 options:-
vector<char*> wordBank(1);
OR
wordBank.push_back(...);
You don't need char* elements in the vector. You can use string instead, and append your strings to the vector with push_back(), which will allocate the needed space:
vector<string> wordBank;
string str;
cin >> str;
wordBank.push_back(str);
cout << wordBank[0];
This will free you from the burden of having to use delete every time you want to remove a string from the vector. Basically, you should never have to use delete on anything, and to accomplish that, you should avoid allocating memory with new. In this case, that means avoiding the use of new char[/*...*/], which in turn means you should use string to store strings.
vector<char*> wordBank; constructs an empty vector. As such, wordBank[0], which uses operator[], has undefined behavior since you're accessing out of bounds.
Returns a reference to the element at specified location pos. No bounds checking is performed.
You can use push_back to add a new element, as such:
wordBank.push_back(new char[str.length + 1]);
Of course the most sensible thing is to just use use a vector of strings
vector<string> wordBank;
wordBank.push_back(str);
You're trying to manually manage memory for your strings, while the std::string class was designed to do that for you.
Also, from what you are describing your use case to be, you may wish to check out std::map and/or std::set. Here's a tutorial.

std::string without null-character

Is a std::string without a null-character in the end valid and can it be acquired like this?:
std::string str = "Hello World";
str.resize(str.size() - 1);
For those who are curious:
I have a 3rd party function taking a string and iterating over the chars (using iterators). Unfortunately the function is buggy (as its a dev-version) and cannot deal with null-characters. I dont have another signature to chose from, I cant modify the function (as I said, 3rd party and we dont want to fork) and at the same time I dont want to reinvent the wheel. As far as I can tell, the function should work as desired without the null-character so I want atleast to give it a try.
The iteration takes place like this:
bool nextChar(CharIntType& c)
{
if (_it == _end) return false;
c = *_it;
++_it;
return true;
}
where _it is initialized to std::string::begin() and _end to std::string::end()
Until C++11, std::string was not required to include a trailing nul until you called c_str().
http://en.cppreference.com/w/cpp/string/basic_string/data
std::string::data()
Returns pointer to the underlying array serving as character storage. The pointer is such that the range [data(); data() + size()) is valid and the values in it correspond to the values stored in the string.
The returned array is not required to be null-terminated.
If empty() returns true, the pointer is a non-null pointer that should not be dereferenced. (until c++11)
The returned array is null-terminated, that is, data() and c_str() perform the same function.
If empty() returns true, the pointer points to a single null character. (since c++11)
From this we can confirm that std::string::size does not include any nul terminator, and that std::string::begin() and std::string::end() describe the ranges you are actually looking for.
We can also determine this by the simple fact that std::string::back() doesn't return a nul character.
#include <iostream>
#include <string>
int main() {
std::string s("hello, world");
std::cout << "s.front = " << s.front() << " s.back = " << s.back() << '\n';
return 0;
}
http://ideone.com/nUX0AB
While it is possible to have non null terminated strings I would not recommend it, strings are null terminated for a good reason, i would actually recommend in this instance that you either go ahead and write the function properly or get in touch with the third party and have them fix it.
To answer your questions yes a std::string is valid if it is not null terminated, to achieve this you can use the overload of string copy with a maximum length loaded, once again i do not recommend this.
See this page for more information:
http://c2.com/cgi/wiki?NonNullTerminatedString
This is a very late answer but I just post it so that anyone who comes later can use it for their reference. If you write a null terminated string into the string.data() array, it will terminate the string and would not let you to continue concatenate the string if you need to. The way to solve it is already answer in the question.
str.resize(str.size() - 1);
This would solve the problem, I have tested out in my code.

Splitting a std::string into two const char*s resulting in the second const char* overwriting the first

I am taking a line of input which is separated by a space and trying to read the data into two integer variables.
for instance: "0 1" should give child1 == 0, child2 == 1.
The code I'm using is as follows:
int separator = input.find(' ');
const char* child1_str = input.substr(0, separator).c_str(); // Everything is as expected here.
const char* child2_str = input.substr(
separator+1, //Start with the next char after the separator
input.length()-(separator+1) // And work to the end of the input string.
).c_str(); // But now child1_str is showing the same location in memory as child2_str!
int child1 = atoi(child1_str);
int child2 = atoi(child2_str); // and thus are both of these getting assigned the integer '1'.
// do work
What's happening is perplexing me to no end. I'm monitoring the sequence with the Eclipse debugger (gdb). When the function starts, child1_str and child2_str are shown to have different memory locations (as they should). After splitting the string at separator and getting the first value, child1_str holds '0' as expected.
However, the next line, which assigns a value to child2_str not only assigns the correct value to child2_str, but also overwrites child1_str. I don't even mean the character value is overwritten, I mean that the debugger shows child1_str and child2_str to share the same location in memory.
What the what?
1) Yes, I'll be happy to listen to other suggestions to convert a string to an int -- this was how I learned to do it a long time ago, and I've never had a problem with it, so never needed to change, however:
2) Even if there's a better way to perform the conversion, I would still like to know what's going on here! This is my ultimate question. So even if you come up with a better algorithm, the selected answer will be the one that helps me understand why my algorithm fails.
3) Yes, I know that std::string is C++ and const char* is standard C. atoi requires a c string. I'm tagging this as C++ because the input will absolutely be coming as a std::string from the framework I am using.
First, the superior solutions.
In C++11 you can use the newfangled std::stoi function:
int child1 = std::stoi(input.substr(0, separator));
Failing that, you can use boost::lexical_cast:
int child1 = boost::lexical_cast<int>(input.substr(0, separator));
Now, an explanation.
input.substr(0, separator) creates a temporary std::string object that dies at the semicolon. Calling c_str() on that temporary object gives you a pointer that is only valid as long as the temporary lives. This means that, on the next line, the pointer is already invalid. Dereferencing that pointer has undefined behaviour. Then weird things happens, as is often the case with undefined behaviour.
The value returned by c_str() is invalid after the string is destructed. So when you run this line:
const char* child1_str = input.substr(0, separator).c_str();
The substr function returns a temporary string. After the line is run, this temporary string is destructed and the child1_str pointer becomes invalid. Accessing that pointer results in undefined behavior.
What you should do is assign the result of substr to a local std::string variable. Then you can call c_str() on that variable, and the result will be valid until the variable is destructed (at the end of the block).
Others have already pointed out the problem with your current code. Here's how I'd do the conversion:
std::istringstream buffer(input);
buffer >> child1 >> child2;
Much simpler and more straightforward, not to mention considerably more flexible (e.g., it'll continue to work even if the input has a tab or two spaces between the numbers).
input.substr returns a temporary std::string. Since you are not saving it anywhere, it gets destroyed. Anything that happens afterwards depends solely on your luck.
I recommend using an istringstream.

Vector values assign

I have this piece of code:
while(fileStream>>help)
{
MyVector.push_back(help);
}
...and lets say that in file "fileStream" is 1 sentence: Today is a sunny day. Now when i do this:
for(int i=0;i<MyVector.size();i++)
{
printf("%s\n", MyVector[i]);
}
, the resoult is "day day day day day". And, of course, it shouldnt be like that. Variables are declared like this:
char *help;
vector<char*> MyVector;
EDIT:i understand the answers, thank you...but is there any way to store words from file in vector<char*> MyVector(like i wanted in the first place). it would be great for the rest of my program.
When you do fileStream>>help, that's not changing the value of the pointer, it is overwriting the contents of the string that the pointer is pointing at. So you are pushing the same pointer into the vector over and over again. So all of the pointers in the vector point to the same string. And since the last thing written into that string was the word "day", that is what you get when you print out each element of your vector.
Use a vector of std::string instead:
string help;
vector<string> MyVector;
If, as you say, you must stick with vector<char*>, then you will have to dynamically allocate a new string for each element. You cannot safely use a char* with operator>>, because there is no way to tell it how much space you actually have in your string, so I will use std::string for at least the input.
std::string help;
vector<char*> MyVector;
while (fileStream>>help) {
char * p = new char[help.size() + 1];
strcpy(p, help.c_str());
MyVector.push_back(p);
}
Of course, when you are done with the vector, you can't just let it go out of scope. You need to delete every element manually in a loop. However, this is still not completely safe, because your memory allocation could throw an exception, causing whatever strings you have already allocated and put in the vector to leak. So you should really wrap this all up in a try block and be ready to catch std::bad_alloc.
This is all a lot of hassle. If you could explain why you think you need to use vector<char*>, I bet someone could show you why you don't.
Thats because your char * are all pointing at the same object. The help pointer and the pointers stored in the vector all point to the same object. Everytime you read from the stream it modifies what all the pointers are pointing at. Change char* to an std::string.

Character Pointers (allotted by new)

I wrote the following code:
char *pch=new char[12];
char *f=new char[42];
char *lab=new char[20];
char *mne=new char[10];
char *add=new char[10];
If initially I want these arrays to be null, can't I do this:
*lab="\0";
*mne="\0";
and so on.....
And after that if I want to add some cstring to an empty array can't I check:
if(strcmp(lab,"\0")==0)
//then add cstring by *lab="cstring";
And if I can't do any of these things, please tell me the right way to do it...
In C++11, an easy way to initialize arrays is by using brace-initializers:
char * p = new char[100] { 0 };
The reasoning here is that all the missing array elements will be zero-initialized. You can also use explicit value-initialization (I think that's even allowed in C++98/03), which is zero-initalization for the primitive types:
char * q = new char[110]();
First of all, as DeadMG says, the correct way of doing this is using std:string:
std::string lab; // empty initially, no further initialization needed
if (lab.size() == 0) // string empty, note, very fast, no character comparison
lab += "cstring"; // or even lab = "cstring", as lab is empty
Also, in your code, if you insist in using C strings, after the initialization, the correct checking for the empty string would be
if (*lab == '\0')
First of all, I agree with everybody else to use a std::string instead of character arrays the vast majority of the time. Link for help is here: C++ Strings Library
Now to directly answer your question as well:
*lab="\0";
*mne="\0";
and so on.....
This is wrong. Assuming your compiler doesn't give you an error, you're not assigning the "null terminator" to those arrays, you're trying to assign the pointer value of where the "\0" string is to the first few memory locations where the char* is pointing to! Remember, your variables are pointers, not strings. If you're trying to just put a null-character at the beginning, so that strlen or other C-string functions see an "empty" string, do this: *lab='\0'; The difference is that with single-ticks, it denotes the character \0 whereas with double, it's a string literal, which returns a pointer to the first element. I hope that made sense.
Now for your second, again, you can't just "assign" like that to C-style strings. You need to put each character into the array and terminate it correctly. Usually the easiest way is with sprintf:
sprintf(lab, "%s", "mystring");
This may not make much sense, especially as I'm not dereferencing the pointer, but I'll walk you through it. The first argument says to sprintf "output your characters to where this pointer is pointing." So it needs the raw pointer. The second is a format string, like printf uses. So I'm telling it to use the first argument as a string. And the 3rd is what I want in there, a pointer to another string. This example would also work with sprintf(lab, "mystring") as well.
If you want to get into C-style string processing, you need to read some examples. I'm afraid I don't even know where to look on the 'net for good examples of that, but I wish you good luck. I'd highly recommend that you check out the C++ strings library though, and the basic_string<> type there. That's typedef'd to just std::string, which is what you should use.