Converting std::vector<char> to char* causes defective characters - c++

I have a function in my code called buildPacket that takes some parameters, and converts them into a char* and adds them together using a std::vector<char> and at the end returns the result as a char*. The problem is that after I convert the vector to a char* all characters become a weird character.
I tried using other ways of converting the vector to a char*, like with using reinterpret_cast<char*>. When I print the contents of the vector from inside the function, I get the expected result so the problem is with the conversion.
The function's code:
char* buildPacket (int code, std::string data)
{
char* codeBytes = CAST_TO_BYTES(code);
std::vector<char> packetBytes(codeBytes, codeBytes + sizeof(char));
size_t dataLength = data.size() + 1;
char* dataLengthBytes = CAST_TO_BYTES(dataLength);
packetBytes.insert(packetBytes.end(), dataLengthBytes, dataLengthBytes + sizeof(int));
const char* dataBytes = data.c_str();
packetBytes.insert(packetBytes.end(), dataBytes, dataBytes + dataLength);
return &packetBytes[0];
}
The CAST_TO_BYTES macro:
#define CAST_TO_BYTES(OBJ) static_cast<char*>(static_cast<void*>(&OBJ));
The intent of the function is to take the input and build a packet out of it to send through a socket later on, the packet's format consists of a 1-byte long code, 4-byte long data length and data with variable length.
The input I gave it is code = 101 and data = "{\"password\":\"123456\",\"username\":\"test\"}"
This is the result I am getting when printing the characters: ▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌
EDIT: Thanks for all the help, I've returned a vector<char> at the end as suggested and took a different approach in converting to values to a char*.

You're returning a pointer to something inside of a local variable. You should change your code to have your vector<char> alive outside of your buildPacket function (such as by returning it instead of the char*).

You might try this solution. I thing using STL makes it more clearer what you are trying to achieve. There was also an undefined reference in your code, that could lead to unpredictable crashes.
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
// Better return std::vector<char>
char* buildPacket(int code, const std::string& data)
{
auto result = data;
result.append(1, static_cast<char>(code));
char* ret = new char[data.size() + 2];
ret[data.size() + 1] = '\0';
std::copy(result.begin(), result.end(), ret);
return ret;
}
std::vector<char> buildPacketStl(int code, const std::string& data)
{
std::vector<char> ret;
std::copy(data.begin(), data.end(), std::back_inserter(ret));
ret.push_back(static_cast<char>(code));
return ret;
}
int main() {
std::cout << buildPacket(65, "test") << std::endl;; // 65 -> A
auto stl= buildPacketStl(65, "test"); // 65 -> A
std::copy(stl.begin(), stl.end(), std::ostream_iterator<char>(std::cout, ""));
std::cout << std::endl;
}

Related

Getting char * or const char * data from a string breaks for 16 character strings or longer

I have a function string_to_char() which attempts to give me a form of a string which I can pass into a library I am using, which wants char * (but I think works with const char *, so I've been trying both).
The code I wrote to test my implementation of string_to_char() goes as such:
#include <iostream>
const std::string endl = "\n";
char * string_to_char(std::string str)
{
return (char*) str.c_str();
}
int main()
{
std::string test1 = "Some test strin";
std::string test2 = "Some test string";
char * result1 = string_to_char(test1);
char * result2 = string_to_char(test2);
std::cout << "part1" << endl;
std::cout << result1 << endl;
std::cout << string_to_char(test1) << endl;
std::cout << "part2" << endl;
std::cout << result2 << endl;
std::cout << string_to_char(test2) << endl;
std::cout << "done" << endl;
return 0;
}
This is the output I get:
part1
Some test strin
Some test strin
part2
Some test string
done
So for some reason, string_to_char() only properly works with strings with 15 characters or shorter, and outputs from the function straight to std::cout, but can't seem to store it to a variable for 16 characters or longer.
I am relatively new to C++ so some of the code below may seem a bit strange to more experienced programmers, but here is the code that I have tried in place of return (char*) str.c_str();
#include <vector>
#include <string.h>
char * string_to_char(std::string str)
{
return (char*) str.c_str();
return const_cast<char*>(str.c_str());
std::vector<char> vec(str.begin(), str.end());
char * chr;
vec.push_back('\0');
chr = (char*) &vec[0];
//chr = & (*vec.begin());
return chr; //all outputs from both are empty with this both versions of chr
return &str[0]; //this makes the output from the 15 character string also be empty when put in a
//variable, but the function going directly to std::cout is fine
return strcpy((char *) malloc(str.length() + 1), str.c_str()); //this one works with everything, but
//it looks like it leaks memory without further changes
std::vector<char> copied(str.c_str(), str.c_str() + str.size() + 1);
return copied.data(); //returns "random" characters/undefined behaviour for both outputs in test1 and is empty for both
//outputs in test2
}
Using const instead, and changing char * result1 = string_to_char(test1); to const char * result1 = string_to_char(test1); (as with result2), to see if that works with these other solutions:
#include <vector>
#include <string.h>
const char * string_to_char(std::string str)
{
return (char*) str.c_str();
return str.c_str();
return (const char*) str.c_str();
return str.data();
return const_cast<char*>(str.c_str());
std::vector<char> vec(str.begin(), str.end());
char * chr;
vec.push_back('\0');
chr = (char*) &vec[0];
//chr = & (*vec.begin());
return chr; //completely breaks both
return &str[0]; //both appear empty when given to a variable, but works fine when taken straight to std::cout
return strcpy((char *) malloc(str.length() + 1), str.c_str()); //memory leak, as when not using const
std::vector<char> copied(str.c_str(), str.c_str() + str.size() + 1);
return copied.data(); //same as when not using const
}
I got a lot of the given methods from:
std::string to char*
string.c_str() is const?
How to convert a std::string to const char* or char*?
Converting from std::string to char * in C++
With a bit of reading around the topic for strings and vectors at https://www.cplusplus.com/reference/ and https://en.cppreference.com/w/
The pointer returned from c_str() is only valid as long as the string is alive. You get expected output when you pass a reference:
auto string_to_char(std::string& str)
{
return str.c_str();
}
Because now the pointer returned is into the buffer of the string of the caller. In your code the caller gets a pointer to the functions local string (because you pass a copy).
Though, instead of calling the function you can directly call c_str(). That also mitigates the problem of holding on to the pointer after the string is gone to some extend.
You've overthought this. There is no need two write this function yourself. std::string::data already exists and returns a pointer to the string's null-terminated internal buffer. Assuming you're using C++17 or later, this pointer will be const char* if the std::string object is const-qualified (i.e. read-only), and otherwise will be a modifiable char*.
std::string test1 = "string";
const std::string test2 = "const string";
char* result1 = test1.data();
const char* result2 = test2.data();
This pointer is valid for as long as the std::string object that it came from is alive and is not modified (except for modifying individual elements).
Also note that casting pointers and casting away const-ness is a very easy way to cause Undefined Behaviour without knowing it. You should avoid C-style casts in general (e.g. (char*)str.c_str()) because they're very unsafe. See this Q/A on the proper use of C++ casts for more information.
Live Demo
Documentation
string_to_char() is taking its str parameter by value, so a copy of the caller's input string is made. When the function exits, that copied std::string will be destroyed. Thus, the returned char* pointer will be left dangling, pointing to freed memory, and any use of that pointer to access the data will be undefined behavior.
Pass in the str parameter by reference instead:
char* string_to_char(std::string &str)
{
return const_cast<char*>(str.c_str());
}
Or, in C++17 and later, you can use this instead:
char* string_to_char(std::string &str)
{
return str.data();
}
Which then begs the question of why you need string_to_char() at all and don't just use data() directly, unless you are not using a modern version of C++.

Initialize const char * with out any memory leaks

Below is my sample code. Its just a sample which is similar to the code which i'm using in my applicaiton.
#define STR_SIZE 32
void someThirdPartyFunc(const char* someStr);
void getString(int Num, const char* myStr)
{
char tempStr[] = "MyTempString=";
int size = strlen(tempStr) + 2;
snprintf((char*)myStr, size, "%s%d", tempStr, Num);
}
int main()
{
const char * myStr = new char(STR_SIZE);
getString(1, myStr); // get the formated string by sending the number
someThirdPartyFunc(myStr); // send the string to the thirdpartyFunction
delete myStr;
return 0;
}
I am getting an exception if i use this code. I think the problem is with deleting the "myStr". But delete is really necessary.
Is there any other way to format the string in getString and send it to the ThirdPartyFunc??
Thanks in advance.
you are allocating not an array of chars but one char with this line:
const char * myStr = new char(STR_SIZE);
and that one allocated char is initialized with the value of STR_SIZE, causing a "char overflow" in this case.
if you want an array of size STR_SIZE:
const char * myStr = new char[STR_SIZE];
(note the rectangular [ ]). you have to deallocate such allocated chunk of memory by using the delete[] operator.
personal note: the code you have written above (manually allocated strings etc) is good educational wise; you will do a lot of such mistakes and thus learn about the inner workings of C / C++. for production code you do not want that, for production code you want std::string or other string-containers to avoid repeating string-related mistakes. in general you are not the one who sucessfully reinvent how string-libraries will work. the same is true for other container-types like dynamically-growable-arrays (std::vector) or dictionary-types or whatever. but for educational fiddling around your code above serves a good purpose.
there are other problems in your code snippet (handing over const char* to a function and then modifying the ram, not calculating correctly the size parameter when calling snprintf etc), but these are not related to your segfault-problem.
Re the technical, instead of
const char * myStr = new char(STR_SIZE);
do
char const myStr[STR_SIZE] = "";
Note that both have the problem that the string can’t be modified.
But you only asked about the allocation/deallocation problem.
But then, there's so much wrong at levels above the language-technical.
Here's the original code, complete:
void someThirdPartyFunc(const char* someStr);
void getString(int Num, const char* myStr)
{
char tempStr[] = "MyTempString=";
int size = strlen(tempStr) + 2;
snprintf((char*)myStr, size, "%s%d", tempStr, Num);
}
int main()
{
const char * myStr = new char(STR_SIZE);
getString(1, myStr); // get the formated string by sending the number
someThirdPartyFunc(myStr); // send the string to the thirdpartyFunction
delete myStr;
return 0;
}
Here's how to do that at the C++ level:
#include <string> // std::string
#include <sstream> // std::ostringstream
using namespace std;
void someThirdPartyFunc( char const* ) {}
string getString( int const num )
{
ostringstream stream;
stream << "MyTempString=" << num;
return stream.str();
}
int main()
{
someThirdPartyFunc( getString( 1 ).c_str() );
}
The #define disappeared out of the more natural code, but note that it can very easily lead to undesired text substitutions, even with all uppercase macro names. And shouting all uppercase is an eyesore anyway (which is why it's the macro name convention, as opposed to some other convention). In C++ simply use const instead.

Malformed output when converting string to char* in C++

I've got a function that splits up a string into various sections and then parses them, but when converting a string to char* I get a malformed output.
int parseJob(char * buffer)
{ // Parse raw data, should return individual jobs
const char* p;
int rows = 0;
for (p = strtok( buffer, "~" ); p; p = strtok( NULL, "~" )) {
string jobR(p);
char* job = &jobR[0];
parseJobParameters(job); // At this point, the data is still in good condition
}
return (1);
}
int parseJobParameters(char * buffer)
{ // Parse raw data, should return individual job parameters
const char* p;
int rows = 0;
for (p = strtok( buffer, "|" ); p; p = strtok( NULL, "|" )) { cout<<p; } // At this point, the data is malformed.
return (1);
}
I don't know what happens between the first function calling the second one, but it malforms the data.
As you can see from the code example given, the same method to convert string to char* is used and it works fine.
I'm using Visual Studio 2012/C++, any guidance and code examples will be greatly appreciated.
The "physical" reason your code does not work has nothing to do with std::string or C++. It wouldn't work in pure C as well. strtok is a function that stores its intermediate parsing state in some global variable. This immediately means that you cannot use strtok to parse more than one string at a time. Starting the second parse session before finishing the first would override the internal data stored by the first parse session, thus ruining it beyond repair. In other words, strtok parse sessions must not overlap. In your code they do overlap.
Also, in C++03 the idea of using std::string with strtok directly is doomed from the start. The internal sequence stored in std::string is not guaranteed to be null-terminated. This means that generally &jobR[0] is not a C-string. It can't be used with strtok. To convert a std::string to a C-string you have to use c_str(). But C-string returned by c_str() is non-modifiable.
In C++11 the null-termination is supposed to be visible through the [] operator, but still there seems to be no requirement to store the terminator object contiguously with the actual string, so &jobR[0] is still not a C-string even in C++11. C-string returned by c_str() or data() is non-modifiable.
You cannot use strtok() to parse multiple strings at the same time, like you are doing. The first call to parseJobParameters() in the first loop iteration of parseJob() will alter the internal buffer that strtok() points to, thus the second loop iteration of parseJob() will not be processing the original data anymore. You need to rewrite your code to not use nested calls to strtok() anymore, eg:
#include <vector>
#include <string>
void split(std::string s, const char *delims, std::vector &vec)
{
// alternatively, use s.find_first_of() and s.substr() instead...
for (const char* p = strtok(s.c_str(), delims); p != NULL; p = strtok(NULL, delims))
{
vec.push_back(p);
}
}
int parseJob(char * buffer)
{
std::vector<std::string> jobs;
split(buffer, "~", jobs);
for (std::vector<std::string>::iterator i = jobs.begin(); i != jobs.end(); ++i)
{
parseJobParameters(i->c_str());
}
return (1);
}
int parseJobParameters(char * buffer)
{
std::vector<std::string> params;
split(buffer, "|", params);
for (std::vector<std::string>::iterator i = params.begin(); i != params.end(); ++i)
{
std::cout << *i;
}
return (1);
}
Whilst this will give you the address of the first character in the string char* job = &jobR[0];, it does not give you a valid C-style string. YOu SHOULD use char* job = jobR.c_str();
I'm fairly sure that will solve your problem, but there could of course be something wrong with the way you read the buffer that is passed to parseJob in as well.
Edit: of course, you are also calling strtok from a function that uses strtok. Inside strtok looks a bit like this:
char *strtok(char *str, char *separators)
{
static char *last;
char *found = NULL;
if (!str) str = last;
... do searching for needle, set found to beginning of non-separators ...
if (found)
{
*str = 0; // mark end of string.
}
last = str;
return found;
}
Since "last" gets overwritten when you call parseParameters, you can't use strtok(NULL, ... ) when you get back to parseJobs

What's the difference between char * and const_cast<char*>(string.c_str())

I use an external library to deal with udp (OSC) communication between 2 apps.
To format the messages that will be sent, the library is expecting a char* but I get a string from the UI that I have to convert.
While I was dealing with other parts of my code the udp part was hard coded like that :
char* endofMess = "from setEndMess";
and was working fine. I thought it would be easy to get it working with my strings and wrote :
std::string s = "from setEndMess";
char* endofMess = const_cast<char*>(s.c_str());
but unlike for the first example where I was receiving the message correctly formatted, I now receive only gibberish characters. Does somebody know where it can come from?
Thanks!
Matthieu
EDIT : the code I use :
The method to send the message each time OSCVal will change :
void osc500::testOSC(int identifier, float OSCval)
{
UdpTransmitSocket transmitSocket( IpEndpointName( destIP, port ) );
char buffer[1024];
osc::OutboundPacketStream p( buffer, 1024 );
p << osc::BeginBundleImmediate
<< osc::BeginMessage( endofMess )
<< OSCval << osc::EndMessage
<< osc::EndBundle;
transmitSocket.Send( p.Data(), p.Size() );
}
And if I have to change the OSC pattern I call this one :
void osc500::setEndMess(String endpattern){
// endofMess = "from setEndMess"; //OK works fine each time it's called
//1st try :
//std::string s = "from setEndMess";
//endofMess = const_cast<char*>(s.c_str()); //gibberish
//2nd try :
//std::string s = "from setEndMess";
//endofMess = &s[0]; //gibberish
//3rd & 4th tries :
//char s[4] = {'t','e','s','t'};
//char s[5] = {'t','e','s','t','\0'};
//endofMess = s; //gibberish
}
c_str() is for read-only access of std::string.
If you want to write to a string through pointers, then use either...
an array (or vector) of char instead of std::string
char* buf = &str[0]; - point directly to the first character of a string
Trick (2) is guaranteed to work under C++11; in practice it works in C++03 too but that relies on compiler implementation of std::string having contignuos storage.
(And whatever you do, keep an eye on the buffer length.)
I suspect the char* is not written to, it is only non const because it is a legacy API. If so, your problem is probably that the std::string has fallen out of scope or been modified between the point where you call c_str and where it is used in the guts of the API.
If you want to modify the std::string content, I think you should use &s[0] (making sure that the string is big enough).
std::string s = "abcdef...";
char* ptr = &s[0];
e.g. (tested with MSVC10):
#include <iostream>
#include <ostream>
#include <string>
using namespace std;
void f(char* ptr, size_t count)
{
for (size_t i = 0; i < count; i++)
ptr[i] = 'x';
}
int main()
{
string s = "abcdef";
cout << s << endl;
f(&s[0], 3);
cout << s << endl;
}
Output:
abcdef
xxxdef

Dereferencing an unsigned char pointer and storing its values into a string

So I am working on a tool that dereferences the values of some addresses, it is in both C and C++, and although I am not familiar with C++ I figured out I can maybe take advantage of the string type offered by C++.
What I have is this:
unsigned char contents_address = 0;
unsigned char * address = (unsigned char *) add.addr;
int i;
for(i = 0; i < bytesize; i++){ //bytesize can be anything from 1 to whatever
if(add.num == 3){
contents_address = *(address + i);
//printf("%02x ", contents_address);
}
}
As you can see what I am trying to do is dereference the unsigned char pointer. What I want to do is have a string variable and concatenate all of the dereferenced values into it and by the end instead of having to go through a for case for getting each one of the elements (by having an array of characters or by just going through the pointers) to have a string variable with everything inside.
NOTE: I need to do this because the string variable is going to a MySQL database and it would be a pain to insert an array into a table...
Try this that I borrowed from this link:
http://www.corsix.org/content/algorithmic-stdstring-creation
#include <sstream>
#include <iomanip>
std::string hexifyChar(int c)
{
std::stringstream ss;
ss << std::hex << std::setw(2) << std::setfill('0') << c;
return ss.str();
}
std::string hexify(const char* base, size_t len)
{
std::stringstream ss;
for(size_t i = 0; i < len; ++i)
ss << hexifyChar(base[i]);
return ss.str();
}
I didn't quite understand what you want to do here (why do you assign a dereferenced value to a variable called ..._address)?.
But maybe what you're looking for is a stringstream.
Here's a relatively efficient version that performs only one allocation and no additional function calls:
#include <string>
std::string hexify(unsigned char buf, unsigned int len)
{
std::string result;
result.reserve(2 * len);
static char const alphabet[] = "0123456789ABCDEF";
for (unsigned int i = 0; i != len)
{
result.push_back(alphabet[buf[i] / 16]);
result.push_back(alphabet[buf[i] % 16]);
{
return result;
}
This should be rather more efficient than using iostreams. You can also modify this trivially to write into a given output buffer, if you prefer a C version which leaves allocation to the consumer.