Undefined behavior when converting char to std::string c++ - c++

My program takes in a vector std::vector<std::string> vector and a character char separator and returns a string with all of the strings added together between the separator character. The concept is: vector[0] + separator + vector[1] + separator
The Code
std::string VectorToString(std::vector<std::string> vector, char separator)
{
std::string output;
for(std::string segment : vector)
{
std::string separator_string(&separator);
output += segment + separator_string;
}
return output;
}
int main()
{
std::vector<std::string> vector = {"Hello", "my", "beautiful", "people"};
std::cout << VectorToString(vector, ' ');
}
My expected output is Hello my beautiful people
However the output is:
Hello �����my �����beautiful �����people �����
What I have found is that something is wrong with the character, specifically its pointer: std::cout << &separator; -> �ƚ��. However if I do like this: std::cout << (void*) &separator; -> 0x7ffee16d35f7. Though I don't really know what (void*) does.
Question:
1.What is happening?
2.Why is it happening?
3.How do I fix it?
4.How do I prevent it from happening in future projekts?

This line
std::string separator_string(&separator);
tries to construct a string from a 0-terminated C string.
But &separator is not 0-terminated, because it depends on the other bytes of memory following separator if there is an 0 byte in there (probably not). So you're getting undefined behavior.
What you can do is to use other constructor:
std::string separator_string(1, separator);
This one creates a string by repeating separator character 1 time.

By the standard, following:
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
means this:
Constructs the string with the contents initialized with a copy of the null-terminated character string pointed to by s. The length of the string is determined by the first null character. The behavior is undefined if [s, s + Traits::length(s)) is not a valid range.
Therefore, std::string separator_string(&separator); causes an Undefined-Behavior because separator is not null-terminated.
To prevent this, you might want to use the following overload:
basic_string( const CharT* s,
size_type count,
const Allocator& alloc = Allocator() );
like std::string separator_string(&separator, 1); or even more simpler (as other answer pointed out) std::string separator_string(1, separator);.

Related

why string prints junk value,if we give it size?

#include <iostream>
using namespace std;
int main()
{
string a("Hello World",20);
cout<<a<<endl;
return 0;
}
I get output as "Hello WorldP". Why?
Usually we initialise string only with a data.But here i gave size.But it takes junkees.
So do i prefer not giving size?
Generally this is called garbage in, garbage out.
From cppreference:
Constructs the string with the first count characters of character string pointed to by s. s can contain null characters. The length of the string is count. The behavior is undefined if [s, s + count) is not a valid range.
The behavior of your program is undefined because "Hello World" is a const char[12] and trying to access characters up to index 20 via the const char* (resulting from the array decaying to pointer to its first element) is out of bounds.
The actual use case for that constructor is to create a std::string from a substring of some C-string, for example:
std::string s("Hello World",5); // s == "Hello"
Or to create a std::string from a C-string that contains \0 in the middle, for example:
std::string s("\0 Hello",5); // s.size() == 5 (not 0)

Error on string '\0' null while concatenating

So I am trying to concatenate simple strings, and make a final sentence.
int main()
{
string I ("I");
string Love ("Love");
string STL ("STL,");
string Str ("String.");
string fullSentence = '\0';
// Concatenate
fullSentence = I + " " + Love + " " + STL + " " + Str;
cout << fullSentence;
return 0;
}
Here, I didn't want to have "fullSentence" with nothing, so I assigned null and it gives me an error. There is no certain error message, except the following which I do not understand at all... :
Exception thrown at 0x51C3F6E0 (ucrtbased.dll) in exercise_4.exe: 0xC0000005: Access violation reading location 0x00000000. occurred
Soon as I remove '\0', it works just fine. Why does it so?
It appears to be an MSVC compiler bug to me.
The statement:
string fullSentence = '\0';
is not supposed to compile.
Indeed, there is no valid (implicit) constructor from char (i.e. '\0') to std::string. Reference Here.
Note that gcc and clang do not accept this code as valid.
MSVC does.
Why does it so?
Looking at the assembly code, MSVC compiles that statement with the following constructor:
std::string::string(char const * const);
Passing '\0' as an argument, it will be converted into a nullptr actually.
So:
Constructs the string with the contents initialized with a copy of the null-terminated character string pointed to by s. The length of the string is determined by the first null character. The behavior is undefined if [s, s + Traits::length(s)) is not a valid range (for example, if s is a null pointer).
So your code is undefined behavior.
Put "\0" instead of '\0'. In C++ '' is for char and "" is for strings.
It's a conversion from char to non-scalar type std::string
You could use a debugger to see a call stack of what happened.
Looking into string class here
Below constructor was called in your case:
basic_string( const CharT* s, const Allocator& alloc = Allocator() );
As per description of the constructor (emphasis mine) Constructs the
string with the contents initialized with a copy of the
null-terminated character string pointed to by s. The length of the
string is determined by the first null character. The behavior is
undefined if [s, s + Traits::length(s)) is not a valid range
in your case range is empty -> invalid.

std::string stops at \0

I am having problems with std::string..
Problem is that '\0' is being recognized as end of the string as in C-like strings.
For example following code:
#include <iostream>
#include <string>
int main ()
{
std::string s ("String!\0 This is a string too!");
std::cout << s.length(); // same result as with s.size()
std::cout << std::endl << s;
return 0;
}
outputs this:
7
String!
What is the problem here? Shouldn't std::string treat '\0' just as any other character?
Think about it: if you are given const char*, how will you detemine, where is a true terminating 0, and where is embedded one?
You need to either explicitely pass a size of string, or construct string from two iterators (pointers?)
#include <string>
#include <iostream>
int main()
{
auto& str = "String!\0 This is a string too!";
std::string s(std::begin(str), std::end(str));
std::cout << s.size() << '\n' << s << '\n';
}
Example: http://coliru.stacked-crooked.com/a/d42211b7199d458d
Edit: #Rakete1111 reminded me about string literals:
using namespace std::literals::string_literals;
auto str = "String!\0 This is a string too!"s;
Your std::string really has only 7 characters and a terminating '\0', because that's how you construct it. Look at the list of std::basic_string constructors: There is no array version which would be able to remember the size of the string literal. The one at work here is this one:
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
The "String!\0 This is a string too!" char const[] array is converted to a pointer to the first char element. That pointer is passed to the constructor and is all information it has. In order to determine the size of the string, the constructor has to increment the pointer until it finds the first '\0'. And that happens to be one inside of the array.
If you happen to work with a lot zero bytes in your strings, then chances are that std::vector<char> or even std::vector<unsigned char> would be a more natural solution to your problem.
You are constructing your std::string from a string literal. String literals are automatically terminated with a '\0'. A string literal "f\0o" is thus encoded as the following array of characters:
{'f', '\0', 'o', '\0'}
The string constructor taking a char const* will be called, and will be implemented something like this:
string(char const* s) {
auto e = s;
while (*e != '\0') ++e;
m_length = e - s;
m_data = new char[m_length + 1];
memcpy(m_data, s, m_length + 1);
}
Obviously this isn't a technically correct implementation, but you get the idea. The '\0' you manually inserted will be interpreted as the end of the string literal.
If you want to ignore the extra '\0', you can use a std::string literal:
#include <iostream>
#include <string>
int main ()
{
using namespace std::string_literals;
std::string s("String!\0 This is a string too!"s);
std::cout << s.length(); // same result as with s.size()
std::cout << std::endl << s;
return 0;
}
Output:
30
String! This is a string too!
\0 is known as a terminating character so you'll need to skip it somehow.
Take that as an example.
So whenever you want to skip special characters you would like to use two backslashes "\\0"
And '\\0' is a two-character literal
std::string test = "Test\\0 Test"
Results :
Test\0 Test
Most beginners also make mistake when loading eg. files :
std::ifstream some_file("\new_dir\test.txt"); //Wrong
//You should be using it like this :
std::ifstream some_file("\\new_dir\\test.txt"); //Correct
In very few words, you're constructing your C++ string from a standard C string.
And standard C strings are zero-terminated. So, your C string parameter will be terminated in the first \0 character it can find. And that character is the one you explicitly provided in your string "String!\0 This is a string too!"
And not in the 2nd one that is implictly and automatically provided by the compiler in the end of your C standard string.
Escape your \0
std::string s ("String!\\0 This is a string too!");
and you will get what you need:
31
String!\0 This is a string too!
That's not a problem, that's the intended behavior.
Maybe you could elaborate why you have a \0 in your string.
Using a std::vector would allow you to use \0 in your string.

string.c_str() is const? [duplicate]

This question already has answers here:
Can I get a non-const C string back from a C++ string?
(14 answers)
Closed 5 years ago.
I have a function in a library that takes in a char* and modifies the data.
I tried to give it the c_str() but c++ docs say it returns a const char*.
What can I do other than newing a char array and copying it into that?
You can use &str[0] or &*str.begin() as long as:
you preallocate explicitly all the space needed for the function with resize();
the function does not try to exceed the preallocated buffer size (you should pass str.size() as the argument for the buffer size);
when the function returns, you explicitly trim the string at the first \0 character you find, otherwise str.size() will return the "preallocated size" instead of the "logical" string size.
Notice: this is guaranteed to work in C++11 (where strings are guaranteed to be contiguous), but not in previous revisions of the standard; still, no implementation of the standard library that I know of ever did implement std::basic_string with noncontiguous storage.
Still, if you want to go safe, use std::vector<char> (guaranteed to be contiguous since C++03); initialize with whatever you want (you can copy its data from a string using the constructor that takes two iterators, adding a null character in the end), resize it as you would do with std::string and copy it back to a string stopping at the first \0 character.
Nothing.
Because std::string manages itself its contents, you can't have write access to the string's underlying data. That's undefined behavior.
However, creating and copying a char array is not hard:
std::string original("text");
std::vector<char> char_array(original.begin(), original.end());
char_array.push_back(0);
some_function(&char_array[0]);
If you know that the function will not modify beyond str.size() you can obtain a pointer in one of different ways:
void f( char* p, size_t s ); // update s characters in p
int main() {
std::string s=...;
f( &s[0], s.size() );
f( &s.front(), s.size() );
}
Note, this is guaranteed in C++11, but not in previous versions of the standard where it allowed for rope implementations (i.e. non-contiguous memory)
If your implementation will not try to increase the length of the string then:
C++11:
std::string data = "This is my string.";
func(&*data.begin());
C++03:
std::string data = "This is my string.";
std::vector<char> arr(data.begin(), data.end());
func(&arr[0]);
Here's a class that will generate a temporary buffer and automatically copy it to the string when it's destroyed.
class StringBuffer
{
public:
StringBuffer(std::string & str) : m_str(str)
{
m_buffer.push_back(0);
}
~StringBuffer()
{
m_str = &m_buffer[0];
}
char * Size(int maxlength)
{
m_buffer.resize(maxlength + 1, 0);
return &m_buffer[0];
}
private:
std::string & m_str;
std::vector<char> m_buffer;
};
And here's how you would use it:
// this is from a crusty old API that can't be changed
void GetString(char * str, int maxlength);
std::string mystring;
GetString(StringBuffer(mystring).Size(MAXLEN), MAXLEN);
If you think you've seen this code before, it's because I copied it from a question I wrote: Guaranteed lifetime of temporary in C++?

strncpy equivalent for std::string?

Is there an exact equivalent to strncpy in the C++ Standard Library? I mean a function, that copies a string from one buffer to another until it hits the terminating 0? For instance when I have to parse strings from an unsafe source, such as TCP packets, so I'm able to perform checks in length while coping the data.
I already searched a lot regarding this topic and I also found some interesting topics, but all of those people were happy with std::string::assign, which is also able to take a size of characters to copy as a parameter. My problem with this function is, that it doesn't perform any checks if a terminating null was already hit - it takes the given size serious and copies the data just like memcpy would do it into the string's buffer. This way there is much more memory allocated and copied than it had to be done, if there were such a check while coping.
That's the way I'm working around this problem currently, but there is some overhead I'd wish to avoid:
// Get RVA of export name
const ExportDirectory_t *pED = (const ExportDirectory_t*)rva2ptr(exportRVA);
sSRA nameSra = rva2sra(pED->Name);
// Copy it into my buffer
char *szExportName = new char[nameSra.numBytesToSectionsEnd];
strncpy(szExportName,
nameSra.pSection->pRawData->constPtr<char>(nameSra.offset),
nameSra.numBytesToSectionsEnd);
szExportName[nameSra.numBytesToSectionsEnd - 1] = 0;
m_exportName = szExportName;
delete [] szExportName;
This piece of code is part of my parser for PE-binaries (of the routine parsing the export table, to be exact). rva2sra converts a relative virtual address into a PE-section relative address. The ExportDirectory_t structure contains the RVA to the export name of the binary, which should be a zero-terminated string. But that doesn't always have to be the case - if someone would like it, it would be able to omit the terminating zero which would make my program run into memory which doesn't belong to the section, where it would finally crash (in the best case...).
It wouldn't be a big problem to implement such a function by myself, but I'd prefer it if there were a solution for this implemented in the C++ Standard Library.
If you know that the buffer you want to make a string out of has at least one NUL in it then you can just pass it to the constructor:
const char[] buffer = "hello\0there";
std::string s(buffer);
// s contains "hello"
If you're not sure, then you just have to search the string for the first null, and tell the constructor of string to make a copy of that much data:
int len_of_buffer = something;
const char* buffer = somethingelse;
const char* copyupto = std::find(buffer, buffer + len_of_buffer, 0); // find the first NUL
std::string s(buffer, copyupto);
// s now contains all the characters up to the first NUL from buffer, or if there
// was no NUL, it contains the entire contents of buffer
You can wrap the second version (which always works, even if there isn't a NUL in the buffer) up into a tidy little function:
std::string string_ncopy(const char* buffer, std::size_t buffer_size) {
const char* copyupto = std::find(buffer, buffer + buffer_size, 0);
return std::string(buffer, copyupto);
}
But one thing to note: if you hand the single-argument constructor a const char* by itself, it will go until it finds a NUL. It is important that you know there is at least one NUL in the buffer if you use the single-argument constructor of std::string.
Unfortunately (or fortunately), there is no built in perfect equivalent of strncpy for std::string.
The std::string class in STL can contain null characters within the string ("xxx\0yyy" is a perfectly valid string of length 7). This means that it doesn't know anything about null termination (well almost, there are conversions from/to C strings). In other words, there's no alternative in the STL for strncpy.
There are a few ways to still accomplish your goal with a shorter code:
const char *ptr = nameSra.pSection->pRawData->constPtr<char>(nameSra.offset);
m_exportName.assign(ptr, strnlen(ptr, nameSra.numBytesToSectionsEnd));
or
const char *ptr = nameSra.pSection->pRawData->constPtr<char>(nameSra.offset);
m_exportName.reserve(nameSra.numBytesToSectionsEnd);
for (int i = 0; i < nameSra.numBytesToSectionsEnd && ptr[i]; i++)
m_exportName += ptr[i];
Is there an exact equivalent to strncpy in the C++ Standard Library?
I certainly hope not!
I mean a function, that copies a string from one buffer to another until it hits the terminating 0?
Ah, but that's not what strncpy() does -- or at least it's not all it does.
strncpy() lets you specify the size, n, of the destination buffer, and copies at most n characters. That's fine as far as it goes. If the length of the source string ("length" defined as the number of characters preceding the terminating '\0') exceeds n, the destination buffer is padded with additional \0's, something that's rarely useful. And if the length if the source string exceeds n, then the terminating '\0' is not copied.
The strncpy() function was designed for the way early Unix systems stored file names in directory entries: as a 14-byte fixed-size buffer that can hold up to a 14-character name. (EDIT: I'm not 100% sure that was the actual motivation for its design.) It's arguably not a string function, and it's not just a "safer" variant of strcpy().
You can achieve the equivalent of what one might assume strncpy() does (given the name) using strncat():
char dest[SOME_SIZE];
dest[0] = '\0';
strncat(dest, source_string, SOME_SIZE);
This will always '\0'-terminate the destination buffer, and it won't needlessly pad it with extra '\0' bytes.
Are you really looking for a std::string equivalent of that?
EDIT : After I wrote the above, I posted this rant on my blog.
There is no built-in equivalent. You have to roll your own strncpy.
#include <cstring>
#include <string>
std::string strncpy(const char* str, const size_t n)
{
if (str == NULL || n == 0)
{
return std::string();
}
return std::string(str, std::min(std::strlen(str), n));
}
The string's substring constructor can do what you want, although it's not an exact equivalent of strncpy (see my notes at the end):
std::string( const std::string& other,
size_type pos,
size_type count = std::string::npos,
const Allocator& alloc = Allocator() );
Constructs the string with a substring [pos, pos+count) of other. If count == npos or if the requested substring lasts past the end of the string, the resulting substring is [pos, size()).
Source: http://www.cplusplus.com/reference/string/string/string/
Example:
#include <iostream>
#include <string>
#include <cstring>
int main ()
{
std::string s0 ("Initial string");
std::string s1 (s0, 0, 40); // count is bigger than s0's length
std::string s2 (40, 'a'); // the 'a' characters will be overwritten
strncpy(&s2[0], s0.c_str(), s2.size());
std::cout << "s1: '" << s1 << "' (size=" << s1.size() << ")" << std::endl;
std::cout << "s2: '" << s2 << "' (size=" << s2.size() << ")" << std::endl;
return 0;
}
Output:
s1: 'Initial string' (size=14)
s2: 'Initial string' (size=40)
Differences with strncpy:
the string constructor always appends a null-terminating character to the result, strncpy does not;
the string constructor does not pad the result with 0s if a null-terminating character is reached before the requested count, strncpy does.
Use the class' constructor:
string::string str1("Hello world!");
string::string str2(str1);
This will yield an exact copy, as per this documentation: http://www.cplusplus.com/reference/string/string/string/
std::string has a constructor with next signature that can be used :
string ( const char * s, size_t n );
with next description:
Content is initialized to a copy of the string formed by the first n characters in the array of characters pointed by s.