I am having problems with std::string..
Problem is that '\0' is being recognized as end of the string as in C-like strings.
For example following code:
#include <iostream>
#include <string>
int main ()
{
std::string s ("String!\0 This is a string too!");
std::cout << s.length(); // same result as with s.size()
std::cout << std::endl << s;
return 0;
}
outputs this:
7
String!
What is the problem here? Shouldn't std::string treat '\0' just as any other character?
Think about it: if you are given const char*, how will you detemine, where is a true terminating 0, and where is embedded one?
You need to either explicitely pass a size of string, or construct string from two iterators (pointers?)
#include <string>
#include <iostream>
int main()
{
auto& str = "String!\0 This is a string too!";
std::string s(std::begin(str), std::end(str));
std::cout << s.size() << '\n' << s << '\n';
}
Example: http://coliru.stacked-crooked.com/a/d42211b7199d458d
Edit: #Rakete1111 reminded me about string literals:
using namespace std::literals::string_literals;
auto str = "String!\0 This is a string too!"s;
Your std::string really has only 7 characters and a terminating '\0', because that's how you construct it. Look at the list of std::basic_string constructors: There is no array version which would be able to remember the size of the string literal. The one at work here is this one:
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
The "String!\0 This is a string too!" char const[] array is converted to a pointer to the first char element. That pointer is passed to the constructor and is all information it has. In order to determine the size of the string, the constructor has to increment the pointer until it finds the first '\0'. And that happens to be one inside of the array.
If you happen to work with a lot zero bytes in your strings, then chances are that std::vector<char> or even std::vector<unsigned char> would be a more natural solution to your problem.
You are constructing your std::string from a string literal. String literals are automatically terminated with a '\0'. A string literal "f\0o" is thus encoded as the following array of characters:
{'f', '\0', 'o', '\0'}
The string constructor taking a char const* will be called, and will be implemented something like this:
string(char const* s) {
auto e = s;
while (*e != '\0') ++e;
m_length = e - s;
m_data = new char[m_length + 1];
memcpy(m_data, s, m_length + 1);
}
Obviously this isn't a technically correct implementation, but you get the idea. The '\0' you manually inserted will be interpreted as the end of the string literal.
If you want to ignore the extra '\0', you can use a std::string literal:
#include <iostream>
#include <string>
int main ()
{
using namespace std::string_literals;
std::string s("String!\0 This is a string too!"s);
std::cout << s.length(); // same result as with s.size()
std::cout << std::endl << s;
return 0;
}
Output:
30
String! This is a string too!
\0 is known as a terminating character so you'll need to skip it somehow.
Take that as an example.
So whenever you want to skip special characters you would like to use two backslashes "\\0"
And '\\0' is a two-character literal
std::string test = "Test\\0 Test"
Results :
Test\0 Test
Most beginners also make mistake when loading eg. files :
std::ifstream some_file("\new_dir\test.txt"); //Wrong
//You should be using it like this :
std::ifstream some_file("\\new_dir\\test.txt"); //Correct
In very few words, you're constructing your C++ string from a standard C string.
And standard C strings are zero-terminated. So, your C string parameter will be terminated in the first \0 character it can find. And that character is the one you explicitly provided in your string "String!\0 This is a string too!"
And not in the 2nd one that is implictly and automatically provided by the compiler in the end of your C standard string.
Escape your \0
std::string s ("String!\\0 This is a string too!");
and you will get what you need:
31
String!\0 This is a string too!
That's not a problem, that's the intended behavior.
Maybe you could elaborate why you have a \0 in your string.
Using a std::vector would allow you to use \0 in your string.
Related
#include <iostream>
using namespace std;
int main()
{
string a("Hello World",20);
cout<<a<<endl;
return 0;
}
I get output as "Hello WorldP". Why?
Usually we initialise string only with a data.But here i gave size.But it takes junkees.
So do i prefer not giving size?
Generally this is called garbage in, garbage out.
From cppreference:
Constructs the string with the first count characters of character string pointed to by s. s can contain null characters. The length of the string is count. The behavior is undefined if [s, s + count) is not a valid range.
The behavior of your program is undefined because "Hello World" is a const char[12] and trying to access characters up to index 20 via the const char* (resulting from the array decaying to pointer to its first element) is out of bounds.
The actual use case for that constructor is to create a std::string from a substring of some C-string, for example:
std::string s("Hello World",5); // s == "Hello"
Or to create a std::string from a C-string that contains \0 in the middle, for example:
std::string s("\0 Hello",5); // s.size() == 5 (not 0)
I am facing an issue with string and null characters in C++.
When I am writing '\0' in between the string and printing the string then I am getting only part before '\0' but on the other hand when I am taking string as input then changing any index as '\0' then it is printing differently. Why is it so and why the sizeof(string) is 32 in both the cases
Code is below for reference. Please help.
First code:
#include<iostream>
using namespace std;
int main(){
string s = "he\0llo";
cout<<s.length()<<"\n";
cout<<s<<endl;
cout<<sizeof(s)<<"\n";
}
output of First code:
2\n
he\n
32\n
Second code
#include<iostream>
using namespace std;
int main(){
string s;
cin>>s;
s[1] = '\0';
cout<<s<<"\n";
cout<<s.length()<<"\n";
cout<<sizeof(s)<<"\n";
return 0;
}
output of second code:
hllo\n
5\n
32\n
Below is the image for your reference.
std::string's implicit const CharT* constructors and assignment operator don't know the exact length of the string argument. Instead, it only knows that this is a const char* and is forced to assume that the string will be null-terminated, and thus computes the length using std::char_traits::length(...) (effectively std::strlen).
As a result, constructing a std::string object with an expression like:
std::string s = "he\0llo";
will compute 2 as the length of the string, since it assumes the first \0 character is the null terminator for the string, whereas your second example of:
s[1] = '\0';
is simply adding a null character into an already constructed string -- which does not change the size of the string.
If you want to construct a string with a null character in the middle, you can't let it compute the length of the string for you. Instead, you will have to construct the std::string and give it the length in some other way. This could either by done with the string(const char*, size_t) constructor, or with an iterator pair if this is an array:
// Specify the length manually
std::string s{"he\0llo",6};
Live Example
// Using iterators to a different container (such as an array)
const char c_str[] = "he\0llo";
std::string s{std::begin(c_str), std::end(c_str)};
Live Example
Note: sizeof(s) is telling you the size of std::string class itself in bytes for your implementation of the standard library. This does not tell you the length of the contained string -- which can be determined from either s.length() or s.size().
As of c++14, there is an option to specify quoted strings as a std::string by using std::literals. This prevents conversion of an array of chars to string which automatically stops at the first nul character.
#include<iostream>
#include <string>
using namespace std;
using namespace std::literals;
int main() {
string s = "he\0llo"s; // This initializes s to the full 6 char sequence.
cout << s.length() << "\n";
cout << s << endl;
cout << sizeof(s) << "\n"; // prints size of the s object, not the size of its contents
}
Results:
6
hello
28
My program takes in a vector std::vector<std::string> vector and a character char separator and returns a string with all of the strings added together between the separator character. The concept is: vector[0] + separator + vector[1] + separator
The Code
std::string VectorToString(std::vector<std::string> vector, char separator)
{
std::string output;
for(std::string segment : vector)
{
std::string separator_string(&separator);
output += segment + separator_string;
}
return output;
}
int main()
{
std::vector<std::string> vector = {"Hello", "my", "beautiful", "people"};
std::cout << VectorToString(vector, ' ');
}
My expected output is Hello my beautiful people
However the output is:
Hello �����my �����beautiful �����people �����
What I have found is that something is wrong with the character, specifically its pointer: std::cout << &separator; -> �ƚ��. However if I do like this: std::cout << (void*) &separator; -> 0x7ffee16d35f7. Though I don't really know what (void*) does.
Question:
1.What is happening?
2.Why is it happening?
3.How do I fix it?
4.How do I prevent it from happening in future projekts?
This line
std::string separator_string(&separator);
tries to construct a string from a 0-terminated C string.
But &separator is not 0-terminated, because it depends on the other bytes of memory following separator if there is an 0 byte in there (probably not). So you're getting undefined behavior.
What you can do is to use other constructor:
std::string separator_string(1, separator);
This one creates a string by repeating separator character 1 time.
By the standard, following:
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
means this:
Constructs the string with the contents initialized with a copy of the null-terminated character string pointed to by s. The length of the string is determined by the first null character. The behavior is undefined if [s, s + Traits::length(s)) is not a valid range.
Therefore, std::string separator_string(&separator); causes an Undefined-Behavior because separator is not null-terminated.
To prevent this, you might want to use the following overload:
basic_string( const CharT* s,
size_type count,
const Allocator& alloc = Allocator() );
like std::string separator_string(&separator, 1); or even more simpler (as other answer pointed out) std::string separator_string(1, separator);.
I'm trying to remove the last character of an std string view, but no matter what I do it remains there. I think its because I'm accidentally removing the "/0" instead of the desired "]".
Here is my code:
#include <iostream>
#include <tstr/tstring.h>
#include <cstring>
template<typename Class>
constexpr const char* to_string() {
std::string_view str = __PRETTY_FUNCTION__;
auto first = str.find("= ");
auto last = str.find("]");
auto str2 = str.substr(first + 2, last - first + 1);
return str2.data();
}
class Foo {};
int main()
{
std::cout << to_string<Foo>() << std::endl;
return 0;
}
This outputs Foo]. How can I remove the trailing ] ?
Thanks.
If you insert a pointer to char into a character stream, the pointed string is printed until the null terminator is reached. If there is no null terminator, then the behaviour of the program is undefined.
std::string_view is not guaranteed to be null terminated. Therefore it is dangerous to insert std::string_view::data into a character stream. In this particular case, the string view points to a non-null-terminated substring within a null terminated string, so the behaviour is well defined, but not what you intended because the output will proceed to the outside of the substring.
How can I remove the prepending ] ?
Return the string view to the substring rather than a pointer:
constexpr std::string_view to_string() {
...
return str2;
}
#include<iostream>
using namespace std;
int main()
{
char s1[80]={"This is a developed country."};
char *s2[8];
s2[0]=&s1[10];
cout<<*s2; //Predicted OUTPUT: developed
// Actual OUTPUT: developed country.
return 0;
}
I want that the cout<<*s2; should print only the letters {"developed"} in it, so I gave *s2[8] length as 8 characters. What can I do so that the variable cout<<*s2 will only print upto the length of 8 characters. I'm using dmc, lcc and OpenWatcom compilers. This is only a small part of other bigger program where I'm using string data type, so what can I do now, well extremely thanks for answering my question :)
s2 is a length 8 array of pointers to char. You are making its first element point to s1 starting at position 10. That is all. You are not using the remaining elements of that array. Therefore the length of s2 is irrelevant.
You could have done this instead:
char* s2 = &s1[10];
If you want to create a string out of part of s1, you can use std::string:
std::string s3(s1+10, s1+19);
std::cout << s3 << endl;
Note that this allocates its own memory buffer and holds a copy or the original character sequence. If you only want a view of part of another string, you can easily implement a class holding a begin and one-past the end pointer to the original. Here's a rough sketch:
struct string_view
{
typedef const char* const_iterator;
template <typename Iter>
string_view(Iter begin, Iter end) : begin(begin), end(end) {}
const_iterator begin;
const_iterator end;
};
std::ostream& operator<<(std::ostream& o, const string_view& s)
{
for (string_view::const_iterator i = s.begin; i != s.end; ++i)
o << *i;
return o;
}
then
int main()
{
char s1[] = "This is a developed country.";
string_view s2(s1+10, s1+19);
cout << s2 << endl;
}
s2 is an array of pointers to char*. You are only ever using the zeroth element in this array.
&s1[10] points to the 11th character in the string s1. That address is assigned to the zeroth element of s2.
In the cout statement, *s2 is equivalent to s2[0];. So cout << *s2; outputs the zeroth element of s2, which has been assigned to the 11th character of s1. cout will trundle along the memory until the null-terminator of your string is reached.
Strings must be NULL terminated aka \0. The start of s2 is fine but cout will continue reading until the end. You have to actually copy the data rather than simply setting the pointer if you want to be able to output.
Your mistake is thinking that
char *s2[8];
declares a pointer to an array of 8 characters (or, not equivalently, a pointer to a string-with-exactly-8-characters). It doesn't do either of those. Instead of declaring a pointer-to-an-array, it declares an array-of-pointers.
If you want s2 to be a pointer to an array-of-8-characters, you need:
char (*s2)[8];
But, that's still messed up. You ask:
What can I do so that the variable *s2 will store only up to its length?
Do you think its length is 8? Before trying to answer that, return to your definition of s1:
char s1[80]={"This is a developed country."};
Is the length 80, or 28? The answer is either, depending on how you define 'length' - the length of the array or the length up to the null terminator?
All of these misconceptions about size are unhelpful. As #n.m. has pointed out in a comment, the solution to all pointer problems in C++ is to stop using pointers. (Apologies if I've mis-paraphrased n.m.!)
#include<iostream>
using namespace std;
int main()
{
string s1="This is a developed country.";
string s2;
s2 = s1.substr(10, 9);
cout << s2;
return 0;
}
If you want to do it ghetto style and skip the std::string for some reason you can always use strncpy, memcpy or strstr etc.
int main()
{
char s1[80]="This is a developed country.";
char s2[10];
strncpy(s2,s1+10,9);
s2[9] = '\0';
std::cout << s2 << std::endl;
std::cin.get();
return 0;
}
s2 is an char type arry, the element of the arry is char *,so you can't use it to store a string. if you want to get the "developed" in the strings,you can write code like it:
#include<iostream>
using namespace std;
int main()
{
char *s1[]={"This", "is", "a", "developed", "country."};
char *s2[8];
s2[0]= s1 + 3;
cout<<s2[0]; //Predicted OUTPUT: developed
// Actual OUTPUT: developed country.
return 0;
}