string.clear() does not remove element properly [duplicate] - c++

This question already has answers here:
Undefined, unspecified and implementation-defined behavior
(9 answers)
Closed 23 days ago.
It seems that member function clear() of string does remove its content, but the removed contents still can be accessed by operator[] . Here's the example that makes me confused.
#include <iostream>
using namespace std;
int main()
{
string input = "Weird";
cout << "Your Input: " << input << "\n";
input.clear();
cout << "Your Input: " << input << "\n";
cout << "Your Input: " << input[0] << input[1] << input[2] << input[3] << input[4] << '\n';
return 0;
}
The results are:
Your Input: Weird
Your Input:
Your Input: eird
Why this is happenning? If example above is normal, what should I do to completely remove its content? (accessing by input[1] should be '\000')

Accessing elements of a string after calling the method clear invokes undefined behavior.
It seems in your case the class std::string uses its internal buffer defined within the class itself for short strings.
After the call of clear the class just set the first character of the buffer with the terminating zero character '\0;.
To check that string was cleared just output its length as for example
std::cout << input.length() << '\n';

Related

Why is stringstream mixing up it's content in combination with char array and cout?

in this minimal example there is a weird messing up between the input to a stringstream and the content of a previously used cout:
online gdb:
https://onlinegdb.com/itO69QGAE
code:
#include <string>
#include <iostream>
#include <sstream>
using namespace std;
const char sepa[] = {':', ' '};
const char crlf[] = {'\r', '\n'};
int main()
{
cout<<"Hello World" << endl;
stringstream s;
string test1 = "test_01";
string test2 = "test_02";
s << test1;
cout << s.str() << endl;
// works as expected
// excpecting: "test_01"
// output: "test_01"
s << sepa;
cout << s.str() << endl;
// messing up with previous cout output
// expecting: "test_01: "
// output: "test_01: \nHello World"
s << test2;
cout << s.str() << endl;
// s seems to be polluted
// expecting: "test_01: test_02"
// output: "test_01: \nHello Worldtest_02"
s << crlf;
cout << s.str() << endl;
// once again messing up with the cout content
// expecting: "test_01: test_02\r\n"
// output: "test_01: Hello Worldtest_02\r\nHello World"
return 0;
}
So I am wondering why is this happing?
As it only happens when a char array is pushed into the stringstream it's likely about this... but according to the reference the stringstream's "<<"-operator can/should handle char* (what actually the name of this array stand's for).
Beside that there seems to be a (?hidden, or at least not obvious?) relation between stringstream and cout. So why does the content pollute into the stringstream?
Is there any wrong/foolish usage in this example or where is the dog buried (-> german idiom :P )?
Best regards and thanks
Damian
P.S. My question is not about "fixing" this issue like using a string instead of the char array (this will work)... it's about comprehend the internal mechanics and why this is actually happing, because for me this is just an unexpected behaviour.
The std::stringstream::str() function returns a string containing all characters previously written into the stream, in all previous calls to operator<< (or other output functions). However it seems that you expect that only the last output operation will be returned - this is not the case.
This is analogous to how e.g. std::cout works: each invocation of std::cout << appends the string to standard output; it does not clear the console's screen.
To achieve what you want, you either need to use a separate std::stringstream instance every time:
std::stringstream s1;
s1 << test1;
std::cout << s1.str() << std::endl;
std::stringstream s2;
s2 << sepa;
std::cout << s2.str() << std::endl;
Or better, clear the contents of the std::stringstream using the single argument overload of the str() function:
std::stringstream s;
s << test1;
std::cout << s.str() << std::endl;
// reset the contents of s to an empty string
s.str("");
s << sepa;
std::cout << s.str() << std::endl;
The s.str("") call effectively discards all characters previously written into the stream.
Note, that even though std::stringstream contains a clear() function that would seem a better candidate, it's not analogous to e.g. std::string::clear() or std::vector::clear() and won't yield the effect desired in your case.
Here I am again,
Thanks to "Some programmer dude"'s comment I think I figured it out:
As there is no (null-)termination-symbol related to both char arrays it seems that the stringstream-<<-operator inserts until it stumbles over an null-terminator '\0'.
Either expending both arrays with a \0-symbol (e.g. const char sepa[] = {':', ' ', '\0'}) or terminating the length with e.g. s << string(sepa,2) will do the expected output.
In this specific case above the data seems to lay aligned in memory, so that the next null-terminator will be found inside the cout << "Hello World"-statement. As this alignment is not guaranteed, this will actually result in undefined behaviour, when the termination is missing.
So also two additional "terminating"-arrays like e.g const char sepa[] = {':', ' '}; char[] end_of_sepa = {'\0'}; declared right after the mentioned arrays will result in expected output, eventhough when the rest will be left unchanged... but this is probably not guaranteed and depends on the internal representation in memory.
P.S. As previously written this issue is not about fixing but comprehension. So please feel free to confirm or correct my assumption.
EDIT: Corrected the bold code section.

Difference between string.empty and string[0] == '\0'

Suppose we have a string
std::string str; // some value is assigned
What is the difference between str.empty() and str[0] == '\0'?
C++11 and beyond
string_variable[0] is required to return the null character if the string is empty. That way there is no undefined behavior and the comparison still works if the string is truly empty. However you could have a string that starts with a null character ("\0Hi there") which returns true even though it is not empty. If you really want to know if it's empty, use empty().
Pre-C++11
The difference is that if the string is empty then string_variable[0] has undefined behavior; There is no index 0 unless the string is const-qualified. If the string is const qualified then it will return a null character.
string_variable.empty() on the other hand returns true if the string is empty, and false if it is not; the behavior won't be undefined.
Summary
empty() is meant to check whether the string/container is empty or not. It works on all containers that provide it and using empty clearly states your intent - which means a lot to people reading your code (including you).
Since C++11 it is guaranteed that str[str.size()] == '\0'. This means that if a string is empty, then str[0] == '\0'. But a C++ string has an explicit length field, meaning it can contain embedded null characters.
E.g. for std::string str("\0ab", 3), str[0] == '\0' but str.empty() is false.
Besides, str.empty() is more readable than str[0] == '\0'.
Other answers here are 100% correct. I just want to add three more notes:
empty is generic (every STL container implements this function) while operator [] with size_t only works with string objects and array-like containers. when dealing with generic STL code, empty is preferred.
also, empty is pretty much self explanatory while =='\0' is not very much.
when it's 2AM and you debug your code, would you prefer see if(str.empty()) or if(str[0] == '\0')?
if only functionality matters, we would all write in vanilla assembly.
there is also a performance penalty involved. empty is usually implemented by comparing the size member of the string to zero, which is very cheap, easy to inline etc. comparing against the first character might be more heavy. first of all, since all strings implement short string optimization, the program first has to ask if the string is in "short mode" or "long mode". branching - worse performance. if the string is long, dereferencing it may be costly if the string was "ignored" for some time and the dereference itself may cause a cache-fault which is costly.
empty() is not implemented as looking for the existence of a null character at position 0, its simply
bool empty() const
{
return size() == 0 ;
}
Which could be different
Also, beware of the functions you'll use if you use C++ 11 or later version:
#include <iostream>
#include <cstring>
int main() {
std::string str("\0ab", 3);
std::cout << "The size of str is " << str.size() << " bytes.\n";
std::cout << "The size of str is " << str.length() << " long.\n";
std::cout << "The size of str is " << std::strlen(str.c_str()) << " long.\n";
return 0;
}
will return
The size of str is 3 bytes.
The size of str is 3 long.
The size of str is 0 long.
You want to know the difference between str.empty() and str[0] == '\0'. Lets follow the example:
#include<iostream>
#include<string>
using namespace std;
int main(){
string str, str2; //both string is empty
str2 = "values"; //assigning a value to 'str2' string
str2[0] = '\0'; //assigning '\0' to str2[0], to make sure i have '\0' at 0 index
if(str.empty()) cout << "str is empty" << endl;
else cout << "str contains: " << str << endl;
if(str2.empty()) cout << "str2 is empty" << endl;
else cout << "str2 contains: " << str2 << endl;
return 0;
}
Output:
str is empty
str2 contains: alues
str.empty() will let you know the string is empty or not and str[0] == '\0' will let you know your strings 0 index contains '\0' or not. Your string variables 0 index contains '\0' doesn't mean that your string is empty. Yes, only once it can be possible when your string length is 1 and your string variables 0 index contains '\0'. That time you can say that, its an empty string.
C++ string has the concept of whether it is empty or not. If the string is empty then str[0] is undefined. Only if C++ string has size >1, str[0] is defined.
str[i] == '\0' is a concept of the C-string style. In the implementation of C-string, the last character of the string is '\0' to mark the end of a C-string.
For C-string you usually have to 'remember' the length of your string with a separate variable. In C++ String you can assign any position with '\0'.
Just a code segment to play with:
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char* argv[]) {
char str[5] = "abc";
cout << str << " length: " << strlen(str) << endl;
cout << "char at 4th position: " << str[3] << "|" << endl;
cout << "char at 5th position: " << str[4] << "|" << endl;
str[4]='X'; // this is OK, since Cstring is just an array of char!
cout << "char at 5th position after assignment: " << str[4] << "|" << endl;
string cppstr("abc");
cppstr.resize(3);
cout << "cppstr: " << cppstr << " length: " << cppstr.length() << endl;
cout << "char at 4th position:" << cppstr[3] << endl;
cout << "char at 401th positon:" << cppstr[400] << endl;
// you should be getting segmentation fault in the
// above two lines! But this may not happen every time.
cppstr[0] = '\0';
str[0] = '\0';
cout << "After zero the first char. Cstring: " << str << " length: " << strlen(str) << " | C++String: " << cppstr << " length: " << cppstr.length() << endl;
return 0;
}
On my machine the output:
abc length: 3
char at 4th position: |
char at 5th position: |
char at 5th position after assignment: X|
cppstr: abc length: 3
char at 4th position:
char at 401th positon:?
After zero the first char. Cstring: length: 0 | C++String: bc length: 3

length() of string is returning 0 for a C++ program

In the following code the length of the reversed string is being returned as zero which should not be the case:
int main() {
string for_reversal;
string reversed;
int i,j,length, r_length;
cout << "Enter the string : \n";
cin >> for_reversal;
cout << "Entered string is : " << for_reversal <<"\n";
cout << "String length is : " << for_reversal.length() << "\n";
length = for_reversal.length();
for (i=0; i<=length; i++)
{
reversed[i] = for_reversal[length - i-1];
cout << for_reversal[length-i] << "\t";
}
reversed[length+1]='\0';
cout << "\n";
r_length = reversed.length();
cout << "Reversed String length is : " << r_length << "\n";
cout << "Reversed String is : " << reversed;
return 0;
}
Not sure whats going wrong here.
There are length valid characters in a string with length length. In your cycle you access the element with index length which is out of bounds for the string and this invokes undefined behavior.
Additionally you can not assign values to cells in a string that are outside of its current size, while you assign values to the cells in reversed before resizing the string appropriately. This leads to second undefined behavior.
Having in mind the two issues I mentioned above the behavior of your problem is really not defined. However the output does makes sense if we ignore that - you assign to reversed[0] the value of for_reversal[length], which is probably '\0'. As a result the length of reversed is now 0.
Change
reversed[i] = for_reversal[length - i-1];
to
reversed+=for_reversal[length - i-1];
The way you're doing it is accessing the string out of its bound, paving the way for all kinds of hell breaking through. They way proposed above will append stuff to the string, reserving space as needed.
You could try and reserve space in "reversed" (http://www.cplusplus.com/reference/string/string/reserve/) so it can hold the original string (reversed.reserve(for_reversal.size())).
In any case, please not that you're not reserving space for the \0 character. Either reserve a bigger string or append \0 with the + operator.
The problem is here:
for (i = 0; i <= length; i++) {
reversed[i] = for_reversal[length - i-1];
// ...
}
With the statement:
reversed[i]
You're actually trying to access the i-th element of the string, with i=0, 1, ..., length.
But at that moment, the object string reversed is an empty string. Indeed you've created it with:
string reversed;
That is default constructor which initializes an empty string.
In other words, reversed[i] access to undefined memory location, since reversed is void, and there is no i-th position in the string.
Solution
In order to reverse a string you may find this code useful:
std::string reversed;
for(auto rit = for_reversal.rbegin(); rit != for_reversal.rend(); ++rit) {
reversed.push_back(*rit);
}
This code should be safer that your, using iterators (in that case reversed iterator).
Note
Moreover there is no need to append the '\0' char at the end, because the class string already handles that.
The program has undefined behavior because the string reversed is empty
string reversed;
and you may not use the subscript operator
reversed[i] = for_reversal[length - i-1];
^^^^^^^^^^^
to assign values to the string.
Also there is no need to append the zero character to objects of type std::string.
Take into account that in the loop
for (i=0; i<=length; i++)
{
reversed[i] = for_reversal[length - i-1];
cout << for_reversal[length-i] << "\t";
}
then i is equal to length then you will have
reversed[length] = for_reversal[-1];
^^^^^
The simplest way to create a reversed string is the following
#include <iostream>
#include <string>
int main()
{
std::string for_reversal;
std::cout << "Enter the string: ";
std::cin >> for_reversal;
std::string reversed( for_reversal.rbegin(), for_reversal.rend() );
std::cout << "\nReversed String length is : " << reversed.length() << "\n";
std::cout << "Reversed String is : " << reversed << std::endl;
return 0;
}
The program output is
Enter the string: Hello
Reversed String length is : 5
Reversed String is : olleH
If you want to write the loop yourself then you should reserve enough memory in the reversed string or at least use member function push_back instead of the subscript operator if the string initially is empty. The program can look for example the following way
#include <iostream>
#include <string>
int main()
{
std::string for_reversal;
std::string reversed;
std::cout << "Enter the string: ";
std::cin >> for_reversal;
reversed.reserve( for_reversal.length() );
for ( std::string::size_type i = for_reversal.length(); i != 0; i-- )
{
reversed += for_reversal[i-1];
}
std::cout << "\nReversed String length is : " << reversed.length() << "\n";
std::cout << "Reversed String is : " << reversed << std::endl;
return 0;
}
The program output will be the same as shown above
Enter the string: Hello
Reversed String length is : 5
Reversed String is : olleH
first initialize string reversed with empty string i.e string reversed="";
after this change for loop:
for(int i=0;i<length;i++)
{
reversed+=for_reversal[length-1-i];
cout << for_reversal[length-i] << endl;
}
int r_length=reversed.length();
cout << "reversed string length : " << r_length << endl;
cout << "Reversed string is : " << reversed << endl;
hope u understand.

Char* pointers and char[] [duplicate]

This question already has answers here:
What is array to pointer decay?
(11 answers)
Closed 7 years ago.
I'm working on learning pointers in C++, and was doing fine with int*, double*, etc. but then I tried something that I can't understand. I've attached my code segment, and the comments are the terminal output.
char word[] = "Hello there";
cout << word << endl; //Hello there
cout << *word << endl; //H
cout << &word << endl; //Address
char *wordpt = word;
cout << wordpt << endl; //Hello there
cout << *wordpt << endl; //H
cout << &wordpt << endl; //Address
wordpt = &word[0];
cout << wordpt << endl; //Hello there
cout << *wordpt << endl; //H
cout << &wordpt << endl; //Address
What's going on in these? I don't even understand how the contents of word (*word) can be a single index. How is word stored in memory? And why would wordpt allow me to give it a value of word, which isn't an address?
The pointer wordpt in the first case points to the first element of the array word, the second assignment is exactly the same thing but explicitly.
Arrays are automatically converted to pointers to their first elements as is specified by the c standard which also applies to c++.
What is confusing you is the fact that cout automatically prints the contents of a char * pointer instead of the address the pointer points to, to do it there is a requirement, the pointed to data must be a c string, i.e. a sequence of non-nul bytes followed by a nul byte, which is what is stored in word.
So cout is accessing the data pointed to by the wordpt pointer.

One-Second Timer [duplicate]

This question already has answers here:
Delay execution 1 second
(4 answers)
Closed 8 years ago.
I need a timer that does X every second.
I made this, however it doesn't print anything until the program is terminated, I find that weird.
It prints everything after three seconds if you put three as the counter, and 100 if you chose that.
How do make it print every second and not all at once at termination?
int main()
{
using namespace std;
//Number to count down from
int counter = 10;
//When changed, a second has passed
int second = (unsigned)time(NULL);
//If not equal to each other, counter is printed
int second_timer = second;
while (counter > 0) {
second = (unsigned)time(NULL);
while (second != second_timer) {
//Do something
cout << counter-- << ", ";
//New value is assigned to match the current second
second_timer = second;
}
}
cout << "0" << endl;
return 0;
}
Add << flush where you want to flush. I.e. change your printout to:
cout << counter-- << ", " << flush;
endl causes the buffer to 'flush' and be written out to stdout. You can add << endl; to your cout << counter--, manually flush the cout stream using cout.flush();, or append << flush; to the end of your cout expression (thanks #Rob!)
For more info, the answer to this question seems to go into more detail.