Modifying the length and contents of the string? - c++

To change the contents of a string in a function such that it reflects in the main function we need to accept the string as reference as indicated below.
Changing contents of a std::string with a function
But in the above code we are changing the size of string also(i.e, more than what it can hold), so why is the program not crashing ?
Program to convert decimal to binary, mind it, the code is not complete and I am just testing the 1st part of the code.
void dectobin(string & bin, int n)
{
int i=0;
while(n!=0)
{
bin[i++]= (n % 2) + '0';
n = n / 2;
}
cout << i << endl;
cout << bin.size() << endl;
cout << bin << endl;
}
int main()
{
string s = "1";
dectobin(s,55);
cout << s << endl;
return 0;
}
O/p: 6 1 1 and the program crashes in codeblocks. While the above code in the link works perfectly fine.
It only outputs the correct result, when i initialize the string in main with 6 characters(i.e, length of the number after it converts from decimal to binary).
http://www.cplusplus.com/reference/string/string/capacity/
Notice that this capacity does not suppose a limit on the length of the string. When this capacity is exhausted and more is needed, it is automatically expanded by the object (reallocating it storage space). The theoretical limit on the length of a string is given by member max_size
If the string resizes itself automatically then why do we need the resize function and then why is my decimal to binary code not working?

Your premise is wrong. You are thinking 1) if I access a string out of bound then my program will crash, 2) my program doesn't crash therefore I can't be accessing a string out of bounds, 3) therefore my apparently out of bounds string accesses must actually resize the string.
1) is incorrect. Accessing a string out of bounds results in undefined behaviour. This is means exactly what it says. Your program might crash but it might not, it's behaviour is undefined.
And it's a fact that accessing a string never changes it's size, that's why we have the resize function (and push_back etc.).
We must get questions like yours several times a week. Undefined behaviour is clearly a concept that newbies find surprising.

Check this link about std::string:
char& operator[] (size_t pos);
const char& operator[] (size_t pos) const;
If pos is not greater than the string length, the function never
throws exceptions (no-throw guarantee). Otherwise, it causes
undefined behavior.
In your while loop you are accessing the bin string with index that is greater than bin.size()

You aren't changing the size of the string anywhere. If the string you pass into the function is of length one and you access it at indices larger than 0, i.e., at bin[1], bin[2], you are not modifying the string but some other memory locations after the string - there might be something else stored there. Corrupting memory in this way does not necessarily directly lead to a crash or an exception. It will once you access those memory locations later on in your program.

Accepting a reference to a string makes it possible to change instances of strings from the calling code inside the called code:
void access(std::string & str) {
// str is the same instance as the function
// is called with.
// without the reference, a copy would be made,
// then there would be two distinct instances
}
// ...
std::string input = "test";
access(input);
// ...
So any function or operator that is called on a reference is effectively called on the referenced instance.
When, similar to your linked question, the code
str = " new contents";
is inside of the body of the access function, then operator= of the input instance is called.
This (copy assignment) operator is discarding the previous contents of the string, and then copying the characters of its argument into newly allocated storage, whose needed length is determined before.
On the other hand, when you have code like
str[1] = 'a';
inside the access function, then this calls operator[] on the input instance. This operator is only providing access to the underlying storage of the string, and not doing any resizing.
So your issues aren't related to the reference, but to misusing the index operator[]:
Calling that operator with an argument that's not less than the strings size/length leads to undefined behaviour.
To fix that, you could resize the string manually before using the index operator.
As a side note: IMO you should try to write your function in a more functional way:
std::string toOct(std::string const &);
That is, instead of modifying the oases string, create a new one.

The bounds of the string are limited by its current content. That is why when you initialise the string with 6 characters you will stay inside bounds for conversion of 55 to binary and program runs without error.
The automatic expansion feature of strings can be utilised using
std::string::operator+=
to append characters at the end of current string. Changed code snippet will look like this:
void dectobin(string & bin, int n){
//...
bin += (n % 2) + '0';
//...
}
Plus you don't need to initialise the original string in main() and your program should now run for arbitrary decimals as well.
int main(){
//...
string s;
dectobin(s,55);
//...
}

Related

Why is my string variable not printing a full string?

Program:
#include <iostream>
using namespace std;
int main() {
string str;
str[0] = 'a';
str[1] = 'b';
str[2] = 'c';
cout << str;
return 0;
}
Output:
No output.
If I replace cout << str; with cout << str[1], I get a proper output.
Output:
b
And if I change the data type of the variable to a character array, I get the desired output. (Replacing string str; with char str[5];
Output:
abc
Why is my program behaving like this? How do I alter my code to get the desired output without changing the data type?
Your program has undefined behavior.
string str;
creates an empty string. It has length 0.
You are trying to write to the first three elements of this string with
str[0] = 'a';
str[1] = 'b';
str[2] = 'c';
These do not exist. Indexing a std::string out-of-bounds causes undefined behavior.
You can add characters to a string with any of the following methods:
str += 'a';
str += "a";
str.push_back('a');
str.append("a");
or you can resize the string first to the intended length before you index into any of its elements:
str.resize(3);
As pointed out by #Ayxan in a comment under this answer, you are also missing #include<string>. Without it it is unspecified whether your program will compile since it uses std::string which is defined in <string>. It is unspecified whether including one standard library header will include another one if there isn't a specific exception. You should not rely on unspecified behavior that may break at any point.
In addition to walnut's answer, here's what's going on under the hood:
An std::string contains at least two members, a data pointer (type char*) and a size. The [] operator does not check the size, it only indexes into the memory behind the data pointer. Also, the [] operator does not modify the size member of the string.
The << operator that streams the string to cout, however, does interpret the size member. It does so to be able to output strings containing null characters. Since the size member is still zero, nothing is printed.
You may wonder why the memory access within str[0] even succeeds, after all, the string never had any reason to allocate any memory for its data yet. This is due to the fact that virtually all std::string implementations use the small-string-optimization: The std::string object itself is a bit larger than it needs to be, and the space at its end is used instead of an allocation on the heap unless the string becomes longer than that space. As such, the default constructor will just point the data pointer to that internal storage to have it initialized, and your memory accesses are directed to existing memory. This is why you don't get a SEGFAULT unless you access the string's data way out of bounds. Doesn't change the fact that already your expression str[0] is undefined behavior, though. The symptoms may appear benign, but the disease is fatal.

How do I declare a new string the same length of a known const string?

I've been using:
string letters = THESAMELENGTH; // Assign for allocation purposes.
Reason being, if I:
string letters[THESAMELENGTH.length()];
I get a non constant expression complaint.
But if I:
string letters[12];
I'm at risk of needing to change every instance if the guide const string changes size.
But it seems foolish to assign a string when I won't use those entries, I only want my newly assigned string to be the same length as the previously assigned const string, then fill with different values.
How do you recommend I do this gracefully and safely?
You can
string letters(THESAMELENGTH.length(), ' '); // constructs the string with THESAMELENGTH.length() copies of character ' '
BTW: string letters[12]; doesn't mean the same as you expected. It declares a raw array of string containing 12 elements.
I only want my newly assigned string to be the same length as the previously assigned const string, then fill with different values.
Part of the reason the string class/type exists is so you don't have to worry about trying to manage its length. (The problem with arrays of char.)
If you have a const std::string tmp then you can't just assign anything to it after it has already been initialized. E.g.:
const std::string tmp = "A value"; // initialization
tmp = "Another value"; // compile error
How do you recommend I do this gracefully and safely?
If you really want to keep strings to a specific size, regardless of their contents, you could always resize your string variables. For example:
// in some constants.h file
const int MAX_STRING_LENGTH = 16;
// in other files
#include "constants.h"
// ...
std::string word = ... // some unknown string
word.resize(MAX_STRING_LENGTH);
Now your word string will have a length/size of MAX_STRING_LENGTH and anything beyond the end gets truncated.
This example is from C++ Reference
// resizing string
#include <iostream>
#include <string>
int main ()
{
std::string str ("I like to code in C");
std::cout << str << '\n';
unsigned sz = str.size();
str.resize (sz+2,'+');
std::cout << str << '\n';
str.resize (14);
std::cout << str << '\n';
return 0;
}
// program output
I like to code in C
I like to code in C++
I like to code
You can't just ask a string variable for its length at compile-time. By definition, it's impossible to know the value of a variable, or the state of any given program for that matter, while it's not running. This question only makes sense at run-time.
Others have mentioned this, but there seems to be an issue with your understanding of string letters[12];. That gives you an array of string types, i.e. you get space for 12 full strings (e.g. words/sentences/etc), not just letters.
In other words, you could do:
for(size_t i = 0; i < letters.size(); ++i)
letters[i] = "Hello, world!";
So your letters variable should be renamed to something more accurate (e.g. words).
If you really want letters (e.g. the full alphabet on a single string), you could do something like this:
// constants.h
const std::string ALPHABET_LC = "abc...z";
const std::string ALPHABET_UC = "ABC...Z";
const int LETTER_A = 0;
const int LETTER_B = 1;
// ...
// main.cpp, etc.
char a = ALPHABET_LC[LETTER_A];
char B = ALPHABET_UC[LETTER_B];
// ...
It all depends on what you need to do, but this might be a good alternative.
Disclaimer: Note that it's not really my recommendation that you do this. You should let strings manage their own length. For example, if the string value is actually shorter than your limit, you're causing your variable to use more space/memory than needed, and if it's longer, you're still truncating it. Neither side-effect is good, IMHO.
The first thing you need to do is understand the difference between a string length and an array dimension.
std::string letters = "Hello";
creates a single string that contains the characters from "Hello", and has length 5.
In comparison
std::string letters[5];
creates an array of five distinct default-constructed objects of type std::string. It doesn't create a single string of 5 characters. The reason for the non-constant complaint when doing
std::string letters[THESAMELENGTH.length()];
is that construction of arrays in standard C++ is required to use a length known to the compiler, whereas the length of a std::string is determined at run time.
If you have a string, and what to create another string of the same length, you can do something like
std::string another_string(letters.length(), 'A');
which will create a single string containing the required number of letters 'A'.
It is largely pointless to do what you are seeking as a std::string can dynamically change its length anyway, as needed. There is also nothing stopping a std::string from allocating more than it needs (e.g. to make provision for multiple increases in its length).

String.length woes

Edit: Solutions must compile against Microsoft Visual Studio 2012.
I want to use a known string length to declare another string of the same length.
The reasoning is the second string will act as a container for operation done to the first string which must be non volatile with regards to it.
e.g.
const string messy "a bunch of letters";
string dostuff(string sentence) {
string organised NNN????? // Idk, just needs the same size.
for ( x = 0; x < NNN?; x++) {
organised[x] = sentence[x]++; // Doesn't matter what this does.
}
}
In both cases above, the declaration and the exit condition, the NNN? stands for the length of 'messy'.
How do I discover the length at compile time?
std::string has two constructors which could fit your purposes.
The first, a copy constructor:
string organised(sentence);
The second, a constructor which takes a character and a count. You could initialize a string with a temporary character.
string organised(sentence.length(), '_');
Alternatively, you can:
Use an empty string and append (+=) text to it as you go along, or
Use a std::stringstream for the same purpose.
the stringstream will likely be more efficient.
Overall, I would prefer the copy constructor if the length is known.
std::string isn't a compile time type (it can't be a constexpr), so you can't use it directly to determine the length at compile time.
You could initialize a constexpr char[] and then use sizeof on that:
constexpr char messychar[] = "a bunch of letters";
// - 1 to avoid including NUL terminator which std::string doesn't care about
constexpr size_t messylen = sizeof(messychar) / sizeof(messychar[0]) - 1;
const string messy(messychar);
and use that, but frankly, that's pretty ugly; the length would be compile time, but organized would need to use the count and char constructor that would still be performed on each call, allocating and initializing only to have the contents replaced in the loop.
While it's not compile time, you'd avoid that initialization cost by just using reserve and += to build the new string, which with the #define could be done in an ugly but likely efficient way as:
constexpr char messychar[] = "a bunch of letters";
constexpr size_t messylen = sizeof(messychar) / sizeof(messychar[0]) - 1;
// messy itself may not be needed, but if it is, it's initialized optimally
// by using the compile time calculated length, so there is no need to scan for
// NUL terminators, and it can reserve the necessary space in the initial alloc
const string messy(messychar, messylen);
string dostuff(string sentence) {
string organised;
organized.reserve(messylen);
for (size_t x = 0; x < messylen; x++) {
organised += sentence[x]++; // Doesn't matter what this does.
}
}
This avoids setting organised's values more than once, allocating more than once (well, possibly twice if initial construction performs it) per call, and only performs a single read/write pass of sentence, no full read followed by read/write or the like. It also makes the loop constraint a compile time value, so the compiler has the opportunity to unroll the loop (though there is no guarantee of this, and even if it happens, it may not be helpful).
Also note: In your example, you mutate sentence, but it's accepted by value, so you're mutating the local copy, not the caller copy. If mutation of the caller value is required, accept it by reference, and if mutation is not required, accept by const reference to avoid a copy on every call (I understand the example code was filler, just mentioning this).

String Reversal Memory Consumption Differences

Suppose I implement the following two string reversal algorithms:
void reverse(string &s) {
if(s.size() == 1) return;
string restOfS = s.substr(1);
reverse(restOfS);
s = restOfS + s.at(0);
}
string reverseString(string s) {
if(s.size() == 1) return s;
return reverseString(s.substr(1)) + s.at(0);
}
int main() {
string name = "Dominic Farolino";
reverse(name);
cout << name << endl;
name = reverseString(name);
cout << name << endl;
return 0;
}
One of these obviously modifies the string given to it, and one of returns a new string. Since the first one modifies the given string and uses a reference parameter as its mode of communication to the next recursive stack frame, I at first assumed this would be more efficient since using a reference parameter may help us not duplicate things in memory down the line, however I don't believe that's the case. Obviously we have to use a reference parameter with this void function, but it seems that we are undoing any memory efficiency using a reference parameter may give us since we since we are just declaring a new variable on the stack every time.
In short, it seems that the first one is making a copy of the reference every call, and the second one is making a copy of the value each call and just returning its result, making them of equal memory consumption.
To make the first one more memory efficient I feel like you'd have to do something like this:
void reverse(string &s) {
if(s.size() == 1) return;
reverse(s.substr(1));
s = s.substr(1) + s.at(0);
}
however the compiler won't let me:
error: invalid initialization of non-const reference of type 'std::string& {aka std::basic_string<char>&}' from an rvalue of type 'std::basic_string<char>'
6:6: note: in passing argument 1 of 'void reverse(std::string&)'
Is this analysis correct?
substr() returns a new string every time, complete with all the memory use that goes with that. So if you're going to do N-1 calls to substr(), that's O(N^2) extra memory you're using for no reason.
With std::string though, you can modify it in place, just by iterating over it with a simple for loop. Or just using std::reverse:
void reverseString(string &s) {
std::reverse(s.begin(), s.end());
}
Either way (for loop or algorithm) takes O(1) extra memory instead - it effectively is just a series of swaps, so you just need one extra char as the temporary. Much better.

Able to Access Elements with Index Greater than Array Length

The following code seems to be running when it shouldn't. In this example:
#include <iostream>
using namespace std;
int main()
{
char data[1];
cout<<"Enter data: ";
cin>>data;
cout<<data[2]<<endl;
}
Entering a string with a length greater than 1 (e.g., "Hello"), will produce output as if the array were large enough to hold it (e.g., "l"). Should this not be throwing an error when it tried to store a value that was longer than the array or when it tried to retrieve a value with an index greater than the array length?
The following code seems to be running when it shouldn't.
It is not about "should" or "shouldn't". It is about "may" or "may not".
That is, your program may run, or it may not.
It is because your program invokes undefined behavior. Accessing an array element beyond the array-length invokes undefined behavior which means anything could happen.
The proper way to write your code is to use std::string as:
#include <iostream>
#include <string>
//using namespace std; DONT WRITE THIS HERE
int main()
{
std::string data;
std::cout<<"Enter data: ";
std::cin>>data; //read the entire input string, no matter how long it is!
std::cout<<data<<std::endl; //print the entire string
if ( data.size() > 2 ) //check if data has atleast 3 characters
{
std::cout << data[2] << std::endl; //print 3rd character
}
}
It can crash under different parameters in compilation or compiled on other machine, because running of that code giving undefined result according to documentaton.
It is not safe to be doing this. What it is doing is writing over the memory that happens to lie after the buffer. Afterwards, it is then reading it back out to you.
This is only working because your cin and cout operations don't say: This is a pointer to one char, I will only write one char. Instead it says: enough space is allocated for me to write to. The cin and cout operations keep reading data until they hit the null terminator \0.
To fix this, you can replace this with:
std::string data;
C++ will let you make big memory mistakes.
Some 'rules' that will save you most of the time:
1:Don't use char[]. Instead use string.
2:Don't use pointers to pass or return argument. Pass by reference, return by value.
3:Don't use arrays (e.g. int[]). Use vectors. You still have to check your own bounds.
With just those three you'll be writing some-what "safe" code and non-C-like code.