Simple String Manipulation - c++

Suppose we have char *a ="Mission Impossible";
If we give cout<<*(a+1), then the output is i.
Is there any way to change this value, or this is not possible?

Yes, there are several ways to do this, but you have to make a copy of the string first because if you didn't, you'd be modifying memory you're not allowed to (where string literals are stored).
const char* a = "Mission Impossible"; // const char*, not char*, because we can't
// modify its contents
char buf[80] = {}; // create an array of chars 80 large, all initialised to 0
strncpy(buf, a, 79); // copy up to 79 characters from a to buf
cout << *(buf + 1); // prints i
buf[1] = 'b';
cout << *(buf + 1); // prints b
*(buf + 1) = 't';
cout << buf[1]; // prints t
That said, if this exercise is not for learning purposes, it is highly recommended that you learn and use std::string rather than C-style strings. They are superior in almost every way and will result in far less frustration and errors in your code.

char a[] = "Mission Impossible";
a[1] = 'x';
String literals cannot be modified. Typically they are placed a section of the binary that will be mapped read-only, therefore writing to them generates a fault. (This is implementation-defined behavior, but this happens to be the most common implementation these days.)
By declaring the string as a character array it is writable. The other alternative would be to duplicate the string literal into heap memory, either through malloc, new, or std::string.

No, the char* a is actually read-only and if you try to modify the content you will get undefined behavior. You should ideally declare a as const char*.

The simplest way to change that is doing *(a+1)='value_you_want';
This will change the content of a pointer (your case pointer is a+1) to the value you set.

Related

Is it possible for separately initialized string variables to overlap?

If I initialize several string(character array) variables in the following ways:
const char* myString1 = "string content 1";
const char* myString2 = "string content 2";
Since const char* is simply a pointer a specific char object, it does not contain any size or range information of the character array it is pointing to.
So, is it possible for two string literals to overlap each other? (The newly allocated overlap the old one)
By overlap, I mean the following behaviour;
// Continue from the code block above
std::cout << myString1 << std::endl;
std::cout << myString2 << std::endl;
It outputs
string costring content 2
string content 2
So the start of myString2 is somewhere in the middle of myString1. Because const char* does not "protect"("possess") a range of memory locations but only that one it points to, I do not see how C++ can prevent other string literals from "landing" on the memory locations of the older ones.
How does C++/compiler avoid such problem?
If I change const char* to const char[], is it still the same?
Yes, string literals are allowed to overlap in general. From lex.string#9
... Whether all string-literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.
So it's up to the compiler to make a decision as to whether any string literals overlap in memory. You can write a program to check whether the string literals overlap, but since it's unspecified whether this happens, you may get different results every time you run the program.
A string is required to end with a null character having a value of 0, and can't have such a character in the middle. So the only case where this is even possible is when two strings are equal from the start of one to the end of both. That is not the case in the example you gave, so those two particular strings would never overlap.
Edit: sorry, I didn't mean to mislead anybody. It's actually easy to put a null character in the middle of a string with \0. But most string handling functions, particularly those in the standard library, will treat that as the end of a string - so your strings will get truncated. Not very practical. Because of that the compiler won't try to construct such a string unless you explicitly ask it to.
The compiler knows the size of each string, because it can "see" it in your code.
Additionally, they are not allocated the same way, that you would allocate them at run-time. Instead, if the strings are constant and defined globally, they are most likely located in the .text section of the object file, not on the heap.
And since the compiler knows the size of a constant string at compile-time, it can simply put its value in the free space of the .text section. The specifics depend on the compiler you use, but be assured the people who wrote are smart enough to avoid this issue.
If you define these strings inside some function instead, the compiler can choose between the first option and allocating space on the stack.
As for the const char[], most compilers will treat it the same way as const char*.
Two string literals will not likely overlap unless they are the same. In that case though the pointers will be pointing to the same thing. (This isn't guaranteed by the standard though, but I believe any modern compiler should make this happen.)
const char *a = "Hello there."
const char *b = "Hello there."
cout << (a == b);
// prints "1" which means they point to the same thing
The const char * can share a string though.
const char *a = "Hello there.";
const char *b = a + 6;
cout << a;
// prints "Hello there."
cout << b;
// prints "there."
I think to answer your second question an explanation of c-style strings is useful.
A const char * is just a pointer to a string of characters. The const means that the characters themselves are immutable. (They are stored as part of the executable itself and you wouldn't want your program to change itself like this. You can use the strings command on unix to see all the strings in an executable easily i.e. strings a.out. You will see many more strings than what you coded as many exist as part of the standard library other required things for an executable.)
So how does it know to just print the string and then stop at the end? Well a c-style string is required to end with a null byte (\0). The complier implicitly puts it there when you declare a string. So "string content 1" is actually "string content 1\0".
const char *a = "Hello\0 there.";
cout << a;
// prints "Hello"
For the most part const char *a and const char a[] are the same.
// These are valid and equivalent
const char *a = "Hello";
const char b[] = "there."
// This is valid
const char *c = b + 3; // *c = "re."
// This, however, is not valid
const char d[] = b + 3;

Initialization of pointers in c++

I need to clarify my concepts regarding the basics of pointer initialization in C++. As per my understanding, a pointer must be assigned an address before putting some value using the pointer.
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This would probably show the correct output (10) but this may cause issue in larger programs since p initially had garbage address which can be anything & may later be used somewhere else in the program as well.So , I believe this is incorrrect, the correct way is:
int *p;
int x=10;
p=&x; //appropriate
cout << *p <<"\n";
My question is, if the above understanding is correct, then does the same apply on char* as well?:
const char *str="hello"; // inappropriate
cout << str << "\n";
//OR
const string str1= "hello";
const char str2[6] ="world";
const char *str=str1; //appropriate
const char *st=str2; //appropriate
cout << str << st << "\n";
Please advice
Your understanding of strings is incorrect.
Lets take for example the very first line:
const char *str="hello";
This is actually correct. A string literal like "hello" is turned into a constant array by the compiler, and like all arrays it can decay to a pointer to its first element. So what you are doing is making str point to the first character of the array.
Then lets continue with
const string str1= "hello";
const char *str=str1;
This is actually wrong. A std::string object have no casting operator defined to cast to a const char *. The compiler will give you an error for this. You need to use the c_str function go get a pointer to the contained string.
Lastly:
const char str2[6] ="world";
const char *st=str2; //appropriate
This is really no different than the first line when you declare and initialize str. This is, as you say, "appropriate".
About that first example with the "inappropriate" pointer:
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This is not only "inappropriate", this leads to undefined behavior and may actually crash your program. Also, the correct term is that the value of p is indeterminate.
When I declare a pointer
int *p;
I get an object p whose values are addresses. No ints are created anywhere. The thing you need to do is think of p as being an address rather than being an int.
At this point, this isn't particularly useful since you have no addresses you could assign to it other than nullptr. Well, technically that's not true: p itself has an address which you can get with &p and store it in an int**, or even do something horrible like p = reinterpret_cast<int*>(&p);, but let's ignore that.
To do something with ints, you need to create one. e.g. if you go on to declare
int x;
you now have an int object whose values are integers, and we could then assign its address to p with p = &x;, and then recover the object from p via *p.
Now, C style strings have weird semantics — the weirdest aspect being that C doesn't actually have strings at all: it's always working with arrays of char.
String literals, like "Hello!", are guaranteed to (act1 like they) exist as an array of const char located at some address, and by C's odd conversion rules, this array automatically converts to a pointer to its first element. Thus,
const char *str = "hello";
stores the address of the h character in that character array. The declaration
const char str2[6] ="world";
works differently; this (acts1 like it) creates a brand new array, and copies the contents of the string literal "world" into the new array.
As an aside, there is an obsolete and deprecated feature here for compatibility with legacy programs, but for some misguided reason people still use it in new programs these days so you should be aware of it and that it's 'wrong': you're allowed to break the type system and actually write
char *str = "hello";
This shouldn't work because "hello" is an array of const char, but the standard permits this specific usage. You're still not actually allowed to modify the contents of the array, however.
1: By the "as if" rule, the program only has to behave as if things happen as I describe, but if you peeked at the assembly code, the actual way things happen can be very different.

Character pointer access

I wanted to access character pointer ith element. Below is the sample code
string a_value = "abcd";
char *char_p=const_cast<char *>(a_value.c_str());
if(char_p[2] == 'b') //Is this safe to use across all platform?
{
//do soemthing
}
Thanks in advance
Array accessors [] are allowed for pointer types, and result in defined and predictable behaviors if the offset inside [] refers to valid memory.
const char* ptr = str.c_str();
if (ptr[2] == '2') {
...
}
Is correct on all platforms if the length of str is 3 characters or more.
In general, if you are not mutating the char* you are looking at, it best to avoid a const_cast and work with a const char*. Also note that std::string provides operator[] which means that you do not need to call .c_str() on str to be able to index into it and look at a char. This will similarly be correct on all platforms if the length of str is 3 characters or more. If you do not know the length of the string in advance, use std::string::at(size_t pos), which performs bound checking and throws an out_of_range exception if the check fails.
You can access the ith element in a std::string using its operator[]() like this:
std::string a_value = "abcd";
if (a_value[2] == 'b')
{
// do stuff
}
If you use a C++11 conformant std::string implementation you can also use:
std::string a_value = "abcd";
char const * p = &a_value[0];
// or char const * p = a_value.data();
// or char const * p = a_value.c_str();
// or char * p = &a_value[0];
21.4.1/5
The char-like objects in a basic_string object shall be stored contiguously.
21.4.7.1/1: c_str() / data()
Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].
The question is essentially about querying characters in a string safely.
const char* a = a_value.c_str();
is safe unless some other operation modifies the string after it. If you can guarantee that no other code performs a modification prior to using a, then you have safely retrieved a pointer to a null-terminated string of characters.
char* a = const_cast<char *>(a_value.c_str());
is never safe. You have yielded a pointer to memory that is writeable. However, that memory was never designed to be written to. There is no guarantee that writing to that memory will actually modify the string (and actually no guarantee that it won't cause a core dump). It's undefined behaviour - absolutely unsafe.
reference here: http://en.cppreference.com/w/cpp/string/basic_string/c_str
addressing a[2] is safe provided you can prove that all possible code paths ensure that a represents a pointer to memory longer than 2 chars.
If you want safety, use either:
auto ch = a_string.at(2); // will throw an exception if a_string is too short.
or
if (a_string.length() > 2) {
auto ch = a_string[2];
}
else {
// do something else
}
Everyone explained very well for most how it's safe, but i'd like to extend a bit if that's ok.
Since you're in C++, and you're using a string, you can simply do the following to access a caracter (and you won't have any trouble, and you still won't have to deal with cstrings in cpp :
std::string a_value = "abcd";
std::cout << a_value.at(2);
Which is in my opinion a better option rather than going out of the way.
string::at will return a char & or a const char& depending on your string object. (In this case, a const char &)
In this case you can treat char* as an array of chars (C-string). Parenthesis is allowed.

Are strings in C++ copied when modified?

Using the std::string class in C++, it is possible to modify a character using the array notation, like:
std::string s = "Hello";
s[0] = 'X';
cout << s << '\n';
I have checked that this code compiles, and prints "Xello" as expected. However, I was wondering what the cost of this operation is: is it constant time, or is it O(n) because the string is copied?
The string isn't copied. The internal data is directly modified.
It basically gets the internal data pointer of the actual string memory, and modifies it. Imagine doing this:
char *data = &str[0];
for(size_t i = 0; i < str.size(); ++i)
{
data[i] = '!';
}
The code sets every character of the string to an exclamation mark.
But if the string was copied, then after the first write, the data pointer would become invalid.
Or to use another example:
std::cout << str[5] << std::endl;
That prints the 6th character of the string. Why would that copy the string?
C++ can't tell the difference between char c = str[5] and str[5] = c (except as far as const vs non-const function calls go).
Also, str[n] is guaranteed to never throw exceptions, as long as n < str.size(). It can't make that guarantee if it had to allocate memory internally for a copy - because the allocation could fail and throw.
(As #juanchopanza mentioned, older C++ standards permitted CoW strings, but the latest C++ standard forbids this)
You can modify stl string like in you example, no copy will be done. Standard library string class does not manage string pool like other languages do for strings (like Java). This operation is constant in complexity.
You do only modify the first element s[0], therefore it can't be O(n). You don't copy the string.

C++ Swap string

I am trying to create a non-recursive method to swap a c-style string. It throws an exception in the Swap method. Could not figure out the problem.
void Swap(char *a, char* b)
{
char temp;
temp = *a;
*a = *b;
*b = temp;
}
void Reverse_String(char * str, int length)
{
for(int i=0 ; i <= length/2; i++) //do till the middle
{
Swap(str+i, str+length - i);
}
}
EDIT: I know there are fancier ways to do this. But since I'm learning, would like to know the problem with the code.
It throws an exception in the Swap method. Could not figure out the problem.
No it doesn't. Creating a temporary character and assigning characters can not possibly throw an exception. You might have an access violation, though, if your pointers don't point to blocks of memory you own.
The Reverse_String() function looks OK, assuming str points to at least length bytes of writable memory. There's not enough context in your question to extrapolate past that. I suspect you are passing invalid parameters. You'll need to show how you call Reverse_String() for us to determine if the call is valid or not.
If you are writing something like this:
char * str = "Foo";
Reverse_String(str, 3);
printf("Reversed: '%s'.\n", str);
Then you will definitely get an access violation, because str points to read-only memory. Try the following syntax instead:
char str[] = "Foo";
Reverse_String(str, 3);
printf("Reversed: '%s'.\n", str);
This will actually make a copy of the "Foo" string into a local buffer you can overwrite.
This answer refers to the comment by #user963018 made under #André Caron's answer (it's too long to be a comment).
char *str = "Foo";
The above declares a pointer to the first element of an array of char. The array is 4 characters long, 3 for F, o & o and 1 for a terminating NULL character. The array itself is stored in memory marked as read-only; which is why you were getting the access violation. In fact, in C++, your declaration is deprecated (it is allowed for backward compatibility to C) and your compiler should be warning you as such. If it isn't, try turning up the warning level. You should be using the following declaration:
const char *str = "Foo";
Now, the declaration indicates that str should not be used to modify whatever it is pointing to, and the compiler will complain if you attempt to do so.
char str[] = "Foo";
This declaration states that str is a array of 4 characters (including the NULL character). The difference here is that str is of type char[N] (where N == 4), not char *. However, str can decay to a pointer type if the context demands it, so you can pass it to the Swap function which expects a char *. Also, the memory containing Foo is no longer marked read-only, so you can modify it.
std::string str( "Foo" );
This declares an object of type std::string that contains the string "Foo". The memory that contains the string is dynamically allocated by the string object as required (some implementations may contain a small private buffer for small string optimization, but forget that for now). If you have string whose size may vary, or whose size you do not know at compile time, it is best to use std::string.