Using remove_if with C null-terminated string - c++

I have a situation where I want to efficiently remove a character from a NULL-terminated char *. I can assume the incoming string is large (i.e. it wouldn't be efficient to copy); but I can also assume that I don't need to de-allocate the unused memory.
I thought I could use std::remove_if for this task (replacing the character at the returned iterator with a NULL-terminator), and set up the following test program to make sure I got the syntax correct:
#include <algorithm>
#include <iostream>
bool is_bad (const char &c) {
return c == 'a';
}
int main (int argc, char *argv[]) {
char * test1 = "123a45";
int len = 6;
std::cout << test1 << std::endl;
char * new_end = std::remove_if(&test1[0], &test1[len], is_bad);
*new_end = '\0';
std::cout << test1 << std::endl;
return 0;
}
This program compiles, however, I'm getting a Segmentation Fault somewhere in remove_if - here's the output from gdb:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400914 in std::remove_copy_if<char*, char*, bool (*)(char const&)> (__first=0x400c2c "45", __last=0x400c2e "", __result=0x400c2b "a45",
__pred=0x4007d8 <is_bad(char const&)>) at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_algo.h:1218
1218 *__result = *__first;
This is with gcc 4.1.2 on RedHat 4.1.2-52.
My understanding was that raw pointers can be used as ForwardIterators, but perhaps not? Any suggestions?

The program has undefined behaviour as it is attempting to modify a string literal:
char * test1 = "123a45";
Change to:
char test1[] = "123a45"; // 'test1' is a copy of the string literal.
char * new_end = std::remove_if(test1, test1 + sizeof(test1), is_bad);
See http://ideone.com/yzeo4k.

Your program has undefined behavior, since you are trying to modify an array of const characters (string literals are arrays of const characters). Per paragraph 7.1.6.1/4 of the C++11 Standard:
Except that any class member declared mutable (7.1.1) can be modified, any attempt to modify a const
object during its lifetime (3.8) results in undefined behavior.
Notice, that since C++11 the conversion from a string literal to a char* is illegal, and in C++03 is deprecated (GCC 4.7.2 gives me a warning for that).
To fix your program with a minimal change, declare test1 as an array of characters and initialize it from the string literal:
char test1[] = "123a45";
Here is a live example.

Related

Getting Extra characters at the end when Creating std::string from char*

I have just started learning C++. Now i am learning about arrays. So i am trying out different examples. One such example is given below:
int main()
{
const char *ptr1 = "Anya";
char arr[] = {'A','n','y','a'};
std::string name1(ptr1); //this works
std::cout << name1 << std::endl;
std::string name2(arr);
std::cout << name2 << std::endl; //this prints extra characters at the end?
return 0;
}
In the above example at the last cout statement i am getting some extra characters at the end. My question is that how can i prevent this from happening in the above code and what is wrong with the code so that i don't make the same mistake in future?
char arr[] = {'A','n','y','a'}; is not null terminated so you will read it out of bounds when creating the string which in turn makes your program have undefined behavior (and could therefore do anything).
Either make it null terminated:
char arr[] = {'A','n','y','a','\0'};
Or, create the string from iterators:
#include <iostream>
#include <iterator>
#include <string>
int main() {
char arr[] = {'A', 'n', 'y', 'a'};
std::string name2(std::begin(arr), std::end(arr));
std::cout << name2 << '\n'; // now prints "Anya"
}
Or create it with the constructor taking the length as an argument:
std::string name2(arr, sizeof arr); // `sizeof arr` is here 4
The problem is that you're constructing a std::string using a non null terminated array as explained below.
When you wrote:
char arr[] = {'A','n','y','a'}; //not null terminated
The above statement creates an array that is not null terminated.
Next when you wrote:
std::string name2(arr); //undefined behavior
There are 2 important things to note about the above statement:
arr decays to a char* due to type decay.
This char* is passed as an argument to a std::string constructor that have a parameter of type const char*. Essentially the above statement creates a std::string object from a non null terminated array.
But note that whenever we create a std::string using a const char*, the array to which the pointer points must be null terminated. Otherwise the result is undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.
For example here the program gives expected output but here it doesn't. So as i said, don't rely on the output of a program that have UB.
Solution
You can solve this by making your array null terminated as shown below.
char arr[] = {'A','n','y','a','\0'}; //arr is null terminated
// char arr[] = "Anya"; //this is also null terminated
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.

c++ change array by index inside function

void changeArray(char* str1) {
str1[0] = 'f';
}
int main() {
char* msg1 = "andrew";
changeArray(msg1);
cout << msg1 << endl;
return 0;
}
Hi guys,i dont understand why i'm getting segmentation fault. pointers cannot be accessed by index inside functions? (C++)
You're trying to modify string literal, which leads to undefined behavior.
Attempting to modify a string literal results in undefined behavior: they may be stored in read-only storage (such as .rodata) or combined with other string literals:
const char* pc = "Hello";
char* p = const_cast<char*>(pc);
p[0] = 'M'; // undefined behavior
And, char* msg1 = "andrew"; is not allowed since C++11,
In C, string literals are of type char[], and can be assigned directly to a (non-const) char*. C++03 allowed it as well (but deprecated it, as literals are const in C++). C++11 no longer allows such assignments without a cast.
You can construct and pass a char array instead.
String literals can be used to initialize character arrays. If an array is initialized like char str[] = "foo";, str will contain a copy of the string "foo".
E.g.
int main() {
char msg1[] = "andrew";
changeArray(msg1);
cout << msg1 << endl;
return 0;
}
In int main() you declared msg1 as a pointer to a char, not as an array of chars. Do this: char msg1[] = "andrew";.

char pointer parameter different behaviour

I have the following code:
void uppercase(char *sir)
{
for(int i=0;i<strlen(sir);i++)
{
sir[i]=(char)toupper(sir[i]);
}
}
int _tmain(int argc, _TCHAR* argv[])
{
//char lower[]="u forgot the funny"; this works
//char *lower="u forgot the funny"; this gives me a runtime error
uppercase(lower);
cout<<lower<<"\n\n";
system("PAUSE");
return 0;
}
I have noted that if I run with the char vector it works.
When I try to run with the second method it generates a runtime error.
I would like to know the reason for this behaviour please.
You cannot modify string literals; doing so (as in your second case) is undefined behaviour.
char x[] = "foo";
creates a character array containing the characters f,o,o,\0. It's basically a mutable copy of the string.
char *x = "foo";
creates a string pointer pointing to the "foo" string literal. The literal may live in some read-only memory, in the program memory, or in a constant pool. Writing to it is undefined behaviour. Also, not that the type of a string literal is always const char[], so assigning it to a char * is violating const-correctness.
The former creates a character array which can be mutated, the latter is a pointer to fixed memory (which cannot be manipulated)

Why is main() argument argv of type char*[] rather than const char*[]?

When I wrote the following code and executed it, the compiler said
deprecated conversion from string constant to char*
int main()
{
char *p;
p=new char[5];
p="how are you";
cout<< p;
return 0;
}
It means that I should have written const char *.
But when we pass arguments into main using char* argv[] we don't write const char* argv[].
Why?
Because ... argv[] isn't const. And it certainly isn't a (static) string literal since it's being created at runtime.
You're declaring a char * pointer then assigning a string literal to it, which is by definition constant; the actual data is in read-only memory.
int main(int argc, char **argv) {
// Yes, I know I'm not checking anything - just a demo
argv[1][0] = 'f';
std::cout << argv[1] << std::endl;
}
Input:
g++ -o test test.cc
./test hoo
Output:
foo
This is not a comment on why you'd want to change argv, but it certainly is possible.
Historical reasons. Changing the signature of main() would break too much existing code. And it is possible that some implementations allow you to change the parameters to main from your code. However code like this:
char * p = "helllo";
* p = 'x';
is always illegal, because you are not allowed to mess with string literals like that, so the pointer should be to a const char.
why is it required for char* to be constant while assigning it to a string
Because such literal strings (like "hi", "hello what's going on", etc), are stored in the read-only segment of your exe. As such, the pointers that point to them need to point to constant characters (eg, can't change them).
You are assigning a string constant (const char*) to a pointer to a non-constant string (char *p). This would allow you to modify the string constant, e.g. by doing p[0] = 'n'.
Anyway, why don't you use std::string instead ? (you seem to be using C++).
If you look at execution functions like execve, you will see that they actually don't accept const char* as parameters, but do indeed require char*, therefore you can't use a string constant to invoke main.

C++ deprecated conversion from string constant to 'char*'

I have a class with a private char str[256];
and for it I have an explicit constructor:
explicit myClass(char *func)
{
strcpy(str,func);
}
I call it as:
myClass obj("example");
When I compile this I get the following warning:
deprecated conversion from string constant to 'char*'
Why is this happening?
This is an error message you see whenever you have a situation like the following:
char* pointer_to_nonconst = "string literal";
Why? Well, C and C++ differ in the type of the string literal. In C the type is array of char and in C++ it is constant array of char. In any case, you are not allowed to change the characters of the string literal, so the const in C++ is not really a restriction but more of a type safety thing. A conversion from const char* to char* is generally not possible without an explicit cast for safety reasons. But for backwards compatibility with C the language C++ still allows assigning a string literal to a char* and gives you a warning about this conversion being deprecated.
So, somewhere you are missing one or more consts in your program for const correctness. But the code you showed to us is not the problem as it does not do this kind of deprecated conversion. The warning must have come from some other place.
The warning:
deprecated conversion from string constant to 'char*'
is given because you are doing somewhere (not in the code you posted) something like:
void foo(char* str);
foo("hello");
The problem is that you are trying to convert a string literal (with type const char[]) to char*.
You can convert a const char[] to const char* because the array decays to the pointer, but what you are doing is making a mutable a constant.
This conversion is probably allowed for C compatibility and just gives you the warning mentioned.
As answer no. 2 by fnieto - Fernando Nieto clearly and correctly describes that this warning is given because somewhere in your code you are doing (not in the code you posted) something like:
void foo(char* str);
foo("hello");
However, if you want to keep your code warning-free as well then just make respective change in your code:
void foo(char* str);
foo((char *)"hello");
That is, simply cast the string constant to (char *).
There are 3 solutions:
Solution 1:
const char *x = "foo bar";
Solution 2:
char *x = (char *)"foo bar";
Solution 3:
char* x = (char*) malloc(strlen("foo bar")+1); // +1 for the terminator
strcpy(x,"foo bar");
Arrays also can be used instead of pointers because an array is already a constant pointer.
Update: See the comments for security concerns regarding solution 3.
A reason for this problem (which is even harder to detect than the issue with char* str = "some string" - which others have explained) is when you are using constexpr.
constexpr char* str = "some string";
It seems that it would behave similar to const char* str, and so would not cause a warning, as it occurs before char*, but it instead behaves as char* const str.
Details
Constant pointer, and pointer to a constant. The difference between const char* str, and char* const str can be explained as follows.
const char* str : Declare str to be a pointer to a const char. This means that the data to which this pointer is pointing to it constant. The pointer can be modified, but any attempt to modify the data would throw a compilation error.
str++ ; : VALID. We are modifying the pointer, and not the data being pointed to.
*str = 'a'; : INVALID. We are trying to modify the data being pointed to.
char* const str : Declare str to be a const pointer to char. This means that point is now constant, but the data being pointed too is not. The pointer cannot be modified but we can modify the data using the pointer.
str++ ; : INVALID. We are trying to modify the pointer variable, which is a constant.
*str = 'a'; : VALID. We are trying to modify the data being pointed to. In our case this will not cause a compilation error, but will cause a runtime error, as the string will most probably will go into a read only section of the compiled binary. This statement would make sense if we had dynamically allocated memory, eg. char* const str = new char[5];.
const char* const str : Declare str to be a const pointer to a const char. In this case we can neither modify the pointer, nor the data being pointed to.
str++ ; : INVALID. We are trying to modify the pointer variable, which is a constant.
*str = 'a'; : INVALID. We are trying to modify the data pointed by this pointer, which is also constant.
In my case the issue was that I was expecting constexpr char* str to behave as const char* str, and not char* const str, since visually it seems closer to the former.
Also, the warning generated for constexpr char* str = "some string" is slightly different from char* str = "some string".
Compiler warning for constexpr char* str = "some string": ISO C++11 does not allow conversion from string literal to 'char *const'
Compiler warning for char* str = "some string": ISO C++11 does not allow conversion from string literal to 'char *'.
Tip
You can use C gibberish ↔ English converter to convert C declarations to easily understandable English statements, and vice versa. This is a C only tool, and thus wont support things (like constexpr) which are exclusive to C++.
In fact a string constant literal is neither a const char * nor a char* but a char[]. Its quite strange but written down in the c++ specifications; If you modify it the behavior is undefined because the compiler may store it in the code segment.
Maybe you can try this:
void foo(const char* str)
{
// Do something
}
foo("Hello")
It works for me
I solve this problem by adding this macro in the beginning of the code, somewhere. Or add it in <iostream>, hehe.
#define C_TEXT( text ) ((char*)std::string( text ).c_str())
I also got the same problem. And what I simple did is just adding const char* instead of char*. And the problem solved. As others have mentioned above it is a compatible error. C treats strings as char arrays while C++ treat them as const char arrays.
For what its worth, I find this simple wrapper class to be helpful for converting C++ strings to char *:
class StringWrapper {
std::vector<char> vec;
public:
StringWrapper(const std::string &str) : vec(str.begin(), str.end()) {
}
char *getChars() {
return &vec[0];
}
};
The following illustrates the solution, assign your string to a variable pointer to a constant array of char (a string is a constant pointer to a constant array of char - plus length info):
#include <iostream>
void Swap(const char * & left, const char * & right) {
const char *const temp = left;
left = right;
right = temp;
}
int main() {
const char * x = "Hello"; // These works because you are making a variable
const char * y = "World"; // pointer to a constant string
std::cout << "x = " << x << ", y = " << y << '\n';
Swap(x, y);
std::cout << "x = " << x << ", y = " << y << '\n';
}