I found this (simplified) piece of code in our code base and it's leaving me feeling unpleasant. It either works, doesn't work, or is never called anyway. I would expect some buffer overflow, but when I try it in an online compiler it certainly doesn't work, but doesn't overflow either. I'm looking at the definition of strcat and it will write the source to the destination starting at its null terminator, but I am assuming in this scenario, the destination buffer (which was created as a std::string) should be too small..
#include <iostream>
#include "string.h"
using namespace std;
void addtostring(char* str){
char str2[12] = "goodbye";
strcat(str, str2);
}
int main()
{
std::string my_string = "hello";
addtostring((char*)my_string.c_str());
cout << my_string << endl;
return 0;
}
What would be the actual behaviour of this operation?
What would be the actual behaviour of this operation?
The behavior is undefined. First, writing to any character through c_str is undefined behavior. Secondly, had you used data instead to get a char*, overwriting the null terminator is also undefined behavior. Lastly, both c_str and data only give you a pointer (p) that has a valid range of elements from [p, p + size()]. Writing to any element outside that range is also undefined behavior.
If you want to modify the string you need to use the string's member/free functions to do so. Your function could be rewritten to
void addtostring(std::string& str){
str += "goodbye";
}
and that will have well defined behavior.
Related
I have this piece of code:
int i = 0;
char s[12];
strcpy(s,"abracadabra");
cout << strlen(s);
while(i < strlen(s))
{
if (s[i]=='a') strcpy(s+i, s+i+1);
else i++;
}
cout << " " << s;
But I can't understand why the output is brcdbr.
I thought that s+i means s[n+i] or something like that?
Can someone explain to me how this works?
In your terminal type man strcpy
char *strcpy(char *dest, const char *src);
The strcpy() function copies the string pointed to by src,
including the terminating null byte ('\0'), to the buffer pointed
to by dest. The strings may not overlap, and the destination
string dest must be large enough to receive the copy. Beware of
buffer overruns! (See BUGS.)
So you are copying all bytes from your string,
from src: index i + 1 (the next letter after 'a') to '\0'
to dest: index i (letter 'a') to '\0'
NB: As stated in the comments it is a very inefficient way and undefined behavior to get rid of the 'a', but I guess you came here for the explanation about s+i, it means &s[i] or even &i[s].
Read more about "Pointer arithmetic"
If you want an efficient way take a look at this post
Can someone explain to me how this works?
it doesn't work, the behavior of the program is undefined. Nothing meaningful can be said about the outcome (even if it looks correct in some specific case).
For example, here on godbolt the output is not brcdbr but brcdrr.
That's because it's illegal to invoke strcpy with overlapping source and destination pointers:
C11 §7.24.2.3 The strcpy function
The strcpy function copies the string pointed to by s2 (including the
terminating null character) into the array pointed to by s1. If
copying takes place between objects that overlap, the behavior is
undefined.
(note: C++ inherits C rules for C standard library functions)
Anyway, if you just wanted to know the intention behind strcpy(s+i, s+i+1), in C++ expressions an array automatically decays to a pointer. So char s[12] becomes char* s. Then the expression s+i becomes pointer arithmetic - taking an address of ith element pointed-to by s. It is equivalent to writing &s[i], i.e. taking an address of ith element in s. The same applies to s+i+1 - it evaluates to a pointer to the i+1th element in s. The intention of the strcpy call was to copy the remainder of the string after a to the memory are starting at a, i.e. to shift the remaining characters forward by one, thus overwriting the a.
A better way in C++ would be to use std::string and the erase-remove idiom to remove the characters a from the string:
#include <iostream>
#include <algorithm>
#include <string>
int main() {
std::string s = "abracadabra";
s.erase(std::remove(s.begin(), s.end(), 'a'), s.end());
std::cout << s << std::endl;
}
Prints:
brcdbr
Why the following code gives an error?
// This a CPP Program
#include <bits/stdc++.h>
using namespace std;
// Driver code
main()
{
string s=NULL;
s.length();
}
I know that a runtime error will occur because I am trying to get the length of the null string but I want to know why it is happening?
You invoke the following overload of the std::string constructor (overload 5):
basic_string( const CharT* s, const Allocator& alloc = Allocator());
And this is the explanation belonging to the constructor (emphasis mine):
Constructs the string with the contents initialized with a copy of the null-terminated character string pointed to by s. The length of the string is determined by the first null character. The behavior is undefined if [s, s + Traits::length(s)) is not a valid range (for example, if s is a null pointer).
Thus, you have undefined behavior at work. Referring back to your question, that outrules any further thoughts on "why it is happening", because UB can result in anything. You could wonder why it's specified as UB in the first place - this is because std::string shall by design work with C-style, zero-terminated strings (char*). However, NULL is not one. The empty, zero-terminated C-style string is e.g. "".
Why the following code gives an error?
main must be declared to return int.
Also, to declare an empty string, make it string s; or string s="";
This would compile:
#include <iostream>
#include <string>
int main()
{
std::string s;
std::cout << s.length() << '\n'; // prints 0
}
On a sidenote: Please read Why should I not #include <bits/stdc++.h>?
There is no such thing as the null string unless by "null" you mean empty, which you don't.
I am learning pointers and i tried this following program
#include <iostream>
#include <cstdlib>
#include <cstdio>
using namespace std;
char* getword()
{
char*temp=(char*)malloc(sizeof(char)*10);
cin>>temp;
return temp;
}
int main()
{
char *a;
a=getword();
cout<<a;
return 0;
}
To my level of understanding, a is a pointer to a character, and in the function getword() I returned temp which I think the base &temp[0]. I thought that the output would be the first character of the string I enter, but I got the entire string in stdout. How does this work?
In the tradition of C, a char* represents a string. Indeed, any string literal in your program (e.g. "hello") will have a type of const char *.
Thus, cout::operator<<( const char * ) is implemented as a string-output. It will output characters beginning at the address it is given, until it encounters the string terminator (otherwise known as null-terminator, or '\0').
If you want to output a single character, you need to dereference the pointer into a char type. You can choose one of the following syntaxes:
cout << *a; // Dereference the pointer
cout << a[0]; // Use array index of zero to return the value at that address
It should be noted that the code you provided isn't very C++ish. For starters, we generally don't use malloc in C++. You then leak the memory by not calling free later. The memory is uninitialised and relies on cin succeeding (which might not be the case). Also, you can only handle input strings of up to 9 characters before you will get undefined behaviour.
Perhaps you should learn about the <string> library and start using it.
It's true that char* "points to a character". But, by convention, and because with pointers there is no other way to do so, we also use it to "point to more than one character".
Since use of char* almost always means you're using a pointer to a C-style string, the C++ streams library makes this assumption for you, printing the char that your pointer points to … and the next … and the next … and the next until NULL is found. That's just the way it's been designed to work.
You can print just that character if you like by dereferencing the pointer to obtain an actual char.
std::cout is an overloaded operator and when it receives a char * as an operand then it treats it as a pointer to c style string and it will print the entire string.
If you want to print the first character of the string then use
cout << *a;
or
cout << a[0];
In your code, std::cout is an ostream and providing a char* variable as input to operator<< invokes a particular operator function overload to write characters to the ostream.
std::ostream also has a operator overload for writing a single character to itself.
I'm assuming you now know how to dereference a char* variable, but you should be using std::string instead of an unsafe char* type.
Here is the correct code
#include <stdio.h>
#include <stdlib.h>
char* getword()
{
char*temp=(char*)malloc(sizeof(char)*10);
scanf("%s",temp);
return temp;
}
int main()
{
char *a;
a = getword();
int currChar = 1;
printf("%c",*(a + currChar)); //increment currChar to get next character
return 0;
}
I am confused with const pointers in C++ and wrote a small application to see what the output would be. I am attempting (I believe) to add a pointer to a string, which should not work correctly, but when I run the program I correctly get "hello world". Can anyone help me figure out what how this line (s += s2) is working?
My code:
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;
const char* append(const char* s1, const char* s2){
std::string s(s1); //this will copy the characters in s1
s += s2; //add s and s2, store the result in s (shouldn't work?)
return s.c_str(); //return result to be printed
}
int main() {
const char* total = append("hello", "world");
printf("%s", total);
return 0;
}
The variable s is local inside the append function. Once the append function returns that variable is destructed, leaving you with a pointer to a string that no longer exists. Using this pointer leads to undefined behavior.
My tip to you on how to solve this: Use std::string all the way!
you're adding const char* pointer to a std::string and that is possible (see this reference). it wouldn't be possible to make that operation on char* type (C style string).
however, you're returning a pointer to local variable, so once function append returns and gets popped of the stack, the string that your returned pointer is pointing to would not exist. this leads to an undefined behavior.
Class std::string has overloaded operator += for an operand of type const char *
basic_string& operator+=(const charT* s);
In fact it simply appends the string pointed to by this pointer to the contents of the object of type std::string allocating additionly memory if required. For example internally the overloaded operator could use standard C function strcat
Conceptually it is similar to the following code snippet.
char s[12] = "Hello ";
const char *s2 = "World";
std::strcat( s, s2 );
Take into account that your program has undefined behaviour because total will be invalid after destroying local object s after exiting function append. So the next statemnent in main
printf("%s", total);
can result in undefined behaviour.
Let's say that a function which returns a fixed ‘random text’ string is written like
char *Function1()
{
return “Some text”;
}
then the program could crash if it accidentally tried to alter the value doing
Function1()[1]=’a’;
What are the square brackets after the function call attempting to do that would make the program crash? If you're familiar with this, any explanation would be greatly appreciated!
The string you're returning in the function is usually stored in a read-only part of your process. Attempting to modify it will cause an access violation. (EDIT: Strictly speaking, it is undefined behavior, and in some systems it will cause an access violation. Thanks, John).
This is the case usually because the string itself is hardcoded along with the code of your application. When loading, pointers are stablished to point to those read-only sections of your process that hold literal strings. In fact, whenever you write some string in C, it is treated as a const char* (a pointer to const memory).
The signature of that function should really be constchar* Function();.
You are trying to modify a string literal. According to the Standard, this evokes undefined behavior. Another thing to keep in mind (related) is that string literals are always of type const char*. There is a special dispensation to convert a pointer to a string literal to char*, taking away the const qualifier, but the underlying string is still const. So by doing what you are doing, you are trying to modify a const. This also evokes undefined behavior, and is akin to trying to do this:
const char* val = "hello";
char* modifyable_val = const_cast<char*>(val);
modifyable_val[1] = 'n'; // this evokes UB
Instead of returning a const char* from your function, return a string by value. This will construct a new string based on the string literal, and the calling code can do whatever it wants:
#include <string>
std::string Function1()
{
return “Some text”;
}
...later:
std::string s = Function1();
s[1] = 'a';
Now, if you are trying to change the value that Function() reuturns, then you'll have to do something else. I'd use a class:
#include <string>
class MyGizmo
{
public:
std::string str_;
MyGizmo() : str_("Some text") {};
};
int main()
{
MyGizmo gizmo;
gizmo.str_[1] = 'n';
}
You can use static char string for return value, but you never use it. It's just like access violation error. The behavior of it is not defined in c++ Standard.
It's not the brackets, but the assignement. Your function returns not a simple char *, but const char *( i can be wrong here, but the memory is read-only here), so you try to change the unchangeable memory. And the brackets - they just give you access to the element of the array.
Note also that you can avoid the crash by placing the text in a regular array:
char Function1Str[] = "Some text";
char *Function1()
{
return Function1Str;
}
The question shows that you do not understand the string literals.
image this code
char* pch = "Here is some text";
char* pch2 = "some text";
char* pch3 = "Here is";
Now, how the compiler allocates memory to the strings is entirely a matter for the compiler. the memory might organised like this:
Here is<NULL>Here is some text<NULL>
with pch2 pointing to memory location inside the pch string.
The key here is understanding the memory. Using the Standard Template Library (stl) would be a good practice, but you may be quite a steep learning curve for you.