Is nullptr used to terminate C-style strings? - c++

I am confused by the use of nullptr in an example from A Tour of C++:  
int count_x(char* p, char x)
// count the number of occurrences of x in p[]
// p is assumed to point to a zero-terminated array of char (or to nothing)
{
if (p==nullptr) return 0;
int count = 0;
for (; p!=nullptr; ++p)
if (*p==x) ++count;
return count;
}
// The definition of count_x() assumes that the char* is a C-style string,
// that is, that the pointer points to a zero-terminated array of char.
I understand that count_x should terminate if p is unassigned, and the for loop should terminate when it reaches the end of the C-style string referenced by p.
However, when I build a main function to use count_x(), it never seems to terminate correctly:
int main () {
char teststring[] = {'b', 'l', 'a', 'h', '\0'};
cout << "teststring is: " << teststring << endl;
cout << "Number of b's is: " << count_x(teststring, 'b') << endl;
return 0;
}
Executing this prints a lot of garbage, and then exits with a segmentation fault. If I replace the for (; p!=nullptr; ++p) in count_x with for (; *p!='\0'; ++p), it executes properly. I guess this means that the string is not terminated correctly. If so, how do I terminate a C-style string so that nullptr can be used here?
Edit: there's been a discussion in the comments that's clarified this situation. I'm using the first print of the book from September 2013, where the above is printed in error. The third print of the book from January 2015 (linked in the comments) has the corrected example which uses for (; *p!=0; ++p) instead of for (; p!=nullptr; ++p). This correction is also documented in the errata for the book. Thanks!
Edit2: Sorry guys, this was apparently already asked on SO earlier here: Buggy code in "A Tour of C++" or non-compliant compiler?

No, a NULL pointer is not used to terminate strings. The NUL character is used. They are different things, but if you assign either of them to an integer in C, the result is always the same: zero (0).
A NUL character is represented in C as '\0' and takes up one char of storage. A NULL pointer is represented in C as 0, and takes up the same storage as void *. For example, on a 64-bit machine, a void * is 8 bytes while '\0' is one byte.
I.e., nullptr is not the same thing as '\0'. And the character is the null character, called NUL, but it is not supposed to be called a NULL byte or a NULL character.

Related

Why do we need a pointer here

I'm new to C++, and while going through pointers I don't quite understand why I need to use *p in the while loop condition check below.
It is a very simple function that counts occurrences of the character x in the array, with an example invocation as in the below main() function. We assume here that p will point to an array of chars. Purely for demonstration.
int count(char* p, char x) {
int count = 0;
while (*p != NULL) { // why *p requried here if p is already a pointer?
if (x == *p) count++;
p++;
}
return count;
}
int main(){
char a[5] = {'a','a','c','a',NULL};
char* p = a;
std::cout << count(p, 'a') << std::endl;
}
Why do I need
while (*p != NULL)
Since p is already a pointer, I thought
while (p != NULL)
should be enough, but the program crashes.
Incrementing the pointer will make it point to the next element in the character array. Incrementing the pointer will never make it equal to a nullpointer or NULL.
c-strings are nul-terminated. The end of the string is marked with an element with value '\0'. In main this is the last element of the array and the loop in the function will stop when it reaches that last element.
p is the pointer to the element. *p is the element.
Using NULL for that condition is misleading. NULL should not be used in C++ anymore. A null pointer is nullptr and the terminator in strings is '\0'. The code works nevertheless because NULL just happens to equal 0 and '\0'. Tough, it was meant to be used for pointers, not for char.
The code can be written like this:
int count(char* p, char x) {
int count = 0;
while (*p != '\0') { // why *p requried here if p is already a pointer?
if (x == *p) count++;
p++;
}
return count;
}
int main(){
char a[5] = {'a','a','c','a','\0'};
std::cout << count(a, 'a') << std::endl;
}
Or better, use std::string and std::count:
#include <string>
#include <algorith>
int main() {
std::string s{"aaca"};
std::cout << std::count(s.begin(),s.end(),'a');
}
Note that string literals automatically include the terminator. So "aaca" is a const char[5], an array of 5 characters, and the last one is '\0'. With std::string the details are a little hairy, but s[4] is also '\0'. Note that this is in contrast to other containers, where container[container.size()] is out-of-bounds and wrong.
p is a pointer to char. So if you check the value of p it will be an address to that char (the first char in a string or array). So that address will be non-zero whether you are pointing to the first character or the last character.
In C or C++ strings are traditionally null terminated, meaning that the end of a string is marked by the null-terminator which is a single char with the value 0. To check the value of the char that the pointer p is pointing to, you need to de-reference it. De-referencing is done by prepending a * to the expression. In this case we extract the value that p is pointing to and not the address that p points to.
You are basically having an array of char, and as an example it might look like this in memory:
Address
ASCII value
Value
1000
97 (0x61)
a
1001
97 (0x61)
a
1002
99 (0x63)
c
1003
97 (0x61)
a
1004
0 (0x00)
NULL
To begin with will point to the first char, that is address 1000, so the value of p is 1000, and the value of *p is 97 or 'a'. As you increment p it will change to 1001, 1002, etc. until it gets to 1004 where the value of p is 1004 and the value of *p will be 0.
Had you written while (p != NULL) instead of *p you would essentially have checked whether 1004 != 0 which would be true, and you would continue past the end of the string.
I know a lot of (older) tutorials start with (naked) pointers and "C" style arrays but they are really not the first things you should use.
If possible in C++ try to write solutions not depending in pointers.
For holding text, use std::string.
#include <string> // stop using char* for text
#include <algorithm> // has the count_if method
#include <iostream>
int count_matching_characters(const std::string& string, char character_to_match)
{
int count{ 0 };
// range based for loop, looping over al characters in the string
for (const char c : string)
{
if (c == character_to_match) count++;
}
return count;
// or using a lambda function and algorithm
/*
return std::count_if(string.begin(), string.end(), [&](const char c)
{
return (character_to_match == c);
});
**/
}
int main()
{
int count = count_matching_characters("hello world", 'l');
std::cout << count;
return 0;
}

Aren't pointers supposed to point to nullptr when an array ends?

Why does this code not work?, I used the function from "A Tour of C++" and it tells me that pointers point to nullptr when an array ends briefly explained. I tried to implement it and it doesn't show anything. Thanks in advance.
#include <iostream>
int count_x(char* p, char x)
{
int count = 0;
while (p)
{
if (*p == x){++count;}
p++;
}
return count;
}
int main()
{
char my_string[] {"hello world"};
char* my_string_ptr {my_string};
std::cout << "There are " << count_x(my_string_ptr,'a') << " a in the string\n";
return 0;
}```
No, pointer at the end of an array are not null. You probably want:
while (*p)
which is the same as
while (*p != '\0')
and
while (*p != 0)
which are testing for the null character.
p stores an address value in the computer memory which is deferenced by astricks (*) like *p.
In your code,
p++;
is incrementing the current address to which the p is pointing currently. When you reach the end of the string (at the last null character), p will be the address of the null character not the null itself, so
while(p)
will be true and following p++ will increment it to next memory location (which technically does not belong to the allocated string), and hence p will return non-zero address this while loop will keep on running and you will get a segmentation fault.
As the jeffrey has mentioned in his answer too, use *p to test the while condition so when p reaches the null, it can derefer it.
Your original code crashed (or caused segmentation fault). The reason is that in the while loop, you did not correctly specify the condition to exit this while loop. Therefore, the while loop will continue to go past the end of the string, which causes segmentation fault (i.e, your app will crash).
One correct way to exit the while loop is to compare each character of the string to see if that character equals the end of string character that is '\0'.
When you encounter this end of string character that is '\0', you should exit the while loop.
I have fixed one line of your code, and your function is now working fine as shown below:
int count_x(char* p, char x)
{
int count = 0;
while (*p != '\0') // I fixed your code here
{
if (*p == x){++count;}
p++;
}
return count;
}
Again, please note that the code above runs well, and produces the correct result as I have verified it. Please let me know if you run into any issue. Cheers.

Pointer syntax and incrementation

#include <iostream>
#include <cstdlib>
using namespace std;
void reverse(char* str){
char *end = str;
char tmp;
if(str){
cout << "hello" << endl;
while(*end){
cout << end << endl;
++end;
}
--end;
while (str < end){
tmp = *str;
*str++ = *end;
*end-- = tmp;
}
}
}
int main(){
char str[] = "helloyouarefunny";
string input = str;
reverse(str);
for(int i = 0; i < input.length(); i++) {
cout << str[i];
}
}
Is if(str){} equivalent to if(str == NULL){}?
What does while(*end){} mean and what is it exactly doing? I think I have a general understanding that the while loop will continue to be executed as long as it does not "see" a '\0'. But I am not sure what is exactly going on with this line of code.
Given that if(str){} is an equivalent statement to if(str == NULL){}, what would you pass into a function to make str = NULL? For example, in my main(){}, I tried to do char str[] = NULL, thereby, attempting to pass a NULL so that it wouldn't go inside the code if(str == NULL){}. But I get an error saying I cannot make this declaration char str[] = NULL. So my question is why am I getting this error and what can I pass through the reverse() function in order to make the code inside of if(str){} not execute? I hope this question made sense.
And the code ++end is doing pointer arithmetic correct? So every time it is incremented, the address is moving to the address right next to it?
I'm a little confused while(str < end){}. What is the difference between just str and *str? I understand that cout << str << endl; has to do with overloading the operator << and therefore prints the entire string that is passed through the argument. But why, when I cout << *end << endl;, it only prints the character at that memory address? So my question is, what's the difference between the two? Is it just dereferencing when i do *str? I might actually be asking more than that question in this question. I hope I don't confuse you guys >_<.
Is if(str){} equivalent to if(str == NULL){}?
No, if(str){} is equivalent to if(str != NULL){}
What does while(*end){} mean and what is it exactly doing?
Since the type of of end is char*, while(*end){} is equivalent to while (*end != '\0'). The loop is executed for all the characters of the input string. When the end of the string is reached, the loop stops.
Given that if(str){} is an equivalent statement to if(str == NULL){}
That is not correct. I did not read rest of the paragraph since you start out with an incorrect statement.
And the code ++end is doing pointer arithmetic correct? So every time it is incremented, the address is moving to the address right next to it?
Sort of. The value of end is incremented. It points to the next object that it used to point to before the operation.
I'm a little confused while(str < end){}
In the previous while loop, end was incremented starting from str until it reached the end of the string. In this loop, end is decremented until it reaches the start of the string. When end reaches str, the conditional of the while statement evaluates to false and the loop breaks.
Update
Regarding
what would you pass into a function to make str = NULL?
You could simply call
reverse(NULL);
I tried to do char str[] = NULL;
str is an array of characters. It can be initialized using couple of ways:
// This is what you have done.
char str[] = "helloyouarefunny";
// Another, more tedious way:
char str[] = {'h','e','l','l','o','y','o','u','a','r','e','f','u','n','n','y', '\0'};
Notice the presence of an explicitly specified null character in the second method.
You cannot initialize a variable that is of type array of chars to to NULL. The language does not allow that. You can initialize a pointer to NULL but not an array.
char* s1 = NULL; // OK
reverse(s1); // Call the function
s1 = malloc(10); // Allocate memory for the pointer.
strcpy(s1, "test") // Put some content in the allocated memory
reverse(s1); // Call the function, this time with some content.
These are pretty standard C programming idioms.
No, in fact if (str) ... is equivalent to if (str != NULL) ...
C character strings are null terminated, meaning that "Hello" is represented in memory as the character array {'H', 'e', 'l', 'l', 'o', '\0'}. As with pointers, the 0 or NULL value is considered false in a logical expression. Thus while (*end) ... will execute the body of the while loop so long as end has not reached the null character.
N/A
Correct - this advances to the next character in the string, or to the null terminator.
This is the reverse algorithm. After the first loop, end points to one past the end of the string and str points to the beginning. Now we work these two pointers toward each other, swapping characters.
1/2) In C and C++, whatever is in the if or while is evaluated as a boolean. 0 is evaluated to false while any other value is evaluated to true. Given that NULL is equivalent to 0, if(str) and if(str != NULL) do the same things.
Likewise, while(*end) will only loop so long as the value end is pointing to does not evaluate to 0.
3) If you pass a char pointer to this function, it could be the null pointer (char *str = 0), so you're checking to make sure str is not null.
4) Yes, the pointer is then pointing to the next location in memory until eventually you find the null at the end of the string.
5) Perhaps your confusion is based around the fact that the code is missing parenthesis, the loop should look like:
while (str < end){
tmp = *str;
*(str++) = *end;
*(end--) = tmp;
}
So that the two pointers will continue to make there way towards eachother until crossing paths (at which point, str will no longer be less than end)

string to character conversion error?

After trying for about 1 hour, my code didn't work because of this:
void s_s(string const& s, char data[10])
{
for (int i = 0; i < 10; i++)
data[i] = s[i];
}
int main()
{
string ss = "1234567890";
char data[10];
s_s("1234567890", data);
cout << data << endl;//why junk
}
I simply don't understand why the cout displays junk after the char array. Can someone please explain why and how to solve it?
You need to null terminate your char array.
std::cout.operator<<(char*) uses \0 to know where to stop.
Your char[] decays to char* by the way.
Look here.
As already mentioned you want to NUL terminate your array, but here's something else to consider:
If s is your source string, then you want to loop to s.size(), so that you don't loop past the size of your source string.
void s_s(std::string const& s, char data[20])
{
for (unsigned int i = 0; i < s.size(); i++)
data[i] = s[i];
data[s.size()] = '\0';
}
Alternatively, you can try this:
std::copy(ss.begin(), ss.begin()+ss.size(),
data);
data[ss.size()] = '\0';
std::cout << data << std::endl;
You have ONLY allocated 10 bytes for data
The string is actually 11 bytes since there is an implied '\0' at the end
At a minimum you should increase the size of data to 11, and change your loop to copy the '\0' as well
The function std::ostream::operator<< that you are trying to use in the last line of the main will take your char array as a pointer and will print every char until the null sentinel character is found (the character is \0).
This sentinel character is generally generated for you in statements where a C-string literal is defined:
char s[] = "123";
In the above example sizeof(s) is 4 because the actual characters stored are:
'1', '2', '3', '\0'
The last character is fundamental in tasks that require to loop on every char of a const char* string, because the condition for the loop to terminate, is that the \0 must be read.
In your example the "junk" that you see are the bytes following the 0 char byte in the memory (interpreted as char). This behavior is clearly undefined and can potentially lead the program to crash.
One solution is to obviously add the \0 char at the end of the char array (of course fixing the size).
The best solution, though, is to never use const char* for strings at all. You are correctly using std::string in your example, which will prevent this kind of problems and many others.
If you ever need a const char* (for C APIs for example) you can always use std::string::c_str and retrieve the C string version of the std::string.
Your example could be rewritten to:
int main(int, char*[]) {
std::string ss = "1234567890";
const char* data = ss.c_str();
std::cout << data << std::endl;
}
(in this particular instance, a version of std::ostream::operator<< that takes a std::string is already defined, so you don't even need data at all)

Counting Bytes Of Char S

I have a homework, It is:
Write the code of function below, this function should count number of bytes inside of s till it is not '\0'.
The function:
unsigned len(const char* s);
Really I do not know what this homework mean, can anyone write this homework's code please?
Further more can anyone please explain what does "Const char* s" mean? If you can explain with some examples it would be perfect.
Here is a code which I'm trying to do:
unsigned len(const char* s)
{
int count=0;; int i=0;
while (*(s+i)!=0)
{
count++;
i++;
}
return count;
}
But in the main function I do not know what should I write, BTW I have written this:
const char k='m';
const char* s=&k;
cout << len(s) << endl;
The result always is 4! really I do not know what should I do for this question, if I can store only one character in const char, so the result should be the same always. What this question is looking for exactly?
The homework means you should write a function that behaves like this:
int main() {
char s[] = {'a','b','c','\0'};
unsigned s_length = len(s);
// s_length will be equal to 3 ('a','b','c', not counting '\0')
}
I think it's unlikely that anyone will do you homework for you here.
Presumably your class has covered function parameters, pointers, and arrays if you're being asked to do this. So I guess you're asking about const. const char* s means that s points to a const char, which means you're not allowed to modify the char. That is, the following is illegal:
unsigned len(const char *s) {
*s = 'a'; // error, modifying a const char.
}
Here are the basic things you need to know about pointers to write the function. First, in this case the pointer is pointing at an element in an array. That is:
char A[] = {'a','b','c','\0'};
char const *s = &A[0]; // s = the address of A[0];
The pointer points to, or references, a char. To get that char you dereference the pointer:
char c = *s;
// c is now equal to A[0]
Because s points at an element of an array, you can add to and subtract from the pointer to access other elements of the array:
const char *t = s+1; // t points to the element after the one s points to.
char d = *t; // d equals A[1] (because s points to A[0])
You can also use the array index operator:
char c = s[0]; // c == A[0]
c = s[1]; // c == A[1]
c = s[2]; // c == A[2]
What would you used to look at each element of the array sequentially, with an increasing index?
Your proposed solution looks like it should work correctly. The reason you're getting a result of 4 is just coincidence. You could be getting any results at all. The problem with the way you're calling the function:
const char k='m';
const char* s=&k;
cout << len(s) << endl;
is that there's no '\0' guaranteed to be at the end. You need to make an array where one of the elements is 0:
const char k[] = { 1,2,3,0};
const char* s = &k[0];
cout << len(s) << '\n'; // prints 3
char m[] = { 'a', 'b', 'c', 'd', '\0', 'e', 'f'};
cout << len(m) << '\n'; // prints 4
char const *j = "Hello"; // automatically inserts a '\0' at the end
cout << len(j) << '\n'; // prints 5
In C (and by extension C++), strings can be represented as a sequence of characters terminated by a null character. So, the string "abc" would be represented as
'a', 'b', 'c', '\0'
This means, you can get the length of a C string by counting each character until you encounter a null. So if you have a null terminated const char* string, you can find out the length of that string by looping over the string and incrementing a counter variable until you find the '\0' character.
it means you have a string like hello world Every string terminates with a \0. That means it looks like this: hello world\0
Now step over the char array (char* s) until you find \0.
Update:
\0 is in fact only one single character of value 0x00. \ is used to tell visualize that this is meant instead of the number 0 in a string.
Example:
0abc\0 -> string starting with number 0 and is terminated with 0x0.
EDIT
char * indicates the type of the variable s. It is a pointer to a character array. const means that this character array is readonly and can't be changed.
Do you actually mean "count the characters till you find a '\0'"?
If so, you could implement it like this:
for each character
if it is not 0
increment x (where x is variable holding number of characters found)
otherwise
stop looking
return x
I am not going to write your homework as well :P, but let me give you some hint: it's called "pointer arithmetic". So, a pointer is a thing exactly just as it names says: a pointer to a memory "cell". As you know all variables in the memory are stored in "cells", that you can refer by an address. A C string is stored in continuous cells in the memory, so for example "abc" would look like something like (the '\0' is added by the compiler when you define a string literal with quotes):
+----+----+----+----+
|'a' |'b' |'c' |'\0'|
+----+----+----+----+
^
s
and you also get the address of the first char. Now, to get the address of 'b', you can simple add one to s like this: (s + 1). To get what is actually in the cell where s points to, you should use the * operator:
*s = 'a' or *(s + 1) = 'b'. This is called pointer arithmetic.
Note: in this case adding one to the pointer shifts to the next cell, because char is one byte long. If you define a pointer to bigger structure (long int for example of 4 bytes) adding one will move to the to the position in the memory where your next structure would begin (in case of long int it will move +4 bytes).
Now that should be enough help to finish your hw.
OK , I have found my answer, just check if I'm true:
#include <iostream>
using namespace std;
unsigned len(const char*);
int main()
{
const char* s = "Hello";
cout << len(s) << endl;
return 0;
}
unsigned len(const char* s)
{
int count=0;; int i=0;
while (*(s+i)!=0)
{
count++;
i++;
}
return count;
}
So it is showing that I have set "Hello" into const char* s; So for const char* variables I should use strings like "Hello" with the sign ("). Is that True?