Aren't pointers supposed to point to nullptr when an array ends?

Aren't pointers supposed to point to nullptr when an array ends? - c++

Why does this code not work?, I used the function from "A Tour of C++" and it tells me that pointers point to nullptr when an array ends briefly explained. I tried to implement it and it doesn't show anything. Thanks in advance.
#include <iostream>
int count_x(char* p, char x)
{
int count = 0;
while (p)
{
if (*p == x){++count;}
p++;
}
return count;
}
int main()
{
char my_string[] {"hello world"};
char* my_string_ptr {my_string};
std::cout << "There are " << count_x(my_string_ptr,'a') << " a in the string\n";
return 0;
}```

No, pointer at the end of an array are not null. You probably want:
while (*p)
which is the same as
while (*p != '\0')
and
while (*p != 0)
which are testing for the null character.

p stores an address value in the computer memory which is deferenced by astricks (*) like *p.
In your code,
p++;
is incrementing the current address to which the p is pointing currently. When you reach the end of the string (at the last null character), p will be the address of the null character not the null itself, so
while(p)
will be true and following p++ will increment it to next memory location (which technically does not belong to the allocated string), and hence p will return non-zero address this while loop will keep on running and you will get a segmentation fault.
As the jeffrey has mentioned in his answer too, use *p to test the while condition so when p reaches the null, it can derefer it.

Your original code crashed (or caused segmentation fault). The reason is that in the while loop, you did not correctly specify the condition to exit this while loop. Therefore, the while loop will continue to go past the end of the string, which causes segmentation fault (i.e, your app will crash).
One correct way to exit the while loop is to compare each character of the string to see if that character equals the end of string character that is '\0'.
When you encounter this end of string character that is '\0', you should exit the while loop.
I have fixed one line of your code, and your function is now working fine as shown below:
int count_x(char* p, char x)
{
int count = 0;
while (*p != '\0') // I fixed your code here
{
if (*p == x){++count;}
p++;
}
return count;
}
Again, please note that the code above runs well, and produces the correct result as I have verified it. Please let me know if you run into any issue. Cheers.

Related

Why do we need a pointer here

I'm new to C++, and while going through pointers I don't quite understand why I need to use *p in the while loop condition check below.
It is a very simple function that counts occurrences of the character x in the array, with an example invocation as in the below main() function. We assume here that p will point to an array of chars. Purely for demonstration.
int count(char* p, char x) {
int count = 0;
while (*p != NULL) { // why *p requried here if p is already a pointer?
if (x == *p) count++;
p++;
}
return count;
}
int main(){
char a[5] = {'a','a','c','a',NULL};
char* p = a;
std::cout << count(p, 'a') << std::endl;
}
Why do I need
while (*p != NULL)
Since p is already a pointer, I thought
while (p != NULL)
should be enough, but the program crashes.

Incrementing the pointer will make it point to the next element in the character array. Incrementing the pointer will never make it equal to a nullpointer or NULL.
c-strings are nul-terminated. The end of the string is marked with an element with value '\0'. In main this is the last element of the array and the loop in the function will stop when it reaches that last element.
p is the pointer to the element. *p is the element.
Using NULL for that condition is misleading. NULL should not be used in C++ anymore. A null pointer is nullptr and the terminator in strings is '\0'. The code works nevertheless because NULL just happens to equal 0 and '\0'. Tough, it was meant to be used for pointers, not for char.
The code can be written like this:
int count(char* p, char x) {
int count = 0;
while (*p != '\0') { // why *p requried here if p is already a pointer?
if (x == *p) count++;
p++;
}
return count;
}
int main(){
char a[5] = {'a','a','c','a','\0'};
std::cout << count(a, 'a') << std::endl;
}
Or better, use std::string and std::count:
#include <string>
#include <algorith>
int main() {
std::string s{"aaca"};
std::cout << std::count(s.begin(),s.end(),'a');
}
Note that string literals automatically include the terminator. So "aaca" is a const char[5], an array of 5 characters, and the last one is '\0'. With std::string the details are a little hairy, but s[4] is also '\0'. Note that this is in contrast to other containers, where container[container.size()] is out-of-bounds and wrong.

p is a pointer to char. So if you check the value of p it will be an address to that char (the first char in a string or array). So that address will be non-zero whether you are pointing to the first character or the last character.
In C or C++ strings are traditionally null terminated, meaning that the end of a string is marked by the null-terminator which is a single char with the value 0. To check the value of the char that the pointer p is pointing to, you need to de-reference it. De-referencing is done by prepending a * to the expression. In this case we extract the value that p is pointing to and not the address that p points to.
You are basically having an array of char, and as an example it might look like this in memory:
Address
ASCII value
Value
1000
97 (0x61)
a
1001
97 (0x61)
a
1002
99 (0x63)
c
1003
97 (0x61)
a
1004
0 (0x00)
NULL
To begin with will point to the first char, that is address 1000, so the value of p is 1000, and the value of *p is 97 or 'a'. As you increment p it will change to 1001, 1002, etc. until it gets to 1004 where the value of p is 1004 and the value of *p will be 0.
Had you written while (p != NULL) instead of *p you would essentially have checked whether 1004 != 0 which would be true, and you would continue past the end of the string.

I know a lot of (older) tutorials start with (naked) pointers and "C" style arrays but they are really not the first things you should use.
If possible in C++ try to write solutions not depending in pointers.
For holding text, use std::string.
#include <string> // stop using char* for text
#include <algorithm> // has the count_if method
#include <iostream>
int count_matching_characters(const std::string& string, char character_to_match)
{
int count{ 0 };
// range based for loop, looping over al characters in the string
for (const char c : string)
{
if (c == character_to_match) count++;
}
return count;
// or using a lambda function and algorithm
/*
return std::count_if(string.begin(), string.end(), [&](const char c)
{
return (character_to_match == c);
});
**/
}
int main()
{
int count = count_matching_characters("hello world", 'l');
std::cout << count;
return 0;
}

What does while(p) mean? (p is a pointer to the first element of a C style string)

#include <iostream>
int main(void) {
char* c = "Hello World!";
char* p = c;
while (p && *p) {
std::cout << *p << std::endl;
++p;
}
return 0;
}
Look at the above.
It is a short code example in an exercise of C++.
I can understand while(*p) which means loop until the last character('\0') is reached.
But I can't understand while(p).
What does while(p) mean? (p is a pointer to the first element of a C style string)

Check that the pointer itselfs is not null - which may be of use if used in a function and p is an argument.
In this example it is useless since p can never be null since.

The while (p) part makes sure that the value of p is nonzero, and the while (*p) checks to make sure that the dereferenced value of p is nonzero.
Therefore, if the string itself (the pointer to it) p is nonzero, and it's current character *p is not '\0', the while loop block will execute.

Is nullptr used to terminate C-style strings?

I am confused by the use of nullptr in an example from A Tour of C++:  
int count_x(char* p, char x)
// count the number of occurrences of x in p[]
// p is assumed to point to a zero-terminated array of char (or to nothing)
{
if (p==nullptr) return 0;
int count = 0;
for (; p!=nullptr; ++p)
if (*p==x) ++count;
return count;
}
// The definition of count_x() assumes that the char* is a C-style string,
// that is, that the pointer points to a zero-terminated array of char.
I understand that count_x should terminate if p is unassigned, and the for loop should terminate when it reaches the end of the C-style string referenced by p.
However, when I build a main function to use count_x(), it never seems to terminate correctly:
int main () {
char teststring[] = {'b', 'l', 'a', 'h', '\0'};
cout << "teststring is: " << teststring << endl;
cout << "Number of b's is: " << count_x(teststring, 'b') << endl;
return 0;
}
Executing this prints a lot of garbage, and then exits with a segmentation fault. If I replace the for (; p!=nullptr; ++p) in count_x with for (; *p!='\0'; ++p), it executes properly. I guess this means that the string is not terminated correctly. If so, how do I terminate a C-style string so that nullptr can be used here?
Edit: there's been a discussion in the comments that's clarified this situation. I'm using the first print of the book from September 2013, where the above is printed in error. The third print of the book from January 2015 (linked in the comments) has the corrected example which uses for (; *p!=0; ++p) instead of for (; p!=nullptr; ++p). This correction is also documented in the errata for the book. Thanks!
Edit2: Sorry guys, this was apparently already asked on SO earlier here: Buggy code in "A Tour of C++" or non-compliant compiler?

No, a NULL pointer is not used to terminate strings. The NUL character is used. They are different things, but if you assign either of them to an integer in C, the result is always the same: zero (0).
A NUL character is represented in C as '\0' and takes up one char of storage. A NULL pointer is represented in C as 0, and takes up the same storage as void *. For example, on a 64-bit machine, a void * is 8 bytes while '\0' is one byte.
I.e., nullptr is not the same thing as '\0'. And the character is the null character, called NUL, but it is not supposed to be called a NULL byte or a NULL character.

Pointer syntax and incrementation

#include <iostream>
#include <cstdlib>
using namespace std;
void reverse(char* str){
char *end = str;
char tmp;
if(str){
cout << "hello" << endl;
while(*end){
cout << end << endl;
++end;
}
--end;
while (str < end){
tmp = *str;
*str++ = *end;
*end-- = tmp;
}
}
}
int main(){
char str[] = "helloyouarefunny";
string input = str;
reverse(str);
for(int i = 0; i < input.length(); i++) {
cout << str[i];
}
}
Is if(str){} equivalent to if(str == NULL){}?
What does while(*end){} mean and what is it exactly doing? I think I have a general understanding that the while loop will continue to be executed as long as it does not "see" a '\0'. But I am not sure what is exactly going on with this line of code.
Given that if(str){} is an equivalent statement to if(str == NULL){}, what would you pass into a function to make str = NULL? For example, in my main(){}, I tried to do char str[] = NULL, thereby, attempting to pass a NULL so that it wouldn't go inside the code if(str == NULL){}. But I get an error saying I cannot make this declaration char str[] = NULL. So my question is why am I getting this error and what can I pass through the reverse() function in order to make the code inside of if(str){} not execute? I hope this question made sense.
And the code ++end is doing pointer arithmetic correct? So every time it is incremented, the address is moving to the address right next to it?
I'm a little confused while(str < end){}. What is the difference between just str and *str? I understand that cout << str << endl; has to do with overloading the operator << and therefore prints the entire string that is passed through the argument. But why, when I cout << *end << endl;, it only prints the character at that memory address? So my question is, what's the difference between the two? Is it just dereferencing when i do *str? I might actually be asking more than that question in this question. I hope I don't confuse you guys >_<.

Is if(str){} equivalent to if(str == NULL){}?
No, if(str){} is equivalent to if(str != NULL){}
What does while(*end){} mean and what is it exactly doing?
Since the type of of end is char*, while(*end){} is equivalent to while (*end != '\0'). The loop is executed for all the characters of the input string. When the end of the string is reached, the loop stops.
Given that if(str){} is an equivalent statement to if(str == NULL){}
That is not correct. I did not read rest of the paragraph since you start out with an incorrect statement.
And the code ++end is doing pointer arithmetic correct? So every time it is incremented, the address is moving to the address right next to it?
Sort of. The value of end is incremented. It points to the next object that it used to point to before the operation.
I'm a little confused while(str < end){}
In the previous while loop, end was incremented starting from str until it reached the end of the string. In this loop, end is decremented until it reaches the start of the string. When end reaches str, the conditional of the while statement evaluates to false and the loop breaks.
Update
Regarding
what would you pass into a function to make str = NULL?
You could simply call
reverse(NULL);
I tried to do char str[] = NULL;
str is an array of characters. It can be initialized using couple of ways:
// This is what you have done.
char str[] = "helloyouarefunny";
// Another, more tedious way:
char str[] = {'h','e','l','l','o','y','o','u','a','r','e','f','u','n','n','y', '\0'};
Notice the presence of an explicitly specified null character in the second method.
You cannot initialize a variable that is of type array of chars to to NULL. The language does not allow that. You can initialize a pointer to NULL but not an array.
char* s1 = NULL; // OK
reverse(s1); // Call the function
s1 = malloc(10); // Allocate memory for the pointer.
strcpy(s1, "test") // Put some content in the allocated memory
reverse(s1); // Call the function, this time with some content.

These are pretty standard C programming idioms.
No, in fact if (str) ... is equivalent to if (str != NULL) ...
C character strings are null terminated, meaning that "Hello" is represented in memory as the character array {'H', 'e', 'l', 'l', 'o', '\0'}. As with pointers, the 0 or NULL value is considered false in a logical expression. Thus while (*end) ... will execute the body of the while loop so long as end has not reached the null character.
N/A
Correct - this advances to the next character in the string, or to the null terminator.
This is the reverse algorithm. After the first loop, end points to one past the end of the string and str points to the beginning. Now we work these two pointers toward each other, swapping characters.

1/2) In C and C++, whatever is in the if or while is evaluated as a boolean. 0 is evaluated to false while any other value is evaluated to true. Given that NULL is equivalent to 0, if(str) and if(str != NULL) do the same things.
Likewise, while(*end) will only loop so long as the value end is pointing to does not evaluate to 0.
3) If you pass a char pointer to this function, it could be the null pointer (char *str = 0), so you're checking to make sure str is not null.
4) Yes, the pointer is then pointing to the next location in memory until eventually you find the null at the end of the string.
5) Perhaps your confusion is based around the fact that the code is missing parenthesis, the loop should look like:
while (str < end){
tmp = *str;
*(str++) = *end;
*(end--) = tmp;
}
So that the two pointers will continue to make there way towards eachother until crossing paths (at which point, str will no longer be less than end)

Access Violation With Pointers? - C++

I've written a simple string tokenizing program using pointers for a recent school project. However, I'm having trouble with my StringTokenizer::Next() method, which, when called, is supposed to return a pointer to the first letter of the next word in the char array. I get no compile-time errors, but I get a runtime error which states:
Unhandled exception at 0x012c240f in Project 5.exe: 0xC0000005: Access violation reading location 0x002b0000.
The program currently tokenizes the char array, but then stops and this error pops up. I have a feeling it has to do with the NULL checking I'm doing in my Next() method.
So how can I fix this?
Also, if you notice anything I could do more efficiently or with better practice, please let me know.
Thanks!!
StringTokenizer.h:
#pragma once
class StringTokenizer
{
public:
StringTokenizer(void);
StringTokenizer(char* const, char);
char* Next(void);
~StringTokenizer(void);
private:
char* pStart;
char* pNextWord;
char delim;
};
StringTokenizer.cpp:
#include "stringtokenizer.h"
#include <iostream>
using namespace std;
StringTokenizer::StringTokenizer(void)
{
pStart = NULL;
pNextWord = NULL;
delim = 'n';
}
StringTokenizer::StringTokenizer(char* const pArray, char d)
{
pStart = pArray;
delim = d;
}
char* StringTokenizer::Next(void)
{
pNextWord = pStart;
if (pStart == NULL) { return NULL; }
while (*pStart != delim) // access violation error here
{
pStart++;
}
if (pStart == NULL) { return NULL; }
*pStart = '\0'; // sometimes the access violation error occurs here
pStart++;
return pNextWord;
}
StringTokenizer::~StringTokenizer(void)
{
delete pStart;
delete pNextWord;
}
Main.cpp:
// The PrintHeader function prints out my
// student info in header form
// Parameters - none
// Pre-conditions - none
// Post-conditions - none
// Returns - void
void PrintHeader();
int main ( )
{
const int CHAR_ARRAY_CAPACITY = 128;
const int CHAR_ARRAY_CAPCITY_MINUS_ONE = 127;
// create a place to hold the user's input
// and a char pointer to use with the next( ) function
char words[CHAR_ARRAY_CAPACITY];
char* nextWord;
PrintHeader();
cout << "\nString Tokenizer Project";
cout << "\nyour name\n\n";
cout << "Enter in a short string of words:";
cin.getline ( words, CHAR_ARRAY_CAPCITY_MINUS_ONE );
// create a tokenizer object, pass in the char array
// and a space character for the delimiter
StringTokenizer tk( words, ' ' );
// this loop will display the tokens
while ( ( nextWord = tk.Next ( ) ) != NULL )
{
cout << nextWord << endl;
}
system("PAUSE");
return 0;
}
EDIT:
Okay, I've got the program working fine now, as long as the delimiter is a space. But if I pass it a `/' as a delim, it comes up with the access violation error again. Any ideas?
Function that works with spaces:
char* StringTokenizer::Next(void)
{
pNextWord = pStart;
if (*pStart == '\0') { return NULL; }
while (*pStart != delim)
{
pStart++;
}
if (*pStart = '\0') { return NULL; }
*pStart = '\0';
pStart++;
return pNextWord;
}

An access violation (or "segmentation fault" on some OSes) means you've attempted to read or write to a position in memory that you never allocated.
Consider the while loop in Next():
while (*pStart != delim) // access violation error here
{
pStart++;
}
Let's say the string is "blah\0". Note that I've included the terminating null. Now, ask yourself: how does that loop know to stop when it reaches the end of the string?
More importantly: what happens with *pStart if the loop fails to stop at the end of the string?

This answer is provided based on the edited question and various comments/observations in other answers...
First, what are the possible states for pStart when Next() is called?
pStart is NULL (default constructor or otherwise set to NULL)
*pStart is '\0' (empty string at end of string)
*pStart is delim (empty string at an adjacent delimiter)
*pStart is anything else (non-empty-string token)
At this point we only need to worry about the first option. Therefore, I would use the original "if" check here:
if (pStart == NULL) { return NULL; }
Why don't we need to worry about cases 2 or 3 yet? You probably want to treat adjacent delimiters as having an empty-string token between them, including at the start and end of the string. (If not, adjust to taste.) The while loop will handle that for us, provided you also add the '\0' check (needed regardless):
while (*pStart != delim && *pStart != '\0')
After the while loop is where you need to be careful. What are the possible states now?
*pStart is '\0' (token ends at end of string)
*pStart is delim (token ends at next delimiter)
Note that pStart itself cannot be NULL here.
You need to return pNextWord (current token) for both of these conditions so you don't drop the last token (i.e., when *pStart is '\0'). The code handles case 2 correctly but not case 1 (original code dangerously incremented pStart past '\0', the new code returned NULL). In addition, it is important to reset pStart for case 1 correctly, such that the next call to Next() returns NULL. I'll leave the exact code as an exercise to reader, since it is homework after all ;)
It's a good exercise to outline the possible states of data throughout a function in order to determine the correct action for each state, similar to formally defining base cases vs. recursive cases for recursive functions.
Finally, I noticed you have delete calls on both pStart and pNextWord in your destructor. First, to delete arrays, you need to use delete [] ptr; (i.e., array delete). Second, you wouldn't delete both pStart and pNextWord because pNextWord points into the pStart array. Third, by the end, pStart no longer points to the start of the memory, so you would need a separate member to store the original start for the delete [] call. Lastly, these arrays are allocated on the stack and not the heap (i.e., using char var[], not char* var = new char[]), and therefore they shouldn't be deleted. Therefore, you should simply use an empty destructor.
Another useful tip is to count the number of new and delete calls; there should be the same number of each. In this case, you have zero new calls, and two delete calls, indicating a serious issue. If it was the opposite, it would indicate a memory leak.

Inside ::Next you need to check for the delim character, but you also need to check for the end of the buffer, (which I'm guessing is indicated by a \0).
while (*pStart != '\0' && *pStart != delim) // access violation error here
{
pStart++;
}
And I think that these tests in ::Next
if (pStart == NULL) { return NULL; }
Should be this instead.
if (*pStart == '\0') { return NULL; }
That is, you should be checking for a Nul character, not a null pointer. Its not clear whether you intend for these tests to detect an uninitialized pStart pointer, or the end of the buffer.

An access violation usually means a bad pointer.
In this case, the most likely cause is running out of string before you find your delimiter.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Aren't pointers supposed to point to nullptr when an array ends? - c++

No, pointer at the end of an array are not null. You probably want: while (p) which is the same as while (p != '\0') and while (*p != 0) which are testing for the null character.

Related

Why do we need a pointer here

What does while(p) mean? (p is a pointer to the first element of a C style string)

Is nullptr used to terminate C-style strings?

Pointer syntax and incrementation

Access Violation With Pointers? - C++

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Aren't pointers supposed to point to nullptr when an array ends? - c++

No, pointer at the end of an array are not null. You probably want: while (*p) which is the same as while (*p != '\0') and while (*p != 0) which are testing for the null character.

Related

Why do we need a pointer here

What does while(p) mean? (p is a pointer to the first element of a C style string)

Is nullptr used to terminate C-style strings?

Pointer syntax and incrementation

Access Violation With Pointers? - C++

Categories

Resources

No, pointer at the end of an array are not null. You probably want: while (p) which is the same as while (p != '\0') and while (*p != 0) which are testing for the null character.