C++: C-strings, pointers, and a very interesting while loop - c++

I saw that a potential job interview for a C++ programmer position could ask you this question:
Explain what the following C++ code segment does.
char *aryA = "Data Structures";
char *aryB, *aryC;
aryB = new char[20];
aryC = aryB;
while (*aryB++ = *aryA++);
cout << aryC << endl;
I've been looking at it for a while, but I don't think I am understanding the while loop. So to me it would seem that the while loop is saying to cout aryC so long as the two pointers are equal. But, both pointers are being incremented by one, which I take to mean which char value in the array is being looked at. But if they are the same and both are being increased by one, wouldn't they always be equal? And there's another thing. The values for the array of chars aryB is not defined; we only know there are 20 values in the array. So how can you compare aryA and aryC in the first place?
If anyone can take the time to explain this code segment to me, I would really appreciate it. I am having issues running visual studio, so I can't just run it myself, but even if I could I think I would still benefit from someone teaching me.

It's quite easy, *aryB++ = *aryA++ can be seen as
*aryB = *aryA;
aryB++;
aryA++;
which just assign the character pointed by aryA to aryBand then increment both (to move on next character. The while is executed until the NUL terminating character is found, which is caught by the fact that the = operator (which is not ==) returns the assigned value.
Saving aryB to aryC before the while is just a way to keep the pointer to the beginning of the copied string, since you lose it by then incrementing aryB.

aryB = new char[20]; sets aryB to a new character array.
aryC = arB; sets aryC as a reference to aryB.
while (*aryB++ = *aryA++); This one is more complicated. It will set the current value at aryB to the current value at aryA while the current value at aryA is not false (0), then move where both pointers forward one (remember that all c strings end with \0, which evaluates to 0). This also changes the value of aryC, but not what it points to. In the end, aryA is copied into aryC.

aryB is a pointer that points to an address in memory. By placing * in front of pointer (*aryB), you access the actual data at that memory address. ++ increments the pointer by 1, which makes it to point to the next memory address. Inside while() you do not compare (the operator is not ==), the assignment operator is used (=). This means that you will be copying data from memory aryA to aryB. Also aryC = aryB means that aryC points to the same memory address as aryB (points to the first element of the array). In other words by modifying data at aryB, you also modify it for aryC.

char *aryA = "Data Structures";
char *aryB, *aryC;
aryB = new char[20];
aryC = aryB;
We can visualise the memory use and pointer contents like this:
[(char*)aryA]--------------------------v
["Data Structures\0"]
[(char*)aryB]-------------v
[ 20 uninitialised chars ]
^
|
[(char*)argC]-------------/
Then:
while (*aryB++ = *aryA++);
Is processed like this:
*aryB = *aryA - assigns *aryB (the first uninitialised char),
with *aryA (the 'D' in Data)
aryB++, aryA++ - post-increments add one to each pointer
(i.e. move to the 'a' in Data, the 2nd uninitialised value
while (...) - evaluates the assignment, exiting if false
the assignment evaluates to the copied character
only character code 0 / '\0' / NUL converts to false; others to true
We now have:
[(char*)aryA]---------------------------v
["Data Structures\0"]
[(char*)aryB]--------------v
[D ]
^
|
[(char*)argC]-------------/
Repeat. This keeps copying character by character until the data from *aryA is the NUL character, which gets copied but then the assignment evaluates to false and the loop terminates.
While that was all happening, aryC stayed pointing at the start of the new-ed buffer, so it can now be used to stream out the content:
cout << aryC << endl;

Related

how does this while loop sort out pointers?

I am a complete novice to pointers and discovered them about a week ago. This book I am reading introduced me a problem for me to solve and it took me a while without coming up with the correct solution, and so I cheated. Looked it up and came up with this:
#include <iostream>
void mystrcpy(char* str1, char* str2) {
while (*str1++ = *str2++);
}
int main() {
char str1[15] = "C++";
char str2[15] = "Pointers";
std::cout << str1 << std::endl << str2 << std::endl;
mystrcpy(str1, str2);
std::cout << str1 << std::endl << str2 << std::endl;
}
Question was me to fill in the blanks at the void function to solve the problem and the problem beneath at main() which printed out this:
C++
Pointers
C++
Pointers
and it wanted me to look like this:
C++
Pointers
Pointers
Pointers
So, basically I do not understand how while (*str1++ = *str2++); solves this? Been looking at it for a while still clueless how it sorst the strings out to output it like that
Pointers are like variables that point to a specific location in memory. You can dereference them to get the value that they point to with *str2, and increment them to make them point to the next value in memory (based on their datatype): str2++.
The precedence can be confusing. *str2++ works in this order:
get the value pointed to by str2, and then
increment str2 to point to the next character
while (*str1++ = *str2++);
This statement copies each char from str2 into str1 until it reaches a null pointer in str2. It works by:
*str2: get the character pointed to by str2
*str1 = *str2: write the character pointed to by str2 to the destination pointed to by str1
str1++, str2++: increment str1 and str2 for the next loop iteration. Now each pointer points to the next character in each string.
When str2 points to a null terminator, *str2 returns 0 (or '\0'), and *str1 = *str2 will write it to str1, and also return the 0 which will cause the while loop to complete.
str1 starts as C++, and str2 starts as Pointers. Then the mystrcopy function is called which copies the value of str2 into str1.
The copying is done by the line in question:
while (*str1++ = *str2++);
I agree that this line is relatively opaque, but here are some basics:
The * on each "de-references` each pointer. That means it is looking at the value stored at the given address, not the address itself.
The ++ is used to shift the pointer address from its current location to the next location in the array. It is placed after the pointer so that the pointer is incremented after we get/set it's value.
The while-loop here has only a condition, however the condition itself is an action. The action is executesd and the result is checked. When the end of the string is reached, the \0 char is found which terminates the loop.
So, in summary, the line copies a single char value from str2 into the value of str1, increments both pointers to point to the next char of each, then repeats until failure.
It is worth noting that, although the pointers are being changed within the function to point at new addresses, not just the start of each array, these pointer variables are not passed into the function by reference, and so the pointer variables outside the function remain unchanged, still containing the address of the first char. However, the data stored at those addresses has been modified.
Each iteration, the loop assigns the value of the second element of array to the first element of the array, after which it checks whether the left set value is equal to zero, and if not, the loop continues. At the end of the iteration, the loop increments both variables.

Why does my pointer to a char only get the first character in it?

I initialized a char and a pointer, and then point it to the char. But I found that only the first character in the char was passed into the pointer. How could this be? Appreciate any advice!
char p[]="This is a long char.";
std::cout<<sizeof(p)<<"\n";
char *ptr=p;
std::cout<<*ptr<<"\n";
Got:
21
T
Program ended with exit code: 0
When checking the value of variables I noticed that:
ptr=(char *) "This is a long char."
*ptr=(char) 'T'
Why is the ptr holding the whole char but the content in it is only the first character? Isn't this the opposite of what we call a pointer? Got really confused...
When you said:
std::cout<<*ptr...
you dereferenced ptr, which gives you the (first) character that ptr is pointing to.
flyingCode's answer shows you how to print the entire string.
In C++, an array is just like multiple variables grouped together, so they each have a memory address, but they are one after another.
So, the pointer is pointing at the address of the first element, and the other elements are in the adresses right after that in order.
When you use the * operator on a pointer, it references the contents of the memory address the pointer is pointing to.
If you wanted to access another element of the array using this, you would use *(ptr + 5) to offset the memory address.
When you use the variable name alone, the language just does the offset for you and gets all the contents of the array. The pointer doesn't know the length of the array, but it can find the end of it, because char arrays always end with a null character (\0).
ran your code in repl and changing it to the code below worked :D
char p[]="This is a long char.";
std::cout<<sizeof(p)<<"\n";
char *ptr=p;
std::cout<<ptr<<"\n"; //notice no *
I recommend you use std::string rather than c-strings unless you have a specific need for them :)

c++ pointers and string of characters

am a bit confused about something in string of characters and pointers regarding c++ language ..
well, i came to know that a name of an array of characters is actually the address of it's first element and i also know that The cout object assumes that the address of a char is the address of a string, so it prints the character at that address and then continues printing characters until it runs into the null character (\0).
but here is the interesting part i tried this code in codeblocks
char arr[10]="bear";
cout <<&arr<<endl; // trying to find the address of the whole string
cout<<(int *)arr<<endl; // another way of finding the address of the string
cout<<(int *)arr[0]<<endl; // trying to find the address of the first element
what came on the screen was as follows
0x22fef6,
0x22fef6,
(0x62) <<<< My question is , what the heck is that? .. If the arrayname holds the address of the first element , shouldn't the first element address be " 0x22fef6 " ???????????????????
The [] operator does not return an address but dereferences the pointer at the given offset.
You could write an equivalent to the [] operator as follows:
char arr[10] = "bear";
char c = *(arr+0); // == arr[0] == 'b'
That is, you take the pointer arr, increase it by 0 char and then dereferences it to get it's value.
char arr[10]="bear";
cout <<&arr<<endl;
cout<<(int *)arr<<endl;
cout<<(int *)(arr+0)<<endl; // increases the address by 0
cout<<(int *)(&arr[0])<<endl; // the address of the value at index 0
This would do what you have expected it to do.
arr[0] equals *(arr + 0); you dereference the pointer and obtain the value it holds. To get what you want you need to reference the element, like &arr[0].

C++ On Demand Creation

I'm reading a book where the following code appears.
TTextInBuffer::TTextInBuffer(const char *pInputFileName, TAbortCode ac)
: pFileName(new char[strlen(pInputFileName) + 1])
pFileName is declared as a const char, so I'm assuming that the second line creates a new char in pFileName. I would just like to know the specifics of what is happening. Thanks.
When this constructor is called, the initializer list here is executed:
: pFileName(new char[strlen(pInputFileName) + 1])
The strlen() call finds the length of the pInputFileName string based on its contents. It basically walks it as a char array until it finds a NULL and then returns the result. This is being done in order to compute the space needed for the new string within pFileName.
The + 1 is there to make sure there's room for the extra NULL termination character at the end.
Finally, whatever number pops out of that expression is fed into a memory allocation call using the keyword new. This gets memory dynamically allocated on the heap where the string data will end up. The new call returns the address of where that memory has been allocated, and that is passed to the pFileName pointer variable so that it will point to it.
So, to summarize:
The length of pInputFileName is computed
The computed length is increased by 1 to cater for NULL in the copy
new is called to request space for the copy
The address returned by new is assigned to pFileName
The one thing that's missing from your code is the actual copy over of the contents of the input string to the destination, but perhaps that happens within the constructor body (between the { and } characters).
Second line allocates memory area (array of chars) by calling operator new[].
Argument of new is size of array to allocate. So, in this snippet the length is set to length of string pInputFileName + 1. This + 1 serves to fit null character that is used in C and C++ to determine where strings ends.

Using NULL as a terminator in arrays?

I like "reinventing the wheel" for learning purposes, so I'm working on a container class for strings. Will using the NULL character as an array terminator (i.e., the last value in the array will be NULL) cause interference with the null-terminated strings?
I think it would only be an issue if an empty string is added, but I might be missing something.
EDIT: This is in C++.
"" is the empty string in C and C++, not NULL. Note that "" has exactly one element (instead of zero), meaning it is equivalent to {'\0'} as an array of char.
char const *notastring = NULL;
char const *emptystring = "";
emptystring[0] == '\0'; // true
notastring[0] == '\0'; // crashes
No, it won't, because you won't be storing in an array of char, you'll be storing in an array of char*.
char const* strings[] = {
"WTF"
, "Am"
, "I"
, "Using"
, "Char"
, "Arrays?!"
, 0
};
It depends on what kind of string you're storing.
If you're storing C-style strings, which are basically just pointers to character arrays (char*), there's a difference between a NULL pointer value, and an empty string. The former means the pointer is ‘empty’, the latter means the pointer points to an array that contains a single item with character value 0 ('\0'). So the pointer still has a value, and testing it (if (foo[3])) will work as expected.
If what you're storing are C++ standard library strings of type string, then there is no NULL value. That's because there is no pointer, and the string type is treated as a single value. (Whereas a pointer is technically not, but can be seen as a reference.)
I think you are confused. While C-strings are "null terminated", there is no "NULL" character. NULL is a name for a null pointer. The terminator for a C-string is a null character, i.e. a byte with a value of zero. In ASCII, this byte is (somewhat confusingly) named NUL.
Suppose your class contains an array of char that is used to store the string data. You do not need to "mark the end of the array"; the array has a specific size that is set at compile-time. You do need to know how much of that space is actually being used; the null-terminator on the string data accomplishes that for you - but you can get better performance by actually remembering the length. Also, a "string" class with a statically-sized char buffer is not very useful at all, because that buffer size is an upper limit on the length of strings you can have.
So a better string class would contain a pointer of type char*, which points to a dynamically allocated (via new[]) array of char s. Again, it makes no sense to "mark the end of the array", but you will want to remember both the length of the string (i.e. the amount of space being used) and the size of the allocation (i.e. the amount of space that may be used before you have to re-allocate).
When you are copying from std::string, use the iterators begin(), end() and you don't have to worry about the NULL - in reality, the NULL is only present if you call c_str() (in which case the block of memory this points to will have a NULL to terminate the string.) If you want to memcpy use the data() method.
Why don't you follow the pattern used by vector - store the number of elements within your container class, then you know always how many values there are in it:
vector<string> myVector;
size_t elements(myVector.size());
Instantiating a string with x where const char* x = 0; can be problematic. See this code in Visual C++ STL that gets called when you do this:
_Myt& assign(const _Elem *_Ptr)
{ // assign [_Ptr, <null>)
_DEBUG_POINTER(_Ptr);
return (assign(_Ptr, _Traits::length(_Ptr)));
}
static size_t __CLRCALL_OR_CDECL length(const _Elem *_First)
{ // find length of null-terminated string
return (_CSTD strlen(_First));
}
#include "Maxmp_crafts_fine_wheels.h"
MaxpmContaner maxpm;
maxpm.add("Hello");
maxpm.add(""); // uh oh, adding an empty string; should I worry?
maxpm.add(0);
At this point, as a user of MaxpmContainer who had not read your documentation, I would expect the following:
strcmp(maxpm[0],"Hello") == 0;
*maxpm[1] == 0;
maxpm[2] == 0;
Interference between the zero terminator at position two and the empty string at position one is avoided by means of the "interpret this as a memory address" operator *. Position one will not be zero; it will be an integer, which if you interpret it as a memory address, will turn out to be zero. Position two will be zero, which, if you interpret it as a memory address, will turn out to be an abrupt disorderly exit from your program.