What does `(c = *str) != 0` mean? - c++

int equiv (char, char);
int nmatches(char *str, char comp) {
char c;
int n=0;
while ((c = *str) != 0) {
if (equiv(c,comp) != 0) n++;
str++;
}
return (n);
}
What does "(c = *str) != 0" actually mean?
Can someone please explain it to me or help give me the correct terms to search for an explanation myself?

This expression has two parts:
c = *str - this is a simple assignment of c from dereferencing a pointer,
val != 0 - this is a comparison to zero.
This works, because assignment is an expression, i.e. it has a value. The value of the assignment is the same as the value being assigned, in this case, the char pointed to by the pointer. So basically, you have a loop that traces a null-terminated string to the end, assigning each individual char to c as it goes.
Note that the != 0 part is redundant in C, because the control expression of a while loop is implicitly compared to zero:
while ((c = *str)) {
...
}
The second pair of parentheses is optional from the syntax perspective, but it's kept in assignments like that in order to indicate that the assignment is intentional. In other words, it tells the readers of your code that you really meant to write an assignment c = *str, and not a comparison c == *str, which is a lot more common inside loop control blocks. The second pair of parentheses also suppresses the compiler warning.

Confusingly,
while ((c = *str) != 0) {
is a tautology of the considerably easier to read
while (c = *str) {
This also has the effect of assigning the character at *str to c, and the loop will terminate once *str is \0; i.e. when the end of the string has been reached.
Assignments within conditionals such as the above can be confusing on first glance, (cf. the behaviour of the very different c == *str), but they are such a useful part of C and C++, you need to get used to them.

(c = *str) is an expression and that has a value in itself. It is an assignment, the value of an assignment is the assigned value. So the value of (c = *str) is the value of *str.
The code basically checks, whether the value of *str, which just has been assigned to c is not 0. In case it isn't, then it will call the function equiv with that value.
Once the 0 is assigned, this is the end of the string. The function has to stop reading from the memory, which it does.

It's looping over every character in the string str, assigning them to c and then seeing if c is equal to 0 which would indicate the end of the string.
Although really the code should use '\0' as that is more obviously a NUL character.

We are going through the str in the while loop and extract every char symbol in it until it is equal to zero - the main rule of the end of char string.
Here is 'for' loop equivalent:
for (int i = 0; i < strlen(str); ++i )
std::cout << str[i];

It is just sloppily written code. The intention is to copy a character from the string str into c and then check if it was the null terminator.
The idiomatic way to check for the null terminator in C is an explicit check against '\0':
if(c != '\0')
This is so-called self-documenting code, since the de facto standard way to write the null terminator in C is by using the octal escape sequence \0.
Another mistake is to use assignment inside conditions. This was recognized as bad practice back in the 1980s and since then every compiler gives a warning against such code, "possibly incorrect assignment" or similar. This is bad practice because assignment includes a side effect and expressions with side effects should be kept as simple as possible. But it is also bad practice because it is easy to mix up = and ==.
The code could easily be rewritten as something more readable and safe:
c = *str;
while (c != '\0')
{
if(equiv(c, comp) != 0)
{
n++;
}
str++;
c = *str;
}

You don't need char c since you already have the pointer char *str, also you can replace != 0 with != '\0' for better readability (if not compatibility)
while (*str != '\0')
{
if (equiv((*str),comp)
!= 0)
{ n++; }
str++;
}
To understand what the code does, you can read it like this
while ( <str> pointed-to value is-not <end_of_string> )
{
if (function <equiv> with parameters( <str> pointed-to value, <comp> )
returned non-zero integer value)
then { increment <n> by 1 }
increment pointer <str> by 1 x sizeof(char) so it points to next adjacent char
}

Related

How to check an empty array returned from a function in C++

I was asked to write a function that accepts a character array, a zero-based start position, and a length.
It should return a character array containing length characters (len) starting with the start character of the input array
#include<iostream>
#include<vector>
#include<iterator>
using namespace std;
char* lengthChar(char c[],int array_len, int start, int len)
{
char* v = new char[len];
if(start < 0 || len > array_len || (start + len - 1) >= array_len){
return NULL;
}
if((start + len) == start)
{
return v;
}
copy(&c[start], &c[len+start], &v[0]);
return v;
}
My question is when I call my function as
char* r = lengthChar(t,3, 1, 0);
Normally based on my implementation, it should return a pointer pointing to an empty array. However, I can't seem to verify this. When I do if(!r[0]), it doesn't detect it. I even did
char s[] = {};
char* tt = &s[0];
if(r[0] == *tt)
Still nothing. The strange thing is when I cout the value of r[0], nothing is printed. So I don't know what actually is return. How do I verify that it is empty?
Don't use if(!r[0]) to check if NULL was returned. You want to compare directly to NULL using if(!r) or if(r == NULL). This will tell you if the string is empty. Doing if(!r[0]) when you return NULL is undefined behavior so you definitely want to make sure the address is valid before you try and access what it points to.
Another thing to note is that in the case that you return NULL, you function has a memory leak. You need to move char* v = new char[len]; to after you decide if you are going to return NULL. You could call delete [] v; in the if statement, but that makes the code more brittle.
There are a few things going on here. Firstly, I would replace that
if((start+len) == start)
with just
if(len == 0) // if(!len) works too
And also note that you don't need to take the address of an index, so
&c[start] is the same thing as c + start
I would read http://www.cplusplus.com/reference/algorithm/copy/ to make sure you understand that the value being passed is an iterator.
But secondly, your char* v = new char[len] statement is invoking undefined behavior. When you call
new char[len];
You're merely telling the compiler that you want to give space to a new character array. Remember that std::cout is a function. It is going to detect a char array as a c string. This means that the char array needs to be null terminated. If it's not, you are truly just invoking undefined behavior because you're reading memory on places that have been allocated but not initialized.
When you call
if(!r[0])
This doesn't really mean anything at all. r[0] is technically initialized, so it is not a nullptr, but it doesn't have any data in it so it is going to evaluate to true with undefined behavior.
Now, if you want to make this more concrete, I would fill the array with zeros
char* v = new char[len];
memset(v, 0, len);
Now your char is a truly "empty" array.
I think it's just a misunderstanding of what an "empty" array actually means.
Finally, don't listen to the guys who say just use std::vector. They're absolutely right, it's better practice, but it's better to understand how those classes work before you pull out the real power of the standard library. Just saying.

C++ - how to check if char* points to something valid?

I want to check if a char* points to a valid string...Can I check this variable...
char* c;
What I tried:
if(c == NULL) //c is not null
if(*c == '\0') //false
if(strlen(c) == 0) //exception
I think it is not possible to check a char* when it was not allocated and dont point to a valid string...
Truth is, you can't ever be sure a pointer is valid. Testing for NULL may give you certainty that it is invalid, but it doesn't guarantee any validity. That's one reason not to use this kind of thing in C++. A std::string is always in a valid state.
If the pointer is not NULL, there is no way of saying if the value of the pointer is valid or not.
c can point anywhere. This might be a null terminated string, a string without terminating null byte, any accessible binary data or any address that is not accessible. What exactly do you want to check for and why? If you want to differentiate between a valid string and a not-initialized, you would normally use NULL for the uninitialized case and check c==NULL. Accessing *c (strlen does this, too) is not OK if c is not a valid pointer. So a typical usecase would be like this:
// initializing to NULL
char *c = NULL;
// maybe setting value
if(condition)
c = strdup("mein string");
// cleanup
if(c != NULL) {
free(c);
c = NULL;
}
If you have an upper bound on the size of the string you expect:
char * c = new char[size];
then perhaps you can check if it terminates within the bound:
bool is_valid(char *c, size_t size) {
while (size--) if (*c) return true;
return false;
}
another way is encapsulating the char * within a class in the constructor, or to have a valid flag in such a class.
It depends on the O/S. Under Windows you have the IsBadXxxPtr() family of functions which will test for a pointer to valid memory, although they won't test if the pointer is to a valid instance of a particular type.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366713(v=vs.85).aspx
etc.

How to check if char* p reached end of a C string?

template<class IntType>
IntType atoi_unsafe(const char* source)
{
IntType result = IntType();
while (source)
{
auto t = *source;
result *= 10;
result += (*source - 48);
++source;
}
return result;
}
and in main() I have:
char* number = "14256";
atoi_unsafe<unsigned>(number);
but the condition while (source) does not seem to recognize that source has iterated over the entire C string. How should it correctly check for the end of the string?
while(source) is true until the pointer wraps around to 0, but will probably crash well before that in modern systems. You need to dereference the pointer to find a null byte, while(*source).
I hate posting short answers
The pointer does not go to zero at the end of the string; the end of the string is found when the value pointed at becomes zero. Hence:
while (*source != '\0')
You might, more compactly, write the whole function as:
template<class IntType>
IntType atoi_unsafe(const char* source)
{
IntType result = IntType();
char c;
while ((c = *source++) != '\0')
result = result * 10 + (c - '0');
return result;
}
Granted, it does not use the auto keyword. Also note carefully the difference between '\0' and '0'. The parentheses in the assignment in the loop body are not necessary.
Your code only handles strings without a sign - and should arguably validate that the characters are actually digits too (maybe raising an exception if the input is invalid). The 'unsafe' appellation certainly applies. Note, too, that if you instantiate the template for a signed integer type and the value overflows, you invoke undefined behaviour. At least with unsigned types, the arithmetic is defined, even if probably not what is expected.
You need to look for the null-terminator at the end of the string. Waiting for the pointer to wrap around to 0 is probably never going to happen. Use while (*source) for your loop.

Comparing character arrays and string literals in C++

I have a character array and I'm trying to figure out if it matches a string literal, for example:
char value[] = "yes";
if(value == "yes") {
// code block
} else {
// code block
}
This resulted in the following error: comparison with string literal results in unspecified behavior. I also tried something like:
char value[] = "yes";
if(strcmp(value, "yes")) {
// code block
} else {
// code block
}
This didn't yield any compiler errors but it is not behaving as expected.
Check the documentation for strcmp. Hint: it doesn't return a boolean value.
ETA: == doesn't work in general because cstr1 == cstr2 compares pointers, so that comparison will only be true if cstr1 and cstr2 point to the same memory location, even if they happen to both refer to strings that are lexicographically equal. What you tried (comparing a cstring to a literal, e.g. cstr == "yes") especially won't work, because the standard doesn't require it to. In a reasonable implementation I doubt it would explode, but cstr == "yes" is unlikely to ever succeed, because cstr is unlikely to refer to the address that the string constant "yes" lives in.
std::strcmp returns 0 if strings are equal.
strcmp returns a tri-state value to indicate what the relative order of the two strings are. When making a call like strcmp(a, b), the function returns
a value < 0 when a < b
0 when a == b
a value > 0 when a > b
As the question is tagged with c++, in addition to David Seilers excellent explanation on why strcmp() did not work in your case, I want to point out, that strcmp() does not work on character arrays in general, only on null-terminated character arrays (Source).
In your case, you are assigning a string literal to a character array, which will result in a null-terminated character array automatically, so no problem here. But, if you slice your character array out of e. g. a buffer, it may not be null-terminated. In such cases, it is dangerous to use strcmp() as it will traverse the memory until it finds a null byte ('\0') to form a string.
Another solution to your problem would be (using C++ std::string):
char value[] = "yes";
if (std::string{value} == "yes")) {
// code block
} else {
// code block
}
This will also only work for null-terminated character arrays. If your character array is not null-terminated, tell the std::string constructor how long your character array is:
char value[3] = "yes";
if (std::string{value, 3} == "yes")) {
// code block
} else {
// code block
}

Copying C-Style String to Free Store Using Only Dereference

As said in the title, the goal is to copy a C-style string into memory without using any standard library functions or subscripting.
Here is what I have so far [SOLVED]
#include "std_lib_facilities.h"
char* strdup(const char* p)
{
int count = 0;
while (p[count]) ++count;
char* q = new char[count+1];
for (int i = 0; i < count + 1; ++i) *(q + i) = *(p + i);
}
int main()
{
char word[] = "Happy";
char* heap_str = strdup(word);
}
Obviously the problem is that allocating just *p (which is equivalent to p[0]) only allocates the letter "H" to memory. I'm not sure how to go about allocating the C-style string without subscripting or STL functions.
C-style string ends with '\0'. You need to traverse the string inside the function character by character until you encounter '\0' to know how long it is. (This is effectively what you would do by calling strlen() to work it out.) Once you know how long the string is, you can allocate the right amount of memory, which is the length+1 (because of the '\0').
To access the i'th element of an array p, one use subscript: p[i].
Subscript of the form p[i] is formally defined to be *((p)+(i)) by both the C standard (6.5.2.1 of C99) and the C++ standard (5.2.1 of C99). Here, one of p or i is of the type pointer to T, and the other is of integral type (or enumeration in C++). Because array name is converted automatically (in most types of use anyway) to a pointer to the first element of said array, p[i] is thus the i'th element of array p.
And just like basic arithmetic, ((p)+(i)) is equivalent to ((i)+(p)) in pointer arithmetic. This mean *((p)+(i)) is equivalent to *((i)+(p)). Which also mean p[i] is equivalent to i[p].
Well, since this a self-teaching exercise, here's an alternative look at a solution that can be compared/contrasted with KTC's nice explanation of the equivalence between subscripting and pointer arithmetic.
The problem appears to be, "implement a strdup() function without using standard library facilities or subscripting".
I'm going to make an exception for malloc(), as there's no reasonable way to do the above without it, and I think that using it isn't detrimental to what's being taught.
First, let's do a basic implementation of strdup(), calling functions that are similar to the ones we might use from the library:
size_t myStrlen( char* s);
void myStrcpy( char* dst, char* src);
char* strdup( char* p)
{
size_t len = myStrlen( p);
char* dup = (char*) malloc( len + 1); /* include space for the termination character */
if (dup) {
myStrcpy( dup, p);
}
return dup;
}
Now lets implement the worker functions without subscripting:
size_t myStrlen( char* s)
{
size_t len = 0;
while (*s != '\0') { /* when s points to a '\0' character, we're at the end of the string */
len += 1;
s += 1; /* move the pointer to the next character */
}
return len;
}
void myStrcpy( char* dst, char* src)
{
while (*src != '\0') { /* when src points to a '\0' character, we're at the end of the string */
*dst = *src;
++dst; /* move both pointers to next character location */
++src;
}
*dst = '\0'; /* make sure the destination string is properly terminated */
}
And there you have it. I think this satisfies the condition of the assignment and shows how pointers can be manipulated to move though an array of data items instead of using subscripting. Of course, the logic for the myStrlen() and myStrcpy() routines can be moved inline if desired, and more idiomatic expressions where the pointer increment can happen in the expression that copies the data can be used (but I think that's more confusing for beginners).