Confused by a part of a code - c++

There is a part of a code for "making the first letter of each word capitalized" I dont understand.
http://www.cplusplus.com/forum/beginner/117463/
std::string str = x;
str [0] = toupper (str[0]);
std::for_each(str.begin()+1, str.end(), printChars);
std::cout << str;
return 0;
}
Void printChars(char& c)
{
if( (*(&c - sizeof(char))) == " ")
c = toupper(c);
}
I understand it sets the first letter to capital always, and checks for each one in the string after.
But why does he use if((*(&c - sizeof(char))) == " ") and how does the * , & and setting it to blank work in this case?

how does ... work in this case?
It does not work. The program that you show is ill-formed and is not likely to compile.
Void printChars(char& c)
There is no type Void in C++. I suspect that you intended to write void instead.
(some_char_value) == " " // expression simplified by me
You may not compare a character to a string literal.
But why does he use if((*(&c - sizeof(char))) == " ")
He doesn't. He uses if( (*(&c - sizeof(char))) == ' ').
how does the & work in this case?
It is the address-of operator. It is used here to get a temporary pointer to the memory address of c.
how does the * work in this case?
It is the indirection operator. It is used here to get the character at the memory location &c - 1. Which is a character in str right before the character referred to by c.
and setting it to blank work in this case?
He doesn't set anything in the quoted expression. == is the equality comparison operator. He compares the values of the &c - 1 and the character literal ' '.
In english: He tests whether the character before c is space. In other words: He test whether c is the first character of a word.

This code is performing simple pointer arithmetic. The code you are asking about is using the reference operator & to grab the address of the variable c. Then performing subtraction of the size of a char to check if the char before c is a space if so it calls toUpper(). So for example
if the address of c is 100 then &c - sizeof(char) is checking the char at address 99 then the * is used to dereference the variable which allows for the comparison of the variable using == " ".

Related

Why do I get an error for the first statement, but not the second one? (C++ substring)

The goal is to get the first, middle, and last name initials each followed by a period.
string first;
string middle;
string last;
string result;
The expression is typed in the string result.
I typed:
result = first[0] + "." + middle[0] + "." last[0] + ".";
And received an the following error:
invalid operands of types ‘const char*’ and ‘const char [2]’ to binary ‘operator+’
But when I remove the "." for the above statement, it compiles without any error.
The solution ended up being:
result = first.substr(0,1) + "." + middle.substr(0,1) + "." last.substr(0,1) + ".";
So my question is that, why do I get a compile error when I add . with string[0], but not when I add . with string.substr(0,1)?
first[0] + "."
In this expression:
first is a std::string.
Therefore, first[0] is a char (some immaterial technical details omitted).
"." is a char [2] (an array of two char values).
In C++, it is illegal to add a char value to a char [2], you can't just add char to an array. That's simply not allowed in C++. There are various rules about what you can do to what else in C++. You can add numerical values to other numerical values. You can add an integer value to a pointer. Finally, a class can define its own + operator. That's pretty much it. None of this applies to this expression, hence the compilation error.
Now, if you go back and reread the actual error message that your compiler showed you, it should make perfect sense.
This does not do what you think (you should have tested it):
first[0] + middle[0] + last[0]
Each of those is a single character, and in C++, single characters are essentially one-byte integers. So adding them just adds their ASCII values (see https://www.asciitable.com/ for those numbers).
first.substr(0,1) gives you a string, and strings can be concatenated with +.
Another way to do what you want is:
result.append(first[0]);
result.append('.');
result.append(middle[0]);
result.append('.');
result.append(last[0]);
That just appends one character at a time. You can also pass strings like result.append("...") if you want. The ' single quote is what makes a literal char instead of a literal string (array of chars).
While your question was answered very well, here are other options that I would prefer over substr.
#include <sstream>
...
std::ostringstream compose;
compose << first[0] << "." << middle[0] << "." << last[0] << ".";
std::string result = compose.str();
or also
string result = string(first[0]) + "." + string(middle[0]) + "." + string(last[0]) + ".";

In what case std::basic_string::find with a count argument greater than the string length can be useful?

One of the signatures of std::basic_string::find method is:
size_type find( const CharT* s, size_type pos, size_type count ) const;
The parameters are the following:
pos    - position at which to start the search
count - length of substring to search for
s         - pointer to a character string to search for
The description of the behavior of the method for this overload is:
Finds the first substring equal to the range [s, s+count). This range may contain null characters.
I would like to know in what case it can be useful to have a range that contain null characters. For instance:
s.find("A", 0, 2);
Here, s corresponds to a string with a length of 1. Because count is 2, the range [s, s+count) contains a null character. What is the point?
There is a false premise that you didn't spell out, but combining the title and the question it is:
The null character indicates the end of a std::string.
This is wrong. std::strings can contain null characters at any position. One has to be cautious with functions that expect a null-terminated c-string, but find is so nice that it explicitly reminds you that it also works in the general case.
C-Strings are null terminated, hence this:
std::string x("My\0str0ing\0with\0null\0characters");
std::cout << x.size() << '\n';
Prints: 2, ie only characters before the \0 are used to constuct the std::string.
However, this
std::string s("Hello world");
s[5] = '\0';
std::cout << s << '\n';
Prints Helloworld (because \0 is not printable). Also char arrays can contain \0 at any postition. Usually this is interpreted as the terminating character of the string. However, as std::strings can contain null characters at any position, it is just consistent to provide also an overload that takes pointer to a character array that can contain null characters in the middle. An example for the usage of that overload is (s is the string from above)
std::string f;
f.push_back('\0');
f.push_back('w');
std::cout << s.find(f.c_str()) << '\n';
std::cout << s.find("") << '\n';
std::cout << s.find(f.c_str(),0,2) << '\n';
Output:
0
0
5
The overload without the count parameter assumes a null terminated c-string, hence s.find(f.c_str()) is the same as s.find(""). Only with the overload that has the count paramter the substring \0w is found at index 5.

Array conversion guidance

I'm stuck on an assignment which converts contents of an array (input from the user) to a pre-declared shorthand.
I want it to be as simple as strcpy(" and ", "+");
to change the word 'and' within a string, to a '+' sign.
Unfortunately, no matter how I structure the function; I get a deprecated conversion warning (variant loops, and direct applications, attempted).
Side note; this is assignment based, so my string shortcuts are severely limited, and no pointers (I've seen several versions of clearing the fault using them).
I'm not looking for someone to do my homework; just guidance on how strcpy can be applied without creating the dep. warning. Perhaps I shouldn't be using strcpy at all?
strcpy copies the contents of the second string into the memory of the first string. Since you're copying a string literal into a string literal it can't do it (you can't write to a string literal) and so it complains.
Instead you need to build your own search and replace system. You can use strstr() to search for a substring within a string, and it returns the pointer in memory to the start of that found string (if it's found).
Let's take the sample string Jack and Jill went up the hill.
char *andloc = strstr(buffer, " and ");
That would return the address of the start of the string (say 0x100) plus the offset of the word " and " (including spaces) within it (0x100 + 4) which would be 0x104.
Then, if found, you can replace it with the & symbol. However you can't use strcpy for that as it'll terminate the string. Instead you can set the bytes manually, or use memcpy:
if (andloc != NULL) { // it's been found
andloc[1] = '&';
andloc[2] = ' ';
}
or:
if (andloc != NULL) { // it's been found
memcpy(andloc, " & ", 3);
}
That would result in Jack & d Jill went up the hill. We're not quite there yet. Next you have to shuffle everything down to cover the "d " from the old " and ". For that you'd think you could now use strcpy or memcpy, however that's not possible - the strings (source and destination) overlap, and the manual pages for both specifically state that the strings must not overlap and to use memmove instead.
So you can move the contents of the string after the "d " to after the "& " instead:
memmove(andloc + 3, andloc + 5, strlen(andloc + 5) + 1);
Adding a number to a string like that adds to the address of the pointer. So we're looking at copying the data from 5 characters further on in the string that the old "and" location into a space starting at 3 characters on from the start of the old "and" location. The amount to copy is the length of the string from 5 characters on from the start of the "and" location plus one so it copies the NULL character at the end of the string.
Another manual way of doing it would be to iterate through each character until you find the end of the string:
char *to = andloc + 3;
char *from = andloc + 5;
while (*from) { // Until the end of the string
*to = *from; // Copy one character
to++; // Move to the ...
from++; // ... next character pair
}
*to = 0; // Add the end of string marker.
So now either way the string memory contains:
Jack & Jill went up the hill\0l\0
The \0 is the end of string marker, so the actual string "content" is only up as far as the first \0 and the l\0 is now ignored.
Note that this only works if you are replacing a part with something that is smaller. If you are replacing it with something bigger, so the string grows in size, you will be forced to use memmove, which first copies the content to a scratchpad, and ensure that your buffer has enough room in it to store the finished string (this kind of thing is often a big source of "buffer overruns" which are a security headache and one of the biggest causes of systems being hacked). Also you have to do the whole thing backwards - move the latter part of the string first to make room, then modify the gap between the two halves.

c++ dynamic allocation initial values

I'm trying to concatenate two strings into a new one (finalString) like this:
finalString = string1 + '&' + string2
Firstly, I allocate the memory for finalString, then i use strcat().
finalString = new char[strlen(string1 ) + strlen(string2) + 2];
cout << finalString << endl;
finalString = strcat(finalString , string1 );
finalString = strcat(finalString , "&");
finalString = strcat(finalString , string2);
cout << finalString << endl;
I'll suppose that string1 is "Mixt" and string2 is "Supermarket".
The output looks like this:
═════════════════řřřř //(which has 21 characters)
═════════════════řřřřMixt&Supermarket
I know that if I use round brackets in "new char" the string will be initialized to 0 and I'll get the desired result, but my question is why does the first output has 21 characters, supposing that I allocated only 17. And even so, why does the final string length exceed the initial allocation size (21 > 17) ?
Thanks in advance!
Two words for you "buffer overrun"
The reason you have 21 characters initially is because there is a '/0' (also called null) character 22 characters away from the memory address that finalString points to. This may or may not be consistent based on what is in your memory.
As for the reason why you have a longer than what you wanted again you wrote outside the initial buffer into random memory. You did not crash because you did not write over something important.
strcat will take the memory address given, find the first '/0' it finds and from that place on it will copy the data from the second memory pointer you provide until the first '/0' it finds there.
What you are doing is VERY DANGEROUS, if you do not hit a /0' before you hit something vital you will cause a crash or at minimum bad behavior.
Undersand in C/C++ a char[] is just a pointer to the initial memory location of the first element. THERE ARE NO SAFEGUARDS! You alone must be careful with that..
if you set the first character of the finalString[0] = 0 then you the logic will work better.
As a different answer, why not use std::string:
std::string a, b, c;
a = "part1";
b = "part2";
c = a + " & " + b;
std::cout << c << '\n';
part1 & part2
Live example: http://ideone.com/pjqz9T
It will make your life easier! You should always look to use stl types with c++.
If you really do need a char * then at the end you can do c.c_str().
Your string is not initialized which leads to undefined behavior. In strcat, string will be appended when it finds the null character.
So, as others already mentioned, either you can do
finalString[0] = 0;
or in place of your first strcat use strcpy. This will copy the first string and put a null character at the end.
why 21 characters?
This is due to undefined behavior. It will keep on printing until it won't find a null or else it will crash as soon as it tries to access any illegal memory.

Pointer + Number

I have this code:
How can i debug with printf?
char *string = "something";
short i;
i = strlen(string);
while (i-- && (*(string+i)==' ' || *(string+i)=='\0'));
*(string+i+1) = '\0' ;
What does this do?
*(string+i)
According to the C standard, the expression e1[e2] is by definition equivalent to *(e1+e2). So when you write
*(string + i)
it is equivalent
string[i]
This definition has an interesting side effect that it is actually correct to write 3[s] instead of s[3] as operator [] is commutative.
*(string+i) is string[i]. So *(string+i) == '\0' is the same as string[i] == 0.
This is because pointer + number is the same address as pointer + number * sizeof(type) (in your case string + i is the same a string + i * sizeof(char). When you index into an array arr[i] is the element at address arr + i * sizeof(type).
To debug with printf you simply insert printf statements and poke around the content of variables. For example:
char *string = "something";
short i;
i = strlen(string);
printf("Debug i = %d\n", i);
The postfix operator -- means that i will get evaluated and then decremented before the next sequence point.
The && operator has a sequence point between the left and right operand, so i-- occurs before the right operand is evaluated.
Therefore string+i refers to the value of the original i - 1.
*(string+i) is guaranteed to be completely equivalent with string[i]. The former is just a less readable form.
The code does a peculiar check. If a character is the null terminator, it adds another null terminator behind the first one. That doesn't make any sense. Only the check for space makes some sense.
I also doubt the true intention is to add a null after the space? Wouldn't you rather want to remove the space and end the string there? Seems like a mistake.
Also the code is inefficient, because if you count from the beginning to the end and encounter the first space there, you can stop iterating.
In other words, this code is just awful. You should replace it with something like this:
char* ptr = strchr(string, ' ');
if(ptr != NULL)
{
*ptr = '\0';
}