Displaying char array in gcc does not work - c++

I wrote a piece of code and tested with gcc compiler
#include <iostream>
int main()
{
char arr[ 1000 ];
for( int index( 0 ); index < 1000; ++index )
{
std::cout << arr[ index ] << std::endl;
}
return 0;
}
I was hoping it to print the garbage values but to my surprise, it did not print anything. When I simply changed the datatype of arr from char to int, it displayed the garbage values as expected. Could somebody please explain this to me?

The overloads for << for character types do not treat them as
integral types, but as characters. If the garbage value
corresponds to a printable character (e.g. 97, which corresponds
to 'a'), you will see it. If it doesn't (e.g. 0), you won't.
And if the garbage values correspond to some escape sequence
which causes your terminal to use a black foreground on a black background, you won't see anything else, period.
If you want to see the actual numerical values of a char (or
any character type), just convert the variable to int before
outputting it:
std::cout << static_cast<int>( arr[index] ) << std::endl;

What you're trying to do has an undefined behavior. Some compilers will clear out the memory for you, others will leave it as it was before the creation of your buffer.
Overall, this is a useless test.

Some platforms may choose, for example for security purposes, to fill the uninitialized char array with zeroes, even though it's not static and wasn't explicitly initialized.
Therefore, that is why no garbage is showing up - your char array was just automatically initialized.

On your platform garbage characters don't print. On another platform it might be different.
As an experiment try this
std::cout << '|' << arr[ index ] << '|' << std::endl;
See if anything appears between the || characters.

You're getting undefined behaviour because you're attempting to use values from an uninitialised array. You can't expect anything in particular to happen. Maybe every character happens to be a non-printing character. Maybe it just decided that it didn't want to print anything because it doesn't like your little games. Anything goes.

Related

Sign & Unsigned Char is not working in C++

In C++ Primer 5th Edition I saw this
when I tried to use it---
At this time it didn't work, but the program's output did give a weird symbol, but signed is totally blank And also they give some warnings when I tried to compile it. But C++ primer and so many webs said it should work... So I don't think they give the wrong information did I do something wrong?
I am newbie btw :)
But C++ primer ... said it should work
No it doesn't. The quote from C++ primer doesn't use std::cout at all. The output that you see doesn't contradict with what the book says.
So I don't think they give the wrong information
No1.
did I do something wrong?
It seems that you've possibly misunderstood what the value of a character means, or possibly misunderstood how character streams work.
Character types are integer types (but not all integer types are character types). The values of unsigned char are 0..255 (on systems where size of byte is 8 bits). Each2 of those values represent some textual symbol. The mapping from a set of values to a set of symbols is called a "character set" or "character encoding".
std::cout is a character stream. << is stream insertion operator. When you insert a character into a stream, the behaviour is not to show the numerical value. Instead, the behaviour to show the symbol that the value is mapped to3 in the character set that your system uses. In this case, it appears that the value 255 is mapped to whatever strange symbol you saw on the screen.
If you wish to print the numerical value of a character, what you can do is convert to a non-character integer type and insert that to the character stream:
int i = c;
std::cout << i;
1 At least, there's no wrong information regarding your confusion. The quote is a bit inaccurate and outdated in case of c2. Before C++20, the value was "implementation defined" rather than "undefined". Since C++20, the value is actually defined, and the value is 0 which is the null terminator character that signifies end of a string. If you try to print this character, you'll see no output.
2 This was bit of a lie for simplicity's sake. Some characters are not visible symbols. For example, there is the null terminator charter as well as other control characters. The situation becomes even more complex in the case of variable width encodings such as the ubiquitous Unicode, where symbols may consist of a sequence of several char. In such encoding, and individual char cannot necessarily be interpreted correctly without other char that are part of such sequence.
3 And this behaviour should feel natural once you grok the purpose of character types. Consider following program:
unsigned char c = 'a';
std::cout << c;
It would be highly confusing if the output would be a number that is the value of the character (such as 97 which may be the value of the symbol 'a' on the system) rather than the symbol 'a'.
For extra meditation, think about what this program might print (and feel free to try it out):
char c = 57;
std::cout << c << '\n';
int i = c;
std::cout << i << '\n';
c = '9';
std::cout << c << '\n';
i = c;
std::cout << i << '\n';
This is due to the behavior of the << operator on the char type and the character stream cout. Note, the << is known as formatted output means it does some implicit formatting.
We can say that the value of a variable is not the same as its representation in certain contexts. For example:
int main() {
bool t = true;
std::cout << t << std::endl; // Prints 1, not "true"
}
Think of it this way, why would we need char if it would still behave like a number when printed, why not to use int or unsigned? In essence, we have different types so to have different behaviors which can be deduced from these types.
So, the underlying numeric value of a char is probably not what we looking for, when we print one.
Check this for example:
int main() {
unsigned char c = -1;
int i = c;
std::cout << i << std::endl; // Prints 255
}
If I recall correctly, you're somewhat close in the Primer to the topic of built-in types conversions, it will bring in clarity when you'll get to know these rules better. Anyway, I'm sure, you will benefit greatly from looking into this article. Especially the "Printing chars as integers via type casting" part.

Why is strlen(s) different from the size of s, and why does cout char display a character not a number?

I wrote a piece of code to count how many 'e' characters are in a bunch of words.
For example, if I type "I read the news", the counter for how many e's are present should be 3.
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
char s[255],n,i,nr=0;
cin.getline(s,255);
for(i=1; i<=strlen(s); i++)
{
if(s[i-1]=='e') nr++;
}
cout<<nr;
return 0;
}
I have 2 unclear things about characters in C++:
In the code above, if I replace strlen(s) with 255, my code just doesn't work. I can only type a word and the program stops. I have been taught at school that strlen(s) is the length for the string s, which in this case, as I declared it, is 255. So, why can't I just type 255, instead of strlen(s)?
If I run the program above normally, it doesn't show me a number, like it is supposed to do. It shows me a character (I believe it is from the ASCII table, but I'm not sure), like a heart or a diamond. It is supposed to print the number of e's from the words.
Can anybody please explain these to me?
strlen(s) gives you the length of the string held in the s variable, up to the first NULL character. So if you input "hello", the length will be 5, even though s has a capacity of 255....
nr is displayed as a character because it's declared as a char. Either declare it as int, for example, or cast it to int when cout'ing, and you'll see a number.
strlen() counts the actual length of strings - the number of real characters up to the first \0 character (marking end of string).
So, if you input "Hello":
sizeof(s) == 255
strlen(s) == 5
For second question, you declare your nr as char type. std::cout recognizes char as a single letter and tries it print it as such. Declare your variable as int type or cast it before printing to avoid this.
int nr = 42;
std::cout << nr;
//or
char charNr = 42;
std::cout << static_cast<int>(charNr);
Additional mistakes not mentioned by others, and notes:
You should always check whether the stream operation was successful before trying to use the result.
i is declared as char and cannot hold values greater than 127 on common platforms. In general, the maximum value for char can be obtained as either CHAR_MAX or std::numeric_limits<char>::max(). So, on common platforms, i <= 255 will always be true because 255 is greater than CHAR_MAX. Incrementing i once it has reached CHAR_MAX, however, is undefined behavior and should never be done. I recommend declaring i at least as int (which is guaranteed to have sufficient range for this particular use case). If you want to be on the safe side, use something like std::ptrdiff_t (add #include <cstddef> at the start of your program), which is guaranteed to be large enough to hold any valid array size.
n is declared but never used. This by itself is harmless but may indicate a design issue. It can also lead to mistakes such as trying to use n instead of nr.
You probably want to output a newline ('\n') at the end, as your program's output may look odd otherwise.
Also note that calling a potentially expensive function such as strlen repeatedly (as in the loop condition) can have negative performance implications (strlen is typically an intrinsic function, though, and the compiler may be able to optimize most calls away).
You do not need strlen anyway, and can use cin.gcount() instead.
Nothing wrong with return 0; except that it is redundant – this is a special case that only applies to the main function.
Here's an improved version of your program, without trying to change your code style overly much:
#include <iostream>
#include <cstring>
#include <cstddef>
using namespace std;
int main()
{
char s[255];
int nr=0;
if ( cin.getline(s,255) )
{ // only if reading was successful
for(int i=0; i<cin.gcount(); i++)
{
if(s[i]=='e') nr++;
}
cout<<nr<<'\n';
}
return 0;
}
For exposition, the following is a more concise and expressive version using std::string (for arbitrary length input), and a standard algorithm. (As an interviewer, I would set this, modulo minor stylistic differences, as the canonical answer i.e. worth full credit.)
#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s;
if ( getline(cin, s) )
{
cout << std::count(begin(s), end(s), 'e') << '\n';
}
}
I have 2 unclear things about characters in C++: 1) In the code above,
if I replace the "strlen(s)" with 255, my code just doesn't work, I
can only type a word and the program stops, and I have been taught at
school that "strlen(s)" is the length for the string s, wich in this
case, as I declared it, is 255. So, why can't I just type 255, instead
of strlen(s);
That's right, but strings only go the null terminator, even if there's more space allocated. Consider this, per example:
char buf[32];
strcpy(buf, "Hello World!");
There's 32 chars worth of space, but my string is only 12 characters long. That's why strlen returns 12 in this example. It's because it doesn't know how long the buffer is, it only knows the address of the string and parses it until it finds the null terminator.
So if you enter 255, you're going past what was set by cin and you'll read the rest of the buffer. Which, in this case, is uninitialized. That's undefined behavior - in this case it will most likely read some rubbish values, and those might coincidentally have the 'e' value and thus give you a wrong result.
2) If you run the program above normaly, it doesn't show you a number,
like it's supposed to do, it shows me a character(I believe it's from
the ASCII table but I'm not sure), like a heart or a diamond, but it
is supposed to print the number of e's from the words. So can anybody
please explain these to me?
You declared nr as char. While that can indeed hold an integer value, if you print it like this, it will be printed as a character. Declare it as int instead or cast it when you print it.

C++ strlen() initialized char array

Quick question.
I couldn't find why an initialized char array returns this value. I understand that the strlen() function will only return the amount of characters inside of an array, and not the size, but why will it return 61 if there are no characters in it?
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
const int MAX = 50;
char test[MAX];
int length = strlen(test);
cout << "The current \'character\' length of the test array is: " << length << endl;
// returns "61"
// why?
cin >> test; //input == 'nice'
length = strlen(test);
cout << "The new \'character\' length of the test array is: " << length << endl;
// returns 4 when 'nice' is entered.
// this I understand.
return 0;
}
This was driving me nuts during a project because I would be trying to use a loop to feed information into a character array but strlen() would always return an outrageous value until I initialized the array as:
char testArray[50] = '';
instead of
char testArray[50];
I got these results using Visual Studio 2015
Thanks!
I think the basic misunderstanding is that - unlike in other languages - in C, locally defined variables are not initialised with any value, neither with empty strings, nor with 0, nor with any <undefined> or whatever unless you explicitly initialise them.
Note that accessing uninitialised variables actually is "undefined behaviour"; it may lead to "funny" and non-deterministic results, may crash, or might even be ignored at all.
A very common behaviour of such programs (though clearly not guaranteed!) is that if you write
char test[50];
int length = strlen(test);
then test will point to some memory, which is reserved in the size of 50 bytes yet filled with arbitrary characters, not necessarily \0-characters at all. Hence, test will probably not be "empty" in the sense that the first character is a \0 as it would be with a really empty string "". If you now access test by calling strlen(test) (which is actually UB, as said), then strlen may just go through this arbitrarily filled memory, and it might detect a \0 within the first 50 characters, or it might detect the first \0 much after having exceeded the 50 bytes.
It's good that you have found your answer, but you have to understand how does this thing works, I think.
char test[MAX];
In this line of code you have just declared an array of MAX chars. You will get random values in this array until you initialize it. The strlen function just walks through the memory until it find 0 value. So, since values in your array are random, the result of this function is random. Moreover, you can easily walk outside of your array and get UB.
char test[MAX] = '';
This code initilizes the first element in 'test' array with 0 value so strlen will be able to find it.

C++ toupper Syntax

I've just been introduced to toupper, and I'm a little confused by the syntax; it seems like it's repeating itself. What I've been using it for is for every character of a string, it converts the character into an uppercase character if possible.
for (int i = 0; i < string.length(); i++)
{
if (isalpha(string[i]))
{
if (islower(string[i]))
{
string[i] = toupper(string[i]);
}
}
}
Why do you have to list string[i] twice? Shouldn't this work?
toupper(string[i]); (I tried it, so I know it doesn't.)
toupper is a function that takes its argument by value. It could have been defined to take a reference to character and modify it in-place, but that would have made it more awkward to write code that just examines the upper-case variant of a character, as in this example:
// compare chars case-insensitively without modifying anything
if (std::toupper(*s1++) == std::toupper(*s2++))
...
In other words, toupper(c) doesn't change c for the same reasons that sin(x) doesn't change x.
To avoid repeating expressions like string[i] on the left and right side of the assignment, take a reference to a character and use it to read and write to the string:
for (size_t i = 0; i < string.length(); i++) {
char& c = string[i]; // reference to character inside string
c = std::toupper(c);
}
Using range-based for, the above can be written more briefly (and executed more efficiently) as:
for (auto& c: string)
c = std::toupper(c);
As from the documentation, the character is passed by value.
Because of that, the answer is no, it shouldn't.
The prototype of toupper is:
int toupper( int ch );
As you can see, the character is passed by value, transformed and returned by value.
If you don't assign the returned value to a variable, it will be definitely lost.
That's why in your example it is reassigned so that to replace the original one.
As many of the other answers already say, the argument to std::toupper is passed and the result returned by-value which makes sense because otherwise, you wouldn't be able to call, say std::toupper('a'). You cannot modify the literal 'a' in-place. It is also likely that you have your input in a read-only buffer and want to store the uppercase-output in another buffer. So the by-value approach is much more flexible.
What is redundant, on the other hand, is your checking for isalpha and islower. If the character is not a lower-case alphabetic character, toupper will leave it alone anyway so the logic reduces to this.
#include <cctype>
#include <iostream>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
for (auto s = text; *s != '\0'; ++s)
*s = std::toupper(*s);
std::cout << text << '\n';
}
You could further eliminate the raw loop by using an algorithm, if you find this prettier.
#include <algorithm>
#include <cctype>
#include <iostream>
#include <utility>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
std::transform(std::cbegin(text), std::cend(text), std::begin(text),
[](auto c){ return std::toupper(c); });
std::cout << text << '\n';
}
toupper takes an int by value and returns the int value of the char of that uppercase character. Every time a function doesn't take a pointer or reference as a parameter the parameter will be passed by value which means that there is no possible way to see the changes from outside the function because the parameter will actually be a copy of the variable passed to the function, the way you catch the changes is by saving what the function returns. In this case, the character upper-cased.
Note that there is a nasty gotcha in isalpha(), which is the following: the function only works correctly for inputs in the range 0-255 + EOF.
So what, you think.
Well, if your char type happens to be signed, and you pass a value greater than 127, this is considered a negative value, and thus the int passed to isalpha will also be negative (and thus outside the range of 0-255 + EOF).
In Visual Studio, this will crash your application. I have complained about this to Microsoft, on the grounds that a character classification function that is not safe for all inputs is basically pointless, but received an answer stating that this was entirely standards conforming and I should just write better code. Ok, fair enough, but nowhere else in the standard does anyone care about whether char is signed or unsigned. Only in the isxxx functions does it serve as a landmine that could easily make it through testing without anyone noticing.
The following code crashes Visual Studio 2015 (and, as far as I know, all earlier versions):
int x = toupper ('é');
So not only is the isalpha() in your code redundant, it is in fact actively harmful, as it will cause any strings that contain characters with values greater than 127 to crash your application.
See http://en.cppreference.com/w/cpp/string/byte/isalpha: "The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF."

Does sending a character pointer - initialized to '\0' - to the standard output fault it? (C++)

This is trivial, probably silly, but I need to understand what state cout is left in after you try to print the contents of a character pointer initialized to '\0' (or 0). Take a look at the following snippet:
const char* str;
str = 0; // or str = '\0';
cout << str << endl;
cout << "Welcome" << endl;
On the code snippet above, line 4 wont print "Welcome" to the console after the attempt to print str on line 3. Is there some behavior I should be aware of? If I substitute line 1-3 with cout << '\0' << endl; the message "Welcome" on the following line will be successfully printed to the console.
NOTE: Line 4 just silently fails to print. No warning or error message or anything (at least not using MinGW(g++) compiler). It spewed an exception when I compiled the same code using MS cl compiler.
EDIT: To dispel the notion that the code fails only when you assign str to '\0', I modified the code to assign to 0 - which was previously commented
If you insert a const char* value to a standard stream (basic_ostream<>), it is required that it not be null. Since str is null you violate this requirement and the behavior is undefined.
The relevant paragraph in the standard is at §27.7.3.6.4/3.
The reason it works with '\0' directly is because '\0' is a char, so no requirements are broken. However, Potatoswatter has convinced me that printing this character out is effectively implementation-defined, so what you see might not quite be what you want (that is, perform your own checks!).
Don't use '\0' when the value in question isn't a "character"
(terminator for a null terminated string or other). That is, I think,
the source of your confusion. Something like:
char const* str = "\0";
std::cout << str << std::endl;
is fine, where str points to a string which contains a '\0' (in this
case, two '\0'). Something like:
char const* str = NULL;
std::cout << str << std::endl;
is undefined behavior; anything can happen.
For historical reasons (dating back to C), '\0' and 0 will convert
implicitly to any pointer type, resulting in a null pointer.
A char* that points to a null character is simply a zero-length string. No harm in printing that.
But a char* whose value is null is a different story. Trying to print that would mean dereferencing a null pointer, which is undefined behavior. A crash is likely.
Assigning '\0' to a pointer isn't really correct, by the way, even if it happens to work: you're assigning a character value to a pointer variable. Use 0 or NULL, or nullptr in C++11, when assigning to a pointer.
Just regarding the cout << '\0' part…
"Terminating the string" of a file or stream in text mode has an undefined effect on its contents. The C++ standard defers to the C standard on matters of text semantics (C++11 27.9.1.1/2), and C is pretty draconian (C99 §7.19.2/2):
Data read in from a text stream will necessarily compare equal to the data that were earlier written out to that stream only if: the data consist only of printing characters and the control characters horizontal tab and new-line; no new-line character is immediately preceded by space characters; and the last character is a new-line character.
Since '\0' is a control character and cout is a text stream, the resulting output may not read as you wrote it.
Take a look at this example:
http://ideone.com/8MHGH
The main problem you have is that str is pointer to a char not a char, so you should assign it to a string: str = "\0";
When you assign it to char, it remains 0 and then the fail bit of cout becomes true and you can no longer print to it. Here is another example where this is fixed:
http://ideone.com/c4LPh