isdigit raises a debug assertion when entering £ and ¬ - c++

The below code works for every character I type in except for £ or ¬.
Why do I get a "debug assertion fail"?
#include <iostream>
#include <string>
#include <cctype>
using namespace std;
int main() {
string input;
while (1) {
cout << "Input number: ";
getline(cin, input);
if (!isdigit(input[0]))
cout << "not a digit\n";
}
}

The microsoft docs say:
The C++ compiler treats variables of type char, signed char, and unsigned char as having different types. Variables of type char are promoted to int as if they are type signed char by default, unless the /J compilation option is used. In this case they are treated as type unsigned char and are promoted to int without sign extension.
And they also say:
The behavior of isdigit and _isdigit_l is undefined if c is not EOF or in the range 0 through 0xFF, inclusive. When a debug CRT library is used and c is not one of these values, the functions raise an assertion.
So char is per default signed, which means as those two characters are not ASCII they are negative in your ANSI charset, and thus you get the assertion.

The Microsoft docs say:
The behavior of isdigit and _isdigit_l is undefined if c is not EOF or in the range 0 through 0xFF, inclusive. When a debug CRT library is used and c is not one of these values, the functions raise an assertion.
(I'm guessing Microsoft because of the comment about an "error window", but docs for other implementations place the same limit on argument values.)
EDIT: as Deduplicator observed, the error probably arises from default char being signed on this platform, so that you are passing negative values (different from EOF). std::string uses char, not wide characters, so my original conclusion cannot be correct.

Related

C++ Atoi can't handle special characters

Im using this atoi to remove all letters from the string. But my string uses special characters as seen below, because of this my atoi exits with an error. What should I do to solve this?
#include <iostream>
#include <string>
using namespace std;
int main() {
std::string playerPickS = "Klöver 12"; // string with special characters
size_t i = 0;
for (; i < playerPickS.length(); i++) { if (isdigit(playerPickS[i])) break; }
playerPickS = playerPickS.substr(i, playerPickS.length() - i); // convert the remaining text to an integer
cout << atoi(playerPickS.c_str());
}
This is what I believe is the error. I only get this when using those special characters, thats why I think thats my problem.
char can be signed or unsigned, but isidigt without a locale overload expects a positive number (or EOF==-1). In your encoding 'ö' has a negative value. You can cast it to unsigned char first: is_digit(static_cast<unsigned char>(playerPickS[i])) or use the locale-aware variant.
atoi stops scanning when it finds something that's not a digit (roughly speaking). So, to get it to do what you want, you have to feed it something that at least starts with the string you want to convert.
From the documentation:
[atoi] Discards any whitespace characters until the first non-whitespace character is found, then takes as many characters as possible to form a valid integer number representation and converts them to an integer value. The valid integer value consists of the following parts:
(optional) plus or minus sign
numeric digits
So, now you know how atoi works, you can pre-process your string appropriately before passing it in. Good luck!
Edit: If your call to isdigit is failing to yield the desired result, the clue lies here:
The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF.
So you need to check for that yourself before you call it. Casting playerPickS[i] to an unsigned int will probably work.

Sign & Unsigned Char is not working in C++

In C++ Primer 5th Edition I saw this
when I tried to use it---
At this time it didn't work, but the program's output did give a weird symbol, but signed is totally blank And also they give some warnings when I tried to compile it. But C++ primer and so many webs said it should work... So I don't think they give the wrong information did I do something wrong?
I am newbie btw :)
But C++ primer ... said it should work
No it doesn't. The quote from C++ primer doesn't use std::cout at all. The output that you see doesn't contradict with what the book says.
So I don't think they give the wrong information
No1.
did I do something wrong?
It seems that you've possibly misunderstood what the value of a character means, or possibly misunderstood how character streams work.
Character types are integer types (but not all integer types are character types). The values of unsigned char are 0..255 (on systems where size of byte is 8 bits). Each2 of those values represent some textual symbol. The mapping from a set of values to a set of symbols is called a "character set" or "character encoding".
std::cout is a character stream. << is stream insertion operator. When you insert a character into a stream, the behaviour is not to show the numerical value. Instead, the behaviour to show the symbol that the value is mapped to3 in the character set that your system uses. In this case, it appears that the value 255 is mapped to whatever strange symbol you saw on the screen.
If you wish to print the numerical value of a character, what you can do is convert to a non-character integer type and insert that to the character stream:
int i = c;
std::cout << i;
1 At least, there's no wrong information regarding your confusion. The quote is a bit inaccurate and outdated in case of c2. Before C++20, the value was "implementation defined" rather than "undefined". Since C++20, the value is actually defined, and the value is 0 which is the null terminator character that signifies end of a string. If you try to print this character, you'll see no output.
2 This was bit of a lie for simplicity's sake. Some characters are not visible symbols. For example, there is the null terminator charter as well as other control characters. The situation becomes even more complex in the case of variable width encodings such as the ubiquitous Unicode, where symbols may consist of a sequence of several char. In such encoding, and individual char cannot necessarily be interpreted correctly without other char that are part of such sequence.
3 And this behaviour should feel natural once you grok the purpose of character types. Consider following program:
unsigned char c = 'a';
std::cout << c;
It would be highly confusing if the output would be a number that is the value of the character (such as 97 which may be the value of the symbol 'a' on the system) rather than the symbol 'a'.
For extra meditation, think about what this program might print (and feel free to try it out):
char c = 57;
std::cout << c << '\n';
int i = c;
std::cout << i << '\n';
c = '9';
std::cout << c << '\n';
i = c;
std::cout << i << '\n';
This is due to the behavior of the << operator on the char type and the character stream cout. Note, the << is known as formatted output means it does some implicit formatting.
We can say that the value of a variable is not the same as its representation in certain contexts. For example:
int main() {
bool t = true;
std::cout << t << std::endl; // Prints 1, not "true"
}
Think of it this way, why would we need char if it would still behave like a number when printed, why not to use int or unsigned? In essence, we have different types so to have different behaviors which can be deduced from these types.
So, the underlying numeric value of a char is probably not what we looking for, when we print one.
Check this for example:
int main() {
unsigned char c = -1;
int i = c;
std::cout << i << std::endl; // Prints 255
}
If I recall correctly, you're somewhat close in the Primer to the topic of built-in types conversions, it will bring in clarity when you'll get to know these rules better. Anyway, I'm sure, you will benefit greatly from looking into this article. Especially the "Printing chars as integers via type casting" part.

Why do character arrays accept non ASCII characters in C++?

So, I want to be able to use Chinese characters in my C++ program, and I need to use some type, to hold such characters beyond the ASCII range.
However, I tried to run the following code, and it worked.
#include <iostream>
int main() {
char snet[4];
snet[0] = '你';
snet[1] = '爱';
snet[2] = '我';
std::cout << snet << std::endl;
int conv = static_cast<int>(snet[0]);
std::cout << conv << std::endl; // -96
}
This doesn't make sense, as since a sizeof(char) in C++, for the g++ compiler evaluates to 1, yet Chinese characters cannot be expressed in a single byte.
Why are the Chinese characters here being allowed to be housed in a char type?
What type should be used to house Chinese characters or non-ASCII characters in C++?
When you compile the code using -Wall flag you will see warnings like:
warning: overflow in implicit constant conversion [-Woverflow]
snet[2] = '我';
warning: multi-character character constant [-Wmultichar]
snet1 = '爱';
Visual C++ in Debug mode, gives the following warning:
c:\users\you\temp.cpp(9): warning C4566: character represented by universal-character-name '\u4F60' cannot be represented in the current code page (1252)
What is happening under the curtains is that your two byte Chinese characters are implicitly converted to a char. That conversion overflows and therefore you are seeing a negative value or something weird when you print it in the console.
Why are the Chinese characters here being allowed to be housed in a char type?
You can, but you shouldn't, the same way that you can define char c = 1000000;
What type should be used to house Chinese characters or non-ASCII characters in C++?
If you want to store Chinese characters and you can use C++11, go for UTF-8 encoding with std::string (live example).
std::string msg = u8"你爱我";

Using isalnum with signed character inputs - Visual C++

I have a very simple program where I am using the isalnum function to check if a string contains alpha-numeric characters. The code is:
#include "stdafx.h"
#include <iostream>
#include <string>
#include <locale>
using namespace std;
int _tmain(int argc, _TCHAR* argv[]) {
string test = "(…….";
for ( unsigned int i = 0; i < test.length(); i++) {
if (isalnum(test[i])) {
cout << "True: " << test[i] << " " << (int)test[i] << endl;
}
else {
cout << "False: " << isalnum(test[i]) << test[i] << " " << (int)test[i] << endl;
}
}
return 0;
}
I am using Visual Studio Desktop Edition 2013 for this snippet.
The issue(s):
1. When this program is run in Debug mode, the program fails with a debug assertion that says: "Expression c >= -1 && c <= 255"
Printing the character at the ith position results in a negative integer (-123). Converting all calls to isalnum to accept unsigned char as input causes the above error to disappear.
I checked the documentation for isalnum and it accepts arguments of type char. Then why does this code snippet fail? I am sure I am missing something trivial here but any help is welcome.
The isalnum function is declared in <cctype> (the C++ version of <ctype.h>) -- which means you really should have #include <cctype> at the top of your source file. You're getting away with calling it without the #include directive because either "stdafx.h" or one of the standard headers (likely <locale>) includes it -- but it's a bad idea to depend on that.
isalnum and friends come from C. The isalnum function takes an argument of type int, which must be either within the range of unsigned char or equal to EOF (which is typically -1). If the argument has any other value, the behavior is undefined.
Annoyingly, this means that if plain char happens to be signed, passing a char value to isalnum causes undefined behavior if the value happens to be negative and not equal to EOF. The signedness of plain char is implementation-defined; it seems to be signed on most modern systems.
C++ adds a template function isalnum that takes an argument of any character type and a second argument of type std::locale. Its declaration is:
template <class charT> bool isalnum (charT c, const locale& loc);
I'm fairly sure that this version of isalnum doesn't suffer from the same problem as the one in <cctype>. You can pass it a char value and it will handle it correctly. You can also pass it an argument of some wide character type like wchar_t. But it requires two arguments. Since you're only passing one argument to isalnum(), you're not using this version; you're using the isalnum declared in <cctype>.
If you want to use this version, you can pass the default locale as the second argument:
std::isalnum(test[i], std::locale())
Or, if you're sure you're only working with narrow characters (type char), you can cast the argument to unsigned char:
std::isalnum(static_cast<unsigned char>(test[i]))
The problem is that characters are signed by default, and anything over 0x7f is being treated as a negative number when passed to isalnum. Make this simple change:
if (isalnum((unsigned char)test[i])) {
Microsoft's documentation clearly states that the parameter is int, not char. I believe you're getting confused with a different version of isalnum that comes from the locale header. I don't know why the function doesn't accept sign-extended negative numbers, but suspect that it's based on wording in the standard.

Converting Const char * to Unsigned long int - strtoul

I am using the following code to convert Const char * to Unsigned long int, but the output is always 0. Where am I doing wrong? Please let me know.
Here is my code:
#include <iostream>
#include <vector>
#include <stdlib.h>
using namespace std;
int main()
{
vector<string> tok;
tok.push_back("2");
const char *n = tok[0].c_str();
unsigned long int nc;
char *pEnd;
nc=strtoul(n,&pEnd,1);
//cout<<n<<endl;
cout<<nc<<endl; // it must output 2 !?
return 0;
}
Use base-10:
nc=strtoul(n,&pEnd,10);
or allow the base to be auto-detected:
nc=strtoul(n,&pEnd,0);
The third argument to strtoul is the base to be used and you had it as base-1.
You need to use:
nc=strtoul(n,&pEnd,10);
You used base=1 that means only zeroes are allowed.
If you need info about integer bases you can read this
The C standard library function strtoul takes as its third argument the base/radix of the number system to be used in interpreting the char array pointed to by the first argument.
Where am I doing wrong?
nc=strtoul(n,&pEnd,1);
You're passing the base as 1, which leads to a unary numeral system i.e. the only number that can be repesented is 0. Hence you'd get only that as the output. If you need decimal system interpretation, pass 10 instead of 1.
Alternatively, passing 0 lets the function auto-detect the system based on the prefix: if it starts with 0 then it is interpreted as octal, if it is 0x or 0X it is taken as hexadecimal, if it has other numerals it is assumed as decimal.
Aside:
If you don't need to know the character upto which the conversion was considered then passing a dummy second parameter is not required; you can pass NULL instead.
When you're using a C standard library function in a C++ program, it's recommended that you include the C++ version of the header; with the prefix c, without the suffix .h e.g. in your case, it'd be #include <cstdlib>
using namespace std; is considered bad practice