Disambiguating std::isalpha() in C++

Disambiguating std::isalpha() in C++ - c++

So I am currently writing a part of a program that takes user text input. I want to ignore all input characters that are not alphabetic, and so I figured std::isalpha() would be a good way to do this. Unfortunately, as far as I know there are two std::isalpha() functions, and the general one needs to be disambiguated from the locale-specific one thusly:
(int(*)(int))std::isalpha()
If I don't disambiguate, std::isalpha seems to return true when reading uppercase but false when reading lowercase letters (if I directly print the returned value, though, it returns 0 for non-alpha chars, 1 for uppercase chars, and 2 for lowercase chars). So I need to do this.
I've done so in another program before, but for some reason, in this project, I sometimes get "ISO C++ forbids" errors. Note, only sometimes. Here is the problematic area of code (this appears together without anything in between):
std::cout << "Is alpha? " << (int(*)(int))std::isalpha((char)Event.text.unicode) << "\n";
if ( (int(*)(int))std::isalpha((char)Event.text.unicode) == true)
{
std::cout << "Is alpha!\n";
//...snip...
}
The first instance, where I send the returned value to std::cout, works fine - I get no errors for this, I get the expected values (0 for non-alpha, 1 for alpha), and if that's the only place I try to disambiguate, the program compiles and runs fine.
The second instance, however, throws up this:
error: ISO C++ forbids comparison between pointer and integer
and only compiles if I remove the (int(*)(int)) snippet, at which point bad behavior ensues. Could someone enlighten me here?

You are casting the return value of the std::alpha() call to int(*)(int), and then compare that pointer to true. Comparing pointers to boolean values doesn't make much sense and you get an error.
Now, without the cast, you compare the int returned by std::alpha() to true. bool is an integer type, and to compare the two different integer types the values are first converted to the same type. In this case they are both converted to int. true becomes 1, and if std::isalpha() returned 2 the comparison ends up with 2 != 1.
If you want to compare the result of std::alpha() against a bool, you should cast that returned in to bool, or simply leave out the comparison and use something like if (std::isalpha(c)) {...}

There is no need to disambiguate, because the there is no ambiguity in a normal call.
Also, there is no need to use the std:: prefix when you get the function declaration from <ctype.h>, which after C++11 is the header you should preferably use (i.e., not <cctype>) – and for that matter also before C++11, but C++11 clinched it.
Third, you should not compare the result to true.
However, you need to cast a char argument to unsigned char, lest you get Undefined Behavior for anything but 7-bit ASCII.
E.g. do like this:
bool isAlpha( char const c )
{
typedef unsigned char UChar;
return !!isalpha( UChar( c ) );
}

Related

c++ how to convert char to int while using `tolower`

I'm try to compile a simple expression:
char_to_int(tolower(row[y]))
However I'm getting the following errors when trying to compile it:
error: implicit conversion loses integer precision: 'int' to 'char' [-Werror,-Wimplicit-int-conversion]
if (char_to_int(tolower(row[y])) > n
The signature of char_to_int is:
unsigned long char_to_int(char c)
and the type of row[y] is char.
Why am I getting this error and how can I fix it?

From your error information I assume you are using std::tolower from <cctype> (or equivalently, ::tolower from <ctype.h>), not std::tolower from <locale>.
Why you are getting the error is straightforward from your error information: your char_to_int expects a char, but tolower returns an int. This will cause loss of information.
Why does tolower return an int, not just a char? Because it can accept and return EOF, which may fall out of range of any char.
The fix can be straightforward: change your char_to_int to accept int, or do an intermediate step to discard the possible EOF.

std::tolower doesn't actually operate on chars: it operates on ints! Moreover, there is risk of undefined behaviour: if on your machine char is a signed type, then the "negative" characters will correspond to negative integers, which std::tolower is not equipped to deal with.
A way to fix this for your use is to manually cast the types before use:
char_to_int(static_cast<char>(
std::tolower(static_cast<unsigned char>(row[y]))));
... which unfortunately is a bit of a mess, but that's what you have to do.
Alternatively, you may use the locale version of std::tolower, which is templated and will correctly handle char types. You may use it like so:
// std::locale{} is an object representing the default locale
// you may specify a locale precisely if needed; see the above links
char_to_int(std::tolower(row[y], std::locale{}));

tolower returns an int. std::tolower is however a template, and will work correctly for char. In general, if there is a std:: version of any func you are calling, use it! :)

String comparison of char* to uint8_t [duplicate]

This question already has an answer here:
Comparing uint8_t data with string
(1 answer)
Closed 4 years ago.
I'm new to C and C++, and can't seem to work out how I need to compare these values:
Variable I'm being passed:
typedef struct {
uint8_t ssid[33];
String I want to match. I've tried both of these:
uint8_t AP_Match = "MatchString";
unsigned char* AP_Match = "MatchString";
How I've attempted to match:
if (strncmp(list[i].ssid, "MatchString")) {
if (list[i].ssid == AP_Match) {
if (list[i].ssid == "MatchString") {
// This one fails because String is undeclared, despite having
// an include line for string.h
if (String(reinterpret_cast<const char*>(conf.sta.ssid)) == 'MatchString') {
I've noodled around with this a few different ways, and done some searching. I know one or both of these may be the wrong type, but I'm not sure to get from where I am to working.

There is no such type as "String" defined by any C standard. A string is just an array of characters that are stored as unsigned values based on the chosen encoding. 'string.h' provides various functions for comparison, concatenation, etc. but it can only work if the values you are passing to it are coherent.
The operator "==" is also undefined for string comparisons, because it would require comparing each character at each index, for two arrays that may not be the same size and ultimately may use different encodings, despite the same underlying unsigned integer representation (raising the prospect of false positive comparisons). You can possibly define your own function to do it (note C doesn't allow overloading operators), but otherwise you're stuck with what the standard libraries provide.
Note that strncmp() takes a size parameter for the number of characters to compare (your code is missing this). https://www.tutorialspoint.com/c_standard_library/c_function_strncmp.htm
Otherwise you would be looking at the function strcmp(), which requires the input strings to be null-terminated (last character equal to '\0'). Ultimately it's up to you to consider what the possible combinations of inputs could be and how they are stored and to use a comparison function that is robust to all possibilities.
As a final side note
if (list[i].ssid == "MatchString") {
Since ssid is an array, you should know that when you do this comparison, you are not actually accessing the contents of ssid, but rather the address of the first element of ssid. When you pass list[i].ssid into strcmp (or strncmp), you are passing a pointer to the first element of the array in memory. The function then iterates over the entire array until it reaches the null character (in the case of strcmp) or until it has compared the specified number of elements (in the case of strncmp).

To match two strings use strcmp:
if (0==strcmp(str1, str2))
str1 and str2 are addresses to memory holding a null terminated string. Return value zero means the strings are equal.
In your case one of:
if (0==strcmp(list[i].ssid, AP_Match))
if (0==strcmp(list[i].ssid, "MatchString"))

C++ toupper Syntax

I've just been introduced to toupper, and I'm a little confused by the syntax; it seems like it's repeating itself. What I've been using it for is for every character of a string, it converts the character into an uppercase character if possible.
for (int i = 0; i < string.length(); i++)
{
if (isalpha(string[i]))
{
if (islower(string[i]))
{
string[i] = toupper(string[i]);
}
}
}
Why do you have to list string[i] twice? Shouldn't this work?
toupper(string[i]); (I tried it, so I know it doesn't.)

toupper is a function that takes its argument by value. It could have been defined to take a reference to character and modify it in-place, but that would have made it more awkward to write code that just examines the upper-case variant of a character, as in this example:
// compare chars case-insensitively without modifying anything
if (std::toupper(*s1++) == std::toupper(*s2++))
...
In other words, toupper(c) doesn't change c for the same reasons that sin(x) doesn't change x.
To avoid repeating expressions like string[i] on the left and right side of the assignment, take a reference to a character and use it to read and write to the string:
for (size_t i = 0; i < string.length(); i++) {
char& c = string[i]; // reference to character inside string
c = std::toupper(c);
}
Using range-based for, the above can be written more briefly (and executed more efficiently) as:
for (auto& c: string)
c = std::toupper(c);

As from the documentation, the character is passed by value.
Because of that, the answer is no, it shouldn't.
The prototype of toupper is:
int toupper( int ch );
As you can see, the character is passed by value, transformed and returned by value.
If you don't assign the returned value to a variable, it will be definitely lost.
That's why in your example it is reassigned so that to replace the original one.

As many of the other answers already say, the argument to std::toupper is passed and the result returned by-value which makes sense because otherwise, you wouldn't be able to call, say std::toupper('a'). You cannot modify the literal 'a' in-place. It is also likely that you have your input in a read-only buffer and want to store the uppercase-output in another buffer. So the by-value approach is much more flexible.
What is redundant, on the other hand, is your checking for isalpha and islower. If the character is not a lower-case alphabetic character, toupper will leave it alone anyway so the logic reduces to this.
#include <cctype>
#include <iostream>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
for (auto s = text; *s != '\0'; ++s)
*s = std::toupper(*s);
std::cout << text << '\n';
}
You could further eliminate the raw loop by using an algorithm, if you find this prettier.
#include <algorithm>
#include <cctype>
#include <iostream>
#include <utility>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
std::transform(std::cbegin(text), std::cend(text), std::begin(text),
[](auto c){ return std::toupper(c); });
std::cout << text << '\n';
}

toupper takes an int by value and returns the int value of the char of that uppercase character. Every time a function doesn't take a pointer or reference as a parameter the parameter will be passed by value which means that there is no possible way to see the changes from outside the function because the parameter will actually be a copy of the variable passed to the function, the way you catch the changes is by saving what the function returns. In this case, the character upper-cased.

Note that there is a nasty gotcha in isalpha(), which is the following: the function only works correctly for inputs in the range 0-255 + EOF.
So what, you think.
Well, if your char type happens to be signed, and you pass a value greater than 127, this is considered a negative value, and thus the int passed to isalpha will also be negative (and thus outside the range of 0-255 + EOF).
In Visual Studio, this will crash your application. I have complained about this to Microsoft, on the grounds that a character classification function that is not safe for all inputs is basically pointless, but received an answer stating that this was entirely standards conforming and I should just write better code. Ok, fair enough, but nowhere else in the standard does anyone care about whether char is signed or unsigned. Only in the isxxx functions does it serve as a landmine that could easily make it through testing without anyone noticing.
The following code crashes Visual Studio 2015 (and, as far as I know, all earlier versions):
int x = toupper ('é');
So not only is the isalpha() in your code redundant, it is in fact actively harmful, as it will cause any strings that contain characters with values greater than 127 to crash your application.
See http://en.cppreference.com/w/cpp/string/byte/isalpha: "The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF."

Cannot resolve type for template function

I'm trying to code up something very simple in D, but I'm having a few problems with one of the standard library template functions (specifically, nextPermutation from std.algorithm).
The crux of what I'm trying to do is to create all permutations of pandigital numbers (that is, numbers including all the values 1 to 9 exactly once).
To do this, I've done the following:
import std.algorithm;
import std.conv;
int[] pandigitals()
{
char[] initial = "123456789".dup;
auto pan = [to!int(initial)];
while(nextPermutation!(initial)) {
pan ~= to!int(initial);
}
return pan;
}
This gives me the error:
Error: cannot resolve type for nextPermutation!(initial)
I've also tried to explicitly set the types:
while(nextPermutation!("a<b", char[])(initial))
However, this gives an error saying it cannot match the template:
Error: template instance std.algorithm.nextPermutation!("a < b", char[]) does not match template declaration nextPermutation(alias less = "a < b", BidirectionalRange)(ref BidirectionalRange range) if (isBidirectionalRange!BidirectionalRange && hasSwappableElements!BidirectionalRange)
What is the correct form of the call meant to be?

Well, your first problem is that you're passing initial as a template argument instead of a function argument. The !() is for template arguments. so, instead of
while(nextPermutation!(initial))
you need to do
while(nextPermutation(initial)) {
Now, that will still give you an error.
q.d(10): Error: template std.algorithm.nextPermutation cannot deduce function from argument types !()(char[]), candidates are:
/usr/include/D/phobos/std/algorithm.d(12351): std.algorithm.nextPermutation(alias less = "a<b", BidirectionalRange)(ref BidirectionalRange range) if (isBidirectionalRange!BidirectionalRange && hasSwappableElements!BidirectionalRange)
And that's because hasSwappableElements!(char[]) is false, and per nextPermutations' template constraint it needs to be true for a type to work with nextPermutations.
It's false because all strings are treated as ranges of dchar rather than their actual element type. This is because in UTF-8 (char) and UTF-16 (wchar), there are multiple code units per code point, so operating on individual code units could break up a code point, whereas in UTF-32 (dchar), there's always one code unit per code point. Essentially, if arrays of char or wchar were treated as ranges of char or wchar, you'd run a high risk of breaking up characters so that you'd end up with pieces of characters rather than whole characters. So, in general in D, if you want to operate on an individual character, you should use dchar, not char or wchar. If you're not very familiar with Unicode, I'd suggest reading this article by Joel Spoelsky on the subject.
However, regardless of why hasSwappableElements!(char[]) is false, it is false, so you're going to need to use a different type. The simplest thing would probably be to just swap your algorithm over to using dchar[] instead.
int[] pandigitals()
{
dchar[] initial = "123456789"d.dup;
auto pan = [to!int(initial)];
while(nextPermutation(initial)) {
pan ~= to!int(initial);
}
return pan;
}

Scanf function with strings

I need to read input from user. The input value may be string type or int type.
If the value is int then the program insert the value into my object.
Else if the value is string then it should check the value of that string, if it's "end" then the program ends.
Halda h; //my object
string t;
int tint;
bool end=false;
while(end!=true)
{
if(scanf("%d",&tint)==1)
{
h.insert(tint);
}
else if(scanf("%s",t)==1)
{
if(t=="end")
end=true;
else if(t=="next")
if(h.empty()==false)
printf("%d\n",h.pop());
else
printf("-1\n");
}
}
The problem is that scanning string doesn't seem to work properly.
I've tried to change it to: if(cin>>t) and it worked well.
I need to get it work with scanf.

The specifier %s in the scanf() format expects a char*, not a std::string.
From C11 Standard (C++ Standard refers to it about the C standard library):
Except in the case of a % specifier, the input item (or, in the case of a %n directive, the
count of input characters) is converted to a type appropriate to the conversion specifier. If
the input item is not a matching sequence, the execution of the directive fails: this
condition is a matching failure. Unless assignment suppression was indicated by a *, the
result of the conversion is placed in the object pointed to by the first argument following
the format argument that has not already received a conversion result. If this object
does not have an appropriate type, or if the result of the conversion cannot be represented
in the object, the behavior is undefined.
Anyway, here there's is no real reason to prefer the C way, use C++ facilities. And when you use the C library, use safe functions that only reads characters up to a given limit (just like fgets, or scanf with a width specifier), otherwise you could have overflow, that leads again to undefined behavior, and some errors if you're luck.

That's a really bad way to check for end-of-input. Either use an integer or use a string.
If you choose string, make provisions to convert from string to int.
My logic would be to first check if it can be converted to integer. if it can be, then continue with the logic. If it can't be(such as if it's a float or double or some other string) then ignore and move on. If it can be, then insert it into Halda's object.
Sidenote: Do not use scanf() and printf() when you're working with C++.

Assuming string refers to std::sring this program doesn't have defined behavior. You can't really use std::string with sscanf() You could set up a buffer inside the std::string and read into that but the string wouldn't change its size. You are probably better off using streams with std::string (well, in my opinion you are always better off using streams).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Disambiguating std::isalpha() in C++ - c++

Related

c++ how to convert char to int while using `tolower`

String comparison of char* to uint8_t [duplicate]

C++ toupper Syntax

Cannot resolve type for template function

Scanf function with strings

Categories

Resources