How can I create an "isalnum" equivalent? - c++

I was wondering if I am writing correct "isalnum" logic. This program is checking if the string is a palindrome or not and when I input the string "race a car", it keeps saying it is true, i.e. it's a palindrome.
bool isPalindrome(string s) {
for (int i = 0, j = s.size() - 1; i < j; i++, j--) {
while ((s[i] < 'a' || s[i] > 'z') && (i < j) ||
(s[i] < 'A' || s[i] > 'Z') && (i < j) ||
(s[i] < '0' || s[i] > '9') && (i < j))
i++;
while ((s[j] < 'a' || s[j] > 'z') && (i < j) ||
(s[j] < 'A' || s[j] > 'Z') && (i < j) ||
(s[j] < '0' || s[j] > '9') && (i < j))
j--;
if (toupper(s[i]) != toupper(s[j])) return false;
}
return true;
}

No, your logic is not correct, well, at least not for all the character sets that C++ can use. In ASCII, the letters of the alphabet are in contiguous blocks, so something like
(s[i]<'A'||s[i]>'Z')
works just fine. The issue with that is that ASCII isn't the only character set C++ supports. The most common example to counterpoint ASCII is EBCDIC which has the characters {, }, and \ in between A and Z.
One thing that is guaranteed though is that 0 through 9 are contiguous in all character sets that C++ supports so it's always legal to text if a character is a number using
if (char_var >= '0' && char_var <= '9')

Assuming contiguous alphabets (which isn't going to hold in reality), your logic is broken by the fact that you're doing multi-range checking, where disqualification in one range is still qualified in another.
Specifically, this:
while((s[i]<'a'||s[i]>'z')&&(i<j)||(s[i]<'A'||s[i]>'Z')&&(i<j)||(s[i]<'0'||s[i]>'9')&&(i<j))i++;
Now, consider this: suppose a[i] is in 'a'..'z', so the first range check will be false. But, if that's the case then it is NOT in 'A'..'Z' and is certainly not in '0'..'9'. Since both of those tests result in true, the loop advances i and continues on. As your loop is written, so long as the character is not in at least one of those ranges the loop continues. Since the ranges are mutually exclusive, there will ALWAYS be at least one the current character is not within. That OR separating condition is wrong. It shouldn't be not-in one of those ranges; it should be not-in ALL of those ranges. Thus.. AND is appropriate.
Short work with a debugger will tell you the very first pass of your outer-for-loop is advancing i all the way to j. The second loop is skipped because j and i are already equal, and since s[i] == s[j] is definitely true when i == j, the result is true.
Short version: your loop conditions are broken, even on contiguous character sequence platforms.
The loop you're more inclined to succeed with would be something like:
while( (i < j) && (s[i]<'a'|| s[i]>'z') && (s[i]<'A'||s[i]>'Z') && (s[i]<'0'||s[i]>'9') )
i++;
I leave the other loop and consideration for not doing any of this because of encodings where it will not work as an exercise for you.

Related

strcmpi integer without a cast error

I'm trying to create a program that removes vowels from a string, add them into a vector and then give the user the possibility of having the original code again.
In my code i have this:
char s[20];
And soon after i have this comparison:
for(j=0;j<tam;j++)
{
if(strcmpi(s[j],"a")==1 ||
(strcmpi(s[j],"e")==1 ||
(strcmpi(s[j],"i") ==1||
(strcmpi(s[j],"o") ==1||
(strcmpi(s[j],"u")==1))
{
So if the array is char and the vowels are char(""), why the compiler give me this error?:
[Warning] passing arg 1 of `strcmpi' makes pointer from integer without a cast
EDIT
As someone said the correct is s[j] == 'a', but that result in wrong way. If a put car the result is still car. Don't know why.
if(s[j] == 'a' ||
s[j] == 'A' ||
s[j] == 'e' ||
s[j] == 'E' ||
s[j] == 'i' ||
s[j] == 'I' ||
s[j] == 'o' ||
s[j] == 'O' ||
s[j] == 'u' ||
s[j] == 'U')
{
s[j] = s[j++]; }
Strcmpi is for comparing strings. The first argument to strcmpi is of type char, when it expects a char*.
In order to compare single chars, s[j]=='e' is enough, or tolower(s[j])=='e' if you need it to be case insensitive. You'll need to include ctype.h. for tolower.
The arguments to strcmpi must be strings, but s[j] is just a single character, not a string. You can use == to compare characters directly. To get case-insensitive comparisons, get the lowercase version of the character first and compare it.
for (j = 0; j < tam; j++) {
char lower = tolower(s[j]);
if (lower == 'a' || lower == 'e' || lower == 'i' || lower == 'o' || lower == 'u') {
...
}
}
You don't want to use strcmp or any of its variants.
Because you want to know whether the string contains vowels or not, you may want to use a substring search using strstr.
You use function strcmpi incorrectly. It first parameter has type const char * while you pass an argument of type char. That is the function expects a string while you pass only one character.
Moreover this function is not a standard C/C++ function. So it should not be used.
You could achieve the same result using the following approach
char vowels[] = "aeiou";
//...
if ( strchr( vowels, tolower( s[j] ) )
{
std::cout << s[j] << " is a vowel" << std::endl;
}
You have already been told that strcmpi is not the right way to check single characters. This is an answer to the edit to your question, where you ask about actually stripping the vowels.
If you want to retain the original string, you need extra memory for the string without consonants. You also need two indices, because once you have skipped a vowel in the original string, the indices are out of sync. Here's an example implementation:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main()
{
char orig[] = "Jackdaws love my big sphinx of quartz.";
char cons[sizeof(orig)]; // Make new char buffer as big
// as the original
int i, j;
j = 0;
for (i = 0; orig[i]; i++) {
if (strchr("AEIOUaeiou", orig[i]) == NULL) {
cons[j++] = orig[i];
}
}
cons[j] = '\0'; // Terminate consonant string
printf("was: '%s'\n", orig);
printf("is: '%s'\n", cons);
return 0;
}
The expression strchr checks whether a character is in a string. You can use it as a shortcut to spelling out all vowels in explicit comparisons.

Are semicolon-terminated if-statements good coding practice?

I am looping i, j, k and n. I want a statement to execute except when k == j and n == k.
If I use the code like this:
if (k != j && n != i)
//statement
The statement will not execute in two cases:
When k != j when n == j.
When k == j when n != j. (which is not what I need)
So I used a code like this:
if (k == j && n == i)
;else
//statement
By this code the statement will successfully execute except when k == j && n == i.
Is semicolon-terminated if-statements is a good way of coding in C++?
No it's not a good way of coding. The usual way is to use ! to invert the logic:
if(!(k==j && n==i))
//statement
Or you could use De Morgan's Law to invert the logic:
if(k!=j || n!=i)
//statement
Your problem is that you're negating the condition incorrectly. You should just do:
if (!(k==j && n==i))
// statement
Or, by De Morgan's laws:
if (k != j || n != i)
// statement
... is a good way of coding?
No.
You should write
if (!(k == j && n == 1))
{
//statement
}
Putting a semicolon after an if, for, or while is almost always wrong. It is highly unexpected and makes reading your code very difficult.

C++ toUpper Implementation

I made an implementation of toUpper(). It doesn't work 100%.
Code :
char* toUpper(char* string)
{
char* sv = string;
while(*sv++ != '\0')
{
if( int(*sv) >= 97 || int(*sv) <= 122) //Only if it's a lower letter
*sv = char( *sv - 32);
}
return string;
}
I know that the lower letters have the numbers from 97 to 122 (in ASCII) and the upper letters have the numbers from 65 to 90. There are exactly 32 numbers between the lower to the upper letter. So I just subtracted 32 from the lower character.
Code where I call this function :
char h[] = "Whats up?";
cout << toUpper(h) << endl;
I expected the program to output "WHATS UP?" but instead I got "WHATS". What did I do wrong?
if( int(*sv) >= 97 || int(*sv) <= 122)
should be
if( int(*sv) >= 97 && int(*sv) <= 122)
or, preferably
if( *sv >= 'a' && *sv <= 'z')
*sv = *sv - ('a' - 'A');
You also need to move the point at which you increment sv. The current code skips checking the first character in string
while(*sv != '\0')
{
if( *sv >= 'a' && *sv <= 'z')
*sv = *sv - ('a' - 'A');
sv++;
}
Lastly, I'm sure you're aware of it but just in case... if this isn't a homework assignment or other learning exercise, the standard C toupper function will do exactly the same job for you
*sv = (char)toupper(*sv);
Having ++ in the while makes you miss important cases. The int() things are unnecessary noise. You need && in the check condition. The action can be written with -=.
Here's a rewrite that uses a for loop and fixes your conditional as well as off-by-one increment:
char* toUpper(char* string)
{
for(char* p=string; *p != '\0'; p++)
{
if(*p >= 'a' && *p <= 'z') //Only if it's a lower letter
*p -= 32;
}
return string;
}

Arithmetic on C++ strings

This code really confuses me, it is using some Stanford libraries for the Vector (array) class. Can anyone tell me what is the purpose of int index = line [j] - 'a'; why - 'a'?
void countLetters(string filename)
{
Vector<int> result;
ifstream in2;
in2.open(filename.c_str());
if (in.fail()) Error("Couldn't read '" + filename + "'");
for (int i = 0; i < ALPHABETH_SIZE; i++)
{
result.add(0); // Must initialize contents of array
}
string line;
while (true)
{
getLine(in, line);
// Check that we got a line
if (in.fail()) break;
line = ConvertToLowerCase(line);
for (int j = 0; j < line.length(); j++)
{
int index = line [j] - 'a';
if (index >= 0 && index < ALPHABETH_SIZE)
{
int prevTotal = result[index];
result[index] = prevTotal +1;
}
}
}
}
The purpose of the code:
Takes a filename and prints the number of times each letter of the alphabet appears in that file. Because there are 26 numbers to be printed, CountLetters needs to create a Vector. For example, if the file is:
Characters in a string are encoded using a character set... typically ASCII on hardware common in English language systems. You can see the ASCII table at http://en.wikipedia.org/wiki/ASCII
In ASCII (and most other character sets), the numbers representing letters are contiguous. So, this is the natural way to test whether the character at index j in character-array line is a letter:
line[j] >= 'a' && line[j] <= 'z'
Your program is equivalent to that, in an algebra-kind of sense it subtracts a from both sides (knowing that a is the first character in the character set):
line[j] >= 'a' - `a` && line[j] <= 'z' - `a`
line[j] >= 0 && line[j] <= 'z' - `a`
Replacing "<= z - a" with am equivalent:
line[j] >= 0 && line[j] < ALPHABET_SIZE
where ALPHABET_SIZE is 26. This trades a dependency on knowing z is the last character of your character set for knowing how many characters are in your character set - both are a little fragile, but fine if you know you're dealing with a well-known, stable character set encoding.
A better way to check for a letter is to use the isalpha() predicate: http://www.cplusplus.com/reference/clibrary/cctype/isalpha/
"a" is at the beginning of ASII chars.
int index = line [j] - 'a';
if (index >= 0 && index < ALPHABETH_SIZE)
These two line of code is to just if line[j] is a character.

Can't seem to break a while loop on collision, hashing

I don't know why this code is not breaking out of the while loop:
int table_size = 953;
store hash_table[953];
for(int i = 0; i < table_size; i++)
hash_table[i].count = 0;
//bunch of stuff to get hash value here
while(hash_table[hashNum].data != pString || hash_table[hashNum].count != 0){
hashNum++;
if(hashNum > table_size)
hashNum = 0;
cout << hash_table[hashNum].count;
// to check the value of the count in the array, it IS 0, thus should have broken the loop
}
you probably mean:
while(hash_table[hashNum].data != pString && hash_table[hashNum].count != 0)
In your code the loop will continue if either case is true, hash_table[hashNum].count == 0 is NOT sufficient to make the clause false.
hash_table[hashNum].count being equal to zero is not sufficient to terminate the loop since you are using || ("or") between the two conditions in the termination test. If hash_table[hashNum].data is not equal to pString then the loop will continue regardless of what hash_table[hashNum].count is.
I think your loop condition should be on hashNum != 0 instead of hash_table[hashNum].count != 0.
Secondly, there should be && instead of || in your while condition.
These are wild guesses since a lot of information is missing in this question.
You should have a look at binary logic, especially De Morgan theorem
!(a && b) is equivalent to (!a) || (!b)