I came across a programming question of which I knew only a part of the answer.
int f( char *p )
{
int n = 0 ;
while ( *p != 0 )
n = 10*n + *p++ - '0' ;
return n ;
}
This is what I think the program is doing.
p is a pointer and the while loop is DE-refrencing the values of the pointer until it equals 0. However I don't understand the n assignment line, what is '0' doing? I am assuming the value of p is initially negative, that is the only way it will reach 0 after the increment.
You are confusing the number zero (none, nothing) with the character 0 (a circle, possibly with a slash through it). Notice that zero is in tick marks, so it's the character "0", not the number zero.
'0' - '0' = 0
'1' - '0' = 1
'2' - '0' = 2
...
So by subtracting the character zero from a digit, you get the number that corresponds to that digit.
So, say you have this sequence of digits: '4', '2', '1'. How do you get the number four-hundred and twenty-one from that? You turn the '4' into four. Then you multiply by ten. Now you have fourty. Convert the '2' into two and add it. Now you have fourty-two. Multiply by ten. Convert the '1' into one, and add, now you have four hundred and twenty one.
That's how you convert a sequence of digits into a number.
The n local variable accumulates the value of the decimal number that is passed to this function in the string. This is an implementation of atoi, without the validity checks.
Here is the workings of the loop body:
n = 10*n + *p++ - ‘0';
Assign to n the result of multiplying the prior value of n by ten plus the current character code at the pointer p less the code of zero; increment p after dereferencing.
Since digit characters are encoded sequentially, the *p-'0' expression represents a decimal value of a digit.
Let's say that you are parsing the string "987". As you go through the loop, n starts at zero; then it gets assigned the following values:
n = 10*0 + 9; // That's 9
n = 10*9 + 8; // That's 98
n = 10*98 + 7; // That's 987
It's poorly written, to say the least.
0) Use formatting!:
int f(char* p)
{
int n = 0;
while (*p != 0)
n = 10*n + *p++ - ‘0?;
return n;
}
1) ? there is syntactically invalid. It should probably be a ' as noted by chris (and your existing ‘ is wrong too, but that's probably because you copied it from a website and not a source file), giving:
int f(char* p)
{
int n = 0;
while (*p != 0)
n = 10 * n + *p++ - '0';
return n;
}
2) The parameter type isn't as contrained as it should be. Because *p is never modified (per our goals), we should enforce that to make sure we don't make any mistakes:
int f(const char* p)
{
int n = 0;
while (*p != 0)
n = 10 * n + *p++ - '0';
return n;
}
3) The original programmer was obviously allergic to readable code. Let's split up our operations:
int f(const char* p)
{
int n = 0;
for (; *p != 0; ++p)
{
const int digit = *p - '0';
n = 10 * n + digit;
}
return n;
}
4) Now that the operations are a bit more visible, we can see some independent functionality embedded in this function; this should be factored out (this is called reactoring) into a separate function.
Namely, we see the operation of converting a character to a digit:
int todigit(const char c)
{
// this works because the literals '0', '1', '2', etc. are
// all guaranteed to be in order. Ergo '0' - '0' will be 0,
// '1' - '0' will be 1, '2' - '0' will be 2, and so on.
return c - '0';
}
int f(const char* p)
{
int n = 0;
for (; *p != 0; ++p)
n = 10 * n + todigit(*p);
return n;
}
5) So now it's clear the function reads a string character by character and generates a number digit by digit. This functionality already exists under the name atoi, and this function is an unsafe implementation:
int todigit(const char c)
{
// this works because the literals '0', '1', '2', etc. are
// all guaranteed to be in order. Ergo '0' - '0' will be 0,
// '1' - '0' will be 1, '2' - '0' will be 2, and so on.
return c - '0';
}
int atoi_unsafe(const char* p)
{
int n = 0;
for (; *p != 0; ++p)
n = 10 * n + todigit(*p);
return n;
}
It's left as an exercise to the read to check for overflow, invalid characters (those that aren't digits), and so on. But this should make it much clearer what's going on, and is how such a function should have been written in the first place.
This is a string to number conversion function. Similar to atoi.
A string is a sequence of characters. So "123" in memory would be :
'1','2','3',NULL
p Points to it.
Now, according to ASCII, digits are encoded from '0' to '9'. '0' being assigned the value 48 and '9' being assigned the value 57. As such, '1','2','3',NULL in memory is actually : 49, 50, 51, 0
If you wanted to convert from the character '0' to the integer 0, you would have to subtract 48 from the value in memory. Do you see where this is going?
Now, instead of subtracting the number 48, you subtract '0', which makes the code easier to read.
Related
I'm practicing a coding problem on "Check if the frequency of all the digits in a number is same"
#include<bits/stdc++.h>
using namespace std;
// returns true if the number
// passed as the argument
// is a balanced number.
bool isNumBalanced(int N)
{
string st = to_string(N);
bool isBalanced = true;
// frequency array to store
// the frequencies of all
// the digits of the number
int freq[10] = {0};
int i = 0;
int n = st.size();
for (i = 0; i < n; i++)
// store the frequency of
// the current digit
freq[st[i] - '0']++;
for (i = 0; i < 9; i++)
{
// if freq[i] is not
// equal to freq[i + 1] at
// any index 'i' then set
// isBalanced to false
if (freq[i] != freq[i + 1])
isBalanced = false;
}
// return true if
// the string is balanced
if (isBalanced)
return true;
else
return false;
}
// Driver code
int main()
{
int N = 1234567890;
bool flag = isNumBalanced(N);
if (flag)
cout << "YES";
else
cout << "NO";
}
but I can't understand this code:
// store the frequency of
// the current digit
freq[st[i] - '0']++;
How this part actually working and storing frequency?
And instead of this line, what else I can write?
st is a string and thus, a sequence of chars. st[i] is the ith char in this string.
Chars are actually positive integers between 0 and 256, so you can use them with mathematical operations, such as -. These integers are assigned to characters according to the ASCII alphabet. For example: The char 0 is assigned to 48 and the char 7 to 55 (Note: in the following, I use x to denote the character).
Their order makes it possible that mathematical operations are sensible as follows: The char 7 and the char 0 are exactly 7 numbers apart, so 0 + 7 = 48 + 7 = 55 = 7. So: 7 - 0 = 7.
So, you get the position in the freq array according to the number, i.e., the position 0 for 0 or position 7 for 7. The ++ operator increments that value in-place.
This line is several things condensed into one expression
freq[st[i] - '0']++;
The individual part are rather simple and in total it also isn't too difficult:
st[i] - '0' - character digits do not map 1 to 1 to integers. There is an offset. The integer value of '1' is 1 + '0', '2' is 2 + '0'. Hence to get the integer from the digit you need to subtract '0'.
freq[ ... ] - accesses the element of the array. Element at index i stores frequency of digit i.
()++ - increments that frequency by one.
Subtracting the '0' character from the single string character results in the actual number you're looking for. This gives you the number whose frequency you are tracking in your code. This works because of the way characters are stored as ASCII values. Check out the table below. Say that the integer value N that is passed in is 1221. The first value observed in this example is '1' which corresponds to an ASCII value of 49. The ASCII value of '0' is 48. Subtracting the two: 49 - 48 = 1. This allows you to access each integer value individually as part of the array that was the result of the transformation of an 'int' value into a string.
ASCII Table
The code of
for (i = 0; i < n; i++)
// store the frequency of
// the current digit
freq[st[i] - '0']++;
traverses the string and for each item, it subtracts '0', which has a value of 48, because character code 48 represents 0, character code 49 represents 1 and so on.
This code however is superfluos, wastes memory in storing a string and wastes time converting a number to a string. This is better:
bool isNumBalanced(int N)
{
//We create an array of 10 for each digit
int digits[10];
//Initialize the difits
for (int i = 0; i < 10; i++) digits[i] = 0;
//If the input is 0, then we have a trivial case
if (N == 0) return true;
//We loop the digits
do {
//N % 10 is the last digit
//We increment the frequency of that digit
digits[N % 10]++;
} while ((N /= 10) != 0); //We don't stop until we reach the trivial case, see above
//Using the transitivity of equality, we compare all values to the first
//We return false upon the first difference
for (int j = 1; j < 10; j++)
if (digits[0] != digits[j]) return false;
//Otherwise we return true
return true;
}
For those who don't understand it.
int arr[5]={0} // it stores 0 in all places
for(int i=0;i<5;i++){
arr[i]++;
} // Now the array is [1 1 1 1 1]
what happened here is first i=0 then arr[0]++ "here arr[0] value was 0, ++, it increment 0 to 1"
now arr[0] value is 1.
Now `
let
st="1221";
for (i = 0; i < 4; i++) {
freq[st[i] - '0']++;
for i=0, the freq location is : freq[49-48]++ = freq[1]++ means value of freq[1] is 1
for i=1, the freq location is : freq[50-48]++ = freq[2]++ means value of freq[2] is 1
for i=2, the freq location is : freq[50-48]++ = freq[2]++ means value of freq[2] is 2
for i=3, the freq location is : freq[49-48]++ = freq[1]++ means value of freq[1] is 2
ASCII value of '0' is 48
ASCII value of '1' is 49
ASCII value of '2'is 50
This question already has answers here:
What's the real use of using n[c-'0']?
(13 answers)
Closed 2 years ago.
I am confused with this line:
sum += a[s[i] - '0'];
To give some context, this is the rest of the code:
#include <iostream>
using namespace std;
int main() {
int a[5];
for (int i = 1; i <= 4; i++)
cin >> a[i];
string s;
cin >> s;
int sum = 0;
for (int i = 0; i < s.size(); i++)
sum += a[s[i] - '0'];
cout << sum << endl;
return 0;
}
- '0' (or less portable - 48, for ASCII only) is used to manually convert numerical characters to integers through their decimal codes, C++ (and C) guarantees consecutive digits in all encodings.
In EBCDIC, for example, the codes range from 240 for '0' to 249 for '9', this will work fine with - '0', but will fail with - 48). For this reason alone it's best to always use the - '0' notation like you do.
For an ASCII example, if '1''s ASCII code is 49 and '0''s ASCII code is 48, 49 - 48 = 1, or in the recommended format '1' - '0' = 1.
So, as you probably figured out by now, you can convert all the 10 digits from characters using this simple arithmetic, just subtracting '0' and in the other direction you can convert all digits to it's character encoding by adding '0'.
Beyond that there are some other issues in the code:
The array does not start being populated at index 0, but at index 1, so if your string input is, for instance, "10" the sum will be a[1] + a[0], but a[0] has no assigned value, so the behaviour is undefined, you need to wach out for these cases.
for (int i = 0; i < 5; i ++)
cin >> a[i];
would be more appropriate, indexes from 0 to 4, since the array has 5 indexes, if you want input numbers from 1 to 5, you can subract 1 from the to the index later on.
As pointed out in the comment section, a bad input, with alpabetic characters, for instance, will also invoke undefined behaviour.
From the C++ Standard (2.3 Character sets)
... In both the source and execution basic character sets, the value of
each character after 0 in the above list of decimal digits shall be
one greater than the value of the previous.
So if you have a character for example '4' then to get value 4 you can write '4' - '0'.
If you will write like for example
sum += a[s[i]];
where i is the character '0' then in fact you will have either
sum += a[s[48]];
if the ASCII coding is used or
sum += a[s[240]];
if the EBCDIC coding is used.
The reversed operation of getting a character from a digit you can write for example
int digit = 4;
char c = digit + '0';
Pay attention to that indices of arrays in C++ start from 0.
Thus this loop
for (int i = 1; i <= 4; i ++)
cin >> a[i];
should be written like
for (int i = 0; i < 5; i ++)
cin >> a[i];
Also to avoid such an error you could use the range based for loop like
for ( auto &item : a ) std::cin >> item;
the code should count each character. If the character is a number, it should count the previous character as much as the number.
So if the input is 'a', it should count 'a' once and assign it to acounter which now is equal to 1.
but if after 'a' is 3, it means 'aaa' and it should count 'a' three times and assign it to acounter which now is equal to 3.
Note: the program is for all of the alphabets but if this one isn't solved then what's the point of writing the rest?
I've tried put another loop exclusively for numbers but it didn't work.
char secret_message[1000];
int counter,number_counter;
int acounter=0;
gets(secret_message);
for (counter = 0 ; secret_message[counter] != NULL ; counter++)
{
if (secret_message[counter]=='a')
acounter++;
if (secret_message[counter] >= '0' && secret_message[counter] <= '9')
{
for(number_counter=1;number_counter<=secret_message[counter];number_counter++)
{
if (secret_message[counter-1]=='a')
acounter++;
}
}
}
cout<<endl<<"acounter is:"<<acounter;
if the input is a3 the output should be 3, but it's 52 !
You'll want to convert the digit from text to number, then use addition:
if (isdigit(secret_message[counter]))
{
const int value = secret_message[counter] - '0';
acounter += value;
}
This question already has answers here:
How do you convert char numbers to decimal and back or convert ASCII 'A'-'Z'/'a'-'z' to letter offsets 0 for 'A'/'a' ...?
(4 answers)
Closed 7 years ago.
The following code basically takes a string and converts it into an integer value. What I don't understand is why the int digit = s[i] - '0'; is necessary. Why can't it work by using int digit = s[i] instead? What purpose does the -'0' have?
int main(){
string s = "123";
int answer = 0;
bool is_negative = false;
if (s[0] == '-')
{
is_negative = true;
}
int result = 0;
for (int i = s[0] == '-' ? 1 : 0; i < s.size(); ++i)
{
int digit = s[i] - '0';
result = result * 10 + digit;
}
answer = is_negative ? -result : result;
cout << answer << endl;
system("pause");
}
Firstly, in your question title
the use of '0'
should be written as
the use of s[i] - '0'
That said, by subtracting the ASCII value of char 0 (represented as '0'), we get the int value of the digit represented in char format (ASCII value, mostly.)
It's because the value inside s[i] is a char type.
To convert the char '1' into an integer, you do '1' - '0' to get 1. This is determined by position in the ASCII table, char '0' is 48, while char '1' is 49.
The reason for the subtraction is that s[i] is, for example, '6'. The value '6' evaluates to 54 (the ascii code). The ASCII code of '0' is 48. So '6' - '0' = 6, which is the char value expressed as an int.
I have a string which will be exactly consist of numbers between 1-30 and one of 'R','T'or'M' char. Let me illustrate it by some examples.
string a="15T","1R","12M","24T","24M" ... // they are all valid for my string
Now I need to have a hash function which gives me a unique hash value for every input string. Since my input have a finite set I think it is possible.
Is there anyone who can tell what kind of hash function could I define ?
By the way, I'll create my hash table using vector therefore I guess size is not an important issue but I'll define 10000 as an upper bound. I mean I assume I can not have more than 10000 such a string
Thanks in advance.
Just have a large enough integer type and put the (maximal) three characters into the integer:
std::size_t hash(const char* s) {
std::size_t result = 0;
while(*s) {
result <<= 8;
result |= *s++;
}
return result;
}
You could define an algebraic function:
result = string[0] * 0x010000
+ string[1] * 0x000100
+ string[2];
Basically, each character fits into an uint8_t, which has a range of 256. So each column is a power of 256.
Yes, there are big gaps, but this insures a unique hash.
You could compress the gaps by using various "powers" for the different character columns.
Given "15T":
result = (string[0] - '0') * 10 // 10 == number of digits in the 2nd column
+ (string[1] - '0') * 3; // 3 == number of choices in 1st column.
switch (string[2])
{
case 'T' : result += 0; break;
case 'M' : result += 1; break;
case 'R' : result += 2; break;
}
It's a number / counting system where each column has a different number of digits.
Something along the line of:
unsigned myhash(const char * str)
{
int n = 0;
// Parse the number part
for ( ; *str >= '0' && *str <= '9'; ++str)
n = n * 10 + (*str - '0');
int c = *str == 'R' ? 0 :
*str == 'T' ? 1 :
*str == 'M' ? 2 :
3;
// Check for invalid strings
if ( c == 3 || n <= 0 || n > 30 || *(++str) != 0 )
{
// Some error or anything
// (Or replace the if condition with an assert)
throw std::runtime_error("Invalid string");
}
// Since 0 <= c < 3 and 0 <= (n-1) < 30
// There are only 90 possible values
return c * 30 + (n-1);
}
In my experience whenever you have to deal with something like this it is often better to do the opposite, that is work with integers and have a function to perform the opposite conversion if necessary.
You can rebuild the original string with:
int n = hash % 30 + 1;
int c = hash / 30; // 0 is 'R', 1 is 'T', 2 is 'M'