what does that mean, C programm for RLE - c++

I am new to C so I do not understand what is happening in this line:
out[counter++] = recurring_count + '0';
What does +'0' mean?
Additionally, can you please help me by writing comments for most of the code? I don't understand it well, so I hope you can help me. Thank you.
#include "stdafx.h"
#include "stdafx.h"
#include<iostream>
void encode(char mass[], char* out, int size)
{
int counter = 0;
int recurring_count = 0;
for (int i = 0; i < size - 1; i++)
{
if (mass[i] != mass[i + 1])
{
recurring_count++;
out[counter++] = mass[i];
out[counter++] = recurring_count + '0';
recurring_count = 0;
}
else
{
recurring_count++;
}
}
}
int main()
{
char data[] = "yyyyyyttttt";
int size = sizeof(data) / sizeof(data[0]);
char * out = new char[size + 1]();
encode(data, out, size);
std::cout << out;
delete[] out;
std::cin.get();
return 0;
}

It adds the character encoding value of '0' to the value in recurring_count. If we assume ASCII encoded characters, that means adding 48.
This is common practice for making a "readable" digit from a integer value in the range 0..9 - in other words, convert a single digit number to an actual digit representation in a character form. And as long as all digits are "in sequence" (only digits between 0 and 9), it works for any encoding, not just ASCII - so a computer using EBCDIC encoding would still have the same effect.

recurring_count + '0' is a simple way of converting the int recurring_count value into an ascii character.
As you can see over on wikipedia the ascii character code of 0 is 48. Adding the value to that takes you to the corresponding character code for that value.

You see, computers may not really know about letters, digits, symbols; like the letter a, or the digit 1, or the symbol ?. All they know is zeroes and ones. True or not. To exist or not.
Here's one bit: 1
Here's another one: 0
These two are only things that a bit can be, existence or absence.
Yet computers can know about, say, 5. How? Well, 5 is 5 only in base 10; in base 4, it would be a 11, and in base 2, it would be 101. You don't have to know about the base 4, but let's examine the base 2 one, to make sure you know about that:
How would you represent 0 if you had only 0s and 1s? 0, right? You probably would also represent the 1 as 1. Then for 2? Well, you'd write 2 if you could, but you can't... So you write 10 instead.
This is exactly analogous to what you do while advancing from 9 to 10 in base 10. You cannot write 10 inside a single digit, so you rather reset the last digit to zero, and increase the next digit by one. Same thing while advancing from 19 to 20, you attempt to increase 9 by one, but you can't, because there is no single digit representation of 10 in base 10, so you rather reset that digit, and increase the next digit.
This is how you represent numbers with just 0s and 1s.
Now that you have numbers, how would you represent letters and symbols and character-digits, like the 4 and 3 inside the silly string L4M3 for example? You could map them; map them so, for example, that the number 1 would from then on represent the character A, and then 2 would represent B.
Of course, it would be a little problematic; because when you do that the number 1 would represent both the number 1 and the character A. This is exactly the reason why if you write...
printf( "%d %c", 65, 65 );
You will have the output "65 A", provided that the environment you're on is using ASCII encoding, because in ASCII 65 has been mapped to represent A when interpreted as a character. A full list can be found over there.
In short
'A' with single quotes around delivers the message that, "Hey, this A over here is to receive whatever the representative integer value of A is", and in most environments it will just be 65. Same for '0', which evaluates to 48 with ASCII encoding.

Related

Slicing string character correctly in C++

I'd like to count number 1 in my input, for example,111 (1+1+1) must return 3and
101must return 2 (1+1)
To achieve this,I developed sample code as follows.
#include <iostream>
using namespace std;
int main(){
string S;
cout<<"input number";
cin>>S;
cout<<"S[0]:"<<S[0]<<endl;
cout<<"S[1]:"<<S[1]<<endl;
cout<<"S[2]:"<<S[2]<<endl;
int T = (int) (S[0]+S[1]+S[2]);
cout<<"T:"<<T<<endl;
return 0;
}
But when I execute this code I input 111 for example and my expected return is 3 but it returned 147.
[ec2-user#ip-10-0-1-187 atcoder]$ ./a.out
input number
111
S[0]:1
S[1]:1
S[2]:1
T:147
What is the wrong point of that ? I am totally novice, so that if someone has opinion,please let me know. Thanks
It's because S[0] is a char. You are adding the character values of these digits, rather than the numerical value. In ASCII, numerical digits start at value 48. In other words, each of your 3 values are exactly 48 too big.
So instead of doing 1+1+1, you're doing 49+49+49.
The simplest way to convert from character value to digit is to subtract 48, which is the value of 0.
e.g, S[0] - '0'.
Since your goal is to count the occurrences of a character, it makes no sense to sum the characters together. I recommend this:
std::cout << std::ranges::count(S, '1');
To explain the output that you get, characters are integers whose values represent various symbols (and non-printable control characters). The value that represents the symbol '1' is not 1. '1'+'1'+'1' is not '3'.

what does 48 and 55 mean in hex? I found a code that converts from decimal to hex that I posted down below

// check if temp < 10
if (temp < 10) {
hexaDeciNum[i] = temp + 48;
i++;
}
else {
hexaDeciNum[i] = temp + 55;
i++;
}
n = n / 16;
}
I found this code to convert from decimal to hex but as you can see we have + 48 and + 55 anyone know why did we use these numbers? btw temp is to store the remainder... thanks!
What the code is doing, badly, is converting a value in the range of 0 to 15 into a corresponding character for the hexadecimal representation of that value. The right way to do that is with a lookup table:
const char hex[] = "0123456789ABCDEF";
hexaDeciNum[i] = hex[temp];
One problem with the code as written is that it assumes, without saying so, that you want the output characters encoded in ASCII. That's almost always the case, but there is no need to make that assumption. The compiler knows what encoding the system that it is targeting uses, so the values in the array hex in my code will be correct for the target system, even if it doesn't use ASCII.
Another problem with the code as written is the magic numbers. They don't tell you what their purpose is. To get rid of the magic numbers, replace 48 with '0' and replace 55 with 'A' - 10. But see the previous paragraph.
In C and C++ you can convert a base-10 digit to its corresponding character by adding it to '0', so
hexaDeciNum[i] = digit + '0';
will work correctly. There is no such requirement for any other values, so that conversion to a letter is not guaranteed to work, even if you use 'A' instead of that hardcoded 65.
And don't get me started on pointless comments:
// check if temp < 10
if (temp < 10)
If you look on the ASCII table you will see that the characters for numbers 0..9 are shifted by 48. So, if you take a number e.g. 0 and add 48 to it you will get a character for that number "0".
The same goes for characters if you take number 10 and add 55 to it you will get an "A" character from the ASCII table.

Why does char occupy 7 bits when the length is 1 byte ie 8 bits?

I've seen that the below program is taking only 7 bits of memory to store the character, but in general everywhere I've studied says that char occupies 1 byte of memory ie is 8 bits.
Does a single character require 8 bits or 7 bits?
If it requires 8 bits, what will be stored in the other bit?
#include <iostream>
using namespace std;
int main()
{
char ch = 'a';
int val = ch;
while (val > 0)
{
(val % 2)? cout<<1<<" " : cout<<0<<" ";
val /= 2;
}
return 0;
}
Output:
1 0 0 0 0 1 1
The below code shows the memory gap between the character, i.e. is 7 bits:
9e9 <-> 9f0 <->......<-> a13
#include <iostream>
using namespace std;
int main()
{
char arr[] = {'k','r','i','s','h','n','a'};
for(int i=0;i<7;i++)
cout<<&arr+i<<endl;
return 0;
}
Output:
0x7fff999019e9
0x7fff999019f0
0x7fff999019f7
0x7fff999019fe
0x7fff99901a05
0x7fff99901a0c
0x7fff99901a13
Your first code sample doesn't print leading zero bits, as ASCII characters all have the upper bit set to zero you'll only get at most seven bits printed if using ASCII characters. Extended ASCII characters or utf-8 use the upper bit for characters outside the basic ASCII character set.
Your second example is actually printing that each character is seven bytes long which is obviously incorrect. If you change the size of the array you are using to not be seven characters long you'll see different results.
&arr + i is equivalent to (&arr) + i as &arr is a pointer to char[7] which has a size of 7, the +i adds 7 * i bytes to the pointer. (&arr) + 1 points to one byte past the end of the array, if you try printing the values these pointers point to you'll get junk or a crash: **(&arr + i).
Your code should be static_cast<void*>(&arr[i]), you'll then see the pointer going up by one for each iteration. The cast to void* is necessary to stop the standard library from trying to print the pointer as a null terminated string.
It has nothing to do with space assigned for char. You simply converting ASCII represent of char into binary.
ASCII is a 7 bit character set. In C normally represented by an 8 bit char. If highest bit in an 8 bit byte is set, it is not an ASCII character. The eighth bit was used for parity. To communicate information between computers using different encoding.
ASCII stands for American Standard Code for Information Interchange, with the emphasis on American. The character set could not represent like Arabic letters (things with umlauts for example) or latin.
To “extend” the ASCII set and use those extra 128 values that became available by using all 8 bits, which caused problems. Eventually, Unicode came along which can represent every Unicode character. But 8 bit become a standard for char.

How can i make a number like 123 into a string with ascii in C++?

I want to make a number like 77 into a string but I can't use ascii because it's only from 0-9. Is there some way to make numbers become a string? So this is the result at the end: You input a number and it outputs the number but as a string. Example: input:123; output:"123".
Each digit can use ASCII. The number 123 uses a 1, a 2, and a 3, and the string that represents that value uses the characters '1', '2', and '3'.
The way to do the conversion yourself is to get each digit by itself and add the digit to '0'. Like this:
int value = 123
std::string result;
while (value != 0) {
int digit = value % 10;
char digit_as_character = digit + '0';
result.insert(0, 1, digit_as_character);
value = value / 10;
}
This is pretty much what you'd do if you were doing it by hand:
start with the value get the last digit of the value by dividing by
10 and looking at the remainder
write down a digit for the remainder
divide the value by 10 to remove the last digit, since you don't need
it any more.
Who said anything about ascii? The C++ standard doesn't.
Use the platform-independent std::to_string(77) instead.
Reference: http://en.cppreference.com/w/cpp/string/basic_string/to_string

Difference between converting int to char by (char) and by ASCII

I have an example:
int var = 5;
char ch = (char)var;
char ch2 = var+48;
cout << ch << endl;
cout << ch2 << endl;
I had some other code. (char) returned wrong answer, but +48 didn't. When I changed ONLY (char) to +48, then my code got corrected.
What is the difference between converting int to char by using (char) and +48 (ASCII) in C++?
char ch=(char)var; has the same effect as char ch=var; and assigns the numeric value 5 to ch. You're using ASCII (supported by all modern systems) and ASCII character code 5 represents Enquiry 'ENQ' an old terminal control code. Perhaps some old timer has a clue what it did!
char ch2 = var+48; assigns the numeric value 53 to ch2 which happens to represent the ASCII character for the digit '5'. ASCII 48 is zero (0) and the digits all appear in the ASCII table in order after that. So 48+5 lands on 53 (which represents the character '5').
In C++ char is a integer type. The value is interpreted as representing an ASCII character but it should be thought of as holding a number.
Its numeric range is either [-128,127] or [0,255]. That's because C++ requires sizeof(char)==1 and all modern platforms have 8 bit bytes.
NB: C++ doesn't actually mandate ASCII, but again that will be the case on all modern platforms.
PS: I think its an unfortunate artifact of C (inherited by C++) that sizeof(char)==1 and there isn't a separate fundamental type called byte.
A char is simply the base integral denomination in c++. Output statements, like cout and printf map char integers to the corresponding character mapping. On Windows computers this is typically ASCII.
Note that the 5th in ASCII maps to the Enquiry character which has no printable character, while the 53rd character maps to the printable character 5.
A generally accepted hack to store a number 0-9 in a char is to do: const char ch = var + '0' It's important to note the shortcomings here:
If your code is running on some non-ASCII character mapping then characters 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 may not be laid out in order in which case this wouldn't work
If var is outside the 0 - 9 range this var + '0' will map to something other than a numeric character mapping
A guaranteed way to get the most significant digit of a number independent of 1 or 2 is to use:
const auto ch = to_string(var).front()
Generally char represents a number as int does. Casting an int value to char doesn't provide it's ASCII representation.
The ASCII codes as numbers for digits range from 48 (== '0') to 58 (== '9'). So to get the printable digit you have to add '0' (or 48).
The difference is that casting to char (char) explicitly converts the digit to a char and adding 48 do not.
Its important to note that an int is typically 32 bit and char is typically 8 bit. This means that the number you can store in a char is from -127 to +127(or 0 to 255-(2^8-1) if you use unsigned char) and in an int from −2,147,483,648 (−231) to 2,147,483,647 (231 − 1)(or 0 to 2^32 -1 for unsigned).
Adding 48 to a value is not changing the type to char.