Joining two characters in c++ - c++

My requirement is to join two characters. For example
int main()
{
char c1 ='0';
char c2 ='4';
char c3 = c1+c2;
cout<< c3;
}
The value which I am expecting is 04. But what I am getting is d.
I know that char is single byte. My requirement is that in the single Byte of C3 is possible to merge/join/concat the c1,c2 and store the value as 04

A char is not a string. So you can first convert the first char to a string and than add the next ones like:
int main()
{
char c1 ='0';
char c2 ='4';
auto c3 = std::string(1,c1)+c2;
std::cout<< c3;
}
What is "magic" std::string(1,c1):
It uses the std::string constructor of the form: std::string::string (size_t n, char c);. So it "fills" the string with one single character of your given c1 which is the 0.
If you add chars you get the result of adding the numeric value of it which is:
int main() {
std::cout << (int)c1 << std::endl;
std::cout << (int)c2 << std::endl;
std::cout << (int)c1+c2 << std::endl;
std::cout << char(c1+c2) << std::endl;
}
The numeric value as int from 0 is 48, from 4 it is 52. Add both you get 100. And 100 is a d in ascii coding.

What you want is called a string of characters. There are many ways to create that in C++. One way you can do it is by using the std::string class from the Standard Library:
char c1 = '0';
char c2 = '4';
std::string s; // an empty string
s += c1; // append the first character
s += c2; // append the second character
cout << s; // print all the characters out

A char is no more than an integral type that's used by your C++ runtime to display things that humans can read.
So c1 + c2 is a instruction to add two numbers, and is an int type due to the rules of type conversions. If that's too big to fit into a char, then the assignment to c3 would have implementation-defined results.
If you want concatenation, then
std::cout << ""s + c1 + c2;
is becoming, from C++11's user defined literals, the idiomatic way of doing this. Note the suffixed s.

Each char in C (and C++) has the length of one byte. What you are doing is adding the actual byte values:
'0' = 0x30
'4' = 0x34
-> '0' + '4' = 0x30 + 0x34 = 0x64 = 'd'
If you want to concatenate those two you will need an array:
int main()
{
char c1 ='0';
char c2 ='4';
char c3[3] = {c1,c2,0}; // 0 at the end to terminate the string
cout<< c3;
return 0;
}
Note that doing those things with chars is C but not C++. In C++ you would use a string, just as Klaus did in his answer.

Basics first
In C++ and C, a string ends with '\0' which is called null character.
Presence of this null character at the end differentiates between a string and char array. Otherwise, both are contigous memory location.
While calculating string.length() not counting the null character at the end is taken care.
Joining/Concatenating two characters
If you join two character using + operator, then handling the null character in both strings end will be taken care.
But if you join two char with + operator and wish to see it as an string, it will not happen, because it has no null character at the end.
See the example below:
char c1 = 'a';
char c2 = 'b';
string str1 = c1 + c2;
cout << str1; // print some garbled character, as it's not valid string.
string str2; // null initialization
str2 = str2 + c1 + c2;
cout << str2; // prints ab correctly.
string s1 = "Hello";
string s2 = "World";
string str3 = s1 + s2;
cout << str3; //prints Hello World correctly.
string str4 = s1 + c1;
cout << str4; //prints Hello a correctly.

What you named "joining" is actually arithmetics on char types, which implicitly promotes them to int. The equivalent integer value for a character is defined by the ASCII table ('0' is 48, '4' is 52, hence 48 + 52 = 100, which finally is 'd'). You want to use std::string when concatenating textual variables via the + operator.

You can do somthing like that:
#include <vector>
#include <iostream>
using namespace std;
int main()
{
char a = '0';
char b = '4';
vector<char> c;
c.push_back(a);
c.push_back(b);
cout << c.data() << endl;
return 0;
}

Simple addition is not suitable to "combine" two characters, you won't ever be able to determine from 22 if it has been composed of 10 and 12, 11 and 11, 7 and 15 or any other combination; additionally, if order is relevant, you won't ever know if it has been 10 and 12 or 12 and 10...
Now you can combine your values in an array or std::string (if representing arbitrary 8-bit-data, you might even prefer std::array or std::vector). Another variant is doing some nice math:
combined = x * SomeFactorLargerThanAnyPossibleChar + y;
Now you can get your two values back via division and modulo calculation.
Assuming you want to encode ASCII only, 128 is sufficient; considering maximum values, you get: 127*128+127 == 127*129 == 2^14 - 1. You might see the problem with yourself, though: Results might need up to 14 bit, so your results won't fit into a simple char any more (at least on typical modern hardware, where char normally has eight bits only – there are systems around, though, with char having 16 bits...).
So you need a bigger data type for:
uint16_t combined = x * 128U + y;
//^^
Side note: if you use 256 instead, each of x and y will be placed into separate bytes of the two of uint16_t. On the other hand, if you use uint64_t and 127 as factor, you can combine up to 9 ASCII characters in one single data type. In any case, you should always use powers of two so your multiplications and divisions can be implemented as bitshifts and the modulo calculations as simple AND-operation with some appropriate mask.
Care has to be taken if you use non-ASCII characters as well: char might be signed, so to prevent unwanted effects resulting from sign extension when values being promoted to int, you need to yet apply a cast:
uint16_t combined = static_cast<uint8_t>(x) * 128U + static_cast<uint8_t>(y);

Since a basic char doesn't have operator+ to concatenate itself with another char,
What you need is a representation of a string literal using std::string as suggested by others
or in c++17, you can use string_view,
char array[2] = {'0', '4'};
std::string_view array_v(array, std::size(array));
std::cout << array_v; // prints 04

https://github.com/bite-rrjo/STG-char-v2/blob/main/stg.h
without external libraries it would be like this
#include "iostream"
#include "lib-cc/stg.h"
using namespace std;
int main(){
// sum two char
char* c1 = "0";
char* c2 = "4";
stg _stg;
_stg = c1;
_stg += c2;
cout << _stg() << endl;
// print '04'
}

Related

why are we adding and subtracting '0' here? [duplicate]

This question already has answers here:
Why does subtracting '0' in C result in the number that the char is representing?
(8 answers)
Closed 10 months ago.
while solving a leetcode challange , I stumbled upon various solutions that were adding and subtracting '0' for successful submission and I have no idea why . any help? I mostly do linked list and array so this problem is quite new to me .
class Solution {
public:
string addBinary(string a, string b) {
string result;
int i=a.size()-1;
int j=b.size()-1;
int carry = 0;
while(i>=0 || j>=0|| carry)
{
if(i>=0)
{ carry += a[i] - '0';//like here
i--;}
if(j>=0)
{ carry += a[j] - '0';//here
j--;
}
result += (carry%2 + '0'); //and here as well
carry = carry/2;
}
reverse(result.begin(),result.end());
return result;
}
};
In C++, '0' is a character literal.
The fundamental reason why/how a[i] - '0' works is through promotion. This is because:
arithmetic operators do not accept types smaller than int as arguments, and integral promotions are automatically applied after lvalue-to-rvalue conversion, if applicable
In particular when you wrote:
a[i]-'0'
This means
both a[i] and '0' will be promoted to an int. And so the result will be an int.
Let's looks at some example for more clarifications:
char c1 = 'E';
char c2 = 'F';
int result = c1 + c2;
Here both c1 and c2 will be promoted to an int. And the result will be an int. In particular, c1 will become(promoted to) 69 and c2 will become(promoted to) 70. And so result will be 69 + 70 which is the integer value 139.
This also makes the program more portable than using magic numbers.
Also note that the C++ Standard (2.3 Character sets) guarantees that:
...In both the source and execution basic character sets, the value of
each character after 0 in the above list of decimal digits shall be
one greater than the value of the previous.
For example, say we have:
std::string str = "123";
int a = str[0] - '0'; //the result of str[0] - '0' is guaranteed to be the integer 1
int b = str[1] - '0'; //the result of str[1] - '0' is guaranteed to be the integer 2
int c = str[2] - '0'; //the result of str[2] - '0' is guaranteed to be the integer 3
Lets consider the statement int a = str[0] - '0';:
Here both str[0] and 0 will be promoted to int. And the final result that is used to initialize variable a on the left hand side will be the result of subtraction of those two promoted int values on the right hand side. Moreover, the result is guaranteed to be the integer 1.
Similarly, for statement int b = str[1] - '0';:
Here both str[1] and 0 will be promoted to int. And the final result that is used to initialize variable b on the left hand side will be the result of subtraction of those two promoted int values on the right hand side. Moreover, the result is guaranteed to be the integer 2.
And similarly for variable c.
It seems to be converting strings' characters to its numeric values. For instance, if you define "char c = '0';", you had indeed stored the ASCII code (or equivalent string encoding) of the char '0' in variable c, not the value zero itself. So to get the actual numeric value out of the ASCII code, you have to subtract by the ASCII code from the char '0'.
Remember that strings are a sequence of numeric values, i.e., integers representing characters, using the ASCII Table or equivalent, depending on the string encoding being used (Unicode, etc...).
Please note the actual characters' numeric values will depending on the actual encoding being used. The most known and basic Encoding is probably ASCII, which does not support languages other than the original english. For international purposes, the most common nowadays is UNICODE UTF-8, which uses basically the same codes used by ASCII for plain English usual characters, but other more complicated encodings exist.
And, as correctly noted by Anoop Rana in the other answer to this question, the C++ standard guarantees the numeric values will be in a sequence in any encoding. So, by using this technique, you avoid magic numbers that would make your code work just with specific encodings.
Here goes a didactic code that may help:
#include <iostream>
#include <string>
using namespace std;
int main()
{
const string str = "5";
cout << "char: " << str << endl;
cout << "ASCII char's value: " << (int)str[0] << endl;
cout << "ASCII char's value of the character '0' is: " << ((int)'0') << endl;
cout << "Value represented by the ASCII code " << str[0] <<
" is the binary value " << (int)str[0] << "-" << ((int)'0') <<
"=" << (str[0] - '0') << endl;
return 0;
}
The output is:
char: 5
ASCII char's value: 53
ASCII char's value of the character '0' is: 48
Value represented by the ASCII code 5 is the binary value 53-48=5
The ASCII table code can be consulted here:
https://www.asciitable.com/
By the way, I think there may be a bug in your code, since parameter b is not being used. Maybe the second 'if' should be taking the value of b instead of a?

Looping through string of integers gives me completely different numbers?

I'm a beginner to C++ so forgive me if I'm making a stupid mistake here.
I want to loop through a string of integers in the following code:
#include <string>
using namespace std;
int main() {
string str = "12345";
for (int i : str) {
cout << i << endl;
}
return 0;
}
But I receive the output:
49
50
51
52
53
I know that I get normal output if I use char instead of int, but why do I receive an output of integers 48 more than they should be?
When you loop through a string you get elements of type char. If you convert a char to an int you get the ASCII value of the char, which is what happens when you do:
string str = "12345";
for (int i : str) { // each char is explicitly converted to int
cout << i << endl; // prints the ascii value
}
The ASCII value of '0' is 48, and '1' is 49, etc, which explains the output you get.
Just what #cigien said, You just need to change it from int to char i.e
string str = "12345";
for (char i : str) {
cout << i << endl;
}
Or one solution for all auto keyword
string str = "12345";
for (auto i : str) {
cout << i << endl;
}
The first thing you need to know is that a string is an array/sequence of chars.
You can think of a char as a single character.
But the way it is encoded is as a number.
For example, the char 'a' is encoded (in ASCII) as the number 97.
Now your for loop says int i: str.
You're telling it to look for integers in the string.
But a string is an array/sequence of chars, not of integers.
So the loop takes each char,
and instead of looking at what the character itself is,
it gives you the integer encoding value of the char,
the ASCII value.
Now the numbers are encoded with the char '0' having the lowest encoding value,
'1' having the next value,
'2', having the next,
and so on through digit '9'.
I can never remember what the actual ASCII value for '0' is . . . .
But because the digit chars are encoded consecutively in this way,
you can convert any digit char to its int value by subtracting the underlying integer encoding value of '0'.
#include <string>
using namespace std;
int main() {
string str = "12345";
for (char c: str) {
cout << (c - '0') << endl; // gives you the actual int value, but only works if the char is actually a digit
}
return 0;
}
for (int i : str) { is infact syntactic sugar for
for (auto iterator = str.begin(); iterator != str.end(); iterator++) {
int i = (int) *iterator;
But the *-operator from string::iterator is infact an overload which returns the current char. It will as such be casted to an int. What you then see is this number. It is the integer value of the byte. Not necessarily ASCII. It could be ANSI too.

Character Array, \0

I know, the \0 on the end of the character array is a must if you use the character array with functions who expect \0, like cout, otherwise unexpected random characters appear.
My question is, if i use the character array only in my functions, reading it char by char, do i need to store the \0 at the end?
Also, is it a good idea to fill only characters and leave holes on the array?
Consider the following:
char chars[5];
chars[1] = 15;
chars[2] = 17;
chars[3] = 'c';
//code using the chars[1] and chars[3], but never using the chars
int y = chars[1]+chars[3];
cout << chars[3] << " is " << y;
Does the code above risk unexpected errors?
EDIT: edited the example.
The convention of storing a trailing char(0) at the end of an array of chars has a name, it's called a 'C string'. It has nothing to do, specifically, with char - if you are using wide character, a wide C string would be terminated with a wchar_t(0).
So it's absolutely fine to use char arrays without trailing zeroes if what you are using is just an array of chars and not a C string.
char dirs[4] = { 'n', 's', 'e', 'w' };
for (size_t i = 0; i < 4; ++i) {
fprintf(stderr, "dir %d = %c\n", i, dirs[i]);
std::cout << "dir " << i << " = " << dirs[i] << '\n';
}
Note that '\0' is char(0), that is it has a numeric, integer value of 0.
char x[] = { 'a', 'b', 'c', '\0' };
produces the same array as
char x[] = { 'a', 'b', 'c', 0 };
Your second question is unclear, though
//code using the chars[1] and chars[3], but never using the chars
int y = chars[1]+chars[3];
cout << chars[3] << " is " << y;
Leaving gaps is fine, as long as you're sure your code is aware that they are uninitialized. If it is not, then consider the following:
char chars[4]; // I don't initialize this.
chars[1] = '1';
chars[3] = '5';
int y = chars[1] + chars[3];
std::cout << "y = " << y << '\n';
// prints 100, because y is an int and '1' is 49 and '5' is 51
// later
for (size_t i = 0; i < sizeof(chars); ++i) {
std::cout << "chars[" << i << "] = " << chars[i] << '\n';
}
Remember:
char one = 1;
char asciiCharOne = '1';
are not the same. one has an integer value of 1, while asciiCharOne has an integer value of 49.
Lastly: If you are really looking to store integer numeric values rather than their character representations, you may want to look at the C++11 fixed-width integer types in . For an 8-bit, unsigned value uint8_t, for an 8-bit signed value, int8_t
Running off the end of a character array because it has no terminating \0 means accessing memory that does not belong to the array. That produces undefined behavior. Often that looks like random characters, but that's a rather benign symptom; some are worse.
As for not including it because you don't need it, sure. There's nothing magic that says that an array of char has to have a terminating \0.
To me it looks like you use the array not for strings, but as an array of numbers, so yes it is ok not to use '\0' in the array.
Since you are using it to store numbers, consider using uint8_t or int8_t types from stdint.h, which are typedefs for unsigned char and signed char, but is more clear this way that the array is used as an array of numbers, and not as a string.
cout << chars[3] << " is " << y; is not undefined behaviour because you access the element at position 3 from the array, that element is inside the array and is a char, so everything is fine.
EDIT:
Also, I know is not in your question, but since we are here, using char instead of int for numbers, can be deceiving. On most architectures, it does not increase performance, but actually slows it down. This is mainly because of the way the memory is addressable and because the processor works with 4 bytes / 8 bytes operands anyways. The only gain would be the storage size, but use this for storing on the disk, and unless you are working with really huge arrays, or with limited ram, use int for ram as well.

Take two hex characters from file and store as a char with associated hex value

I'd like to take the next two hex characters from a stream and store them as the associated associated hex->decimal numeric value in a char.
So if an input file contains 2a3123, I'd like to grab 2a, and store the numeric value (decimal 42) in a char.
I've tried
char c;
instream >> std::setw(2) >> std::hex >> c;
but this gives me garbage (if I replace c with an int, I get the maximum value for signed int).
Any help would be greatly appreciated! Thanks!
edit: I should note that the characters are guaranteed to be within the proper range for chars and that the file is valid hexadecimal.
OK I think dealing with ASCII decoding is a bad idea at all and does not really answer the question.
I think your code does not work because setw() or istream::width() works only when you read to std::string or char*. I guess it from here
How ever you can use the goodness of standard c++ iostream converters. I came up with idea that uses stringstream class and string as buffer. The thing is to read n chars into buffer and then use stringstream as a converter facility.
I am not sure if this is the most optimal version. Probably not.
Code:
#include <iostream>
#include <sstream>
int main(void){
int c;
std::string buff;
std::stringstream ss_buff;
std::cin.width(2);
std::cin >> buff;
ss_buff << buff;
ss_buff >> std::hex >> c;
std::cout << "read val: " << c << '\n';
}
Result:
luk32#genaker:~/projects/tmp$ ./a.out
0a10
read val: 10
luk32#genaker:~/projects/tmp$ ./a.out
10a2
read val: 16
luk32#genaker:~/projects/tmp$ ./a.out
bv00
read val: 11
luk32#genaker:~/projects/tmp$ ./a.out
bc01
read val: 188
luk32#genaker:~/projects/tmp$ ./a.out
01bc
read val: 1
And as you can see not very error resistant. Nonetheless, works for the given conditions, can be expanded into a loop and most importantly uses the iostream converting facilities so no ASCII magic from your side. C/ASCII would probably be way faster though.
PS. Improved version. Uses simple char[2] buffer and uses non-formatted write/read to move data thorough the buffer (get/write as opposed to operator<</operator>>). The rationale is pretty simple. We do not need any fanciness to move 2 bytes of data. We ,however, use formatted extractor to make the conversion. I made it a loop version for the convenience. It was not super simple though. It took me good 40 minutes of fooling around to figure out very important lines. With out them the extraction works for 1st 2 characters.
#include <iostream>
#include <sstream>
int main(void){
int c;
char* buff = new char[3];
std::stringstream ss_buff;
std::cout << "read vals: ";
std::string tmp;
while( std::cin.get(buff, 3).gcount() == 2 ){
std::cout << '(' << buff << ") ";
ss_buff.seekp(0); //VERY important lines
ss_buff.seekg(0); //VERY important lines
ss_buff.write(buff, 2);
if( ss_buff.fail() ){ std::cout << "error\n"; break;}
std::cout << ss_buff.str() << ' ';
ss_buff >> std::hex >> c;
std::cout << c << '\n';
}
std::cout << '\n';
delete [] buff;
}
Sample output:
luk32#genaker:~/projects/tmp$ ./a.out
read vals: 0aabffc
(0a) 0a 10
(ab) ab 171
(ff) ff 255
Please note, the c was not read as intended.
I found everything needed here http://www.cplusplus.com/reference/iostream/
You can cast a Char to an int and the int will hold the ascii value of the char. For example, '0' will be 48, '5' will be 53. The letters occur higher up so 'a' will be cast to 97, 'b' to 98 etc. So knowing this you can take the int value and subtract 48, if the result is greater than 9, subtract another 39. Then char 0 will have been turned to int 0, char 1 to int 1 all the way up to char a being set to int 10, char b to int 11 etc.
Next you will need to multiply the value of the first by 16 and add it to the second to account for the bit shift. Using your example of 2a.
char 2 casts to int 50. Subtract 48 and get 2. Multiply by 16 and get 32.
char a casts to int 97. Subtract 48 and get 49, this is higher than 9 so subtract another 39 and get 10. Add this to the end result of the last one (32) and you get 42.
Here is the code:
int HexToInt(char hi, char low)
{
int retVal = 0;
int hiBits = (int)hi;
int loBits = (int)low;
retVal = Convert(hiBits) * 16 + Convert(loBits);
return retVal;
}
int Convert(int in)
{
int retVal = in - 48;
//If it was not a digit
if(retVal > 10)
retVal = retVal - 7;
//if it was not an upper case hex didgit
if(retVal > 15)
retVal = retVal - 32;
return retVal;
}
The first function can actually be written as one line thus:
int HexToInt(char hi, char low)
{
return Convert((int)hi) * 16 + Convert((int)low);
}
NOTE: This only accounts for lower case letters and only works on systems that uses ASCII, i.e. Not IBM ebcdic based systems.

C++ char assignment

The task is to add a numerical value to all characters in the English alphabet and then by entering a word in the program, it should calculate its value. Anybody knows how this can be done?
If you don't care about the specific mapping from characters to integers, you can simply assign to an int:
char c = 'A';
int i = c;
On many architectures, this will map A to 65, B to 66 and so on.
To map an entire word to an integer, simply loop over the entire word and add the integers up. Your course should already have covered how to write a loop that inspects each character of a string.
So here is just some pseudo-code to give you the general idea of what I'm talking about:
int sum = 0
for each c in word
sum += c
You do know that you can add ints to C++ chars, do you?
char a = 'A';
char b = a + 1;
int b_int = b;
cout << b << " " << b_int;
//should print
// 'b' 66
//or something like that
Chars in C are just (byte-sized) integers, under the hood