What is the purpose of Signed Char - c++

what is the purpose of signed char if both char and signed char ranges from -127 - 127?
what is the place where we use signed char instead of just char?

unsigned char is unsigned.
signed char is signed.
char may be unsigned or signed depending on your platform.
Use signed char when you definitely want signedness.
Possibly related: What does it mean for a char to be signed?

It is implementation defined whether plain char uses the same
representation as signed char or unsigned char. signed char was
introduced because plain char was underspecified. There's also the
message you send to your readers:
plain char: character data
signed char: small itegers
unsigned char: raw memory
(unsigned char may also be used if you're doing a lot of bitwise
operations. In practice, that tends to overlap with the raw memory
use.)

See lamia,
First I want to prepare background for your question.
................................................
char data type is of two types:
unsigned char;
signed char;
(i.e. INTEGRAL DATATYPES)
.................................................
Exaplained as per different books as:
char 1byte –128 to 127 (i.e. by default signed char)
signed char 1byte –128 to 127
unsigned char 1byte 0 to 255
.................................................
one more thing 1byte=8 bits.(zero to 7th bit)
As processor flag register reserves 7th bit for representing sign(i.e. 1=+ve & 0=-ve)
-37 will be represented as 1101 1011 (the most significant bit is 1),
+37 will be represented as 0010 0101 (the most significant bit is 0).
.................................................
similarly for char last bit is by default taken as signed
This is why?
Because char also depends on ASCII codes of perticular charectors(Eg.A=65).
In any case we are using char and using 7 bits only.
In this case to increase memory range for char/int by 1 bit we use unsigned char or unsigned int;
Thanks for the question.

Note that on many systems, char is signed char.
As for your question: Well, you would use it when you would need a small signed number.

Related

If char can store a number in C++, why do we need int?

The char data type can store numbers, characters, and symbols, so what is the need for the int data type?
char = '2';
I have knowledge of use of int, but I want to know the conceptual part to describe it fundamentally.
Usually, int can hold larger numbers than char. In current, widely available architectures, int is 32-bit, while char is 8-bit. Furthermore, it is implementation defined that a char is signed or unsigned.
On these architectures int can hold numbers between -2147483648 and 2147483647, while a (signed) char can hold numbers between -128 and 127.

Unsigned Char Values in String

char takes values in the range of -128 to 127. By simply putting unsigned before char the range changes to 0-255.
How to achieve the same effect in a string? So that all chars in that string take values from 0-255?
char takes values in the range of -128 to 127.
No.
char is implementation-defined, it could be either signed char or unsigned char depending on what your compiler chose to use. And char doesn't necessarily means byte BTW... (there are some platforms where a char is 16 bits for example)
If you want to ensure that a char is indeed an unsigned char then just cast it: static_cast<unsigned char>(some_char_value)

C++ char definition from binary string and overflow

I have a datatype that's more or less a character array. Each space in the array holds a char, which, as per my understanding, is a single byte (8 bits) of information. I need to be able to specify the char value through a binary string... for instance
char someChar = char(0b00110011);
What I don't understand is why the max value I can specify is 0b0XXXXXXX, where I have to leave that MSB set to zero. If I try setting the char like so
char someChar = char(0b11111111);
I get a decimal value: -2147483648, which looks very much like overflow. So I don't really get what's going on here. If I call the sizeof() operator on char, I get an answer of 1 (one byte). Doesn't that mean that I either get 0-255 if the char is unsigned, or -128-127 if the char is signed? Any advice/input would be appreciated.
In response to most of the comments -- I converted it to an int before printing it out:
std::cerr << int(someChar)
Thanks to all for the thorough explanations :)
char is signed in this case, so setting the top bit will give a negative value. Use unsigned char if you don't want to worry about positive/negative values.
As for the negative integer value - please show how you're converting/displaying the char.
NB. You can use signed char or unsigned char to tell the compiler explicitly what you want.
-2147483648 in binary is 10000000 00000000 00000000 01111111.
When you declare you char in binary, you compiler interprets it as a signed char, which is the case for the most compilers. The leftmost bit is interpreted as the sign bit.
Upon conversion to int, the bit pattern of the value is copied, therefore the seven rightmost bits, and the sign bit is moved to the MSB of the 32-bit block.
You have two main problems here :
First, it seems that you except someChar to be unsigned. If that's the case, you should tell it to your compiler : unsigned char someChar = unsigned char(0b11111111);
Second, the way you put it to the console (which is unknown to us) apparently involves a conversion to int. If it's not needed, there is likely a way to print someChar for what it is really, i.e. a signed char.

Conversion from unsigned to signed type safety?

Is it safe to convert, say, from an unsigned char * to a signed char * (or just a char *?
The access is well-defined, you are allowed to access an object through a pointer to signed or unsigned type corresponding to the dynamic type of the object (3.10/15).
Additionally, signed char is guaranteed not to have any trap values and as such you can safely read through the signed char pointer no matter what the value of the original unsigned char object was.
You can, of course, expect that the values you read through one pointer will be different from the values you read through the other one.
Edit: regarding sellibitze's comment, this is what 3.9.1/1 says.
A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (3.9); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers.
So indeed it seems that signed char may have trap values. Nice catch!
The conversion should be safe, as all you're doing is converting from one type of character to another, which should have the same size. Just be aware of what sort of data your code is expecting when you dereference the pointer, as the numeric ranges of the two data types are different. (i.e. if your number pointed by the pointer was originally positive as unsigned, it might become a negative number once the pointer is converted to a signed char* and you dereference it.)
Casting changes the type, but does not affect the bit representation. Casting from unsigned char to signed char does not change the value at all, but it affects the meaning of the value.
Here is an example:
#include <stdio.h>
int main(int args, char** argv) {
/* example 1 */
unsigned char a_unsigned_char = 192;
signed char b_signed_char = b_unsigned_char;
printf("%d, %d\n", a_signed_char, a_unsigned_char); //192, -64
/* example 2 */
unsigned char b_unsigned_char = 32;
signed char a_signed_char = a_unsigned_char;
printf("%d, %d\n", b_signed_char, b_unsigned_char); //32, 32
return 0;
}
In the first example, you have an unsigned char with value 192, or 110000000 in binary. After the cast to signed char, the value is still 110000000, but that happens to be the 2s-complement representation of -64. Signed values are stored in 2s-complement representation.
In the second example, our unsigned initial value (32) is less than 128, so it seems unaffected by the cast. The binary representation is 00100000, which is still 32 in 2s-complement representation.
To "safely" cast from unsigned char to signed char, ensure the value is less than 128.
It depends on how you are going to use the pointer. You are just converting the pointer type.
You can safely convert an unsigned char* to a char * as the function you are calling will be expecting the behavior from a char pointer, but, if your char value goes over 127 then you will get a result that will not be what you expected, so just make certain that what you have in your unsigned array is valid for a signed array.
I've seen it go wrong in a few ways, converting to a signed char from an unsigned char.
One, if you're using it as an index to an array, that index could go negative.
Secondly, if inputted to a switch statement, it may result in a negative input which often is something the switch isn't expecting.
Third, it has different behavior on an arithmetic right shift
int x = ...;
char c = 128
unsigned char u = 128
c >> x;
has a different result than
u >> x;
Because the former is sign-extended and the latter isn't.
Fourth, a signed character causes underflow at a different point than an unsigned character.
So a common overflow check,
(c + x > c)
could return a different result than
(u + x > u)
Safe if you are dealing with only ASCII data.
I'm astonished it hasn't been mentioned yet: Boost numeric cast should do the trick - but only for the data of course.
Pointers are always pointers. By casting them to a different type, you only change the way the compiler interprets the data pointed to.

What is an unsigned char?

In C/C++, what an unsigned char is used for? How is it different from a regular char?
In C++, there are three distinct character types:
char
signed char
unsigned char
If you are using character types for text, use the unqualified char:
it is the type of character literals like 'a' or '0' (in C++ only, in C their type is int)
it is the type that makes up C strings like "abcde"
It also works out as a number value, but it is unspecified whether that value is treated as signed or unsigned. Beware character comparisons through inequalities - although if you limit yourself to ASCII (0-127) you're just about safe.
If you are using character types as numbers, use:
signed char, which gives you at least the -127 to 127 range. (-128 to 127 is common)
unsigned char, which gives you at least the 0 to 255 range.
"At least", because the C++ standard only gives the minimum range of values that each numeric type is required to cover. sizeof (char) is required to be 1 (i.e. one byte), but a byte could in theory be for example 32 bits. sizeof would still be report its size as 1 - meaning that you could have sizeof (char) == sizeof (long) == 1.
This is implementation dependent, as the C standard does NOT define the signed-ness of char. Depending on the platform, char may be signed or unsigned, so you need to explicitly ask for signed char or unsigned char if your implementation depends on it. Just use char if you intend to represent characters from strings, as this will match what your platform puts in the string.
The difference between signed char and unsigned char is as you'd expect. On most platforms, signed char will be an 8-bit two's complement number ranging from -128 to 127, and unsigned char will be an 8-bit unsigned integer (0 to 255). Note the standard does NOT require that char types have 8 bits, only that sizeof(char) return 1. You can get at the number of bits in a char with CHAR_BIT in limits.h. There are few if any platforms today where this will be something other than 8, though.
There is a nice summary of this issue here.
As others have mentioned since I posted this, you're better off using int8_t and uint8_t if you really want to represent small integers.
Because I feel it's really called for, I just want to state some rules of C and C++ (they are the same in this regard). First, all bits of unsigned char participate in determining the value if any unsigned char object. Second, unsigned char is explicitly stated unsigned.
Now, I had a discussion with someone about what happens when you convert the value -1 of type int to unsigned char. He refused the idea that the resulting unsigned char has all its bits set to 1, because he was worried about sign representation. But he didn't have to be. It's immediately following out of this rule that the conversion does what is intended:
If the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. (6.3.1.3p2 in a C99 draft)
That's a mathematical description. C++ describes it in terms of modulo calculus, which yields to the same rule. Anyway, what is not guaranteed is that all bits in the integer -1 are one before the conversion. So, what do we have so we can claim that the resulting unsigned char has all its CHAR_BIT bits turned to 1?
All bits participate in determining its value - that is, no padding bits occur in the object.
Adding only one time UCHAR_MAX+1 to -1 will yield a value in range, namely UCHAR_MAX
That's enough, actually! So whenever you want to have an unsigned char having all its bits one, you do
unsigned char c = (unsigned char)-1;
It also follows that a conversion is not just truncating higher order bits. The fortunate event for two's complement is that it is just a truncation there, but the same isn't necessarily true for other sign representations.
As for example usages of unsigned char:
unsigned char is often used in computer graphics, which very often (though not always) assigns a single byte to each colour component. It is common to see an RGB (or RGBA) colour represented as 24 (or 32) bits, each an unsigned char. Since unsigned char values fall in the range [0,255], the values are typically interpreted as:
0 meaning a total lack of a given colour component.
255 meaning 100% of a given colour pigment.
So you would end up with RGB red as (255,0,0) -> (100% red, 0% green, 0% blue).
Why not use a signed char? Arithmetic and bit shifting becomes problematic. As explained already, a signed char's range is essentially shifted by -128. A very simple and naive (mostly unused) method for converting RGB to grayscale is to average all three colour components, but this runs into problems when the values of the colour components are negative. Red (255, 0, 0) averages to (85, 85, 85) when using unsigned char arithmetic. However, if the values were signed chars (127,-128,-128), we would end up with (-99, -99, -99), which would be (29, 29, 29) in our unsigned char space, which is incorrect.
signed char has range -128 to 127; unsigned char has range 0 to 255.
char will be equivalent to either signed char or unsigned char, depending on the compiler, but is a distinct type.
If you're using C-style strings, just use char. If you need to use chars for arithmetic (pretty rare), specify signed or unsigned explicitly for portability.
unsigned char takes only positive values....like 0 to 255
where as
signed char takes both positive and negative values....like -128 to +127
char and unsigned char aren't guaranteed to be 8-bit types on all platforms—they are guaranteed to be 8-bit or larger. Some platforms have 9-bit, 32-bit, or 64-bit bytes. However, the most common platforms today (Windows, Mac, Linux x86, etc.) have 8-bit bytes.
An unsigned char is an unsigned byte value (0 to 255). You may be thinking of char in terms of being a "character" but it is really a numerical value. The regular char is signed, so you have 128 values, and these values map to characters using ASCII encoding. But in either case, what you are storing in memory is a byte value.
In terms of direct values a regular char is used when the values are known to be between CHAR_MIN and CHAR_MAX while an unsigned char provides double the range on the positive end. For example, if CHAR_BIT is 8, the range of regular char is only guaranteed to be [0, 127] (because it can be signed or unsigned) while unsigned char will be [0, 255] and signed char will be [-127, 127].
In terms of what it's used for, the standards allow objects of POD (plain old data) to be directly converted to an array of unsigned char. This allows you to examine the representation and bit patterns of the object. The same guarantee of safe type punning doesn't exist for char or signed char.
unsigned char is the heart of all bit trickery. In almost all compilers for all platforms an unsigned char is simply a byte and an unsigned integer of (usually) 8 bits that can be treated as a small integer or a pack of bits.
In addition, as someone else has said, the standard doesn't define the sign of a char. So you have 3 distinct char types: char, signed char, unsigned char.
If you like using various types of specific length and signedness, you're probably better off with uint8_t, int8_t, uint16_t, etc simply because they do exactly what they say.
Some googling found this, where people had a discussion about this.
An unsigned char is basically a single byte. So, you would use this if you need one byte of data (for example, maybe you want to use it to set flags on and off to be passed to a function, as is often done in the Windows API).
An unsigned char uses the bit that is reserved for the sign of a regular char as another number. This changes the range to [0 - 255] as opposed to [-128 - 127].
Generally unsigned chars are used when you don't want a sign. This will make a difference when doing things like shifting bits (shift extends the sign) and other things when dealing with a char as a byte rather than using it as a number.
unsigned char takes only positive values: 0 to 255 while
signed char takes positive and negative values: -128 to +127.
quoted frome "the c programming laugage" book:
The qualifier signed or unsigned may be applied to char or any integer. unsigned numbers
are always positive or zero, and obey the laws of arithmetic modulo 2^n, where n is the number
of bits in the type. So, for instance, if chars are 8 bits, unsigned char variables have values
between 0 and 255, while signed chars have values between -128 and 127 (in a two' s
complement machine.) Whether plain chars are signed or unsigned is machine-dependent,
but printable characters are always positive.
signed char and unsigned char both represent 1byte, but they have different ranges.
Type | range
-------------------------------
signed char | -128 to +127
unsigned char | 0 to 255
In signed char if we consider char letter = 'A', 'A' is represent binary of 65 in ASCII/Unicode, If 65 can be stored, -65 also can be stored. There are no negative binary values in ASCII/Unicode there for no need to worry about negative values.
Example
#include <stdio.h>
int main()
{
signed char char1 = 255;
signed char char2 = -128;
unsigned char char3 = 255;
unsigned char char4 = -128;
printf("Signed char(255) : %d\n",char1);
printf("Unsigned char(255) : %d\n",char3);
printf("\nSigned char(-128) : %d\n",char2);
printf("Unsigned char(-128) : %d\n",char4);
return 0;
}
Output -:
Signed char(255) : -1
Unsigned char(255) : 255
Signed char(-128) : -128
Unsigned char(-128) : 128