I have a code like this. The code is working
I understand that the code prints 'm' because of demangling (https://gcc.gnu.org/onlinedocs/libstdc++/manual/ext_demangling.html)
But why does compiler print 'm' for size_t?
What is the logic of mapping ('i' --> int // it's clear, but why 'm' --> size_t)
#include <typeinfo>
using namespace std;
int main() {
size_t i = 5;
cout << "Type: " << typeid(i).name() << '\n'; // Type: m
}
If you check the itanium ABI you'll see that all the unsigned types use the next letter in the alphabet after their signed equivalent (except char). int is i, unsigned int is j. long is l and unsigned long is m. As size_t isn't an itanium type it's represented by unsigned long and therefore m.
The assigned letters are essentially arbitrary so though there is some logic to their assignment they're not really important exactly what they are. They're an implementation detail and are platform specific, if you need to know what they mean use a demangler like c++filt, http://demangler.com/ or abi::__cxa_demangle
But why does compiler print 'm' for size_t?
Because size_t is an unsigned long on this platform and the letter m represents unsigned long in this compiler on this platform.
What is the logic of mapping ('i' --> int // it's clear, but why 'm' --> size_t)
There is no "logic", there are just rules laid out what letter is what type. See https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin .
Related
I want to use the following code in my program but gcc won't allow me to left shift my 1 beyond 31.
sizeof(long int) displays 8, so doesn't that mean I can left shift till 63?
#include <iostream>
using namespace std;
int main(){
long int x;
x=(~0 & ~(1<<63));
cout<<x<<endl;
return 0;
}
The compiling outputs the following warning:
left shift `count >= width` of type [enabled by default] `x=(~0 & ~(1<<63))`;
^
and the output is -1. Had I left shifted 31 bits I get 2147483647 as expected of int.
I am expecting all bits except the MSB to be turned on thus displaying the maximum value the datatype can hold.
Although your x is of type long int, the 1 is not. 1 is an int, so 1<<63 is indeed undefined.
Try (static_cast<long int>(1) << 63), or 1L << 63 as suggested by Wojtek.
You can't use 1 (int by default) to shift it beyond the int boundaries.
There's an easier way to get the "all bits except the MSB turned on" for a specific datatype
#include <iostream>
#include <limits>
using namespace std;
int main(){
unsigned long int max = std::numeric_limits<unsigned long int>::max();
unsigned long int max_without_MSB = max >> 1;
cout<< max_without_MSB <<endl;
return 0;
}
note the unsigned type. Without numeric_limits:
#include <iostream>
using namespace std;
int main() {
long int max = -1;
unsigned long int max_without_MSB = ((unsigned long int)max) >> 1;
cout << max_without_MSB << endl;
return 0;
}
Your title is misleading; a long can shift beyond 31 bits if a long is indeed that big. However your code shifts 1, which is an int.
In C++, the type of an expression is determined by the expression itself. An expression XXXXX has the same type regardless; if you later go double foo = XXXXX; it doesn't mean XXXXX is a double - it means a conversion happens from whatever XXXXX was, to double.
If you want to left-shift a long, then do that explicitly, e.g. 1L << 32, or ((long)1) << 32. Note that the size of long varies between platforms, so if you don't want your code to break when run on a different system then you'll have to take further measures, such as using fixed-width types, or shifting by CHAR_BIT * sizeof(long) - 1.
There is another issue with your intended code: 1L << 63 causes undefined behaviour if long is 64-bit or less. This is because of signed integer overflow; left-shift is defined the same as repeated multiplication of two, so attempting to "shift into the sign bit" causes an overflow.
To fix this, use unsigned types where it is fine to shift into the MSB, e.g. 1ul << 63.
Technically there is another issue in that ~0 doesn't do what you want if you are not on a 2's complement system, but these days it's pretty safe to ignore that case.
Looking at your overall intention with long x = ~0 & ~(1 << 63). A shorter way to write this is:
long x = LONG_MAX;
which is defined by <climits>. If you wanted 64-bit on all platforms then
int64_t x = INT64_MAX;
NB. If you do not intend to work with negative values then use unsigned long x and uint64_t respectively.
First let me state a few things about the shift, which is the source of your problem:
There is no guarantee that long int is actually 64 bit wide.
The most generic way I can think of is using std::numeric_limits:
static_cast<long int>(1) << (std::numeric_limits<long int>::digits - 1);
Now you can even make that a constexpr templated function:
template <typename Integer>
constexpr Integer foo()
{
return static_cast<Integer>(1) << (std::numeric_limits<Integer>::digits - 1);
}
So replacing the shift with static_cast<long int>(1) << (std::numeric_limits<long int>::digits - 1) will fix your issue, however there is a far better way:
std::numeric_limits includes a bunch of useful stuff, including:
std::numeric_limits<T>::max(); // the maximum value T can hold
std::numeric_limits<T>::min(); // the minimum value T can hold
std::numeric_limits<T>::digits; // the number of binary digits
std::numeric_limits<T>::is_signed(); // well, do I have to explain? ;-)
See cppreference.com for a complete list. You should prefer the facilities provided by the standard library, because it will most likely have fewer mistakes and other developers immediately know it.
The default datatype for a numeric value in C is integer unless explicitly mentioned.
Here you have to type cast the 1 as long int which would otherwise be an int.
Except for bool and the extended character types, the integral types
may be signed or unsigned (34 pp. C++ Primer 5ed)
"may be", makes me confused, however, please don't give such answer, I'm not asking the difference between, for example, int and unsigned int when you explicitly write them down in the declaration. I would like to know for type char, short, int, long, long long under what condition it is singed or unsigned
I've write a simple test code on my Mac and compiled by GNU compiler, it tells, the char is singed
#include <iostream>
#include <limits>
using namespace std;
int main( int argc, char * argv[] )
{
int minChar = numeric_limits<char>::min();
int maxChar = numeric_limits<char>::max();
cout << minChar << endl; // prints -128
cout << maxChar << endl; // prints 127
return 0;
}
The same mechanism was applied to all of the sign-able integral types, and the results are shown below.
minOfChar: -128
maxOfChar: 127
minOfShort: -32768
maxOfShort: 32767
minOfInt: -2147483648
maxOfInt: 2147483647
minOfLong: 0 // This is interesting, 0
maxOfLong: -1 // and -1 :p
minOfLongLong: 0 // shouldn't use int to hold max/min of long/long long #Bathsheba answered below
maxOfLongLong: -1 // I'll live this error unfixed, that's a stupid pitiful for newbies like me, also good for leaning :)
The result tells me, for char, short, int, long, long long which is compiled by g++ on a Mac, are singed integers by default.
So the question is as the title says:
What decides an integral type is singed or unsigned
Aside from char, the signedness of the integral types is specified in the C and C++ standards, either explicitly, or by a simple corollary of the ranges that the types are required to implement.
The signedness of char is determined by the particular implementation of C and C++; i.e. it's typically up to the compiler. And the choice will be made to best suit the hardware.
Note that char, signed char, and unsigned char are all distinct types, much in the same way that int and long are distinct types even if they have the same size and complementing scheme.
It's also not a particularly good idea to assign, for example,
numeric_limits<long>::min();
to an int value, the behaviour of this could be undefined. Why not use
auto foo = numeric_limits<whatever>::min();
instead?
I would like to find a maximally efficient way to compute a char that contains the least significant bits of an int in C++11. The solution must work with any possible standards-compliant compiler. (I'm using the N3290 C++ draft spec, which is essentially C++11.)
The reason for this is that I'm writing something like a fuzz tester, and want to check libraries that require a std::string as input. So I need to generate random characters for the strings. The pseudo-random generator I'm using provides ints whose low bits are pretty uniformly random, but I'm not sure of the exact range. (Basically the exact range depends on a "size of test case" runtime parameter.)
If I didn't care about working on any compiler, this would be as simple as:
inline char int2char(int i) { return i; }
Before you dismiss this as a trivial question, consider that:
You don't know whether char is a signed or unsigned type.
If char is signed, then a conversion from an unrepresentable int to a char is "implementation-defined" (§4.7/3). This is far better than undefined, but for this solution I'd need to see some evidence that the standard prohibits things like converting all ints not between CHAR_MIN and CHAR_MAX to '\0'.
reinterpret_cast is not permitted between a signed and unsigned char (§5.2.10). static_cast performs the same conversion as in the previous point.
char c = i & 0xff;--though it silences some compiler warnings--is almost certainly not correct for all implementation-defined conversions. In particular, i & 0xff is always a positive number, so in the case that c is signed could quite plausibly not convert negative values of i to negative values of c.
Here are some solutions that do work, but in most of these cases I'm worried they won't be as efficient as a simple conversion. These also seem too complicated for something so simple:
Using reinterpret_cast on a pointer or reference, since you can convert from unsigned char * or unsigned char & to char * or char & (but at the possible cost of runtime overhead).
Using a union of char and unsigned char, where you first assign the int to the unsigned char, then extract the char (which again could be slower).
Shifting left and right to sign-extend the int. E.g., if i is the int, running c = ((i << 8 * (sizeof(i) - sizeof(c)) >> 8 * (sizeof(i) - sizeof(c)) (but that's inelegant, and if the compiler doesn't optimize away the shifts, quite slow).
Here's a minimal working example. The goal is to argue that the assertions can never fail on any compiler, or to define an alternate int2char in which the assertions can never fail.
#include <algorithm>
#include <cassert>
#include <cstdio>
#include <cstdlib>
using namespace std;
constexpr char int2char(int i) { return i; }
int
main(int argc, char **argv)
{
for (int n = 1; n < min(argc, 127); n++) {
char c = -n;
int i = (atoi(argv[n]) << 8) ^ -n;
assert(c == int2char(i));
}
return 0;
}
I've phrased this question in terms of C++ because the standards are easier to find on the web, but I am equally interested in a solution in C. Here's the MWE in C:
#include <assert.h>
#include <stdlib.h>
static char int2char(int i) { return i; }
int
main(int argc, char **argv)
{
for (int n = 1; n < argc && n < 127; n++) {
char c = -n;
int i = (atoi(argv[n]) << 8) ^ -n;
assert(c == int2char(i));
}
return 0;
}
a far better way is to have an array of chars and generate a random number to pick a char from that array. This way you get 'well behaved' characters; or at least characters with well defined badness. If you really want all 256 chars (note 8 bit assumption) then create an array with 256 entries in it ('a','b',....'\t','n'.....)
This will be portable too
Given that you appear to be interested in bit value (rather than numeric value), and have also asked for C solutions, I'm going to post what I believe to be something that's compliant and optimal:
inline char int2char(int i) {
char ret;
memcpy(&ret, (char *)&i + OFFSET, 1);
return ret;
}
where OFFSET is a macro that expands to either 0 or sizeof(int)-1, based on an endianness check.
AFAICS, this works invariant of whether char is signed or unsigned, of what representation is used for negative values, or of the width of char or int. It doesn't rely on any weird type-punning tricks, and has no branching or complex operations (such as divide).
I say "optimal" because I'm assuming that any sane compiler treats memcpy as an intrinsic, and thus will do something smart here.
in c++, is it okay to compare an int to a char because of implicit type casting? Or am I misunderstanding the concept?
For example, can I do
int x = 68;
char y;
std::cin >> y;
//Assuming that the user inputs 'Z';
if(x < y)
{
cout << "Your input is larger than x";
}
Or do we need to first convert it to an int?
so
if(x < static_cast<int>(y))
{
cout << "Your input is larger than x";
}
The problem with both versions is that you cannot be sure about the value that results from negative/large values (the values that are negative if char is indeed a signed char). This is implementation defined, because the implementation defines whether char means signed char or unsigned char.
The only way to fix this problem is to cast to the appropriate signed/unsigned char type first:
if(x < (signed char)y)
or
if(x < (unsigned char)y)
Omitting this cast will result in implementation defined behavior.
Personally, I generally prefer use of uint8_t and int8_t when using chars as numbers, precisely because of this issue.
This still assumes that the value of the (un)signed char is within the range of possible int values on your platform. This may not be the case if sizeof(char) == sizeof(int) == 1 (possible only if a char is 16 bit!), and you are comparing signed and unsigned values.
To avoid this problem, ensure that you use either
signed x = ...;
if(x < (signed char)y)
or
unsigned x = ...;
if(x < (unsigned char)y)
Your compiler will hopefully complain with warning about mixed signed comparison if you fail to do so.
Your code will compile and work, for some definition of work.
Still you might get unexpected results, because y is a char, which means its signedness is implementation defined. That combined with unknown size of int will lead to much joy.
Also, please write the char literals you want, don't look at the ASCII table yourself. Any reader (you in 5 minutes) will be thankful.
Last point: Avoid gratuituous cast, they don't make anything better and may hide problems your compiler would normally warn about.
Yes you can compare an int to some char, like you can compare an int to some short, but it might be considered bad style. I would code
if (x < (int)y)
or like you did
if (x < static_cast<int>(y))
which I find a bit too verbose for that case....
BTW, if you intend to use bytes not as char consider also the int8_t type (etc...) from <cstdint>
Don't forget that on some systems, char are signed by default, on others they are unsigned (and you could explicit unsigned char vs signed char).
The code you suggest will compile, but I strongly recommend the static_cast version. Using static_cast you will help the reader understand what do you compare to an integer.
I came across a code like below:
#define SOME_VALUE 0xFEDCBA9876543210ULL
This SOME_VALUE is assigned to some unsigned long long later.
Questions:
Is there a need to have postfix like ULL in this case ?
What are the situation we need to specify the type of integer used ?
Do C and C++ behave differently in this case ?
In C, a hexadecimal literal gets the first type of int, unsigned int, long, unsigned long, long long or unsigned long long that can represent its value if it has no suffix. I wouldn't be surprised if C++ has the same rules.
You would need a suffix if you want to give a literal a larger type than it would have by default or if you want to force its signedness, consider for example
1 << 43;
Without suffix, that is (almost certainly) undefined behaviour, but 1LL << 43; for example would be fine.
I think not, but maybe that was required for that compiler.
For example, printf("%ld", SOME_VALUE); if SOME_VALUE's integer type is not specified, this might end up with the wrong output.
A good example for the use of specifying a suffix in C++ is overloaded functions. Take the following for example:
#include <iostream>
void consumeInt(unsigned int x)
{
std::cout << "UINT" << std::endl;
}
void consumeInt(int x)
{
std::cout << "INT" << std::endl;
}
void consumeInt(unsigned long long x)
{
std::cout << "ULL" << std::endl;
}
int main(int argc, const char * argv[])
{
consumeInt(5);
consumeInt(5U);
consumeInt(5ULL);
return 0;
}
Results in:
INT
UINT
ULL
You do not need suffixes if your only intent is to get the right value of the number; C automatically chooses a type in which the value fits.
The suffixes are important if you want to force the type of the expression, e.g. for purposes of how it interacts in expressions. Making it long, or long long, may be needed when you're going to perform an arithmetic operation that would overflow a smaller type (for example, 1ULL<<n or x*10LL), and making it unsigned is useful when you want to the expression as a whole to have unsigned semantics (for example, c-'0'<10U, or n%2U).
When you don't mention any suffix, then the type of integral literal is deduced to be int by the compiler. Since some integral literal may overflow if its type is deduced to be int, so you add suffix to tell the compiler to deduce the type to be something other than int. That is what you do when you write 0xFEDCBA9876543210ULL.
You can also use suffix when you write floating-pointer number. 1.2 is a double, while 1.2f is a float.