`int` assumed to always be 32 bit in OpenCV? - c++

It appears that in OpenCV, the int datatype is always assumed to be 32 bits. This is reflected in the documentation (for example, in the introduction), and also in the source code (for example, in the comments of modules/core/include/opencv2/core/cvdef.h, and the fact that it defines uint to be a 32-bit unsigned integer, but doesn't define a corresponding signed type).
How does this not break OpenCV on systems in which int isn't 32 bits? Afterall, int is only guaranteed to be 16 bits by the standard.
I would have expected OpenCV to define datatypes for all sizes that it uses (just like it does for int64), or use uint_8 and friends.

How does this not break OpenCV on systems in which int isn't 32 bits?
Probably, yes. You should try building on such a system to be sure. Then again, I wish you good luck finding such a system that still has enough memory and CPU power to do meaningful computer vision; 16-bit int is typically found on very small embedded systems these days.
The clean way to get a fast type of at least 32 bits wide is to use the int_fast32_t type from <stdint.h>, but this requires C99 support and Microsoft's C compiler has long not supported that standard.

Related

In new code, why would you use `int` instead of `int_fast16_t` or `int_fast32_t` for a counting variable?

If you need a counting variable, surely there must be an upper and a lower limit that your integer must support. So why wouldn't you specify those limits by choosing an appropriate (u)int_fastxx_t data type?
The simplest reason is that people are more used to int than the additional types introduced in C++11, and that it's the language's "default" integral type (so much as C++ has one); the standard specifies, in [basic.fundamental/2] that:
Plain ints have the natural size suggested by the architecture of the execution environment46; the other signed integer types are provided to meet special needs.
46) that is, large enough to contain any value in the range of INT_MIN and INT_MAX, as defined in the header <climits>.
Thus, whenever a generic integer is needed, which isn't required to have a specific range or size, programmers tend to just use int. While using other types can communicate intent more clearly (for example, using int8_t indicates that the value should never exceed 127), using int also communicates that these details aren't crucial to the task at hand, while simultaneously providing a little leeway to catch values that exceed your required range (if a system handles signed overflow with modulo arithmetic, for example, an int8_t would treat 313 as 57, making the invalid value harder to troubleshoot); typically, in modern programming, it either indicates that the value can be represented within the system's word size (which int is supposed to represent), or that the value can be represented within 32 bits (which is nearly always the size of int on x86 and x64 platforms).
Sized types also have the issue that the (theoretically) most well-known ones, the intX_t line, are only defined on platforms which support sizes of exactly X bits. While the int_leastX_t types are guaranteed to be defined on all platforms, and guaranteed to be at least X bits, a lot of people wouldn't want to type that much if they don't have to, since it adds up when you need to specify types often. [You can't use auto, either because it detects integer literals as ints. This can be mitigated by making user-defined literal operators, but that still takes more time to type.] Thus, they'll typically use int if it's safe to do so.
Or in short, int is intended to be the go-to type for normal operation, with the other types intended to be used in extranormal circumstances. Many programmers stick to this mindset out of habit, and only use sized types when they explicitly require specific ranges and/or sizes. This also communicates intent relatively well; int means "number", and intX_t means "number that always fits in X bits".
It doesn't help that int has evolved to unofficially mean "32-bit integer", due to both 32- and 64-bit platforms usually using 32-bit ints. It's very likely that many programmers expect int to always be at least 32 bits in the modern age, to the point where it can very easily bite them in the rear if they have to program for platforms that don't support 32-bit ints.
Conversely, the sized types are typically used when a specific range or size is explicitly required, such as when defining a struct that needs to have the same layout on systems with different data models. They can also prove useful when working with limited memory, using the smallest type that can fully contain the required range.
A struct intended to have the same layout on 16- and 32-bit systems, for example, would use either int16_t or int32_t instead of int, because int is 16 bits in most 16-bit data models and the LP32 32-bit data model (used by the Win16 API and Apple Macintoshes), but 32 bits in the ILP32 32-bit data model (used by the Win32 API and *nix systems, effectively making it the de facto "standard" 32-bit model).
Similarly, a struct intended to have the same layout on 32- and 64-bit systems would use int/int32_t or long long/int64_t over long, due to long having different sizes in different models (64 bits in LP64 (used by 64-bit *nix), 32 bits in LLP64 (used by Win64 API) and the 32-bit models).
Note that there is also a third 64-bit model, ILP64, where int is 64 bits; this model is very rarely used (to my knowledge, it was only used on early 64-bit Unix systems), but would mandate the use of a sized type over int if layout compatibility with ILP64 platforms is required.
There are several reasons. One, these long names make the code less readable. Two, you might introduce really hard to find bugs. Say you used int_fast16_t but you really need to count up to 40,000. The implementation might use 32 bits and the code work just fine. Then you try to run the code on an implementation that uses 16 bits and you get hard-to-find bugs.
A note: In C / C++ you have types char, short, int, long and long long which must cover 8 to 64 bits, so int cannot be 64 bits (because char and short cannot cover 8, 16 and 32 bits), even if 64 bits is the natural word size. In Swift, for example, Int is the natural integer size, either 32 and 64 bits, and you have Int8, Int16, Int32 and Int64 for explicit sizes. Int is the best type unless you absolutely need 64 bits, in which case you use Int64, or if you need to save space.

int limit vs long limit

Everyone knows this, int are smaller than long.
Behind this MSDN link, I'm reading the following :
INT_MIN (Minimum value for a variable of type int.) –2147483648
INT_MAX (Maximum value for a variable of type int.) 2147483647
LONG_MIN (Minimum value for a variable of type long.) –2147483648
LONG_MAX (Maximum value for a variable of type long.) 2147483647
The same information can be found here.
Have I been told a lie my whole life? What is the difference between int and long if not the values they can hold ? How come?
You've mentioned both C++ and ASP.NET. The two are very different.
As far as the C and C++ specifications are concerned, the only thing you know about a primitive data type is the maximal range of values it can store. Prepare for your first surprise - int corresponds to a range of [-32767; 32767]. Most people today think that int is a 32-bit number, but it's really only guaranteed to be able to store the equivallent of a 16-bit number, almost. Also note that the range isn't the more typical [-32768; 32767], because C was designed as a common abstract machine for a wide range of platforms, including platforms that didn't use 2's complement for their negative numbers.
It shouldn't therefore be surprising that long is actually a "sort-of-32-bit" data type. This doesn't mean that C++ implementations on Linux (which commonly use a 64-bit number for long) are wrong, but it does mean that C++ applications written for Linux that assume that long is 64-bit are wrong. This is a lot of fun when porting C++ applications to Windows, of course.
The standard 64-bittish integer type to use is long long, and that is the standard way of declaring a 64-bittish integer on Windows.
However, .NET cares about no such things, because it is built from the ground up on its own specification - in part exactly because of how history-laden C and C++ are. In .NET, int is a 32-bit integer, and long is a 64-bit integer, and long is always bigger than int. In C, if you used long (32-bittish) and stored a value like ten trillion in there, there was a chance it would work, since it's possible that your long was actually a 64-bit number, and C didn't care about the distinction - that's exactly what happens on most Linux C and C++ compilers. Since the types are defined like this for performance reasons, it's perfectly legal for the compiler to use a 32-bit data type to store a 8-bit value (keep that in mind when you're "optimizing for performance" - the compiler is doing optimizations of its own). .NET can still run on platforms that don't have e.g. 32-bit 2's complement integers, but the runtime must ensure that the type can hold as much as a 32-bit 2's complement integer, even if that means taking the next bigger type ("wasting" twice as much memory, usually).
In C and C++ the requirements are that int can hold at least 16 bits, long can hold at least 32 bits, and int can not be larger than long. There is no requirement that int be smaller than long, although compilers often implement them that way. You haven't been told a lie, but you've been told an oversimplification.
This is C++
On many (but not all) C and C++ implementations, a long is larger than
an int. Today's most popular desktop platforms, such as Windows and
Linux, run primarily on 32 bit processors and most compilers for these
platforms use a 32 bit int which has the same size and representation
as a long.
See the ref http://tsemba.org/c/inttypes.html
No! Well! Its like, we had been told since childhood, that sun rises in the east and sets in the west. (the Sun doesn't move after all! )
In earlier processing environments, where we had 16 bit Operating Systems, an integer was considered to be of 16 bits(2 bytes), and a 'long' as 4 bytes (32 bits)
But, with the advent of 32 bit and 64 bit OS, an integer is said to consist of 32 bits(4 bytes) and a long to be 'atleast as big as an integer', hence, 32 bits again. Thereby explaining the equality between the maximum and minimum ranges 'int' and 'long' can take.
Hence, this depends entirely on the architecture of your system.

Why are there different names for same type of data unit?

As far as i know, in c++ on a 32bit compiler, int = __int32 = long = DWORD. But why have so many? Why not just one?
If i were to pick a name, int32 seems most appropriate since there is no confusion there as to what it could be.
int is a pre-C99 type which is guaranteed to be at least 16 bits, but is 32 bits on most modern architectures. (It was originally intended to be the "native" word size, but even on 64-bit architectures it is usually still 32 bits, largely for backwards compatibility reasons.)
long is a pre-C99 type which is guaranteed to be at least 32 bits, but is allowed to be wider. (Few compilers make it longer, even on 64-bit architectures, largely for backwards compatibility reasons.)
__int32/__int32_t is a nonstandard typedef which was implemented by many C compilers and runtime libraries, to guarantee a fixed width pre-C99.
int32_t is a C99 type which is guaranteed to be exactly 32 bits.
DWORD is a typedef from the original Windows API which is guaranteed to be exactly 32 bits, from the days when there was no language-defined type of exactly 32 bits.
So basically, the large number of ways to say "32-bit integer" come from how C dragged its feet on standardizing fixed-width types, and from the long tenure of 32-bit processors dominating the field, causing everyone to standardize on 32 bits as the "normal" integer size.
Because of legacy applications. An int doesn't describe how big it is at all. It's an integer. Big deal.
In the 16-bit era, an int was not a long. DWORD being a double-word was precise. A word is known as 2 bytes, and therefore a DWORD must be two of them.
__intXX are Microsoft specific.
So, there are lots of different reasons why different projects (e.g Microsoft Windows) uses different types.
Where compilers TODAY are typically 32-bit, this has not always been the case. And there are compilers that are 64-bit.
The term DWORD originates from way back when Windows was a 16-bit segmented mode application (many members here have probably never worked on a 16-bit segmented mode environment). It is "two 16-bit words", treated, at least these days, as an unsigned 32-bit value.
The type int32_t is defined by the C standard document (and through inheritance, also in C++). It is GUARANTEED to only exist if it is actually exactly 32 bits. On a machine with 36-bit words, there is no int32_t (there is a int32_least_t, which should exist on all systems that support AT LEAST 32 bits).
long is 32 bits in a Windows 32- or 64-bit compiler, but 64-bits in a Linux 64-bit compiler, and 32-bits in a Linux 32-bit compiler. So it's definitely "variable size".
It is also often a good idea to pick your OWN name for types. That is assuming you do care at all - it's also fine to use int, long, etc, as long as you are not RELYING on them being some size - for(i = 0; i < 10; i++) x += i; will work with i and x being any integer type - the sum is even below 128, so char would work. Using int here will be fine, since it's likely to be a "fast" type. In some architectures, using long may make the code slower - especially in 16-bit architectures where long takes up two 16-bit words and needs to be dealt with using (typically) two or more operations for addition and subtraction for example. This can really slow code down in sensitive places.
It is because they represent different types which can be translated to different sizes.
int is a default 'integer' and its size is not specified.
`int32' says it is 32 bit (four bytes integer)
long is a 'longer version integer' which can occupy larger about of bytes. On your 32bit compiler it is still 4 bytes integer. A 'long long' type, which on Windows, as I remember was __int64 was 64bit.
DWORD is a Microsoft introduced type. It is a 'double word', where word, at that time, meant 'two bytes'
You choice of int32 is good when you know that you need 32bit integer.

Long Vs. Int C/C++ - What's The Point?

As I've learned recently, a long in C/C++ is the same length as an int. To put it simply, why? It seems almost pointless to even include the datatype in the language. Does it have any uses specific to it that an int doesn't have? I know we can declare a 64-bit int like so:
long long x = 0;
But why does the language choose to do it this way, rather than just making a long well...longer than an int? Other languages such as C# do this, so why not C/C++?
When writing in C or C++, every datatype is architecture and compiler specific. On one system int is 32, but you can find ones where it is 16 or 64; it's not defined, so it's up to compiler.
As for long and int, it comes from times, where standard integer was 16bit, where long was 32 bit integer - and it indeed was longer than int.
The specific guarantees are as follows:
char is at least 8 bits (1 byte by definition, however many bits it is)
short is at least 16 bits
int is at least 16 bits
long is at least 32 bits
long long (in versions of the language that support it) is at least 64 bits
Each type in the above list is at least as wide as the previous type (but may well be the same).
Thus it makes sense to use long if you need a type that's at least 32 bits, int if you need a type that's reasonably fast and at least 16 bits.
Actually, at least in C, these lower bounds are expressed in terms of ranges, not sizes. For example, the language requires that INT_MIN <= -32767, and INT_MAX >= +32767. The 16-bit requirements follows from this and from the requirement that integers are represented in binary.
C99 adds <stdint.h> and <inttypes.h>, which define types such as uint32_t, int_least32_t, and int_fast16_t; these are typedefs, usually defined as aliases for the predefined types.
(There isn't necessarily a direct relationship between size and range. An implementation could make int 32 bits, but with a range of only, say, -2**23 .. +2^23-1, with the other 8 bits (called padding bits) not contributing to the value. It's theoretically possible (but practically highly unlikely) that int could be larger than long, as long as long has at least as wide a range as int. In practice, few modern systems use padding bits, or even representations other than 2's-complement, but the standard still permits such oddities. You're more likely to encounter exotic features in embedded systems.)
long is not the same length as an int. According to the specification, long is at least as large as int. For example, on Linux x86_64 with GCC, sizeof(long) = 8, and sizeof(int) = 4.
long is not the same size as int, it is at least the same size as int. To quote the C++03 standard (3.9.1-2):
There are four signed integer types: “signed char”, “short int”,
“int”, and “long int.” In this list, each type provides at least as
much storage as those preceding it in the list. Plain ints have the
natural size suggested by the architecture of the execution
environment); the other signed integer types are provided to meet special needs.
My interpretation of this is "just use int, but if for some reason that doesn't fit your needs and you are lucky to find another integral type that's better suited, be our guest and use that one instead". One way that long might be better is if you 're on an architecture where it is... longer.
looking for something completely unrelated and stumbled across this and needed to answer. Yeah, this is old, so for people who surf on in later...
Frankly, I think all the answers on here are incomplete.
The size of a long is the size of the number of bits your processor can operate on at one time. It's also called a "word". A "half-word" is a short. A "doubleword" is a long long and is twice as large as a long (and originally was only implemented by vendors and not standard), and even bigger than a long long is a "quadword" which is twice the size of a long long but it had no formal name (and not really standard).
Now, where does the int come in? In part registers on your processor, and in part your OS. Your registers define the native sizes the CPU handles which in turn define the size of things like the short and long. Processors are also designed with a data size that is the most efficient size for it to operate on. That should be an int.
On todays 64bit machines you'd assume, since a long is a word and a word on a 64bit machine is 64bits, that a long would be 64bits and an int whatever the processor is designed to handle, but it might not be. Why? Your OS has chosen a data model and defined these data sizes for you (pretty much by how it's built). Ultimately, if you're on Windows (and using Win64) it's 32bits for both a long and int. Solaris and Linux use different definitions (the long is 64bits). These definitions are called things like ILP64, LP64, and LLP64. Windows uses LLP64 and Solaris and Linux use LP64:
Model ILP64 LP64 LLP64
int 64 32 32
long 64 64 32
pointer 64 64 64
long long 64 64 64
Where, e.g., ILP means int-long-pointer, and LLP means long-long-pointer
To get around this most compilers seem to support setting the size of an integer directly with types like int32 or int64.

Do I need to have 64 bit Processor to use 64 bit data type

I have a few questions:
Do I need to have 64 bit Processor to use 64 bit data type(__int64 or int64_t) ?
What means by, the "t" of int64_t?
Starting from what version of GCC and VCC are supporting data type?
Is the 64 bit data type are just doubling the data length or there are some other things going under the hood too?
You don't need 64 bit processor to use 64 bit data type. It all depends on the compiler and only on the compiler. The compiler can provide you with 128-bit, 237-bit or 803-bit data types, if it so desires.
However, keep in mind that normally 32-bit CPUs cannot handle 64-bit values directly, which means that the burden of supporting all necessary language operations for 64-bit type lies on the compiler and the library. The compiler will have to generate a more-or-less complex sequence of 32-bit CPU instructions in order to perform additions, shifts, multiplications etc. on 64-bit values. This means that in code generated for 32-bit CPUs basic language operations on 64-bit data types will not be as efficient as they would be in code generated for 64-bit CPUs (since in the latter most language operations would be carried out by a single CPU instruction).
The "t" in int64_t stands for either "type" or "typedef name". That's an old accepted naming convention for standard library typedefs.
As for compiler versions, it is an ambiguous question actually. The typedef name int64_t is a part of the standard library of C language (but not of C++ language), while the support for 64-bit integer types (under any name) is a part of the compiler. So which one are you asking about? For example, MSVC compiler has been supporting 64-bit data types for a long time, but the names for these types have been different. 64-bit signed integer is called __int64 of something like that in MSVC. As for the int64_t typedef, AFAIK, it is not a part of MSVC's standard library even today. In fact, int64_t became a part of C language from the C99 version of its specification. At the same time it is not a part of C++ language. So, generally, you are not supposed to expect to have int64_t in C++ code regardless of the version of the compiler.
As for data length... Well, yeah, it is just doubling the number of bits. The rest follows.
No, you can process such data on a 32 bit machine. So long as your compiler supports those data types you are fine.
int64_t is just its name, as defined in the standard.
I think all versions of GCC and MSVC this century support 64 bit integers on 32 bit architecture.
A 64 bit integer is just twice the size of a 32 bit integer.
If you look at /usr/include/stdint.h, you'll find that int64_t is defined as
typedef long long int int64_t;
So, as David said, it's compiler and not architecture dependent.
No, compilers on 32bit architectures emulate 64bit arithmetic. It's not terribly fast, but it's not that bad.
The t refers to type. This is legacy from C where structs would have to be referred to differently.
64bit integral types may have increased alignment, but that's about it.
I've no idea for point 3.