Why do C programmers use typedefs to rename basic types? - c++

So I'm far from an expert on C, but something's been bugging me about code I've been reading for a long time: can someone explain to me why C(++) programmers use typedefs to rename simple types? I understand why you would use them for structs, but what exactly is the reason for declarations I see like
typedef unsigned char uch;
typedef uch UBYTE;
typedef unsigned long ulg;
typedef unsigned int u32;
typedef signed short s16;
Is there some advantage to this that isn't clear to me (a programmer whose experience begins with Java and hasn't ventured far outside of strictly type-safe languages)? Because I can't think of any reason for it--it looks like it would just make the code less readable for people unfamiliar with the project.
Feel free to treat me like a C newbie, I honestly know very little about it and it's likely there are things I've misunderstood from the outset. ;)

Renaming types without changing their exposed semantics/characteristics doesn't make much sense. In your example
typedef unsigned char uch;
typedef unsigned long ulg;
belong to that category. I don't see the point, aside from making a shorter name.
But these ones
typedef uch UBYTE;
typedef unsigned int u32;
typedef signed short s16;
are a completely different story. For example, s16 stands for "signed 16 bit type". This type is not necessarily signed short. Which specific type will hide behind s16 is platform-dependent. Programmers introduce this extra level of naming indirection to simplify the support for multiple platforms. If on some other platform signed 16 bit type happens to be signed int, the programmer will only have to change one typedef definition. UBYTE apparently stands for an unsigned machine byte type, which is not necessarily unsigned char.
It's worth noting that the C99 specification already provides a standard nomenclature for integral types of specific width, like int16_t, uint32_t and so on. It probably makes more sense to stick with this standard naming convention on platforms that don't support C99.

This allows for portability. For example you need an unsigned 32-bit integer type. Which standard type is that? You don't know - it's implementation defined. That's why you typedef a separate type to be 32-bit unsigned integer and use the new type in your code. When you need to compile on another C implementation you just change the typedefs.

Sometimes it is used to reduce an unwieldy thing like volatile unsigned long to something a little more compact such as vuint32_t.
Other times it is to help with portability since types like int are not always the same on each platform. By using a typedef you can set the storage class you are interested in to the platform's closest match without changing all the source code.

There are many reasons to it. What I think is:
Typename becomes shorter and thus code also smaller and more readable.
Aliasing effect for longer structure names.
Convention used in particular team/companies/style.
Porting - Have same name across all OS and machine. Its native data-structure might be slightly different.

Following is a quote from The C Programming Language (K&R)
Besides purely aesthetic issues, there are two main reasons for using
typedefs.
First- to parameterize a program
The first is to parameterize a program against portability problems.
If typedefs are used for data types
that may be machine-dependent, only
the typedefs need change when the
program is moved.
One common situation is to use typedef names for various integer
quantities, then make an appropriate
set of choices of short, int, and long
for each host machine. Types like
size_t and ptrdiff_t from the standard library are examples.
The italicized portions tells us that programmers typedef basic type for portability. If I want to make sure my program works on different platforms, using different compiler, I will try to ensure that its portability in every possible way and typedef is one of them.
When I started programming using Turbo C compiler on Windows platform, it gave us the size of int 2. When I moved to Linux platform and GCC complier, the size I get is 4. If I had developed a program using Turbo C which relied on the assertion that sizeof( int ) is always two, it would have not ported properly to my new platform.
Hope it helps.
Following quote from K&R is not related to your query but I have posted it too for the sake of completion.
Second- to provide better documentation
The second purpose of typedefs is to provide better documentation for a
program - a type called Treeptr may be easier to understand than one declared only as a
pointer to a complicated structure.

Most of these patterns are bad practices that come from reading and copying existing bad code. Often they reflect misunderstandings about what C does or does not require.
Is akin to #define BEGIN { except it saves some typing instead of making for more.
Is akin to #define FALSE 0. If your idea of "byte" is the smallest addressable unit, char is a byte by definition. If your idea of "byte" is an octet, then either char is the octet type, or your machine has no octet type.
Is really ugly shorthand for people who can't touch type...
Is a mistake. It should be typedef uint32_t u32; or better yet, uint32_t should just be used directly.
Is the same as 4. Replace uint32_t with int16_t.
Please put a "considered harmful" stamp on them all. typedef should be used when you really need to create a new type whose definition could change over the life cycle of your code or when the code is ported to different hardware, not because you think C would be "prettier" with different type names.

We use it to make it Project/platform specific, everything has a common naming convention
pname_int32, pname_uint32, pname_uint8 -- pname is project/platform/module name
And some #defines
pname_malloc, pname_strlen
It easier to read and shortens long datatypes like unsigned char to pname_uint8 also making it a convention across all modules.
When porting you need to just modify the single file , thus making porting easy.

To cut the long story short,
you might want to do that to make your code portable (with less effort/editing).
This way you don't depend to 'int', instead you are using INTEGER that can be anything you want.

All [|u]intN_t types, where N=8|16|32|64 and so forth, are defined per architecture in this exact manner. This is a direct consequence of the fact that the standard does not mandate that char,int,float, etc. have exactly N bits - that would be insane. Instead, the standard defines minimum and maximum values of each type as guarantees to the programmer, and in various architectures types may well exceed those boundaries. It is not an uncommon sight.
The typedefs in your post are used to defined types of a certain length, in a specific architecture. It's probably not the best choice of naming; u32 and s16 are a bit too short, in my opinion. Also, it's kind of a bad thing to expose the names ulg and uch, one could prefix them with an application specific string since they obviously will not be exposed.
Hope this helps.

Related

Purpose of using UINT64_C?

I found this line in boost source:
const boost::uint64_t m = UINT64_C(0xc6a4a7935bd1e995);
I wonder what is the purpose of using a MACRO here?
All this one does is to add ULL to the constant provided.
I assume it may be used to make it harder for people to make mistake of typing UL instead of ULL, but I wonder if there is any other reason to use it.
If you look at boost/cstdint.h, you can see that the definition of the UINT64_C macro is different on different platforms and compilers.
On some platforms it's defined as value##uL, on others it's value##uLL, and on yet others it's value##ui64. It all depends on the size of unsigned long and unsigned long long on that platform or the presence of compiler-specific extensions.
I don't think using UINT64_C is actually necessary in that context, since the literal 0xc6a4a7935bd1e995 would already be interpreted as a 64-bit unsigned integer. It is necessary in some other context though. For example, here the literal 0x00000000ffffffff would be interpreted as a 32-bit unsigned integer if it weren't specifically specified as a 64-bit unsigned integer by using UINT64_C (though I think it would be promoted to uint64_t for the bitwise AND operation).
In any case, explicitly declaring the size of literals where it matters serves a valuable role in code-clarity. Sometimes, even if an operation is perfectly well-defined by the language, it can be difficult for a human programmer to tell what types are involved. Saying it explicitly can make code easier to reason about, even if it doesn't directly alter the behavior of the program.

Why are the standard datatypes not used in Win32 API? [duplicate]

This question already has answers here:
Why does the Win32-API have so many custom types?
(4 answers)
Closed 6 years ago.
I have been learning Visual C++ Win32 programming for some time now.
Why are there the datatypes like DWORD, WCHAR, UINT etc. used instead of, say, unsigned long, char, unsigned int and so on?
I have to remember when to use WCHAR instead of const char *, and it is really annoying me.
Why aren't the standard datatypes used in the first place? Will it help if I memorize Win32 equivalents and use these for my own variables as well?
Yes, you should use the correct data-type for the arguments for functions, or you are likely to find yourself with trouble.
And the reason that these types are defined the way they are, rather than using int, char and so on is that it removes the "whatever the compiler thinks an int should be sized as" from the interface of the OS. Which is a very good thing, because if you use compiler A, or compiler B, or compiler C, they will all use the same types - only the library interface header file needs to do the right thing defining the types.
By defining types that are not standard types, it's easy to change int from 16 to 32 bit, for example. The first C/C++ compilers for Windows were using 16-bit integers. It was only in the mid to late 1990's that Windows got a 32-bit API, and up until that point, you were using int that was 16-bit. Imagine that you have a well-working program that uses several hundred int variables, and all of a sudden, you have to change ALL of those variables to something else... Wouldn't be very nice, right - especially as SOME of those variables DON'T need changing, because moving to a 32-bit int for some of your code won't make any difference, so no point in changing those bits.
It should be noted that WCHAR is NOT the same as const char - WCHAR is a "wide char" so wchar_t is the comparable type.
So, basically, the "define our own type" is a way to guarantee that it's possible to change the underlying compiler architecture, without having to change (much of the) source code. All larger projects that do machine-dependant coding does this sort of thing.
The sizes and other characteristics of the built-in types such as int and long can vary from one compiler to another, usually depending on the underlying architecture of the system on which the code is running.
For example, on the 16-bit systems on which Windows was originally implemented, int was just 16 bits. On more modern systems, int is 32 bits.
Microsoft gets to define types like DWORD so that their sizes remain the same across different versions of their compiler, or of other compilers used to compile Windows code.
And the names are intended to reflect concepts on the underlying system, as defined by Microsoft. A DWORD is a "double word" (which, if I recall correctly, is 32 bits on Windows, even though a machine "word" is probably 32 or even 64 bits on modern systems).
It might have been better to use the fixed-width types defined in <stdint.h>, such as uint16_t and uint32_t -- but those were only introduced to the C language by the 1999 ISO C standard (which Microsoft's compiler doesn't fully support even today).
If you're writing code that interacts with the Win32 API, you should definitely use the types defined by that API. For code that doesn't interact with Win32, use whatever types you like, or whatever types are suggested by the interface you're using.
I think that it is a historical accident.
My theory is that the original Windows developers knew that the standard C type sizes depend on the compiler, that is, one compiler may have 16-bit integer and another a 32-bit integer. So they decided to make the Window API portable between different compilers using a series of typedefs: DWORD is a 32 bit unsigned integer, no matter what compiler/architecture you are using. Naturally, nowadays you will use uint32_t from <stdint.h>, but this wasn't available at that time.
Then, with the UNICODE thing, they got the TCHAR vs. CHAR vs. WCHAR issue, but that's another story.
And, then it grew out of control and you get such nice things as typedef void VOID, *PVOID; that are utterly nonsense.

u_int32_t vs bpf_u_int32

I've been busy doing some network programming over the past couple of days and I cant seem to figure out a difference between the data types u_int32_t abd bpf_u_int32.
u_int32_t means 32 unsigned bits. Doesnt bpf_u_int32 mean the same?
Because some functions read the IP address in one form or the other.
Some functions in the pcap library like pcap_lookupnet require the net address to be of the form bpf_u_int32.
I am curious to know the difference
Programmers add layers of indirection for a living. They're almost certainly the same type, you can check that in C++ with #include <typeinfo> followed by typeid(u_int32_t) == typeid(bpf_u_int32).
On some implementations there's at least the possibility that one is unsigned int and the other is unsigned long.
What's happened is that two different people have independently chosen a name for a 32 bit unsigned type (or maybe the same person for two slightly different purposes). One of them has used a "bpf" prefix, which in this context stands for Berkeley Packet Filter since that's relevant to packet capture. The other one hasn't. One has used the _t suffix that indicates a type name, the other hasn't. Aside from that, they picked similar names.
C99 and C++11 both introduce a standard name for a 32 bit unsigned type: uint32_t. That won't stop people creating their own aliases for it, though.
Both types are most likely typedefs to a 32-bit unsigned type. As such, they can be considered equivalent and there is no useful difference between them.
Check type always from bpf.h file you are really using. This is a bpf.h:
#ifdef MSDOS /* must be 32-bit */
typedef long bpf_int32;
typedef unsigned long bpf_u_int32;
#else
typedef int bpf_int32;
typedef u_int bpf_u_int32;
#endif

Forcing types to a specific size

I've been learning C++ and one thing that I'm not really comfortable with is the fact that datatype sizes are not consistent. Depending on what system something is deployed on an int could be 16 bits or 32 bits, etc.
So I was thinking it might be a good idea to make my own header file with data types like byte, word, etc. that are defined to be a specific size and will maintain that size on any platform.
Two questions. First is this a good idea? Or is it going to create other problems I'm not aware of? Second, how do you define a type as being, say, 8 bits? I can't just say #define BYTE char, cause char would vary across platforms.
Fortunately, other people have noticed this same problem. In C99 and C++11 (so set your compiler to compatibility with one of those two modes, there should be a switch in your compiler settings), they added the header stdint.h (for C) and cstdint (for C++). If you #include <cstdint>, you get the types int8_t, int16_t, int32_t, int64_t, and the same prefixed with a u for unsigned versions. If your platform supports those types, they will be defined in the header, along with several others.
If your compiler does not yet support that standard (or you are forced by reasons out of your control to remain on C++03), then there is also Boost.
However, you should only use this if you care exactly about the size of the type. int and unsigned are fine for throw-away variables in most cases. size_t should be used for indexing std::vector, etc.
First you need to figure out if you really care what sizes things are. If you are using an int to count the number of lines in a file, do you really care if it's 32-bit or 64? You need BYTE, WORD, etc if you are working with packed binary data, but generally not for any other reason. So you may be worrying over something that doesn't really matter.
Better yet, use the already defined stuff in stdint.h. See here for more details. Similar question here.
Example:
int32_t is always 32 bits.
Many libraries have their own .h with a lots of typedef to have constant size types. This is useful when making portable code, and avoid relying on the headers of the platform you are currently working with.
If you only want to make sure the builtin data types have a minimum size you can use std::numeric_limits in the header to check.
std::numeric_limits<int>::digits
will give you, for example, the number of bits of an int without the sign bit. And
std::numeric_limits<int>::max()
will give you the max value.

When does it make sense to typedef basic data types?

A company's internal c++ coding standards document states that even for basic data types like int, char, etc. one should define own typedefs like "typedef int Int". This is justified by advantage of portability of the code.
However are there general considerations/ advice about when (in means for which types of projects) does it really make sense?
Thanks in advance..
Typedefing int to Int offers almost no advantage at all (it provides no semantic benefit, and leads to absurdities like typedef long Int on other platforms to remain compatible).
However, typedefing int to e.g. int32_t (along with long to int64_t, etc.) does offer an advantage, because you are now free to choose the data-type with the relevant width in a self-documenting way, and it will be portable (just switch the typedefs on a different platform).
In fact, most compilers offer a stdint.h which contains all of these definitions already.
That depends. The example you cite:
typedef int Int;
is just plain dumb. It's a bit like defining a constant:
const int five = 5;
Just as there is zero chance of the variable five ever becoming a different number, the typedef Int can only possibly refer to the primitive type int.
OTOH, a typedef like this:
typedef unsigned char byte;
makes life easier on the fingers (though it has no portability benefits), and one like this:
typedef unsigned long long uint64;
Is both easier to type and more portable, since, on Windows, you would write this instead (I think):
typedef unsigned __int64 uint64;
Rubbish.
"Portability" is non-sense, because int is always an int. If they think they want something like an integer type that's 32-bits, then the typedef should be typedef int int32_t;, because then you are naming a real invariant, and can actually ensure that this invariant holds, via the preprocessor etc.
But this is, of course, a waste of time, because you can use <cstdint>, either in C++0x, or by extensions, or use Boost's implementation of it anyway.
Typedefs can help describing the semantics of the data type. For instance, if you typedef float distance_t;, you're letting the developer in on how the values of distance_t will be interpreted. For instance you might be saying that the values may never be negative. What is -1.23 kilometers? In this scenario, it might just not make sense with negative distances.
Of course, typedefs does not in any way constraint the domain of the values. It is just a way to make code (should at least) readable, and to convey extra information.
The portability issues your work place seem to mention would be when you want ensure that a particular datatype is always the same size, no matter what compiler is used. For instance
#ifdef TURBO_C_COMPILER
typedef long int32;
#elsif MSVC_32_BIT_COMPILER
typedef int int32;
#elsif
...
#endif
typedef int Int is a dreadful idea... people will wonder if they're looking at C++, it's hard to type, visually distracting, and the only vaguely imaginable rationalisation for it is flawed, but let's put it out there explicitly so we can knock it down:
if one day say a 32-bit app is being ported to 64-bit, and there's lots of stupid code that only works for 32-bit ints, then at least the typedef can be changed to keep Int at 32 bits.
Critique: if the system is littered which code that's so badly written (i.e. not using an explicitly 32-bit type from cstdint), it's overwhelmingly likely to have other parts of the code where it will now need to be using 64-bit ints that will get stuck at 32-bit via the typedef. Code that interacts with library/system APIs using ints are likely to be given Ints, resulting in truncated handles that work until they happen to be outside the 32-bit range etc.. The code will need a complete reexamination before being trustworthy anyway. Having this justification floating around in people's minds can only discourage them from using explicitly-sized types where they are actually useful ("what are you doing that for?" "portability?" "but Int's for portability, just use that").
That said, the coding rules might be meant to encourage typedefs for things that are logically distinct types, such as temperatures, prices, speeds, distances etc.. In that case, typedefs can be vaguely useful in that they allow an easy way to recompile the program to say upgrade from float precision to double, downgrade from a real type to an integral one, or substitute a user-defined type with some special behaviours. It's quite handy for containers too, so that there's less work and less client impact if the container is changed, although such changes are usually a little painful anyway: the container APIs are designed to be a bit incompatible so that the important parts must be reexamined rather than compiling but not working or silently performing dramatically worse than before.
It's essential to remember though that a typedef is only an "alias" to the actual underlying type, and doesn't actually create a new distinct type, so people can pass any value of that same type without getting any kind of compiler warning about type mismatches. This can be worked around with a template such as:
template <typename T, int N>
struct Distinct
{
Distinct(const T& t) : t_(t) { }
operator T&() { return t_; }
operator const T&() const { return t_; }
T t_;
};
typedef Distinct<float, 42> Speed;
But, it's a pain to make the values of N unique... you can perhaps have a central enum listing the distinct values, or use __LINE__ if you're dealing with one translation unit and no multiple typedefs on a line, or take a const char* from __FILE__ as well, but there's no particularly elegant solution I'm aware of.
(One classic article from 10 or 15 years ago demonstrated how you could create templates for types that knew of several orthogonal units, keeping counters of the current "power" in each, and adjusting the type as multiplications, divisions etc were performed. For example, you could declare something like Meters m; Time t; Acceleration a = m / t / t; and have it check all the units were sensible at compile time.)
Is this a good idea anyway? Most people clearly consider it overkill, as almost nobody ever does it. Still, it can be useful and I have used it on several occasions where it was easy and/or particularly dangerous if values were accidentally misassigned.
I suppose, that the main reason is portability of your code. For example, once you assume to use 32 bit integer type in the program, you need to be shure that the other's platform int is also 32 bits long. Typedef in header helps you to localize the changes of your code in one place.
I would like to put out that it could also be used for people who speak a different language. Say for instance, if you speak spanish and your code is all in spanish wouldn't you want a type definition in spanish. Just something to consider.