In all of the C++ style guides I have read, I never have seen any information about numerical literal suffixes (i.e. 3.14f, 0L, etc.).
Questions
Is there any style guide out there that talks about there usage, or is there a general convention?
I occasionally encounter the f suffix in graphics programming. Is there any trend on there usage in the type of programming domain?
The only established convention (somewhat established, anyway) of which I'm aware is to always use L rather than l, to avoid its being mistaken for a 1. Beyond that, it's pretty much a matter of using what you need when you need it.
Also note that C++ 11 allows user-defined literals with user-defined suffixes.
There is no general style guide that I've found. I use capital letters and I'm picky about using F for float literals and L for long double. I also use the appropriate suffixes for integral literals.
I assume you know what these suffixes mean: 3.14F is a float literal, 12.345 is a double literal, 6.6666L is a long double literal.
For integers: U is unsigned, L is long, LL is long long. Order between U and the Ls doesn't matter but I always put UL because I declare such variables unsigned long for example.
If you assign a variable of one type a literal of another type, or supply a numeric literal of one type for function argument of another type a cast must happen. Using the proper suffix avoids this and is useful along the same lines as static_cast is useful for calling out casts. Consistent usage of numeric literal suffixes is good style and avoids numeric surprises.
People differ on whether lower or upper case is best. Pick a style that looks good to you and be consistent.
The CERT C Coding Standard
recommends to use uppercase letters:
DCL16-C. Use "L," not "l," to indicate a long value
Lowercase letter l (ell) can easily be confused with the digit 1 (one). This can be particularly confusing when indicating that an integer literal constant is a long value. This recommendation is similar to DCL02-C. Use visually distinct identifiers.
Likewise, you should use uppercase LL rather than lowercase ll when indicating that an integer literal constant is a long long value.
MISRA C++ 2008 for the C++03 language states in rule M2-13-3 (at least, as cited by this Autosar document) that
A “U” suffix shall
be applied to all octal or hexadecimal
integer literals of unsigned type.
The linked document also compares to JSF-AV 2005 and HIC++v4.0, all these four standards require the suffixes to be uppercase.
Nevertheless I can't find a rule (but I don't have a hardcopy of MISRA C++ at hand) that states that the suffixes shall be used whenever needed. However, IIRC there is one in MISRA C++ (or maybe was it just my former company coding guidelines…)
Web search for "c++ numeric suffixes" returns:
http://cpp.comsci.us/etymology/literals.html
http://www.cplusplus.com/forum/general/27226/
http://bytes.com/topic/c/answers/758563-numeric-constants
Are these what you're looking for?
Related
I didn't know that C and C++ allow multicharacter literal: not 'c' (of type int in C and char in C++), but 'tralivali' (of type int!)
enum
{
ActionLeft = 'left',
ActionRight = 'right',
ActionForward = 'forward',
ActionBackward = 'backward'
};
Standard says:
C99 6.4.4.4p10: "The value of an
integer character constant containing
more than one character (e.g., 'ab'),
or containing a character or escape
sequence that does not map to a
single-byte execution character, is
implementation-defined."
I found they are widely used in C4 engine. But I suppose they are not safe when we are talking about platform-independend serialization. Thay can be confusing also because look like strings. So what is multicharacter literal's scope of usage, are they useful for something? Are they in C++ just for compatibility with C code? Are they considered to be a bad feature as goto operator or not?
It makes it easier to pick out values in a memory dump.
Example:
enum state { waiting, running, stopped };
vs.
enum state { waiting = 'wait', running = 'run.', stopped = 'stop' };
a memory dump after the following statement:
s = stopped;
might look like:
00 00 00 02 . . . .
in the first case, vs:
73 74 6F 70 s t o p
using multicharacter literals. (of course whether it says 'stop' or 'pots' depends on byte ordering)
I don't know how extensively this is used, but "implementation-defined" is a big red-flag to me. As far as I know, this could mean that the implementation could choose to ignore your character designations and just assign normal incrementing values if it wanted. It may do something "nicer", but you can't rely on that behavior across compilers (or even compiler versions). At least "goto" has predictable (if undesirable) behavior...
That's my 2c, anyway.
Edit: on "implementation-defined":
From Bjarne Stroustrup's C++ Glossary:
implementation defined - an aspect of C++'s semantics that is defined for each implementation rather than specified in the standard for every implementation. An example is the size of an int (which must be at least 16 bits but can be longer). Avoid implementation defined behavior whenever possible. See also: undefined. TC++PL C.2.
also...
undefined - an aspect of C++'s semantics for which no reasonable behavior is required. An example is dereferencing a pointer with the value zero. Avoid undefined behavior. See also: implementation defined. TC++PL C.2.
I believe this means the comment is correct: it should at least compile, although anything beyond that is not specified. Note the advice in the definition, also.
Four character literals, I've seen and used. They map to 4 bytes = one 32 bit word. It's very useful for debugging purposes as said above. They can be used in a switch/case statement with ints, which is nice.
This (4 Chars) is pretty standard (ie supported by GCC and VC++ at least), although results (actual values compiled) may vary from one implementation to another.
But over 4 chars? I wouldn't use.
UPDATE: From the C4 page: "For our simple actions, we'll just provide an enumeration of some values, which is done in C4 by specifying four-character constants". So they are using 4 chars literals, as was my case.
Multicharacter literals allow one to specify int values via the equivalent representation in characters. Useful for enums, FourCC codes and tags, and non-type template parameters. With a multicharacter literal, a FourCC code can be typed directly into the source, which is handy.
The implementation in gcc is described at https://gcc.gnu.org/onlinedocs/cpp/Implementation-defined-behavior.html . Note that the value is truncated to the size of the type int, so 'efgh' == 'abcdefgh' if your ints are 4 chars wide, although gcc will issue a warning on the literal that overflows.
Unfortunately, gcc will issue a warning on all multi-character literals if -pedantic is passed, as their behavior is implementation-defined. As you can see above, it is perhaps possible for equality of two multi-character literals to change if you switch implementations.
In C++14 specification draft N4527 section 2.13.3, entry 2:
... An ordinary character literal that contains more than one c-char is a multicharacter literal. A multicharacter literal, or an ordinary character literal containing a single c-char not representable in the execution character set, is conditionally-supported, has type int, and has an implementation-defined value.
Previous answers to your question pertained mostly on real machines that did support multicharacter literals. Specifically, on platforms where int is 4 bytes, four-byte multicharacter is fine and can be used for convenience, as per Ferrucio's mem dump example. But, as there is no guarantee that this will ever work or work the same way on other platforms, use of multicharacter literals should be deprecated for portable programs.
unbelievable, every compiler I know places the first character of a UINT defined as 4-character constant in the low significant byte (little indian) --- but Visual C does it in opposite direction 🙄
// file signature
#define SFKFILE_SIGNATURE 'SFPK' (S=53)
// check header
if (out_FileHdr->Signature != SFKFILE_SIGNATURE)
fails on VC:
Borland: 4B504653 4B504653
Watcom: 4B504653 4B504653
VisualC: 4B504653 5346504B
I am currently reading about constants on the c++ tutorial from TutorialsPoint and, where it says:
Constants refer to fixed values that the program may not alter and they are called literals.
(Source)
I do not really get this. If constants are called literals and literals are data represented directly in the code, how can constants be considered as literals? I mean variables preceded with the const keyword are constants, but they are not literals, so how can you say that constants are literals?
Here:
const int MEANING = 42;
the value MEANING is a constant, 42 is a literal. There is no real relationship between the two terms, as can be seen here:
int n = 42;
where n is not a constant, but 42 is still a literal.
The major difference is that a constant may have an address in memory (if you write some code that needs such an address), whereas a literal never has an address.
I disagree with the claim "...There wasn't a thing called const in C originally so this was fine." const is actually one of the 32 C keywords. Google to see.
With that rested, I think the man missed something at TP. To be fair to them at Tutorials Point, they had an article that explained the difference thus (full quote, verbatim):
https://www.tutorialspoint.com/questions/category/Cplusplus
A literal is a value that is expressed as itself. For example, the number 25 or the string "Hello World" are both literals.
A constant is a data type that substitutes a literal. Constants are used when a specific, unchanging value is used various times during the program. For example, if you have a constant named PI that you'll be using at various places in your program to find the area, circumference, etc of a circle, this is a constant as you'll be reusing its value. But when you'll be declaring it as:
const float PI = 3.141;
The 3.141 is a literal that you're using. It doesn't have any memory address of its own and just sits in the source code.
Pls don't disparage those fellows doing what you call "random tutorials". Kids from poorer homes and less developed world can't afford your " good C++ textbooks " e.g. Scott Myers Effective C++ It is these online free tutorials they can have, and most of these tutorials do better explaining than the "good books".
By any means read them guys. Get confused some then come over here to StackOveflow or Quora to have your confusion cleared. Happy coding guys.
The author of the article is confused, and spreading that confusion to others (including you).
In C, literals are "constants". There wasn't a thing called const in C originally so this was fine.
C++ is a different language. In C++, literals are called "literals", and "constant" has a few meanings but generally is a const thing. The two concepts are different (although both kinds of things cannot be mutated after initial creation). We also have compile-time constants via constexpr which is yet another thing.
In general, read a good book rather than random tutorials written by randomers on the internet!
While the first part of the statement makes sense
Constants refer to fixed values that the program may not alter
the continuation
and they are called literals
is not really true.
Neil has already explained the semantical difference between the literal and the constant in his answer. But I would also like to add that the values of constant variables in C++ are not necessarily known at compile time.
// x might be obtained at runtime
// for instance, from the user input
void print_square(int x)
{
const int square = x*x;
std::cout << square << '\n';
}
Literals are values that are known at compile-time, which allows the compiler to put them to a separate read-only address space in the resulting binaries.
You can also enforce your variables to be known at compile-time by applying constexpr keyword (C++11).
constexpr int meaning = 42;
P.S. And I also do agree with a comment suggesting to use a good book instead of tutorialspoint.
If constants are called literals and literals are data represented directly in the code, how can constants be considered as literals?
The article from which you drew the quote is defining the word "constant" to be a synonym of "literal". The latter is the C++ standard's term for what it is describing. The former is what the C standard uses for the same concept.
I mean variables preceded with the const keyword are constants, but they are not literals, so how can you say that constants are literals?
And there you are providing an alternative definition for the term "constant", which, you are right, is inconsistent with the other. That's all. TP is using a different definition of the term than the one you are used to.
In truth, although the noun usage of "constant" appears in a couple of places in the C++ standard outside the defined term "null pointer constant", apparently with the meaning you propose here, I do not find an actual definition of that term, and especially not one matching yours. In truth, your definition is less plausible than TutorialPoint's, because an expression having const-qualified type can nevertheless designate an object that is modifiable (via a different expression).
Constant is simply a variable declared constant by keyword 'const' whose value after being declared shouldn't be altered during the course of the program (and if tried to alter it will result in an error).
On the other hand, literal is simply what is used and represented as it is typed in. For example, 25 when used in an expression (x+4*y+25) will be termed as literal.
Whenever we use String values or directly supply it in double quotes ("hello"), then that value in double quotes is called literal.
For example, printf("This is literal");
And if you are assigning a string value to a variable then thereafter you will refer to the variable (which could be declared constant if desired) and not exclusively to the value you have stored in it, i.e., only till the point you are supplying a value (string type of any other type) to the variable, the value is referred to as literal value, after that the variable is talked about whenever referring that value.
Once again, the value(25) in expression : x+4*y+25 is literal.
The value(4) in the term 4*y is also a literal (since it is exactly as we see it and is known to compiler beforehand).
--> The value(4) in the term 4*y is called numerical coefficient in algebraic terms and y is called literal coefficient in algebraic terms.
Hence,
All the above explanation I have given is in computer terms only. The meaning of literals and constants in Algebra are somewhat different than used in computer terms.
"Constants refer to fixed values that the program may not alter and they are called literals. (Source)"
The sentence construction is weird which is leading to the confusion.
Here, the the "they" that are referring to are the the fixed values and not constants. I would phrase it as "Constants refer to fixed values, that the program may not alter, called literals." which is less confusing I hope.
Constants are variables that can't vary, whereas Literals are literally numbers/letters that indicate the value of a variable or constant.
I can explain it this way.
Basically, constants are variables whose value cannot change.
Literals are notations that represent fixed values. These values can be Strings numbers etc
Literals can be assigned to variables
Code :
var a = 10;
var name = "Simba";
const pi = 3.14;
Here a and name are variables. pi is a constant. ( Constants are those variables whose value doesn't change. )
Here 10, "Simba" and 3.14 are literals.
I am in a C++ class right now so this question will concern itself primarily with that language, though I haven't been able to find any information for any other language either and I have a feeling whatever the answer is it's probably largely cross language.
In C++ unmarked numbers are assumed to be of integral type ('4', for example, is an integer)
various bounding marks allow for the number to be interpreted differently (''4'', for example, is a character, '"4"' a string).
As far as I know, there is only one kind of unary mark: the decimal point. ('4.' is a double).
I would like to create a new unary mark that designates a constant number in the code to be interpreted as a member of a created datatype. More fundamentally, I would like to know what the '.' and ',' and '"', and ''' are (they aren't operators, keywords, or statements, so what are they?) and how the compiler deals with/interprets them.
More information, if you feel it is necessary:
I am trying to make a complex number header that I can include in any project to do complex math. I am aware of the library but it is, IMHO, ugly and if used extensively slows down coding time. Also I'm mostly trying to code this to improve my programming skills. My goal is to be able to declare a complex variable by doing something of the form cmplx num1= 3 + 4i; where '3' and '4' are arbitrary and 'i' is a mark similar to the decimal point which indicates '4' as imaginary.
I would like to create a new unary mark that designates a constant number in the code to be interpreted as a member of a created datatype.
You can use user defined literals that were introduced in C++11. As an example, assuming you have a class type Type and you want to use the num_y syntax, where num is a NumericType, you can do:
Type operator"" _y(NumericType i) {
return Type(i);
}
Live demo
Things like 4, "4" and 4. are all single tokens,
indivisible. There's no way you can add new tokens to the
language. In C++11, it is possible to define user defined
literals, but they still consist of several tokens; for complex,
a much more natural solution would be to support a constant i,
to allow writing things like 4 + 3*i. (But you'd still need
the C++11 support for constexpr for it to be a compile time
constant.)
Is the definition of enums in C++ and setting values in random order a valid? Is it used in any well-known code?
e.g.:
enum ESampleType{ STPositiveControl='P', STNegativeControl='N', STSample='S' };
VS2008 compiles without warnings. I wonder whether gcc would.
In my opinion, it a least harms "least surprise rule" as iterating over all values using
for(int i=STPositiveControl; i<=STSample; ++i)
would fail.
P.S: The rationale for this approach:
In my DB application I'm defining wrapper methods. Some columns contain "constants" encoded as
single char. I'm trying to make it a little more type safe.
It's a standard and widely used feature. Most of the time (but
not always) it will be used to create bitmaps, but it can also
be used as you show, to give "printable" values to the enum
contants.
It shouldn't surprise anyone who knows C++, as it is widely
used.
Yes it is valid.
There is an example use of it in the book The C++ Programming Language (3rd Edition) by Bjarne Stroustrup, section "6.1 A Desk Calculator [expr.calculator]" (and more precisely "6.1.1 The Parser [expr.parser]"), for the parser code of a simple arithmetic calculator. Here's an excerpt:
The parser uses a function get_token() to get input. The value of the most recent call of get_token() can be found in the global variable curr_tok. The type of curr_tok is the enumeration Token_value:
enum Token_value {
NAME, NUMBER, END,
PLUS='+', MINUS='-', MUL='*', DIV='/',
PRINT=';', ASSIGN='=', LP='(', RP=')'
};
Token_value curr_tok = PRINT;
Representing each token by the integer value of its character is convenient and efficient and can be a help to people using debuggers. This works as long as no character used as input has a value used as an enumerator – and no character set I know of has a printing character with a single-digit integer value. (...)
(That last sentence is specific to this example, because the enumeration mixes "default-value" and "explicit-value" enumerators and wants each one to be unique.)
However it's only an educative example (and notably it uses a global variable, and CAPS names for enumerators when you should reserve them for macros (but Stroustrup doesn't like macros :p)).
Now, indeed you can't iterate over it (at least with a plain for loop; but see this question). (And as James Kanze pointed, an enum's values are not always ordered, contiguous, and unique.)
Visual Studio 2008
Project compiled as multibyte character set
LPWSTR lpName[1] = {(WCHAR*)_T("Setup")};
After this conversion, lpName[0] contains garbage (at least when previewed in VS)
LPWSTR is typedef'd as follows:
typedef __nullterminated WCHAR *NWPSTR, *LPWSTR, *PWSTR;
It's an expanded version of my comment above.
The code shown casts a pointer of type A to a pointer of type B. This is a low-vevel, machine-dependent operation. It almost never works as a conversion of an object of type A to an object of type B, especially if one type is a regular character type and the other is wide characters.
Imagine that you take a French book, and read it aloud as if it was written in English.
FRENCH* book;
readaloud ((ENGLISH*) book);
You will mostly hear gibberish. The letters used in the two languages are the same (or similar, at any rate), but the rules of the two languages are are totally different. The representation is the same for both languages, but the meaning is not.
This is very similar to what we have here. Whatever type you have, bits and bytes are the same, but the rules are totally different. You take bits laid out according to regular character rules, and try to interpret them according to wide character rules. It doesn't work. The representation is the same in both cases, but the meaning is not.
To convert one character flavor to another, you in general need a lookup table or some other means to convert each character from one type to the other — change representation, but keep the meaning. Likewise, to convert a French book into an English book, you need to use a big lookup table a.k.a. dictionary... well, the analogy breaks here, because there's no formal set of conversion rules, you need to be creative! But you get the idea.
The rules of C++ actually prohibit such casts. You can only cast an object type poiner to void*, and only use the result to cast it back to the original object type. Everything else is a no-no (unless you are willing to venture in the realm of undefined behavior).
So what should you do?
Pick one character variant and stick to it.
If you must convert between flavors, do so with a library function.
Try to avoid pointer casts, they almost always signal trouble.
I think what you're looking for is
LPTSTR lpName[1] = {_T("Setup")};
The various typedefs with a T in them (e.g. TSTR, LPTSTR) are dependant on whether you use unicode or multi-byte or whatever else. By using these, you should be able to write code that work in whatever encoding you are using (i.e., tomorrow you could switch to ascii, and a large portion of your code should still work).
Edit
If you are in situation where you really must convert between encodings, then there are various conversion functions available, such as wcstombs (or microsoft's documentation) and mbstowcs. These are defined in <cstdlib>