Is there a more human-readable way for representing big numbers in the source code of an application written in C++ or C?
let's for example take the number 2,345,879,444,641 , in C or C++ if we wanted a program to return this number we would do return 2345879444641.
But this is not really readable.
In PAWN (a scripting language) for example I can do return 2_345_879_444_641 or even return 2_34_58_79_44_46_41 and these both would return the number 2,345,879,444,641.
This is much more readable for the human-eye.
Is there a C or C++ equivalent for this?
With a current compiler (C++14 or newer), you can use apostrophes, like:
auto a = 1'234'567;
If you're still stuck with C++11, you could use a user-defined literal to support something like: int i = "1_000_000"_i. The code would look something like this:
#include <iostream>
#include <string>
#include <cstdlib>
int operator "" _i (char const *in, size_t len) {
std::string input(in, len);
int pos;
while (std::string::npos != (pos=input.find_first_of("_,")))
input.erase(pos, 1);
return std::strtol(input.c_str(), NULL, 10);
}
int main() {
std::cout << "1_000_000_000"_i;
}
As I've written it, this supports underscores or commas interchangeably, so you could use one or the other, or both. For example, "1,000_000" would turn out as 1000000.
Of course, Europeans would probably prefer "." instead of "," -- if so, feel free to modify as you see fit.
With Boost.PP:
#define NUM(...) \
NUM_SEQ(BOOST_PP_VARIADIC_TO_SEQ(__VA_ARGS__))
#define NUM_SEQ(seq) \
BOOST_PP_SEQ_FOLD_LEFT(NUM_FOLD, BOOST_PP_SEQ_HEAD(seq), BOOST_PP_SEQ_TAIL(seq))
#define NUM_FOLD(_, acc, x) \
BOOST_PP_CAT(acc, x)
Usage:
NUM(123, 456, 789) // Expands to 123456789
Demo.
Another way is making an UDL. Left as an exercise (and also because it requires more code).
Here's a macro that would do it, tested on both MSVC and GCC. No reliance on Boost...
#define NUM(...) NUM_(__VA_ARGS__, , , , , , , , , , )
#define NUM_(...) NUM_MSVCHACK((__VA_ARGS__))
#define NUM_MSVCHACK(numlist_) NUM__ numlist_
#define NUM__(a1_, a2_, a3_, a4_, a5_, a6_, a7_, a8_, ...) a1_##a2_##a3_##a4_##a5_##a6_##a7_##a8_
Use it like:
int y = NUM(1,2,3,4,5,6,7,8);
int x = NUM(100,460,694);
Produces:
int y = 12345678;
int x = 100460694;
For C++1y you can now use single quote(') as a digit separator. Based on N3781: Single-Quotation-Mark as a Digit Separator which has finally been accepted. Both gcc and clang have supported this feature as part of their C++1y implementation.
So the following program (see it live for clang):
#include <iostream>
int main(){
std::cout << 2'345'879'444'641 << std::endl ;
}
will output:
2345879444641
You could use a preprocessor macro
#define BILLION (1000*1000*1000)
then code e.g. (4*BILLION) ; if you care about large power of two just ust 1<<30
PS Notice that 1e6 is a double literal (same as 1.0e6)
And you could also:
patch the GCC lexer to accept 1_234_567 notation for number literals and publish that patch for conformance with GPLv3 and free software spirit.
probably in file libpp/lex.c and/or gcc/c-family/c-lex.c and/or gcc/cpp/lex.c of future GCC 4.8, i.e. current trunk.
lobby the C & C++ standardization groups to get that accepted in future C or C++ standards.
Related
HI ,
Can some one help me in understanding why the value of SQUARE(x) is 49 ?
I am using Visual C++ 6.0 .
#define SQUARE(X) X * X
int main(int argc, char* argv[])
{
int y = 5;
printf("%d\n",SQUARE(++y));
return 0;
}
Neil Butterworth, Mark and Pavel are right.
SQUARE(++y) expands to ++y * ++y, which increments twice the value of y.
Another problem you could encounter: SQUARE(a + b) expands to a + b * a + b which is not (a+b)*(a+b) but a + (b * a) + b. You should take care of adding parentheses around elements when needed while defining macros: #define SQUARE(X) ((X) * (X)) is a bit less risky. (Ian Kemp wrote it first in his comment)
You could instead use an inline template function (no less efficient at runtime) like this one:
template <class T>
inline T square(T value)
{
return value*value;
}
You can check it works:
int i = 2;
std::cout << square(++i) << " should be 9" << std::endl;
std::cout << square(++i) << " should be 16" << std::endl;
(no need to write
square<int>(++i)
because the int type is implicit for i)
Because the macro expands to:
++y * ++y
which gives undefined behaviour in C++ - the result could
be anything. This very well known problem should be covered in any decent textbook that covers the use of macros. Which one are you using?
Because macros do textual substitution so the code you wrote gets expanded to
printf("%d\n",++y * ++y );
and then the order of operations is undefined behaviour so this the compiler sees 2 increments and then a multiplication
So be careful with macros better to use functions which as the compiler can expand inline will not take any longer to run.
Secondly don't assume what will happen if you increment and use variables
Macros are not functions: they just alter the text of the program. This operation is called preprocessing and it's automatically executed before your code gets compiled. People write macros to save their time and introduce some variability to their source code.
When you write SQUARE(x), no actual funciton call happens, just the text is modified. The operation is quite dumb, so you have to do additional precautions in cases like yours. Refer to other answers for explanation of your case.
Let's say that I need to create a LUT containing precomputed bit count values (count of 1 bits in a number) for 0...255 values:
int CB_LUT[256] = {0, 1, 1, 2, ... 7, 8};
If I don't want to use hard-coded values, I can use nice template solution How to count the number of set bits in a 32-bit integer?
template <int BITS>
int CountBits(int val)
{
return (val & 0x1) + CountBits<BITS-1>(val >> 1);
}
template<>
int CountBits<1>(int val)
{
return val & 0x1;
}
int CB_LUT[256] = {CountBits<8>(0), CountBits<8>(1) ... CountBits<8>(255)};
This array is computed completely at compile time. Is there any way to avoid a long list, and generate such array using some kind of templates or even macros (sorry!), something like:
Generate(CB_LUT, 0, 255); // array declaration
...
cout << CB_LUT[255]; // should print 8
Notes. This question is not about counting 1 bits in an number, it is used just as example. I want to generate such array completely in the code, without using external code generators. Array must be generated at compile time.
Edit.
To overcome compiler limits, I found the following solution, based on
Bartek Banachewicz` code:
#define MACRO(z,n,text) CountBits<8>(n)
int CB_LUT[] = {
BOOST_PP_ENUM(128, MACRO, _)
};
#undef MACRO
#define MACRO(z,n,text) CountBits<8>(n+128)
int CB_LUT2[] = {
BOOST_PP_ENUM(128, MACRO, _)
};
#undef MACRO
for(int i = 0; i < 256; ++i) // use only CB_LUT
{
cout << CB_LUT[i] << endl;
}
I know that this is possibly UB...
It would be fairly easy with macros using (recently re-discovered by me for my code) Boost.Preprocessor - I am not sure if it falls under "without using external code generators".
PP_ENUM version
Thanks to #TemplateRex for BOOST_PP_ENUM, as I said, I am not very experienced at PP yet :)
#include <boost/preprocessor/repetition/enum.hpp>
// with ENUM we don't need a comma at the end
#define MACRO(z,n,text) CountBits<8>(n)
int CB_LUT[256] = {
BOOST_PP_ENUM(256, MACRO, _)
};
#undef MACRO
The main difference with PP_ENUM is that it automatically adds the comma after each element and strips the last one.
PP_REPEAT version
#include <boost/preprocessor/repetition/repeat.hpp>
#define MACRO(z,n,data) CountBits<8>(n),
int CB_LUT[256] = {
BOOST_PP_REPEAT(256, MACRO, _)
};
#undef MACRO
Remarks
It's actually very straightforward and easy to use, though it's up to you to decide if you will accept macros. I've personally struggled a lot with Boost.MPL and template techniques, to find PP solutions easy to read, short and powerful, especially for enumerations like those. Additional important advantage of PP over TMP is the compilation time.
As for the comma at the end, all reasonable compilers should support it, but in case yours doesn't, simply change the number of repetitions to 255 and add last case by hand.
You might also want to rename MACRO to something meaningful to avoid possible redefinitions.
I like to do it like this:
#define MYLIB_PP_COUNT_BITS(z, i, data) \
CountBits< 8 >(i)
int CB_LUT[] = {
BOOST_PP_ENUM(256, MYLIB_PP_COUNT_BITS, ~)
};
#undef MYLIB_PP_COUNT_BITS
The difference with BOOST_PP_REPEAT is that BOOST_PP_ENUM generates a comma-separated sequence of values, so no need to worry about comma's and last-case behavior.
Furthermore, it is recommended to make your macros really loud and obnoixous by using a NAMESPACE_PP_FUNCTION naming scheme.
a small configuration thing is to omit the [256] in favor of [] in the array size so that you can more easily modify it later.
Finally, I would recommend making your CountBit function template constexpr so that you also can initialize const arrays.
Now that we soon have user defined literals (UDL), in GCC 4.7 for example, I'm eagerly waiting for (physical) unit libraries (such as Boost.Units) using them to ease expression of literals such as 1+3i, 3m, 3meter or 13_meter. Has anybody written an extension to Boost.Units using UDL supporting this behaviour?
No one has come out with such an extension. Only gcc (and possibly IBM?) has UDL so it might be a while. I'm hoping some kind of units makes it into tr2 which is starting about now. If that happens I'm sure UDL for units will come up.
This works:
// ./bin/bin/g++ -std=c++0x -o units4 units4.cpp
#include <boost/units/unit.hpp>
#include <boost/units/quantity.hpp>
#include <boost/units/systems/si.hpp>
using namespace boost::units;
using namespace boost::units::si;
quantity<length, long double>
operator"" _m(long double x)
{ return x * meters; }
quantity<si::time, long double>
operator"" _s(long double x)
{ return x * seconds; }
int
main()
{
auto l = 66.6_m;
auto v = 2.5_m / 6.6_s;
std::cout << "l = " << l << std::endl;
std::cout << "v = " << v << std::endl;
}
I think it wouldn't be too hard to go through you favorite units and do this.
Concerning putting these in a library:
The literal operators are namespace scope functions. The competition for suffixes is going to get ugly. I would (if I were boost) have
namespace literals
{
...
}
Then Boost users can do
using boost::units::literals;
along with all the other using decls you typically use. Then you won't get clobbered by std::tr2::units for example. Similarly if you roll your own.
In my opinion there is not much gain in using literals for Boost.Units, because a more powerful syntax can still be achieved with existing capabilities.
In simple cases, looks like literals is the way to go, but soon you see that it is not very powerful.
For example, you still have to define literals for combined units, for example, how do you express 1 m/s (one meter per second)?
Currently:
auto v = 1*si::meter/si::second; // yes, it is long
but with literals?
// fake code
using namespace boost::units::literals;
auto v = 1._m_over_s; // or 1._m/si::second; or 1._m/1_s // even worst
A better solution can be achieved with existing features. And this is what I do:
namespace boost{namespace units{namespace si{ //excuse me!
namespace abbreviations{
static const length m = si::meter;
static const time s = si::second;
// ...
}
}}} // thank you!
using namespace si::abbreviations;
auto v = 1.*m/s;
In the same way you can do: auto a = 1.*m/pow<2>(s); or extend the abbreviations more if you want (e.g. static const area m2 = pow<2>(si::meter);)
What else beyond this do you want?
Maybe a combined solution could be the way
auto v = 1._m/s; // _m is literal, /s is from si::abbreviation combined with operator/
but there will be so much redundant code and the gain is minimal (replace * by _ after the number.)
One drawback of my solution is that it polutes the namespace with common one letter names. But I don't see a way around that except to add an underscore (to the beginning or the end of the abbreviation), as in 1.*m_/s_ but at least I can build real units expressions.
HI ,
Can some one help me in understanding why the value of SQUARE(x) is 49 ?
I am using Visual C++ 6.0 .
#define SQUARE(X) X * X
int main(int argc, char* argv[])
{
int y = 5;
printf("%d\n",SQUARE(++y));
return 0;
}
Neil Butterworth, Mark and Pavel are right.
SQUARE(++y) expands to ++y * ++y, which increments twice the value of y.
Another problem you could encounter: SQUARE(a + b) expands to a + b * a + b which is not (a+b)*(a+b) but a + (b * a) + b. You should take care of adding parentheses around elements when needed while defining macros: #define SQUARE(X) ((X) * (X)) is a bit less risky. (Ian Kemp wrote it first in his comment)
You could instead use an inline template function (no less efficient at runtime) like this one:
template <class T>
inline T square(T value)
{
return value*value;
}
You can check it works:
int i = 2;
std::cout << square(++i) << " should be 9" << std::endl;
std::cout << square(++i) << " should be 16" << std::endl;
(no need to write
square<int>(++i)
because the int type is implicit for i)
Because the macro expands to:
++y * ++y
which gives undefined behaviour in C++ - the result could
be anything. This very well known problem should be covered in any decent textbook that covers the use of macros. Which one are you using?
Because macros do textual substitution so the code you wrote gets expanded to
printf("%d\n",++y * ++y );
and then the order of operations is undefined behaviour so this the compiler sees 2 increments and then a multiplication
So be careful with macros better to use functions which as the compiler can expand inline will not take any longer to run.
Secondly don't assume what will happen if you increment and use variables
Macros are not functions: they just alter the text of the program. This operation is called preprocessing and it's automatically executed before your code gets compiled. People write macros to save their time and introduce some variability to their source code.
When you write SQUARE(x), no actual funciton call happens, just the text is modified. The operation is quite dumb, so you have to do additional precautions in cases like yours. Refer to other answers for explanation of your case.
Is there some way to do something like this in c++, it seems sizeof cant be used there for some reason?
#if sizeof(wchar_t) != 2
#error "wchar_t is expected to be a 16 bit type."
#endif
No, this can't be done because all macro expansion (#... things) is done in the pre-processor step which does not know anything about the types of the C++ code and even does not need to know anything about the language!
It just expands/checks the #... things and nothing else!
There are some other common errors, for example:
enum XY
{
MY_CONST = 7,
};
#if MY_CONST == 7
// This code will NEVER be compiled because the pre-processor does not know anything about your enum!
#endif //
You can only access and use things in #if that are defined via command line options to the compiler or via #define.
The preprocessor works without knowing anything about the types, even the builtin one.
BTW, you can still do the check using a static_assert like feature (boost has one for instance, C++0X will have one).
Edit: C99 and C++0X have also WCHAR_MIN and WCHAR_MAX macros in <stdint.h>
I think things like BOOST_STATIC_ASSERT could help.
Wouldn't you get basically what you want (compile error w/o the fancy message) by using a C_ASSERT?
#define C_ASSERT(e) typedef char __C_ASSERT__[(e)?1:-1]
sizeof() is a runtime compile-time function. You cannot call that in a preprocessor directive. I don't think you can check the size of wchar_t during preprocessing. (see Edit 2)
Edit: As pointed out in comments, sizeof() is mostly calculated at compile time. In C99, it can be used at runtime for arrays.
Edit 2: You can do asserts at build time using the techniques described in this thread.
char _assert_wchar_t_is_16bit[ sizeof(wchar_t) == 2 ? 1 : -1];
I've developed some macros that will effectively allow you to use sizeof within a macro condition. They're in a header file that I've uploaded here (MIT license).
It will permit for code like this:
#include <iostream>
#include "SIZEOF_definitions.h"
//You can also use SIZEOF_UINT in place of SIZEOF(unsigned, int)
// and UINT_BIT in place of SIZEOF_BIT(unsigned, int)
#if SIZEOF(unsigned, int) == 4
int func() { return SIZEOF_BIT(unsigned, int); }
#elif SIZEOF(unsigned, int) == 8
int func() { return 2 * SIZEOF_BIT(unsigned, int); }
#endif
int main(int argc, char** argv) {
std::cout SIZEOF(unsigned, long, int) << " chars, #bits = " << SIZEOF_BIT(unsigned, long, int) << '\n'
<< SIZEOF(unsigned, int) << " chars, #bits = " << SIZEOF_BIT(unsigned, int) << '\n'
<< SIZEOF(int) << " chars, #bits = " << SIZEOF_BIT(int) << '\n';
std::cout << func() << std::endl;
return 0;
}
Note the commas within SIZEOF(unsigned, long, int).