Preprocessor: convert macro parameter to multicharacter literal - c++

In a nutshell:
It's possible to convert macro parameter to character string literal (that contains the spelling of the preprocessing token sequence for the corresponding
argument, regarding to 16.3.2, n3797) with # operator. Is there a way, some trick, maybe, to convert macro parameter to multibyte character ?
So, for example, I will have some macro TO_CHAR(WHATEWER) and TO_CHAR(CONSTANT) will give me 'CONSTANT' multibyte character ?
I need this to work with boost::mpl::string. More precisely, I think so:
I have a lot of constants and I work with them with boost::mpl::vector (boost::mpl::vector_c). Now, I need to print out the name of constant after some work with this constant. Pseudo-pseudo code:
struct Temp
{
template<typename Value>
operator()(Value)
{
// ...
std::cout << "The value of " << #Value::value Name << " is " << Value::value;
};
};
typedef boost::mpl::vector_c<int, MY_CONST1, MY_CONST2> Constants;
boost::mpl::for_each<Constants>(Temp());
The output must be something like this:
The value of MY_CONST1 is 1
The value of MY_CONST2 is 2
I don't want to make some runtime array with a lot of typing:
typedef boost::mpl::vector_c<int, MY_CONST1, MY_CONST2> Constants;
const char* constant_names[] = { "MY_CONST1", "MY_CONST2" };
// Work with this stuff
What I want to have is something like this:
#define MAP_CONST(CONST) boost::mpl::pair<boost::mpl::int_<CONST>, (boost::mpl::string maybe here) #CONST>
typedef boost::mpl::vector<MAP_CONST(MY_CONST1), MAP_CONST(MY_CONST2)> Constants;
// Work with #Constants
Btw, I can't use constexp or new standard in general - I have C++03 compiler.
Thanks
UPD: Sorry for my mismatch - I mean not multibyte character, but "multicharacter literal" (regarding standard - 2.14.3, n3797). Somrthing like this

Related

Creating binary (custom length) string in C++ [duplicate]

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.

How to cleanly use: const char* and std::string?

tl:dr
How can I concatenate const char* with std::string, neatly and
elegantly, without multiple function calls. Ideally in one function
call and have the output be a const char*. Is this impossible, what
is an optimum solution?
Initial Problem
The biggest barrier I have experienced with C++ so far is how it handles strings. In my opinion, of all the widely used languages, it handles strings the most poorly. I've seen other questions similar to this that either have an answer saying "use std::string" or simply point out that one of the options is going to be best for your situation.
However this is useless advice when trying to use strings dynamically like how they are used in other languages. I cannot guaranty to always be able to use std::string and for the times when I have to use const char* I hit the obvious wall of "it's constant, you can't concatenate it".
Every solution to any string manipulation problem I've seen in C++ requires repetitive multiple lines of code that only work well for that format of string.
I want to be able to concatenate any set of characters with the + symbol or make use of a simple format() function just how I can in C# or Python. Why is there no easy option?
Current Situation
Standard Output
I'm writing a DLL and so far I've been output text to cout via the << operator. Everything has been going fine so far using simple char arrays in the form:
cout << "Hello world!"
Runtime Strings
Now it comes to the point where I want to construct a string at runtime and store it with a class, this class will hold a string that reports on some errors so that they can be picked up by other classes and maybe sent to cout later, the string will be set by the function SetReport(const char* report). So I really don't want to use more than one line for this so I go ahead and write something like:
SetReport("Failure in " + __FUNCTION__ + ": foobar was " + foobar + "\n"); // __FUNCTION__ gets the name of the current function, foobar is some variable
Immediately of course I get:
expression must have integral or unscoped enum type and...
'+': cannot add two pointers
Ugly Strings
Right. So I'm trying to add two or more const char*s together and this just isn't an option. So I find that the main suggestion here is to use std::string, sort of weird that typing "Hello world!" doesn't just give you one of those in the first place but let's give it a go:
SetReport(std::string("Failure in ") + std::string(__FUNCTION__) + std::string(": foobar was ") + std::to_string(foobar) + std::string("\n"));
Brilliant! It works! But look how ugly that is!! That's some of the ugliest code I've every seen. We can simplify to this:
SetReport(std::string("Failure in ") + __FUNCTION__ + ": foobar was " + std::to_string(foobar) + "\n");
Still possibly the worst way I've every encounter of getting to a simple one line string concatenation but everything should be fine now right?
Convert Back To Constant
Well no, if you're working on a DLL, something that I tend to do a lot because I like to unit test so I need my C++ code to be imported by the unit test library, you will find that when you try to set that report string to a member variable of a class as a std::string the compiler throws a warning saying:
warning C4251: class 'std::basic_string<_Elem,_Traits,_Alloc>' needs to have dll-interface to be used by clients of class'
The only real solution to this problem that I've found other than "ignore the warning"(bad practice!) is to use const char* for the member variable rather than std::string but this is not really a solution, because now you have to convert your ugly concatenated (but dynamic) string back to the const char array you need. But you can't just tag .c_str() on the end (even though why would you want to because this concatenation is becoming more ridiculous by the second?) you have to make sure that std::string doesn't clean up your newly constructed string and leave you with garbage. So you have to do this inside the function that receives the string:
const std::string constString = (input);
m_constChar = constString.c_str();
Which is insane. Because now I traipsed across several different types of string, made my code ugly, added more lines than should need and all just to stick some characters together. Why is this so hard?
Solution?
So what's the solution? I feel that I should be able to make a function that concatenates const char*s together but also handle other object types such as std::string, int or double, I feel strongly that this should be capable in one line, and yet I'm unable to find any examples of it being achieved. Should I be working with char* rather than the constant variant, even though I've read that you should never change the value of char* so how would this help?
Are there any experienced C++ programmers who have resolved this issue and are now comfortable with C++ strings, what is your solution? Is there no solution? Is it impossible?
The standard way to build a string, formatting non-string types as strings, is a string stream
#include <sstream>
std::ostringstream ss;
ss << "Failure in " << __FUNCTION__ << ": foobar was " << foobar << "\n";
SetReport(ss.str());
If you do this often, you could write a variadic template to do that:
template <typename... Ts> std::string str(Ts&&...);
SetReport(str("Failure in ", __FUNCTION__, ": foobar was ", foobar, '\n'));
The implementation is left as an exercise for the reader.
In this particular case, string literals (including __FUNCTION__) can be concatenated by simply writing one after the other; and, assuming foobar is a std::string, that can be concatenated with string literals using +:
SetReport("Failure in " __FUNCTION__ ": foobar was " + foobar + "\n");
If foobar is a numeric type, you could use std::to_string(foobar) to convert it.
Plain string literals (e.g. "abc" and __FUNCTION__) and char const* do not support concatenation. These are just plain C-style char const[] and char const*.
Solutions are to use some string formatting facilities or libraries, such as:
std::string and concatenation using +. May involve too many unnecessary allocations, unless operator+ employs expression templates.
std::snprintf. This one does not allocate buffers for you and not type safe, so people end up creating wrappers for it.
std::stringstream. Ubiquitous and standard but its syntax is at best awkward.
boost::format. Type safe but reportedly slow.
cppformat. Reportedly modern and fast.
One of the simplest solution is to use an C++ empty string. Here I declare empty string variable named _ and used it in front of string concatenation. Make sure you always put it in the front.
#include <cstdio>
#include <string>
using namespace std;
string _ = "";
int main() {
char s[] = "chararray";
string result =
_ + "function name = [" + __FUNCTION__ + "] "
"and s is [" + s + "]\n";
printf( "%s", result.c_str() );
return 0;
}
Output:
function name = [main] and s is [chararray]
Regarding __FUNCTION__, I found that in Visual C++ it is a macro while in GCC it is a variable, so SetReport("Failure in " __FUNCTION__ "; foobar was " + foobar + "\n"); will only work on Visual C++. See: https://msdn.microsoft.com/en-us/library/b0084kay.aspx and https://gcc.gnu.org/onlinedocs/gcc/Function-Names.html
The solution using empty string variable above should work on both Visual C++ and GCC.
My Solution
I've continued to experiment with different things and I've got a solution which combines tivn's answer that involves making an empty string to help concatenate long std::string and character arrays together and a function of my own which allows single line copying of that std::string to a const char* which is safe to use when the string object leaves scope.
I would have used Mike Seymour's variadic templates but they don't seem to be supported by the Visual Studio 2012 I'm running and I need this solution to be very general so I can't rely on them.
Here is my solution:
Strings.h
#ifndef _STRINGS_H_
#define _STRINGS_H_
#include <string>
// tivn's empty string in the header file
extern const std::string _;
// My own version of .c_str() which produces a copy of the contents of the string input
const char* ToCString(std::string input);
#endif
Strings.cpp
#include "Strings.h"
const std::string str = "";
const char* ToCString(std::string input)
{
char* result = new char[input.length()+1];
strcpy_s(result, input.length()+1, input.c_str());
return result;
}
Usage
m_someMemberConstChar = ToCString(_ + "Hello, world! " + someDynamicValue);
I think this is pretty neat and works in most cases. Thank you everyone for helping me with this.
As of C++20, fmtlib has made its way into the ISO standard but, even on older iterations, you can still download and use it.
It gives similar capabilities as Python's str.format()(a), and your "ugly strings" example then becomes a relatively simple:
#include <fmt/format.h>
// Later on, where code is allowed (inside a function for example) ...
SetReport(fmt::format("Failure in {}: foobar was {}\n", __FUNCTION__, foobar));
It's much like the printf() family but with extensibility and type safety built in.
(a) But, unfortunately, not its string interpolation feature (use of f-strings), which has the added advantage of putting the expressions in the string at the place where they're output, something like:
set_report(f"Failure in {__FUNCTION__}: foobar was {foobar}\n");
If fmtlib ever got that capability, I'd probably wet my pants in excitement :-)

Macro string concatenation

I use macros to concatenate strings, such as:
#define STR1 "first"
#define STR2 "second"
#define STRCAT(A, B) A B
which having STRCAT(STR1 , STR2 ) produces "firstsecond".
Somewhere else I have strings associated to enums in this way:
enum class MyEnum
{
Value1,
Value2
}
const char* MyEnumString[] =
{
"Value1String",
"Value2String"
}
Now the following does not work:
STRCAT(STR1, MyEnumString[(int)MyEnum::Value1])
I was just wondering whether it possible to build a macro that concatenate a #defined string literal with a const char*? Otherwise, I guess I'll do without macro, e.g. in this way (but maybe you have a better way):
std::string s = std::string(STR1) + MyEnumString[(int)MyEnum::Value1];
The macro works only on string literals, i.e. sequence of characters enclosed in double quotes. The reason the macro works is that C++ standard treats adjacent string literals like a single string literal. In other words, there is no difference to the compiler if you write
"Quick" "Brown" "Fox"
or
"QuickBrownFox"
The concatenation is performed at compile time, before your program starts running.
Concatenation of const char* variables needs to happen at runtime, because character pointers (or any other pointers, for that matter) do not exist until the runtime. That is why you cannot do it with your CONCAT macro. You can use std::string for concatenation, though - it is one of the easiest solutions to this problem.
It's only working for char literals that they can be concatenated in this way:
"A" "B"
This will not work for a pointer expression which you have in your sample, which expands to a statement like
"STR1" MyEnumString[(int)MyEnum::Value1];
As for your edit:
Yes I would definitely go for your proposal
std::string s = std::string(STR1) + MyEnumString[(int)MyEnum::Value1];
Your macro is pretty unnecessary, as it can only work with string literals of the same type. Functionally it does nothing at all.
std::string s = STRCAT("a", "b");
Is exactly the same as:
std::string s = "a" "b";
So I feel that it's best to just not use the macro at all. If you want a runtime string concatenating function, a more C++-canonical version is:
inline std::string string_concat(const std::string& a, const std::string& b)
{
return a + b;
}
But again, it seems almost pointless to have this function when you can just do:
std::string a = "a string";
std::string ab = a + "b string";
I can see limited use for a function like string_concat. Maybe you want to work on arbitrary string types or automatic conversion between UTF-8 and UTF-16...

Reverse preprocessor stringizing operator

There is a lot of wide string numeric constants defined in one include file in one SDK, which I cannot modify, but which gets often updated and changed. So I cannot declare the numeric define with the numbers because It is completely different each few days and I don't want ('am not allowed) to apply any scripting for updating
If it would be the other way round and the constant would be defined as a number, I can simply make the string by # preprocessor operator.
I don't won't to use atoi and I don't want to make any variables, I just need the constants in numeric form best by preprocessor.
I know that there is no reverse stringizing operator, but isn't there any way how to convert string to token (number) by preprocessor?
There is no way to "unstringify" a string in the preprocessor. However, you can get, at least, constant expressions out of the string literals using user-defined literals. Below is an example initializing an enum value with the value taken from a string literal to demonstrate that the decoding happens at compile time, although not during preprocessing:
#include <iostream>
constexpr int make_value(int base, wchar_t const* val, std::size_t n)
{
return n? make_value(base * 10 + val[0] - L'0', val + 1, n -1): base;
}
constexpr int operator"" _decode(wchar_t const* val, std::size_t n)
{
return make_value(0, val, n);
}
#define VALUE L"123"
#define CONCAT(v,s) v ## s
#define DECODE(d) CONCAT(d,_decode)
int main()
{
enum { value = DECODE(VALUE) };
std::cout << "value=" << value << "\n";
}

Combining string literals and integer constants

Given an compile-time constant integer (an object, not a macro), can I combine it with a string literal at compile time, possibly with the preprocessor?
For example, I can concatenate string literals just by placing them adjacent to each other:
bool do_stuff(std::string s);
//...
do_stuff("This error code is ridiculously long so I am going to split it onto "
"two lines!");
Great! But what if I add integer constants in the mix:
const unsigned int BAD_EOF = 1;
const unsigned int BAD_FORMAT = 2;
const unsigned int FILE_END = 3;
Is it possible to use the preprocessor to somehow concatenate this with the string literals?
do_stuff("My error code is #" BAD_EOF "! I encountered an unexpected EOF!\n"
"This error code is ridiculously long so I am going to split it onto "
"three lines!");
If that isn't possible, could I mix constant strings with string literals? I.e. if my error codes were strings, instead of unsigneds?
And if neither is possible, what is the shortest, cleanest way to patch together this mix of string literals and numeric error codes?
If BAD_EOF was a macro, you could stringize it:
#define STRINGIZE_DETAIL_(v) #v
#define STRINGIZE(v) STRINGIZE_DETAIL_(v)
"My error code is #" STRINGIZE(BAD_EOF) "!"
But it's not (and that's just about always a good thing), so you need to format the string:
stringf("My error code is #%d!", BAD_EOF)
stringstream ss; ss << "My error code is #" << BAD_EOF << "!";
ss.str()
If this was a huge concern for you (it shouldn't be, definitely not at first), use a separate, specialized string for each constant:
unsigned const BAD_EOF = 1;
#define BAD_EOF_STR "1"
This has all the drawbacks of a macro plus more to screwup maintain for a tiny bit of performance that likely won't matter for most apps. However, if you decide on this trade-off, it has to be a macro because the preprocessor can't access values, even if they're const.
What's wrong with:
do_stuff(my_int_1,
my_int_2,
"My error code is #1 ! I encountered an unexpected EOF!\n"
"This error code is ridiculously long so I am going to split it onto "
"three lines!");
If you want to abstract the error codes, then you can do this:
#define BAD_EOF "1"
Then you can use BAD_EOF as if it were a string literal.