Related
Context:
In my company we generate a lot of types based on IDL files. Some of the types require special logic so they are handcoded but follow the same pattern as the generated ones. We have a function which all types must implement which is a name function. This will return the type name as a char* string and the function is constexpr.
Problem:
The problem is regarding collections which could contain other collections nested potentially N number of times. I therefore am trying to concatenate two or more char* strings at compile time.
Pseudocode of what I want to achieve:
template <typename T>
constexpr char* name()
{
constexpr char* collectionName = "std::vector";
constexpr char* containedTypeName = name<T>();
return concat(collectionName, "<", containedTypeName, ">");
}
Note:
There are examples out there which does something like this but is done with char[] or the use of static variables.
The question:
How can I make a constexpr function which return a char* which consists of two or more concatenated char* strings at compile time? I am bound to C++17.
From constexpr you cannot return char* which is constructed there... You must return some compile time known(also its size) constant thingy.
A possible solution could be something like:
#include <cstring>
// Buffer to hold the result
struct NameBuffer
{
// Hardcoded 128 bytes!!!!! Carefully choose the size!
char data[128];
};
// Copy src to dest, and return the number of copied characters
// You have to implement it since std::strcpy is not constexpr, no big deal.
constexpr int constexpr_strcpy(char* dest, const char* src);
//note: in c++20 make it consteval not constexpr
template <typename T>
constexpr NameBuffer name()
{
// We will return this
NameBuffer buf{};
constexpr const char* collectionName = "std::vector";
constexpr const char* containedTypeName = "dummy";
// Copy them one by one after each other
int n = constexpr_strcpy(buf.data, collectionName);
n += constexpr_strcpy(buf.data + n, "<");
n += constexpr_strcpy(buf.data + n, containedTypeName);
n += constexpr_strcpy(buf.data + n, ">");
// Null terminate the buffer, or you can store the size there or whatever you want
buf.data[n] = '\0';
return buf;
}
Demo
And since the returned char* is only depends on the template parameter in your case, you can create templated variables, and create a char* to them, and it can act like any other char*...
EDIT:
I have just realized that your pseudo code will never work!! Inside name<T>() you are trying to call name<T>().
You must redesign this!!! But! With some hack you can determine the size at compile time somehow for example like this:
#include <cstring>
#include <iostream>
template<std::size_t S>
struct NameBuffer
{
char data[S];
};
// Copy src to dest, and return the number of copied characters
constexpr int constexpr_strcpy(char* dest, const char* src)
{
int n = 0;
while((*(dest++) = *(src++))){ n++; }
return n;
}
// Returns the len of str without the null term
constexpr int constexpr_strlen(const char* str)
{
int n = 0;
while(*str) { str++; n++; }
return n;
}
// This template parameter does nothing now...
// I left it there so you can see how to create the template variable stuff...
//note: in c++20 make it consteval not constexpr
template <typename T>
constexpr auto createName()
{
constexpr const char* collectionName = "std::vector";
constexpr const char* containedTypeName = "dummy";
constexpr std::size_t buff_size = constexpr_strlen(collectionName) +
constexpr_strlen(containedTypeName) +
2; // +1 for <, +1 for >
/// +1 for the nullterm
NameBuffer<buff_size + 1> buf{};
/// I'm lazy to rewrite, but now we already calculated the lengths...
int n = constexpr_strcpy(buf.data, collectionName);
n += constexpr_strcpy(buf.data + n, "<");
n += constexpr_strcpy(buf.data + n, containedTypeName);
n += constexpr_strcpy(buf.data + n, ">");
buf.data[n] = '\0';
return buf;
}
// Create the buffer for T
template<typename T>
static constexpr auto name_buff_ = createName<T>();
// point to the buffer of type T. It can be a function too as you wish
template<typename T>
static constexpr const char* name = name_buff_<T>.data;
int main()
{
// int is redundant now, but this is how you could use this
std::cout << name<int> << '\n';
return 0;
}
Demo
In my compile time function I'd like to work with strings. BOTH ANSI and WIDE ones. So, I added a quick template to handle both. This is all easy-peasy, but I've got a special function which calculates security checksum on strings. This works on a byte array and it would take quite huge effort to rewrite to work on variable buffer size, so I thought I will just simply narrow the wchar down to char and let my function work on it. By default it doesn't work as I thought it should be.
Sample code to reproduce my problem:
https://godbolt.org/z/ya2zq7
#include <iostream>
constexpr void hack(const char* const from, const size_t fromLen, char* const to)
{
for (size_t i = 0; i < fromLen; i++)
{
to[i] = from[i] + 1;
}
}
template <typename U, std::size_t LENGTH>
class EncryptedStorage
{
U m_data[LENGTH]{};
public:
constexpr EncryptedStorage(const U* input)
{
hack(static_cast<const char* const>(input), LENGTH * sizeof(U), static_cast<char* const>(m_data));
}
};
int main()
{
// Test with CHAR
constexpr char test[] = "Hello World";
constexpr size_t size = sizeof(test) / sizeof(test[0]);
constexpr auto encrypted = EncryptedStorage<char, size>(test);
// test with WCHAR
constexpr wchar_t wtest[] = L"Hello World";
constexpr size_t wsize = sizeof(wtest) / sizeof(wtest[0]);
constexpr auto wencrypted = EncryptedStorage<wchar_t, wsize>(wtest);
}
If you comment the wide strings it will compile perfectly. Is it possible to do what I want, or I should really really rework all my algorithm to work on variable size?
The basic problem in your code is that you can't use static_cast to covert between pointers to different data types - when those types are unrelated, as char and wchar_t are; for this, you need a reinterpret_cast or a C-style cast:
constexpr EncryptedStorage(const U* input)
{
hack(reinterpret_cast<const char* const>(input), LENGTH * sizeof(U), reinterpret_cast<char* const>(m_data));
}
However, once you have such a cast, your EncyptedStorage function can no longer be evaluated at compile time, so two of theconstexpr declarations in your main will fail, and you will have to just use const instead:
const auto encrypted = EncryptedStorage<char, size>(test);
const auto wencrypted = EncryptedStorage<wchar_t, wsize>(wtest); // Can't use constexpr
EDIT:
Another way (perhaps nicer) is to use function-style casts:
using pcchar = const char* const;
using pchar = char* const;
constexpr EncryptedStorage(const U* input)
{
hack(pcchar(input), LENGTH * sizeof(U), pchar(m_data));
}
With this, you can use constexpr for encrypted but not for wencrypted!
So I have the following available:
struct data_t {
char field1[10];
char field2[20];
char field3[30];
};
const char *getData(const char *key);
const char *field_keys[] = { "key1", "key2", "key3" };
This code is given to my and I cannot modify it in any way. It comes from some old C project.
I need to fill in the struct using the getData function with the different keys, something like the following:
struct data_t my_data;
strncpy(my_data.field1, getData(field_keys[0]), sizeof(my_data.field1));
strncpy(my_data.field1, getData(field_keys[1]), sizeof(my_data.field2));
strncpy(my_data.field1, getData(field_keys[2]), sizeof(my_data.field3));
Of course, this is a simplification, and more things are going on in each assignment. The point is that I would like to represent the mapping between keys and struct member in a constant structure, and use that to transform the last code in a loop. I am looking for something like the following:
struct data_t {
char field1[10];
char field2[20];
char field3[30];
};
typedef char *(data_t:: *my_struct_member);
const std::vector<std::pair<const char *, my_struct_member>> mapping = {
{ "FIRST_KEY" , &my_struct_t::field1},
{ "SECOND_KEY", &my_struct_t::field2},
{ "THIRD_KEY", &my_struct_t::field3},
};
int main()
{
data_t data;
for (auto const& it : mapping) {
strcpy(data.*(it.second), getData(it.first));
// Ideally, I would like to do
// strlcpy(data.*(it.second), getData(it.first), <the right sizeof here>);
}
}
This, however, has two problems:
It does not compile :) But I believe that should be easy to solve.
I am not sure about how to get the sizeof() argument for using strncpy/strlcpy, instead of strcpy. I am using char * as the type of the members, so I am losing the type information about how long each array is. In the other hand, I am not sure how to use the specific char[T] types of each member, because if each struct member pointer has a different type I don't think I will be able to have them in a std::vector<T>.
As explained in my comment, if you can store enough information to process a field in a mapping, then you can write a function that does the same.
Therefore, write a function to do so, using array references to ensure what you do is safe, e.g.:
template <std::size_t N>
void process_field(char (&dest)[N], const char * src)
{
strlcpy(dest, getData(src), N);
// more work with the field...
};
And then simply, instead of your for loop:
process_field(data.field1, "foo");
process_field(data.field2, "bar");
// ...
Note that the amount of lines is the same as with a mapping (one per field), so this is not worse than a mapping solution in terms of repetition.
Now, the advantages:
Easier to understand.
Faster: no memory needed to keep the mapping, more easily optimizable, etc.
Allows you to write different functions for different fields, easily, if needed.
Further, if both of your strings are known at compile-time, you can even do:
template <std::size_t N, std::size_t M>
void process_field(char (&dest)[N], const char (&src)[M])
{
static_assert(N >= M);
std::memcpy(dest, src, M);
// more work with the field...
};
Which will be always safe, e.g.:
process_field(data.field1, "123456789"); // just fits!
process_field(data.field1, "1234567890"); // error
Which has even more pros:
Way faster than any strcpy variant (if the call is done in run-time).
Guaranteed to be safe at compile-time instead of run-time.
A variadic templates based solution:
struct my_struct_t {
char one_field[30];
char another_field[40];
};
template<typename T1, typename T2>
void do_mapping(T1& a, T2& b) {
std::cout << sizeof(b) << std::endl;
strncpy(b, a, sizeof(b));
}
template<typename T1, typename T2, typename... Args>
void do_mapping(T1& a, T2& b, Args&... args) {
do_mapping(a, b);
do_mapping(args...);
}
int main()
{
my_struct_t ms;
do_mapping(
"FIRST_MAPPING", ms.one_field,
"SECOND_MAPPING", ms.another_field
);
return 0;
}
Since data_t is a POD structure, you can use offsetof() for this.
const std::vector<std::pair<const char *, std::size_t>> mapping = {
{ "FIRST_FIELD" , offsetof(data_t, field1},
{ "SECOND_FIELD", offsetof(data_t, field2)}
};
Then the loop would be:
for (auto const& it : mapping) {
strcpy(static_cast<char*>(&data) + it.second, getData(it.first));
}
I don't think there's any way to get the size of the member similarly. You can subtract the offset of the current member from the next member, but this will include padding bytes. You'd also have to special-case the last member, subtracting the offset from the size of the structure itself, since there's no next member.
The mapping can be a function to write the data into the appropriate member
struct mapping_t
{
const char * name;
std::function<void(my_struct_t *, const char *)> write;
};
const std::vector<mapping_t> mapping = {
{ "FIRST_KEY", [](data_t & data, const char * str) { strlcpy(data.field1, str, sizeof(data.field1); } }
{ "SECOND_KEY", [](data_t & data, const char * str) { strlcpy(data.field2, str, sizeof(data.field2); } },
{ "THIRD_KEY", [](data_t & data, const char * str) { strlcpy(data.field3, str, sizeof(data.field3); } },
};
int main()
{
data_t data;
for (auto const& it : mapping) {
it.write(data, getData(it.name));
}
}
To iterate over struct member you need:
offset / pointer to the beginning of that member
size of that member
struct Map {
const char *key;
std::size_t offset;
std::size_t size;
};
std::vector<Map> map = {
{ field_keys[0], offsetof(data_t, field1), sizeof(data_t::field1), },
{ field_keys[1], offsetof(data_t, field2), sizeof(data_t::field2), },
{ field_keys[2], offsetof(data_t, field3), sizeof(data_t::field3), },
};
once we have that we need strlcpy:
std::size_t mystrlcpy(char *to, const char *from, std::size_t max)
{
char * const to0 = to;
if (max == 0)
return 0;
while (--max != 0 && *from) {
*to++ = *from++;
}
*to = '\0';
return to0 - to - 1;
}
After having that, we can just:
data_t data;
for (auto const& it : map) {
mystrlcpy(reinterpret_cast<char*>(&data) + it.offset, getData(it.key), it.size);
}
That reinterpret_cast looks a bit ugly, but it just shift &data pointer to the needed field.
We can also create a smarter container which takes variable pointer on construction, thus is bind with an existing variable and it needs a little bit of writing:
struct Map2 {
static constexpr std::size_t max = sizeof(field_keys)/sizeof(*field_keys);
Map2(data_t* pnt) : mpnt(pnt) {}
char* getDest(std::size_t num) {
std::array<char*, max> arr = {
mpnt->field1,
mpnt->field2,
mpnt->field3,
};
return arr[num];
}
const char* getKey(std::size_t num) {
return field_keys[num];
}
std::size_t getSize(std::size_t num) {
std::array<std::size_t, max> arr = {
sizeof(mpnt->field1),
sizeof(mpnt->field2),
sizeof(mpnt->field3),
};
return arr[num];
}
private:
data_t* mpnt;
};
But probably makes the iterating more readable:
Map2 m(&data);
for (std::size_t i = 0; i < m.max; ++i) {
mystrlcpy(m.getDest(i), getData(m.getKey(i)), m.getSize(i));
}
Live code available at onlinegdb.
The Ghostscript interpreter API has a function
GSDLLEXPORT int GSDLLAPI gsapi_init_with_args(void *instance, int argc, char **argv)
The final argument argv is a pointer to an array of C strings, which are interpreted as command-line arguments. I obviously cannot change the signature of the function gsapi_init_with_args to take a const char ** argument instead.
If I were willing to ignore (or silence) the deprecated conversion from string constant to 'char*' warning, then I would write simply
char *gs_argv[] = {"", "-dNOPAUSE", "-dBATCH", ...};
and pass gs_argv as the final argument. But I would prefer to fix my code so that I am not relying on an external function to behave in the way I expect it to (and effectively treat gs_argv as const char**).
Is there any simple way to declare gs_argv as an array of pointers to (non-const) C strings, and initialize its elements with string literals? (That is, using a similar approach to how I can initialize a single C string: using char c_str[] = "abc".) The best I can think of is to use
const char *gs_argv0[] = {"", "-dNOPAUSE", "-dBATCH", ...};
and then copy the contents, element by element, into gs_argv.
Please note that I understand why the compiler gives this warning (and have read the answers to, among others, this question). I am asking for a solution, rather than an explanation.
You can use:
char arg1[] = "";
char arg2[] = "-dNOPAUSE";
char arg3[] = "-dBATCH";
char* gs_argv0[] = {arg1, arg2, arg3, NULL};
int argc = sizeof(gs_argv0)/sizeof(gs_argv0[0]) - 1;
gsapi_init_with_args(instance, argc, gs_argv0)
Create copies of the string literals using strdup. This is more verbose, but fixes the warning.
char* gs_argv0[NARGS];
gs_argv0[0] = strdup("");
gs_argv0[1] = strdup("-dNOPAUSE");
// ...
Note that you will also need to free the memory allocated by strdup if you want to prevent leaks.
You might also want to add a comment to your code saying why you are doing this, to make it clear for future readers.
If you can guarantee that the function will not modify the non-const parameter, then it is acceptable to use const_cast in this situation.
A C++14 solution.
#define W(x) \
(([](auto& s)->char* \
{ \
static char r[sizeof(s)]; \
strcpy (r, s); \
return r; \
})(x))
char* argv[] =
{ W("--foo=bar",
W("baz"),
nullptr
};
Since this code requires C++11, there's a lower cost C++11 solution in another answer below. I'm leaving this one for posterity.
There are pretty much two choices: ignore it and const_cast, or do the right thing. Since this is modern C++, you're supposed to have nice, RAII classes. Thus, the simplest, safest thing to do is to safely wrap such an array.
// https://github.com/KubaO/stackoverflown/tree/master/questions/args-cstrings-32484688
#include <initializer_list>
#include <type_traits>
#include <cstdlib>
#include <cassert>
#include <vector>
class Args {
struct str_vector : std::vector<char*> {
~str_vector() { for (auto str : *this) free(str); }
} m_data;
void append_copy(const char * s) {
assert(s);
auto copy = strdup(s);
if (copy) m_data.push_back(copy); else throw std::bad_alloc();
}
public:
Args(std::initializer_list<const char*> l) {
for (auto str : l) append_copy(str);
m_data.push_back(nullptr);
}
template <std::size_t N>
Args(const char * const (&l)[N]) {
for (auto str : l) append_copy(str);
m_data.push_back(nullptr);
}
/// Initializes the arguments with a null-terminated array of strings.
template<class C, typename = typename std::enable_if<std::is_same<C, char const**>::value>::type>
Args(C l) {
while (*l) append_copy(*l++);
m_data.push_back(nullptr);
}
/// Initializes the arguments with an array of strings with given number of elements.
Args(const char ** l, size_t count) {
while (count--) append_copy(*l++);
m_data.push_back(nullptr);
}
Args(Args && o) = default;
Args(const Args &) = delete;
size_t size() const { return m_data.size() - 1; }
char ** data() { return m_data.data(); }
bool operator==(const Args & o) const {
if (size() != o.size()) return false;
for (size_t i = 0; i < size(); ++i)
if (strcmp(m_data[i], o.m_data[i]) != 0) return false;
return true;
}
};
Let's see how it works:
#include <iostream>
extern "C" int gsapi_init_with_args(void*, int argc, char** argv) {
for (int i = 0; i < argc; ++i)
std::cout << "arg " << i << "=" << argv[i] << std::endl;
return 0;
}
int main()
{
Args args1 { "foo", "bar", "baz" };
const char * args2i[] { "foo", "bar", "baz", nullptr };
Args args2 { (const char **)args2i };
const char * args3i[] { "foo", "bar", "baz" };
Args args3 { args3i };
const char * const args4i[] { "foo", "bar", "baz" };
Args args4 { args4i };
const char * args5i[] { "foo", "bar", "baz" };
Args args5 { args5i, sizeof(args5i)/sizeof(args5i[0]) };
assert(args1 == args2);
assert(args2 == args3);
assert(args3 == args4);
assert(args4 == args5);
gsapi_init_with_args(nullptr, args1.size(), args1.data());
}
Output:
arg 0=foo
arg 1=bar
arg 2=baz
Try to const_cast it:
gsapi_init_with_args(instance, argc, const_cast<char**>(argv));
Maybe it will help with fixing warning.
Inspired by n.m.'s C++14 version, here's a C++11 version. The trick is to use an evaluated empty lambda expression to generate a fresh type, so that each instantiation of W__ is unique.
template <typename T, int N> static char * W__(const char (&src)[N], T) {
static char storage[N];
strcpy(storage, src);
return storage;
}
#define W(x) W__(x, []{})
char * argv[] = {
W("foo"),
W("bar")
};
The static in front of W__'s return type means that W__ has internal linkage and won't bloat the object file with extra symbols. It has nothing to do with the static in front of storage, as the latter indicates the static storage duration for the local variable. The code below would be perfectly valid, but of course doing the wrong thing and having undefined behavior:
template <typename T, int N> static char * BAD(const char (&src)[N], T) {
char storage[N];
strcpy(storage, src);
return storage;
}
Since a lambda has to be evaluated, you can't simply make its type a template argument:
template<typename> void G();
G<decltype([]{})>(); // doesn't work
I have written some code to cast const char* to int by using constexpr and thus I can use a const char* as a template argument. Here is the code:
#include <iostream>
class conststr
{
public:
template<std::size_t N>
constexpr conststr(const char(&STR)[N])
:string(STR), size(N-1)
{}
constexpr conststr(const char* STR, std::size_t N)
:string(STR), size(N)
{}
constexpr char operator[](std::size_t n)
{
return n < size ? string[n] : 0;
}
constexpr std::size_t get_size()
{
return size;
}
constexpr const char* get_string()
{
return string;
}
//This method is related with Fowler–Noll–Vo hash function
constexpr unsigned hash(int n=0, unsigned h=2166136261)
{
return n == size ? h : hash(n+1,(h * 16777619) ^ (string[n]));
}
private:
const char* string;
std::size_t size;
};
// output function that requires a compile-time constant, for testing
template<int N> struct OUT
{
OUT() { std::cout << N << '\n'; }
};
int constexpr operator "" _const(const char* str, size_t sz)
{
return conststr(str,sz).hash();
}
int main()
{
OUT<"A dummy string"_const> out;
OUT<"A very long template parameter as a const char*"_const> out2;
}
In this example code, type of out is OUT<1494474505> and type of out2 is OUT<106227495>. Magic behind this code is conststr::hash() it is a constexpr recursion that uses FNV Hash function. And thus it creates an integral hash for const char* which is hopefully a unique one.
I have some questions about this method:
Is this a safe approach to use? Or can this approach be an evil in a specific use?
Can you write a better hash function that creates different integer for each string without being limited to a number of chars? (in my method, the length is long enough)
Can you write a code that implicitly casts const char* to int constexpr via conststr and thus we will not need aesthetically ugly (and also time consumer) _const user-defined string literal? For example OUT<"String"> will be legal (and cast "String" to integer).
Any help will be appreciated, thanks a lot.
Although your method is very interesting, it is not really a way to pass a string literal as a template argument. In fact, it is a generator of template argument based on string literal, which is not the same: you cannot retrieve string from hashed_string... It kinda defeats the whole interest of string literals in templates.
EDIT : the following was right when the hash used was the weighted sum of the letters, which is not the case after the edit of the OP.
You can also have problems with your hash function, as stated by mitchnull's answer. This may be another big problem with your method, the collisions. For example:
// Both outputs 3721
OUT<"0 silent"_const> out;
OUT<"7 listen"_const> out2;
As far as I know, you cannot pass a string literal in a template argument straightforwardly in the current standard. However, you can "fake" it. Here's what I use in general:
struct string_holder //
{ // All of this can be heavily optimized by
static const char* asString() // the compiler. It is also easy to generate
{ // with a macro.
return "Hello world!"; //
} //
}; //
Then, I pass the "fake string literal" via a type argument:
template<typename str>
struct out
{
out()
{
std::cout << str::asString() << "\n";
}
};
EDIT2: you said in the comments you used this to distinguish between several specializations of a class template. The method you showed is valid for that, but you can also use tags:
// tags
struct myTag {};
struct Long {};
struct Float {};
// class template
template<typename tag>
struct Integer
{
// ...
};
template<> struct Integer<Long> { /* ... */ };
// use
Integer<Long> ...; // those are 2
Integer<Float> ...; // different types
Here is the pattern that I am using for template const string parameters.
class F {
static constexpr const char conststr[]= "some const string";
TemplateObject<conststr> instance;
};
see :
https://stackoverflow.com/a/18031951/782168