is std::string now usable as compile time constant? [duplicate] - c++

Many developers and library authors have been struggling with compile-time strings for quite a few years now - as the standard (library) string, std::string, requires dynamic memory allocation, and isn't constexpr.
So we have a bunch of questions and blog posts about how to get compile-time strings right:
Conveniently Declaring Compile-Time Strings in C++
Concatenate compile-time strings in a template at compile time?
C++ Compile-Time string manipulation
(off-site) Compile-time strings with constexpr
We've now learned that not only is new available in constexpr code, allowing for dynamical allocation at compile-time, but, in fact, std::string will become constexpr in C++20 (C++ standard working group meeting report by Herb Sutter).
Does that mean that for C++20-and-up code we should chuck all of those nifty compile-time string implementations and just always go with std::string?
If not - when would we do so, and when would we stick to what's possible today (other than backwards-compatible code of course)?
Note: I'm not talking about strings whose contents is part of their type, i.e. not talking about the equivalent of std::integral_constant; that's definitely not going to be std::string.

It depends on what you mean by "constexpr string".
What C++20 allows you to do is to use std::string within a function marked constexpr (or consteval). Such a function can create a string, manipulate it, and so forth just like any literal type. However, that string cannot leak out into non-constexpr code; that would be a non-transient allocation and is forbidden.
The thing is, all of the examples you give are attempts to use strings as template parameters. That's a similar-yet-different thing. You're not just talking about building a string at compile-time; you now want to use it to instantiate a template.
C++20 solves this problem by allowing user-defined types to be template parameters. But the requirements on such types are much more strict than merely being literal types. The type must have no non-public data members and the only members are of types that follow those restrictions. Basically, the compiler needs to know that a byte-wise comparison of its data members represents an equivalent value. And even a constexpr-capable std::string doesn't work that way.
But std::array<char, N> can do that. And if you are in constexpr code, call a constexpr function which returns a std::string, and store that string in a constexpr value, then string::size() is a constexpr function. So you can use that to fill in the N for your array.
Copying the characters into a constexpr array (since it's a constexpr value, it's immutable) is a bit more involved, but it's doable.
So C++20 solves those problem, just not (directly) with std::string.

Related

Why is there no overload for printing `std::byte`?

The following code does not compile in C++20
#include <iostream>
#include <cstddef>
int main(){
std::byte b {65};
std::cout<<"byte: "<<b<<'\n';// Missing overload
}
When std::byte was added in C++17, why was there no corresponding operator<< overloading for printing it? I can maybe understand the choice of not printing containers, but why not std::byte? It tries to act as primitive type and we even have overloads for std::string, the recent std::string_view, and perhaps the most related std::complex, and std::bitset itself can be printed.
There are also std::hex and similar modifiers, so printing 0-255 by default should not be an issue.
Was this just oversight? What about operator>>, std::bitset has it and it is not trivial at all.
EDIT: Found out even std::bitset can be printed.
From the paper on std::byte (P0298R3): (emphasis mine)
Design Decisions
std::byte is not an integer and not a character
The key motivation here is to make byte a distinct type – to improve program safety by leveraging the type system. This leads to the design that std::byte is not an integer type, nor a character type. It is a distinct
type for accessing the bits that ultimately make up object storage.
As such, it is not required to be implicitly convertible/interpreted to be either a char or any integral type whatsoever and hence cannot be printed using std::cout unless explicitly cast to the required type.
Furthermore, this question might help.
std::byte is intended for accessing raw data. To allow me to replace that damn uint8_t sprinkled all over the codebase with something that actually says "this is raw and unparsed", instead of something that could be misunderstood as a C string.
To underline: std::byte doesn't "try to be a primitive", it represents something even less - raw data.
That it's implemented like this is mostly a quirk of C++ and compiler implementations (layout rules for "primitive" types are much simpler than for a struct or a class).
This kind of thing is mostly found in low level code where, honestly, printing shouldn't be used. Isn't possible sometimes.
My use case, for example, is receiving raw bytes over I2C (or RS485) and parsing them into frame which is then put into a struct. Why would I want to serialize raw bytes over actual data? Data I will have access to almost immediately?
To sum up this somewhat ranty answer, providing operator overloads for std::byte to work with iostream goes against the intent of this type.
And expressing intent in code as much as possible is one of important principles in modern programming.

Avoiding the func(char *) api on embedded

Note:
I heavily changed my question to be more specific, but I will keep the old question at end of the post, in case it is useful to anyone.
New Question
I am developing an embedded application which uses the following types to represent strings :
string literals(null terminated by default)
std::array<char,size> (not null terminated)
std::string_view
I would like to have a function that accepts all of them in a uniform way. The only problem is that if the input is a string literal I will have to count the size with strlen that in both other cases doesn't work but if I use size it will not work on case 1.
Should I use a variant like so: std::variant<const char *,std::span<char>> ? Would that be heavy by forcing myself to use std::visit ? Would that thing even match correctly all the different representations of strings?
Old Question
Disclaimer when I refer to "string" in the following context I don't mean an std::string but just an abstract way to say alphanumeric series.
Most of the cases when I have to deal with strings in c++ I use something like void func(const std::string &); or without the const and the reference at some cases.Now on an embedded app I don't have access to std::string and I tried to use std::string_view the problem is that std::string_view when constructed from a non literal sometimes is not null terminated
Edit: I changed the question a bit as the comments implied some very helphull hints .
So even though y has a size in the example below:
std::array<char,5> x{"aa"} ;
std::string_view y(x.data());
I can't use y with a c api like printf(%s,y.data()) that is based on null termination
#include <array>
#include <string_view>
#include "stdio.h"
int main(){
std::array<char,5> x{"aaa"};
std::string_view y(x.data());
printf("%s",x);
}
To summarize:
What can I do to implement a stack allocated string that implicitly gets a static size at its constructors (from null terminated strings,string literals, string_views and std::arrays) and it is movable (or cheap copyable)?
What would be the underlying type of my class? What would be the speed costs in comparison with the underlying type?
I think that you are looking at two largely and three subtly different semantics of char*.
Yes, all of them point at char but the type-specific info on how to determine the length is not carried by that. Even in the ancient ancestor of C++ (not saying C...) a pointer to char was not always the same. Already there pointers to terminated and non-terminated sequences of characters could not be mixed.
In C++ the tool of overloading a function exists and it seems to be the obvious solution for your problem. You can still implement that efficiently with only one (helper) function doing the actual work, based on an explicit size information in a second parameter.
Overload the function which is "visible" on the API, with three versions for the three types. Have it determine the length in the appropriate way, then call the single helper function, providing that length.

Will std::string end up being our compile-time string after all?

Many developers and library authors have been struggling with compile-time strings for quite a few years now - as the standard (library) string, std::string, requires dynamic memory allocation, and isn't constexpr.
So we have a bunch of questions and blog posts about how to get compile-time strings right:
Conveniently Declaring Compile-Time Strings in C++
Concatenate compile-time strings in a template at compile time?
C++ Compile-Time string manipulation
(off-site) Compile-time strings with constexpr
We've now learned that not only is new available in constexpr code, allowing for dynamical allocation at compile-time, but, in fact, std::string will become constexpr in C++20 (C++ standard working group meeting report by Herb Sutter).
Does that mean that for C++20-and-up code we should chuck all of those nifty compile-time string implementations and just always go with std::string?
If not - when would we do so, and when would we stick to what's possible today (other than backwards-compatible code of course)?
Note: I'm not talking about strings whose contents is part of their type, i.e. not talking about the equivalent of std::integral_constant; that's definitely not going to be std::string.
It depends on what you mean by "constexpr string".
What C++20 allows you to do is to use std::string within a function marked constexpr (or consteval). Such a function can create a string, manipulate it, and so forth just like any literal type. However, that string cannot leak out into non-constexpr code; that would be a non-transient allocation and is forbidden.
The thing is, all of the examples you give are attempts to use strings as template parameters. That's a similar-yet-different thing. You're not just talking about building a string at compile-time; you now want to use it to instantiate a template.
C++20 solves this problem by allowing user-defined types to be template parameters. But the requirements on such types are much more strict than merely being literal types. The type must have no non-public data members and the only members are of types that follow those restrictions. Basically, the compiler needs to know that a byte-wise comparison of its data members represents an equivalent value. And even a constexpr-capable std::string doesn't work that way.
But std::array<char, N> can do that. And if you are in constexpr code, call a constexpr function which returns a std::string, and store that string in a constexpr value, then string::size() is a constexpr function. So you can use that to fill in the N for your array.
Copying the characters into a constexpr array (since it's a constexpr value, it's immutable) is a bit more involved, but it's doable.
So C++20 solves those problem, just not (directly) with std::string.

Why is the `std::sto`... series not a template?

I wonder if there is a reason why the std::sto series (e.g. std::stoi, std::stol) is not a function template, like that:
template<typename T>
T sto(std::string const & str, std::size_t *pos = 0, int base = 10);
and then:
template<>
int sto<int>(std::string const & str, std::size_t *pos, int base)
{
// do the stuff.
}
template<>
long sto<long>(std::string const & str, std::size_t *pos, int base)
{
// do the stuff.
}
/* etc. */
In my sense, that would be a better design, because for the moment, when I have to convert a string in whatever numerical value an user want, I have to manually manage each case.
Is there a reason to not have such a template function? Is there an assumed choice, or is this just done like that?
Looking at the description of these functions at cppref, I note the following:
... Interprets a signed integer value in the string str.
1) calls std::strtol(str.c_str(), &ptr, base)...
and strol a "C" standard function that's also available in C++.
Reading further, we see: (for the c++ sto* functions):
Return value
The string converted to the specified signed integer type.
Exceptions
std::invalid_argument if no conversion could be performed
std::out_of_range if the converted value would fall out of the range of the result type or if the underlying function (std::strtol or
std::strtoll) sets errno to ERANGE.
So while I have no original source for this, and indeed have never worked with these functions, I would guess that:
TL;DR : These functions are C++-ish wrappers around already existing C/C++ functions -- strtol* -- so they resemble these functions as close as possible.
I have to manage manually each case. Is there a reason to not have such a template function?
In case of such questions, Eric Lippert (C#) usually says something along the lines:
If a feature is missing, then it's missing because noone implemented it yet. And that's because either noone else earlier wanted yet, or because it was considered not worth the effort, or because it couldn't have been finished before publishing the current release".
Here, I guess it's the "not worth" part, but I have neither asked the commitee about, nor managed to find any answer in old questions and faqs. I didn't spend much time searching though.
I say this because I suppose that most common of these functions' functionality (if not all of) is already contained in stream classes, like istringstream. Just like cin/etc, this one also has an all-having operator >>, overloaded for all base numeric types (and more).
Furthermore, the stream manipulators like std::hex (std::setbase) already solve the problem of passing various type-dependent configuration parameters to the actual conversion functions. No problems with mixed function signatures (like those mentioned by DavidHaim in his answer). Here's just a single operator>>.
So.. since if we have it in streams, if we already can read numbers/etc from strings with simple foo >> bar >> setbase(42) >> baz >> ..., then I think it was not worth the effort to add more complicated layers to old C runtime functions.
No proof for that though. Just a hunch.
The problem with template specialization is that the specialization requires you to match the original template function signature, so each specialization must implement the interface of (string,pos,base).
If you would like to have some other type which does not follows this interface, you are in trouble.
Suppose that, in the future, we would like to have sto<std::pair<int,int>>. We will want to have pos and base for the first and the second stringified integer. we would like the signature to be in the form of string,pos1,base1,pos2,base2. Since sto signature is already set, we cannot do it.
You can always wrap std::sto* in your implementation of sto for integral types, but you cannot do that the other way around.
The purpose of these functions is to provide simple conversions for common cases. They are not intended as a general-purpose conversion suite. std::ostringstream is much better for that kind of thing.
In my sense, there would be a better design, because for the moment,
when I have to convert a string in whatever numerical value an user
want, I have to manage manually each case.
No, it would not. Templates goal (deliberately setting T-MP apart) is not to replace overloading; you should always prefer overloading to templates. Actually, it's something the language already does for you! Between a candidate function and a possible template instantation, the former will be prefered. Using language features for the sake of it is bad.
I don't see how templates could help either. Whatever type the user decides to input, it won't be known till runtime, and template types are deduced at compile time. C++ is a statically typed language. In this case, templates will just add an unneeded layer of complexity over normal function overloading.

C++ - Significance of local names for types

I recently read up how classes are allowed to define their own local names for types. One of the famous examples being size_type, provided almost by all STL containers. It was also mentioned that doing so helps hide implementation details from the user of the class. I am not quite sure how this is the case.
What are some examples where defining local names for types might be useful and how doing so hides implementation details?
Please provide some examples where defining local names for types might be useful and how it hides implementation details.
its more usefull when you use templated algorithms or containers, which might assume that your type has such type alias. So even if you modify type for size_type - i.e. change for some reason from size_t to int, then your type will still work with those algorithms / containers.
Otherwise, presence of size_type are required by standard when you for example implement your own allocator.
Suppose you have a program where you define several variables of type size_type and that it is defined somewhere as an int.
Then, upon analysis and reflection, you realize that the variables never assume values igger than 10 thousand. Therefore, the 32 bits used to allocate each of these variables are somewhate an overkill. In this case, you can redefine size_type as being of short type, instead of int. Therefore you will end up saving some memory.
Regarding the examples, you can check clock_t, char16_t, char32_t, wchar_t, true_type and false_type.