Remove last character of std::string_view - c++

I'm trying to remove the last character of an std string view, but no matter what I do it remains there. I think its because I'm accidentally removing the "/0" instead of the desired "]".
Here is my code:
#include <iostream>
#include <tstr/tstring.h>
#include <cstring>
template<typename Class>
constexpr const char* to_string() {
std::string_view str = __PRETTY_FUNCTION__;
auto first = str.find("= ");
auto last = str.find("]");
auto str2 = str.substr(first + 2, last - first + 1);
return str2.data();
}
class Foo {};
int main()
{
std::cout << to_string<Foo>() << std::endl;
return 0;
}
This outputs Foo]. How can I remove the trailing ] ?
Thanks.

If you insert a pointer to char into a character stream, the pointed string is printed until the null terminator is reached. If there is no null terminator, then the behaviour of the program is undefined.
std::string_view is not guaranteed to be null terminated. Therefore it is dangerous to insert std::string_view::data into a character stream. In this particular case, the string view points to a non-null-terminated substring within a null terminated string, so the behaviour is well defined, but not what you intended because the output will proceed to the outside of the substring.
How can I remove the prepending ] ?
Return the string view to the substring rather than a pointer:
constexpr std::string_view to_string() {
...
return str2;
}

Related

I am facing an issue with string and null characters in C++.Null character is behaving differently

I am facing an issue with string and null characters in C++.
When I am writing '\0' in between the string and printing the string then I am getting only part before '\0' but on the other hand when I am taking string as input then changing any index as '\0' then it is printing differently. Why is it so and why the sizeof(string) is 32 in both the cases
Code is below for reference. Please help.
First code:
#include<iostream>
using namespace std;
int main(){
string s = "he\0llo";
cout<<s.length()<<"\n";
cout<<s<<endl;
cout<<sizeof(s)<<"\n";
}
output of First code:
2\n
he\n
32\n
Second code
#include<iostream>
using namespace std;
int main(){
string s;
cin>>s;
s[1] = '\0';
cout<<s<<"\n";
cout<<s.length()<<"\n";
cout<<sizeof(s)<<"\n";
return 0;
}
output of second code:
hllo\n
5\n
32\n
Below is the image for your reference.
std::string's implicit const CharT* constructors and assignment operator don't know the exact length of the string argument. Instead, it only knows that this is a const char* and is forced to assume that the string will be null-terminated, and thus computes the length using std::char_traits::length(...) (effectively std::strlen).
As a result, constructing a std::string object with an expression like:
std::string s = "he\0llo";
will compute 2 as the length of the string, since it assumes the first \0 character is the null terminator for the string, whereas your second example of:
s[1] = '\0';
is simply adding a null character into an already constructed string -- which does not change the size of the string.
If you want to construct a string with a null character in the middle, you can't let it compute the length of the string for you. Instead, you will have to construct the std::string and give it the length in some other way. This could either by done with the string(const char*, size_t) constructor, or with an iterator pair if this is an array:
// Specify the length manually
std::string s{"he\0llo",6};
Live Example
// Using iterators to a different container (such as an array)
const char c_str[] = "he\0llo";
std::string s{std::begin(c_str), std::end(c_str)};
Live Example
Note: sizeof(s) is telling you the size of std::string class itself in bytes for your implementation of the standard library. This does not tell you the length of the contained string -- which can be determined from either s.length() or s.size().
As of c++14, there is an option to specify quoted strings as a std::string by using std::literals. This prevents conversion of an array of chars to string which automatically stops at the first nul character.
#include<iostream>
#include <string>
using namespace std;
using namespace std::literals;
int main() {
string s = "he\0llo"s; // This initializes s to the full 6 char sequence.
cout << s.length() << "\n";
cout << s << endl;
cout << sizeof(s) << "\n"; // prints size of the s object, not the size of its contents
}
Results:
6
hello
28

Convert std::string to char* when string has nulls in middle

char* convert()
{
std::string data = "stack\0over\0flow";
return data.c_str();
}
This will return pointer and upon building the string from it on the caller it will have stack instead of complete string. Is there a workaround for this with out changing input and return types in c++ 11?
I want to rebuild the entire string on the caller side from char*.
Convert std::string to char* when string has nulls in middle
In order to convert a std::string that has nulls in the middle to a char*, you must first have a std::string that has nulls in the middle. You don't have such string.
Because you used the constructor std::string(const char*), the string that you created treated the passed pointer as a pointer to first element of a null terminated string, and as such the std::string only contains "stack".
You can use:
const auto& str = "stack\0over\0flow";
std::string data(str, std::size(str) - 1);
This will return stack instead of complete string
If the string were to actually contain "stack\0over\0flow", then c_str will return a pointer to the first element of the complete string "stack\0over\0flow".
If you treat the pointer as a pointer to null terminated string, then the first null terminator character terminates the null terminated string. There is no way to avoid that if you treat the pointer as a pointer to null terminated string. So, if you wish to avoid the string being terminated by the first null terminator character, then don't treat it as a pointer to a null terminated string (such as when you used the string literal as a pointer to null terminated string in your example).
However, that's mostly a moot issue since the pointed string will have been deallocated and the returned pointer will be dangling when the function returns. Attempting to access through the danging pointer will result in undefined behaviour.
Furthermore, c_str always returns a const char* and never char*.
To be able to use the full string with the \0's safely. you must put the data in a buffer.
E.g. like this :
#include <iostream>
#include <array>
#include <vector>
template<std::size_t N>
auto make_buffer(const char(&chars)[N])
{
// note returing an object uses RVO and nothing is left dangling.
std::array<char, N> buffer{};
for (std::size_t n = 0; n < N; ++n) buffer[n] = chars[n];
return buffer;
}
int main()
{
auto buffer = make_buffer("stack\0over\0flow");
// this is the closest you can get to having a char*
// pointing to ALL the data.
char* data_ptr = buffer.data();
// but you still must rely on the buffer size
// for correct looping over the valid values!
for (std::size_t n = 0; n < buffer.size(); ++n)
{
std::cout << data_ptr[n];
}
std::cout << std::endl;
// But with a buffer like this a range based for loop is recommended
for (const auto c : buffer)
{
std::cout << c;
}
std::cout << std::endl;
return 0;
}

C-style String not updated with = operator - C++

So recently I was playing with the concept of creating my own C++ classes that represent generic data (such as strings, numbers and arrays).
And so far my progress on this has been good (as seen here: https://github.com/LapysDev/LapysCPP).
Except one hitch. For the life of me, I can not figure out why the code below faults when it comes to creating a String class object with a variable amount of arguments.
#include <iostream>
#include <sstream>
#include <string.h>
// Make a new C-style string (or stringify a value).
char* stringify(char character) {
std::string stream = static_cast<std::ostringstream*>(&(std::ostringstream() << character)) -> str();
char* string = new char[stream.size() + 1];
strcpy(string, stream.c_str());
return string;
}
template <typename data> char* stringify(data string) { return strdup(std::string(string).c_str()); }
char* globalString = stringify("");
class String {
public:
char* value = stringify("");
String() {}
template <typename data>
String(data value) {
strcat(globalString, value);
this -> value = stringify(globalString);
globalString = stringify("");
}
template <typename data, typename... argumentsData>
String(data value, argumentsData... values) {
strcat(globalString, stringify(value));
String(values...);
}
};
int main(int argc, char* argv[]) {
std::cout << "String [1]: '" << String("Hello, World!").value << '\'' << std::endl;
// -> String [1]: 'Hello, World!'
std::cout << "String [3]: '" << String("Hello,", ' ', "World!").value << '\'';
// -> String [3]: ''
return 0;
}
I have tried everything I can with the code already (and yes, using an std::string for the text value is banned). If there's anyone out there that can explain why using multiple arguments faults when using char*'s, you're welcome to comment.
To summarize, I need to be able to create a String object with a proper value property using a variable amount of arguments.
// Works fine
String("Hello, World!").value // -> Hello, World!
// Needs fixing
String("Hello,", ' ', "World!").value // -> ...
I understand that this may not be platform to ask questions of this nature but a little help would go a long way. Thanks for reading through.
globalString is char* that points to the return value of stringify("");.
stringify("") returns strdup(std::string(string).c_str());. strdup returns a dynamically allocated string that has the length of its parameter ( and the same contents) .
Here, "" only contains \0 so the C-string returned from strdup will only have a length of 1.
You then try to call strcat(destination, source) with globalString as the destination, but globalString isn't big enough to fit the source.
strcat says:
The behavior is undefined if the destination array is not large
enough for the contents of both src and dest and the terminating null
character. The behavior is undefined if the strings overlap. The
behavior is undefined if either dest or src is not a pointer to a
null-terminated byte string.
So both of your test cases are UB. Even the first test that seems to work well.
std::string handles all of this for you. If you somehow aren't allowed to use it for whatever (stupid) reason a professor has given you, then make sure to allocate enough space for globalString before calling strcat on it, C-strings are tricky beasts.

std::string stops at \0

I am having problems with std::string..
Problem is that '\0' is being recognized as end of the string as in C-like strings.
For example following code:
#include <iostream>
#include <string>
int main ()
{
std::string s ("String!\0 This is a string too!");
std::cout << s.length(); // same result as with s.size()
std::cout << std::endl << s;
return 0;
}
outputs this:
7
String!
What is the problem here? Shouldn't std::string treat '\0' just as any other character?
Think about it: if you are given const char*, how will you detemine, where is a true terminating 0, and where is embedded one?
You need to either explicitely pass a size of string, or construct string from two iterators (pointers?)
#include <string>
#include <iostream>
int main()
{
auto& str = "String!\0 This is a string too!";
std::string s(std::begin(str), std::end(str));
std::cout << s.size() << '\n' << s << '\n';
}
Example: http://coliru.stacked-crooked.com/a/d42211b7199d458d
Edit: #Rakete1111 reminded me about string literals:
using namespace std::literals::string_literals;
auto str = "String!\0 This is a string too!"s;
Your std::string really has only 7 characters and a terminating '\0', because that's how you construct it. Look at the list of std::basic_string constructors: There is no array version which would be able to remember the size of the string literal. The one at work here is this one:
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
The "String!\0 This is a string too!" char const[] array is converted to a pointer to the first char element. That pointer is passed to the constructor and is all information it has. In order to determine the size of the string, the constructor has to increment the pointer until it finds the first '\0'. And that happens to be one inside of the array.
If you happen to work with a lot zero bytes in your strings, then chances are that std::vector<char> or even std::vector<unsigned char> would be a more natural solution to your problem.
You are constructing your std::string from a string literal. String literals are automatically terminated with a '\0'. A string literal "f\0o" is thus encoded as the following array of characters:
{'f', '\0', 'o', '\0'}
The string constructor taking a char const* will be called, and will be implemented something like this:
string(char const* s) {
auto e = s;
while (*e != '\0') ++e;
m_length = e - s;
m_data = new char[m_length + 1];
memcpy(m_data, s, m_length + 1);
}
Obviously this isn't a technically correct implementation, but you get the idea. The '\0' you manually inserted will be interpreted as the end of the string literal.
If you want to ignore the extra '\0', you can use a std::string literal:
#include <iostream>
#include <string>
int main ()
{
using namespace std::string_literals;
std::string s("String!\0 This is a string too!"s);
std::cout << s.length(); // same result as with s.size()
std::cout << std::endl << s;
return 0;
}
Output:
30
String! This is a string too!
\0 is known as a terminating character so you'll need to skip it somehow.
Take that as an example.
So whenever you want to skip special characters you would like to use two backslashes "\\0"
And '\\0' is a two-character literal
std::string test = "Test\\0 Test"
Results :
Test\0 Test
Most beginners also make mistake when loading eg. files :
std::ifstream some_file("\new_dir\test.txt"); //Wrong
//You should be using it like this :
std::ifstream some_file("\\new_dir\\test.txt"); //Correct
In very few words, you're constructing your C++ string from a standard C string.
And standard C strings are zero-terminated. So, your C string parameter will be terminated in the first \0 character it can find. And that character is the one you explicitly provided in your string "String!\0 This is a string too!"
And not in the 2nd one that is implictly and automatically provided by the compiler in the end of your C standard string.
Escape your \0
std::string s ("String!\\0 This is a string too!");
and you will get what you need:
31
String!\0 This is a string too!
That's not a problem, that's the intended behavior.
Maybe you could elaborate why you have a \0 in your string.
Using a std::vector would allow you to use \0 in your string.

Inconsistency between std::string and string literals

I have discovered a disturbing inconsistency between std::string and string literals in C++0x:
#include <iostream>
#include <string>
int main()
{
int i = 0;
for (auto e : "hello")
++i;
std::cout << "Number of elements: " << i << '\n';
i = 0;
for (auto e : std::string("hello"))
++i;
std::cout << "Number of elements: " << i << '\n';
return 0;
}
The output is:
Number of elements: 6
Number of elements: 5
I understand the mechanics of why this is happening: the string literal is really an array of characters that includes the null character, and when the range-based for loop calls std::end() on the character array, it gets a pointer past the end of the array; since the null character is part of the array, it thus gets a pointer past the null character.
However, I think this is very undesirable: surely std::string and string literals should behave the same when it comes to properties as basic as their length?
Is there a way to resolve this inconsistency? For example, can std::begin() and std::end() be overloaded for character arrays so that the range they delimit does not include the terminating null character? If so, why was this not done?
EDIT: To justify my indignation a bit more to those who have said that I'm just suffering the consequences of using C-style strings which are a "legacy feature", consider code like the following:
template <typename Range>
void f(Range&& r)
{
for (auto e : r)
{
...
}
}
Would you expect f("hello") and f(std::string("hello")) to do something different?
If we overloaded std::begin() and std::end() for const char arrays to return one less than the size of the array, then the following code would output 4 instead of the expected 5:
#include <iostream>
int main()
{
const char s[5] = {'h', 'e', 'l', 'l', 'o'};
int i = 0;
for (auto e : s)
++i;
std::cout << "Number of elements: " << i << '\n';
}
However, I think this is very undesirable: surely std::string and string literals should behave the same when it comes to properties as basic as their length?
String literals by definition have a (hidden) null character at the end of the string. Std::strings do not. Because std::strings have a length, that null character is a bit superfluous. The standard section on the string library explicitly allows non-null terminated strings.
Edit
I don't think I've ever given a more controversial answer in the sense of a huge amount of upvotes and a huge amount of downvotes.
The auto iterator when applied to a C-style array iterates over each element of the array. The determination of the range is made at compile-time, not run time. This is ill-formed, for instance:
char * str;
for (auto c : str) {
do_something_with (c);
}
Some people use arrays of type char to hold arbitrary data. Yes, it is an old-style C way of thinking, and perhaps they should have used a C++-style std::array, but the construct is quite valid and quite useful. Those people would be rather upset if their auto iterator over a char buffer[1024]; stopped at element 15 just because that element happens to have the same value as the null character. An auto iterator over a Type buffer[1024]; will run all the way to the end. What makes a char array so worthy of a completely different implementation?
Note that if you want the auto iterator over a character array to stop early there is an easy mechanism to do that: Add a if (c == '0') break; statement to the body of your loop.
Bottom line: There is no inconsistency here. The auto iterator over a char[] array is consistent with how auto iterator work any other C-style array.
That you get 6 in the first case is an abstraction leak that couldn't be avoided in C. std::string "fixes" that. For compatibility, the behaviour of C-style string literals does not change in C++.
For example, can std::begin() and std::end() be overloaded for
character arrays so that the range they delimit does not include the
terminating null character? If so, why was this not done?
Assuming access through a pointer (as opposed to char[N]), only by embedding a variable inside the string containing the number of characters, so that seeking for NULL isn't required any more. Oops! That's std::string.
The way to "resolve the inconsistency" is not to use legacy features at all.
According to N3290 6.5.4, if the range is an array, boundary values are
initialized automatically without begin/end function dispatch.
So, how about preparing some wrapper like the following?
struct literal_t {
char const *b, *e;
literal_t( char const* b, char const* e ) : b( b ), e( e ) {}
char const* begin() const { return b; }
char const* end () const { return e; }
};
template< int N >
literal_t literal( char const (&a)[N] ) {
return literal_t( a, a + N - 1 );
};
Then the following code will be valid:
for (auto e : literal("hello")) ...
If your compiler provides user-defined literal, it might help to abbreviate:
literal operator"" _l( char const* p, std::size_t l ) {
return literal_t( p, p + l ); // l excludes '\0'
}
for (auto e : "hello"_l) ...
EDIT: The following will have smaller overhead
(user-defined literal won't be available though).
template< size_t N >
char const (&literal( char const (&x)[ N ] ))[ N - 1 ] {
return (char const(&)[ N - 1 ]) x;
}
for (auto e : literal("hello")) ...
If you wanted the length, you should use strlen() for the C string and .length() for the C++ string. You can't treat C strings and C++ strings identically--they have different behavior.
The inconsistency can be resolved using another tool in C++0x's toolbox: user-defined literals. Using an appropriately-defined user-defined literal:
std::string operator""s(const char* p, size_t n)
{
return string(p, n);
}
We'll be able to write:
int i = 0;
for (auto e : "hello"s)
++i;
std::cout << "Number of elements: " << i << '\n';
Which now outputs the expected number:
Number of elements: 5
With these new std::string literals, there is arguably no more reason to use C-style string literals, ever.