I am aware about the + operator to join 2 std::string objects. And this method to join multiple strings. But I am wondering about the performance. This answer tells why + operator is inefficient as compared to a method (in Python). Is there a similar function to join multiple strings in C++ standard library?
There is no performance difference between an operator overload and method call -- depending on the calling context. At the point you should be concerned about this, you're micro optimizing.
Here's an abstract example demonstrating the concept.
class MyString
{
public:
// assume this class is implemented
std::string operator+(const std::string &rhs) const
{
return concatenate(rhs);
}
std::string concatenate(const std::string &rhs) const
{
// some implementation
}
};
MyString str1, str2;
// we can use the operator overload
std::string option1 = str1 + str2;
// or we can call a method
std::string option2 = str1.concatenate(str2);
Operator overloads exist (for the most part) to avoid typing out lengthy method calls like the latter example. It makes code more clean and concise.
If you were specifically talking about the performance of more than 2 strings, this is a different scenario. It's more performant to batch objects into one method call, as it avoids constructing more temporary objects than necessary. You won't be able to do this without a new data structure to do the heavy lifting for you.
Using the class above, we'll look at concatenating a bunch of objects together using the + operator in one expression.
std::string bigConcatenation = str1 + str2 + str1 + str2 + str1;
Firstly, you probably wouldn't be doing this if you were concerned about performance in the first place. That said, here's a pretty decent approximation of what would be happening (assuming no optimizations done by the compiler).
std::string bigConcatenation = str1;
bigConcatenation = bigConcatenation + str2;
bigConcatenation = bigConcatenation + str1;
bigConcatenation = bigConcatenation + str2;
bigConcatenation = bigConcatenation + str1;
The reason why this is not ideal, is each assignment creates a temporary object, adds them together, then assigns the result back to bigConcatenation.
Without using any extra containers, according to this answer, the most performant way of doing this would be something like this (hint: no temporaries are created in the process).
std::string bigConcatenation = str1;
bigConcatenation += str2;
bigConcatenation += str1;
bigConcatenation += str2;
bigConcatenation += str1;
Concatenating strings is rarely enough a bottleneck in C++ that almost nobody ever attempts to optimize it much.
If you get into a situation where it's honestly important to do so, it's fairly easy to avoid the problem though, with fairly minimal change, so a = b + c + d + e; stays a = b + c + d + e; instead of something like a.concat(b, c, d, e); (which, I'd add, is actually quite non-trivial to make work well). Depending on the types involved, you may need to add code to convert the first item in the list to the right type. For the obvious example, we can't overload operator+ to work on string literals, so if you wanted to concatenate string literals together, you'd have to explicitly convert the first to some other type.
The "trick" is in something called expression templates (except that in this case it doesn't really even need to be a template). What you do is create a string builder object that overloads its operator+ to just store pointers/references to the source strings. That, in turn, overloads a conversion to std::string that adds up the lengths of all the source strings, concatenates them together into an std::string, and finally returns that entire string. For example, code can look like this:
#include <string>
#include <iostream>
#include <vector>
#include <numeric>
#include <cstring>
namespace string {
class ref;
class builder {
std::vector<ref const *> strings;
public:
builder(ref const &s) { strings.push_back(&s); }
builder & operator+(ref const &s) {
strings.push_back(&s);
return *this;
}
operator std::string() const;
};
class ref {
char const *data;
size_t N;
public:
ref(char const *s) : data(s), N(std::strlen(s)) {}
ref(char const *s, size_t N) : data(s), N(N-1) {}
size_t length() const { return N; }
operator char const *() const { return data; }
builder operator+(ref const &other) {
return builder(*this) + other;
}
};
builder::operator std::string() const {
size_t length = std::accumulate(strings.begin(), strings.end(),
size_t(),
[](size_t a, auto s) { return a + s->length(); });
std::string ret;
ret.reserve(length);
for (auto s : strings)
ret.append(*s);
return ret;
}
}
string::ref operator "" _s(char const *s, unsigned long N) { return string::ref(s, N); }
int main() {
std::string a = "b"_s + "c" + "d" + "e";
std::cout << a << "\n";
}
Note that as the code is written above, you (at least theoretically) gain some run-time efficiency if you converted all the literals to string_refs up-front:
std::string a = "b"_s + "c"_s + "d"_s + "e"_s;
This way, the compiler measures the length of each literal at compile time, and passes the length directly to the user-defined literal operator. If you just pass string literals directly, the code uses std::strlen to measure the length at run time instead.
you can use a stringstream remember the to add the directive #include
into the header file or source file of your class. Whichever one has your method definitions.
Related
I've wanted to create a program using the operator new in order to obtain the right amount of memory for a string of characters.
#include <iostream>
#include <cstring>
using namespace std;
class String
{
private:
char* str;
public:
String(char* s)
{
int len = strlen(s);
str = new char[len + 1]; // points to a memory
strcpy(str, s);
}
~String()
{
cout << "Deleting";
delete[] str;
}
void display()
{
cout << str << endl;
}
};
int main()
{
String s1 = "who knows";
cout << "s1=";
s1.display();
return 0;
}
The constructor in this example takes a normal char* string as its argument. It obtains space in
memory for this string with new; str points to the newly obtained memory. The constructor
then uses strcpy() to copy the string into this new space. Of course, I've used a destructor as well.
However, the error is: no suitable constructor exists to convert from const char[10] to "String".
I'm a total beginner when it comes to pointers and I'm trying to understand why my constructor doesn't work as intended.
As noted in the comments, some compilers will accept your code (depending on how strict they are). For example, MSVC will accept it when "conformance mode" is disabled - specifically, the /Zc:strictStrings option.
However, to fully conform to strict C++ rules, you need to supply a constructor for your String class that takes a const char* argument. This can be done readily by just 'redirecting' that constructor to the one without the const keyword, and casting away the 'constness':
String(const char* cs) : String(const_cast<char*>(cs)) { }
An alternative (and IMHO far better) way is simply to add the const qualifier to your existing constructor's argument, as all the operations you do therein can be be done perfectly well with a const char* (you would then not actually need the non-const version):
String(const char* s) {
int len = strlen(s);
str = new char[len + 1]; // points to a memory
strcpy(str, s);
}
Without one or other of these 'amendments' (or something equivalent), you are passing the address of string literal (which is immutable) to a function (the constructor) that takes an argument that (in theory, at least) points to data that could be changed within that function; thus, a strict compiler is within its 'rights' to disallow this. As your constructor doesn't change the data, then you should have no problem qualifying its argument as const.
after years of writing Java, I would like to dig deeper into C++ again.
Although I think I can handle it, I don't know if I handle it the "state of the art"-way.
Currently I try to understand how to handle std::strings passed as const pointer to as parameter to a method.
In my understanding, any string manipulations I would like to perform on the content of the pointer (the actual string) are not possible because it is const.
I have a method that should convert the given string to lower case and I did quite a big mess (I believe) in order to make the given string editable. Have a look:
class Util
{
public:
static std::string toLower(const std::string& word)
{
// in order to make a modifiable string from the const parameter
// copy into char array and then instantiate new sdt::string
int length = word.length();
char workingBuffer[length];
word.copy(workingBuffer, length, 0);
// create modifiable string
std::string str(workingBuffer, length);
std::cout << str << std::endl;
// string to lower case (include <algorithm> for this!!!!)
std::transform(str.begin(), str.end(), str.begin(), ::tolower);
std::cout << str << std::endl;
return str;
}
};
Especially the first part, where I use the char buffer, to copy the given string into a modifiable string annoys me.
Are there better ways to implement this?
Regards,
Maik
The parameter is const (its a reference not a pointer!) but that does not prevent you from copying it:
// create modifiable string
std::string str = word;
That being said, why did you make the parameter a const reference in the first place? Using a const reference is good to avoid the parameter being copyied, but if you need the copy anyhow, then simply go with a copy:
std::string toLower(std::string word) {
std::transform(word.begin(), word.end(), word.begin(), ::tolower);
// ....
Remeber that C++ is not Java and values are values not references, ie copies are real copies and modifiying word inside the function won't have any effect on the parameter that is passed to the function.
you should replace all this:
// in order to make a modifiable string from the const parameter
// copy into char array and then instantiate new sdt::string
int length = word.length();
char workingBuffer[length];
word.copy(workingBuffer, length, 0);
// create modifiable string
std::string str(workingBuffer, length);
with simple this:
std::string str(word);
and it should work just fine =)
As you must make a copy of the input string, you may as well take it by value (also better use a namespace than a class with static members):
namespace util {
// modifies the input string (taken by reference), then returns a reference to
// the modified string
inline std::string&convert_to_lower(std::string&str)
{
for(auto&c : str)
c = std::tolower(static_cast<unsigned char>(c));
return str;
}
// returns a modified version of the input string, taken by value such that
// the passed string at the caller remains unaltered
inline std::string to_lower(std::string str)
{
// str is a (deep) copy of the string provided by caller
convert_to_lower(str);
// return-value optimisation ensures that no deep copy is made upon return
return str;
}
}
std::string str = "Hello";
auto str1 = util::to_lower(str);
std::cout << str << ", " << str1 << std::endl;
leaves str un-modified: it prints
Hello, hello
See here for why I cast to unsigned char.
I'm working through an old book of C++ at the moment, and in it was developed a "rational numbers" class to introduce the idea of operator overloading, etc. Here's some example code from the book:
interface:
const Rational operator+(const Rational& Rhs) const;
implementation:
const Rational Rational::operator+(const Rational& Rhs) const
{
Rational Answer(*this);
Answer += Rhs;
return Answer;
}
The copy constructor does what you think it would do, and the += operator is correctly overloaded.
I decided to practice a bit by implementing a string class, and so I took a similar approach. My += overload works fine, but the + seems to have no effect ultimately.
interface:
const String operator+(const String&) const;
implementation:
const String String::operator+(const String& Rhs) const
{
String Answer (*this);
Answer += Rhs;
return Answer;
}
where the copy constructor (which works) is defined as such:
String::String(const String& str)
{
unsigned _strlen = str.len() + 1;
content = new char[_strlen];
std::memcpy(content, str.content, _strlen);
length = _strlen - 1;
content[length] = '\0';
}
and += is overloaded by the following:
const String& String::operator+=(const String& Rhs)
{
unsigned _Addl = Rhs.len();
unsigned newLen = _Addl + length; //length is member variable -- current length of content
content = (char*) realloc( content, newLen+1 );
std::memcpy(content+length, Rhs.content, _Addl);
content[newLen] = '\0';
return *this;
}
However -- while I can get proper output for +=, the + operator is failing to actually return a concatenated string. With debugging outputs inside of the function, Answer is holding the correct content, but it returns the original String instead of the concatenated one. I have a feeling this is something to do with const's being everywhere, but I've tried without it as well to no good fortune.
Test Code:
(in main):
String s1 ("Hello");
String s2 (" World!");
String s3 = (s1+s2); //prints "Hello" when s3 is output
cout << (s1+s2) << endl; //prints "Hello"
String constructor for const char*
String::String(const char* str)
{
unsigned _strlen = strlen(str) + 1;
content = new char[_strlen];
std::memcpy(content, str, _strlen);
length = _strlen - 1;
content[length] = '\0';
}
Your String::operator+=(), despite your claim it is properly implemented, is not properly implemented.
Firstly, realloc() returns NULL if it fails, and your code is not checking for that.
Second, and more critical, is that the length member is not being updated. Since your code calls the len() member function to get the length of one string, and uses the length member to get the length of the other, all of your functions need to ensure those two methods are in sync (i.e. that they give consistent results for a given instance of String). Since length is not being updated, your code does not ensure that.
There are probably better approaches than using C-style memory allocation as well, but (assuming this is a learning exercise) I'll leave that alone.
You've given no pertinent code for your Rational class but, if it is not working, your code presumably exhibits similar inconsistencies between what various constructors and member functions do.
In Lua (apologise, I like working with it the best), the conversion between int and string is done automatically, so
"hi"..2
would result as
"hi2"
In C++ (cause I can't seem to get the default C++11 stoi() and to_string() methods to work) I defined these for myself:
int stoi(string str) {
char* ptr;
strtol(str.c_str(), &ptr, 10);
}
string to_string(int i) {
char* buf;
sprintf(buf, "%d", i);
return buf;
}
which are basically how the default ones are defined anyways.
Then I did this:
string operator+ (string& stuff, int why) {
stuff.append(to_string(why));
}
I tried it on the following code:
void print(string str) {
cout << str << endl;
}
int main() {
cout << stoi("1") + 2 << endl;
print("die" + 1);
return 0;
}
And it outputs
3
ie
Why is this so, and how can I fix it?
EDIT:
Here's what the code looks like now:
using namespace std;
string to_string(int i) {
char* buf;
sprintf(buf, "%d", i);
return buf;
}
string operator+ (string stuff, int why) {
stuff.append(to_string(why));
return stuff;
}
int main() {
cout << string("die") + 2 << endl;
return 0;
}
And it just keeps giving me stackdumps.
Replace print("die" + 1); with cout << std::string("die") + 1;
print() doesn't know what to do with strings. Use std::cout. "die" is a char*, +1 will increment the pointer.
std::string to_string(int i) {
char buf[(sizeof(int)*CHAR_BIT+2)/3+3];
sprintf(buf, "%d", i);
return buf;
}
You need to make an actual buffer to print to. The math is a quick over-estimate of big the largest decimal int is in characters; 3 bits can fit in 1 decimal character, plus null, plus negation, plus rounding, plus 1 for good measure. Hopefully I did not err: do some testing.
Also use snprintf instead of sprintf while you are at it: buffer overflows are not to be toyed with.
The next problem is that "hello" is not a std::string, It is a char const[6] -- an array of 6 char. It can be converted tomstd::string, but +1 will instead convert it to a pointer to the first character, then +1 it to the 2nd character.
Cast it to std::string before.
Finally, it is ambiguous in the standard (really) of pverloading an operator on std::string + int is legal. It is definitely poor practice, as you cannot do it in std legally, and you should overload operators in the type's namespace (so ADL works): these two conflict. On top of that, if std in the future adds such a + your code starts behaving strangely. On top of that, operators are part of a class's interface, and modifying the interface of a class you do not 'own' is rude and a bad habit.
Write your own string class that owns a std::string rather. Or a string view.
Finally, consider telling your compiler to use c++11, you probably just need to pass a flag to it like -std=c++11.
std::string s1("h1");
std::string s2("2");
s1 += s2;
If you are using C++11 compatible compiler you can convert int to string like this:
int i = 2;
std::string s = std::to_string(i);
If you are using Boost library:
#include <boost/lexical_cast.hpp>
int i = 2;
std::string s = boost::lexical_cast<std::string>(i);
Please do not use raw char pointers in C++ for strings.
overloading the operator+ on other than your own types it at best dangerous.
Just use std::to_string in conjunction with operator+ or +=, e.g.
std::string x = "hi";
x += std::to_string(2);
C++14 introduces a user-defined literal that takes a string literal (conversions are applied to make this a pointer) and returns a std::string. In C++11, you can just write your own (this is taken from libstdc++):
inline std::string
operator""_s(const char* str, size_t len)
{
return std::string{str, len};
}
(Note: UDLs without a preceding underscore are reserved names)
And you can use it like this:
// Assumes operator+ is overloaded
print("die"_s + 1);
Demo
"die" is not a std::string. It's a string literal.
Thus when you add 1 to the string literal, it decays to a const char* and the + 1 simply increments that pointer — to next char, 'i'.
Then you call print with the incremented pointer, which causes a std::string to be constructed using that pointer. Since it pointed to the 'i' character, to constructed string is initialized to "ie".
You must first make a std::string out of your string literal to make it call your operator+:
std::cout << std::string("die") + 1;
And then make a few fixes to your operator+:
string operator+ (string stuff, int why) {
return stuff.append(to_string(why));
}
Now it works.
Im writing string class by myself. And I overloaded + operator. Its works fine, but then I tried to eguate cstr = str +pop , its did nothing. `You could see my error in main() function. Complier doesnt give any mistake.
#include <iostream>
#include <string.h>
#include <stdlib.h>
using namespace std;
class S {
public:
S();
S(const char *str);
S(const S &s);
~S() { delete []string;}
S &operator =(const S &s);
int lenght() const {return l ;}
char* strS() const {return string;}
friend ostream &operator <<(ostream &, const S &first) {cout<<first.string;}
friend S operator+ (const S& first, const S& second);
private:
char *string;
int l;
};
int main(){
S pop("Q6");
S str("M5");
S cstr = str +pop; // works correct
cout<<str;
str = str + pop;
cout<<str ; // doesnt work, it doesnt write in terminal
return 0;
}
S::S()
{
l = 0;
string = new char[1];
string[0]='\0';
}
S::S(const char *str)
{
l = strlen(str);
string = new char[l+1];
memcpy(string, str, l+1);
}
S::S(const S &s)
{
l = s.l;
string = new char[l+1];
memcpy(string,s.string,l+1);
}
S &S::operator=(const S &s)
{
if (this != &s)
{
delete []string;
string = new char[s.l+1];
memcpy(string,s.string,s.l+1);
return *this;
}
return *this;
}
S operator +(const S& first, const S& second)
{
S temp;
temp.string = strcat(first.strS(),second.strS());
temp.l = first.lenght() + second.lenght();
return temp;
}
I`m looking forward to your help.
Your operator has bugs!
S temp;
//^^^^ has only one byte buffer!!!
temp.string = strcat(first.strS(),second.strS());
// 1 byte ^^^^^ strcat appends second.strS to first.strS
You should re-allocate memory for temp:
S temp;
temp.l = first.lenght() + second.lenght();
delete [] temp.string; // !!!! -
temp.string = new char[temp.l + 1]; // !!!!
// you should have another c-tor which can allocate memory!!!
// like: S(unsigned length, unsigned char c = '\0')
strcpy(temp.string, first.strS());
strcat(temp.string, second.strS());
Besides this obvious bug - you should also take care of exceptions - std::bad_alloc for example. Look at copy-and-swap idiom for better approach for this task.
From the manpage for strcat:
The strcat() and strncat() functions append a copy of the null-terminated
string s2 to the end of the null-terminated string s1, then add a termi-
nating `\0'. The string s1 must have sufficient space to hold the
result.
You're using it as if it allocates room for a new char array, then fills it. But, it doesn't do that.
The problem is that your operator+ doesn't allocate any memory for the combined string. Nor does it copy the string to right place (it copies the string to first, not to temp). There's no easy fix with the class design you have.
The problem is with your implementation of operator+. strcat() appends the string ponted by the second argument to the string pointed by the first argument. The return value is the first argument. Therefore on return from operator+ the resulting S and the first S argument will be pointing to the same buffer. Which will later be deleted twice...
Check the description of strcat. It appends the second argument to
the first, supposing both are null terminated strings, and returns the
first argument. In your case:
it appends to the string member of first, although there isn't
enoguh memory for it (undefined behavior), and
it sets the string pointer in temp to point to the same memory as
that in first; the first one to be destructed leaves the other
pointing to deleted memory, and the memory allocated in the default
constructor of temp is leaked.
Also, you never terminate your strings with '\0', so strcat may do
just about anything.
A better solution would be to implement += first, and define + in
terms of it. += would have to grow the memory it has, and append the
text from the second string to it.
And while I'm at it: your operator= doesn't work either. It will
leave the object in a state where it cannot be destructed if the new
fails (throwing std::bad_alloc). You must ensure that all operations
that can fail occur before the delete. (The fact that you need to
test for self assignment is a warning sign. It's very rare for this
test to be necessary in a correctly written assignment operator.) In
this case, the swap idiom would probably be your best bet: copy
construct a new S in a local variable, then swap their members.