Is there any way I could further optimize this program? [duplicate] - c++

This question already has answers here:
read words from line in C++
(2 answers)
Right way to split an std::string into a vector<string>
(12 answers)
Closed 12 days ago.
I've been trying to learn how to optimize C++ in the context of low latency trading systems, and am wondering if this implementation could be improved. I would appreciate any insight, either specific or general.
// Code to add each word in string to vector
int main() {
std::string originalText = "Hello World!";
std::vector<std::string> words;
words.reserve(originalText.length()); // unsure if this could be more accurate
std::size_t wStart = 0;
std::size_t pos = originalText.find(" ");
while(pos != std::string::npos) {
words.emplace_back(&originalText[wStart], pos - wStart);
wStart = pos + 1;
pos = originalText.find(" ", wStart);
}
words.emplace_back(&originalText[wStart], originalText.size() - wStart);
return 0;
}

You still allocate and copy std::strings from the input that's not needed. You can use string_views. This is no way near optimized for a trading system though.
I still use a vector and the will reallocate while it grows.
When optimizing code there is two things, keep thinking/researching if there is another algorithm (that usually brings the biggest performance gain). Then keep profiling to find your (real) bottlenecks.
Anyway here is a split function that can split without making string copies and can split on multiple characters (you can still optimize that to only split on spaces if that is the only delimiter)
std::vector<std::string_view> split_string(std::string_view string, std::string_view delimiters)
{
std::vector<std::string_view> substrings;
if(delimiters.size() == 0ul)
{
substrings.emplace_back(string);
return substrings;
}
auto start_pos = string.find_first_not_of(delimiters);
auto end_pos = start_pos;
auto max_length = string.length();
while(start_pos < max_length)
{
end_pos = std::min(max_length, string.find_first_of(delimiters, start_pos));
if(end_pos != start_pos)
{
substrings.emplace_back(&string[start_pos], end_pos - start_pos);
start_pos = string.find_first_not_of(delimiters, end_pos);
}
}
return substrings;
}

Related

string::replace not working correctly 100% of the time?

I'm trying to replace every space character with '%20' in a string, and I'm thinking of using the built in replace function for the string class.
Currently, I have:
void replaceSpace(string& s)
{
int len = s.length();
string str = "%20";
for(int i = 0; i < len; i++) {
if(s[i] == ' ') {
s.replace(i, 1, str);
}
}
}
When I pass in the string "_a_b_c_e_f_g__", where the underscores represent space, my output is "%20a%20b%20c%20e_f_g__". Again, underscores represent space.
Why is that the spaces near the beginning of the string are replaced, but the spaces towards the end aren't?
You are making s longer with each replacement, but you are not updating len which is used in the loop condition.
Modifying the string that you are just scanning is like cutting the branch under your feet. It may work if you are careful, but in this case you aren't.
Namely, you take the string len at the beginning but with each replacement your string gets longer and you are pushing the replacement places further away (so you never reach all of them).
The correct way to cut this branch is from its end (tip) towards the trunk - this way you always have a safe footing:
void replaceSpace(string& s)
{
int len = s.length();
string str = "%20";
for(int i = len - 1; i >= 0; i--) {
if(s[i] == ' ') {
s.replace(i, 1, str);
}
}
}
You're growing the string but only looping to its initial size.
Looping over a collection while modifying it is very prone to error.
Here's a solution that doesn't:
void replace(string& s)
{
string s1;
std::for_each(s.begin(),
s.end(),
[&](char c) {
if (c == ' ') s1 += "%20";
else s1 += c;
});
s.swap(s1);
}
As others have already mentioned, the problem is you're using the initial string length in your loop, but the string gets bigger along the way. Your loop never reaches the end of the string.
You have a number of ways to fix this. You can correct your solution and make sure you go to the end of the string as it is now, not as it was before you started looping.
Or you can use #molbdnilo 's way, which creates a copy of the string along the way.
Or you can use something like this:
std::string input = " a b c e f g ";
std::string::size_type pos = 0;
while ((pos = input.find(' ', pos)) != std::string::npos)
{
input.replace(pos, 1, "%20");
}
Here's a function that can make it easier for you:
string replace_char_str(string str, string find_str, string replace_str)
{
size_t pos = 0;
for ( pos = str.find(find_str); pos != std::string::npos; pos = str.find(find_str,pos) )
{
str.replace(pos ,1, replace_str);
}
return str;
}
So if when you want to replace the spaces, try it like this:
string new_str = replace_char_str(yourstring, " ", "%20");
Hope this helps you ! :)

Replace all occurrences of the search string after specific position

I'm looking for replace all algorithm which replaced all occurrences of substring after specific position. So far I have replace all copy approach. What is most convenient way to do it without allocation of new string except this one? Does exist convenient way to do it with boost?
#include <iostream>
#include <string>
#include <boost/algorithm/string/replace.hpp>
int main() {
std::string str = "1234 abc1234 marker 1234 1234 123 1 12 134 12341234";
const std::string marker("marker");
size_t pos = str.find(marker);
if (pos == std::string::npos) {
return 0;
}
pos += marker.length();
std::string str_out(str, 0, pos);
boost::algorithm::replace_all_copy(std::back_inserter(str_out), str.substr(pos, std::string::npos), "12", "XXXX");
std::cout << str << std::endl;
std::cout << str_out << std::endl;
}
If you want to do an in-place find and replace operation, you'll have to be aware of the performance implications. In order to do such an operation, you will likely have to read the string backwards which can lead to bad cache behavior, or do a lot of memory shuffling, which can also be bad for performance. In general, it's best to do a replace-in-copy operation since any strings you'll be operating on will likely be relatively small, and most memory caches will handle things quite easily.
If you must have an in-place find and replace algorithm, use the following code if you're just looking for a drop-in function. I benchmarked it and it's very fast.
std::string& find_replace_in_place( std::string &haystack, const std::string needle, const std::string replacement, size_t start = 0 ){
size_t ret = 0;
size_t position = haystack.find( needle, start );
while( position != std::string::npos ){
haystack.replace( position, needle.length(), replacement );
position = haystack.find( needle, position + replacement.length() );
}
return haystack;
}

need help on splitting a string in c++ [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
i need a suggestion on how to take one string of text and split it up based on a certain character, in this case "," without the use of any outside libraries
the lines of text are:
Amadeus,Drama,160 Mins.,1984,14.83
As Good As It Gets,Drama,139 Mins.,1998,11.3
Batman,Action,126 Mins.,1989,10.15
Billy Elliot,Drama,111 Mins.,2001,10.23
BR,SF,117,1982,11.98
Shadowlands,Drama,133 Mins.,1993,9.89
Shrek,Animation,93 Mins,2001,15.99
Snatch,Action,103 Mins,2001,20.67
The Lord of the Rings,Fantasy,178 Mins,2001,25.87
If you don't want to resort to other libraries (Boost.Tokenizer is a good choice IMO), here is some simple code for doing that:
#include <string>
#include <vector>
using namespace std;
vector<string> tokenize(string const& s, string const& separator)
{
size_t start = 0;
size_t pos = s.find(separator);
vector<string> v;
while (pos != string::npos)
{
string sub = s.substr(start, pos - start);
v.push_back(sub);
start = pos + 1;
pos = s.find(separator, start);
}
string sub = s.substr(start, pos - start);
v.push_back(sub);
return v;
}
int main()
{
string s = "asfa,adf,daf,c";
vector<string> v = tokenize(s, ",");
// Do what you want with v...
return 0;
}
You could just find the indices of the commas and store them in a vector, then use string::substr (http://www.cplusplus.com/reference/string/string/substr/) to get the substrings between those indices.

C++ Breaking up a string using multiple delimiters [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Split a string into words by multiple delimiters in C++
I'm currently trying to read a file where each line has a variation of tabs and spaces separating key attributes that need to be inserted into a binary tree.
My question is: How do I split a line up using multiple delimiters using only the STL? I've been trying to wrap my head around this for the good part of the day to no avail.
Any advice would be very much appreciated.
Use std::string::find_first_of
vector<string> bits;
size_t pos = 0;
size_t newpos;
while(pos != string::npos) {
newpos = str.find_first_of(" \t", pos);
bits.push_back(str.substr(pos, newpos-pos));
if(pos != string::npos)
pos++;
}
Using string::find_first_of() [1]:
int main ()
{
string str("Replace the vowels in this sentence by asterisks.");
size_t found;
found = str.find_first_of("aeiou");
while (found != string::npos) {
str[found]='*';
found=str.find_first_of("aeiou", found + 1);
}
cout << str << endl;
return 0;
}

Reverse string find_first_not_of

I have a std::string and I want to find the position of the first character that:
Is different from all the following characters: ' ', '\n' and '\t'.
Has lower position from that indicated by me.
So, for example if I have the following string and position:
string str("AAA BBB=CCC DDD");
size_t pos = 7;
I want to have the possibility to use a method like this:
size_t res = find_first_of_not_reverse(str, pos, " \n\t");
// now res = 4, because 4 is the position of the space character + 1
How can I do?
As Bo commented, templatetypedef's answer was 99% of the way there; we just need std::string::find_last_of rather than std::string::find_last_not_of:
#include <cassert>
#include <string>
std::string::size_type find_first_of_not_reverse(
std::string const& str,
std::string::size_type const pos,
std::string const& chars)
{
assert(pos > 1);
assert(pos < str.size());
std::string::size_type const res = str.find_last_of(chars, pos - 1) + 1;
return res == pos ? find_first_of_not_reverse(str, pos - 1, chars)
: res ? res
: std::string::npos;
}
int main()
{
std::string const str = "AAA BBB=CCC DDD";
std::string const chars = " \n\t";
std::string::size_type res = find_first_of_not_reverse(str, 7, chars); // res == 4
res = find_first_of_not_reverse(str, 2, chars); // res == npos
}
I was curious why basic_string does not define rfind_first_of and friends myself. I think it should. Regardless here is a non-recursive (see ildjarn's answer) implementation that should fulfill the requirements of this question. It compiles but I've not tested it.
std::string delims = " \n\t";
reverse_iterator start = rend()-pos-1, found =
std::find_first_of(start,rend(),delims.begin(),delims.end());
return found==rend()?npos:pos-(found-start);
To be like rfind pos needs to be set to size() if it's npos or greater than size().
PS: I think this question could benefit from some editing. For one "find_first_of_not_reverse" is pretty misleading. It should be rfind_first_of I think (and then add 1 to the result.)