Split string on the left in C++ - c++

I'm trying to write a short and stupid equation parser, and need to split a string around a given operator. I can split off the right side of a string by doing
return std::string(oprtr + 1, equ.end());
where equ is the string, and oprtr is an iterator for the position I need to split from. Doing this works perfectly, but splitting off the left, however, doesn't:
return std::string(equ.begin(), oprtr - 1);
====
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_S_create
I've tried a variety of other nasty workarounds that I'm really not proud of, like
return equ.substr(0, std::distance(equ.begin(), oprtr));
This one doesn't give errors, but actually just returns the entire equation. What am I doing wrong here?

Works for me with g++ 4.8.2:
#include <string>
#include <algorithm>
#include <iostream>
int main() {
std::string eq("a+b=c");
std::string::iterator opit = std::find(eq.begin(),eq.end(),'=');
std::string lhs = std::string(eq.begin(),opit);
std::cout << "lhs: " << lhs << "\n";
return 0;
}
The output is:
lhs: a+b

Seems you are doing something like this
void my_func(string equ, string::iterator oprtr)
{
string left = std::string(equ.begin(), oprtr);
}
string::iterator oprtr = equ.find('=');
my_func(equ, oprtr);
That won't work because in my_func you have two iterators to different strings. Because the original string is copied when you call my_func.
One fix is to pass by reference
void my_func(string& equ, string::iterator oprtr)
Another fix is to use integers instead of iterators. Integers aren't tied to one particular string instance like iterators are.

Related

Rotate last n elements of array in C++

I'm trying to implement a function that deletes a character from a string wherever the current index is. Below is a skeleton of what I have so far. I'm trying to rotate the character I want to remove to the end of the string then replace it with a null terminator. The code I have below does not seem to actually rotate buffer because the output I'm getting is "wor" instead of the expected output "wrd".
int main() {
char buffer[]="word";
int currIndex=2;
int endIndex=strlen(buffer);
currIndex--;
endIndex--;
rotate(buffer+currIndex,
buffer+1,
buffer+strlen(buffer));
buffer[endIndex]='\0';
cout << buffer << endl;
return 0;
}
This doesn't attempt to answer the question being asked, but rather solve the underlying problem: Removing a single character from a string.
The solution is a simple application of the std::string::erase class member:
#include <string>
#include <iostream>
int main() {
std::string word{ "word" };
std::string::size_type currIndex{ 2 };
word.erase( currIndex, 1 );
std::cout << word << std::endl;
}
Using a std::string makes things way easier because I don't have to think about pointers:
std::string buffer="word";
rotate(buffer.begin()+1, buffer.begin()+2, buffer.end());
buffer.resize(buffer.size()-1);
Demo
Alternatively, we can stick with a c-style array:
char buffer[]="word";
rotate(buffer+1, buffer+2, buffer+4);
buffer[3] = '\0';
Demo2
std::rotate accepts 3 arguments:
template< class ForwardIt >
ForwardIt rotate( ForwardIt first, ForwardIt n_first, ForwardIt last );
first is the first element in the range you want to left rotate.
nfirst is the element you want to be at the start of the range after you've rotated (this tells the algorithm how many times to left rotate, effectively)
last is the last element in the range you want to rotate.
Your code:
char buffer[]="word";
int currIndex=2;
int endIndex=strlen(buffer);
currIndex--;
endIndex--;
rotate(buffer+currIndex,
buffer+1,
buffer+strlen(buffer));
buffer[endIndex]='\0';
Was actually really close. You just got the second argument wrong. It should have been
rotate(buffer+currIndex,
buffer+2,
buffer+strlen(buffer));
buffer[endIndex]='\0';
But the code was admittedly a bit confusing written with the increments and decrements.

Splitting a vector into values

Say I have a vector of values from a tokenizing function, tokenize(). I know it will only have two values. I want to store the first value in a and the second in b. In Python, I would do:
a, b = string.split(' ')
I could do it as such in an ugly way:
vector<string> tokens = tokenize(string);
string a = tokens[0];
string b = tokens[1];
But that requires two extra lines of code, an extra variable, and less readability.
How would I do such a thing in C++ in a clean and efficient way?
EDIT: I must emphasize that efficiency is very important. Too many answers don't satisfy this. This includes modifying my tokenization function.
EDIT 2: I am using C++11 for reasons outside of my control and I also cannot use Boost.
With structured bindings (definitely will be in C++17), you'd be able to write something like:
auto [a,b] = as_tuple<2>(tokenize(str));
where as_tuple<N> is some to-be-declared function that converts a vector<string> to a tuple<string, string, ... N times ...>, probably throwing if the sizes don't match. You can't destructure a std::vector since it's size isn't known at compile time. This will necessarily do extra moves of the string so you're losing some efficiency in order to gain some code clarity. Maybe that's ok.
Or maybe you write a tokenize<N> that returns a tuple<string, string, ... N times ...> directly, avoiding the extra move. In that case:
auto [a, b] = tokenize<2>(str);
is great.
Before C++17, what you have is what you can do. But just make your variables references:
std::vector<std::string> tokens = tokenize(str);
std::string& a = tokens[0];
std::string& b = tokens[1];
Yeah, it's a couple extra lines of code. That's not the end of the world. It's easy to understand.
If you "know it will only have two values", you could write something like:
#include <cassert>
#include <iostream>
#include <string>
#include <tuple>
std::pair<std::string, std::string> tokenize(const std::string &text)
{
const auto pos(text.find(' '));
assert(pos != std::string::npos);
return {text.substr(0, pos), text.substr(pos + 1)};
}
your code is a great example of the power of STL but it's probably a bit slower.
int main()
{
std::string a, b;
std::tie(a, b) = tokenize("first second");
std::cout << a << " " << b << '\n';
}
Unfortunately without structured bindings (C++17) you have to use the std::tie hack and the variables a and b have to exist.
Ideally you'd rewrite the tokenize() function so that it returns a pair of strings rather than a vector:
std::pair<std::string, std::string> tokenize(const std::string& str);
Or you would pass two references to empty strings to the function as parameters.
void tokenize(const std::string& str, std::string& result_1, std::string& result_2);
If you have no control over the tokenize function the best you can do is move the strings out of the vector in an optimal way.
std::vector<std::string> tokens = tokenize(str);
std::string a = std::move(tokens.first());
std::string b = std::move(tokens.last());

ostream, copy function printing string address, instead of string contents

This prints the address for my string, but not its' contents,
#include <memory>
#include <string>
#include <list>
#include <iostream>
#include <iterator>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
unique_ptr<list<shared_ptr<string>>> upList (new list<shared_ptr<string>>);
shared_ptr<string> spNation (new string ("India"));
upList->push_back (spNation);
copy (upList->begin(), upList->end(), ostream_iterator<shared_ptr<string>> (cout, "\n "));
return 0;
}
My questions are:
What ostream_iterator<shared_ptr<string>> is taking shared_ptr or strings as its' prime object.
How to print actual string contents (i.e. India) using this approach.
Is this approach is preferable over traditional for loop to print all node contents.
What ostream_iterator<shared_ptr<string>> is taking shared_ptr or strings as its' prime object.
You've instantiated ostream_iterator for shared_ptr<string>, so that is what it will attempt to output.
How to print actual string contents (i.e. India) using this approach.
If you really want to use shared pointers for some reason, then you can't use copy since that won't undo the extra level of indirection. Either use a plain loop, or get rid of the unnecessary indirection:
list<string> list;
list.push_back("India");
copy(list.begin(), list.end(), ostream_iterator<string>(cout, "\n "));
Of course, it doesn't look as exciting without all the arrows, templates, new-expressions and pseudohungarian warts, but anyone trying to maintain the code won't thank you for adding such embellishments.
Is this approach is preferable over traditional for loop to print all node contents
It's preferable when it makes the code simpler. When it doesn't, it isn't.
Firstly: why you use shared_ptr<string> instead of string here? You shouln't do this.
1)
shared_ptr<string>
2) Use std::for_each with lambda (or range-based for loop)
for_each(upList->begin(), upList->end(), [](const shared_ptr<string>& p)
{
cout << *p << endl;
});
or
for (const auto& p : upList)
{
std::cout << *p << std::endl;
}

Can we split, manipulate and rejoin a string in c++ in one statement?

This is a bit of a daft question, but out of curiousity would it be possibly to split a string on comma, perform a function on the string and then rejoin it on comma in one statement with C++?
This is what I have so far:
string dostuff(const string& a) {
return string("Foo");
}
int main() {
string s("a,b,c,d,e,f");
vector<string> foobar(100);
transform(boost::make_token_iterator<string>(s.begin(), s.end(), boost::char_separator<char>(",")),
boost::make_token_iterator<string>(s.end(), s.end(), boost::char_separator<char>(",")),
foobar.begin(),
boost::bind(&dostuff, _1));
string result = boost::algorithm::join(foobar, ",");
}
So this would result in turning "a,b,c,d,e,f" into "Foo,Foo,Foo,Foo,Foo,Foo"
I realise this is OTT, but was just looking to expand my boost wizardry.
First, note that your program writes "Foo,Foo,Foo,Foo,Foo,Foo,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,," to your result string -- as already mentioned in comments, you wanted to use back_inserter there.
As for the answer, whenever there's a single value resulting from a range, I look at std::accumulate (since that is the C++ version of fold/reduce)
#include <string>
#include <iostream>
#include <numeric>
#include <boost/tokenizer.hpp>
#include <boost/algorithm/string.hpp>
#include <boost/bind.hpp>
std::string dostuff(const std::string& a) {
return std::string("Foo");
}
int main() {
std::string s("a,b,c,d,e,f");
std::string result =
accumulate(
++boost::make_token_iterator<std::string>(s.begin(), s.end(), boost::char_separator<char>(",")),
boost::make_token_iterator<std::string>(s.end(), s.end(), boost::char_separator<char>(",")),
dostuff(*boost::make_token_iterator<std::string>(s.begin(), s.end(), boost::char_separator<char>(","))),
boost::bind(std::plus<std::string>(), _1,
bind(std::plus<std::string>(), ",",
bind(dostuff, _2)))); // or lambda, for slightly better readability
std::cout << result << '\n';
}
Except now it's way over the top and repeats make_token_iterator twice. I guess boost.range wins.
void dostuff(string& a) {
a = "Foo";
}
int main()
{
string s("a,b,c,d,e,f");
vector<string> tmp;
s = boost::join(
(
boost::for_each(
boost::split(tmp, s, boost::is_any_of(",")),
dostuff
),
tmp
),
","
);
return 0;
}
Unfortunately I can't eliminate mentioning tmp twice. Maybe I'll think of something later.
I am actually working on a library to allow writing code in a more readable fashion than iterators alone... don't know if I'll ever finish the project though, seems dead projects tend to accumulate on my computer...
Anyway the main reproach I have here is obviously the use of iterators. I tend to think of iterators as low-level implementation details, when coding you rarely want to use them at all.
So, let's assume that we have a proper library:
struct DoStuff { std::string operator()(std::string const&); };
int main(int argc, char* argv[])
{
std::string const reference = "a,b,c,d,e,f";
std::string const result = boost::join(
view::transform(
view::split(reference, ","),
DoStuff()
),
","
);
}
The idea of a view is to be a lightwrapper around another container:
from the user point of view it behaves like a container (minus the operations that actually modify the container structure)
from the implementation point of view, it's a lightweight object, containing as few data as possible --> the value is ephemeral here, and only lives as long as the iterator lives.
I already have the transform part working, I am wondering how the split could work (generally), but I think I'll get into it ;)
Okay, I guess it's possible, but please please don't really do this in production code.
Much better would be something like
std::string MakeCommaEdFoo(std::string input)
{
std::size_t commas = std::count_if(input.begin(), input.end(),
std::bind2nd(std::equal_to<char>(), ','));
std::string output("foo");
output.reserve((commas+1)*4-1);
for(std::size_t idx = 1; idx < commas; ++idx)
output.append(",foo");
return output;
}
Not only will it perform better, it will is much easier for the next guy to read and understand.

String handling in C++

How do I write a function in C++ that takes a string s and an integer n as input and gives at output a string that has spaces placed every n characters in s?
For example, if the input is s = "abcdefgh" and n = 3 then the output should be "abc def gh"
EDIT:
I could have used loops for this but I am looking for concise and an idiomatic C++ solution (i.e. the one that uses algorithms from STL).
EDIT:
Here's how I would I do it in Scala (which happens to be my primary language):
def drofotize(s: String, n: Int) = s.grouped(n).toSeq.flatMap(_ + " ").mkString
Is this level of conciseness possible with C++? Or do I have to use explicit loops after all?
Copy each character in a loop and when i>0 && i%(n+1)==0 add extra space in the destination string.
As for Standard Library you could write your own std::back_inserter which will add extra spaces and then you could use it as follows:
std::copy( str1.begin(), str1.end(), my_back_inserter(str2, n) );
but I could say that writing such a functor is just a wasting of your time. It is much simpler to write a function copy_with_spaces with an old good for-loop in it.
STL algorithms don't really provide anything like this. Best I can think of:
#include <string>
using namespace std;
string drofotize(const string &s, size_t n)
{
if (s.size() <= n)
{
return s;
}
return s.substr(0,n) + " " + drofotize(s.substr(n), n);
}