I'm looking for an easy way to build an array of strings at compile time. For a test, I put together a class named Strings that has the following members:
Strings();
Strings(const Strings& that);
Strings(const char* s1);
Strings& operator=(const char* s1);
Strings& operator,(const char* s2);
Using this, I can successfully compile code like this:
Strings s;
s="Hello","World!";
The s="Hello" part invokes the operator= which returns a Strings& and then the operator, get called for "World!".
What I can't get to work (in MSVC, haven't tried any other compilers yet) is
Strings s="Hello","World!";
I'd assume here that Strings s="Hello" would call the copy constructor and then everything would behave the same as the first example. But I get the error: error C2059: syntax error : 'string'
However, this works fine:
Strings s="Hello";
So I know that the copy constructor does at least work for one string. Any ideas? I'd really like to have the second method work just to make the code a little cleaner.
I think that the comma in your second example is not the comma operator but rather the grammar element for multiple variable declarations.
e.g., the same way that you can write:
int a=3, b=4
It seems to me that you are essentially writing:
Strings s="Hello", stringliteral
So the compiler expects the item after the comma to be the name of a variable, and instead it sees a string literal and announces an error. In other words, the constructor is applied to "Hello", but the comma afterwards is not the comma operator of Strings.
By the way, the constructor is not really a copy constructor. It creates a Strings object from a literal string parameter... The term copy constructor is typically applied to the same type.
I wouldn't recommend this kind of an API. You are going to continue discovering cases that don't work as expected, since comma is the operator with the lowest precedence. For example, this case won't work either:
if ("Hello","world" == otherStrings) { ... }
You may be able to get things working if you use brackets every time around the set of strings, like this:
Strings s=("Hello","World!");
And my example above would look like this:
if (("Hello","world") == otherStrings) { ... }
That can likely be made to work, but the shorthand syntax is probably not worth the tricky semantics that come with it.
Use boost::list_of.
It's possible to make this work, for a sufficiently loose definition of "work." Here's a working example I wrote in response to a similar question some years ago. It was fun as a challenge, but I wouldn't use it in real code:
#include <iostream>
#include <algorithm>
#include <iterator>
#include <vector>
void f0(std::vector<int> const &v) {
std::copy(v.begin(), v.end(),
std::ostream_iterator<int>(std::cout, "\t"));
std::cout << "\n";
}
template<class T>
class make_vector {
std::vector<T> data;
public:
make_vector(T const &val) {
data.push_back(val);
}
make_vector<T> &operator,(T const &t) {
data.push_back(t);
return *this;
}
operator std::vector<T>() { return data; }
};
template<class T>
make_vector<T> makeVect(T const &t) {
return make_vector<T>(t);
}
int main() {
f0((makeVect(1), 2, 3, 4, 5));
f0((makeVect(1), 2, 3));
return 0;
}
You could use an array of character pointers
Strings::Strings(const char* input[]);
const char* input[] = {
"string one",
"string two",
0};
Strings s(input);
and inside the constructor, iterate through the pointers until you hit the null.
If the only job of Strings is to store a list of strings, then boost::assign could do the job better with standard containers, I think :)
using namespace boost::assign;
vector<string> listOfThings;
listOfThings += "Hello", "World!";
If you c++0x, they have new inializer lists for this! I wish you could use those. For example:
std::vector<std::string> v = { "xyzzy", "plugh", "abracadabra" };
std::vector<std::string> v{ "xyzzy", "plugh", "abracadabra" };
Related
I want to have an input to a function similar to python so that then I can loop over it in inside the function. But I am not sure how I should define the input.
func(["a","b","c"])
so that it can also be called
func(["a","b","c", "d"])
is there actually such style of input in c++?
I'd be glad if someone also suggested a way of looping over it since my c++ experience is quite basic.
-------edit,
will be glad if this "[]" style of brackets are possible instead of "{}" similar to python and with minimal code.
Yes, you can use std::initializer_list to do that:
#include <initializer_list>
template<class T>
void func(std::initializer_list<T> il) {
for (auto x : il);
}
int main() {
func({"a","b","c"});
func({"a","b","c", "d"});
}
will be glad if this "[]" style of brackets are possible instead of
"{}" similar to python and with minimal code.
Unfortunately, the multidimensional subscript operator only works in C++23, see p2128 for more details.
You can use a std::initilializer_list:
#include <iostream>
#include <initializer_list>
void foo(std::initializer_list<std::string> l){
for (const auto& s : l) std::cout << s << " ";
}
int main() {
foo({"a","b","c"});
}
I think python does not distinguish between character and string literals, but C++ does. "a" is a string literal, while 'a' is a character literal. If you actually wanted characters you can use a std::initializer_list<char>. You can also consider to simply pass a std::string to the function (foo("abc")).
will be glad if this "[]" style of brackets are possible instead of "{}" similar to python and with minimal code.
Better get used to different languages being different. Trying to make code in one language look like a different language usually does not pay off, because not only in details python and C++ are very different.
The other answers will work but I think your looking for std::vector, which is a array that can dynamically grow and shrink. It is basically the c++ equivalent to a python list (except you can only store on data type in it).
#include <iostream>
#include <vector>
void foo (std::vector<std::string> vec)
{
// normal for loop
for (int i = 0; i < vec.size (); i++)
{
std::cout << vec[i] << std::endl; // do something
}
std::cout << "#########" << std::endl;
// range based for loop
for (auto val : vec)
{
std::cout << val << std::endl;
}
std::cout << "#########" << std::endl;
}
int main ()
{
foo ({'a', 'b', 'c'});
foo ({'a', 'b', 'c', 'd'});
}
replace std::string with the data type that you need.
live example
I would recommend you to use std::initializer_list for that purpose.
The function may be defined as follows:
void func(std::initializer_list<std::string> il)
{
for(const std::string & s : il)
{
// ...
}
}
And you may use it the following way:
int main()
{
func({"a", "b", "c"});
return 0;
}
will be glad if this "[]" style of brackets are possible instead of "{}" similar to python and with minimal code.
Python and C++ are not the same languages and symbols, keywords, etc... have their own meaning. In Python, [] means a list, but in C++ it is the subscript operator (supposed to be called for a given object), which is a completely different thing.
Using http://en.cppreference.com/w/cpp/string/basic_string_view as a reference, I see no way to do this more elegantly:
std::string s = "hello world!";
std::string_view v = s;
v = v.substr(6, 5); // "world"
Worse, the naive approach is a pitfall and leaves v a dangling reference to a temporary:
std::string s = "hello world!";
std::string_view v(s.substr(6, 5)); // OOPS!
I seem to remember something like there might be an addition to the standard library to return a substring as a view:
auto v(s.substr_view(6, 5));
I can think of the following workarounds:
std::string_view(s).substr(6, 5);
std::string_view(s.data()+6, 5);
// or even "worse":
std::string_view(s).remove_prefix(6).remove_suffix(1);
Frankly, I don't think any of these are very nice. Right now the best thing I can think of is using aliases to simply make things less verbose.
using sv = std::string_view;
sv(s).substr(6, 5);
There's the free-function route, but unless you also provide overloads for std::string it's a snake-pit.
#include <string>
#include <string_view>
std::string_view sub_string(
std::string_view s,
std::size_t p,
std::size_t n = std::string_view::npos)
{
return s.substr(p, n);
}
int main()
{
using namespace std::literals;
auto source = "foobar"s;
// this is fine and elegant...
auto bar = sub_string(source, 3);
// but uh-oh...
bar = sub_string("foobar"s, 3);
}
IMHO the whole design of string_view is a horror show which will take us back to a world of segfaults and angry customers.
update:
Even adding overloads for std::string is a horror show. See if you can spot the subtle segfault timebomb...
#include <string>
#include <string_view>
std::string_view sub_string(std::string_view s,
std::size_t p,
std::size_t n = std::string_view::npos)
{
return s.substr(p, n);
}
std::string sub_string(std::string&& s,
std::size_t p,
std::size_t n = std::string::npos)
{
return s.substr(p, n);
}
std::string sub_string(std::string const& s,
std::size_t p,
std::size_t n = std::string::npos)
{
return s.substr(p, n);
}
int main()
{
using namespace std::literals;
auto source = "foobar"s;
auto bar = sub_string(std::string_view(source), 3);
// but uh-oh...
bar = sub_string("foobar"s, 3);
}
The compiler found nothing to warn about here. I am certain that a code review would not either.
I've said it before and I'll say it again, in case anyone on the c++ committee is watching, allowing implicit conversions from std::string to std::string_view is a terrible error which will only serve to bring c++ into disrepute.
Update
Having raised this (to me) rather alarming property of string_view on the cpporg message board, my concerns have been met with indifference.
The consensus of advice from this group is that std::string_view must never be returned from a function, which means that my first offering above is bad form.
There is of course no compiler help to catch times when this happens by accident (for example through template expansion).
As a result, std::string_view should be used with the utmost care, because from a memory management point of view it is equivalent to a copyable pointer pointing into the state of another object, which may no longer exist. However, it looks and behaves in all other respects like a value type.
Thus code like this:
auto s = get_something().get_suffix();
Is safe when get_suffix() returns a std::string (either by value or reference)
but is UB if get_suffix() is ever refactored to return a std::string_view.
Which in my humble view means that any user code that stores returned strings using auto will break if the libraries they are calling are ever refactored to return std::string_view in place of std::string const&.
So from now on, at least for me, "almost always auto" will have to become, "almost always auto, except when it's strings".
You can use the conversion operator from std::string to std::string_view:
std::string s = "hello world!";
std::string_view v = std::string_view(s).substr(6, 5);
This is how you can efficiently create a sub-string string_view.
#include <string>
inline std::string_view substr_view(const std::string& source, size_t offset = 0,
std::string_view::size_type count =
std::numeric_limits<std::string_view::size_type>::max()) {
if (offset < source.size())
return std::string_view(source.data() + offset,
std::min(source.size() - offset, count));
return {};
}
#include <iostream>
int main(void) {
std::cout << substr_view("abcd",3,11) << "\n";
std::string s {"0123456789"};
std::cout << substr_view(s,3,2) << "\n";
// be cautious about lifetime, as illustrated at https://en.cppreference.com/w/cpp/string/basic_string_view
std::string_view bad = substr_view("0123456789"s, 3, 2); // "bad" holds a dangling pointer
std::cout << bad << "\n"; // possible access violation
return 0;
}
I realize that the question is about C++17, but it's worth noting that C++20 introduced a string_view constructor that accepts two iterators to char (or whatever the base type is) which allows writing
std::string_view v{ s.begin() +6, s.begin()+6 +5 };
Not sure if there is a cleaner syntax, but it's not difficult to
#define RANGE(_container,_start,_length) (_container).begin() + (_start), (_container).begin() + (_start) + (_length)
for a final
std::string_view v{ RANGE(s,6,5) };
PS: I called RANGE's first parameter _container instead of _string for a reason: the macro can be used with any Container (or class supporting at least begin() and end()), even as part of a function call like
auto pisPosition= std::find( RANGE(myDoubleVector,11,23), std::numbers::pi );
PPS: When possible, prefer C++20's actual ranges library to this poor person's solution.
Say I have a vector of values from a tokenizing function, tokenize(). I know it will only have two values. I want to store the first value in a and the second in b. In Python, I would do:
a, b = string.split(' ')
I could do it as such in an ugly way:
vector<string> tokens = tokenize(string);
string a = tokens[0];
string b = tokens[1];
But that requires two extra lines of code, an extra variable, and less readability.
How would I do such a thing in C++ in a clean and efficient way?
EDIT: I must emphasize that efficiency is very important. Too many answers don't satisfy this. This includes modifying my tokenization function.
EDIT 2: I am using C++11 for reasons outside of my control and I also cannot use Boost.
With structured bindings (definitely will be in C++17), you'd be able to write something like:
auto [a,b] = as_tuple<2>(tokenize(str));
where as_tuple<N> is some to-be-declared function that converts a vector<string> to a tuple<string, string, ... N times ...>, probably throwing if the sizes don't match. You can't destructure a std::vector since it's size isn't known at compile time. This will necessarily do extra moves of the string so you're losing some efficiency in order to gain some code clarity. Maybe that's ok.
Or maybe you write a tokenize<N> that returns a tuple<string, string, ... N times ...> directly, avoiding the extra move. In that case:
auto [a, b] = tokenize<2>(str);
is great.
Before C++17, what you have is what you can do. But just make your variables references:
std::vector<std::string> tokens = tokenize(str);
std::string& a = tokens[0];
std::string& b = tokens[1];
Yeah, it's a couple extra lines of code. That's not the end of the world. It's easy to understand.
If you "know it will only have two values", you could write something like:
#include <cassert>
#include <iostream>
#include <string>
#include <tuple>
std::pair<std::string, std::string> tokenize(const std::string &text)
{
const auto pos(text.find(' '));
assert(pos != std::string::npos);
return {text.substr(0, pos), text.substr(pos + 1)};
}
your code is a great example of the power of STL but it's probably a bit slower.
int main()
{
std::string a, b;
std::tie(a, b) = tokenize("first second");
std::cout << a << " " << b << '\n';
}
Unfortunately without structured bindings (C++17) you have to use the std::tie hack and the variables a and b have to exist.
Ideally you'd rewrite the tokenize() function so that it returns a pair of strings rather than a vector:
std::pair<std::string, std::string> tokenize(const std::string& str);
Or you would pass two references to empty strings to the function as parameters.
void tokenize(const std::string& str, std::string& result_1, std::string& result_2);
If you have no control over the tokenize function the best you can do is move the strings out of the vector in an optimal way.
std::vector<std::string> tokens = tokenize(str);
std::string a = std::move(tokens.first());
std::string b = std::move(tokens.last());
If I write this
std::vector<std::string> v{"one","two","three"};
What is the type inferred to the associated std::initializer_list template ?
In other words, when the char * string literals are converted to std::string ?
It's a better idea to declare it as
std::vector<std::string> v{std::string("one"),
std::string("two"),
std::string("three")};
to avoid issues connected to the type-deduction mechanism of the templates involved ?
I'll keep the same optimizations with this ?
Update: To answer your question about type inference:
The initializer list constructor of vector<string> takes an initializer_list<string>. It is not templated, so nothing happens in terms of type inference.
Still, the type conversion and overload resolution rules applied here are of some interest, so I'll let my initial answer stand, since you have accepted it already:
Original answer:
At first, the compiler only sees the initializer list {"one","two","three"}, which is only a list of initializers, not yet an object of the type std::initializer_list.
Then it tries to find an appropiate constructor of vector<string> to match that list.
How it does that is a somewhat complicated process you would do best to look up in the standard itself if you are interested in the exact process.
Therefore, the compiler decides to create an actual object of std::initializer_list<string> from the initializer list, since the implicit conversion from the char*'s to std::strings makes that possible.
Another, maybe more interesting example:
std::vector<long> vl1{3};
std::vector<string> vs1{3};
std::vector<string> vs2{0};
What do these do?
The first line is relatively easy. The initializer list {3} can be converted into a std::initializer_list<long> analogous to the {"onm", "two", "three"} example above, so you get a vector with a single element, which has value 3.
The second line is different. It constructs a vector of 3 empty strings. Why? Because an initializer list {3} can by no means be converted into an std::initializer_list<string>, so the "normal" constructor std::vector<T>::vector(size_t, T = T()) kicks in and gives three default-constructed strings.
Well this one should be roughly the same as the second, right? It should give an empty vector, in other words, with zero default-constructed strings. WRONG!. The 0 can be treated as a nullpointer constant, and validates the std::initializer_list<string>. Only this time the single string in that list gets constructed by a nullpointer, which is not allowed, so you get an exception.
There is no type inference because vector provide only a fully specialized constructor with the initializer list. We could add a template indirection to play with type deduction. The example below show that a std::initializer_list<const char*> is an invalid argument to the vector constructor.
#include <string>
#include <vector>
std::string operator"" _s( const char* s, size_t sz ) { return {s, s+sz}; }
template<typename T>
std::vector<std::string> make_vector( std::initializer_list<T> il ) {
return {il};
}
int main() {
auto compile = make_vector<std::string>( { "uie","uieui","ueueuieuie" } );
auto compile_too = make_vector<std::string>( { "uie"_s, "uieui", "ueueuieuie" } );
//auto do_not_compile = make_vector( { "uie","uieui","ueueuieuie" } );
}
Live demo
From http://en.cppreference.com/w/cpp/language/string_literal:
The type of an unprefixed string literal is const char[]
Thus things go this way:
#include <iostream>
#include <initializer_list>
#include <vector>
#include <typeinfo>
#include <type_traits>
using namespace std;
int main() {
std::cout << std::boolalpha;
std::initializer_list<char*> v = {"one","two","three"}; // Takes string literal pointers (char*)
auto var = v.begin();
char *myvar;
cout << (typeid(decltype(*var)) == typeid(decltype(myvar))); // true
std::string ea = "hello";
std::initializer_list<std::string> v2 = {"one","two","three"}; // Constructs 3 std::string objects
auto var2 = v2.begin();
cout << (typeid(decltype(*var2)) == typeid(decltype(ea))); // true
std::vector<std::string> vec(v2);
return 0;
}
http://ideone.com/UJ4a0i
This is a bit of a daft question, but out of curiousity would it be possibly to split a string on comma, perform a function on the string and then rejoin it on comma in one statement with C++?
This is what I have so far:
string dostuff(const string& a) {
return string("Foo");
}
int main() {
string s("a,b,c,d,e,f");
vector<string> foobar(100);
transform(boost::make_token_iterator<string>(s.begin(), s.end(), boost::char_separator<char>(",")),
boost::make_token_iterator<string>(s.end(), s.end(), boost::char_separator<char>(",")),
foobar.begin(),
boost::bind(&dostuff, _1));
string result = boost::algorithm::join(foobar, ",");
}
So this would result in turning "a,b,c,d,e,f" into "Foo,Foo,Foo,Foo,Foo,Foo"
I realise this is OTT, but was just looking to expand my boost wizardry.
First, note that your program writes "Foo,Foo,Foo,Foo,Foo,Foo,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,," to your result string -- as already mentioned in comments, you wanted to use back_inserter there.
As for the answer, whenever there's a single value resulting from a range, I look at std::accumulate (since that is the C++ version of fold/reduce)
#include <string>
#include <iostream>
#include <numeric>
#include <boost/tokenizer.hpp>
#include <boost/algorithm/string.hpp>
#include <boost/bind.hpp>
std::string dostuff(const std::string& a) {
return std::string("Foo");
}
int main() {
std::string s("a,b,c,d,e,f");
std::string result =
accumulate(
++boost::make_token_iterator<std::string>(s.begin(), s.end(), boost::char_separator<char>(",")),
boost::make_token_iterator<std::string>(s.end(), s.end(), boost::char_separator<char>(",")),
dostuff(*boost::make_token_iterator<std::string>(s.begin(), s.end(), boost::char_separator<char>(","))),
boost::bind(std::plus<std::string>(), _1,
bind(std::plus<std::string>(), ",",
bind(dostuff, _2)))); // or lambda, for slightly better readability
std::cout << result << '\n';
}
Except now it's way over the top and repeats make_token_iterator twice. I guess boost.range wins.
void dostuff(string& a) {
a = "Foo";
}
int main()
{
string s("a,b,c,d,e,f");
vector<string> tmp;
s = boost::join(
(
boost::for_each(
boost::split(tmp, s, boost::is_any_of(",")),
dostuff
),
tmp
),
","
);
return 0;
}
Unfortunately I can't eliminate mentioning tmp twice. Maybe I'll think of something later.
I am actually working on a library to allow writing code in a more readable fashion than iterators alone... don't know if I'll ever finish the project though, seems dead projects tend to accumulate on my computer...
Anyway the main reproach I have here is obviously the use of iterators. I tend to think of iterators as low-level implementation details, when coding you rarely want to use them at all.
So, let's assume that we have a proper library:
struct DoStuff { std::string operator()(std::string const&); };
int main(int argc, char* argv[])
{
std::string const reference = "a,b,c,d,e,f";
std::string const result = boost::join(
view::transform(
view::split(reference, ","),
DoStuff()
),
","
);
}
The idea of a view is to be a lightwrapper around another container:
from the user point of view it behaves like a container (minus the operations that actually modify the container structure)
from the implementation point of view, it's a lightweight object, containing as few data as possible --> the value is ephemeral here, and only lives as long as the iterator lives.
I already have the transform part working, I am wondering how the split could work (generally), but I think I'll get into it ;)
Okay, I guess it's possible, but please please don't really do this in production code.
Much better would be something like
std::string MakeCommaEdFoo(std::string input)
{
std::size_t commas = std::count_if(input.begin(), input.end(),
std::bind2nd(std::equal_to<char>(), ','));
std::string output("foo");
output.reserve((commas+1)*4-1);
for(std::size_t idx = 1; idx < commas; ++idx)
output.append(",foo");
return output;
}
Not only will it perform better, it will is much easier for the next guy to read and understand.