Why is ranges::split_view not a Bidirectional Range? - c++

I am using the cmcstl2 library with the C++ proposed Ranges with gcc 8
std::string text = "Let me split this into words";
std::string pattern = " ";
auto splitText = text | ranges::view::split(pattern) |
ranges::view::reverse;
But this does not work since the view is only a Forward Range not a Bidirectional Range which required by range (which is what I think is going on). Why? if
text | ranges::view::split(pattern)
outputs a view of subranges. Can't that view be reversed?
Also in cmcstl2 I must do the following to print it out.
for (auto x : splitText)
{
for (auto m : x)
std::cout << m;
std::cout << " ";
}
But in range-v3/0.4.0 version I can do:
for (auto x : splitText)
std::cout << x << '\n';
Why? What is the type of x?

The way it's been written only supports ForwardRange.
You can certainly try to make a BidirectionalRange version, although I suspect that is either hard or less general.
Consider how to specify all the options for pattern such that it can also match backwards.

Related

What is the proper evaluation order when assigning a value in a map?

I know that compiler is usually the last thing to blame for bugs in a code, but I do not see any other explanation for the following behaviour of the following C++ code (distilled down from an actual project):
#include <iostream>
#include <map>
int main()
{
auto values = { 1, 3, 5 };
std::map<int, int> valMap;
for (auto const & val : values) {
std::cout << "before assignment: valMap.size() = " << valMap.size();
valMap[val] = valMap.size();
std::cout << " -> set valMap[" << val << "] to " << valMap[val] << "\n";
}
}
The expected output of this code is:
before assignment: valMap.size() = 0 -> set valMap[1] to 0
before assignment: valMap.size() = 1 -> set valMap[3] to 1
before assignment: valMap.size() = 2 -> set valMap[5] to 2
However, when I build a Release version with the (default) C++14 compiler, the output becomes:
before assignment: valMap.size() = 0 -> set valMap[1] to 1
before assignment: valMap.size() = 1 -> set valMap[3] to 2
before assignment: valMap.size() = 2 -> set valMap[5] to 3
In other words, all values in valMap are larger by 1 than what they should be - it looks like the map gets appended before the right-hand-side of the assignment is evaluated.
This happens only in a Release build with C++14 language standard (which is the default in VS2019). Debug builds work fine (I hate when this happens - it took me hours to find out what is going on), as do Release builds of C++17 and C++20. This is why it looks like a bug to me.
My question is: is this a compiler bug, or am I doing something wrong/dangerous by using .size() in the assignment?
The evaluation order of A = B was not specified before c++17, after c++17 B is guaranteed to be evaluated before A, see https://en.cppreference.com/w/cpp/language/eval_order rule 20.
The behaviour of valMap[val] = valMap.size(); is therefore unspecified in c++14, you should use:
auto size = valMap.size();
valMap[val] = size;
Or avoid the problem by using emplace which is more explicit than relying on [] to automatically insert a value if it doesn't already exist:
valMap.emplace(val, size);

C++ Need help sorting a 2D string array

I'm a little stuck with sorting a string Table[X][Y]. As tagged, Im using C++ and have to use standard libraries and make it for all C++ (not only C++ 11).
The size of the Table is fixed (i get the X reading how many lines a file has and the Y is fixed because thats the different "attributes" has each line).
When i create the Table, each part of it is obtained as Table[X][Y] = stringX.data(); from things previously read from a file and stored in strings. I have numbers in the first column (the one im going to use as sorting criteria), names, address, etc in the others.
The part where the Table is created is:
Table[i][0] = string1.data();
Table[i][1] = string2.data();
Table[i][2] = string3.data();
Table[i][3] = string4.data();
Table[i][4] = string5.data();
Where "i" is the current "iteration" of a while(fgets) that reads one line at a time from a file, does some operations and stores in those strings the "final values" of each part of the line read.
I have to sort that Table using the first column as criteria in decreasing order.
Lets imagine the Table is like this: Table[4][3]
20 | Jhon | 14th July
2 | Mary | 9th June
44 | Mark | 10th December
1 | Chris | 4th Feb
And i need the output to be this:
44 | Mark | 10th December
20 | Jhon | 14th July
2 | Mary | 9th June
1 | Chris | 4th Feb
I have been reading several questions and pages and they sort int/chars arrays or convert the array into vector and then work with them. Im trying to sort the string Table i have without converting anything (dunno if possible).
I dont know if i managed to explain the issue and the situation i have clear enough. Im not putting all the code i have because apart from the declaration of the string Table and the strings that are then placed as string.data in the Table, the rest of the code has nothing to do with the Table and the sorting process. The code opens the file, reads line by line, filters the info i need from some separators and special characters and places each of the "rankings criteria" to a string, then assigns a "ranking" after evaluating each of the criterias and giving a total score (which then is stored in "string1").
After all this is done, i create the string Table[x][y] and place the filtered and processed information in that Table one row at a time (because i assing this while reading each line from the file).
The only thing that remains is the sorting of the table from the best scored to the last and then create a file with the top 10.
I appreciate and thank in advance the time you took reading this and any tip, information, code or source from where i can read this that you could provide.
First, as mentioned in the comments, a variable length array is accomplished in C++ by using std::vector. The current syntax you're using
std::string Table[X][Y]
where either X or Y are runtime variables, is not legal C++. Given your example, a standard C++ declaration would be this:
std::vector<std::array<std::string, 3>> Table;
So let's assume that this is what you are going to use.
The next step is to sort the data based on the first column. That can be accomplished by utilizing the std::sort algorithm function, along with the appropriate predicate indicating that you are using the first column as the sorting criteria.
Here is a short example, using your data, of how this is all accomplished:
#include <vector>
#include <array>
#include <iostream>
#include <algorithm>
#include <string>
int main()
{
std::vector<std::array<std::string, 3>> Table;
// Set up the test data
Table.push_back({"20", "Jhon", "14th July"});
Table.push_back({"2", "Mary", "9th June"});
Table.push_back({"44", "Mark", "10th December"});
Table.push_back({"1", "Chris", "4th Feb"});
std::cout << "Before sort:\n\n";
for (auto& s : Table)
std::cout << s[0] << " | " << s[1] << " | " << s[2] << "\n";
std::cout << "\n\nAfter sort:\n\n";
// Sort the data using the first column of each `std::array` as the criteria
std::sort(Table.begin(), Table.end(), [&](auto& a1, auto& a2)
{ return std::stoi(a1[0]) > std::stoi(a2[0]); });
// Output the results:
for (auto& s : Table)
std::cout << s[0] << " | " << s[1] << " | " << s[2] << "\n";
}
Here is the final output:
Before sort:
20 | Jhon | 14th July
2 | Mary | 9th June
44 | Mark | 10th December
1 | Chris | 4th Feb
After sort:
44 | Mark | 10th December
20 | Jhon | 14th July
2 | Mary | 9th June
1 | Chris | 4th Feb
The output needs a little bit of formatting, but that's not important.
Remember, it is not important as to where the data comes from, whether it is from a file or hardcoded as the example above shows. However you populate the Table, that's up to you. The goal is to show you once populated, how to sort the data.
The first thing we did was create the Table and fill it in with the test data. Note that the vector has a push_back function to add entries to the vector.
Then the call to std::sort has a predicate function (the lambda), where the predicate is given two items, in this case it would be two std::array's by reference. Then the goal is to return if the first std::array (in this case, a1) should be placed before the second std::array (a2).
Note that we only care about the first column, so we only need to consider array[0] of each of those arrays, and compare them.
Also note that since array[0] is a std::string, we simply can't compare it lexicographically -- we need to convert the string to an int and compare the int value. That's the reason for the std::stoi call to convert to an integer.
The final thing about the sort predicate is that we want to have a descending sort. Thus the comparing operator to use is > instead of the "traditional" < (which would have sorted in a ascending manner).
Hopefully this explains what the code is doing.
Edit:
Since you are attempting to get this code to work in C++98, the easiest way to do that is
Change to std::vector<std::vector<std::string>> instead of std::vector<std::array<std::string, 3>>
Not use the brace-initialization that C++11 offers
Use a comparison function instead of a lambda.
Given that, here is the code for C++98:
#include <vector>
#include <iostream>
#include <algorithm>
#include <string>
bool SortFirstColumn(const std::vector<std::string>& a1,
const std::vector<std::string>& a2)
{
return atoi(a1[0].c_str()) > atoi(a2[0].c_str());
}
int main()
{
std::vector<std::vector<std::string>> Table;
// Set up the test data
std::vector<std::string> vect(3);
vect[0] = "20";
vect[1] = "Jhon";
vect[2] = "14th July";
Table.push_back(vect);
vect[0] = "2";
vect[1] = "Mary";
vect[2] = "9th June";
Table.push_back(vect);
vect[0] = "44";
vect[1] = "Mark";
vect[2] = "10th December";
Table.push_back(vect);
vect[0] = "1";
vect[1] = "Chris";
vect[2] = "10th December";
Table.push_back(vect);
std::cout << "Before sort:\n\n";
for (size_t i = 0; i < Table.size(); ++i)
std::cout << Table[i][0] << " | " << Table[i][1] << " | " << Table[i][2] << "\n";
std::cout << "\n\nAfter sort:\n\n";
// Sort the data using the first column of each `std::vector<std::string>` as the criteria
std::sort(Table.begin(), Table.end(), SortFirstColumn);
// Output the results:
for (size_t i = 0; i < Table.size(); ++i)
std::cout << Table[i][0] << " | " << Table[i][1] << " | " << Table[i][2] << "\n";
}

Why does operator>> on complex<double> not set eofbit if it reaches EOF?

I'm trying to read as many std::complex<double> as possible from a file (or any std::istream). If the operation fails, I check for ios::eof(). If it hasn't been set, I assume that there was an error in parsing the data, and I can report to the user that the input file has errors. This scheme works for double, but somehow it fails on complex numbers. Why?
Here is some code to reproduce the problem:
std::istringstream istr("4 1.2");
std::complex<double> val;
while( istr >> val )
std::cout << " val = " << val << std::endl;
std::cout << "flags: eof=" << istr.eof() << " fail=" << istr.fail() << " bad=" << istr.bad() << std::endl;
The output is
val = (4,0)
val = (1.2,0)
flags: eof=0 fail=1 bad=0
If I replace std::complex<double> with double, it works as expected, yielding
val = 4
val = 1.2
flags: eof=1 fail=1 bad=0
This problem occurs with libstdc++, but it seems to work with libc++:
run on coliru with g++
run on coliru with clang++ and libc++
EDIT I found a bug report from 2013 but the problem still seems to be there, and the library is quite common. Is there a way for me to make it work for anybody without having to write my own parser?
It stems from standard wording:
[complex.ops]
12 Effects: Extracts a complex number x of the form: u,
(u), or (u,v), where u is the real part and v is the imaginary part (27.7.2.2).
13 Requires: The input values shall be convertible to T. If bad input is encountered, calls is.setstate(ios_base::failbit) (which may throw ios::failure (27.5.5.4)).
14 Returns: is.
15 Remarks: This extraction is performed as a series of simpler extractions. Therefore, the skipping of whitespace is specified to be the same for each of the simpler extractions.
In particular it does not specify that it should set eofbit in any case. Even remark does not specify what operations are performed and what their semantic is. There is a Defect Report about it, which suggest resolution by specifying semantic of operation and, if we are lucky, it will make its way to C++17.

How to capture 0-2 groups in C++ regular expressions and print them?

Edit 3
I went to the good'ol custom parsing approach as I got stuck with the regular expression. It didn't turn out to be that bad, as the file contents can be tokenized quite neatly and the tokens can be parsed in a loop with a very simple state machine. Those who want to check, there's a snippet of code doing this with range-for, ifstream iterators and custom stream tokenizer at my other question in Stackoverflow here. These techniques lessen considerably the complexity of doing a custom parser.
I'd like to tokenize file contents in first part in capture groups of two and then just line by line. I have like a semi-functional solution, but I'd like to learn how to make this better. That is, without "extra processing" to make-up my lack of knowledge with capture groups. Next some preliminaries and in the end a more exact question (the line
const std::regex expression("([^:]+?)(^:|$)");
...is the one I'd like to ask about in combination with processing the results of it).
The files which are basically defined like this:
definition_literal : value_literal
definition_literal : value_literal
definition_literal : value_literal
definition_literal : value_literal
HOW TO INTERPRET THE FOLLOWING SECTION OF ROWS
[DATA ROW 1]
[DATA ROW 2]
...
[DATA ROW n]
Where each of the data rows consists of a certain number of either integers or floating point numbers separated by whitespace. Each row having as many numbers as the others (e.g. each row could have four integers). So, the "interpretation section" basically tells this format in plain text in one row.
I have an almost working solution that reads such files like this:
int main()
{
std::ifstream file("xyz", std::ios_base::in);
if(file.good())
{
std::stringstream file_memory_buffer;
file_memory_buffer << file.rdbuf();
std::string str = file_memory_buffer.str();
file.close();
const std::regex expression("([^:]+?)(^:|$)");
std::smatch result;
const std::sregex_token_iterator end;
for(std::sregex_token_iterator i(str.begin(), str.end(), expression); i != end; ++i)
{
std::cout << (*i) << std::endl;
}
}
return EXIT_SUCCESS;
}
With the regex defined expression, it now prints the <value> parts of the definition file, then the interpretation part and then the data rows one by one. If I change the regex to
"([^:]+?)(:|$)"
...it prints all the lines tokenized in groups of one, almost like I would like to, but how tokenize the first part in groups of two and the rest line by line?
Any pointers, code, explanations are truly welcomed. Thanks.
EDIT:
As noted to Tom Kerr already, but some additional points, this is also a rehearsal, or coding kata if you will, to not to write a custom parser, but to see if I could -- or we could :-) -- accomplish this with regex. I know regex isn't the most efficient thing to do here, but it doesn't matter.
What I'd hope to have is something like a list of tuples of header information (tuple of size 2), then the INTERPRET line (tuple of size 1), which I could use to choose a function on what to do with the data lines (tuple of size 1).
Yep, the "HOW TO INTERPRET" line is contained in a set of well-defined strings and I could just read line by line from the beginning, splitting strings along the way, until one of the INTERPRET lines is met. This regex solution is not the most efficient method, I know, but more like coding kata to get myself to write something else than customer parsers (and it's quite some time I've write in C++ the last time, so this is rehearsing otherwise too).
EDIT 2
I have managed to get access to the tuples (in the context of this question) by changing the iterator type, like so
const std::sregex_iterator end;
for(std::sregex_iterator i(str.begin(), str.end(), expression); i != end; ++i)
{
std::cout << "0: " << (*i)[0] << std::endl;
std::cout << "1: " << (*i)[1] << std::endl;
std::cout << "2: " << (*i)[2] << std::endl;
std::cout << "***" << std::endl;
}
Though this is still way off what I'd like to have, there's something wrong with the regular expression I'm trying ot use. In any event, this new find, another kind of iterator, helps too.
I believe the re you are attempting is this:
TEST(re) {
static const boost::regex re("^([^:]+) : ([^:]+)$");
std::string str = "a : b";
CHECK(boost::regex_match(str, re));
CHECK(!boost::regex_match("a:a : bbb", re));
CHECK(!boost::regex_match("aaa : b:b", re));
boost::smatch what;
CHECK(boost::regex_match(str, what, re, boost::match_extra));
CHECK_EQUAL(3, what.size());
CHECK_EQUAL(str, what[0]);
CHECK_EQUAL("a", what[1]);
CHECK_EQUAL("b", what[2]);
}
I'm not sure I would recommend regex in this instance though. I think you'll find simply reading a line at a time, splitting on :, and then trimming the spaces more manageable.
I guess if you can't depend the below line as a sentinel, then it would be more difficult. Usually I would expect a format like this to be obvious from that line, not the format of each line of the header.
HOW TO INTERPRET THE FOLLOWING SECTION OF ROWS

C++ Multi-Assign Python feature

I am trying to learn C++. While experimenting, I typed in the following code, not expecting it to work, but hoping it would:
int one = 1, two = 2;
one, two = two, one;
cout << "one = " << one << "\n";
cout << "two = " << two << "\n";
I was encouraged by the fact that the compiler didn't complain, because this is one of the main features that I love about python that most every programming language I've ever learned does not match - the ability to evaluate multiple expressions before assigning the results WITHOUT using a temporary variable. However, I found when I ran it that this code seems to be ineffectual.
After playing around a bit, I discovered that the variable two is actually being set - so, if I ran this code:
one, two = 3, 4;
two would be equal to 3, but one would be unchanged. And so my question is, what exactly is the compiler doing in this statement? I can't for the life of me figure it out.
Thanks,
Brandon
Multi-assignment python style is not supported in C++, the comma operator does not do what you intend: http://en.wikipedia.org/wiki/Comma_operator
If you were assigning the same value to both one and two then you could do
one = two = 3;
but in your case you are not so they must be on separate lines:
one = 3;
two = 4;
if you wanted to swap the values then you must use a temporary third variable:
int temp = one;
one = two;
two = temp;
If we consider the following:
one, two = 3, 4;
the simple case is 3 ,4; here the 4 will be discarded so only 3 will remain, you then have
one, two = 3;
Now you have the assignment operator = which has higher precedence than the , operator so irrespective of what happens with one, two the two = 3; is evaluated first before one, two resulting in:
two = 3;
First, = operator has higher precedence than , thus both lines below are equivalent:
(one), (two = 3), (4);
one, two = 3, 4;
Second, the comma operator is a way to group multiple statements, but only express the value of the last one, so:
int a, b;
a = (b = 5, b*b);
cout << a << '\n';
Would print 25. Hope to have enlightened you on what the compiler was doing.
In C++11 there is std::tie.
Example:
#include <tuple>
#include <iostream>
int main()
{
int a, b;
std::tie(a, b) = std::make_tuple(1, 2);
std::cout << a << ", " << b;
}
Output: 1, 2. std::tie() creates tuple of references - and std::tuple has overload for operator=().
I know it's not same syntax as the one used in Python, but it's functionally the same and you may want to use it.