Extract data from file into vectors but avoid duplicates - c++

I want to read data in files that are formatted like:
Point1, [3, 4]
I'm using delimiters '[' ']' and ',' and replace them with ' ' (empty space). My code now is fine and working. But the problem is, if Point1, [3, 4] appeared once, I want it to be unique and not to appear again if the same data in the text file exist.
Here is what I have:
string line, name;
char filename[50];
int x,y;
cout << "Please enter filename : ";
cin >> filename;
ifstream myfile(filename);
if (myfile.is_open()) {
while ( myfile.good() ) {
getline(myfile, line);
for (unsigned int i = 0; i < line.size(); ++i) {
if (line[i] == '[' || line[i] == ']' || line[i] == ',') {
line[i] = ' ';
}
istringstream in(line);
in >> name >> x >> y;
}
cout <<name <<endl;
if (name=="Point") {
p.push_back(Point(x,y));
}
count++;
}
myfile.close();
cout << count;
}
else cout<< "Unable to open file";
How do i do this? I tried adding this after if(name=="Point")
for (int j=0; j<p.size(); j++) {
if(p.getX != x && p.getY) != y) {
p.push_back(Point(x,y))
}
}
...but this is not working properly as data was not stored into the vector.
Anyone can help?

Instead of storing your data into vectors you can store it into sets. A set contains only unique values, so you don't have to check the uniqueness of your Points, the set will manage that.
Instead of defining vector<Point>, you have to define set<Point>, and instead of using p.push_back to add points into your vector, you have to use p.insert if it is a set.
You can check the set documentation here.

Assuming you want to keep you data store in a std::vector<Point> you could just check that no corresponding point already exists. Assuming there is an equality operator defined, it is as easy as this:
if (p.end() == std::find(p.begin(), p.end(), Point(x, y))) {
p.push_back(Point(x, y));
}
If you Point type doesn't have an equality operator and shouldn't get one, you can use a function object together with find_if() instead, e.g.:
if (p.end() == std::find_if(p.begin(), p.end(),
[=](Point const& v) { return x == v.x && y == v.y; })) {
...
You should separate your loops from other operations: The loop you propose to check if the point already exists is basically what std::find() does except that you insert a new point in each iteration! You first want to go through all existing points and see if it exists anywhere. Only if it does not, you'd insert a new point.
Note, that you did a similar mistake when replacing characters in your string: you try to decode the string after checking each character. This isn't so much a semantic problem but it is a major performance problem. You should replace the characters in a loop and after that loop you should decode the string just once. Of course, when decoding the string you need to check if the format was OK as it is always necessary to determine if an input operation was successful. That is, after your loop replacing the characters you want something like:
std::istringstream in(line);
if (in >> name >> x >> y) {
// now process the data
}
... or, if you are like me and don't like naming things which are only around temporarily:
if (std::istringstream(line) >> std::ws >> name >> x >> y) {
As another note: checking a stream for good() is generally the wrong thing to do because a stream can be in perfectly good conditions except that it had seen EOF at some point and, thus, as set the std::ios_base::eofbit. More importantly, you need to check after the input operation, not before! That is, your first loop should start with something like this:
while (std::getline(myFile, line)) {
...

Related

why does this routine return an array of length 0?

everyone, here is a function I wrote to read a user input which is a vector of double of unknown size, the input must terminate when 'enter' is pressed:
vector<double> read_array()
{
vector<double> array_in;
double el;
while (!cin.get())
{
cin >> el;
array_in.push_back(el);
}
return array_in;
}
To illustrate it consider the following code:
void init() // the function that calls the read_array function
{
cout << "Enter array X: " << endl;
vector<double> X = read_array();
int l = X.size();
cout << l << endl;
}
A typical input when promted is:
1(space)2(space)3(space)4(enter)
When enter is pressed, the input terminates, and the variable 'l' is initialised but is equal to 0
However, when the enter key is pressed, the array size is 0. Debugging it makes it look like it never makes it into the loop like that.
The same routine works well if the input value is not an array.
Thanks to everyone in advance!
I don't know what you hope std::cin.get() does but based on your comment it seems you hope that it somehow deals with end of lines: it doesn't. It simply reads the next character which is unlikely to do you much good. In particular, if the character is anything but '\0' negating it will result in the boolean value false. That said, the loop should in principle work unless you only input a single digit numeric value followed (possibly after space) by a non-digit or the end of the input.
The easiest approach to deal with line-based input is to read the line into a std::string using std::getline() and then to parse the line using std::istringstream:
std::vector<double> read_array() {
std::vector<double> result;
if (std::string line; std::getline(std::cin, line)) {
std::istringstream lin(line);
for (double tmp; std::cin >> tmp; ) {
result.push_back(tmp);
}
}
return result;
}
As std::cin is only involved while reading lines, std::cin.fail() won't be set when parsing doubles fails. That is, you can read multiple lines with arrays of doubles, each of which can also be empty.
If you don't want to read an auxiliary line, you'll need to understand a bit more about how formatted input in C++ works: it starts off skipping whitespace. As newlines are whitespace you need to rather read the whitespace yourself and stop if it happens to be a newline or non-whitespace. I'd use a function doing this skipping which returns false if it reached a newline (which is still extracted):
bool skip_non_nl_ws(std::istream& in) {
for (int c; std::isspace(c = in.peek()); std::cin.ignore()) {
if (c == '\n') {
return false;
}
}
return true;
}
std::vector<double> read_array() {
std::vector<double> result;
for (double tmp; skip_non_nl_ws(std::cin) && std::cin >> result); ) {
result.push_back(tmp);
}
return result;
}
This approach has a similar property that std::ios_base::failbit won't be set. However, if any of the characters on a line can't be parsed as double the bit will set. That way you can detect input errors. The approach using std::getline() will just go on to the next line.

Trouble with Out of range in memory using stringsteam

I just started using the stringstream for the first time and I love the concept, but I am having a hard time finding where exactly I am having an out of range in memory with my stringstream function.
What my function does is it takes in a string, for example, "N02550 G3 X16.7379 Y51.7040 R0.0115" This is machine code for a CNC machine at my job. I pass the string to a stringstream in order to find the strings that have a X, Z, Y next to them, these are coordinates. It then gets rid of the character at the beggining in order to save the float number to my struct "Coordinate"(there are 3 doubles, x, y, z).
When I run a text file that has this machine code with 33 lines, my program works. When I run it with machine code of 718 lines, it gets to 718, then crashes with out of range memory. Then another weird part is when I run machine code with 118,000 lines, it goes up to around 22,000 lines then crashes. So I'm having trouble figuring out why it is able to do that and whats causing the problem.
Here is the function:
void getC(string& line, Coordinates& c)//coordinates holds 3 doubles, x, y, z
{
//variables
string holder;
stringstream ss(line);
while(ss)
{
ss >> holder;
if(holder.at(0) == 'X')
{
holder.erase(0,1);//get rid the the character at the beggining
stringstream sss(holder);
sss >> c.x;
sss.clear();
}
if(holder.at(0) == 'Y')
{
holder.erase(0,1);
stringstream sss(holder);
sss >> c.y;
sss.clear();
}
if(holder.at(0) == 'Z')
{
holder.erase(0,1);
stringstream sss(holder);
sss >> c.z;
sss.clear();
}
if(ss.eof()) // to get out of the ss stream
break;
}
ss.clear();
}
If you want to see the whole application(the application is well documented) then ask or if you need the txt files containing the machine code. Thank you!
Try changing:
while(ss)
{
ss >> holder;
...
if(ss.eof()) // to get out of the ss stream
break;
}
To simply this:
while(ss >> holder)
{
...
}
And you can get rid of those calls to clear in each branch (X/Y/Z) as it doesn't really do anything given that sss is a temporary and you're not doing anything more with it (no point setting flags on something you're going to discard right after). I suspect your out of range issue is coming from trying to access holder.at(0) after ss >> holder fails.
You generally want to check for input failure right after reading a token, and a convenient way to both attempt to input and check for failure at once is to simply check if ss >> token evaluates to true. So we can write code like:
if (ss >> token)
{
...
}
else
{
// handle failure if necessary
}
I generally find it's a lot easier to avoid getting in trouble writing code that way than manually checking error flags.
As a simplified version:
void getC(string& line, Coordinates& c)
{
stringstream ss(line);
for (string holder; ss >> holder; )
{
const char ch = holder.at(0);
stringstream sss(holder.substr(1));
if (ch == 'X')
sss >> c.x;
else if (ch == 'Y')
sss >> c.y;
else if (ch == 'Z')
sss >> c.z;
}
}

How to obtain certain information from a specified format in c++?

Beginner programmer here,
Say I want to obtain initial starting coordinates in the form (x,y), so I ask the user to enter in a point using the specific form "(x,y)". Is there a way that I could recognize the format and parse the string so that I could obtain the x and y values?
Read a line of text using:
char line[200]; // Make it large enough.
int x;
int y;
fgets(line, sizeof(line), stdin);
Then, use sscanf to read the numbers from the line of text.
if ( sscanf(line, "(%d,%d)", &x, &y) != 2 )
{
// Deal with error.
}
else
{
// Got the numbers.
// Use them.
}
If you want to use iostreams instead of stdio, use getline instead of fgets.
You can use regex to find the matching sequence anywhere in the input.
#include <iostream>
#include <string>
#include <regex>
int main() {
std::string line;
std::getline(std::cin, line);
std::regex e(R"R(\(([-+]?(?:\d*[.])?\d+)\s*,\s*([-+]?(?:\d*[.])?\d+)\))R");
std::smatch sm;
if (std::regex_search(line, sm, e)) {
auto x = std::stod(sm[1]);
auto y = std::stod(sm[2]);
std::cout << "Numbers are: " << x << ", " << y << std::endl;
}
return 0;
}
In order to parse stuff you need a parser. There are many ways to write a parser, but generally parsers read tokens and decide what to do next based on which token it is. (Emphasized words are important, look them up).
In your case you don't have to explicitly introduce a separate token entity. Reading elements from the input stream with the >> operator will do.
You need to:
read a character; verify it's a '('
read a number
read a character; verify it's a ','
read a number
read a character; verify it's a ')'
If any step fails, the entire parsing fails.
You can see the same basic step is done three times, so you can write a function for it.
bool expect_char(std::istream& is, char what)
{
char ch;
return is >> ch && ch == what;
}
This works because is >> ch returns the stream after the read operation, and the stream can be viewed as a boolean value: true if the last operation succeeded, false otherwise.
Now you can compose your parser:
bool get_vector (std::istream& is, int& x, int& y)
{
return expect_char(is, '(') &&
is >> x &&
expect_char(is, ',') &&
is >> y &&
expect_char(is, ')');
}
This method has a nice property that blanks are allowed between the numbers and the symbols.
Now this may look like a lot of stuff to type compared to the solution that uses sscanf:
bool get_numbers2 (std::istream& is, int& x, int& y)
{
std::string s;
return std::getline(in, s) &&
(std::sscanf(s.c_str(), "(%d,%d)", &x, &y) == 2);
}
But sscanf is:
dangerous (there is no typechecking of its arguments)
less powerful (the input format is very rigid)
not generic (doesn't work in a template)
It's OK to use the scanf functions family where appropriate, but I don't recommend it for new C++ programmers.

c++ moving to next element in a file.txt [duplicate]

This question already has answers here:
How to test whether stringstream operator>> has parsed a bad type and skip it
(5 answers)
Closed 8 years ago.
i have a stupid question.
I have a .txt file. Once opened, i need to take only numbers and skipping words.
Is there any method to check if next element is a word or not?
Because my file is like: word 1 2 word 1 2 3 4 5 6...
int n,e;
string s;
ifstream myfile("input.txt");
and so i think that's a stupid method to avoid the problem using a string and put the content in a string and then taking numbers, right like this:
myfile >> s;
myfile >> n;
myfile >> e;
You can do the following
int num = 0;
while(myfile >> num || !myfile.eof()) {
if(myfile.fail()) { // Number input failed, skip the word
myfile.clear();
string dummy;
myfile >> dummy;
continue;
}
cout << num << endl; // Do whatever necessary with the next number read
}
See a complete, working sample here
When reading in from a file as you are doing, all data is seen as a string. You must check to see if the string is a number. Here is a way to convert a string to an integer (IF THAT STRING IS AN INTEGER): atoi() function
Be careful though, you must pass it a c-string.
You can get the all data as a string and try convert the data to an integer in a try {} catch () { } block. If the data is real an integer, perform the operation in try section, else if code go to the catch and don't do any operation in catch.
Oops it's already solved. But worth to mention, there is also possibility to:
either read individual chars from the stream and pushback() if they are digits before using operator >>
or peek() the next chars in the stream without reading it to decisde whether to ignore it or to use operator >>
Just be carefull about the '-' which is not a digit but could be the sign of an interger.
Here a small example :
int c, n, sign=1;
ifstream ifs("test.txt", std::ifstream::in);
while (ifs.good() && (c=ifs.peek())!=EOF ) {
if (isdigit(c)) {
ifs >> n;
n *= sign;
sign = 1;
cout << n << endl;
}
else {
c=ifs.get();
if (c == '-')
sign = -1;
else sign = 1;
}
}
ifs.close();
It's not the most performant approach, however it has the advantage of only reading from stream, without intermediary strings and memory management.

failing to compare space in a string

I'm making a program which is getting inputs from the user, while each input contains ints delimited with spaces. e.g "2 3 4 5".
I implemented the atoi function well, but, whenever I try to run on the string and "skip" on the spaces I get a runtime error:
for(int i=0, num=INIT; i<4; i++)
{
if(input[i]==' ')
continue;
string tmp;
for(int j=i; input[j]!=' '; j++)
{
//add every char to the temp string
tmp+=input[j];
//means we are at the end of the number. convert to int
if(input[i+1]==' ' || input[i+1]==NULL)
{
num=m_atoi(tmp);
i=j;
}
}
}
In the line 'if(input[i+1]==' '.....' I get an exception.
Basically, I'm trying to insert just "2 2 2 2".
I realized that whenever I try to compare a real space in the string and ' ', the exception raises.
I tried to compare with the ASCII value of space which is 32 but that failed too.
Any ideas?
The problem is that you don't check for the end of the string in your main loop:
for(int j=i; input[j]!=' '; j++)
should be:
for(int j=i; input[j]!=0 && input[j]!=' '; j++)
Also, don't use NULL for the NUL char. You should use '\0' or simply 0. The macro NULL should be used only for pointers.
That said, it may be easier in your case to just use strtol or istringstream or something similar.
Not an answer to the question.
but two big for a comment.
You should note the C++ stream library automatically reads and decodes int from a space separated stream:
int main()
{
int value;
std::cin >> value; // Reads and ignores space then stores the next int into `value`
}
Thus to read multiple ints just put it in a loop:
while(std::cin >> value) // Loop will break if user hits ctrl-D or ctrl-Z
{ // Or a normal file is piped to the stdin and it is finished.
// Use value
}
To read a single line. That contains space separated values just read the line into a string (convert this to a stream then read the values.
std::string line;
std::getline(std::cin, line); // Read a line into a string
std::stringstream linestream(line); // Convert string into a stream
int value;
while(linestream >> value) // Loop as above.
{
// Use Value
}