confusion about lists and pairs - c++

So I'm experimenting with trying to add first and last names into a double linked list.
I have a various text files of different lengths with the format "string, string", and am using list> to store my data.
I am using this code:
typedef std::list< std::pair<string,string> > listPair;
...
list<pair<string, string> > mylist;
ifstream myFile;
myFile.open("20.txt");
pair<string, string> stuff;
while (myFile >> stuff.first >> stuff.second)
{
mylist.push_back(stuff);
}
listPair::iterator iter = mylist.begin();
for(;iter != mylist.end();iter++)
{
string s = (*iter).first;
cout << s << endl;
string c = (*iter).second;
cout << c << endl;
}
now the problem i'm having is that firstly, the last item in the list is not being added.
like every file just misses the end line, so that's a little confusing.
also, I'm doing a "mylist.size()" to ensure all the names have been added, and it's confusing me because say for a text file containing 99 names, i.e 99 lines of text, it will say (not forgetting it only reads in 98 due to missing the last line) that the list has size 48.
WHY 48?
Is it something to do because I have done pairs, which still would not make sense as if it was not reading in pairs there would actually be double the about, since the pairs are just to take the first and last name as one value.
Mind boggling to me.
once again thanks for all your help!

I have a feeling your file doesn't actually have spaces between the values as you've described, so it looks like this:
one,two
three,four
five,six
seven,eight
nine,ten
If you were to run your program on this, the size of the list will be 2 (floor(number_of_lines/2), which for you would give 48) and the last line won't have been put in the list at all. Why?
Firstly, each call to std::ifstream::operator>>(std::string&) will extract up until it hits some white space. In this case, the first white space on the first line is the \n at the end of it. So on the first iteration, stuff.first will be "one,two" and then the next line will be read into stuff.second, making it "three,four". This is then pushed into the list. The next two lines are read in the same way, giving you the pair {"five,six","seven,eight"}. On the next iteration, the first operator>> will extract "nine,ten" and the second will fail, causing the while condition to end and the last line to be discarded.
Even if you did have spaces, you would end up with commas in the first of every pair, which is certainly not what you want.
The nicer way to approach a problem like this is to use std::getline to extract a line, and then parse that line as appropriate:
std::string line;
std::pair<std::string, std::string> line_pair;
while (std::getline(myFile, line)) {
std::stringstream line_stream(line);
std::getline(line_stream, line_pair.first, ',');
std::getline(line_stream, line_pair.second);
mylist.push_back(line_pair);
}
I also recommend using std::vector unless you have a good reason to use std::list.

Operator >> on ifstream treats newline as yet another token. Hence it will probably read your first and second word as per normal from your first line, but the third token read is the new line on the first line.
Try using getline to 'eat' the newline as well.

Related

how to ignore n integers from input

I am trying to read the last integer from an input such as-
100 121 13 ... 7 11 81
I'm only interested in the last integer and hence want to ignore all
previous integers.
I thought of using cin.ignore but that won't work here due to
unknown integers (100 is of 3 digits, while 13 is of 2 digits & so on)
I can input integer by integer using a loop and do nothing with them. Is there a better way?
It all depends on the use case that you have.
Reading a none specified number of integers from std::cin is not as easy at it may seem. Because, in contrast to reading from a file, you will not have an EOF condition. If you would read from a file stream, then it would be very simple.
int value{};
while (fileStream >> value)
;
If you are using std::cin you could try pressing CTRL-D or CTRL-Z or whatever works on your terminal to produce an EOF (End Of File) condition. But usually the approach is to use std::getline to read a complete line until the user presses enter, then put this line into a std::istringstream and extract from there.
Insofar, one answer given below is not that good.
So, next solution:
std::string line{};
std::getline(std::cin, line);
std::istringstream iss{line};
int value{};
while (iss >> value)
;
You were asking
Is there a better way?
That also depends a little. If you are just reading some integers, then please go with above approach. If you would have many many values, then you would maybe waste time by unnecessarily converting many substrings to integers and loose time.
Then, it would be better, to first read the complete string, then use rfind to find the last space in the string and use std::stoi to convert the last substring to an integer.
Caveat: In this case you must be sure (or check with more lines of code) that there are no white space at the end and the last substring is really a number. That is a lot of string/character fiddling, which can most probably avoided.
So, I would recommend the getline-stringstream approach.
You can try this simple solution for dynamically ignoring rest of the values except the last given in this problem as shown:
int count = 0;
int values, lastValue; // lastValue used for future use
std::cout << "Enter your input: ";
while (std::cin >> values) {
lastValue = values; // must be used, otherwise values = 0 when loop ends
count++;
}
std::cout << lastValue; // prints
Note: A character must be required to stop the while(), hence it's better put a . at last.
Output example
Enter your input: 3 2 4 5 6 7.
7
Try this:
for( int i=0; i<nums_to_ignore; i++) {
int ignored;
std::cin >> ignored;
}

cpp stringstream read input file algorithm to find LCS

Hi here's my first questions here, I would write as clear as possible, if I am too newbie here, please bear it with me. Thanks
Backgroud: I was asked to solve longest common substring(lcs) problem with given input files in c++.
Its purpose is to optimize the algorithm, so it has limited run-time and RAM requirement.(case insensitive)
My Approach: I used to stringstream to parse the every input line and stored them into a vector. use something like suffix tree to chop the string, sort it and put into a vector array (vector that store vectors) and compare every 2 lines (v1,v2) to find common substirng.(I used nested foop loop to compare each word inside every vector), and then put common substrings back to array and remove v1 and v2.
suffix tree eg. banana -> anana -> nana -> ana -> na -> a..[I stored all 5 elements into the vector]
result: it works for most of the files (normal textfiles)
problem: I got 2 special test case that took me forever to find lcs.
1. has 10000 line input, and each line has ave 3000 chars (include space). It took me 50 mins to find lcs. the requirement is not exceed 5 mins.
2. has 100 line input, and each line has ave 60k chars. It never finish running
what I tried:build a common word dictionary for first 2 sentence
read first two lines and stored into vectors
used suffix tree again to find common elements(substring) and named as dictionary
for rest of input lines,
if (words read is within dictionary)
fine do what I did before, read next one
else if (word is not in dictionary)
ignore this word, read next one
help needed: I still cannot read the first two lines if each line contains 60k char, so building the dict itself would exceed the run time limitations. I am not sure if the hashed table would work way better than vectors. I knew a bit about HT but never write anything with it, so if you can explain HT with patience, I would appreciate that.
Update:
As suggested, I put some code here (first one for parse and store into vector, second involve how I compare 2 string and find common element)
vector< vector<string> > parsed_array;
vector<string> choped_element;
// Num1::read from file in a while loop
while (getline (myfile,line))
{
cout << "< InputlineLoopCounter: "<<InputlineLoopCounter<<endl;:q
choped_element.clear();
choped_element.push_back(line); //whole string as first element, eg'Hello World"
stringstream ss(line);
string copystr (line);
while (ss >> temp)
{
copystr.erase(0,copystr.find_first_of(" \t")+1); // here turns into "World"
choped_element.push_back(copystr);
}
choped_element.pop_back();//since I stored whole string as frist element, last one is not necessary
sort(choped_element.begin(),choped_element.end());
parsed_array.push_back(choped_element);//stored into vector array
InputlineLoopCounter ++;
}
//Num::2 compare part in 2 diff string and assembly into new string
//v1 and v2 and 2 vectors full of chopped strings and v3 should be common element
// eg. v1[0]="hello world"; v1[1]="world"
// eg. v2[0]="I dislike hello world"; v2[1]="dislike hello word"; v2[2]="hello word"; v2[4]="word"
// eg. v3 as result would be v3[0]="hello word";v3[1]="word"
for (size_t i = 0; i < v1len; i++)
{
for (size_t j = 0; j< v2len; j++)
{
stringstream ss1(v1[i]);
string fword1;
ss1>>fword1;
stringstream ss2(v2[j]);
string fword2;
ss2>>fword2;
if(fword1 == fword2) //v1[i] and v2[j] are space seperated words
{
string nword1;
string nword2;
string lcommon;
int comlen = 1;
string combine;
combine.append(fword1);
combine.append(space);
while (ss1>>nword1 && ss2>>nword2)
{
if (nword1 == nword2)
{
combine.append(nword1);
combine.append(space);
comlen ++;
}
else
break;
}
combine.erase(combine.find_last_of(" "));
cout<< "common word: "<<combine<<endl;
v3.push_back(combine);
}
}
}

How to Read last line of a delimited File

I have a delimited file with semi-colon delimiter.
I am using the following code to read the last line. I believe the loop runs until the last line and keeps overwriting the lastLine string. So once the last line is reached, the loop breaks and the string is the last line.
while(getline(finlocal, chuy, ';'))
{
getline(finlocal,lastLine, ';');
}
cout<<lastLine; //last line.
But this method does not work properly and efficiently.
Any suggestions on How to know the last line of this delimited file.
You did not explain what is your problem but I guess it doesn't check every line because the getline() inside the loop points to next element, so you lose half of your elements.
while(getline(finlocal, chuy, ';'))
{
}
std::cout<<chuy;
should work
Let's say this is your lines:
1.; <--
2.;
3.;
4.;
...
after the first iteration it goes to:
1.; <--
2.; <--(by while)
3.; <--(by inside getline)
4.;
...
So you lose the 2nd element. Loop goes on and in the end
5.; <-- (by inside loop)
end.; <-- (by while): this assignes the right line to chuy and points to next element
which does not exist and so inside getline() doesn't work
use
while(getline(finlocal, lastLine, ';'));
cout<<lastLine;
or otherwise you will might not get the last line.
you can use
while(getline(finlocal, lastLine, ';'))
{
getline(finlocal,lastLine, ';');
}
cout<<lastLine;
but there is no point in doing getline twice
for a better delimiter in your case i would have used a special sign, say do:
char delimiter = 251;
that char is the sign for square root, and most probably won't be used in chat. still, for each chat message that arrives you need to check if the sign is in use there and if so delete it. now for each message you send, send with the sign at the end of it, and when receiving, split at that character with the getline.
that way you split easily in a place you know the user won't mess.
if the file wouldn't be really big, you may first get the whole file in a string and use string::find_last_of to find the last index of semicolon. Then print the line from that index till end of file.

How do I make std::getline inform me when it has hit the end of a stringstream?

Supposing stringstream contains James is 4 , I can write something like getline (stream, stringjames, ' ') to get the individual words, but is there any way to know that I've hit the end of the line?
Bonus Question!
Case 1: James is 4
Case 2: James is four
If I were iterating through the words in a stringstream, and I expect to receive an int val of 4, but instead I received a string, what would be the best way to check this?
You check the return value to see if it evaluates true or false:
if (getline(stream, stringjames, ' '))
// do stuff
else
// fail
As for the "bonus question" you can also do the same thing when extracting ints and things from streams. The return value of operator>> will evaluate to true if the read was successful, and false if there was an error (such as there being letters instead of numbers):
int intval;
if (stream >> intval)
// int read, process
else if (stream.eof())
// end-of-stream reached
else
// int failed to read but there is still stuff left in the stream
Supposing stringstream contains James is 4 , I can write something like getline (stream, stringjames, ' ') to get the individual words, but is there any way to know that I've hit the end of the line?
It's normally easiest to read into a std::string variable - the default is to consider it delimited by space anyway:
std::string word;
while (some_stream >> word)
{
// first iteration "James", then "is", then "4", then breaks from while...
}
Bonus Question! Case 1: James is 4 Case 2: James is four
If I were iterating through the words in a stringstream, and I expect to receive an int val of 4, but instead I received a string, what would be the best way to check this?
You're best off reading it into a string first, then checking whether you can convert that string into a number. You might try strtoi strtol etc. - they helpfully indicate whether the entire value is a legal number so you can detect and reject values like say "4q".
An alternative is to try streaming into an integral type first, and only if it fails reset the error flags on the stream and get a string instead. I can't remember if you could need to reposition the stream so you could read the string variable, but you could write a couple test cases and nut it out.
Alternatively, you could use regular expressions and subexpression matches to parse your input: more useful as the expression gets more complicated.

Is there a better way to parse a line of text like this?

I have a text file with lines of text that have a string another string followed by up to 4 integers,
ex:
clear "clear water.png" 5 7
wet "wet water.png" 9 5 33 17
soft "soft rain falling.png"
The only way I see it is:
read until space is found
set string to wet
read until double quote
read until second double quote
set second string to wet water.png
while not end of line
read until space
put string through string stream
push resulting integer into vector of int
Is there a better way to do this?
Thanks
This is the sort of task for which scanf and company truly shine.
char first_string[33], second_string[129];
sscanf(input_string,
"%32s%*[^\"]%*c%128[^\"]%*c %d %d %d %d",
first_string,
second_string,
&first_int,
&second_int,
&third_int,
&fourth_int);
You probably want to do that in an if statement so you can test the return value, to tell you how many of those fields converted (e.g., so you know how many integers you read at the end).
Edit: perhaps some explanation would be helpful. Let's dissect that:
%32s reads a string to the first white-space (or 32 characters, whichever comes first).
%*[^\"] ignores input up to the first ".
%*c ignores one more byte of input (the quote itself)
%128[^\"] reads the string in the quote (i.e., up to the next quote character).
%*c Ignores the closing quote
%d Reads an int (which we've done four times).
The space before each %d is really unnecessary -- it'll skip whitespace, but without the space, %d will skip leading whitespace anyway. I've included them purely to make it a little more readable.
Ugly, with no error-checking, but no dependencies on any non-standard libraries:
string s;
while(getline(fin, s))
{
string word, quoted_string;
vector<int> vec;
istringstream line(s);
line >> word;
line.ignore(numeric_limits<streamsize>::max(), '"');
getline(line, quoted_string, '"');
int n;
while(line >> n) vec.push_back(n);
// do something with word, quoted_string and vec...
}
Depending on the restrictions of the input strings you could trying splitting on double-quote then splitting on space.
Yes
Use getline to read one line at a time. Parse the lines using a regular expression library.