Reading two last occurrences from a file

Reading two last occurrences from a file - c++

I have a little problem,
I've got a file containing a lot of information ( posted on pastbin, because its pretty long, http://pastebin.com/MPcTMHfd )
Yes, it is a PokerStars card log, but I am making a odd calculator for myself, I even asked PokerStars about that.
When I filter the file I get something like
CODE FOR FILTERING:
getline(failas,line);
if( line.find(search_str) != std::string::npos )
{
firstCard = line.substr(4);
cout << firstCard << '\n' ;
}
- Result:
::: 7c
::: 5d
::: 13c
::: 7d
::: 12h
::: 13d
and so on, so the thing I want to do is get last cards ( 12h and 13h as in the example above )
All I managed to get is last card, (13d)
Any ideas, how I could read two lines or any other ideas how to solve my little problem?
I know that it is a beginner question, but I haven't really found a suitable answer anywhere.

So you want the n last cards ? Then maintain a "last n cards" list and update it for every card found.
Like that : (code written by hand)
#include <list>
#include <string>
std::list<std::string> last_n_cards;
const unsigned int last_n_cards_max = 2;
// basic init to make code simpler, can be done differently
last_n_cards.push_back( "?" ); // too lazy to write the for(last_n_cards_max) loop ;)
last_n_cards.push_back( "?" );
(loop)
if( line.find(search_str) != std::string::npos )
{
currentCard = line.substr(4);
cout << currentCard << '\n';
last_n_cards.push_back(currentCard); // at new card to the end
last_n_cards.pop_front(); // remove 1 card from the front
}
// At the end :
cout << "Last two cards are : " << last_n_cards[0] << " and " << last_n_cards[1] << '\n';
std::list API is here http://en.cppreference.com/w/cpp/container/list
Note : have you considered using another language than C++ for this task ? Non perf-intensive files parsing may be easier with a dynamic language like python.

There could be two simple methods to solve your problem.
Scan whole file to figure out how many lines (cards) it has and then simply set position to one before last card and read it and then the next. You will have two last cards read.
Use two pointers. Set first pointer to first line and next one to first + 2. Then iterate through whole file and once second pointer will reach end of file (EOF) your first pointer will point to one before last card.

Related

How to return sequences in YAML node as string?

I'm trying to parse a dialogue tree (YAML) in C++ using yaml-cpp. Here's a sample YAML:
dialogue_block:
character_name:
- Hello
- How are you?
- :main
main:
- 1: ["I'm fine, thank you", :response1]
- 2: ["Not very well", :response2]
- 3: ["I don't want to talk", :exit]
I'm relatively new to C++ and Yaml, so if there's an easier/more intuitive way, please point me in the right direction. My idea is to store each block as a dialogue node. In the example above, I want to be able to call on dialogue_block, and extract character_name to identify the character speaking, print all of the sequences up to :main, where it'll switch to the main node, with 3 choices for the player. I'm currently stuck on step 1 - parsing the yaml file...
The following works...
YAML::Node dialogue = YAML::LoadFile("dialogue.yaml");
if(dialogue["dialogue_block"]){
std::cout << dialogue["dialogue_block"]["character_name"][0].as<std::string>() << "\n";
}
and it prints "Hello". However, I'm stumped on the next steps: how can I retrieve "character_name" without hardcoding the string into my code? Is there a way to print all of the strings leading up to, but not including ":main"? And then make "main" the next node?
First time posting on stackoverflow, so please do let me know if there's more info needed! Thanks.
Edit:
Here's the updated code I'm using:
// read in file
YAML::Node dialogue = YAML::LoadFile("dialogue.yaml");
// Extract names of each block
std::vector<std::string> dialogueBlocks;
for (const auto& kv : dialogue) {
dialogueBlocks.push_back(kv.first.as<std::string>());
} // will return "dialogue_block" and "main"
std::string character;
// if first_encounter exists, always start at that block
if(dialogue["first_encounter"]){
for(YAML::iterator it = dialogue["first_encounter"].begin(); it != dialogue["first_encounter"].end(); ++it){
character = it->first.as<std::string>();
std::cout << "\nCharacter: " << character << "\n";
for (YAML::iterator it=dialogue["first_encounter"][character].begin();it!=dialogue["first_encounter"][character].end(); ++it) {
std::cout << it->as<std::string>() << "\n";
}
}
}
I can successfully extract the character name and the dialogue, but there are a few things I'm struggling with:
1) It also prints ":main", which I want it to leave out. I'm not sure how to get it to terminate when it reaches a string starting with ":", or if there's an appropriate built-in function to use.
2) Store ":main" as the next block to pass through the for loop when called upon.

You're asking how to find the "key name" for the list. You could certainly look at all the keys under dialogue["dialogue_block"], but it would be much more idiomatic yaml to make character a separate field from their lines, like so
dialogue_block:
character: Bob
lines:
- Hello
- How are you?
- :main
or if a block is intended to be a list,
dialogue_block:
- character: Bob
lines:
- Hello
- How are you?
- :main
- character: Alice
lines:
- Blah
- :main

Parsing Data of data from a file

i have this project due however i am unsure of how to parse the data by the word, part of speech and its definition... I know that i should make use of the tab spacing to read it but i have no idea how to implement it. here is an example of the file
Recollection n. The power of recalling ideas to the mind, or the period within which things can be recollected; remembrance; memory; as, an event within my recollection.
Nip n. A pinch with the nails or teeth.
Wodegeld n. A geld, or payment, for wood.
Xiphoid a. Of or pertaining to the xiphoid process; xiphoidian.
NB: Each word and part of speech and definition is one line in a text file.

If you can be sure that the definition will always follow the first period on a line, you could use an implementation like this. But it will break if there are ever more than 2 periods on a single line.
string str = "";
vector<pair<string,string>> v; // <word,definition>
while(getline(fileStream, str, '.')) { // grab line, deliminated '.'
str[str.length() - 1] = ""; // get rid of n, v, etc. from word
v.push_back(make_pair<string,string>(str,"")); // push the word
getline(fileStream, str, '.'); // grab the next part of the line
v.back()->second = str; // push definition into last added element
}
for(auto x : v) { // check your results
cout << "word -> " << x->first << endl;
cout << "definition -> " << x->second << endl << endl;
}
The better solution would be to learn Regular Expressions. It's a complicated topic but absolutely necessary if you want to learn how to parse text efficiently and properly:
http://www.cplusplus.com/reference/regex/

Reading text file by scanning for keywords

As part of a bigger application I am working on a class for reading input from a text file for use in the initialization of the program. Now I am myself fairly new to programming, and I only started to learn C++ in December, so I would be very grateful for some hints and ideas on how to get started! I apologise in advance for a rather long wall of text.
The text file format is "keyword-driven" in the following way:
There are a rather small number of main/section keywords (currently 8) that need to be written in a given order. Some of them are optional, but if they are included they should adhere to the given ordering.
Example:
Suppose there are 3 potential keywords ordered like as follows:
"KEY1" (required)
"KEY2" (optional)
"KEY3" (required)
If the input file only includes the required ones, the ordering should be:
"KEY1"
"KEY3"
Otherwise it should be:
"KEY1"
"KEY2"
"KEY3"
If all the required keywords are present, and the total ordering is ok, the program should proceed by reading each section in the sequence given by the ordering.
Each section will include a (possibly large) amount of subkeywords, some of which are optional and some of which are not, but here the order does NOT matter.
Lines starting with characters '*' or '--' signify commented lines, and they should be ignored (as well as empty lines).
A line containing a keyword should (preferably) include nothing else than the keyword. At the very least, the keyword must be the first word appearing there.
I have already implemented parts of the framework, but I feel my approach so far has been rather ad-hoc. Currently I have manually created one method per section/main keyword , and the first task of the program is to scan the file for to locate these keywords and pass the necessary information on to the methods.
I first scan through the file using an std::ifstream object, removing empty and/or commented lines and storing the remaining lines in an object of type std::vector<std::string>.
Do you think this is an ok approach?
Moreover, I store the indices where each of the keywords start and stop (in two integer arrays) in this vector. This is the input to the above-mentioned methods, and it would look something like this:
bool readMAINKEY(int start, int stop);
Now I have already done this, and even though I do not find it very elegant, I guess I can keep it for the time being.
However, I feel that I need a better approach for handling the reading inside of each section, and my main issue is how should I store the keywords here? Should they be stored as arrays within a local namespace in the input class or maybe as static variables in the class? Or should they be defined locally inside relevant functions? Should I use enums? The questions are many!
Now I've started by defining the sub-keywords locally inside each readMAINKEY() method, but I found this to be less than optimal. Ideally I want to reuse as much code as possible inside each of these methods, calling upon a common readSECTION() method, and my current approach seems to lead to much code duplication and potential for error in programming. I guess the smartest thing to do would simply be to remove all the (currently 8) different readMAINKEY() methods, and use the same function for handling all kinds of keywords. There is also the possibility for having sub-sub-keywords etc. as well (i.e. a more general nested approach), so I think maybe this is the way to go, but I am unsure on how it would be best to implement it?
Once I've processed a keyword at the "bottom level", the program will expect a particular format of the following lines depending on the actual keyword. In principle each keyword will be handled differently, but here there is also potential for some code reuse by defining different "types" of keywords depending on what the program expects to do after triggering the reading of it. Common task include e.g. parsing an integer or a double array, but in principle it could be anything!
If a keyword for some reason cannot be correctly processed, the program should attempt as far as possible to use default values instead of terminating the program (if reasonable), but an error message should be written to a logfile. For optional keywords, default values will of course also be used.
In order to summarise, therefore, my main questions are the following:
1. Do you think think my approach of storing the relevant lines in a std::vector<std::string> to be reasonable?
This will of course require me to do a lot of "indexing work" to keep track of where in the vector the different keywords are located. Or should I work more "directly" with the original std::ifstream object? Or something else?
2. Given such a vector storing the lines of the text file, how I can I best go about detecting the keywords and start reading the information following them?
Here I will need to take account of possible ordering and whether a keyword is required or not. Also, I need to check if the lines following each "bottom level" keyword is in the format expected in each case.
One idea I've had is to store the keywords in different containers depending on whether they are optional or not (or maybe use object(s) of type std::map<std::string,bool>), and then remove them from the container(s) if correctly processed, but I am not sure exactly how I should go about it..
I guess there is really a thousand different ways one could answer these questions, but I would be grateful if someone more experienced could share some ideas on how to proceed. Is there e.g. a "standard" way of doing such things? Of course, a lot of details will also depend on the concrete application, but I think the general format indicated here can be used in a lot of different applications without a lot of tinkering if programmed in a good way!
UPDATE
Ok, so let my try to be more concrete. My current application is supposed to be a reservoir simulator, so as part of the input I need information about the grid/mesh, about rock and fluid properties, about wells/boundary conditions throughout the simulation and so on. At the moment I've been thinking about using (almost) the same set-up as the commercial Eclipse simulator when it comes to input, for details see
http://petrofaq.org/wiki/Eclipse_Input_Data.
However, I will probably change things a bit, so nothing is set in stone. Also, I am interested in making a more general "KeywordReader" class that with slight modifications can be adapted for use in other applications as well, at least it can be done in a reasonable amount of time.
As an example, I can post the current code that does the initial scan of the text file and locates the positions of the main keywords. As I said, I don't really like my solution very much, but it seems to work for what it needs to do.
At the top of the .cpp file I have the following namespace:
//Keywords used for reading input:
namespace KEYWORDS{
/*
* Main keywords and corresponding boolean values to signify whether or not they are required as input.
*/
enum MKEY{RUNSPEC = 0, GRID = 1, EDIT = 2, PROPS = 3, REGIONS = 4, SOLUTION = 5, SUMMARY =6, SCHEDULE = 7};
std::string mainKeywords[] = {std::string("RUNSPEC"), std::string("GRID"), std::string("EDIT"), std::string("PROPS"),
std::string("REGIONS"), std::string("SOLUTION"), std::string("SUMMARY"), std::string("SCHEDULE")};
bool required[] = {true,true,false,true,false,true,false,true};
const int n_key = 8;
}//end KEYWORDS namespace
Then further down I have the following function. I am not sure how understandable it is though..
bool InputReader::scanForMainKeywords(){
logfile << "Opening file.." << std::endl;
std::ifstream infile(filename);
//Test if file was opened. If not, write error message:
if(!infile.is_open()){
logfile << "ERROR: Could not open file! Unable to proceed!" << std::endl;
std::cout << "ERROR: Could not open file! Unable to proceed!" << std::endl;
return false;
}
else{
logfile << "Scanning for main keywords..." << std::endl;
int nkey = KEYWORDS::n_key;
//Initially no keywords have been found:
startIndex = std::vector<int>(nkey, -1);
stopIndex = std::vector<int>(nkey, -1);
//Variable used to control that the keywords are written in the correct order:
int foundIndex = -1;
//STATISTICS:
int lineCount = 0;//number of non-comment lines in text file
int commentCount = 0;//number of commented lines in text file
int emptyCount = 0;//number of empty lines in text file
//Create lines vector:
lines = std::vector<std::string>();
//Remove comments and empty lines from text file and store the result in the variable file_lines:
std::string str;
while(std::getline(infile,str)){
if(str.size()>=1 && str.at(0)=='*'){
commentCount++;
}
else if(str.size()>=2 && str.at(0)=='-' && str.at(1)=='-'){
commentCount++;
}
else if(str.size()==0){
emptyCount++;
}
else{
//Found a non-empty, non-comment line.
lines.push_back(str);//store in std::vector
//Start by checking if the first word of the line is one of the main keywords. If so, store the location of the keyword:
std::string fw = IO::getFirstWord(str);
for(int i=0;i<nkey;i++){
if(fw.compare(KEYWORDS::mainKeywords[i])==0){
if(i > foundIndex){
//Found a valid keyword!
foundIndex = i;
startIndex[i] = lineCount;//store where the keyword was found!
//logfile << "Keyword " << fw << " found at line " << lineCount << " in lines array!" << std::endl;
//std::cout << "Keyword " << fw << " found at line " << lineCount << " in lines array!" << std::endl;
break;//fw cannot equal several different keywords at the same time!
}
else{
//we have found a keyword, but in the wrong order... Terminate program:
std::cout << "ERROR: Keywords have been entered in the wrong order or been repeated! Cannot continue initialisation!" << std::endl;
logfile << "ERROR: Keywords have been entered in the wrong order or been repeated! Cannot continue initialisation!" << std::endl;
return false;
}
}
}//end for loop
lineCount++;
}//end else (found non-comment, non-empty line)
}//end while (reading ifstream)
logfile << "\n";
logfile << "FILE STATISTICS:" << std::endl;
logfile << "Number of commented lines: " << commentCount << std::endl;
logfile << "Number of non-commented lines: " << lineCount << std::endl;
logfile << "Number of empty lines: " << emptyCount << std::endl;
logfile << "\n";
/*
Print lines vector to screen:
for(int i=0;i<lines.size();i++){
std:: cout << "Line nr. " << i << " : " << lines[i] << std::endl;
}*/
/*
* So far, no keywords have been entered in the wrong order, but have all the necessary ones been found?
* Otherwise return false.
*/
for(int i=0;i<nkey;i++){
if(KEYWORDS::required[i] && startIndex[i] == -1){
logfile << "ERROR: Incorrect input of required keywords! At least " << KEYWORDS::mainKeywords[i] << " is missing!" << std::endl;;
logfile << "Cannot proceed with initialisation!" << std::endl;
std::cout << "ERROR: Incorrect input of required keywords! At least " << KEYWORDS::mainKeywords[i] << " is missing!" << std::endl;
std::cout << "Cannot proceed with initialisation!" << std::endl;
return false;
}
}
//If everything is in order, we also initialise the stopIndex array correctly:
int counter = 0;
//Find first existing keyword:
while(counter < nkey && startIndex[counter] == -1){
//Keyword doesn't exist. Leave stopindex at -1!
counter++;
}
//Store stop index of each keyword:
while(counter<nkey){
int offset = 1;
//Find next existing keyword:
while(counter+offset < nkey && startIndex[counter+offset] == -1){
offset++;
}
if(counter+offset < nkey){
stopIndex[counter] = startIndex[counter+offset]-1;
}
else{
//reached the end of array!
stopIndex[counter] = lines.size()-1;
}
counter += offset;
}//end while
/*
//Print out start/stop-index arrays to screen:
for(int i=0;i<nkey;i++){
std::cout << "Start index of " << KEYWORDS::mainKeywords[i] << " is : " << startIndex[i] << std::endl;
std::cout << "Stop index of " << KEYWORDS::mainKeywords[i] << " is : " << stopIndex[i] << std::endl;
}
*/
return true;
}//end else (file opened properly)
}//end scanForMainKeywords()

You say your purpose is to read initialization data from a text file.
Seems you need to parse (syntax analyze) this file and store the data under the right keys.
If the syntax is fixed and each construction starts with a keyword, you could write a recursive descent (LL1) parser creating a tree (each node is a stl vector of sub-branches) to store your data.
If the syntax is free, you might pick JSON or XML and use an existing parsing library.

read in values and store in list in c++

i have a text file with data like the following:
name
weight
groupcode
name
weight
groupcode
name
weight
groupcode
now i want write the data of all persons into a output file till the maximum weight of 10000 kg is reached.
currently i have this:
void loadData(){
ifstream readFile( "inFile.txt" );
if( !readFile.is_open() )
{
cout << "Cannot open file" << endl;
}
else
{
cout << "Open file" << endl;
}
char row[30]; // max length of a value
while(readFile.getline (row, 50))
{
cout << row << endl;
// how can i store the data into a list and also calculating the total weight?
}
readFile.close();
}
i work with visual studio 2010 professional!
because i am a c++ beginner there could be is a better way! i am open for any idea's and suggestions
thanks in advance!

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <limits>
struct entry
{
entry()
: weight()
{ }
std::string name;
int weight; // kg
std::string group_code;
};
// content of data.txt
// (without leading space)
//
// John
// 80
// Wrestler
//
// Joe
// 75
// Cowboy
int main()
{
std::ifstream stream("data.txt");
if (stream)
{
std::vector<entry> entries;
const int limit_total_weight = 10000; // kg
int total_weight = 0; // kg
entry current;
while (std::getline(stream, current.name) &&
stream >> current.weight &&
stream.ignore(std::numeric_limits<std::streamsize>::max(), '\n') && // skip the rest of the line containing the weight
std::getline(stream, current.group_code))
{
entries.push_back(current);
total_weight += current.weight;
if (total_weight > limit_total_weight)
{
break;
}
// ignore empty line
stream.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
}
}
else
{
std::cerr << "could not open the file" << std::endl;
}
}
Edit: Since you wannt to write the entries to a file, just stream out the entries instead of storing them in the vector. And of course you could overload the operator >> and operator << for the entry type.

Well here's a clue. Do you see the mismatch between your code and your problem description? In your problem description you have the data in groups of four lines, name, weight, groupcode, and a blank line. But in your code you only read one line each time round your loop, you should read four lines each time round your loop. So something like this
char name[30];
char weight[30];
char groupcode[30];
char blank[30];
while (readFile.getline (name, 30) &&
readFile.getline (weight, 30) &&
readFile.getline (groupcode, 30) &&
readFile.getline (blank, 30))
{
// now do something with name, weight and groupcode
}
Not perfect by a long way, but hopefully will get you started on the right track. Remember the structure of your code should match the structure of your problem description.

Have two file pointers, try reading input file and keep writing to o/p file. Meanwhile have a counter and keep incrementing with weight. When weight >= 10k, break the loop. By then you will have required data in o/p file.
Use this link for list of I/O APIs:
http://msdn.microsoft.com/en-us/library/aa364232(v=VS.85).aspx

If you want to struggle through things to build a working program on your own, read this. If you'd rather learn by example and study a strong example of C++ input/output, I'd definitely suggest poring over Simon's code.
First things first: You created a row buffer with 30 characters when you wrote, "char row[30];"
In the next line, you should change the readFile.getline(row, 50) call to readFile.getline(row, 30). Otherwise, it will try to read in 50 characters, and if someone has a name longer than 30, the memory past the buffer will become corrupted. So, that's a no-no. ;)
If you want to learn C++, I would strongly suggest that you use the standard library for I/O rather than the Microsoft-specific libraries that rplusg suggested. You're on the right track with ifstream and getline. If you want to learn pure C++, Simon has the right idea in his comment about switching out the character array for an std::string.
Anyway, john gave good advice about structuring your program around the problem description. As he said, you will want to read four lines with every iteration of the loop. When you read the weight line, you will want to find a way to get numerical output from it (if you're sticking with the character array, try http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/, or try http://www.cplusplus.com/reference/clibrary/cstdlib/atof/ for non-whole numbers). Then you can add that to a running weight total. Each iteration, output data to a file as required, and once your weight total >= 10000, that's when you know to break out of the loop.
However, you might not want to use getline inside of your while condition at all: Since you have to use getline four times each loop iteration, you would either have to use something similar to Simon's code or store your results in four separate buffers if you did it that way (otherwise, you won't have time to read the weight and print out the line before the next line is read in!).
Instead, you can also structure the loop to be while(total <= 10000) or something similar. In that case, you can use four sets of if(readFile.getline(row, 30)) inside of the loop, and you'll be able to read in the weight and print things out in between each set. The loop will end automatically after the iteration that pushes the total weight over 10000...but you should also break out of it if you reach the end of the file, or you'll be stuck in a loop for all eternity. :p
Good luck!

Read file and extract certain part only

ifstream toOpen;
openFile.open("sample.html", ios::in);
if(toOpen.is_open()){
while(!toOpen.eof()){
getline(toOpen,line);
if(line.find("href=") && !line.find(".pdf")){
start_pos = line.find("href");
tempString = line.substr(start_pos+1); // i dont want the quote
stop_pos = tempString .find("\"");
string testResult = tempString .substr(start_pos, stop_pos);
cout << testResult << endl;
}
}
toOpen.close();
}
What I am trying to do, is to extrat the "href" value. But I cant get it works.
EDIT:
Thanks to Tony hint, I use this:
if(line.find("href=") != std::string::npos ){
// Process
}
it works!!

I'd advise against trying to parse HTML like this. Unless you know a lot about the source and are quite certain about how it'll be formatted, chances are that anything you do will have problems. HTML is an ugly language with an (almost) self-contradictory specification that (for example) says particular things are not allowed -- but then goes on to tell you how you're required to interpret them anyway.
Worse, almost any character can (at least potentially) be encoded in any of at least three or four different ways, so unless you scan for (and carry out) the right conversions (in the right order) first, you can end up missing legitimate links and/or including "phantom" links.
You might want to look at the answers to this previous question for suggestions about an HTML parser to use.

As a start, you might want to take some shortcuts in the way you write the loop over lines in order to make it clearer. Here is the conventional "read line at a time" loop using C++ iostreams:
#include <fstream>
#include <iostream>
#include <string>
int main ( int, char ** )
{
std::ifstream file("sample.html");
if ( !file.is_open() ) {
std::cerr << "Failed to open file." << std::endl;
return (EXIT_FAILURE);
}
for ( std::string line; (std::getline(file,line)); )
{
// process line.
}
}
As for the inner part the processes the line, there are several problems.
It doesn't compile. I suppose this is what you meant with "I cant get it works". When asking a question, this is the kind of information you might want to provide in order to get good help.
There is confusion between variable names temp and tempString etc.
string::find() returns a large positive integer to indicate invalid positions (the size_type is unsigned), so you will always enter the loop unless a match is found starting at character position 0, in which case you probably do want to enter the loop.
Here is a simple test content for sample.html.
<html>
<a href="foo.pdf"/>
</html>
Sticking the following inside the loop:
if ((line.find("href=") != std::string::npos) &&
(line.find(".pdf" ) != std::string::npos))
{
const std::size_t start_pos = line.find("href");
std::string temp = line.substr(start_pos+6);
const std::size_t stop_pos = temp.find("\"");
std::string result = temp.substr(0, stop_pos);
std::cout << "'" << result << "'" << std::endl;
}
I actually get the output
'foo.pdf'
However, as Jerry pointed out, you might not want to use this in a production environment. If this is a simple homework or exercise on how to use the <string>, <iostream> and <fstream> libraries, then go ahead with such a procedure.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js