Reading and storing every second line - c++

I am creating a small search engine to find values in files and store them. I have a txt file with the data:
link1
cat, dog, bird
link2
car, wheel, chair
There is a code to read and store, but the index map is empty.
int function(string filename, map<string, set<string>>& index) {
int counter = 0;
set <string> tokens;
ifstream inStream;
inStream.open(filename);
if (inStream.fail()){
counter = 0;
}
string http, definition;
while (getline(inStream, http) && getline(inStream, definition)){
for (auto v : tokens){
index[v].insert(http);
counter++
}
}
return counter;
}

You forgot to split the items in each second line.
This is basically very simple.
You can use the well known std::getline with delimiter concept.
The corrected version of your code will be:
#include <iostream>
#include <sstream>
#include <string>
#include <map>
#include <set>
std::istringstream fileSimulationStream{R"(link1
cat, dog, bird
link2
car, wheel, chair)"};
int main() {
// Resulting data
std::map<std::string, std::set<std::string>> data;
// Read all lines from file. always 2 lines in one shot
for (std::string link{}, item{}; std::getline(fileSimulationStream, link) and std::getline(fileSimulationStream, item); ) {
// Put line in an istringstream for further extraction
std::istringstream iss{item};
// Read the items
std::set<std::string> tempSet{}; std::string tempString{};
while (std::getline(iss, tempString, ',')) tempSet.insert(tempString);
// Now add everything to the map
data[link] = std::move(tempSet);
}
// Create some debug output
for (const auto&[key, set] : data) {
std::cout << key << "\t--> ";
for (const std::string s : set) std::cout << s << ' ';
std::cout << '\n';
}
}

Related

How to access data in a struct? C++

Just quick one, how should I go about printing a value from a struct? the 'winningnums' contains a string from another function (edited down for minimal example)
I've tried the below but the program doesnt output anything at all, is my syntax incorrect?
struct past_results {
std::string date;
std::string winningnums;
};
int main(){
past_results results;
std::cout << results.winningnums;
return 0;
}
EDIT:
just to give some more insight, here's the function that populates my struct members. Is it something here im doing wrong?
//function to read csv
void csv_reader(){
std::string line;
past_results results;
past_results res[104];
int linenum = 0;
//open csv file for reading
std::ifstream file("Hi S.O!/path/to/csv", std::ios::in);
if(file.is_open())
{
while (getline(file, line))
{
std::istringstream linestream(line);
std::string item, item1;
//gets up to first comma
getline(linestream, item, ',');
results.date = item;
//convert to a string stream and put into winningnums
getline(linestream, item1);
results.winningnums = item1;
//add data to struct
res[linenum] = results;
linenum++;
}
}
//display data from struct
for(int i = 0; i < linenum; i++) {
std::cout << "Date: " << res[i].date << " \\\\ Winning numbers: " << res[i].winningnums << std::endl;
}
}
is my syntax incorrect?
No, it's just fine and if you add the inclusion of the necessary header files
#include <iostream>
#include <string>
then your whole program is ok and will print the value of the default constructed std::string winningnums in the results instance of past_results. A default constructed std::string is empty, so your program will not produce any output.
Your edited question shows another problem. You never call csv_reader() and even if you did, the result would not be visible in main() since all the variables in csv_reader() are local. Given a file with the content:
today,123
tomorrow,456
and if you call csv_reader() from main(), it would produce the output:
Date: today \\ Winning numbers: 123
Date: tomorrow \\ Winning numbers: 456
but as I mentioned, this would not be available in main().
Here's an example of how you could read from the file and make the result available in main(). I've used a std::vector to store all the past_results in. It's very practical since it grows dynamically, so you don't have to declare a fixed size array.
#include <fstream>
#include <iostream>
#include <sstream> // istringstream
#include <string> // string
#include <utility> // move
#include <vector> // vector
struct past_results {
std::string date;
std::string winningnums;
};
// added operator to read one `past_results` from any istream
std::istream& operator>>(std::istream& is, past_results& pr) {
std::string line;
if(std::getline(is, line)) {
std::istringstream linestream(line);
if(!(std::getline(linestream, pr.date, ',') &&
std::getline(linestream, pr.winningnums)))
{ // if reading both fields failed, set the failbit on the stream
is.setstate(std::ios::failbit);
}
}
return is;
}
std::vector<past_results> csv_reader() { // not `void` but returns the result
std::vector<past_results> result; // to store all past_results read in
// open csv file for reading
std::ifstream file("csv"); // std::ios::in is default for an ifstream
if(file) {
// loop and read records from the file until that fails:
past_results tmp;
while(file >> tmp) { // this uses the `operator>>` we added above
// and save them in the `result` vector:
result.push_back(std::move(tmp));
}
}
return result; // return the vector with all the records in
}
int main() {
// get the result from the function:
std::vector<past_results> results = csv_reader();
// display data from all the structs
for(past_results& pr : results) {
std::cout << "Date: " << pr.date
<< " \\\\ Winning numbers: " << pr.winningnums << '\n';
}
}
Your example doesn't initialize the struct members. There is no data to print, so why would you expect it to output anything?

Trying to push an unknown number of strings into a vector, but my cin loop doesn't terminate

I am trying to take strings as input from cin, and then push the string into a vector each time. However, my loop doesn't terminate even when I put a '\' at the end of all my input.
int main(void) {
string row;
vector<string> log;
while (cin >> row) {
if (row == "\n") {
break;
}
log.push_back(row);
}
return 0;
}
I've tried replacing the (cin >> row) with (getline(cin,row)), but it didn't make any difference. I've tried using stringstream, but I don't really know how it works. How do I go about resolving this?
As commented by #SidS, the whitespace is discarded. So you have to think about another strategy.
You could instead check if row is empty. But that will only work with std::getline:
#include <vector>
#include <string>
#include <iostream>
int main() {
std::string row;
std::vector<std::string> log;
while (std::getline(std::cin, row)) {
if (row.empty()) {
break;
}
log.push_back(row);
}
std::cout << "done\n";
}
OP, in case you want to save single words (rather than a whole line), you can use regex to single-handedly push each of them into log after input:
#include <vector>
#include <string>
#include <iostream>
#include <regex>
int main() {
const std::regex words_reg{ "[^\\s]+" };
std::string row;
std::vector<std::string> log;
while (std::getline(std::cin, row)) {
if (row.empty()) {
break;
}
for (auto it = std::sregex_iterator(row.begin(), row.end(), words_reg); it != std::sregex_iterator(); ++it){
log.push_back((*it)[0]);
}
}
for (unsigned i = 0u; i < log.size(); ++i) {
std::cout << "log[" << i << "] = " << log[i] << '\n';
}
}
Example run:
hello you
a b c d e f g
18939823
#_#_# /////
log[0] = hello
log[1] = you
log[2] = a
log[3] = b
log[4] = c
log[5] = d
log[6] = e
log[7] = f
log[8] = g
log[9] = 18939823
log[10] = #_#_#
log[11] = /////
If you want to store the tokens of one line from std::cin, separated by the standard mechanism as in the operator>> overloads from <iostream> (i.e., split by whitespace/newline), you can do it like this:
std::string line;
std::getline(std::cin, line);
std::stringstream ss{line};
const std::vector<std::string> tokens{std::istream_iterator<std::string>{ss},
std::istream_iterator<std::string>{}};
Note that this is not the most efficient solution, but it should work as expected: process only one line and use an existing mechanism to split this line into individual std::string objects.
You can't read newline by using the istream& operator >> of string. This operator ignores whitespaces and will never return the string "\n". Consider using getline instead.

How to get input an array of strings with \n as delimiter?

#include<bits/stdc++.h>
using namespace std;
int main()
{
int i=0;
char a[100][100];
do {
cin>>a[i];
i++;
}while( strcmp(a[i],"\n") !=0 );
for(int j=0;j<i;i++)
{
cout<<a[i]<<endl;
}
return 0;
}
Here , i want to exit the do while loop as the users hits enter .But, the code doesn't come out of the loop..
The following reads one line and splits it on white-space. This code is not something one would normally expect a beginner to write from scratch. However, searching on Duckduckgo or Stackoverflow will reveal lots of variations on this theme. When progamming, know that you are probably not the first to need the functionality you seek. The engineering way is to find the best and learn from it. Study the code below. From one tiny example, you will learn about getline, string-streams, iterators, copy, back_inserter, and more. What a bargain!
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <vector>
int main() {
using namespace std;
vector<string> tokens;
{
string line;
getline(cin, line);
istringstream stream(line);
copy(istream_iterator<string>(stream),
istream_iterator<string>(),
back_inserter(tokens));
}
for (auto s : tokens) {
cout << s << '\n';
}
return 0;
}
First of all, we need to read the line until the '\n' character, which we can do with getline(). The extraction operator >> won't work here, since it will also stop reading input upon reaching a space. Once we get the whole line, we can put it into a stringstream and use cin >> str or getline(cin, str, ' ') to read the individual strings.
Another approach might be to take advantage of the fact that the extraction operator will leave the delimiter in the stream. We can then check if it's a '\n' with cin.peek().
Here's the code for the first approach:
#include <iostream> //include the standard library files individually
#include <vector> //#include <bits/stdc++.h> is terrible practice.
#include <sstream>
int main()
{
std::vector<std::string> words; //vector to store the strings
std::string line;
std::getline(std::cin, line); //get the whole line
std::stringstream ss(line); //create stringstream containing the line
std::string str;
while(std::getline(ss, str, ' ')) //loops until the input fails (when ss is empty)
{
words.push_back(str);
}
for(std::string &s : words)
{
std::cout << s << '\n';
}
}
And for the second approach:
#include <iostream> //include the standard library files individually
#include <vector> //#include <bits/stdc++.h> is terrible practice.
int main()
{
std::vector<std::string> words; //vector to store the strings
while(std::cin.peek() != '\n') //loop until next character to be read is '\n'
{
std::string str; //read a word
std::cin >> str;
words.push_back(str);
}
for(std::string &s : words)
{
std::cout << s << '\n';
}
}
You canuse getline to read ENTER, run on windows:
//#include<bits/stdc++.h>
#include <iostream>
#include <string> // for getline()
using namespace std;
int main()
{
int i = 0;
char a[100][100];
string temp;
do {
getline(std::cin, temp);
if (temp.empty())
break;
strcpy_s(a[i], temp.substr(0, 100).c_str());
} while (++i < 100);
for (int j = 0; j<i; j++)
{
cout << a[j] << endl;
}
return 0;
}
While each getline will got a whole line, like "hello world" will be read once, you can split it, just see this post.

How to parse table of numbers in C++

I need to parse a table of numbers formatted as ascii text. There are 36 space delimited signed integers per line of text and about 3000 lines in the file. The input file is generated by me in Matlab so I could modify the format. On the other hand, I also want to be able to parse the same file in VHDL and so ascii text is about the only format possible.
So far, I have a little program like this that can loop through all the lines of the input file. I just haven't found a way to get individual numbers out of the line. I am not a C++ purest. I would consider fscanf() but 36 numbers is a bit much for that. Please suggest practical ways to get numbers out of a text file.
int main()
{
string line;
ifstream myfile("CorrOut.dat");
if (!myfile.is_open())
cout << "Unable to open file";
else{
while (getline(myfile, line))
{
cout << line << '\n';
}
myfile.close();
}
return 0;
}
Use std::istringstream. Here is an example:
#include <sstream>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
string line;
istringstream strm;
int num;
ifstream ifs("YourData");
while (getline(ifs, line))
{
istringstream strm(line);
while ( strm >> num )
cout << num << " ";
cout << "\n";
}
}
Live Example
If you want to create a table, use a std::vector or other suitable container:
#include <sstream>
#include <string>
#include <fstream>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
string line;
// our 2 dimensional table
vector<vector<int>> table;
istringstream strm;
int num;
ifstream ifs("YourData");
while (getline(ifs, line))
{
vector<int> vInt;
istringstream strm(line);
while ( strm >> num )
vInt.push_back(num);
table.push_back(vInt);
}
}
The table vector gets populated, row by row. Note we created an intermediate vector to store each row, and then that row gets added to the table.
Live Example
You can use a few different approaches, the one offered above is probable the quickest of them, however in case you have different delimitation characters you may consider one of the following solutions:
The first solution, read strings line by line. After that it use the find function in order to find the first position o the specific delimiter. It then removes the number read and continues till the delimiter is not found anymore.
You can customize the delimiter by modifying the delimiter variable value.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myfile("CorrOut.dat");
string delimiter = " ";
size_t pos = 0;
string token;
vector<vector<int>> data;
if (!myfile.is_open())
cout << "Unable to open file";
else {
while (getline(myfile, line))
{
vector<int> temp;
pos = 0;
while ((pos = line.find(delimiter)) != std::string::npos) {
token = line.substr(0, pos);
std::cout << token << std::endl;
line.erase(0, pos + delimiter.length());
temp.push_back(atoi(token.c_str()));
}
data.push_back(temp);
}
myfile.close();
}
return 0;
}
The second solution make use of regex and it doesn't care about the delimiter use, it will search and match any integers found in the string.
#include <iostream>
#include <string>
#include <regex> // The new library introduced in C++ 11
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myfile("CorrOut.dat");
std::smatch m;
std::regex e("[-+]?\\d+");
vector<vector<int>> data;
if (!myfile.is_open())
cout << "Unable to open file";
else {
while (getline(myfile, line))
{
vector<int> temp;
while (regex_search(line, m, e)) {
for (auto x : m) {
std::cout << x.str() << " ";
temp.push_back(atoi(x.str().c_str()));
}
std::cout << std::endl;
line = m.suffix().str();
}
data.push_back(temp);
}
myfile.close();
}
return 0;
}

Locate and tag words in text file

I need to read in a text file of 500 words or more(a real world article from newspaper, etc..) and locate and tag like this, <location> word <location/>, and then print the entire article on the screen. Im using boost regex right now and its working ok. I want to try and use a list or array or some other data structure to have a list of the states and major cities, and search those and compare to the aticle. right now I'm using an array but I'm willing to use anything. Any ideas or clues?
#include <boost/regex.hpp>
#include <iostream>
#include <string>
#include <boost/iostreams/filter/regex.hpp>
#include <fstream>
using namespace std;
int main()
{
string cities[389];
string states [60];
string filename, line,city,state;
ifstream file,cityfile, statefile;
int i=0;
int j=0;
cityfile.open("c:\\cities.txt");
while (!cityfile.eof())
{
getline(cityfile,city);
cities[i]=city;
i++;
//for (int i=0;i<500;i++)
//file>>cities[i];
}
cityfile.close();
statefile.open("c:\\states.txt");
while (!statefile.eof())
{
getline(statefile,state);
states[j]=state;
//for (int i=0;i<500;i++)
//cout<<states[j];
j++;
}
statefile.close();
//4cout<<cities[4];
cout<<"Please enter the path and file name "<<endl;
cin>>filename;
file.open(filename);
while (!file.eof())
{
while(getline(file, line)
{
}
while(getline(file, line))
{
//string text = "Hello world";
boost::regex re("[A-Z/]\.[A-Z\]\.|[A-Z/].*[:space:][A-Z/]|C........a");
//boost::regex re(
string fmt = "<locations>$&<locations\>";
if(boost::regex_search(line, re))
{
string result = boost::regex_replace(line, re, fmt);
cout << result << endl;
}
/*else
{
cout << "Found Nothing" << endl;
}*/
}
}
file.close();
cin.get(),cin.get();
return 0;
}
If you are after asymptotic complexity - Aho-Corasick algorithm offers a linear time complexity ( O(n+m)) (n and m are the lengths of the input strings). for searching a dictionary in a string.
An alternative is to put the tokenized words in a map (where the value is a list to the places in the stream of each string), and search for each string in the data in the tree. The complexity will be O(|S| * (nlogn + mlogn) ) (m being the number of searched words, n is the number of words in the string, and |S| is the length of the average word)
You can use any container that has a .find() method or supports std::find(). I'd use set, since set::find() runs in less than linear time.
Here is a program which does what you talk about. Note that the parsing doesn't work great, but that's not what I'm trying to demonstrate. You could continue to find the words using your parser, and use the call to set::find() to determine if they are locations.
#include <set>
#include <string>
#include <iostream>
#include <sstream>
const std::set<std::string> locations { "Springfield", "Illinois", "Pennsylvania" };
int main () {
std::string line;
while(std::getline(std::cin, line)) {
std::istringstream iss(line);
std::string word;
while(iss >> word) {
if(locations.find(word) == locations.end())
std::cout << word << " ";
else
std::cout << "<location>" << word << "</location> ";
}
std::cout << "\n";
}
}