How to read a file with multiple delimiters within a single line - c++

I am trying to read a file which has multiple delimiters per line.
Below is the data
2,22:11
3,33:11
4,44:11
5,55:11
6,66:11
7,77:11
8,88:11
9,99:11
10,00:11
11,011:11
12,022:11
13,033:11
14,044:11
15,055:11
16,066:11
17,077:11
18,088:11
19,099:11
And the code is below
Here, I am trying to read the line first with comma as delimiter to get line and then the colon.
#include <fstream>
#include <iostream>
#include <string>
int main() {
std::string line;
std::string token;
std::ifstream infile;
infile.open("./data.txt");
while (std::getline(infile, line,',')) {
std::cout << line << std::endl;
while(getline(infile, token, ':'))
{
std::cout << " : " << token;
}
}
}
But there is an issue with the code as it is skipping the first line.
Also , if i comment the second while loop,only the first line is getting printed, and below is the output
Ian unable to figure out where exactly the code has gone wrong
Output
2
: 22 : 11
3,33 : 11
4,44 : 11
5,55 : 11
6,66 : 11
7,77 : 11
8,88 : 11
9,99 : 11
10,00 : 11
11,011 : 11
12,022 : 11
13,033 : 11
14,044 : 11
15,055 : 11
16,066 : 11
17,077 : 11
18,088 : 11
19,099 : 11

Why two while ?
Your problem is that you repeat the second while forever. The first while is executed only to get the first 2, the second while is executed to the end of the file.
You can do all with a single while; something like
#include <fstream>
#include <iostream>
using namespace std;
int main() {
std::string line;
std::string token;
std::string num;
ifstream infile;
infile.open("a.txt");
while ( getline(infile, line,',')
&& getline(infile, token, ':')
&& getline(infile, num) )
cout << line << ',' << token << ':' << num << endl;
}

The problem comes from the fact you are using std::getline twice.
At the beginning you enter the first loop. The first call to std::getline returns want you expect : the first line until the , delimiter.
Then you enter the second std::getline, in a nested loop, to get the rest of the line. But the thing is, you never leave the second loop until the end of the file. So, you read all the file splitting by : delimiter.
When the second std:getline ends up to the end of the file, it leaves the nested loop.
Because you already read all the file, nothing's left to be read and the first loop directly exits.
Here is some debug to help you understand the context :
#include <fstream>
#include <iostream>
#include <string>
int main() {
std::string line;
std::string token;
std::ifstream infile;
infile.open("./data.txt");
while (std::getline(infile, line, ',')) {
std::cout << "First loop : " << line << std::endl;
while(getline(infile, token, ':'))
{
std::cout << "Inner loop : " << token << std::endl;
}
}
}
The first lines to be printed are :
First loop : 2
Inner loop : 22
Inner loop : 11
3,33
Inner loop : 11
4,44
You can clearly see it doesn't exit the second loop until the end.
I would advise to read the entire line, with no care of delimiters, and then split the string into token using a tailor made function. It would be easy and very clean.
Solution :
#include <fstream>
#include <list>
#include <iostream>
#include <string>
struct line_content {
std::string line_number;
std::string token;
std::string value;
};
struct line_content tokenize_line(const std::string& line)
{
line_content l;
auto comma_pos = line.find(',');
l.line_number = line.substr(0, comma_pos);
auto point_pos = line.find(':');
l.token = line.substr(comma_pos + 1, point_pos - comma_pos);
l.value = line.substr(point_pos + 1);
return l;
}
void print_tokens(const std::list<line_content>& tokens)
{
for (const auto& line: tokens) {
std::cout << "Line number : " << line.line_number << std::endl;
std::cout << "Token : " << line.token << std::endl;
std::cout << "Value : " << line.value << std::endl;
}
}
int main() {
std::string line;
std::ifstream infile;
std::list<line_content> tokens;
infile.open("./data.txt");
while (std::getline(infile, line)) {
tokens.push_back(tokenize_line(line));
}
print_tokens(tokens);
return 0;
}
I think you should be able to do what you what.
Compiled as follow : g++ -Wall -Wextra --std=c++1y <your c++ file>

If you want to split a string on multiple delimiters, without having to worry about the order of the delimiters, you can use std::string::find_first_of()
#include <fstream>
#include <iostream>
#include <streambuf>
#include <string>
int
main()
{
std::ifstream f("./data.txt");
std::string fstring = std::string(std::istreambuf_iterator<char>(f),
std::istreambuf_iterator<char>());
std::size_t next, pos = 0;
while((next = fstring.find_first_of(",:\n", pos)) != std::string::npos)
{
std::cout.write(&fstring[pos], ++next - pos);
pos = next;
}
std::cout << &fstring[pos] << '\n';
return 0;
}

Related

How do I search a string from a file and return the line location using functions in C++?

I am trying to make a program that lets me search for groups of words in a file and then it would return the line locations where they are found. I made it work for a little bit but for some reason, the value of int FileLine (line location) keeps on stacking up whenever a new word search is introduced.
include <iostream>
#include <fstream>
#include <iomanip>
#include <string>
using namespace std;
string S1, S2, S, Line;
int FileLine = 0;
int CountInFile(string S) {
ifstream in("DataFile.txt");
while (getline(in, Line)) {
FileLine++;
if (Line.find(S, 0) != string::npos) {
cout << "Found " << S << " at line " << FileLine << "\n";
}
}
return 0;
in.close();
}
int main()
{
// Words to search
CountInFile("Computer Science");
CountInFile("Programming");
CountInFile("C++");
CountInFile("COSC");
CountInFile("computer");
This is the output:
Is there a way that I can stop the FileLine value from stacking?

c++ : istream_iterator skip spaces but not newline

Suppose I have
istringstream input("x = 42\n"s);
I'd like to iterate over this stream using std::istream_iterator<std::string>
int main() {
std::istringstream input("x = 42\n");
std::istream_iterator<std::string> iter(input);
for (; iter != std::istream_iterator<std::string>(); iter++) {
std::cout << *iter << std::endl;
}
}
I get the following output as expected:
x
=
42
Is it possible to have the same iteration skipping spaces but not a newline symbol? So I'd like to have
x
=
42
\n
std::istream_iterator isn't really the right tool for this job, because it doesn't let you specify the delimiter character to use. Instead, use std::getline, which does. Then check for the newline manually and strip it off if found:
#include <iostream>
#include <string>
#include <sstream>
int main() {
std::istringstream input("x = 42\n");
std::string s;
while (getline (input, s, ' '))
{
bool have_newline = !s.empty () && s.back () == '\n';
if (have_newline)
s.pop_back ();
std::cout << "\"" << s << "\"" << std::endl;
if (have_newline)
std::cout << "\"\n\"" << std::endl;
}
}
Output:
"x"
"="
"42"
"
"
If you can use boost use this:
boost::algorithm::split_regex(cont, str, boost::regex("\s"));
where "cont" can be the result container and "str" is your input string.
https://www.boost.org/doc/libs/1_76_0/doc/html/boost/algorithm/split_regex.html

If a word is repeated many times in a string, how can I count the number of repetitions of the word and their positions?

If a word is repeated many times in a string, how can I count the number of repetitions of the word and their positions?
#include <cstring>
#include <iostream>
#include <string>
using namespace std;
int main()
{
string str;
getline(cin, str);
string str2;
getline(cin, str2);
const char* p = strstr(str.c_str(), str2.c_str());
if (p)
cout << "'" << str2 << "' find in " << p - str.c_str();
else
cout << target << "not find \"" << str << "\"";
return 0;
}
Just off the top of my head, you could use find() within std::string. find() returns the first match of a substring within your string (or std::string::npos if there is no match), so you would need to loop until find() was not able to find any more matches of your string.
Something like:
#include <string>
#include <vector>
#include <cstdio>
int main(void) {
std::string largeString = "Some string with substrings";
std::string mySubString = "string";
int numSubStrings = 0;
std::vector<size_t> locations;
size_t found = 0;
while(true) {
found = largeString.find(mySubString, found+1);
if (found != std::string::npos) {
numSubStrings += 1;
locations.push_back(found);
}
else {
break; // there are no more matches
}
}
printf("There are %d occurences of: \n%s \nin \n%s\n", numSubStrings, mySubString.c_str(), largeString.c_str());
}
Which outputs:
There are 2 occurences of:
string
in
Some string with substrings
The code below uses a lot of the Standard Library to do common things for us. I use a file to collect words into one large string. I then use a std::stringstream to separate the words on whitespace and I store the individual words in a std::vector (an array that manages its size and grows when needed). In order to get a good count of the words, punctuation and capitalization must also be removed, this is done in the sanitize_word() function. Finally, I add the words to a map where the word is the key, and the int is the count of how many times that word occurred. Finally, I print the map to get a complete word count.
The only place I directly did any string parsing is in the sanitize function, and it was done using the aptly named erase/remove idiom. Letting the Standard Library do the work for us when possible is much simpler.
Locating where a word occurs also becomes trivial after they've been separated and sanitized.
Contents of input.txt:
I must not fear. Fear is the mind-killer. Fear is the little-death that brings total obliteration. I will face my fear. I will permit it to pass over me and through me. And when it has gone past, I will turn the inner eye to see its path. Where the fear has gone, there will be nothing. Only I will remain.
#include <algorithm>
#include <cctype>
#include <fstream>
#include <iostream>
#include <map>
#include <sstream>
#include <string>
#include <vector>
// Removes puncuation marks and converts words to all lowercase
std::string sanitize_word(std::string word) {
word.erase(std::remove_if(word.begin(), word.end(),
[punc = std::string(".,?!")](auto c) {
return punc.find(c) != std::string::npos;
}),
word.end());
for (auto& c : word) {
c = std::tolower(c);
}
return word;
}
int main() {
// Set up
std::ifstream fin("input.txt");
if (!fin) {
std::cerr << "Error opening file...\n";
return 1;
}
std::string phrases;
for (std::string tmp; std::getline(fin, tmp);) {
phrases += tmp;
}
fin.close();
// Words are collected, now the part we care about
std::stringstream strin(phrases);
std::vector<std::string> words;
for (std::string tmp; strin >> tmp;) {
words.push_back(tmp);
}
for (auto& i : words) {
i = sanitize_word(i);
}
// std::map's operator[]() function will create a new element in the map if it
// doesn't already exist
std::map<std::string, int> wordCounts;
for (auto i : words) {
++wordCounts[i];
}
for (auto i : wordCounts) {
std::cout << i.first << ": " << i.second << '\n';
}
// Now we'll do code to locate a certain word, "fear" for this example
std::string wordToFind("fear");
auto it = wordCounts.find(wordToFind);
std::cout << "\n\n" << it->first << ": " << it->second << '\n';
std::vector<int> locations;
for (std::size_t i = 0; i < words.size(); ++i) {
if (words[i] == wordToFind) {
locations.push_back(i);
}
}
std::cout << "Found at locations: ";
for (auto i : locations) {
std::cout << i << ' ';
}
std::cout << '\n';
}
Output:
and: 2
be: 1
brings: 1
eye: 1
face: 1
fear: 5
gone: 2
has: 2
i: 5
inner: 1
is: 2
it: 2
its: 1
little-death: 1
me: 2
mind-killer: 1
must: 1
my: 1
not: 1
nothing: 1
obliteration: 1
only: 1
over: 1
pass: 1
past: 1
path: 1
permit: 1
remain: 1
see: 1
that: 1
the: 4
there: 1
through: 1
to: 2
total: 1
turn: 1
when: 1
where: 1
will: 5
fear: 5
Found at locations: 3 4 8 20 50

How to skip blank spaces when reading in a file c++

Here is the codeshare link of the exact input file: https://codeshare.io/5DBkgY
Ok, as you can see, ​there are 2 blank lines, (or tabs) between 8 and ROD. How would I skip that and continue with the program? I am trying to put each line into 3 vectors (so keys, lamp, and rod into one vector etc). Here is my code (but it does not skip the blank line).:
#include <string>
#include <iostream>
#include <sstream>
#include <vector>
#include <fstream>
using namespace std;
int main() {
ifstream objFile;
string inputName;
string outputName;
string header;
cout << "Enter image file name: ";
cin >> inputName;
objFile.open(inputName);
string name;
vector<string> name2;
string description;
vector<string> description2;
string initialLocation;
vector<string> initialLocation2;
string line;
if(objFile) {
while(!objFile.eof()){
getline(objFile, line);
name = line;
name2.push_back(name);
getline(objFile, line);
description = line;
description2.push_back(description);
getline(objFile, line);
initialLocation = line;
initialLocation2.push_back(initialLocation);
} else {
cout << "not working" << endl;
}
for (std::vector<string>::const_iterator i = name2.begin(); i != name2.end(); ++i)
std::cout << *i << ' ';
for (std::vector<string>::const_iterator i = description2.begin(); i != description2.end(); ++i)
std::cout << *i << ' ';
for (std::vector<string>::const_iterator i = initialLocation2.begin(); i != initialLocation2.end(); ++i)
std::cout << *i << ' ';
#include <cstddef> // std::size_t
#include <cctype> // std::isspace()
#include <string>
#include <vector>
#include <fstream>
#include <iostream>
bool is_empty(std::string const &str)
{
for (auto const &ch : str)
if (!std::isspace(static_cast<char unsigned>(ch)))
return false;
return true;
}
int main()
{
std::cout << "Enter image file name: ";
std::string filename;
std::getline(std::cin, filename); // at least on Windows paths containing whitespace
// are valid.
std::ifstream obj_file{ filename }; // define variables as close to where they're used
// as possible and use the ctors for initialization.
if (!obj_file.is_open()) { // *)
std::cerr << "Couldn't open \"" << filename << "\" for reading :(\n\n";
return EXIT_FAILURE;
}
std::vector<std::string> name;
std::vector<std::string> description;
std::vector<std::string> initial_location;
std::string line;
std::vector<std::string> *destinations[] = { &name, &description, &initial_location };
for (std::size_t i{}; std::getline(obj_file, line); ++i) {
if (is_empty(line)) { // if line only consists of whitespace
--i;
continue; // skip it.
}
destinations[i % std::size(destinations)]->push_back(line);
}
for (auto const &s : name)
std::cout << s << '\n';
for (auto const &s : description)
std::cout << s << '\n';
for (auto const &s : initial_location)
std::cout << s << '\n';
}
... initial_locations look like integers, though.
*) Better early exit if something bad happens. Instead of
if (obj_file) {
// do stuff
}
else {
// exit
}
-->
if(!obj_file)
// exit
// do stuff
makes your code easier to read and takes away one level of indentation for the most parts.

How can I read a line from a stringstream only if it contains any newline?

I'm reading some network data into a stringstream as an input_buffer.
The data is ASCII lines separated by a LF char.
The input_buffer may be in a state where there is only a partial line in it.
I'm trying to call getline (), but only when there actually is a new newline char in the stringstream. In other words it should extract completed lines, but leave a partial line in the buffer.
Here is a MVCE:
#include <string>
#include <sstream>
#include <iostream>
int
main (void)
{
std::stringstream input_buffer;
input_buffer << "test123\nOK\n";
while (input_buffer.str ().find ('\n') != std::string::npos)
{
std::string line;
std::getline (input_buffer, line, '\n');
std::cout << "input_buffer.str ().size: " << input_buffer.str ().size () << "\n";
std::cout << "line: " << line << "\n";
}
return 0;
}
It currently does not terminate, here is a fragment of the output:
input_buffer.str ().size: 11
line: test123
input_buffer.str ().size: 11
line: OK
input_buffer.str ().size: 11
line:
input_buffer.str ().size: 11
...
How can I read a line from a stringstream only if it contains any newline?
Edit: For clarification here is another code sample with partial input:
#include <string>
#include <sstream>
#include <iostream>
#include <vector>
void
extract_complete_lines_1 (std::stringstream &input_buffer, std::vector<std::string> &lines)
{
while (input_buffer.str ().find ('\n') != std::string::npos)
{
std::string line;
std::getline (input_buffer, line, '\n');
lines.push_back (line);
}
}
void
print_lines (const std::vector<std::string> &v)
{
for (auto l : v)
{
std::cout << l << '\n';
}
}
int
main (void)
{
std::vector<std::string> lines;
std::stringstream input_buffer {"test123\nOK\npartial line"};
extract_complete_lines_1 (input_buffer, lines);
print_lines (lines);
return 0;
}
This should print "test123" and "OK", but not "partial line".
As mentioned here, you could override the underflow function of the buffer so that it will refill using a function that you can specify.
Here is an example adapted from here:
#include <iostream>
#include <sstream>
#include <string>
class Mybuf : public std::streambuf {
std::string line{};
char ch{}; // single-byte buffer
protected:
int underflow() override {
if(line.empty()) {
std::cout << "Please enter a line of text for the stream: ";
getline(std::cin, line);
line.push_back('\n');
}
ch = line[0];
line.erase(0, 1);
setg(&ch, &ch, &ch + 1); // make one read position available
return ch;
}
public:
Mybuf(std::string line) : line{line} {};
};
class mystream : public std::istringstream {
Mybuf mybuf;
public:
mystream(std::string line) : std::istringstream{}, mybuf{line}
{
static_cast<std::istream&>(*this).rdbuf(&mybuf);
}
};
int main()
{
mystream ms{"The first line.\nThe second line.\nA partial line"};
for(std::string line{}; std::getline(ms, line); )
std::cout << "line: " << line << "\n";
}
Output:
line: The first line.
line: The second line.
Please enter a line of text for the stream: Here is more!
line: A partial lineHere is more!
Please enter a line of text for the stream:
I think that it's not easily possible with std::stringstream. I tried to manipulate the stream position with tellg () and seekg (), but they don't behave like I expected.
I have found a solution using a std::vector<char> as a buffer:
#include <string>
#include <sstream>
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
void
extract_complete_lines (std::vector<char> &buf, std::vector<std::string> &lines)
{
auto pos = std::end (buf);
while ((pos = std::find (std::begin (buf), std::end (buf), '\n')) != std::end (buf))
{
std::string line (std::begin (buf), pos);
buf.erase (std::begin(buf), pos + 1);
lines.push_back (line);
}
}
void
print_lines (const std::vector<std::string> &v)
{
for (auto l : v)
{
std::cout << l << '\n';
}
}
int
main (void)
{
std::vector<std::string> lines;
const std::string test_input = "test123\nOK\npartial line";
std::vector<char> input_buffer {std::begin (test_input), std::end (test_input)};
extract_complete_lines_1 (input_buffer, lines);
print_lines (lines);
return 0;
}
It prints the first two lines as expected and the "partial line" is left in the vector.
Or even better, a std::vector<char> is not too different from a std::string:
#include <string>
#include <sstream>
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
void
extract_complete_lines (std::string &buf, std::vector<std::string> &lines)
{
std::string::size_type pos;
while ((pos = buf.find ('\n')) != std::string::npos)
{
lines.push_back (buf.substr (0, pos));
buf.erase (0, pos + 1);
}
}
void
print_lines (const std::vector<std::string> &v)
{
for (auto l : v)
{
std::cout << l << '\n';
}
}
int
main (void)
{
std::vector<std::string> lines;
std::string input_buffer = "test123\nOK\npartial line";
extract_complete_lines (input_buffer, lines);
print_lines (lines);
return 0;
}