How to Parse A file to block in C++ - c++

I am looking for an elegant way to parse a file to blocks and for each block create a new file , for example :
original file:
line 1
line 2
line 3
line 4
line 5
line 6
line 7
result:
first file:
line 1
second file:
line 2
line 3
third file:
line 4
line 5
line 6
fourth file:
line 7
thanks

It looks like you could use this algorithm:
Count the number of spaces at the beginning of each line, if it's less than or equal to the number of spaces in the preceding non-empty line, open a new file.
What have you tried so far?

You could use scoped_ptrs to change the output file when the input line does not begin with whitespace:
std::ifstream in("/your/input/file");
boost::scoped_ptr<std::ofstream> out(NULL)
int out_serial = 0;
std::string line;
while(std::getline(in, line))
{
// test: first character non blank
if(! line.empty() && (line.at(0) != ' ' && line.at(0) != '\t'))
{
std::ostringstream new_output_file;
new_output_file << "/your/output/file/prefix_" << out_serial++;
out.reset(new std::ofstream(new_output_file.str()));
}
if(out.get() != NULL) (*out) << line << std::endl;
}

If your code is not properly intended or your blocks are only based on braces not spaces, you can use a stack(STL). Push on opening brace and Pop on closing brace. Open a new file each time the stack become empty

Related

Reading from multiple lines from a txt file in a single iteration

I am working at the moment with linked lists, and my nodes have 4 elements( where each of them is a variable type string). In a .txt file there are groups of text where each group has 4 lines, for example:
This is the first line
This is the second line
This is the third line
This is the fourth line
'\n'
This is the fifth line
This is the sixth line
This is the seventh line
This is the eighth line
and so on and so on...
What I'm trying to achieve is reading the four lines in a single iteration and give them to a node, and let the program iterate until there is no more lines.
So if reading the example above our nodes will be left with;
Node1.string1 = This is the first line;
Node1.string2 = This is the second line;
Node1.string3 = This is the third line;
Node1.string4 = This is the fourth line;
While looking for a way to do this on internet, I found one way you can do this and tell the "ifstream reader" to do a '\n' before the next iteration, but I lost this page and cant seem to find it
Easy enough
while (getline(in, node.string1) &&
getline(in, node.string2) &&
getline(in, node.string3) &&
getline(in, node.string4))
{
...
string dummy;
getline(in, dummy); // skip blank line
}
You can also use in.ignore(std::numeric_limits<std::streamsize>, '\n'); to skip the blank line, but reading into a dummy variable allows you to easily check if the blank line really is blank.

Read a file line by line and insert data in to variables and arrays [duplicate]

This question already has answers here:
Read file line by line using ifstream in C++
(8 answers)
Closed 3 years ago.
VA301
20/02/2020 10:20
COLOMBO
SINGAPORE
10 E AB
15 E CDE
22 E ADF
31 E BCF
35 E ABCD
45 E AB
50 E DEF
These are the details in my file. I want to read this file line by line and store 1st 5 lines into a variable and other lines into 3 char arrays.
I don't know why you really want to do so. If you can give me a better explanation, I can give you a better answer.
To read form file you have to use a file stream input.
Example:
ifstream infile("thefile.txt");// change thefile to your file name and make sure it's at the same folder with the programe
Now you can use getline() method to get data from the stream.
string line;
char ch[200];
getline(infile, line);//this to store the line into a string
getline(infile,line,'&'); // the last parameter is the "delimiter"
//getline() will use delimiter to decide when to stop reading data.
infile.getline(ch,200); //this to store the line into a char array
And simply you can read to the of the file using a loop and eof() method
while (infile.eof( ))//Mean read until the end of file
{
//do something
}
To get everything together:
#include <fstream>
#include <iostream>
using namespace std;
int main() {
ifstream infile("thefile.txt");// change thefile to your file name and make sure it's at the same folder with the programe
string line, var="";
while (infile.eof( ))//Mean read until the end of file
{
getline(infile, line);//this to store the line into a string
var= var + line +'\n';
}
//assuming that they are just 3 other lines
char ch1[200],ch2[200],ch3[200];//you can choose another size
infile.getline(ch1,200);
infile.getline(ch2,200);
infile.getline(ch3,200);
}
For more information you can read:
https://en.cppreference.com/w/cpp/string/basic_string/getline
https://www.tutorialspoint.com/cplusplus/cpp_files_streams.htm

Printing duplicate strings and how many times they appear in a file C++

Here is the question I have to solve and the code I've written so far.
Write a function named printDuplicates that accepts an input stream and an output stream as parameters.
The input stream represents a file containing a series of lines. Your function should examine each line looking for consecutive occurrences of the same token on the same line and print each duplicated token along how many times it appears consecutively.
Non-repeated tokens are not printed. Repetition across multiple lines (such as if a line ends with a given token and the next line starts with the same token) is not considered in this problem.
For example, if the input file contains the following text:
hello how how are you you you you
I I I am Jack's Jack's smirking smirking smirking smirking smirking revenge
bow wow wow yippee yippee yo yippee yippee yay yay yay
one fish two fish red fish blue fish
It's the Muppet Show, wakka wakka wakka
My expected result should be:
how*2 you*4
I*3 Jack's*2 smirking*5
wow*2 yippee*2 yippee*2 yay*3
\n
wakka*3
Here is my function:
1 void printDuplicates(istream& in, ostream& out)
2 {
3 string line; // Variable to store lines in
4 while(getline(in, line)) // While there are lines to get do the following
5 {
6 istringstream iss(line); // String stream initialized with line
7 string word; // Current word
8 string prevWord; // Previous word
9 int numWord = 1; // Starting index for # of a specific word
10 while(iss >> word) // Storing strings in word variable
11 {
12 if (word == prevWord) ++numWord; // If a word and the word 13 before it are equal add to word counter
14 else if (word != prevWord) // Else if the word and the word before it are not equal
15 {
16 if (numWord > 1) // And there are at leat two copies of that word
17 {
18 out << prevWord << "*" << numWord << " "; // Print out "word*occurrences"
19 }
20 numWord = 1; // Reset the num counter variable for next word
21 }
22 prevWord = word; // Set current word to previous word, loop begins again
23 }
24 out << endl; // Prints new line between each iteration of line loop
25 }
26 }
My result thus far is:
how*2
I*3 Jack's*2 smirking*5
wow*2 yippee*2 yippee*2
I have tried adding (|| iss.eof()), (|| iss.peek == EOF), etc inside the nested else if statement on Line 14, but I am unable to figure this guy out. I need some way of knowing I'm at the end of the line so my else if statement will be true and try to print the last word on the line.

Parsing of file with Key value in C/C++

Need some help in parsing the file
Device# Device Name Serial No. Active Policy Disk# P.B.T.L ALB
Paths
--------------------------------------------------------------------------------------- -------------------------------------
1 AB OPEN-V-CM 50 0BC1F1621 1 SQST Disk 2 3.1.4.0 N/A
2 AB OPEN-V-CM 50 0BC1F1605 1 SQST Disk 3 3.1.4.1 N/A
3 AB OPEN-V*2 50 0BC1F11D4 1 SQST Disk 4 3.1.4.2 N/A
4 AB OPEN-V-CM 50 0BC1F005A 1 SQST Disk 5 3.1.4.3 N/A
The above information is in devices.txt file and and i want to extract the device number corresponding to the disk no i input.
The disk number i input is just an integer (and not "Disk 2" as shown in the file).
Open the file and skip first 3 lines.
Start reading line by line from 4th line onward. You can get the device number easily as it is the first column.
To get the disk no, search through each line using the space character. When you encounter one space character it means you've gone past one column. Ignore the repeated spaces and continue this until you reach the disk no. You must handle the spaces in the column data separately if it exist.
Load the disk no and device no in to say a map and later you can use your input to query the device info from this map.
#include <sstream>
#include <fstream>
#include <iostream>
#include <cctype>
using namespace std;
int main(int argc, char* argv[])
{
int wantedDisknum = 4;
int finalDeviceNum = -1;
ifstream fin("test.txt");
if(!fin.is_open())
return -1;
while(!fin.eof())
{
string line;
getline(fin, line);
stringstream ss(line);
int deviceNum;
ss >> deviceNum;
if(ss.fail())
{
ss.clear();
continue;
}
string unused;
int diskNum;
ss >> unused >> unused >> unused >> unused >> unused >> unused >> unused >> diskNum;
if(diskNum == wantedDisknum)
{
finalDeviceNum = deviceNum;
break;
}
}
fin.close();
cout << finalDeviceNum << endl;
system("pause");
return 0;
}
In UNIX, you can easily achieve this using awk or other script lang.
cat Device.txt | awk '{if ( $1 == 2 ) print}'
In C++, you have to extract specific column using strtok and compare it with 'val' if it matches print that line.'
Assuming there is no "Disk" in any of the following columns:
1) Skip lines until you encounter '-' as the first character of a line, then skip that line too.
2) read a line
2.a) skip characters of the current line until isdigit(line[i]) function returns true, then read current character and characters following it into a temporary buffer until isdigit(line[i]) returns false. This is the device id.
2.b) Skip characters of the current line until you find a 'D'
2.b.i) match 'i', 's', 'k' characters, if any of them fails, go to 2.b
2.c) skip characters of the current line until isdigit(line[i]) function returns true, then read current character and characters following it into another buffer until isdigit(line[i]) returns false. This is the disk id.
3) print out both buffers
I don't have my Regular Expression cheat sheet handy, but I'm pretty sure it would be straightforward to run each line in the file through a regex that:
1) looks for a integer in the line
2) skips whitespace followed by text three times
3) matches characters one space and characters
Boost, Qt, and most other common C++ class libraries have a Regex parser for just this kind of thing.

String vectors not working as expected with newline and iterators? (C++)

I have a text file made of 3 lines:
Line 1
Line 3
(Line 1, a blank line, and Line 3)
vector<string> text;
vector<string>::iterator it;
ifstream file("test.txt");
string str;
while (getline(file, str))
{
if (str.length() == 0)
str = "\n";
// since getline discards the newline character, replacing blank strings with newline
text.push_back(str);
} // while
for (it=text.begin(); it < text.end(); it++)
cout << (*it);
Prints out:
Line 1
Line 3
I'm not sure why the string with only a newline was not printed out. Any help would be appreciated. Thanks.
Wasn't? Actually, it was! The reason you have a newline after Line 1 is exactly that empty string with newline in it and nothing else. If not for that second line, you'd see Line 1Line 3 as output. (You said it yourself: getline discards newline characters.)
Apparently, the way I understand your intent, you were supposed to implement your output cycle as follows
for (it = text.begin(); it < text.end(); it++)
cout << *it << endl;
That way you add a newline after each string during output. But if so, then you don't need to manually add a \n character to empty strings during reading.
In other words, decide what is it you want to do. At this time it is not clear.
If you want to restore the discarded
newline characters during reading,
you have to do it for all lines,
not just for empty ones.
If you want to add the newline
characters during output, you don't
need to explictly push them into the
read lines at all.
In fact, it is a rather strange idea to literally push the newline characters into your strings. What for? Since you already read and store your text line-by-line, the newline characters can be implied. I.e. you can do the printing as I do it above (with endl), and everything will look as expected.
I think the simple answer here, is that getline() strips the trailing newline whether or not there is content in the string. So the three reads you do are as follows:
"Line 1"
""
"Line 3"
which you transform into:
"Line 1"
"\n"
"Line 3"
which when printed is:
Line 1
Line 3
I'd use something like this:
std::vector<std::string> text;
std::string str;
while (std::getline(infile, str))
text.push_back(str);
std::copy(text.begin(), text.end(),
std::ostream_iterator<std::string>(std::cout, "\n"));
You're adding complexity that stops your code from working.