Parsing of file with Key value in C/C++ - c++

Need some help in parsing the file
Device# Device Name Serial No. Active Policy Disk# P.B.T.L ALB
Paths
--------------------------------------------------------------------------------------- -------------------------------------
1 AB OPEN-V-CM 50 0BC1F1621 1 SQST Disk 2 3.1.4.0 N/A
2 AB OPEN-V-CM 50 0BC1F1605 1 SQST Disk 3 3.1.4.1 N/A
3 AB OPEN-V*2 50 0BC1F11D4 1 SQST Disk 4 3.1.4.2 N/A
4 AB OPEN-V-CM 50 0BC1F005A 1 SQST Disk 5 3.1.4.3 N/A
The above information is in devices.txt file and and i want to extract the device number corresponding to the disk no i input.
The disk number i input is just an integer (and not "Disk 2" as shown in the file).

Open the file and skip first 3 lines.
Start reading line by line from 4th line onward. You can get the device number easily as it is the first column.
To get the disk no, search through each line using the space character. When you encounter one space character it means you've gone past one column. Ignore the repeated spaces and continue this until you reach the disk no. You must handle the spaces in the column data separately if it exist.
Load the disk no and device no in to say a map and later you can use your input to query the device info from this map.

#include <sstream>
#include <fstream>
#include <iostream>
#include <cctype>
using namespace std;
int main(int argc, char* argv[])
{
int wantedDisknum = 4;
int finalDeviceNum = -1;
ifstream fin("test.txt");
if(!fin.is_open())
return -1;
while(!fin.eof())
{
string line;
getline(fin, line);
stringstream ss(line);
int deviceNum;
ss >> deviceNum;
if(ss.fail())
{
ss.clear();
continue;
}
string unused;
int diskNum;
ss >> unused >> unused >> unused >> unused >> unused >> unused >> unused >> diskNum;
if(diskNum == wantedDisknum)
{
finalDeviceNum = deviceNum;
break;
}
}
fin.close();
cout << finalDeviceNum << endl;
system("pause");
return 0;
}

In UNIX, you can easily achieve this using awk or other script lang.
cat Device.txt | awk '{if ( $1 == 2 ) print}'
In C++, you have to extract specific column using strtok and compare it with 'val' if it matches print that line.'

Assuming there is no "Disk" in any of the following columns:
1) Skip lines until you encounter '-' as the first character of a line, then skip that line too.
2) read a line
2.a) skip characters of the current line until isdigit(line[i]) function returns true, then read current character and characters following it into a temporary buffer until isdigit(line[i]) returns false. This is the device id.
2.b) Skip characters of the current line until you find a 'D'
2.b.i) match 'i', 's', 'k' characters, if any of them fails, go to 2.b
2.c) skip characters of the current line until isdigit(line[i]) function returns true, then read current character and characters following it into another buffer until isdigit(line[i]) returns false. This is the disk id.
3) print out both buffers

I don't have my Regular Expression cheat sheet handy, but I'm pretty sure it would be straightforward to run each line in the file through a regex that:
1) looks for a integer in the line
2) skips whitespace followed by text three times
3) matches characters one space and characters
Boost, Qt, and most other common C++ class libraries have a Regex parser for just this kind of thing.

Related

Extraction Operator reaching EOF on istringstream, behavioral change with int/char/string

So I am doing some simple file I/O in c++ and I notice this behaviour, not sure if I am forgetting something about the extraction operator and chars
Note that the file format in Unix.
ifstream infile("test.txt");
string line;
while(getline(infile, line)){
istringstream iss(line);
**<type>** a;
for(...){
iss >> a;
}
if(iss.eof())
cout << "FAIL" << endl;
}
Say that the input file test.txt looks like this and the <type> of a is int
$ is the newline character (:set line)
100 100 100$
100 100 100$
what I notice is that after the first line is read, EOF is set true;
If the input file is like so, and the <type> of a is char:
a b c$
a b c$
Then the Code behaves perfectly as expected.
From what I understand about File I/O and the extraction operator, the leading spaces are ignored, and the carriage lands on the character after the input is taken out of the input stringstream iss. So in both cases, at the end of each stringstream the carriage lands on the newline character, and it shouldn't be an EOF.
Changing the <type> of a to string had similar failure as <type> = int
BTW failbit is not set,
at the end:
good = 0
fail = 0
eof = 1
getline has extracted and discarded the newline, so line contains 100 100 100, not 100 100 100$, where $ is representing the newline. This means reading all three tokens from the line with a stringstream and the >> operator may reach the EOF and produce the FAIL message.
iss >> a; when a is an int or a string will skip all preceding whitespace and then continue extracting until it reaches a character that can't possibly be part of an int or is whitespace or is the end of the stream. On the third >> from the stream, the end of the stream stops the extraction and the stream's EOF flag is set.
iss >> a; when a is an char will skip all preceding whitespace and then extract exactly one character. In this case the third >> will extract the final character and stop before seeing the end of the stream and without setting the EOF flag.

Printing duplicate strings and how many times they appear in a file C++

Here is the question I have to solve and the code I've written so far.
Write a function named printDuplicates that accepts an input stream and an output stream as parameters.
The input stream represents a file containing a series of lines. Your function should examine each line looking for consecutive occurrences of the same token on the same line and print each duplicated token along how many times it appears consecutively.
Non-repeated tokens are not printed. Repetition across multiple lines (such as if a line ends with a given token and the next line starts with the same token) is not considered in this problem.
For example, if the input file contains the following text:
hello how how are you you you you
I I I am Jack's Jack's smirking smirking smirking smirking smirking revenge
bow wow wow yippee yippee yo yippee yippee yay yay yay
one fish two fish red fish blue fish
It's the Muppet Show, wakka wakka wakka
My expected result should be:
how*2 you*4
I*3 Jack's*2 smirking*5
wow*2 yippee*2 yippee*2 yay*3
\n
wakka*3
Here is my function:
1 void printDuplicates(istream& in, ostream& out)
2 {
3 string line; // Variable to store lines in
4 while(getline(in, line)) // While there are lines to get do the following
5 {
6 istringstream iss(line); // String stream initialized with line
7 string word; // Current word
8 string prevWord; // Previous word
9 int numWord = 1; // Starting index for # of a specific word
10 while(iss >> word) // Storing strings in word variable
11 {
12 if (word == prevWord) ++numWord; // If a word and the word 13 before it are equal add to word counter
14 else if (word != prevWord) // Else if the word and the word before it are not equal
15 {
16 if (numWord > 1) // And there are at leat two copies of that word
17 {
18 out << prevWord << "*" << numWord << " "; // Print out "word*occurrences"
19 }
20 numWord = 1; // Reset the num counter variable for next word
21 }
22 prevWord = word; // Set current word to previous word, loop begins again
23 }
24 out << endl; // Prints new line between each iteration of line loop
25 }
26 }
My result thus far is:
how*2
I*3 Jack's*2 smirking*5
wow*2 yippee*2 yippee*2
I have tried adding (|| iss.eof()), (|| iss.peek == EOF), etc inside the nested else if statement on Line 14, but I am unable to figure this guy out. I need some way of knowing I'm at the end of the line so my else if statement will be true and try to print the last word on the line.

Extracting numbers from string into an array

I'm having a problem with an assignment. I have to open a text file that looks more or less like this:
-------------------------------------------------------------
|ammount | time |delay |
-------------------------------------------------------------
|100 | 342 | 4324 |
with a few more rows. All I have to do is get the numbers into an array, which, for the example above, would look like this: ar[0]=100, ar[1]=342, ar[2]=4324. I imagine that I need to read the file line by line into strings with getline, but what next? If I use stringstream, I would get |100 instead of just 100. I'm really out of ideas now.
To read one line of input like you described (file may be an ifstream or a istringstream here):
for (int i = 0; i < 3; ++i)
{
file.ignore(numeric_limits<streamsize>::max(), '|'); // Ignores all characters until it finds a '|' character
file >> ar[i]; // Reads the number following the '|' to ar[i]
}
file.ignore(numeric_limits<streamsize>::max(), '\n'); // Finally, ignores all characters until newline
You can even make a small shortcut macro if you want:
#define ignore_until(c) ignore(numeric_limits<streamsize>::max(), c)
and use it like this:
file.ignore_until('|');

c++ How to extract the whitespace between words if there is one

I've got two questions. I need to write a program that extracts all non-alphabetic characters and displays them, then removes them.
I am using isalpha which is working for symbols, but only if the input string has no spaces like "hello world"
but if it is more than one word like "hello! world!", it will only extract the first exclamation mark but not the second.
Second question which may be related, I want my program to detect the spaces between the words (I tried isspace but I must have used it wrong? and remove them and put them in a char variable
so for example
if the input is hello4 world! How3 are you today?
I want it to tell me
removed: 4
removed:
removed: !
removed:
removed: 3
removed:
removed:
removed:
long story short, if there is no other way, I'd like to detect spaces as !isalpha, or find something similar to isalpha for space between text.
Thanks
# include <iostream>
# include <string>
using namespace std;
void main()
{
string message;
cin >> message;
for (int i = 0; message[i]; i++)
if(!isalpha(message[i]))
cout << "deleted following character: " << message[i] <<endl;
else
cout <<"All is good! \n";
}
>> reads a single word, stopping when a whitespace character is found. To read a whole line, you want
std::getline(cout, message);
There is a better way by which you can get non-alphabetic characters,
You can check with asci value of each character and compare with alphabetic asci character if not in it & not a space (space asci val),
then you get your non-alphabetic character.
You can get all ascii codes over here :=> http://www.asciitable.com/
-Jayesh

C++ Reading a text file backwards from the end of each line up until a space

Is it possible to read a text file backwards from the end of each line up until a space? I need to be able to output the numbers at the end of each line. My text file is formatted as follows:
1 | First Person | 123.45
2 | Second Person | 123.45
3 | Third Person | 123.45
So my output would be, 370.35.
Yes. But in your case, it's most likely more efficient to simply read the whole file and parse out the numbers.
You could do something like this (and I'm writing this in pseudocode so you have to acutally write real code, since that's how you learn):
seek to end of file.
pos = current position
while(pos >= 0)
{
read a char from file.
if (char == space)
{
flag = false;
process string to fetch out number and add to sum.
}
else
{
add char to string
}
if (char == newline)
{
flag = true;
}
pos--
seek to pos-2
}