How to parse this data of unknown size - c++

I have a simple text file containing instructions per a line. e.g
A 1 1
B 2 1 A
C 3 1 A
D 4 1 B C
Basic syntax is Letter, Num, Num, Letter(s)
I just don't know what function I should be calling to parse the data, and how to parse it in the given syntax. I feel like there's so many ways to do it.

The following C++ example shows one of possible way to read single characters from file, controlling end of line:
#include <string>
#include <fstream>
#include <sstream>
#include <iostream>
using namespace std;
int main(void)
{
ifstream inpFile("test.txt");
string str;
char c;
while (inpFile.good()) {
// read line from file
getline(inpFile, str);
// make string stream for reading small pieces of data
istringstream is(str);
// read data ingnoring spaces
do
{
is >> c; // read a single character
if (!is.eof()) // after successful reading
cout << c << " "; // output this character
} while (is.good()); // control the stream state
cout << "[End of line]" << endl;
}
cout << "[End of file]" << endl;
}
Here istringstream is used to process single line that is got by getline.
After reading a char with is >> c value in c can be checked for content, e.g.:
if (!is.eof()) // after successful reading
{
// analyze the content
if ( isdigit(c) )
cout << (c - '0') << "(number) "; // output as a digit
else
cout << c << "(char) "; // output as a non-number
}
Note: if file can contain not single characters / digits, but numbers and words, type of c should be appropriate (e.g. string)

In C++, read an entire line and make a stream from it, then read from that stream with >>.
Example:
std::ifstream file(filename);
std::string line;
while (file.getline(line))
{
std::istringstream in(line);
char letter;
int number1;
int number2;
std::vector<char> letters;
if (in >> letter >> number1 >> number2)
{
char letter2;
while (in >> letter2)
{
letters.push_back(letter2);
}
}
}

This is C example that read lines, and then goes (using pointer) from the beginning to output readable characters (with code greater than 32):
#include <stdio.h>
#include <ctype.h>
#define MAX_LINE_LEN 80
int main(void)
{
FILE * inpFile = fopen("test.txt", "r");
char buf[MAX_LINE_LEN];
char *p;
while (!feof(inpFile))
{
// read a line from file
if (fgets(buf, MAX_LINE_LEN, inpFile) != NULL)
{
p = buf; // start from the beginning of line
// reading data from string till the end
while (*p != '\n' && *p != '\0')
{
// skip spaces
while (isspace(*p) && *p != '\n') p++;
if (*p > 32)
{
// output character
printf("%c ", *p);
// move to next
p++;
}
}
}
printf("[End of line]\n");
}
printf("[End of file]\n");
return 0;
}
To extract numbers and words from the line you can do something like:
// reading data from string till the end
while (*p != '\n' && *p != '\0')
{
// skip spaces
while (isspace(*p) && *p != '\n') p++;
if (*p > 32)
{
int num;
char word[MAX_LINE_LEN];
// trying to read number
if (sscanf(p, "%i", &num))
{
printf("%i(number) ", num);
}
else // read string
{
sscanf(p, "%s", word);
printf("%s(string) ", word);
}
// move to next space in the simplest way
while (*p > 32) p++;
}
}

Related

Odd Cout Behavior

I'm sorry for the initial post. This is tested and reproducible.
I'm trying to get cout to work within a fstream while loop while detecting each character its parsing, but it's exhibiting an odd behavior with the Text getting overrided by the first variable that I'm trying to put into cout.
main.cxx
#include <string.h>
#include <fstream>
#include <iostream>
#include <stdio.h>
using std::string;
using std::fstream;
using std::noskipws;
using std::cout;
int main(int argc, const char *argv[]){
char pipe;
string word; // stores the word of characters it's working on at the moment
string filename = "Directory.dat";
int type = 0; // 2 types counter, starts at 0
int newindicator = 0; // for detecting a new * for the data set
fstream fin(filename.c_str(), fstream::in);
while(fin >> noskipws >> pipe){
if(pipe == '*'){ // if the character is an asterisk
type++;
newindicator = 0;
word.clear();
}else if (pipe == '\n'){ // if the character is next line
if(newindicator == 0){ // tells the reader to know that it just finished reading a *, so it doesn't print anything.
newindicator = 1;
}else {
if(type == 1){
cout << "new word as: ";
cout << word << "\n";
}else if (type == 2){
cout << "new word as: ";
cout << word << "\n";
}
word.clear(); // clears the word string as it's reading the next line.
}
}else{
word+=pipe;
}
}
return 0;
}
Directory.dat
*
Chan
Johnathan
Joespeh
*
Betty
Lady Gaga
Output
Chanword as:
new word as: Johnathan
new word as: Joespeh
Bettyord as:
new word as: Lady Gaga
Note that how "Chan" is overriding the characters "new " on the first line, but it's fine after that. This seems to happen on every new type I'm doing, and when its recalling a new set of type. Same with Betty on the next set, which overrides "new w" with "Betty" on that cout.
Any feedback would be much appreciated. Thank you!
I suspect your input file has Windows line endings. These contain the carriage return character that's handled differently on Unix.
https://superuser.com/questions/374028/how-are-n-and-r-handled-differently-on-linux-and-windows
Thank you all for the comments and feedback. Made the changes as suggested:
Corrected
#include <string.h>
#include <fstream>
#include <iostream>
#include <stdio.h>
using std::string;
using std::fstream;
using std::noskipws;
using std::cout;
int main(int argc, const char *argv[]){
char pipe;
string word; // stores the word of characters it's working on at the moment
string filename = "Directory.dat";
int type = 0; // 2 types counter, starts at 0
int newindicator = 0; // for detecting a new * for the data set
fstream fin(filename.c_str(), fstream::in);
while(fin >> noskipws >> pipe){
if(pipe == '*'){ // if the character is an asterisk
type++;
newindicator = 0;
word.clear();
}else if (pipe == '\n'){ // if the character is next line
if(newindicator == 0){ // tells the reader to know that it just finished reading a *, so it doesn't print anything.
newindicator = 1;
}else {
if(type == 1){
cout << "new word as: ";
cout << word << "\n";
}else if (type == 2){
cout << "new word as: ";
cout << word << "\n";
}
word.clear(); // clears the word string as it's reading the next line.
}
}else{
if (pipe != '\r'){
word+=pipe;
}
}
}
return 0;
}
Output
new word as: Chan
new word as: Johnathan
new word as: Joespeh
new word as: Betty
new word as: Lady Gaga

Error implementing Caesar Cipher Using C++

I am trying to implement Caesar Cipher using C++. The directions are to use this file which is already encrypted:
5
Asi ymj rtrjwfymjx tzylwfgj.
Aqq rnrxd bjwj ymj gtwtlwtajx
Dni ldwj fsi ldrgqj ns ymj bfgj.
Tbfx gwnqqnl fsi ymj xnymjd ytajx
The number 5 represents the shift that is applied to the text. I have to decode the Caesar ciphered text and reverse the lines as in put line 4 in line 1's position and line 3 in line 2's. The first letter of each line does not need to be decoded (the uppercase letters).
The text should look like this after running the program:
Twas brillig and the sithey toves
Did gyre and gymble in the wabe.
All mimsy were the borogroves
And the momerathes outgrabe.
As of right now, I have this code:
#include <iostream>
#include <vector>
#include <string>
#include <fstream>
using namespace std;
char decipher (char c, int shift);
int main(){
//declare variables
char c;
string deciphered = "";
int shift;
vector <string> lines;
//ask for filename and if not found, keep trying
ifstream inFile;
string filename;
cout << "What is the name of the file? ";
cin >> filename;
inFile.open(filename);
while (!inFile){
cout << "File not found. Try again: ";
cin >> filename;
inFile.open(filename);
}
//find shift from file
inFile >> shift;
//get lines from file
inFile >> noskipws;
while (inFile >> c){
char decipheredChar = decipher (c, shift);
deciphered += decipheredChar;
}
cout << deciphered;
}
char decipher (char c, int shift){
string letters = "abcdefghijklmnopqrstuvwxyz";
if (c == 'T'){
return c;
}
else if (c == 'D'){
return c;
}
else if (c == 'A'){
return c;
}
else if (c == ' '){
return c;
}
else {
int currentPosition = letters.find(c);
int shiftedPosition = currentPosition - shift;
if (shiftedPosition < 0){
shiftedPosition = 26 + shiftedPosition;
}
char shifted = letters[shiftedPosition];
return shifted;
}
}
The result I'm getting is this:
uAnd the momerathes outgrabeuuAll mimsy were the borogrovesuDid gyre and gymble in the wabeuuTwas brillig and the sithey tovesu
How do I get rid of the u's and also separate the words by line? I have an idea of reversing the lines using a vector and using a loop counting backwards but I'm not sure how to get to there yet. Please help. Thank you.
To answer your question, the 'u's are the newlines. You read them in and decipher them, so they change and the result is pulled from letters. You should be able to add another case to decipher() to leave newlines alone:
char decipher (char c, int shift){
string letters = "abcdefghijklmnopqrstuvwxyz";
if(c == '\n'){ // do not modify new lines.
return c;
}
else if (c == 'T'){
return c;
}
// ...
}
Probably the cleanest way to reverse the lines is parse them while you read the characters. You can them pop them from the vector in reverse order. A working (but not robust) example would be to add the following to your while loop:
while (inFile >> c){
char decipheredChar = decipher (c, shift);
deciphered += decipheredChar;
if(decipheredChar=='\n'){ //if full line
lines.push_back(deciphered); //push line
deciphered = ""; //start fresh for next line
}
}
lines.push_back(deciphered+'\n'); //push final line (if no newline)
while(!lines.empty()){
cout << lines.back(); //prints last line
lines.pop_back(); //removes last line
}
I say not robust because there are minor things you may still need watch out for. For instance, this reads stores newline from after 5, and if the file ends in a newline I've added an empty one on the end... I'll leave you minor details to clear up.

How do you parse a c-string?

Hi I'm trying to take a c-string from a user, input it into a queue, parse the data with a single space depending on its contents, and output the kind of data it is (int, float, word NOT string).
E.g. Bobby Joe is 12 in 3.5 months \n
Word: Bobby
Word: Joe
Word: is
Integer: 12
Word: in
Float: 3.5
Word: months
Here's my code so far:
int main()
{
const int maxSize = 100;
char cstring[maxSize];
std::cout << "\nPlease enter a string: ";
std::cin.getline(cstring, maxSize, '\n');
//Keyboard Buffer Function
buffer::keyboard_parser(cstring);
return EXIT_SUCCESS;
}
Function:
#include <queue>
#include <string>
#include <cstring>
#include <iostream>
#include <cstdlib>
#include <vector>
namespace buffer
{
std::string keyboard_parser(char* input)
{
//Declare Queue
std::queue<std::string> myQueue;
//Declare String
std::string str;
//Declare iStringStream
std::istringstream isstr(input);
//While Loop to Read iStringStream to Queue
while(isstr >> str)
{
//Push onto Queue
myQueue.push(str);
std::string foundDataType = " ";
//Determine if Int, Float, or Word
for(int index = 0; index < str.length(); index++)
{
if(str[index] >= '0' && str[index] <= '9')
{
foundDataType = "Integer";
}
else if(str[index] >= '0' && str[index] <= '9' || str[index] == '.')
{
foundDataType = "Float";
break;
}
else if(!(str[index] >= '0' && str[index] <= '9'))
{
foundDataType = "Word";
}
}
std::cout << "\n" << foundDataType << ": " << myQueue.front();
std::cout << "\n";
//Pop Off of Queue
myQueue.pop();
}
}
}
Right now with this code, it doesn't hit the cout statement, it dumps the core.
I've read about using the find member function and the substr member function, but I'm unsure of how exactly I need to implement it.
Note: This is homework.
Thanks in advance!
UPDATE: Okay everything seems to work! Fixed the float and integer issue with a break statement. Thanks to everyone for all the help!
Your queue is sensible: it contains std::strings. Unfortunately, each of those is initialised by you passing cstring in without any length information and, since you certainly aren't null-terminating the C-strings (in fact, you're going one-off-the-end of each one), that's seriously asking for trouble.
Read directly into a std::string.
std::istreams are very useful for parsing text in C++... often with an initial read of a line from a string, then further parsing from a std::istringstream constructed with the line content.
const char* token_type(const std::string& token)
{
// if I was really doing this, I'd use templates to avoid near-identical code
// but this is an easier-to-understand starting point...
{
std::istringstream iss(token);
int i;
char c;
if (iss >> i && !(iss >> c)) return "Integer";
}
{
std::istringstream iss(token);
float f;
char c; // used to check there's no trailing characters that aren't part
// of the float value... e.g. "1Q" is not a float (rather, "word").
if (iss >> f && !(iss >> c)) return "Float";
}
return "Word";
}
const int maxSize = 100; // Standard C++ won't let you create an array unless const
char cstring[maxSize];
std::cout << "\nPlease enter a string: ";
if (std::cin.getline(cstring, maxSize, '\n'))
{
std::istringstream iss(cstring);
std::string token;
while (iss >> token) // by default, streaming into std::string takes a space-...
token_queue.push(token); // ...separated word at a time
for (token_queue::const_iterator i = token_queue.begin();
i != token_queue.end(); ++i)
std::cout << token_type(*i) << ": " << *i << '\n';
}

What is simple way to read in randomly placed characters line by line?(C++)

It should ignore spaces and read in 2 characters.
My code to read it:
#include <iostream>
using namespace std ;
int main(){
char current_char1 ;
char current_char2 ;
//input comes from stdin
while(getc() != '\0'){
current_char1 = getc() ;
current_char2 = getc() ;
}
}
Can you show simpler way to do it?
To read two numbers from a single line, no matter the number of spaces, this will be fine:
std::string line;
std::getline(std::cin, line);
std::istringstream iss(line);
int a, b;
iss >> std::hex >> a >> b;
std::cout << "First value is " << a << ", the second value is " << b << '\n';
You are writing C++, but nonetheless, many tasks are easier with stdio.h than iostream, and this is one of them.
#include <ctype.h>
#include <stdio.h>
#include <string>
using std::string;
// Returns the next two non-`isspace` characters as a std::string.
// If there aren't that many before EOF, returns all that there are.
// Note: line boundaries ('\n') are ignored just like all other whitespace.
string
read_next_two_nonspace(FILE *fp)
{
string s;
int c;
do c = getc(fp);
while (c != EOF && isspace(c));
if (c != EOF) s.append(1, c);
do c = getc(fp);
while (c != EOF && isspace(c));
if (c != EOF) s.append(1, c);
return s;
}
EDIT: If what you actually want is to read two hexadecimal numbers from a line-oriented file that is supposed to have two such numbers per line and may have random amounts of whitespace around them, then this is my preferred method. Joachim's method is shorter, but less reliable; in particular, iostreams cannot be used safely for numeric input (!) owing to their being defined in terms of scanf, which provokes undefined behavior (!!) upon numeric overflow. This is more code but handles arbitrarily malformed input. Again, note free mixing of C++ and C library facilities -- there is no reason not to use the older library if it does what you need, as it does in this case.
Preamble:
#include <istream>
#include <stdexcept>
#include <string>
#include <vector>
#include <ctype.h>
#include <errno.h>
#include <stdlib.h>
using std::getline;
using std::invalid_argument;
using std::istream;
using std::vector;
using std::string;
struct two_numbers { unsigned long a; unsigned long b; };
Parser:
#define isspace_or_eol(c) (isspace(c) || (c) == '\0')
// Parse a line consisting of two hexadecimal numbers, separated and surrounded
// by arbitrary whitespace.
two_numbers
parse_line(string const &s)
{
const char *p;
char *endp;
two_numbers val;
// N.B. strtoul skips *leading* whitespace.
errno = 0;
p = s.c_str();
val.a = strtoul(p, &endp, 16);
if (endp == p || !isspace_or_eol(*endp) || errno)
throw invalid_argument("first number is missing, malformed, or too large");
p = endp;
val.b = strtoul(p, &endp, 16);
if (endp == p || !isspace_or_eol(*endp) || errno)
throw invalid_argument("second number is missing, malformed, or too large");
// Verify only whitespace after the second number.
p = endp;
while (isspace(*p)) p++;
if (*p != '\0')
throw invalid_argument("junk on line after second number");
return val;
}
Example usage:
vector<two_numbers>
read_file(istream &fp, const char *fname)
{
string line;
unsigned int lineno = 0;
vector<two_numbers> contents;
bool erred = false;
while (getline(fp, line))
{
lineno++;
try
{
contents.append(parse_line(line));
}
catch (invalid_argument &e)
{
std::cerr << fname << ':' << lineno << ": parse error: "
<< e.what() << '\n';
erred = true;
}
}
if (erred)
throw invalid_argument("parse errors in file");
return contents;
}
You can try the code below, a small modification based on Zack's
#include "stdafx.h"
#include <iostream>
using namespace std;
bool read(FILE * fp, string * ps)
{
char c = fgetc(fp);
if(c == EOF)
return false;
if(isspace(c))
{
ps->clear();
return read(fp, ps);
}
if(isdigit(c))
{
ps->append(1, c);
if(ps->length() == 2)
return true;
}
return read(fp, ps);
}
int _tmain(int argc, _TCHAR* argv[])
{
FILE * file;
fopen_s(&file, "YOURFILE", "r");
string s;
while(read(file, &s))
{
cout<<s.c_str()<<endl;
s.clear();
}
return 0;
}

Swapping chars in a file, multiple times

I have code that works but only once. I need an input char a to be swapped with an input char b. The first time through the loop, it swaps the two selected chars fine, but on the second and following iterations it does nothing but keep the outFile the same. How can I swap more than two chars until I want to stop?
ifstream inFile("decrypted.txt");
ofstream outFile("swapped.txt");
const char exist = 'n';
char n = '\0';
char a = 0;
char b = 0;
cout<<"\nDo u want to swap letters? press <n> to keep letters or any button to continue:\n"<<endl;
cin>>n;
while (n != exist)
{
cout<<"\nWhat is the letter you want to swap?\n"<<endl;
cin>>a;
cout<<"\nWhat is the letter you want to swap it with?\n"<<endl;
cin>>b;
if (inFile.is_open())
{
while (inFile.good())
{
inFile.get(c);
if( c == b )
{
outFile<< a;
}
else if (c == a)
{
outFile<< b;
}
else
{
outFile<< c;
}
}
}
else
{
cout<<"Please run the decrypt."<<endl;
}
cout<<"\nAnother letter? <n> to stop swapping\n"<<endl;
cin>>n;
}
Consider a different approach.
Collect all the character swaps in a lookup table. By default translate['a'] == 'a', the input character is the same as the output character. To swap a with z just set translate['a'] = 'z' and translate['z'] = 'a'.
Then perform a single pass over the file, copying and translating at the same time.
#include <array>
#include <fstream>
#include <iostream>
#include <numeric>
int main()
{
std::array<char,256> translate;
std::iota(translate.begin(), translate.end(), 0); // identity function
for (;;)
{
char a, b;
std::cout << "\nEnter ~ to end input and translate file\n";
std::cout << "What is the letter you want to swap? ";
std::cin >> a;
if (a == '~') break;
std::cout << "What is the letter you want to swap it with? ";
std::cin >> b;
if (b == '~') break;
std::swap(translate[a], translate[b]); // update translation table
}
std::ifstream infile("decrypted.txt");
std::ofstream outfile("swapped.txt");
if (infile && outfile)
{
std::istreambuf_iterator<char> input(infile), eof;
std::ostreambuf_iterator<char> output(outfile);
// this does the actual file copying and translation
std::transform(input, eof, output, [&](char c){ return translate[c]; });
}
}
You have read the entire file, and as such will not read more bytes or write more bytes. You can use seek to get back to the beginning, or simply close and re-open the files.