I'm trying to read information from a file and process that information in a certain way. I need to make an array of all the words on the very left hand side of the file that don't have white space in front of them. I keep however getting really odd output when I try to display the contents of that char array.
Here is the sample input:
# Sample Input
LA 1,3
LA 2,1
TOP NOP
ADDR 3,1
ST 3, VAL
CMPR 3,4
JNE TOP
P_INT 1,VAL
P_REGS
HALT
VAL INT 0
TAN LA 2,1
So for instance when I run my program, my output should be:
TOP
VAL
TAN
Instead I'm getting:
a
aTOP
aVAL
aTAN
a
a
I'm not sure why this is happening. Any minor changes I make don't actually help, they just change what's in front of my expected output. Sometimes it's ASCII value 0 or 20 characters. Hopefully someone can help me fix this because it's driving me crazy.
Here's my code:
#include <string>
#include <iostream>
#include <cstdlib>
#include <string.h>
#include <fstream>
#include <stdio.h>
using namespace std;
int main(int argc, char *argv[])
{
// If no extra file is provided then exit the program with error message
if (argc <= 1)
{
cout << "Correct Usage: " << argv[0] << " <Filename>" << endl;
exit (1);
}
// Array to hold the registers and initialize them all to zero
int registers [] = {0,0,0,0,0,0,0,0};
string memory [16000];
string symTbl [1000][1000];
char line[100];
char label [9];
char opcode[9];
char arg1[256];
char arg2[256];
char* pch;
// Open the file that was input on the command line
ifstream myFile;
myFile.open(argv[1]);
if (!myFile.is_open())
{
cerr << "Cannot open the file." << endl;
}
int counter = 0;
int i = 0;
while (myFile.good())
{
myFile.getline(line, 100, '\n');
// If the line begins with a #, then just get the next line
if (line[0] == '#')
{
continue;
}
// If there is a label, then this code will run
if ( line[0] != '\t' && line[0]!=' ')
{
if( pch = strtok(line-1," \t"));
{
strcpy(label,pch);
cout << label << endl;
}
if (pch = strtok(NULL, " \t"))
{
strcpy(opcode,pch);
}
if (pch = strtok(NULL, " \t,"))
{
strcpy(arg1,pch);
}
if (pch = strtok(NULL, ","))
{
strcpy(arg2, pch);
}
}
}
return 0;
}
You are passing line-1 to strtok, which will cause it to return a pointer to the character before the start of the string; accessing line[-1] will produce undefined behaviour. strtok takes a pointer to the start of a string.
You've also got a ; at the end of your if( pch = strtok(line-1," \t")) statement, which nullifies the if test and causes the block to run even if pch is NULL.
You have a bug here: strtok(line-1," \t")
line-1 is the address of line[-1]. It's an invalid address and using it produces undefined behavior.
Related
I have a file of data like this:
Judy Henn 2 Oaklyn Road Saturday 2001
Norman Malnark 15 Manor Drive Saturday 2500
Rita Fish 210 Sunbury Road Friday 750
I need to assign the first 20 characters as the name, next 20 as address, next 10 as day, and the number as yardSize, using the istream::get() method. My professor is requiring the use of .get() to accomplish this.
I am having a really hard time figuring out how to assign the data from the file to the right variables while still looping.
struct Customer{
char name[21];
char address[21];
char day[11];
int yardSize;
};
int main(){
const int arrSize = 50;
Customer custArr[arrSize];
int i = 0;
//set up file
ifstream dataFile;
dataFile.open("Data.txt");
//try to open file
if(!dataFile){
cout << "couldn't open file";
}
//while dataFile hasn't ended
while(!dataFile.eof()){
dataFile.get(custArr[i].name, 21);
cout << custArr[i].name;
i++;
}
}; //end
I would have thought that the while loop would assign the first 21 characters into custArr[i].name, then loop over and over until the end of file. However, when I print out custArr[i].name, I get this and ONLY this:
Judy Henn 2 Oaklyn Road Saturday 2001
I'm not sure how to go about assigning a specified number of characters to a variable, while still iterating through the entire file.
First off, the character counts you mentioned don't match the data file you have shown. There are only 19 characters available for the name, not 20. And only 9 characters available for the day, not 10.
After fixing that, your code is still broken, as it is reading only into the Customer::name field. So it will try to read Judy Henn into custArr[0].name, then 2 Oaklyn Road into custArr[1].name, then Saturday into custArr[2].name, and so on.
I would suggest something more like this instead:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
using namespace std;
struct Customer
{
char name[21];
char address[21];
char day[11];
int yardSize;
};
int main()
{
const int arrSize = 50;
Customer custArr[arrSize];
string line;
int i = 0;
//set up file
ifstream dataFile("Data.txt");
if (!dataFile)
{
cout << "couldn't open file";
return 0;
}
//while dataFile hasn't ended
while ((i < arrSize) && getline(dataFile, line))
{
istringstream iss(line);
if (iss.get(custArr[i].name, 21) &&
iss.get(custArr[i].address, 21) &&
iss.get(custArr[i].day, 11) &&
iss >> custArr[i].yardSize)
{
cout << custArr[i].name;
++i;
}
}
return 0;
}
Reading fixed-width (mainframe type) records isn't something C++ was written to do specifically. While C++ provides a wealth of string manipulation functions, reading fixed-width records is still something you have to put together yourself using basic I/O functions.
In addition to using to the great answer by #RemyLebeau, a similar approach using std::vector<Customer> instead of an array of customers eliminates bounds concerns. By using a std::vector instead of an array, you can adapt the code to read as many records as needed (up to the limits of your physical memory) without the fear of adding information past an array bound.
Additionally, as currently written, you leave the leading and trailing whitespace in each array. For example, your name array would hold " Judy Henn " instead of just "Judy Henn". Generally you will always want to trim leading and trailing whitespace from what you store as a variable. Otherwise, when you use the stored characters you will have to have someway to deal with the whitespace each time the contents are used. While std::string provides a number of methods you can use to trim leading and trailing whitespace, your use of plain old char[] will require a manual removal.
Adding code to trim the excess leading and trailing whitespace from the character arrays in the collection of Customer could be written as follows.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <cstring>
#define NAMLEN 20 /* if you need a constant, #define one (or more) */
#define ADDRLEN 21 /* (these marking the fixed-widths of the fields) */
#define DAYLEN 10
struct Customer {
char name[21];
char address[21];
char day[11];
int yardSize;
};
int main (int argc, char **argv) {
if (argc < 2) { /* validate at least one argument given for filename */
std::cerr << "error: insufficient no. of arguments\n"
"usage: " << argv[0] << " <filename>\n";
return 1;
}
std::string line {}; /* string to hold each line read from file */
std::vector<Customer> customers {}; /* vector of Customer struct */
std::ifstream f (argv[1]); /* file stream (filename in 1st arg) */
if (!f.is_open()) { /* validate file open for reading */
std::cerr << "error: file open failed '" << argv[1] << "'.\n"
<< "usage: " << argv[0] << " <filename>\n";
return 1;
}
while (getline (f, line)) { /* read each line into line */
std::stringstream ss (line); /* create stringstream from line */
Customer tmp {}; /* declare temporary instance */
char *p; /* pointer to trim leading ws from name */
size_t wslen; /* whitespace len to use in trim */
ss.get (tmp.name, NAMLEN); /* read up to NAMLEN chars from ss */
if (ss.gcount() != NAMLEN - 1) { /* validate gcount()-1 chars read */
std::cerr << "error: invalid format for name.\n";
continue;
}
for (int i = NAMLEN - 2; tmp.name[i] == ' '; i--) /* loop from end of name */
tmp.name[i] = 0; /* overwrite spaces with nul-char */
for (p = tmp.name; *p == ' '; p++) {} /* count leading spaces */
wslen = strlen (p); /* get remaining length */
memmove (tmp.name, p, wslen + 1); /* move name to front of array */
ss.get (tmp.address, ADDRLEN); /* read up to ADDRLEN chars from ss */
if (ss.gcount() != ADDRLEN - 1) { /* validate gcount()-1 chars read */
std::cerr << "error: invalid format for address.\n";
continue;
}
for (int i = ADDRLEN - 2; tmp.address[i] == ' '; i--)/* loop from end of name */
tmp.address[i] = 0; /* overwrite spaces with nul-char */
ss.get (tmp.day, DAYLEN); /* read up to DAYLEN chars from ss */
if (ss.gcount() != DAYLEN - 1) { /* validate gcount()-1 chars read */
std::cerr << "error: invalid format for day.\n";
continue;
}
for (int i = DAYLEN - 2; tmp.day[i] == ' '; i--) /* loop from end of name */
tmp.day[i] = 0; /* overwrite spaces with nul-char */
if (!(ss >> tmp.yardSize)) { /* extract final int value from ss */
std::cerr << "error: invalid format for yardSize.\n";
continue;
}
customers.push_back(tmp); /* add temp to vector */
}
for (Customer c : customers) /* output information */
std::cout << "\n'" << c.name << "'\n'" << c.address << "'\n'" <<
c.day << "'\n'" << c.yardSize << "'\n";
}
(note: the program expects the filename to read to be provided on the command line as the first argument. You can change how you provide the filename to suite your needs, but you should not hardcode filenames or use MagicNumbers in your code. You shouldn't have to re-compile your program just to read from another filename)
Also note that in the for() loop trimming whitespace, you are dealing with 0-based indexes instead of a 1-based count of characters which is why you are using gcount() - 1 or the total number of chars minus two, e.g. NAMLEN - 2 to loop from the last character in the array back towards the beginning.
The removal of trailing whitespace simply loops from the last character in each string from the end of each array back toward the beginning overwriting each space with a nul-terminating character. To trim leading whitespace from name, the number of whitespace characters are counted and then C memmove() is used to move the name back to the beginning of the array.
Example Use/Output
$ ./bin/read_customer_day_get dat/customer_day_get.txt
'Judy Henn'
'2 Oaklyn Road'
'Saturday'
'2001'
'Norman Malnark'
'15 Manor Drive'
'Saturday'
'2500'
'Rita Fish'
'210 Sunbury Road'
'Friday'
'750'
The output of each value has been wrapped in single-quotes to provide visual confirmation that the name field has had both leading and trailing whitespace removed, while address and day have both had trailing whitespace removed.
I am recently working on implementing word counting as wc in Linux in C++. I've read a lot of posts about how to implement this function, but I've still got a problem. When I use text-based file as the input, it'll return right word counts. Otherwise, it returns incorrect counts. So I am wondering if the logic of my code is wrong. I really can't figure this out. Please help me solve this problem.
What I expected is to get the exact number of word counts as wc does, for example:
wc -w filename
it'll return
wordCounts filename
I want to get the exactly same number of wordCounts as wc does and return as the result of function.
I've used .cpp and .txt files as input, and I got right word count.But when I use .out or other files it returns different result.
Here's my code:
int countWords()
{
std::ifstream myFile(pathToFile);
char buffer[1];
enum states {WHITESPACE,WORD};
int state = WHITESPACE;
int wordCount = 0;
if(myFile.is_open())
{
while(myFile.read(buffer,1))
{
if(!isspace(buffer[0]))
{
if (state == WHITESPACE )
{
wordCount++;
state = WORD;
}
}
else
{
state = WHITESPACE;
}
}
myFile.close();
}
else
{
throw std::runtime_error("File has not opend");
}
return wordCount;
}
It isn't obvious at all, but on a Mac (so it might not apply to Linux, but I expect it does), I can get the same result as wc by using setlocale(LC_ALL, ""): before calling your counting function.
The other problem is, as I noted in a comment, that isspace() takes an integer in the range 0..255 or EOF, but when you get signed values from plain char, you are indexing out of range and you get whatever you get (it is undefined behaviour). Without the setlocale() call, the character codes 9 (tab, '\t'), 10 (newline, '\n'), 11 (vertical tab, '\v'), 12 (formfeed, '\f'), 13 (carriage return, '\r') and 32 (space, ' ') are treated as 'space' by isspace(). When the setlocale() function is called on the Mac, the character code 160 (NBSP — non-breaking space) is also counted as a space, and that brings the calculation into sync with wc.
Here's a mildly modified version of your function which counts lines and characters as well as words. The function is revised to take a file name argument — you may do as you wish with yours, but you should avoid global variables whenever possible.
#include <cctype>
#include <clocale>
#include <fstream>
#include <iostream>
int countWords(const char *pathToFile)
{
std::ifstream myFile(pathToFile);
char buffer[1];
enum states { WHITESPACE, WORD };
int state = WHITESPACE;
int wordCount = 0;
int cc = 0;
int lc = 0;
if (myFile.is_open())
{
while (myFile.read(buffer, 1))
{
cc++;
if (buffer[0] == '\n')
lc++;
if (!isspace(static_cast<unsigned char>(buffer[0])))
{
if (state == WHITESPACE)
{
wordCount++;
state = WORD;
}
}
else
{
state = WHITESPACE;
}
}
myFile.close();
std::cerr << "cc = " << cc << ", lc = " << lc
<< ", wc = " << wordCount << "\n";
}
else
{
throw std::runtime_error("File has not opened");
}
return wordCount;
}
int main()
{
setlocale(LC_ALL, "");
std::cout << countWords("/dev/stdin") << "\n";
}
When compiled (from wc79.cpp into wc79), and run on itself, I get the output:
$ ./wc79 < wc79
cc = 13456, lc = 6, wc = 118
118
$ wc wc79
6 118 13456 wc79
$
The two are now in agreement.
i wrote a code in C++ where it opens a .txt file and reads its contents, think of it as a (MAC address database), each mac address is delimited by a (.), my problem is after i search the file for total number of lines , iam unable to return the pointer to the initial position of the file in here i use seekg() and tellg() to manipulate the pointer to the file.
here is the code:
#include <iostream>
#include <fstream>
#include <conio.h>
using namespace std;
int main ()
{
int i = 0;
string str1;
ifstream file;
file.open ("C:\\Users\\...\\Desktop\\MAC.txt");
//this section calculates the no. of lines
while (!file.eof() )
{
getline (file,str1);
for (int z =0 ; z<=15; z++)
if (str1[z] == '.')
i++;
}
file.seekg(0,ios::beg);
getline(file,str2);
cout << "the number of lines are " << i << endl;
cout << str2 << endl;
file.close();
getchar();
return 0;
}
and here is the contents of the MAC.txt file:
0090-d0f5-723a.
0090-d0f2-87hf.
b048-7aae-t5t5.
000e-f4e1-xxx2.
1c1d-678c-9db3.
0090-d0db-f923.
d85d-4cd3-a238.
1c1d-678c-235d.
here the the output of the code is supposed to be the first MAC address but it returns the last one .
file.seekg(0,ios::end);
I believe you wanted file.seekg(0,ios::beg); here.
Zero offset from the end (ios::end) is the end of the file. The read fails and you're left with the last value you read in the buffer.
Also, once you've reached eof, you should manually reset it with file.clear(); before you seek:
file.clear();
file.seekg(0,ios::beg);
getline(file,str2);
The error would have been easier to catch if you checked for errors when you perform file operations. See Kerrek SB's answer for examples.
Your code is making all sorts of mistakes. You never check any error states!
This is how it should go:
std::ifstream file("C:\\Users\\...\\Desktop\\MAC.txt");
for (std::string line; std::getline(file, line); )
// the loop exits when "file" is in an error state
{
/* whatever condition */ i++;
}
file.clear(); // reset error state
file.seekg(0, std::ios::beg); // rewind
std::string firstline;
if (!(std::getline(file, firstline)) { /* error */ }
std::cout << "The first line is: " << firstline << "\n";
I wrote the code below that successfully gets a random line from a file; however, I need to be able to modify one of the lines, so I need to be able to get the line character by character.
How can I change my code to do this?
Use std::istream::get instead of std::getline. Just read your string character by character until you reach \n, EOF or other errors. I also recommend you read the full std::istream reference.
Good luck with your homework!
UPDATE:
OK, I don't think an example will hurt. Here is how I'd do it if I were you:
#include <string>
#include <iostream>
#include <fstream>
#include <cstdlib>
using namespace std;
static std::string
answer (const string & question)
{
std::string answer;
const string filename = "answerfile.txt";
ifstream file (filename.c_str ());
if (!file)
{
cerr << "Can't open '" << filename << "' file.\n";
exit (1);
}
for (int i = 0, r = rand () % 5; i <= r; ++i)
{
answer.clear ();
char c;
while (file.get (c).good () && c != '\n')
{
if (c == 'i') c = 'I'; // Replace character? :)
answer.append (1, c);
}
}
return answer;
}
int
main ()
{
srand (time (NULL));
string question;
cout << "Please enter a question: " << flush;
cin >> question;
cout << answer (question) << endl;
}
... the only thing is that I have no idea why do you need to read string char by char in order to modify it. You can modify std::string object, which is even easier. Let's say you want to replace "I think" with "what if"? You might be better off reading more about
std::string and using find, erase, replace etc.
UPDATE 2:
What happens with your latest code is simply this - you open a file, then you get its content character by character until you reach newline (\n). So in either case you will end up reading the first line and then your do-while loop will terminate. If you look into my example, I did while loop that reads line until \n inside a for loop. So that is basically what you should do - repeat your do-while loop for as many times as many lines you want/can get from that file. For example, something like this will read you two lines:
for (int i = 1; i <= 2; ++i)
{
do
{
answerfile.get (answer);
cout << answer << " (from line " << i << ")\n";
}
while (answer != '\n');
}
I have written a program that takes the filename from argv[1] and do operations on it .
When debugging from visual studio I pass the filename from project options>>debugging>>command arguments and It works fine and prints all results correctly .
But when trying from the command prompt , I go to the dir of project/debug the I type
program
It works fine and prints "No valid input file" in the same window (Which is my error handling technique)
but when i type
program test.txt
It just does nothing . I think no problem in code because it works fine from the debugger .
Code :
int main(int argc, char *argv[])
{
int nLines;
string str;
if(argv[1]==NULL)
{
std::cout << "Not valid input file" << endl;
return 0 ;
}
ifstream infile(argv[1]);
getline(infile,str);
nLines = atoi(str.c_str());//get number of lines
for(int line=0 ;line < nLines;line++)
{
//int currTime , and a lot of variables ..
//do a lot of stuff and while loops
cout << currTime <<endl ;
}
return 0 ;
}
You don't check if file was successfully opened, whether getline returned error code or not, or if string to integer conversion didn't fail. If any of those error occur, which I guess is the case, nLines will be equal to 0, no cycles will be performed and program will exit with return code 0.
This code worked correctly for me running on the command line.
#include <string>
#include <algorithm>
#include <functional>
#include <vector>
#include <iostream>
using namespace std;
int main(int argc, char *argv[])
{
int nLines;
string str;
if(argv[1]==NULL)
{
std::cout << "Not valid input file" << endl;
return 0 ;
}
else
std::cout << "Input file = " << argv[1] << endl;
}
Output :
C:\Users\john.dibling\Documents\Visual Studio 2008\Projects\hacks_vc9\x64\Debug>hacks_vc9.exe hello
Input file = hello
By the way, this code is dangerous, at best:
if(argv[1]==NULL)
You should probably be checking the value of argc before attempting to dereference a possibly-wild pointer.
The file probably contains an invalid numeric first line (perhaps starting with a space or the BOM).
That would explain no output, since if nLines == 0 no output should be expected