How to read in a text file with commas and write it out to a new file without commas?
The file has nine columns- 1st is month next is date and there is a comma, then year then there are 6 columns of numbers representing dollars with a comma in between them for example 17,751.24
I need to make a new text file out of it that has no commas.
I was trying to use double variables for each coloumn but the commas forced me to use string variables then i tried to read it in with a while loop:
While(infile >> month >> date >> year >> open >> high >> low >> close >> volume >> adj)
{
...
}
Side note : the file is about some stock exchange thing.
How do I make a new text file with the same content except there should be no commas.
This is for a beginners c++ class so please avoid using advanced stuff in answers.
Thanks in advance.
Perhaps this is still too soon, but as what I provide below is not a complete answer, I hope it might help to get you going.
The simple question seems to be:
How to read in a text file with commas and write it out to a new file
without commas?
This simple question is broad (with many approaches), but you might consider the following:
// To use:
// invoke with i/o redirection
//
// Example with executable named t160
// and some file named 'inputFile':
//
// t160 < inputFile > noCommaFile
//
int t160()
{
int commaCount = 0;
do {
int kar = 0;
kar = std::cin.get(); // read single char from cin
if(std::cin.eof()) { break; } // break out when we finish
if(std::cin.bad()) { // break out on file i/o error
std::cerr << "ERR: std::cin.bad() " << std::endl;
break;
}
if(',' == kar) { commaCount += 1; continue; } // found a comma
// ^^^^^^^^ continue loop: do not cout the comma
std::cout << char(kar); // not a comma, so cout the kar
}while(1);
// uncomment for test:
// std::cerr << "commaCount = " << commaCount << std::endl;
return(0);
}
This code reads one char at a time, and writes one char at a time or nothing.
When a comma is found, it is 'discarded'. All other kars are included in the output.
You might later decide that you want to replace the comma with a space (instead of discarding it).
I can imagine that you might decide to treat a comma-space (", ") differently than a comma-digit (as in "123,456").
This simple loop should support those decisions, and many more.
Good luck.
Related
I am trying to write a code in C++ reading a text file contains a series of numerics. For example, I have this .txt file which contains the following series of numbers mixed with a character:
1 2 3 a 5
I am trying to make the code capable of recognizing numerics and characters, such as the 4th entry above (which is a character), and then report error.
What I am doing is like
double value;
while(in) {
in >> value;
if(!isdigit(value)) {
cout << "Has non-numeric entry!" << endl;
break;
}
else
// some codes for storing the entry
}
However, the isdigit function doesn't work for text file. It seems when I am doing in >> value, the code will implicitly type-cast a into double.
Can anyone give me some suggestion?
Thanks a lot!
Your while loop doesn't do what you think it does.
It only iterates one statement:
in >> value;
The rest of the statements are actually outside the loop.
Using curly braces for the while body is always recommended
I created a small mini script where I would be reading in a file through a standard fstream library object as I was a little unsure on what your "in" represented.
Essentially, try to read in every element as a character and check the digit function. If you're reading in elements that are not of just length 1, a few modifications would have to be made. Let me know if that's the case and I'll try to help!
int main() {
std::fstream fin("detect_char.txt");
char x;
while (fin >> x) {
if (!isdigit(x)) {
std::cout << "found non-int value = " << x << '\n';
}
}
std::cout << '\n';
return 0;
}
Try reading the tokens into string and explicitly parsing it
ifstream infile("data.txt");
string token;
while (infile >> token) {
try {
double num = stod(token);
cout << num << endl;
}
catch (invalid_argument e) {
cerr << "Has non-numeric entry!" << endl;
}
}
Since it looks like the Asker's end goal is to have a double value for their own nefarious purposes and not simply detect the presence of garbage among the numbers, what the heck. Let's read a double.
double value;
while (in) // loop until failed even after the error handling case
{
if (in >> value) // read a double.
{
std::cout << value; // printing for now. Store as you see fit
}
else // failed to read a double
{
in.clear(); // clear error
std::string junk;
in >> junk; // easiest way I know of to read up to any whitepsace.
// It's kinda gross if the discard is long and the string resizes
}
}
Caveat:
What this can't handle is stuff like 3.14A. This will be read as 3.14 and stop, returning the 3.14 and leave the A for the next read where it will fail to parse and then be consumed and discarded by in >> junk; Catching that efficiently is a bit trickier and covered by William Lee's answer. If the exception handling of stod is deemed to expensive, use strtod and test that the end parameter reached the end of the string and no range errors were generated. See the example in the linked strtod documentation
I'm triying to implement my own MergeSort, but I've got some problems, see if anyone can help me a little.
I have a big file with some info separeted with coma (Name,city,mail,telf). I would like to apply mergesort to order it, because I supose that the client computer wont have as much memory to do it in one try.
So, I split it into files of MAX_CUSTOMERS lines, and order them individually, all correct until here, but when I want to get the first two files and order them, I've got all the problems, I got repeated, ones and others dissapear, here's my code:
void MergeSort(string file1Name, string file2Name,string name){
printf("Enter MERGE SORT %s AND %s\n",file1Name.c_str(),file2Name.c_str());
string temp;
string fileName;
string lineFile1, lineFile2;
bool endFil1 = false, endFil2 = false;
int numCust1 = 0;
int numCust2 = 0;
int x1 = 0, x2 = 0;
ifstream file1;
file1.open(file1Name.c_str());
ifstream file2;
file2.open(file2Name.c_str());
ofstream mergeFile;
fileName = "customers_" +name +".txt";
cout << "Result file " << fileName << endl;
mergeFile.open("temp.txt");
getline(file1,lineFile1);
getline(file2,lineFile2);
while(!endFil1 && !endFil2){
if(CompareTelf(lineFile1,lineFile2)==1){
mergeFile << lineFile1 << endl;
if(!getline(file1,lineFile1)){
cout << lineFile1 << endl;
cout << "1st file end" << endl;
endFil1 = true;
}
}else{
mergeFile << lineFile2 << endl;
if(!getline(file2,lineFile2)){
cout << lineFile2 << endl;
cout << "2nd file end" << endl;
endFil2 = true;
}
}
}
if(endFil1){
//mergeFile << lineFile2 << endl;
while(getline(file2,lineFile2)){
mergeFile << lineFile2 << endl;
}
}else{
//mergeFile << lineFile1 << endl;
while(getline(file1,lineFile1)){
mergeFile << lineFile1 << endl;
}
}
file1.close();
file2.close();
mergeFile.close();
rename("temp.txt",fileName.c_str());
return;
}
Customer SplitLine(string line){
string splitLine;
string temp;
Customer cust;
int actProp = 0;
int number;
istringstream readLineStream(line); //convert String readLine to Stream readLine
while(getline(readLineStream,splitLine,',')){
if (actProp == 0)cust.name = splitLine;
else if (actProp == 1)cust.city = splitLine;
else if (actProp == 2)cust.mail = splitLine;
else if (actProp == 3)cust.telf = atoi(splitLine.c_str());
actProp++;
}
//printf("Customer read: %s, %s, %s, %i\n",cust.name.c_str(), cust.city.c_str(), cust.mail.c_str(), cust.telf);
return cust;
}
int CompareTelf(string str1, string str2){
Customer c1 = SplitLine(str1);
Customer c2 = SplitLine(str2);
if(c1.telf<c2.telf)return 1; //return 1 if 1st string its more important than second, otherwise, return -1
else return -1;
}
struct Customer{
string name;
string city;
string mail;
long telf;
};
If have some question about the code, just say it! I tried to use varNames as descriptive as possible!
Thanks a lot.
Your code seems quite good, but it has several flaws and one important omission.
One of the minor flaws is lack of initialization of Customer structure - you didn't provide a constructor to the struct, and do no explicit initialization of the cust variable. Hopefully string members are properly initialized by the string class constructor, but long telf may get any initial value.
Another one is lack of format checking in splitting an input line. Are you sure that every input line has same format? If there are lines with too many commas (say, comma inside a name) then the loop may incorrectly try to assign 'email' data to 'telf' member...
OTOH if there is too few commas, the 'telf' member may remain uninitialized, with a random initial value...
Together with the first one this flaw may lead to incorrect order of output data.
Similar problems arise when you use atoi function: it returns int but your variable is long. I suppose you have chosen long type because of the expected range of values - if so, converting input data to int may truncate significant part of data! I'm not sure what atoi does in that case, it may either return the result of converting some initial part of the input string or just return zero. Both values are wrong and lead to incorrect sorting, so you better use atol instead.
Next issue is reading first line from both input files. You don't check if getline() succeeded. If an input file is empty, the corresponding lineFile_num string will be empty, but endFil_num will not reflect that - it will still be false. So you again go into comparing invalid data.
Finally the main problem. Assume the file1 contents is 'greater than' (that is: goes after) the whole file2. Then the first line stored in lineFile1 results in CompareTelf() returning -1 all the time. the main loop copies the whole file2 into the output, and...? And the final while() loop starts with getline(file1,lineFile1) thus discarding the first line of file1!
Similar result happens with files consisting of records (A,C) and (B), to be merged as (A,B,C): first A and B are read in, then A is saved and C is read in, then B is saved and end of file 2 detected. Then while(getline(...)) cancels C in memory and finds end of file 1, which terminates the loop. Record C gets lost.
Generally, when the main merging loop while(!endFil1 && !endFil2) exhausts one of files, the first unsaved line of the other file gets discarded. To avoid this you need to store the result of the first read:
endFil1 = ! getline(file1,lineFile1);
endFil2 = ! getline(file2,lineFile2);
then, after the main loop, start copying the input file's tail with the unsaved line:
while(!endFil1) {
mergeFile << lineFile1 << endl;
endFil1 = !getline(file1,lineFile1);
}
while(!endFil2) {
mergeFile << lineFile2 << endl;
endFil2 = !getline(file2,lineFile2);
}
(I apologize that this is so low level compared to most of the questions I have seen on this website, but I have run out of ideas and I do not know who else to ask.)
I am working on a school project that requires me to read basketball statistics from a file named in06.txt. The file in06.txt looks exactly as follows:
5
P 17 24 9 31 28
R 4 5 1 10 7
A 9 2 3 6 8
S 3 4 0 5 4
I am required to read and store the first number, 5, into a variable called "games." From there, I must read the numbers from the second line and determine the high, the low, and the average. I must do the same thing for lines 3, 4, and 5. (FYI, the letters P, R, A, and S are there to indicate "Points," "Rebounds," "Assists," and "Steals.")
Since I only have been learning about programming for a few weeks, I do not want to overwhelm myself by jumping right into dealing with every aspect of the project. So, I am first working on determining the average from each line. My plan is to keep a running total of each line and then divide the running total by the number of games, which is 5 in this case.
This is my code:
#include <iostream>
#include <fstream>
#include <cstdlib>
using namespace std;
int main()
{
int games;
int points_high, points_low, points_total;
int rebounds_high, rebounds_low, rebounds_total;
int assists_high, assists_low, assists_total;
int steals_high, steals_low, steals_total;
double points_average, rebounds_average, assists_average, steals_average;
ifstream fin;
ofstream fout;
fin.open("in06.txt");
if( fin.fail() ) {
cout << "\nInput file opening failed.\n";
exit(1);
}
else
cout << "\nInput file was read successfully.\n";
int tempint1, tempint2, tempint3, tempint4;
char tempchar;
fin >> games;
fin.get(tempchar); // Takes the endl; from the text file.
fin.get(tempchar); // Takes the character P from the text file.
while( fin >> tempint1 ) {
points_total += tempint1;
}
fin.get(tempchar); // Takes the endl; from the text file.
fin.get(tempchar); // Takes the character R from the text file.
while( fin >> tempint2 ) {
rebounds_total += tempint2;
}
fin.get(tempchar); // Takes the endl; from the text file.
fin.get(tempchar); // Takes the character A from the text file.
while( fin >> tempint3 ) {
assists_total += tempint3;
}
fin.get(tempchar); // Takes the endl; from the text file.
fin.get(tempchar); // Takes the character S from the text file.
while( fin >> tempint4 ) {
steals_total += tempint4;
}
cout << "The total number of games is " << games << endl;
cout << "The value of total points is " << points_total << endl;
cout << "The value of total rebounds is " << rebounds_total << endl;
cout << "The value of total assists is " << assists_total << endl;
cout << "The value of total steals is " << steals_total << endl;
return 0;
}
And this is the (incorrect) output:
Input file was read successfully.
The total number of games is 5
The value of total points is 111
The value of total rebounds is 134522076
The value of total assists is 134515888
The value of total steals is 673677934
I have been reading about file input in my textbook for hours, hoping that I will find something that will indicate why my program is outputting the incorrect values. However, I have found nothing. I have also researched similar problems on this forum as well as other forums, but the solutions use methods that I have not yet learned about and thus, my teacher would not allow them in my project code. Some of the methods I saw were arrays and the getline function. We have not yet learned about either.
Note: My teacher does not want us to store every integer from the input file. He wants us to open the file a single time and store the number of games, and then use loops and if statements for determining the high, average, and low numbers from each line.
If anyone could help me out, I would GREATLY appreciate it!
Thanks!
You have all these variables declared:
int games;
int points_high, points_low, points_total;
int rebounds_high, rebounds_low, rebounds_total;
int assists_high, assists_low, assists_total;
int steals_high, steals_low, steals_total;
double points_average, rebounds_average, assists_average, steals_average;
And then you increment them:
points_total += tempint1;
Those variables were never initialzed to a known value (0), so they have garbage in them. You need to initialize them.
Besides what OldProgrammer said, you've approached the reading of integers incorrectly. A loop like this
while( fin >> tempint2 ) {
rebounds_total += tempint2;
}
will stop when an error occurs. That is, either it reaches EOF or the extraction encounters data that cannot be formatted as an integer - or in other words, good() returns false. It does not, as you seem to think, stop reading at the end of a line. Once an error flag is set, all further extractions will fail until you clear the flags. In your case, a loop starts reading after P, extracts five intergers, but then it encounters the R from the next line and errors out.
Change this to a loop that reads a fixed number of integers or alternatively, read a whole line using std::getline into a std::string, put it into a std::stringstream and read from there.
In any case, learn to write robust code. Check for success of extractions and count how many elements you get.
An example of a loop that reads at most 5 integers:
int i;
int counter = 0;
while (counter < 5 && file >> i) {
++counter;
// do something with i
}
if (counter < 5) {
// hm, got less than 5 ints...
}
can anyone help me make this more generalised and more pro?
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main()
{
// open text file for input:
string file_name;
cout << "please enter file name: ";
cin >> file_name;
// associate the input file stream with a text file
ifstream infile(file_name.c_str());
// error checking for a valid filename
if ( !infile )
{
cerr << "Unable to open file "
<< file_name << " -- quitting!\n";
return( -1 );
}
else cout << "\n";
// some data structures to perform the function
vector<string> lines_of_text;
string textline;
// read in text file, line by
while (getline( infile, textline, '\n' ))
{
// add the new element to the vector
lines_of_text.push_back( textline );
// print the 'back' vector element - see the STL documentation
cout << lines_of_text.back() << "\n";
}
cout<<"OUTPUT BEGINS HERE: "<<endl<<endl;
cout<<"the total capacity of vector: lines_of_text is: "<<lines_of_text.capacity()<<endl;
int PLOC = (lines_of_text.size()+1);
int numbComments =0;
int numbClasses =0;
cout<<"\nThe total number of physical lines of code is: "<<PLOC<<endl;
for (int i=0; i<(PLOC-1); i++)
//reads through each part of the vector string line-by-line and triggers if the
//it registers the "//" which will output a number lower than 100 (since no line is 100 char long and if the function does not
//register that character within the string, it outputs a public status constant that is found in the class string and has a huge value
//alot more than 100.
{
string temp(lines_of_text [i]);
if (temp.find("//")<100)
numbComments +=1;
}
cout<<"The total number of comment lines is: "<<numbComments<<endl;
for (int j=0; j<(PLOC-1); j++)
{
string temp(lines_of_text [j]);
if (temp.find("};")<100)
numbClasses +=1;
}
cout<<"The total number of classes is: "<<numbClasses<<endl;
Format the code properly, use consistent style and nomenclature and throw out the utterly redundant comments and empty lines. The resulting code should be fine. Or “pro”.
Here, I’ve taken the efford (along with some stylistic things that are purely subjective):
Notice that the output is actually wrong (just run it on the program code itself to see that …).
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main()
{
string file_name;
cout << "please enter file name: ";
cin >> file_name;
ifstream infile(file_name.c_str());
if (not infile) {
cerr << "Unable to open file " << file_name << " -- quitting!" << endl;
return -1;
}
else cout << endl;
vector<string> lines_of_text;
string textline;
while (getline(infile, textline)) {
lines_of_text.push_back(textline);
cout << lines_of_text.back() << endl;
}
cout << "OUTPUT BEGINS HERE: " << endl << endl;
cout << "the total capacity of vector: lines_of_text is: "
<< lines_of_text.capacity() << endl << endl;
int ploc = lines_of_text.size() + 1;
cout << "The total number of physical lines of code is: " << ploc << endl;
// Look for comments `//` and count them.
int num_comments = 0;
for (vector<string>::iterator i = lines_of_text.begin();
i != lines_of_text.end();
++i) {
if (i->find("//") != string::npos)
++num_comments;
}
cout << "The total number of comment lines is: " << num_comments << endl;
// Same for number of classes ...
}
I'm not really sure what you're asking, but I can point out some things that can be improved in this code. I'll focus on the actual statements and leave stylistic comments to others.
cin >> file_name;
To handle file names with spaces, better write
getline(cin, file_name);
int PLOC = (lines_of_text.size()+1);
Why do you claim that there's one more line than there actually is?
if (temp.find("//")<100)
with some complicated comment explaining this. Better write
if (temp.find("//")<temp.npos)
to work correctly on all line lengths.
cout<<"The total number of comment lines is: "<<numbComments<<endl;
Actually, you counted the number of end-of-line comments. I wouldn't call a comment at the end of a statement a "comment line".
You don't count /* */ style comments.
Counting the number of classes as };? Really? How about structs, enums, and plain superfluous semicolons? Simply count the number of occurences of the class keyword. It should have no alphanumeric character or underscore on either side.
Use proper indentation, your code is very difficult to read in its current form. Here is a list of styles.
Prefer ++variable instead of variable += 1 when possible; the ++ operator exists for a reason.
Be consistent in your coding style. If you're going to leave spaces between things like cout and <<, function arguments and the function parantheses do it, otherwise don't, but be consistent. Pick one naming convention for your variables and stick to it. There is a lot about styles you can find on google, for example here and here.
Don't use the entire std namespace, only what you need. User either using std::cout; or prefix all of your cout statements with std::
Avoid needless comments. Everyone knows what ifstream infile(file_name.c_str()); does for example, what I don't know is what your program does as a whole, because I don't really care to understand what it does due to the indentation. It's a short program, so rather than explaning every statement on its own, why not explain what the program's purpose is, and how to use it?
These are all stylistic points. Your program doesn't work in its current form, assuming your goal is to count comments and classes. Doing that is a lot more difficult than you are considering. What if I have a "};" as part of a string for example? What if I have comments in strings?
Don't import the whole std namespace, only things you need from it:
using std::string;
Use a consistent naming convention: decide whether you prefer name_for_a_variable or nameforavariable or nameForAVariable. And use meaningful names: numbComments makes me associate to very different things than would numberOfComments, numComments or commentCount.
If your original code looks like this, I strongly recommend to select a single consistent indentation style: either
if ( ... )
{
...
}
or
if ( ... )
{
...
}
bot not both in the same source file.
Also remove the useless comments like
// add the new element to the vector
This is "only" about the readability of your code, not even touching its functionality... (which, as others have already pointed out, is incorrect). Note that any piece of code is likely to be read many more times than edited. I am fairly sure that you will have trouble reading (and understanding) your own code in this shape, if you need to read it even a couple of months after.
"More professional" would be not doing it at all. Use an existing SLOC counter, so you don't reinvent the wheel.
This discussion lists a few:
http://discuss.techinterview.org/default.asp?joel.3.207012.14
Another tip: Don't use "temp.find("};}) < 100)", use "temp.find("};") != temp.npos;"
Edit: s/end()/npos. Ugh.
Not as in "can't find the answer on stackoverflow", but as in "can't see what I'm doing wrong", big difference!
Anywho, the code is attached below. What it does is fairly basic, it takes in a user created text file, and spits out one that has been encrypted. In this case, the user tells it how many junk characters to put between each real character. (IE: if I wanted to encrypt the word "Hello" with 1 junk character, it would look like "9H(eal~l.o")
My problem is that for some reason, it isn't reading the input file correctly. I'm using the same setup to read in the file as I had done previously on decrypting, yet this time it's reading garbage characters, and when I tell it to output to file, it prints it on the screen instead, and it seems like nothing is being put in the output file (though it is being created, so that means I've done something correctly, point for me!
code:
string start;
char choice;
char letter;
int x;
int y;
int z;
char c;
string filename;
while(start == "enc")
{
x = 1;
y = 1;
cout << "How many garbage characters would you like between each correct character?: " ;
cin >> z;
cout << endl << "Please insert the name of the document you wish to encrypt, make sure you enter the name, and the file type (ie: filename.txt): " ;
cin >> filename;
ifstream infile(filename.c_str());
ofstream outfile("encrypted.txt", ios::out);
while(!infile.eof())
{
infile.get(letter);
while ((x - y) != z)
{
outfile << putchar(33 + rand() % 94);
x++;
}
while((x - y) == z)
{
outfile << letter;
y = 1;
x = 1;
}
}
outfile.close();
cout << endl << "Encryption complete...please return to directory of program, a new file named encrypted.txt will be there." << endl;
infile.close();
cout << "Do you wish to try again? Please press y then enter if yes (case sensitive).";
cin >> choice;
What I pasted above the start of the while loop are the declaration variables, this is part of a much larger code that not only will encrypt, but decrypt as well, I left the decryption part out as it works perfectly, it's this part I'm having an issue with.
Thanks in advance for the assist!
EDIT:: I'm using visual C++ express 2008, and it shoots back that there are no errors at all, nor any warnings.
IMPORTANT EDIT
It turns out it is outputting to the file! However, it is outputting numbers instead of ascii characters, and it is also outputting the garbage character for the letter it should be. When it goes back to the "infile.get(letter)", it doesn't get a new character. So right now it seems to be the issues are 2 fold:
1) Printing numbers instead of ascii characters.
2) Using garbage instead of the actual character it should be getting.
Question Answered
Found out the second part in the "Important Edit" ...it turns out if you name something test.txt...that means it is actually called test.txt.txt when you type it into a C++ program. Just goes to show it's the tiny, minute, simple details that cause any program to go pooey.
Thank you to George Shore. Your comment about the input file being in the wrong place is what gave me the idea to try the actual items name.
Thank you to everyone who helped with the answer!
Further to the previous answers, I believe it's because the file you wish to encrypt is not being found by the original code. Is it safe to assume that you're running the code from the IDE? If so, then the file that is to be encrypted has to be in the same directory as the source.
Also:
outfile << putchar(33 + rand() % 94);
seems to be the source of your garbage to the screen; the 'putchar' function echoes to the screen whilst returning the integer value of that character. What is then going to happen is that number will be output to the file, as opposed to the character.
Changing that block to something like:
while ((x - y) != z)
{
c = (33 + rand() % 94);
outfile << c;
x++;
}
should enable the code to run as you want it to.
Rather than doing this:
while (!infile.eof())
{
infile.get(letter);
if (infile.good())
{
Do this:
while (infile.get(letter))
{
This is the standard pattern for reading a file.
It gets a character and the resulting infile (that is returned by get) is then checked to see if it is still good by converting it to bool.
The line:
outfile << putchar(33 + rand() % 94);
Should probably be:
outfile << static_cast<char>(33 + rant() % 94);
putchar() prints to the standard output. But the return value (same as the input) goes to the outfile. To stop this just convert the value to char and send to outfile.
Is the use of 'y' necessary? It seems confusing and unnecessary to me. If I were implementing this, then I'd expect to use just 'x' and 'z'.
I'm also not sure about the 'while (!infile.eof())' condition; Pascal determines EOF ahead of time, but C++ can only tell you about EOF after attempting to read a character. However, this would only affect the end of the file, not the main body of the loop.
while (!infile.eof())
{
infile.get(letter);
if (infile.good())
{
for (int i = 0; i < z; i++)
outfile << putchar(33 + rand() % 94);
outfile << letter;
}
}
(Uncompiled code!).
Also, do not use this for security - it may help a little, but it certainly won't conceal the information from the determined.