Dumping a File into a String Array - c++

I am using Visual C++ with an MFC program, using Visual Studio 2008, and I will be creating or appending to an XML file.
If the file doesn't exist, it will be created, and there is no worries, but it's when the file already exists and I have to append to it that there seems to be an issue.
What I was instructed, and found through some research, was to read the file into a string, back up a bit, and write to the end of the string. My idea for that was to read the file into an array of strings.
bool WriteXMLHeader(string header, ofstream xmlFile)
{
int fileSize = 1;
while(!xmlFile.eof())
{
fileSize++;
}
string entireFile[fileSize];
for(int i = 0; i < fileSize; i++)
{
xmlFile >> entireFile[i];
}
//Processing code to add more to the end
//Save the File
return true;
}
However, this causes an error where entireFile is of unknown size, and there are constants errors popping up.
I am not allowed to use any third party software (already looked into TinyXML and RapidXML).
What would be a better way to append to the end of an XML file above an unknown amount of closing tags?
Edit: My boss keeps talking about sending in a path to a node, and writing after the last instance of the node. He wants this capable of processing xml files with a million indents if needed. (Impossible for one man to accomplish?)

std::vector < std::string>
"I mentioned that, and my boss said no and to focus on strings"
Well this is what a most preferred, easiest, least error prone solution.
Keeping aside you xml parsing, (if any), coming to the question/confusion, whatever it is.
Consider following:
#include <vector>
#include <string>
//...
std::vector < std::string > entireFile;
while ( std::getline(xmlFile, line) )
{
entireFile.push_back( line ) ;
}
xmlFile.close( );
// entireFile now contains all lines from xml file.
// To iterate its just like simple array
for( std::size_t i = 0; i < entireFile.size( ); ++i )
{
// entireFile[i]
}
Note: with <algorithm> and <iterators> you can achieve this in still fewer lines of code.
Suggested Reading: Why is iostream::eof inside a loop condition considered wrong?
If your boss says no, ask him why with courtesy, why ?
There can't be any valid reason unless you're tired with specific environment/platform with limited capabilities.

Related

Writing and reading from a file sequentially using the same file object

I am learning data file handling basics in c++ (and am working in the compiler turbo C++).
So I wanted to create a text file , write some data onto it and then read it.
So I wrote this: -
int main()
{
fstream fin;
fin.open("textfile.txt",ios::in|ios::out);
for(int i=0;i<3;i++)
{
char x;
cin>>x;
fin<<x;
}
fin.seekg(0,ios::beg); //I added this and also tried seekp() when I didn't get the desired output
//but to no use
while(!fin.eof())
{
char v;
fin>>v;
cout<<v;
}
fin.close();
getch();
return 0;
}
But instead of outputting only the 3 characters which I input, it outputs 4 characters.
I tried removing the loops and taking input and giving outputs one by one like this (among other things):
...
char x,y,z;
cin>>x>>y>>z;
fin<<x<<y<<z;
fin.seekg(0,ios::beg);
char q,w,e;
fin>>q>>w>>e;
cout<<q<<w<<e;
...
But it still didn't work.
I think it has something to do with file pointers and their location but don''t know what. I tried finding a similar question on the net but to no avail.
So I want to know what is wrong with what I did and how to I improve this to actually write and read in a file sequentially using the same file object (if it is even possible). And is seekg() even necessary here?
Thanks.
The problem you face is a general problem. Your input code is right and there is no error in that. The problem is your ouput code and to be more specific the line while(!fin.eof()). eof(end-of-file) works on a end of file mark whose numeric value is generally -1. But this function goes false only when the end character is encountered and traversed. To remove this error just replace this statement with a read statement that is move this line fin>>v from loop statements to the conditional statements. In this false will be when it encounters a end character.

Dynamically Allocating Array With Datafile

On a C++ project, I have been trying to use an array to store data from a textfile that I would later use. I have been having problems initializing the array without a size. Here is a basic sample of what I have been doing:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
int i = 0;
ifstream usern;
string data;
string otherdata;
string *users = nullptr;
usern.open("users.txt");
while(usern >> otherdata)
i++;
users = new (nothrow) string[i];
for (int n = 0; usern >> data; n++)
{
users[n] = data;
}
usern.close();
return 0;
}
This is a pretty rough example that I threw together. Basically I try to read the items from a text file called users.txt and store them in an array. I used pointers in the example that I included (which probably wasn't the best idea considering I don't know too much about poniters). When I run this program, regardless of the data in the file, I do not get any result when I try to test the values by including cout << *(users + 1). It just leaves a blank line in the window. I am guessing my error is in my use of pointers or in how I am assigning values in the pointers themselves. I was wondering if anybody could point me in the right direction on how to get the correct values into an array. Thanks!
Try reopening usern after
while(usern >> otherdata)
i++;
perhaps, try putting in
usern.close();
ifstream usern2;
usern2.open("users.txt");
right after that.
There may be other issues, but this seems like the most likely one to me. Let me know if you find success with this. To me it appears like usern is already reaching eof, and then you try to read from it a second time.
One thing that helps me a lot in finding such issues is to just put a cout << "looping"; or something inside the for loop so you know that you're at least getting in that for loop.
You can also do the same thing with usern.seekg(0, ios::beg);
What I think has happened in your code is that you have moved the pointer in the file that shows where the file is being read from. This happened when you iterated the number of strings to be read in using the code below.
while(usern >> otherdata)
i++;
This however brought the file pointer to the end of the file this means that in order to read the file you need to move the file pointer to the beginning of the file before you re-read it into your array of strings that you allocated of size i. This can be acheived by adding usern.seekg(0, ios::beg); after your while loop, as shown below. (For a good tutorial on file pointers see here.)
while(usern >> otherdata)
i++;
// Returns file pointer to beginning of file.
usern.seekg(0, ios::beg);
// The rest of your code.
Warning: I am unsure about how safe dynamically allocating STL containers are, I have previously run into issues with code similar to yours and would recommend staying away from this in functional code.

Read .part files and concatenate them all

So I am writing my own custom FTP client for a school project. I managed to get everything to work with the swarming FTP client and am down to one last small part...reading the .part files into the main file. I need to do two things. (1) Get this to read each file and write to the final file properly (2) The command to delete the part files after I am done with each one.
Can someone please help me to fix my concatenate function I wrote below? I thought I had it right to read each file until the EOF and then go on to the next.
In this case *numOfThreads is 17. Ended up with a file of 4742442 bytes instead of 594542592 bytes. Thanks and I am happy to provide any other useful information.
EDIT: Modified code for comment below.
std::string s = "Fedora-15-x86_64-Live-Desktop.iso";
std::ofstream out;
out.open(s.c_str(), std::ios::out);
for (int i = 0; i < 17; ++i)
{
std::ifstream in;
std::ostringstream convert;
convert << i;
std::string t = s + ".part" + convert.str();
in.open(t.c_str(), std::ios::in | std::ios::binary);
int size = 32*1024;
char *tempBuffer = new char[size];
if (in.good())
{
while (in.read(tempBuffer, size))
out.write(tempBuffer, in.gcount());
}
delete [] tempBuffer;
in.close();
}
out.close();
return 0;
Almost everything in your copying loop has problems.
while (!in.eof())
This is broken. Not much more to say than that.
bzero(tempBuffer, size);
This is fairly harmless, but utterly pointless.
in.read(tempBuffer, size);
This the "almost" part -- i.e., the one piece that isn't obviously broken.
out.write(tempBuffer, strlen(tempBuffer));
You don't want to use strlen to determine the length -- it's intended only for NUL-terminated (C-style) strings. If (as is apparently the case) the data you read may contain zero-bytes (rather than using zero-bytes only to signal the end of a string), this will simply produce the wrong size.
What you normally want to do is a loop something like:
while (read(some_amount) == succeeded)
write(amount that was read);
In C++ that will typically be something like:
while (infile.read(buffer, buffer_size))
outfile.write(buffer, infile.gcount());
It's probably also worth noting that since you're allocating memory for the buffer using new, but never using delete, your function is leaking memory. Probably better to do without new for this -- an array or vector would be obvious alternatives here.
Edit: as for why while (infile.read(...)) works, the read returns a reference to the stream. The stream in turn provides a conversion to bool (in C++11) or void * (in C++03) that can be interpreted as a Boolean. That conversion operator returns the state of the stream, so if reading failed, it will be interpreted as false, but as long as it succeeded, it will be interpreted as true.

Keep a text file from wiping in a function but keep ability to write to it? C++

I have a function that swaps two chars, in a file, at a time, which works, however if i try to use the function more than once the previous swap i made will be wiped from the text file and the original text in now back in, therefore the second change will seem as my first. how can i resolve this?
void swapping_letters()
{
ifstream inFile("decrypted.txt");
ofstream outFile("swap.txt");
char a;
char b;
vector<char> fileChars;
if (inFile.is_open())
{
cout<<"What is the letter you want to replace?"<<endl;
cin>>a;
cout<<"What is the letter you want to replace it with?"<<endl;
cin>>b;
while (inFile.good())
{
char c;
inFile.get(c);
fileChars.push_back(c);
}
replace(fileChars.begin(),fileChars.end(),a,b);
}
else
{
cout<<"Please run the decrypt."<<endl;
}
for(int i = 0; i < fileChars.size(); i++)
{
outFile<<fileChars[i];
}
}
What you probably want to do is to parameterize your function :
void swapping_letters(string inFileName, string outFileName)
{
ifstream inFile(inFileName);
ofstream outFile(outFileName);
...
Because you don't have parameters, calling it twice is equivalent to:
swapping_letters("decrypted.txt", "swap.txt");
swapping_letters("decrypted.txt", "swap.txt");
But "decrypted.txt" wasn't modified after the first call, because you don't change the input file. So if you wanted to use the output of the first operation as the input to the second you'd have to write:
swapping_letters("decrypted.txt", "intermediate.txt");
swapping_letters("intermediate.txt", "swap.txt");
There are other ways of approaching this problem. By reading the file one character at a time, you are making quite a number of function calls...a million-byte file will involve 1 million calls to get() and 1 million calls to push_back(). Most of the time the internal buffering means this won't be too slow, but there are better ways:
Read whole ASCII file into C++ std::string
Note that if this is the actual problem you're solving, you don't actually need to read the whole file into memory. You can read the file in blocks (or character-by-character as you are doing) and do your output without holding the entire file.
An advanced idea that you may be interested in at some point are memory-mapped files. This lets you treat a disk file like it's a big array and easily modify it in memory...while letting the operating system worry about details of how much of the file to page in or page out at a time. They're a good fit for some problems, and there's a C++ platform-independent API for memory mapped files in the boost library:
http://en.wikipedia.org/wiki/Memory-mapped_file

HUGE .cpp file better than reading from text file?

Is it a legitimate optimisation to simply create a really HUGE source file which initialises a vector with hundreds of thousands of values manually? rather than parsing a text file with the same values into a vector?
Sorry that could probably be worded better. The function that parses the text file in is very slow due to C++'s stream reading being very slow (takes about 6 minutes opposed to about 6 seconds in the C# version.
Would making a massive array initialisation file be a legitimate solution? It doesn't seem elegant, but if it's faster then I suppose it's better?
this is the file reading code:
//parses the text path vector into the engine
void Level::PopulatePathVectors(string pathTable)
{
// Read the file line by line.
ifstream myFile(pathTable);
for (unsigned int i = 0; i < nodes.size(); i++)
{
pathLookupVectors.push_back(vector<vector<int>>());
for (unsigned int j = 0; j < nodes.size(); j++)
{
string line;
if (getline(myFile, line)) //enter if a line is read successfully
{
stringstream ss(line);
istream_iterator<int> begin(ss), end;
pathLookupVectors[i].push_back(vector<int>(begin, end));
}
}
}
myFile.close();
}
sample line from the text file (in which there are about half a million lines of similar format but varying length.
0 5 3 12 65 87 n
First, make sure you're compiling with the highest optimization level available, then please add the following lines marked below, then test again. I doubt this will fix the problem, but it may help. Hard to say until I see the results.
//parses the text path vector into the engine
void Level::PopulatePathVectors(string pathTable)
{
// Read the file line by line.
ifstream myFile(pathTable);
pathLookupVectors.reserve(nodes.size()); // HERE
for (unsigned int i = 0; i < nodes.size(); i++)
{
pathLookupVectors.push_back(vector<vector<int> >(nodes.size()));
pathLookupVectors[i].reserve(nodes.size()); // HERE
for (unsigned int j = 0; j < nodes.size(); j++)
{
string line;
if (getline(myFile, line)) //enter if a line is read successfully
{
stringstream ss(line);
istream_iterator<int> begin(ss), end;
pathLookupVectors[i].push_back(vector<int>(begin, end));
}
}
}
myFile.close();
}
6 minutes vs 6 seconds!! must be something wrong with your C++ code. Optimize it using good old methods before you revert to such an extreme "optimization" mentioned in your post.
Also know that reading from file would allow you to change the vector contents without changing the source code. If you do it the way you mention it, you'll have to re-code, compile n link all over again.
Depending if the data changes. If the data can/needs to be changed (after compiletime) than the only option is to load it from textfile. If not, well I don't see any harm to compile it.
I was able to get the following result with Boost.Spirit 2.5:
$ time ./test input
real 0m6.759s
user 0m6.670s
sys 0m0.090s
'input' is a file containing 500,000 lines containing 10 random integers between 0 and 65535 each.
Here's the code:
#include <vector>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/classic_file_iterator.hpp>
using namespace std;
namespace spirit = boost::spirit;
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
typedef vector<int> ragged_matrix_row_type;
typedef vector<ragged_matrix_row_type> ragged_matrix_type;
template <class Iterator>
struct ragged_matrix_grammar : qi::grammar<Iterator, ragged_matrix_type()> {
ragged_matrix_grammar() : ragged_matrix_grammar::base_type(ragged_matrix_) {
ragged_matrix_ %= ragged_matrix_row_ % qi::eol;
ragged_matrix_row_ %= qi::int_ % ascii::space;
}
qi::rule<Iterator, ragged_matrix_type()> ragged_matrix_;
qi::rule<Iterator, ragged_matrix_row_type()> ragged_matrix_row_;
};
int main(int argc, char** argv){
typedef spirit::classic::file_iterator<> ragged_matrix_file_iterator;
ragged_matrix_type result;
ragged_matrix_grammar<ragged_matrix_file_iterator> my_grammar;
ragged_matrix_file_iterator input_it(argv[1]);
qi::parse(input_it, input_it.make_end(), my_grammar, result);
return 0;
}
At this point, result contains the ragged matrix, which can be confirmed by printing its contents. In my case the 'ragged matrix' isn't so ragged-it's a 500000 x 10 rectangle-but it won't matter because I'm pretty sure the grammar is correct. I got even better results when I read the entire file into memory before parsing (~4 sec), but the code for that is longer and it's generally undesirable to copy large files into memory in their entirety.
Note: my test machine has an SSD, so I don't know if you'll get the same numbers I did (unless your test machine has an SSD as well).
HTH!
I wouldn't consider compiling static data into your application to be bad practice. If there is little conceivable need to change your data without a recompilation, parsing the file at compile time not only improves runtime performance (since your data have been pre-parsed by the compiler and are in a usable format at runtime), but also reduces risks (like the data file not being found at runtime or any other parse errors).
Make sure that users won't have need to change the data (or have the means to recompile the program), document your motivation and you should be absolutely fine.
That said, you could make the iostream version a lot faster if necessary.
using a huge array in a C++ file is a totally allowed option, depending on the case.
You must consider if the data will change and how often.
If you put it in a C++ file, that means that you will have to recompile your program each time the data change (and distribute it to your customers each time !) So that wouldn't be a good solution if you have to distribute the program to other people.
Now if a compilation is allowed for every data change, then you can have the best of two worlds : just use a small script (for example in python or perl) which will take your .txt and generate a C++ file, so the file parsing will only have to be done one time for each data change. You can even integrate this step in your build process with automatic dependency management.
Good luck !
Don't use the std input stream, it's extremely slow.
There are better alternatives.
Since people decided to downvote my answer because they are too lazy to use google, here:
http://accu.org/index.php/journals/1539