Include massive text file in C++ program - c++

I have a comma delimited text file that has a few million entries. After every 23 entries there is a newline. I will add each full line as an instance of a vector, with the 23 fields as instances of a sub-vector. So, the first instance will be vec[0][0-22], followed by vec[1][0-22], etc.
This file is a part of my program and needs to be compiled with it. Meaning, I don't want to have to provide the file additionally and use ifstream to read the data from the separate file.
I already can sort the data using ifstream, but now I need to integrate the raw data into the program so that I can compile it all together.
I am unable to make this large comma-delimited-field text file into one long string and then separate it into fields because some of the fields have quotes within them, with commas between the quotes too.
example:
`19891656,PLANTAE,TRACHEOPHYTA,MAGNOLIOPSIDA,FABALES,FABACEAE,Zygia,ampla,(Benth.) Pittier,,,,,Pithecellobium amplum |Pithecolobium brevispicatum ,Jarendeua de Sapo,,,LC,,3.1,2012,stable,N
19891919,PLANTAE,TRACHEOPHYTA,MAGNOLIOPSIDA,FABALES,FABACEAE,Zygia,biflora,L.Rico,,,,,,,,,VU,B2ab(iii),3.1,2012,stable,N
2060,ANIMALIA,CHORDATA,MAMMALIA,CARNIVORA,OTARIIDAE,Arctocephalus,pusillus,"(Schreber, 1775)",,,,,Phoca pusilla,"Afro-Australian Fur Seal, Australian Fur Seal, Brown Fur Seal, Cape Fur Seal, South African Fur Seal",Arctocphale d'Afrique du Sud,,LC,,3.1,2015,increasing,N`
When my program runs it will source data from this mass of text, and it will not need to use ifstream with a path to an external file. How can I include this text file in my program? Is there a way to "include" text files? If I need to make a massive array of strings, how do I do this with quoted fields with commas between the quotes? I would be happy to clarify any part of this question which seems vague as I am really curious as to how I can make this work.
Technically this text file is a csv, but I am hesitant to include csv as a tag because I think people will think I am looking for a csv parsing solution.

You may want to write a script to convert each line of your data file into an initializer of a record struct with a trailing comma after each lins [if you don't want to use a terminator entry (see below) than except the last line]. This script may be your data type specific. Say,
12,Joe,,,YES -> MyType(12,"Joe",0,0,true),
Then #include the entire converted file in place of your data array/vector element initializers, for ex
MyType myData [] =
{
#include "my_data_file_converted"
MyType() //an optional terminal entry
};
Of course MyType should have constructor(s) accepting your initialization sequences.

Related

ofsteam overwriting the selected part of a txt file

I am new in C++98. I am getting some fields from a large text file. I want to update only 4 out of 50 lines in the text file. Here is my code. It is getting the text from a lineEdit of a Qt4 form.
strcpy(Name,ui->lineEdit_1->setText(QString::fromStdString(Name)) );
strcpy(Class,ui->lineEdit_1->setText(QString::fromStdString(Class)));
strcpy(Grade,ui->lineEdit_1->setText(QString::fromStdString(Grade)));
std::fstream myfile;
myfile.open(mypath,std::fstream::in | std::fstream::out );
myfile<<Name<<"\t"<<"Name"<<"\n";
myfile<<Class<<"\t"<<"Class"<<"\n";
myfile<<Grade<<"\t"<<"Grade"<<"\n";
Here is sample.conf.txt:
Hello. Name
One. Class
Two. Classsec
A+. Grade
B+. Gradesec
On updating it by adding random values:
Name AA
Class BB
Grade CC
After executing the above code, it shows this updated sample.conf.txt:
AA Name
BB Class
CC Grade
A+. Grade
B+. Gradesec
It should be like this Model instead:
AA Name
BB. Class
Two. Classsec
CC. Grade
B+. Gradesec
Means it (fstream) is just:
1- overwriting truncate the top 3 lines in the file, leaving the rest of the file intact.
2- it is not selecting the position field name to overwrite its value, according to input content?
How can I overwrite by selecting the specific position by name and write its corresponding value, or write column-wise? How can I accomplish this task? Please help.
You cannot do this. When you update a file, then any text you write will replace exactly the same number of bytes as the size of the text you are writing.
It's not that case that if you write three lines of text, then they will replace the first three lines of text currently in the file (unless those two pieces of text happen to be exactly the same length).
Unless you are doing binary IO with fixed length records then trying to update files is not the way to go. Instead your program should read in the whole file into some data structure, manipulate that data structure as required, and then write out the entire data structure to a file, replacing the whole contents of the file.

How to load a string made of 2+ words into a string array

For my basics of programming project number 2, i have to construct a small app that teaches the user english words and expressions. Apart from the source code, my project also needs two .txt files attached. Both of the txt files contain 100 words, in polish and english respectively. I know how to input words from the file into a string array, however i have trouble with expressions. For example; "To kill two birds with one stone" has to be just one element of the array i'll be using in the program. Each of the words/expressions in txt file is contained within one line like this:
Dog Cat Woman By the skin off their teeth Lion
Is there a function in fstream library that conveniently allows me to accomplish my task?

How to input an arbitrary number of text files in C++?

so I'm working on a coding project for a class, and I understand the basic things I want to accomplish, but one thing that nobody seems to be able to help me with is inputting an unspecified number of text files. The user is prompted to enter the text files they want to compare (overall purpose of my code), separated by spaces, thus allowing them to compare an arbitrary amount of text files (eg. 2, 3, 8, 16, etc). I know that the getline function is helpful here, as well as searching for the number of "." because files can only contain one ".", all within a for loop. After that logic I am utterly lost. Eventually, I'm going to have to open the text files and put them in sets to compare them against every other file once, and output their similarities and differences into yet another text file. Any ideas?
Here is the general process I would try to follow (if I interpreted the prompt correctly)
Get the line of text files using getline
Put that into a stringstream
Open the next file in the stream while there is still information in the stringstream (not at eof)
Store all of that information in a Vector of strings, each new file just appended on after it is read
compare strings in the vector
If you pass the text files on the commandline rather than getting them from a little dialog with the user via stdin life will be easier. Most users will type
compare *
which on Unix type systems is expanded to a list of files. ON DOS you need to match and expand the wild card yourself.
You've got an N squared problem, but the logic is easy, it's just
int mian(int argc, char **argv)
{
int i, j;
for(i=1;i<argc;i++)
for(j=i+1;j>argc;j++)
compare(argv[i], argv[j];
}

Folder with 1300 png files into html images list

I've got folder with about 1300 png icons. What I need is html file with all of them inside like:
<img src="path-to-image.png" alt="file name without .png" id="file-name-without-.png" class="icon"/>
Its easy as hell but with that number of files its pure waste of time to do it manually. Have you any ideas how to automate it?
If you need it just once, then do a "dir" or "ls" and redirect it to a file, then use an editor with macro-ability like notepad++ to record modifying a single line like you desire, then hit play macro for the remainder of the file. If it's dynamic, use PHP.
I would not use C++ to do this. I would use vi, honestly, because running regular expressions repeatedly is all that is needed for this.
But young an do this in C++. I would start with a plan text file with all the file names generated by Dir or ls on the command prompt.
Then write code that takes a line of input and turns it into a line formatted the way you want. Test this and get it working on a single line first.
The RE engine of C++ is probably overkill (and is not all that well supported in compilers), but substr and basic find and replace is all you need. Is there a string library you are familiar with? std::string would do.
To generate the file name without PNG, check the last four characters and see if they exist and are .PNG (if not report an error). Then strip them. To remove dashes, copy characters to a new string but if you are reading a dash write a space. Everything else is just string concatenation.

Incorporating text files in applications?

Is there anyway I can incorporate a pretty large text file (about 700KBs) into the program itself, so I don't have to ship the text files together in the application directory ? This is the first time I'm trying to do something like this, and I have no idea where to start from.
Help is greatly appreciated (:
Depending on the platform that you are on, you will more than likely be able to embed the file in a resource container of some kind.
If you are programming on the Windows platform, then you might want to look into resource files. You can find a basic intro here:
http://msdn.microsoft.com/en-us/library/y3sk7e6b.aspx
With more detailed information here:
http://msdn.microsoft.com/en-us/library/zabda143.aspx
Have a look at the xxd command and its -include option. You will get a buffer and a length variable in a C formatted file.
If you can figure out how to use a resource file, that would be the preferred method.
It wouldn't be hard to turn a text file into a file that can be compiled directly by your compiler. This might only work for small files - your compiler might have a limit on the size of a single string. If so, a tiny syntax change would make it an array of smaller strings that would work just fine.
You need to convert your file by adding a line at the top, enclosing each line within quotes, putting a newline character at the end of each line, escaping any quotes or backslashes in the text, and adding a semicolon at the end. You can write a program to do this, or it can easily be done in most editors.
This is my example document:
"Four score and seven years ago,"
can be found in the file c:\quotes\GettysburgAddress.txt
Convert it to:
static const char Text[] =
"This is my example document:\n"
"\"Four score and seven years ago,\"\n"
"can be found in the file c:\\quotes\\GettysburgAddress.txt\n"
;
This produces a variable Text which contains a single string with the entire contents of your file. It works because consecutive strings with nothing but whitespace between get concatenated into a single string.