Incorporating text files in applications? - c++

Is there anyway I can incorporate a pretty large text file (about 700KBs) into the program itself, so I don't have to ship the text files together in the application directory ? This is the first time I'm trying to do something like this, and I have no idea where to start from.
Help is greatly appreciated (:

Depending on the platform that you are on, you will more than likely be able to embed the file in a resource container of some kind.
If you are programming on the Windows platform, then you might want to look into resource files. You can find a basic intro here:
http://msdn.microsoft.com/en-us/library/y3sk7e6b.aspx
With more detailed information here:
http://msdn.microsoft.com/en-us/library/zabda143.aspx

Have a look at the xxd command and its -include option. You will get a buffer and a length variable in a C formatted file.

If you can figure out how to use a resource file, that would be the preferred method.
It wouldn't be hard to turn a text file into a file that can be compiled directly by your compiler. This might only work for small files - your compiler might have a limit on the size of a single string. If so, a tiny syntax change would make it an array of smaller strings that would work just fine.
You need to convert your file by adding a line at the top, enclosing each line within quotes, putting a newline character at the end of each line, escaping any quotes or backslashes in the text, and adding a semicolon at the end. You can write a program to do this, or it can easily be done in most editors.
This is my example document:
"Four score and seven years ago,"
can be found in the file c:\quotes\GettysburgAddress.txt
Convert it to:
static const char Text[] =
"This is my example document:\n"
"\"Four score and seven years ago,\"\n"
"can be found in the file c:\\quotes\\GettysburgAddress.txt\n"
;
This produces a variable Text which contains a single string with the entire contents of your file. It works because consecutive strings with nothing but whitespace between get concatenated into a single string.

Related

How to input an arbitrary number of text files in C++?

so I'm working on a coding project for a class, and I understand the basic things I want to accomplish, but one thing that nobody seems to be able to help me with is inputting an unspecified number of text files. The user is prompted to enter the text files they want to compare (overall purpose of my code), separated by spaces, thus allowing them to compare an arbitrary amount of text files (eg. 2, 3, 8, 16, etc). I know that the getline function is helpful here, as well as searching for the number of "." because files can only contain one ".", all within a for loop. After that logic I am utterly lost. Eventually, I'm going to have to open the text files and put them in sets to compare them against every other file once, and output their similarities and differences into yet another text file. Any ideas?
Here is the general process I would try to follow (if I interpreted the prompt correctly)
Get the line of text files using getline
Put that into a stringstream
Open the next file in the stream while there is still information in the stringstream (not at eof)
Store all of that information in a Vector of strings, each new file just appended on after it is read
compare strings in the vector
If you pass the text files on the commandline rather than getting them from a little dialog with the user via stdin life will be easier. Most users will type
compare *
which on Unix type systems is expanded to a list of files. ON DOS you need to match and expand the wild card yourself.
You've got an N squared problem, but the logic is easy, it's just
int mian(int argc, char **argv)
{
int i, j;
for(i=1;i<argc;i++)
for(j=i+1;j>argc;j++)
compare(argv[i], argv[j];
}

C++- Code not working to alternate lines when writing on a text file

I need to create a program that reads strings from two different files and write these strings on a new file. The thing is, it must alternate both files, meaning that it should write a line from one file, and then one line from the other, and so on.
I'm having a problem with my code, it writes the first line of the first file, and then it writes all lines from the second file.
Anyone knows how to solve this problem?
do {
getline(archivo1, sLinea);
archivoS << sLinea << endl;
getline(archivo2, sLinea2);
archivoS << sLinea2 << endl;
} while (!archivo1.eof() && !archivo2.eof());
The code looks correct and should work under normal circumstances. This might be a problem with the encoding of the second file, where the newline characters are not being recognised as such on your platform, which could result in the entire second file being interpreted as a single line by the C++ standard library.
Windows (CR+LF), Unix/Linux (LF), and Mac (CR) each have different conventions for newlines. Search about the carriage return and line feed characters across platforms to learn more about this topic.
To identify if this is the issue, try running the code on two separate copies of the first file to see if it produces the expected output?
If newline encoding is your issue, you will either need to convert the second file to use your platform's newline encoding (you can use a tool like Notepad++ to easily do this) or incorporate logic which controls for this into your program.
Check your second file. In all likelihood it does not contain the line delimiter "\n" , per line. There may be only one at the end

How to keep characters in C++ from combining when outputted to text file

I have a fairly simple program with a vector of characters which is then outputted to a .txt file.
ofstream op ("output.txt");
vector <char> outp;
for(int i=0;i<outp.size();i++){
op<<outp[i]; //the final output of this is incorrect
cout<<outp[i]; //this output is correct
}
op.close();
the text that is output by cout is correct, but when I open the text file that was created, the output is wrong with what look like Chinese characters that shouldn't have been an option for the program to output. For example, when the program should output:
O dsof
And cout prints the right output, the .txt file has this:
O獤景
I have even tried adding the characters into a string before outputting it but it doesn't help. My best guess is that the characters are combining together and getting a different value for unicode or ascii but I don't know enough about character codes to know for sure or how to stop this from happening. Is there a way to correct the output so that it doesn't do this? I am currently using a windows 8.1 computer with code::blocks 12.11 and the GNU GCC compiler in case that helps.
Some text editors try to guess the encoding of a file and occasionally get it wrong. This can particularly happen with very small amounts of text because whatever statistical analysis is being used just doesn't have enough data to make a good conclusion. Window's Notepad has/had an infamous example with the text "Bush hid the facts".
More advanced text editors (for example Notepad++) may either not experience the same problem or may give you options to change what encoding is being assumed. You could use such to verify that the contents of the file are actually correct.
Hex editors/viewers are another way, since they allow you to examine the raw bytes of the file without interpretation. For instance, HxD is a hex editor that I have used in the past.
Alternatively, you can simply output more text. The more there is, generally the less likely something will guess wrong. From some of my experiences, newlines are particularly helpful in convincing the text editor to assume the correct encoding.
there is nothing wrong with your code.
maybe the text editor you use has a default encoding.
use more advanced editors and you will get the right output.

Folder with 1300 png files into html images list

I've got folder with about 1300 png icons. What I need is html file with all of them inside like:
<img src="path-to-image.png" alt="file name without .png" id="file-name-without-.png" class="icon"/>
Its easy as hell but with that number of files its pure waste of time to do it manually. Have you any ideas how to automate it?
If you need it just once, then do a "dir" or "ls" and redirect it to a file, then use an editor with macro-ability like notepad++ to record modifying a single line like you desire, then hit play macro for the remainder of the file. If it's dynamic, use PHP.
I would not use C++ to do this. I would use vi, honestly, because running regular expressions repeatedly is all that is needed for this.
But young an do this in C++. I would start with a plan text file with all the file names generated by Dir or ls on the command prompt.
Then write code that takes a line of input and turns it into a line formatted the way you want. Test this and get it working on a single line first.
The RE engine of C++ is probably overkill (and is not all that well supported in compilers), but substr and basic find and replace is all you need. Is there a string library you are familiar with? std::string would do.
To generate the file name without PNG, check the last four characters and see if they exist and are .PNG (if not report an error). Then strip them. To remove dashes, copy characters to a new string but if you are reading a dash write a space. Everything else is just string concatenation.

Including files as raw string literals [duplicate]

This question already has answers here:
"#include" a text file in a C program as a char[]
(21 answers)
Closed 9 years ago.
I have a C++ source file and a Python source file. I'd like the C++ source file to be able to use the contents of the Python source file as a big string literal. I could do something like this:
char* python_code = "
#include "script.py"
"
But that won't work because there need to be \'s at the end of each line. I could manually copy and paste in the contents of the Python code and surround each line with quotes and a terminating \n, but that's ugly. Even though the python source is going to effectively be compiled into my C++ app, I'd like to keep it in a separate file because it's more organized and works better with editors (emacs isn't smart enough to recognize that a C string literal is python code and switch to python mode while you're inside it).
Please don't suggest I use PyRun_File, that's what I'm trying to avoid in the first place ;)
The C/C++ preprocessor acts in units of tokens, and a string literal is a single token. As such, you can't intervene in the middle of a string literal like that.
You could preprocess script.py into something like:
"some code\n"
"some more code that will be appended\n"
and #include that, however. Or you can use xxd​ -i to generate a C static array ready for inclusion.
This won't get you all the way there, but it will get you pretty damn close.
Assuming script.py contains this:
print "The current CPU time in seconds is: ", time.clock()
First, wrap it up like this:
STRINGIFY(print "The current CPU time in seconds is: ", time.clock())
Then, just before you include it, do this:
#define STRINGIFY(x) #x
const char * script_py =
#include "script.py"
;
There's probably an even tighter answer than that, but I'm still searching.
The best way to do something like this is to include the file as a resource if your environment/toolset has that capability.
If not (like embedded systems, etc.), you can use a bin2c utility (something like http://stud3.tuwien.ac.at/~e0025274/bin2c/bin2c.c). It'll take a file's binary representation and spit out a C source file that includes an array of bytes initialized to that data. You might need to do some tweaking of the tool or the output file if you want the array to be '\0' terminated.
Incorporate running the bin2c utility into your makefile (or as a pre-build step of whatever you're using to drive your builds). Then just have the file compiled and linked with your application and you have your string (or whatever other image of the file) sitting in a chunk of memory represented by the array.
If you're including a text file as string, one thing you should be aware of is that the line endings might not match what functions expect - this might be another thing you'd want to add to the bin2c utility or you'll want to make sure your code handles whatever line endings are in the file properly. Maybe modify the bin2c utility to have a '-s' switch that indicates you want a text file incorportated as a string so line endings will be normalized and a zero byte will be at the end of the array.
You're going to have to do some of your own processing on the Python code, to deal with any double-quotes, backslashes, trigraphs, and possibly other things, that appear in it. You can at the same time turn newlines into \n (or backslash-escape them) and add the double-quotes on either end. The result will be a header file generated from the Python source file, which you can then #include. Use your build process to automate this, so that you can still edit the Python source as Python.
You could use Cog as part of your build process (to do the preprocessing and to embed the code). I admit that the result of this is probably not ideal, since then you end up seeing the code in both places. But any time I see the "Python," "C++", and "Preprocessor" in closs proximity, I feel it deserves a mention.
Here is how automate the conversion with cmd.exe
------ html2h.bat ------
#echo off
echo const char * html_page = "\
sed "/.*/ s/$/ \\n\\/" ../src/page.html | sed s/\"/\\\x22/g
echo.
echo ";
It was called like
cmd /c "..\Debug\html2h.bat" > "..\debug\src\html.h"
and attached to the code by
#include "../Debug/src/html.h"
printf("%s\n", html_page);
This is quite system-dependent approach but, as most of the people, I disliked the hex dump.
Use fopen, getline, and fclose.