How to get the bytes of a file? - c++

Simple question I know, what I want to do is be able to get the bytes of a file to use to add those bytes to an bit array, which I can then use to write to a file named bytes.exe and launch it. I know how to read the bytes of an existing file at runtime. But I don't know how to get the bytes of a file to copy and paste into my bitarray[] at design time.
The goal is to be able to write the bites of bitarray[] to myfile.exe at runtime, and then launch said file. There are many bitarray[]'s I'll be using, based on many different file types, so I'm looking for an easy method.
Is there some kind of decompiler that should be used? I just looked into resource scripts, but I don't want to attach any dependencies to my main .exe.

If you are targeting Windows, the easiest way to do this is to embed myfile.exe as a resource, then load the resource at runtime and create a file and write the contents of the resource to your new file.
If you can't use resources, then you'll need to create a source file (.c or .h) that initializes a byte array with the contents of myfile.exe and include that as part of your build. Check out this answer for one possible approach:
https://stackoverflow.com/a/73653/333127
EDIT: After further review, I don't think the source code in the link I referenced above will work for binary input files. Here's a quick alternative I just threw together:
#include <stdio.h>
#include <stdlib.h>
#define BYTES_PER_LINE 70
int main(int argc, char* argv[])
{
FILE* fp;
int ch;
int numBytes = 0;
if (argc < 2) {
printf("Usage: tobytes <file>\n");
exit(1);
}
fp = fopen(argv[1], "rb");
if (fp == NULL) {
printf("Cannot open file %s\n", argv[1]);
exit(1);
}
printf("char fileContents[] = {\n");
while ((ch = fgetc(fp)) != EOF) {
if (numBytes > 0)
printf(",");
++numBytes;
if (numBytes % BYTES_PER_LINE == 0)
printf("\n");
printf("0x%x", ch);
}
printf("\n};\n");
fclose(fp);
return 0;
}

It's not 100% clear what you want to do, but why not write a small program that reads a file and translates it into a C array.
That is if the file data is:
01 02 03 04 (binary)
The program will generate a file that is:
char data[] = {0x01, 0x02, 0x03, 0x04};
and then run this program as a prebuild step of your application (in your Makefile or whatever build system you are using), and generate the output into your source tree.
In that way the data would be compiled into your application and be available statically.
As I said, I'm not clear if this is the problem you are trying to solve.

Related

fread issues on windows 7 64 bits

I'm trying to create a part in my program to write a binary file. It seems to work find but to be sure, I created an other part wich read this binary when I close the file.
Here come issues. I use the fopen/fwrite/fread functions and visual studio 12 with a Windows seven 64 bit and it seems that fread doesn't work correctly on Windows (I tried to read on linux, no problem and when I copy / paste the code, it compiles but the values I get are bad).
Here my code to read:
int en;
float fl;
double dl;
char c;
FILE *F;
int cpt;
cpt = 0;
if ((F = fopen("SimpleTest.twd", "rb")) == NULL)
{
printf("error on fopen\n");
return ;
}
while (cpt < 9)
{
fread(&fl, 4, 1, F);
printf("%f\n", fl);
cpt++;
}
I included cstdlib and cstdio and I'm sure of the existence of SimpleTest and the location of the file.
Thank

Append to gzipped Tar-Archive

I've written a program, generating a tarball, which gets compressed by zlib.
At regular intervals, the same program is supposed to add a new file to the tarball.
Per definition, the tarball needs empty records (512 Byte blocks) to work properly at it's end, which already shows my problem.
According to documentation gzopen is unable to open the file in r+ mode, meaning I can't simply jump to the beginning of the empty records, append my file information and seal it again with empty records.
Right now, I'm at my wits end. Appending works fine with zlib, as long as the empty records are not involved, yet I need them to 'finalize' my compressed tarball.
Any ideas?
Ah yes, it would be nice if I could avoid decompressing the whole thing and/or parsing the entire tarball.
I'm also open for other (preferably simple) file formats I could implement instead of tar.
This is two separate problems, both of which are solvable.
The first is how to append to a tar file. All you need to do there is overwrite the final two zeroed 512-byte blocks with your file. You would write the 512-byte tar header, your file rounded up to an integer number of 512-byte blocks, and then two 512-byte blocks filled with zeros to mark the new end of the tar file.
The second is how to frequently append to a gzip file. The simplest approach is to write separate gzip streams and concatenate them. Write the last two 512-byte zeroed blocks in a separate gzip stream, and remember where that starts. Then overwrite that with a new gzip stream with the new tar entry, and then another gzip stream with the two end blocks. This can be done by seeking back in the file with lseek() and then using gzdopen() to start writing from there.
That will work well, with good compression, for added files that are large (at a minimum several 10's of K). If however you are adding very small files, simply concatenating small gzip streams will result in lousy compression, or worse, expansion. You can do something more complicated to actually add small amounts of data to a single gzip stream so that the compression algorithm can make use of the preceding data for correlation and string matching. For that, take a look at the approach in gzlog.h and gzlog.c in examples/ in the zlib distribution.
Here is an example of how to do the simple approach:
/* tapp.c -- Example of how to append to a tar.gz file with concatenated gzip
streams. Placed in the public domain by Mark Adler, 16 Jan 2013. */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <unistd.h>
#include <fcntl.h>
#include "zlib.h"
#define local static
/* Build an allocated string with the prefix string and the NULL-terminated
sequence of words strings separated by spaces. The caller should free the
returned string when done with it. */
local char *build_cmd(char *prefix, char **words)
{
size_t len;
char **scan;
char *str, *next;
len = strlen(prefix) + 1;
for (scan = words; *scan != NULL; scan++)
len += strlen(*scan) + 1;
str = malloc(len); assert(str != NULL);
next = stpcpy(str, prefix);
for (scan = words; *scan != NULL; scan++) {
*next++ = ' ';
next = stpcpy(next, *scan);
}
return str;
}
/* Usage:
tapp archive.tar.gz addthis.file andthisfile.too
tapp will create a new archive.tar.gz file if it doesn't exist, or it will
append the files to the existing archive.tar.gz. tapp must have been used
to create the archive in the first place. If it did not, then tapp will
exit with an error and leave the file unchanged. Each use of tapp appends a
new gzip stream whose compression cannot benefit from the files already in
the archive. As a result, tapp should not be used to append a small amount
of data at a time, else the compression will be particularly poor. Since
this is just an instructive example, the error checking is done mostly with
asserts.
*/
int main(int argc, char **argv)
{
int tgz;
off_t offset;
char *cmd;
FILE *pipe;
gzFile gz;
int page;
size_t got;
int ret;
ssize_t raw;
unsigned char buf[3][512];
const unsigned char z1k[] = /* gzip stream of 1024 zeros */
{0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 2, 3, 0x63, 0x60, 0x18, 5, 0xa3, 0x60,
0x14, 0x8c, 0x54, 0, 0, 0x2e, 0xaf, 0xb5, 0xef, 0, 4, 0, 0};
if (argc < 2)
return 0;
tgz = open(argv[1], O_RDWR | O_CREAT, 0644); assert(tgz != -1);
offset = lseek(tgz, 0, SEEK_END); assert(offset == 0 || offset >= (off_t)sizeof(z1k));
if (offset) {
if (argc == 2) {
close(tgz);
return 0;
}
offset = lseek(tgz, -sizeof(z1k), SEEK_END); assert(offset != -1);
raw = read(tgz, buf, sizeof(z1k)); assert(raw == sizeof(z1k));
if (memcmp(buf, z1k, sizeof(z1k)) != 0) {
close(tgz);
fprintf(stderr, "tapp abort: %s was not created by tapp\n", argv[1]);
return 1;
}
offset = lseek(tgz, -sizeof(z1k), SEEK_END); assert(offset != -1);
}
if (argc > 2) {
gz = gzdopen(tgz, "wb"); assert(gz != NULL);
cmd = build_cmd("tar cf - -b 1", argv + 2);
pipe = popen(cmd, "r"); assert(pipe != NULL);
free(cmd);
got = fread(buf, 1, 1024, pipe); assert(got == 1024);
page = 2;
while ((got = fread(buf[page], 1, 512, pipe)) == 512) {
if (++page == 3)
page = 0;
ret = gzwrite(gz, buf[page], 512); assert(ret == 512);
} assert(got == 0);
ret = pclose(pipe); assert(ret != -1);
ret = gzclose(gz); assert(ret == Z_OK);
tgz = open(argv[1], O_WRONLY | O_APPEND); assert(tgz != -1);
}
raw = write(tgz, z1k, sizeof(z1k)); assert(raw == sizeof(z1k));
close(tgz);
return 0;
}
In my opinion this is not possible with TAR conforming to standard strictly. I have read through zlib[1] manual and GNU tar[2] file specification. I did not find any information how appending to TAR can be implemented. So I am assuming it has to be done by over-writing the empty blocks.
So I assume, again, you can do it by using gzseek(). However, you would need to know how large is the uncompressed archive (size) and set offset to size-2*512.
Note, that this might be cumbersome since "The whence parameter is defined as in lseek(2); the value SEEK_END is not supported."1 and you can't open file for reading and writing at the same time, i.e. for introspect where the end blocks are.
However, it should be possible abusing TAR specs slightly. The GNU tar[2] docs mention something funny:
"
Each file archived is represented by a header block which describes the file, followed by zero or more blocks which give the contents of the file. At the end of the archive file there are two 512-byte blocks filled with binary zeros as an end-of-file marker. A reasonable system should write such end-of-file marker at the end of an archive, but must not assume that such a block exists when reading an archive. In particular GNU tar always issues a warning if it does not encounter it.
"
This means, you can deliberately not write those blocks. This is easy if you wrote the tarball compressor. Then you can use zlib in the normal append mode, remembering that the TAR decompressor must be aware of the "broken" TAR file.
[1]http://www.zlib.net/manual.html#Gzip
[2]http://www.gnu.org/software/tar/manual/html_node/Standard.html#SEC182

reading this text in C/C++

Hi I am trying to read this text using a file input stream or some sort:
E^#^#<a^R#^##^FÌø<80>è^AÛ<80>è ^F \^DÔVn3Ï^#^#^#^# ^B^VÐXâ^#^#^B^D^E´^D^B^H
IQRÝ^#^#^#^#^A^C^C^GE^#^#<^#^##^##^F.^K<80>è ^F<80>è^AÛ^DÔ \»4³ÕVn3Р^R^V J ^#^#^B^D^E´^D^B^H
^#g<9f><86>IQRÝ^A^C^C^GE^#^#4a^S#^##^FÌÿ<80>è^AÛ<80>è ^F \^DÔVn3л4³Ö<80>^P^#.<8f>F^#^#^A^A^H
IQRÞ^#g<9f><86>E^#^A±,Q#^##^F^#E<80>è ^F<80>è^AÛ^DÔ \»4³ÖVn3Ð<80>^X^#.^NU^#^#^A^A^H
^#g<9f><87>
Here's the code I tried to read it with, but I am getting a bunch of 0s.
#include <stdio.h> /* required for file operations */
int main(int argc, char *argv[]){
int n;
FILE *fr;
unsigned char c;
if (argc != 2) {
perror("Usage: summary <FILE>");
return 1;
}
fr = fopen (argv[1], "rt"); /* open the file for reading */
while (1 == 1){
read(fr, &c, sizeof(c));
printf("<0x%x>\n", c);
}
fclose(fr); /* close the file prior to exiting the routine */
}
What's wrong with my code? I think I am not reading the file correctly.
You're using fopen() to open your file, which returns a FILE *, and read() to read it, which takes an int. You need to either use open() and read() together, or fopen() and fread(). You can't mix these together.
To clarify, fopen() and fread() make use of FILE pointers, which are a different way to access and a different abstraction than straight-up file descriptors. open() and read() make use of "raw" file descriptors, which are a notion understood by the operating system.
While not related to the program's failure here, your fclose() call must also match. In other words, fopen(), fread(), and fclose(), or open(), read(), and close().
Your's didn't compile for me, but I made a few fixes and it's right as rain ;-)
#include <stdio.h> /* required for file operations */
int main(int argc, char *argv[]){
int n;
FILE *fr;
unsigned char c;
if (argc != 2) {
perror("Usage: summary <FILE>");
return 1;
}
fr = fopen (argv[1], "rt"); /* open the file for reading */
while (!feof(fr)){ // can't read forever, need to stop when reading is done
// my ubuntu didn't have read in stdio.h, but it does have fread
fread(&c, sizeof(c),1, fr);
printf("<0x%x>\n", c);
}
fclose(fr); /* close the file prior to exiting the routine */
}
That doesn't look like text to me. So use the "r" mode to fopen, not "rt".
Also, ^# represents '\0', so you probably will read a bunch of zeros in any case. But not ALL zeros.

using stat to detect whether a file exists (slow?)

I'm using code like the following to check whether a file has been created before continuing, thing is the file is showing up in the file browser much before it is being detected by stat... is there a problem with doing this?
//... do something
struct stat buf;
while(stat("myfile.txt", &buf))
sleep(1);
//... do something else
alternatively is there a better way to check whether a file exists?
Using inotify, you can arrange for the kernel to notify you when a change to the file system (such as a file creation) takes place. This may well be what your file browser is using to know about the file so quickly.
The "stat" system call is collecting different information about the file, such as, for example, a number of hard links pointing to it or its "inode" number. You might want to look at the "access" system call which you can use to perform existence check only by specifying "F_OK" flag in "mode".
There is, however, a little problem with your code. It puts the process to sleep for a second every time it checks for file which doesn't exist yet. To avoid that, you have to use inotify API, as suggested by Jerry Coffin, in order to get notified by kernel when file you are waiting for is there. Keep in mind that inotify does not notify you if file is already there, so in fact you need to use both "access" and "inotify" to avoid a race condition when you started watching for a file just after it was created.
There is no better or faster way to check if file exists. If your file browser still shows the file slightly faster than this program detects it, then Greg Hewgill's idea about renaming is probably taking place.
Here is a C++ code example that sets up an inotify watch, checks if file already exists and waits for it otherwise:
#include <cstdio>
#include <cstring>
#include <string>
#include <unistd.h>
#include <sys/inotify.h>
int
main ()
{
const std::string directory = "/tmp";
const std::string filename = "test.txt";
const std::string fullpath = directory + "/" + filename;
int fd = inotify_init ();
int watch = inotify_add_watch (fd, directory.c_str (),
IN_MODIFY | IN_CREATE | IN_MOVED_TO);
if (access (fullpath.c_str (), F_OK) == 0)
{
printf ("File %s exists.\n", fullpath.c_str ());
return 0;
}
char buf [1024 * (sizeof (inotify_event) + 16)];
ssize_t length;
bool isCreated = false;
while (!isCreated)
{
length = read (fd, buf, sizeof (buf));
if (length < 0)
break;
inotify_event *event;
for (size_t i = 0; i < static_cast<size_t> (length);
i += sizeof (inotify_event) + event->len)
{
event = reinterpret_cast<inotify_event *> (&buf[i]);
if (event->len > 0 && filename == event->name)
{
printf ("The file %s was created.\n", event->name);
isCreated = true;
break;
}
}
}
inotify_rm_watch (fd, watch);
close (fd);
}
your code will check if the file is there every second. you can use inotify to get an event instead.

Why is calling close() after fopen() not closing?

I ran across the following code in one of our in-house dlls and I am trying to understand the behavior it was showing:
long GetFD(long* fd, const char* fileName, const char* mode)
{
string fileMode;
if (strlen(mode) == 0 || tolower(mode[0]) == 'w' || tolower(mode[0]) == 'o')
fileMode = string("w");
else if (tolower(mode[0]) == 'a')
fileMode = string("a");
else if (tolower(mode[0]) == 'r')
fileMode = string("r");
else
return -1;
FILE* ofp;
ofp = fopen(fileName, fileMode.c_str());
if (! ofp)
return -1;
*fd = (long)_fileno(ofp);
if (*fd < 0)
return -1;
return 0;
}
long CloseFD(long fd)
{
close((int)fd);
return 0;
}
After repeated calling of GetFD with the appropriate CloseFD, the whole dll would no longer be able to do any file IO. I wrote a tester program and found that I could GetFD 509 times, but the 510th time would error.
Using Process Explorer, the number of Handles did not increase.
So it seems that the dll is reaching the limit for the number of open files; setting _setmaxstdio(2048) does increase the amount of times we can call GetFD. Obviously, the close() is working quite right.
After a bit of searching, I replaced the fopen() call with:
long GetFD(long* fd, const char* fileName, const char* mode)
{
*fd = (long)open(fileName, 2);
if (*fd < 0)
return -1;
return 0;
}
Now, repeatedly calling GetFD/CloseFD works.
What is going on here?
If you open a file with fopen, you have to close it with fclose, symmetrically.
The C++ runtime must be given a chance to clean up/deallocate its inner file-related structures.
You need to use fclose with files opened via fopen, or close with files opened via open.
The standard library you are using has a static array of FILE structures. Because you are not calling fclose(), the standard library doesn't know that the underlying files have been closed, so it doesn't know it can reuse the corresponding FILE structures. You get an error after it has run out of entries in the FILE array.
fopen opens it's own file descriptor, so you'd need to do an fclose(ofp) in your original function to prevent running out of file descriptors. Usually, one either uses the lower level file descriptor functions open, close OR the buffered fopen, fclose functions.
you are open the file fopen() function so u have to close the file useing fclose(), if you are using open() function and try to call fclose() function it will not work