How to embed a file into an executable? - c++

I have a small demo executable wrote in C++ that depends only on one 5kb PNG image being loaded before it can run, which is used for a pixel text I made. Because of this one file, I would need to give out a ZIP archive instead of just one executable file, which creates enough friction between download and 'play' that I believe would dissuade some from trying it out.
My question is, is there anyway to embed the PNG file (and any other file really) into the Executable or source code so that it is a single file, and the executable can use it?
I have the ability to parse the PNG as a byte stream, so it does not need converted to pixel data.
Thanks in advance! (Other questions with a similar title to this exist, but they and their answers seem to get into more specific issues and weren't very helpful)
edit:The compiler is Visual C++ 2010 and this is on Windows (though I would want to avoid windows specific utilities for this)
edit2: Alf's answer seemed like the most portable method, so I quickly wrote a function to parse the PNG file into a TXT or header file that could be read as a unsigned char array. It appears to be identical in this form to the PNG file itself, but my png loader won't accept the array. When loading it from memory, the PNG parser takes a (void * buffer, size_t length) if it matters.
The code if you wanted to see, but I'll still accept other answers if you think they're better than this method:
void compileImagePNGtoBinary(char * filename, char * output){
FILE * file = fopen(filename, "rb");
FILE * out = fopen(output, "w");
unsigned char buffer[32];
size_t count;
fprintf(out, "#pragma once \n\n static unsigned char TEXT_PNG_BYTES[] = { ");
while(!feof(file)){
count = fread(buffer, 1, 32, file);
for(int n = 0; n < count; ++n){
fprintf(out, "0x%02X, ", buffer[n]);
};
};
fprintf(out, "};");
fclose(file);
fclose(out);
};
Final Edit: ImageMagick which Alf also mentioned did exactly what I needed of it, thanks!

A portable way is to define a function like
typedef unsigned char Byte;
Byte const* pngFileData()
{
static Byte const data =
{
// Byte data generated by a helper program.
};
return data;
}
Then all you have to do is to write a little helper program that reads the PNG file as binary and generates the C++ curly braces initializer text. Edit: #awoodland has pointed out in comment to the question, that ImageMagick has such a little helper program…
Of course, for a Windows-specific program, instead use the ordinary Windows resource scheme.
Cheers & hth.,

Look at XD:
http://www.fourmilab.ch/xd/
Finally, xd can read a binary file and emit a C language data
declaration which contains the data from the file. This is handy when
you wish to embed binary data within C programs.
Personally, I'd use resources for windows, but if you require a truly portable way that doesn't involve knowledge of the executable format, this is the way to go. PNG, JPG, whatever...

Base64 encode the file and put it in a string somewhere in your code ;)

You can embed any arbitrary file into your program resources: (MSDN) User-Defined Resource.
A user-defined resource-definition statement defines a resource that contains application-specific data. The data can have any format and can be defined either as the content of a given file (if the filename parameter is given) or as a series of numbers and strings (if the raw-data block is specified).
nameID typeID filename
The filename specifies the name of a file containing the binary data of the resource. The contents of the file are included as the resource. RC does not interpret the binary data in any way. It is the programmer's responsibility to ensure that the data is properly aligned for the target computer architecture.
Once you've done that you can use the LoadResource function to access the bytes contained in the file.

This is executable-format dependent, which means inherently operating system/compiler dependent. Windows offers the Resources system for this as mentioned in this question.

On linux I use this. It's based off a few examples I found when trying to do some 4k demos, albeit modified a bit. I believe it can work on windows too, but not with the default VS inline assembly. My workaround is #defining a macro to either use this code or the windows resource system that #MarkRansom suggests (quite painful to get working, but does work eventually).
//USAGE: call BINDATA(name, file.txt) and access the char array &name.
#ifndef EMBED_DATA_H
#define EMBED_DATA_H
#ifdef _WIN32
//#error The VS ASM compiler won't work with this, but you can get external ones to do the trick
#define BINDATA #error BINDATA requires nasm
#else
__asm__(
".altmacro\n" \
".macro binfile p q\n" \
" .global \\p\n" \
"\\p:\n" \
" .incbin \\q\n" \
"\\p&_end:\n" \
" .byte 0\n" \
" .global \\p&_len\n" \
"\\p&_len:\n" \
" .int(\\p&_end - \\p)\n" \
".endm\n\t"
);
#ifdef __cplusplus
extern "C" {
#endif
#define BINDATA(n, s) \
__asm__("\n\n.data\n\tbinfile " #n " \"" #s "\"\n"); \
extern char n; \
extern int n##_len;
#ifdef __cplusplus
}
#endif
#endif
#endif

If I want to embed static data into an executable, I would package it into a .lib/.a file or a header file as an array of unsigned chars. That's if you are looking for a portable approach.
I have created a command line tool that does both actually here. All you have to do is list files, and pick option -l64 to output a 64bit library file along with a header that includes all pointers to each data.
You can explore more options as well.for example, this option:
>BinPack image.png -j -hx
will output the data of image.png into a header file, as hexadecimal and lines will be justified per -j option.
const unsigned char BP_icon[] = {
0x89,0x50,0x4e,0x47,0x0d,0x0a,0x1a,0x0a,0x00,0x00,0x00,0x0d,0x49,0x48,0x44,0x52,
0x00,0x00,0x01,0xed,0x00,0x00,0x01,0xed,0x08,0x06,0x00,0x00,0x00,0x34,0xb4,0x26,
0xfb,0x00,0x00,0x02,0xf1,0x7a,0x54,0x58,0x74,0x52,0x61,0x77,0x20,0x70,0x72,0x6f,
0x66,0x69,0x6c,0x65,0x20,0x74,0x79,0x70,0x65,0x20,0x65,0x78,0x69,0x66,0x00,0x00,
0x78,0xda,0xed,0x96,0x5d,0x92,0xe3,0x2a,0x0c,0x85,0xdf,0x59,0xc5,0x2c,0x01,0x49,
0x08,0x89,0xe5,0x60,0x7e,0xaa,0xee,0x0e,0xee,0xf2,0xef,0x01,0x3b,0x9e,0x4e,0xba,
0xbb,0x6a,0xa6,0x66,0x5e,0x6e,0x55,0x4c,0x8c,0x88,0x0c,0x07,0xd0,0x27,0x93,0x84,
0xf1,0xef,0x3f,0x33,0xfc,0xc0,0x45,0xc5,0x52,0x48,0x6a,0x9e,0x4b,0xce,0x11,0x57,
0x2a,0xa9,0x70,0x45,0xc3,0xe3,0x79,0xd5,0x5d,0x53,0x4c,0xbb,0xde,0xd7,0xe8,0x57,
0x8b,0x9e,0xfd,0xe1,0x7e,0xc0,0xb0,0x02,0x2b,0xe7,0x03,0xcf,0xa7,0xa5,0x87,0xff,
0x1a,0xf0,0xb0,0x54,0xd1,0xd2,0x0f,0x42,0xde,0xae,0x07,0xc7,0xf3,0x83,0x92,0x4e,
0xcb,0xfe,0x22,0xc4,0xa7,0x91,0xb5,0xa2,0xd5,0xee,0x97,0x50,0xb9,0x84,0x84,0xcf,
0x07,0x74,0x09,0xd4,0x73,0x5b,0x31,0x17,0xb7,0x8f,0x5b,0x38,0xc6,0x69,0xaf}

I came here looking for a bash script, so that I can generate the C array of bytes in a mostly-cross-platform compatible way (I depend on mingw bash for my windows builds anyway) without having to compile a helper tool or depend on any tools that don't come standard with a normal bash shell. Here's my take:
#!/bin/sh
set -e
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
OUT_FILE="$SCRIPT_DIR/src/alloverse_binary_schema.h"
BINARY_FILE="$SCRIPT_DIR/include/allonet/schema/alloverse.bfbs"
VAR_NAME="alloverse_schema"
echo "static const unsigned char ${VAR_NAME}_bytes[] = {" > "$OUT_FILE"
hexdump -ve '1/1 "0x%02x, "' "$BINARY_FILE" >> "$OUT_FILE"
echo "0x00}; static const int ${VAR_NAME}_size = sizeof(${VAR_NAME}_bytes); " >> "$OUT_FILE"
I can then just #include this from the C file where I use it, and use foo_bytes and foo_size as needed:
#include "alloverse_binary_schema.h"
bool allo_initialize(void)
{
g_alloschema = reflection_Schema_as_root(alloverse_schema_bytes);
}
This script should be adaptable to your needs by adjusting OUT_FILE, BINARY_FILE and VAR_NAME (perhaps taking them as arguments to the script).

Related

How to put a file in the program and extract it after its launch [duplicate]

I have a small demo executable wrote in C++ that depends only on one 5kb PNG image being loaded before it can run, which is used for a pixel text I made. Because of this one file, I would need to give out a ZIP archive instead of just one executable file, which creates enough friction between download and 'play' that I believe would dissuade some from trying it out.
My question is, is there anyway to embed the PNG file (and any other file really) into the Executable or source code so that it is a single file, and the executable can use it?
I have the ability to parse the PNG as a byte stream, so it does not need converted to pixel data.
Thanks in advance! (Other questions with a similar title to this exist, but they and their answers seem to get into more specific issues and weren't very helpful)
edit:The compiler is Visual C++ 2010 and this is on Windows (though I would want to avoid windows specific utilities for this)
edit2: Alf's answer seemed like the most portable method, so I quickly wrote a function to parse the PNG file into a TXT or header file that could be read as a unsigned char array. It appears to be identical in this form to the PNG file itself, but my png loader won't accept the array. When loading it from memory, the PNG parser takes a (void * buffer, size_t length) if it matters.
The code if you wanted to see, but I'll still accept other answers if you think they're better than this method:
void compileImagePNGtoBinary(char * filename, char * output){
FILE * file = fopen(filename, "rb");
FILE * out = fopen(output, "w");
unsigned char buffer[32];
size_t count;
fprintf(out, "#pragma once \n\n static unsigned char TEXT_PNG_BYTES[] = { ");
while(!feof(file)){
count = fread(buffer, 1, 32, file);
for(int n = 0; n < count; ++n){
fprintf(out, "0x%02X, ", buffer[n]);
};
};
fprintf(out, "};");
fclose(file);
fclose(out);
};
Final Edit: ImageMagick which Alf also mentioned did exactly what I needed of it, thanks!
A portable way is to define a function like
typedef unsigned char Byte;
Byte const* pngFileData()
{
static Byte const data =
{
// Byte data generated by a helper program.
};
return data;
}
Then all you have to do is to write a little helper program that reads the PNG file as binary and generates the C++ curly braces initializer text. Edit: #awoodland has pointed out in comment to the question, that ImageMagick has such a little helper program…
Of course, for a Windows-specific program, instead use the ordinary Windows resource scheme.
Cheers & hth.,
Look at XD:
http://www.fourmilab.ch/xd/
Finally, xd can read a binary file and emit a C language data
declaration which contains the data from the file. This is handy when
you wish to embed binary data within C programs.
Personally, I'd use resources for windows, but if you require a truly portable way that doesn't involve knowledge of the executable format, this is the way to go. PNG, JPG, whatever...
Base64 encode the file and put it in a string somewhere in your code ;)
You can embed any arbitrary file into your program resources: (MSDN) User-Defined Resource.
A user-defined resource-definition statement defines a resource that contains application-specific data. The data can have any format and can be defined either as the content of a given file (if the filename parameter is given) or as a series of numbers and strings (if the raw-data block is specified).
nameID typeID filename
The filename specifies the name of a file containing the binary data of the resource. The contents of the file are included as the resource. RC does not interpret the binary data in any way. It is the programmer's responsibility to ensure that the data is properly aligned for the target computer architecture.
Once you've done that you can use the LoadResource function to access the bytes contained in the file.
This is executable-format dependent, which means inherently operating system/compiler dependent. Windows offers the Resources system for this as mentioned in this question.
On linux I use this. It's based off a few examples I found when trying to do some 4k demos, albeit modified a bit. I believe it can work on windows too, but not with the default VS inline assembly. My workaround is #defining a macro to either use this code or the windows resource system that #MarkRansom suggests (quite painful to get working, but does work eventually).
//USAGE: call BINDATA(name, file.txt) and access the char array &name.
#ifndef EMBED_DATA_H
#define EMBED_DATA_H
#ifdef _WIN32
//#error The VS ASM compiler won't work with this, but you can get external ones to do the trick
#define BINDATA #error BINDATA requires nasm
#else
__asm__(
".altmacro\n" \
".macro binfile p q\n" \
" .global \\p\n" \
"\\p:\n" \
" .incbin \\q\n" \
"\\p&_end:\n" \
" .byte 0\n" \
" .global \\p&_len\n" \
"\\p&_len:\n" \
" .int(\\p&_end - \\p)\n" \
".endm\n\t"
);
#ifdef __cplusplus
extern "C" {
#endif
#define BINDATA(n, s) \
__asm__("\n\n.data\n\tbinfile " #n " \"" #s "\"\n"); \
extern char n; \
extern int n##_len;
#ifdef __cplusplus
}
#endif
#endif
#endif
If I want to embed static data into an executable, I would package it into a .lib/.a file or a header file as an array of unsigned chars. That's if you are looking for a portable approach.
I have created a command line tool that does both actually here. All you have to do is list files, and pick option -l64 to output a 64bit library file along with a header that includes all pointers to each data.
You can explore more options as well.for example, this option:
>BinPack image.png -j -hx
will output the data of image.png into a header file, as hexadecimal and lines will be justified per -j option.
const unsigned char BP_icon[] = {
0x89,0x50,0x4e,0x47,0x0d,0x0a,0x1a,0x0a,0x00,0x00,0x00,0x0d,0x49,0x48,0x44,0x52,
0x00,0x00,0x01,0xed,0x00,0x00,0x01,0xed,0x08,0x06,0x00,0x00,0x00,0x34,0xb4,0x26,
0xfb,0x00,0x00,0x02,0xf1,0x7a,0x54,0x58,0x74,0x52,0x61,0x77,0x20,0x70,0x72,0x6f,
0x66,0x69,0x6c,0x65,0x20,0x74,0x79,0x70,0x65,0x20,0x65,0x78,0x69,0x66,0x00,0x00,
0x78,0xda,0xed,0x96,0x5d,0x92,0xe3,0x2a,0x0c,0x85,0xdf,0x59,0xc5,0x2c,0x01,0x49,
0x08,0x89,0xe5,0x60,0x7e,0xaa,0xee,0x0e,0xee,0xf2,0xef,0x01,0x3b,0x9e,0x4e,0xba,
0xbb,0x6a,0xa6,0x66,0x5e,0x6e,0x55,0x4c,0x8c,0x88,0x0c,0x07,0xd0,0x27,0x93,0x84,
0xf1,0xef,0x3f,0x33,0xfc,0xc0,0x45,0xc5,0x52,0x48,0x6a,0x9e,0x4b,0xce,0x11,0x57,
0x2a,0xa9,0x70,0x45,0xc3,0xe3,0x79,0xd5,0x5d,0x53,0x4c,0xbb,0xde,0xd7,0xe8,0x57,
0x8b,0x9e,0xfd,0xe1,0x7e,0xc0,0xb0,0x02,0x2b,0xe7,0x03,0xcf,0xa7,0xa5,0x87,0xff,
0x1a,0xf0,0xb0,0x54,0xd1,0xd2,0x0f,0x42,0xde,0xae,0x07,0xc7,0xf3,0x83,0x92,0x4e,
0xcb,0xfe,0x22,0xc4,0xa7,0x91,0xb5,0xa2,0xd5,0xee,0x97,0x50,0xb9,0x84,0x84,0xcf,
0x07,0x74,0x09,0xd4,0x73,0x5b,0x31,0x17,0xb7,0x8f,0x5b,0x38,0xc6,0x69,0xaf}
I came here looking for a bash script, so that I can generate the C array of bytes in a mostly-cross-platform compatible way (I depend on mingw bash for my windows builds anyway) without having to compile a helper tool or depend on any tools that don't come standard with a normal bash shell. Here's my take:
#!/bin/sh
set -e
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
OUT_FILE="$SCRIPT_DIR/src/alloverse_binary_schema.h"
BINARY_FILE="$SCRIPT_DIR/include/allonet/schema/alloverse.bfbs"
VAR_NAME="alloverse_schema"
echo "static const unsigned char ${VAR_NAME}_bytes[] = {" > "$OUT_FILE"
hexdump -ve '1/1 "0x%02x, "' "$BINARY_FILE" >> "$OUT_FILE"
echo "0x00}; static const int ${VAR_NAME}_size = sizeof(${VAR_NAME}_bytes); " >> "$OUT_FILE"
I can then just #include this from the C file where I use it, and use foo_bytes and foo_size as needed:
#include "alloverse_binary_schema.h"
bool allo_initialize(void)
{
g_alloschema = reflection_Schema_as_root(alloverse_schema_bytes);
}
This script should be adaptable to your needs by adjusting OUT_FILE, BINARY_FILE and VAR_NAME (perhaps taking them as arguments to the script).

Run a few command line commands from c++

I wrote a program in python that has the functionality of being able to find out file sizes, create directories, and move around as though i'm just in a regular shell. The porblem is that I need to be able to do this in c++.
Here's the python code that I need c++ functionality from:
os.chdir('r'+str(r)+'n'+str(n))
def build_path(newpath):
if os.path.isdir(newpath):
os.chdir(newpath)
else:
os.mkdir(newpath)
os.chdir(newpath)
And also this piece:
if os.stat('data'+str(tick)).st_size > 2500000:
heavyFile.close()
tick+=1
heavyFile=open('data'+str(tick),'w')
os.system('touch COMPLETED'+str(r)+str(n))
So basically I need to be able to make some directories, change into those directories, build files, but don't let them get much larger than 2.5 MB, and when they finally get over that size, create a new file that is incremented by one.
so the file tree ends up looking like:
r4n4/dir1/data0,data1,data2,etc
r4n4/dir2/data0,data1,data2,etc
and so on.
How can I do this in c++?
I know I can call system('command')
but I don't know how to get file size nicely using that and I'm just hoping for an easier way to do this.
Also, I do not have access to boost where I am running this program.
Try checking out the boost::filesystem library. (http://www.boost.org/doc/libs/1_54_0/libs/filesystem/doc/index.htm) All three of your requests are covered in the tutorial.
You can use stat() in your code to get properties of filesystem objects. Here's an example:
#include <sys/types.h>
#include <sys/stat.h>
struct stat buf;
stat(filename, &buf);
// If it's a regular file, print the size in bytes
if ((buf.st_mode & S_IFREG) == S_IFREG)
{
off_t size = buf.st_size;
fprintf(stdout, "%s is a regular file\n", filename);
fprintf(stdout, "%s is a regular file: size %zd bytes\n", filename, size);
}
There are also macros within stat.h which make it a little easier to check if something is a regular file or whatever, instead of AND'ing multiple things as above. For example, the S_ISREG macro will do the same thing as the code above:
if(S_ISREG(buf.st_mode)) /* stat.h macro, instead of AND'ing */
{
fprintf(stdout, "%s is a regular file\n", filename);
}
The macro S_ISDIR would tell you if it's a directory. There are other macros like this.
You can do man -s 2 stat to see the man page for stat() and get more details. Hope this helps.
You can make use of system calls to achieve what you want. If you are on Linux, check out the following man pages:
man 2 chdir
man 2 mkdir
man 2 stat
You can also just call your script from your c++ code with this command:
system ("python script.py");

embedding a text file in an exe which can be accessed using fopen

I would like to embed a text file with some data into my program.
let's call it "data.txt".
This text file is usually loaded with a function which requires the text file's file name as input and is eventually opened using a fopen() call... some something to the lines of
FILE* name = fopen("data.txt");
I can't really change this function and I would like the routine to open this same file every time it runs. I've seen people ask about embedding the file as a header but it seems that I wouldn't be able to call fopen() on a file that I embed into the header.
So my question is: is there a way to embed a text file as a callable file/variable to fopen()?
I am using VS2008.
Yes and No. The easiest way is to transform the content of the text file into an initialized array.
char data_txt[] = {
'd','a','t','a',' ','g','o','e','s',' ','h','e','r','e', //....
};
This transformation is easily done with a small perl script or even a small C program. You then compile and link the resulting module into your program.
An old trick to make this easier to manage with a Makefile is to make the script transform its data into the body of the initializer and write it to a file without the surrounding variable declaration or even the curly braces. If data.txt is transformed to data.inc, then it is used like so:
char data_txt[] = {
#include "data.inc"
};
Update
On many platforms, it is possible to append arbitrary data to the executable file itself. The trick then is to find it at run time. On platforms where this is possible, there will be file header information for the executable that indicates the length of the executable image. That can be used to compute an offset to use with fseek() after you have opened the executable file for reading. That is harder to do in a portable way, since it may not even be possible to learn the actual file name of your executable image at run time in a portable way. (Hint, argv[0] is not required to point to the actual program.)
If you cannot avoid the call to fopen(), then you can still use this trick to keep a copy of the content of data.txt, and put it back in a file at run time. You could even be clever and only write the file if it is missing....
If you can drop the call to fopen() but still need a FILE * pointing at the data, then this is likely possible if you are willing to play fast and loose with your C runtime library's implementation of stdio. In the GNU version of libc, functions like sprintf() and sscanf() are actually implemented by creating a "real enough" FILE * that can be passed to a common implementation (vfprintf() and vfscanf(), IIRC). That faked FILE is marked as buffered, and points its buffer to the users's buffer. Some magic is used to make sure the rest of stdio doesn't do anything stupid.
For any kind of file, base on RBerteig anwser you could do something simple as this with python:
This program will generate a text.txt.c file that can be compiled and linked to your code, to embed any text or binary file directly to your exe and read it directly from a variable:
import struct; # Needed to convert string to byte
f = open("text.txt","rb") # Open the file in read binary mode
s = "unsigned char text_txt_data[] = {"
b = f.read(1) # Read one byte from the stream
db = struct.unpack("b",b)[0] # Transform it to byte
h = hex(db) # Generate hexadecimal string
s = s + h; # Add it to the final code
b = f.read(1) # Read one byte from the stream
while b != "":
s = s + "," # Add a coma to separate the array
db = struct.unpack("b",b)[0] # Transform it to byte
h = hex(db) # Generate hexadecimal string
s = s + h; # Add it to the final code
b = f.read(1) # Read one byte from the stream
s = s + "};" # Close the bracktes
f.close() # Close the file
# Write the resultan code to a file that can be compiled
fw = open("text.txt.c","w");
fw.write(s);
fw.close();
Will generate something like
unsigned char text_txt_data[] = {0x52,0x61,0x6e,0x64,0x6f,0x6d,0x20,0x6e,0x75...
You can latter use your data in another c file using the variable with a code like this:
extern unsigned char text_txt_data[];
Right now I cant think of two ways to converting it to readable text. Using memory streams or converting it to a c-string.

How to embed a file into an executable file?

I have two problems, the first has been solved.
Current problem
If I embed a file that requires a library to load it, such as a jpeg image or a mp3 music, I will need to use the file as input to the library. However, each library is different and uses a way to get a file as input, the input may be the file name or a FILE* pointer (from libc's file interface).
I would like to know how to access an embedded file with a name. It will be inefficient if I create a temporary file, is there another way? Can I map a file name to memory? My platforms are Windows and Linux.
If show_file(const char* name) is a function from a library, I will need a string to open the file.
I have seen these questions:
How to get file descriptor of buffer in memory?
Getting Filename from file descriptor in C
and the following code is my solution. Is it a good solution? Is it inefficient?
# include <stdio.h>
# include <unistd.h>
extern char _binary_data_txt_start;
extern const void* _binary_data_txt_size;
const size_t len = (size_t)&_binary_data_txt_size;
void show_file(const char* name){
FILE* file = fopen(name, "r");
if (file == NULL){
printf("Error (show_file): %s\n", name);
return;
}
while (true){
char ch = fgetc(file);
if (feof(file) )
break;
putchar( ch );
}
printf("\n");
fclose(file);
}
int main(){
int fpipe[2];
pipe(fpipe);
if( !fork() ){
for( int buffsize = len, done = 0; buffsize>done; ){
done += write( fpipe[1], &_binary_data_txt_start + done, buffsize-done );
}
_exit(0);
}
close(fpipe[1]);
char name[200];
sprintf(name, "/proc/self/fd/%d", fpipe[0] );
show_file(name);
close(fpipe[0]);
}
The other problem (solved)
I tried to embed a file on Linux, with GCC, and it worked. However, I tried to do the same thing on Windows, with Mingw, and it did not compile.
The code is:
# include <stdio.h>
extern char _binary_data_txt_start;
extern char _binary_data_txt_end;
int main(){
for (char* my_file = &_binary_data_txt_start; my_file <= &_binary_data_txt_end; my_file++)
putchar(*my_file);
printf("\n");
}
The compilation commands are:
objcopy --input-target binary --output-target elf32-i386 --binary-architecture i386 data.txt data.o
g++ main.cpp data.o -o test.exe
On Windows, I get the following compiler error:
undefined reference to `_binary_data_txt_start'
undefined reference to `_binary_data_txt_end'
I tried to replace elf32-i386 with i386-pc-mingw32, but I still get the same error.
I think that for this to work with MinGW you'll need to remove the leading underscore from the names in the .c file. See Embedding binary blobs using gcc mingw for some details.
See if using the following helps:
extern char binary_data_txt_start;
extern char binary_data_txt_end;
If you need the same source to work for Linux or MinGW builds, you might need to use the preprocessor to have the right name used in the different environments.
If you're using a library that requires a FILE* for reading data, then you can use fmemopen(3) to create a pseudofile out of a memory blob. This will avoid creating a temporary file on disk. Unfortunately, it's a GNU extension, so I don't know if it's available with MinGW (likely not).
However, most well-written libraries (such as libpng and the IJG's JPEG library) provide routines for opening a file from memory as opposed to from disk. libpng, in particular, even offers a streaming interface, where you can incrementally decode a PNG file before it's been completely read into memory. This is useful if, say, you're streaming an interlaced PNG from the network and you want to display the interlaced data as it loads for a better user experience.
On Windows, you can embed custom resource into executable file. You would need a .RC file and a resource compiler. With Visual Studio IDE you can do it without hassle.
In your code, you would use FindResource, LoadResource and LockResource functions to load the contents into memory at runtime. A sample code that reads the resource as long string:
void GetResourceAsString(int nResourceID, CStringA &strResourceString)
{
HRSRC hResource = FindResource(NULL, MAKEINTRESOURCE(nResourceID), L"DATA");
HGLOBAL hResHandle = LoadResource(NULL, hResource);
const char* lpData = static_cast<char*> ( LockResource(hResHandle) );
strResourceString.SetString(lpData, SizeofResource(NULL, hResource));
FreeResource(hResource);
}
Where nResourceID is the ID of resource under custom resource type DATA. DATA is just a name, you may choose another name. Other in-built resources are cursors, dialogs, string-tables etc.
I've created a small library called elfdataembed which provides a simple interface for extracting/referencing sections embedded using objcopy. This allows you to pass the offset/size to another tool, or reference it directly from the runtime using file descriptors. Hopefully this will help someone in the future.
It's worth mentioning this approach is more efficient than compiling to a symbol, as it allows external tools to reference the data without needing to be extracted, and it also doesn't require the entire binary to be loaded into memory in order to extract/reference it.
Use nm data.o to see what it named the symbols. It may be something as simple as the filesystem differences causing the filename-derived symbols to be different (eg filename capitalized).
Edit: Just saw your second question. If you are using threads you can make a pipe and pass that to the library (first using fdopen() if it wants a FILE *). If you are more specific about the API you need to talk to I can add more specific advice.

Embedding resources in executable using GCC

I'm looking for a way to easily embed any external binary data in a C/C++ application compiled by GCC.
A good example of what I'd like to do is handling shader code - I can just keep it in source files like const char* shader = "source here"; but that's extremely impractical.
I'd like the compiler to do it for me: upon compilation (linking stage), read file "foo.bar" and link its content to my program, so that I'd be able to access the contents as binary data from the code.
Could be useful for small applications which I'd like to distribute as a single .exe file.
Does GCC support something like this?
There are a couple possibilities:
use ld's capability to turn any file into an object (Embedding binary blobs using gcc mingw):
ld -r -b binary -o binary.o foo.bar # then link in binary.o
use a bin2c/bin2h utility to turn any file into an array of bytes (Embed image in code, without using resource section or external images)
Update: Here's a more complete example of how to use data bound into the executable using ld -r -b binary:
#include <stdio.h>
// a file named foo.bar with some example text is 'imported' into
// an object file using the following command:
//
// ld -r -b binary -o foo.bar.o foo.bar
//
// That creates an bject file named "foo.bar.o" with the following
// symbols:
//
// _binary_foo_bar_start
// _binary_foo_bar_end
// _binary_foo_bar_size
//
// Note that the symbols are addresses (so for example, to get the
// size value, you have to get the address of the _binary_foo_bar_size
// symbol).
//
// In my example, foo.bar is a simple text file, and this program will
// dump the contents of that file which has been linked in by specifying
// foo.bar.o as an object file input to the linker when the progrma is built
extern char _binary_foo_bar_start[];
extern char _binary_foo_bar_end[];
int main(void)
{
printf( "address of start: %p\n", &_binary_foo_bar_start);
printf( "address of end: %p\n", &_binary_foo_bar_end);
for (char* p = _binary_foo_bar_start; p != _binary_foo_bar_end; ++p) {
putchar( *p);
}
return 0;
}
Update 2 - Getting the resource size: I could not read the _binary_foo_bar_size correctly. At runtime, gdb shows me the right size of the text resource by using display (unsigned int)&_binary_foo_bar_size. But assigning this to a variable gave always a wrong value. I could solve this issue the following way:
unsigned int iSize = (unsigned int)(&_binary_foo_bar_end - &_binary_foo_bar_start)
It is a workaround, but it works good and is not too ugly.
As well as the suggestions already mentioned, under linux you can use the hex dump tool xxd, which has a feature to generate a C header file:
xxd -i mybinary > myheader.h
The .incbin GAS directive can be used for this task. Here is a totally free licenced library that wraps around it:
https://github.com/graphitemaster/incbin
To recap. The incbin method is like this. You have a thing.s assembly file that you compile with gcc -c thing.s
.section .rodata
.global thing
.type thing, #object
.align 4
thing:
.incbin "meh.bin"
thing_end:
.global thing_size
.type thing_size, #object
.align 4
thing_size:
.int thing_end - thing
In your c or cpp code you can reference it with:
extern const char thing[];
extern const char* thing_end;
extern int thing_size;
So then you link the resulting .o with the rest of the compilation units.
Credit where due is to #John Ripley with his answer here: C/C++ with GCC: Statically add resource files to executable/library
But the above method is not as convenient as what incbin can give you. To accomplish the above with incbin you don't need to write any assembler. Just the following will do:
#include "incbin.h"
INCBIN(thing, "meh.bin");
int main(int argc, char* argv[])
{
// Now use thing
printf("thing=%p\n", gThingData);
printf("thing len=%d\n", gThingSize);
}
For C23, there now exists the preprocessor directive #embed, which achieves exactly what you are looking for without using external tools. See 6.10.3.1 of the C23 standard (here is a link to the most recent working draft). Here's good blog post about the history of #embed by one of the committee members behind this new feature.
Here is a snippet from the draft standard demonstrating its use:
#include <stddef.h>
void have_you_any_wool(const unsigned char*, size_t);
int main (int, char*[]) {
static const unsigned char baa_baa[] = {
#embed "black_sheep.ico"
};
have_you_any_wool(baa_baa, sizeof(baa_baa));
return 0;
}
An equivalent directive for C++ does not exist at this time.
If I want to embed static data into an executable, I would package it into a .lib/.a file or a header file as an array of unsigned chars. That's if you are looking for a portable approach.
I have created a command line tool that does both actually here. All you have to do is list files, and pick option -l64 to output a 64bit library file along with a header that includes all pointers to each data.
You can explore more options as well.for example, this option:
>BinPack image.png -j -hx
will output the data of image.png into a header file, as hexadecimal and lines will be justified per -j option.
const unsigned char BP_icon[] = {
0x89,0x50,0x4e,0x47,0x0d,0x0a,0x1a,0x0a,0x00,0x00,0x00,0x0d,0x49,0x48,0x44,0x52,
0x00,0x00,0x01,0xed,0x00,0x00,0x01,0xed,0x08,0x06,0x00,0x00,0x00,0x34,0xb4,0x26,
0xfb,0x00,0x00,0x02,0xf1,0x7a,0x54,0x58,0x74,0x52,0x61,0x77,0x20,0x70,0x72,0x6f,
0x66,0x69,0x6c,0x65,0x20,0x74,0x79,0x70,0x65,0x20,0x65,0x78,0x69,0x66,0x00,0x00,
0x78,0xda,0xed,0x96,0x5d,0x92,0xe3,0x2a,0x0c,0x85,0xdf,0x59,0xc5,0x2c,0x01,0x49,
0x08,0x89,0xe5,0x60,0x7e,0xaa,0xee,0x0e,0xee,0xf2,0xef,0x01,0x3b,0x9e,0x4e,0xba,
0xbb,0x6a,0xa6,0x66,0x5e,0x6e,0x55,0x4c,0x8c,0x88,0x0c,0x07,0xd0,0x27,0x93,0x84,
0xf1,0xef,0x3f,0x33,0xfc,0xc0,0x45,0xc5,0x52,0x48,0x6a,0x9e,0x4b,0xce,0x11,0x57,
0x2a,0xa9,0x70,0x45,0xc3,0xe3,0x79,0xd5,0x5d,0x53,0x4c,0xbb,0xde,0xd7,0xe8,0x57,
0x8b,0x9e,0xfd,0xe1,0x7e,0xc0,0xb0,0x02,0x2b,0xe7,0x03,0xcf,0xa7,0xa5,0x87,0xff,
0x1a,0xf0,0xb0,0x54,0xd1,0xd2,0x0f,0x42,0xde,0xae,0x07,0xc7,0xf3,0x83,0x92,0x4e,
0xcb,0xfe,0x22,0xc4,0xa7,0x91,0xb5,0xa2,0xd5,0xee,0x97,0x50,0xb9,0x84,0x84,0xcf,
0x07,0x74,0x09,0xd4,0x73,0x5b,0x31,0x17,0xb7,0x8f,0x5b,0x38,0xc6,0x69,0xaf}
You could do this in a header file :
#ifndef SHADER_SRC_HPP
#define SHADER_SRC_HPP
const char* shader= "
//source
";
#endif
and just include that.
Other way is to read the shader file.